* [PATCH v3 0/2] x86/fault: Further improve #PF oops messages @ 2018-12-21 21:36 Sean Christopherson 2018-12-21 21:36 ` [PATCH v3 1/2] x86/fault: Reword initial BUG message for unhandled page faults Sean Christopherson ` (2 more replies) 0 siblings, 3 replies; 10+ messages in thread From: Sean Christopherson @ 2018-12-21 21:36 UTC (permalink / raw) To: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86 Cc: H. Peter Anvin, linux-kernel, Linus Torvalds, Rik van Riel, Yu-cheng Yu, Ingo Molnar Rework the messages printed for #PF oopses to display more detailed information about the fault in human readable form and to avoid conflicting messages and/or statements that may not always be accurate. v3: - Prepend a patch to reword the initial BUG message - Add sample output in the changelogs - Swap the order of the #PF lines. For most cases the three main lines show up in reverse fir-tree ordering and the cause of the fault is easy to pick out since it's the last thing highlighted by pr_alert (excepting when dumping the IDT, GDT, etc...). v2: - Explicitly call out protection keys violations - "Slightly" reword the changelog v1: https://lkml.kernel.org/r/20181207195223.23968-1-sean.j.christopherson@intel.com v2: https://lkml.kernel.org/r/20181207184423.1962-1-sean.j.christopherson@intel.com Sean Christopherson (2): x86/fault: Reword initial BUG message for unhandled page faults x86/fault: Decode and print #PF oops in human readable form arch/x86/mm/fault.c | 51 +++++++++++++++------------------------------ 1 file changed, 17 insertions(+), 34 deletions(-) -- 2.19.2 ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v3 1/2] x86/fault: Reword initial BUG message for unhandled page faults 2018-12-21 21:36 [PATCH v3 0/2] x86/fault: Further improve #PF oops messages Sean Christopherson @ 2018-12-21 21:36 ` Sean Christopherson 2019-04-19 18:35 ` [tip:x86/mm] " tip-bot for Sean Christopherson 2018-12-21 21:36 ` [PATCH v3 2/2] x86/fault: Decode and print #PF oops in human readable form Sean Christopherson 2019-02-28 15:44 ` [PATCH v3 0/2] x86/fault: Further improve #PF oops messages Sean Christopherson 2 siblings, 1 reply; 10+ messages in thread From: Sean Christopherson @ 2018-12-21 21:36 UTC (permalink / raw) To: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86 Cc: H. Peter Anvin, linux-kernel, Linus Torvalds, Rik van Riel, Yu-cheng Yu, Ingo Molnar Reword the NULL pointer dereference case to simply state that a NULL pointer was dereferenced, i.e. drop "unable to handle" as that implies that there are instances where the kernel actual does handle NULL pointer dereferences, which is not true barring funky exception fixup. For the non-NULL case, replace "kernel paging request" with "page fault" as the kernel can technically oops on faults that originated in user code. Dropping "kernel" also allows future patches to provide detailed information on where the fault occurred, e.g. user vs. kernel, without conflicting with the initial BUG message. In both cases, replace "at address=" with wording more appropriate to the oops, as "at" may be interpreted as stating that the address is the RIP of the instruction that faulted. Last, and probably least, further qualify the NULL-pointer path by checking that the fault actually originated in kernel code. It's technically possible for userspace to map address 0, and not printing a super specific message is the least of our worries if the kernel does manage to oops on an actual NULL pointer dereference from userspace. Before: BUG: unable to handle kernel NULL pointer dereference at ffffbeef00000000 BUG: unable to handle kernel paging request at ffffbeef00000000 After: BUG: kernel NULL pointer dereference, address = 0000000000000008 BUG: unable to handle page fault for address = ffffbeef00000000 Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yu-cheng Yu <yu-cheng.yu@intel.com> Cc: linux-kernel@vger.kernel.org Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> --- arch/x86/mm/fault.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 2ff25ad33233..39dccdfef496 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -644,9 +644,12 @@ show_fault_oops(struct pt_regs *regs, unsigned long error_code, unsigned long ad from_kuid(&init_user_ns, current_uid())); } - pr_alert("BUG: unable to handle kernel %s at %px\n", - address < PAGE_SIZE ? "NULL pointer dereference" : "paging request", - (void *)address); + if (address < PAGE_SIZE && !user_mode(regs)) + pr_alert("BUG: kernel NULL pointer dereference, address = %px\n", + (void *)address); + else + pr_alert("BUG: unable to handle page fault for address = %px\n", + (void *)address); err_txt[0] = 0; -- 2.19.2 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [tip:x86/mm] x86/fault: Reword initial BUG message for unhandled page faults 2018-12-21 21:36 ` [PATCH v3 1/2] x86/fault: Reword initial BUG message for unhandled page faults Sean Christopherson @ 2019-04-19 18:35 ` tip-bot for Sean Christopherson 0 siblings, 0 replies; 10+ messages in thread From: tip-bot for Sean Christopherson @ 2019-04-19 18:35 UTC (permalink / raw) To: linux-tip-commits Cc: dave.hansen, riel, torvalds, sean.j.christopherson, luto, linux-kernel, peterz, hpa, yu-cheng.yu, tglx, bp, mingo Commit-ID: f28b11a2abd9cf5535baa03e28fc63ad0e14f52a Gitweb: https://git.kernel.org/tip/f28b11a2abd9cf5535baa03e28fc63ad0e14f52a Author: Sean Christopherson <sean.j.christopherson@intel.com> AuthorDate: Fri, 21 Dec 2018 13:36:56 -0800 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Fri, 19 Apr 2019 19:31:15 +0200 x86/fault: Reword initial BUG message for unhandled page faults Reword the NULL pointer dereference case to simply state that a NULL pointer was dereferenced, i.e. drop "unable to handle" as that implies that there are instances where the kernel actual does handle NULL pointer dereferences, which is not true barring funky exception fixup. For the non-NULL case, replace "kernel paging request" with "page fault" as the kernel can technically oops on faults that originated in user code. Dropping "kernel" also allows future patches to provide detailed information on where the fault occurred, e.g. user vs. kernel, without conflicting with the initial BUG message. In both cases, replace "at address=" with wording more appropriate to the oops, as "at" may be interpreted as stating that the address is the RIP of the instruction that faulted. Last, and probably least, further qualify the NULL-pointer path by checking that the fault actually originated in kernel code. It's technically possible for userspace to map address 0, and not printing a super specific message is the least of our worries if the kernel does manage to oops on an actual NULL pointer dereference from userspace. Before: BUG: unable to handle kernel NULL pointer dereference at ffffbeef00000000 BUG: unable to handle kernel paging request at ffffbeef00000000 After: BUG: kernel NULL pointer dereference, address = 0000000000000008 BUG: unable to handle page fault for address = ffffbeef00000000 Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yu-cheng Yu <yu-cheng.yu@intel.com> Link: http://lkml.kernel.org/r/20181221213657.27628-2-sean.j.christopherson@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> --- arch/x86/mm/fault.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 667f1da36208..df2c5cdef5c4 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -644,9 +644,12 @@ show_fault_oops(struct pt_regs *regs, unsigned long error_code, unsigned long ad from_kuid(&init_user_ns, current_uid())); } - pr_alert("BUG: unable to handle kernel %s at %px\n", - address < PAGE_SIZE ? "NULL pointer dereference" : "paging request", - (void *)address); + if (address < PAGE_SIZE && !user_mode(regs)) + pr_alert("BUG: kernel NULL pointer dereference, address = %px\n", + (void *)address); + else + pr_alert("BUG: unable to handle page fault for address = %px\n", + (void *)address); err_txt[0] = 0; ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v3 2/2] x86/fault: Decode and print #PF oops in human readable form 2018-12-21 21:36 [PATCH v3 0/2] x86/fault: Further improve #PF oops messages Sean Christopherson 2018-12-21 21:36 ` [PATCH v3 1/2] x86/fault: Reword initial BUG message for unhandled page faults Sean Christopherson @ 2018-12-21 21:36 ` Sean Christopherson 2019-04-19 18:35 ` [tip:x86/mm] " tip-bot for Sean Christopherson 2019-04-29 13:58 ` [PATCH v3 2/2] x86/fault: Decode and print #PF oops in human readable form Dmitry Vyukov 2019-02-28 15:44 ` [PATCH v3 0/2] x86/fault: Further improve #PF oops messages Sean Christopherson 2 siblings, 2 replies; 10+ messages in thread From: Sean Christopherson @ 2018-12-21 21:36 UTC (permalink / raw) To: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86 Cc: H. Peter Anvin, linux-kernel, Linus Torvalds, Rik van Riel, Yu-cheng Yu, Ingo Molnar Linus pointed out that deciphering the raw #PF error code and printing a more human readable message are two different things, and also that printing the negative cases is mostly just noise[1]. For example, the USER bit doesn't mean the fault originated in user code and stating that an oops wasn't due to a protection keys violation isn't interesting since an oops on a keys violation is a one-in-a-million scenario. Remove the per-bit decoding of the error code and instead print: - the raw error code - why the fault occurred - the effective privilege level of the access - the type of access - whether the fault originated in user code or kernel code This provides the user with the information needed to triage 99.9% of oopses without polluting the log with useless information or conflating the error_code with the CPL. Sample output: BUG: kernel NULL pointer dereference, address = 0000000000000008 #PF: supervisor-privileged instruction fetch from kernel code #PF: error_code(0x0010) - not-present page BUG: unable to handle page fault for address = ffffbeef00000000 #PF: supervisor-privileged instruction fetch from kernel code #PF: error_code(0x0010) - not-present page BUG: unable to handle page fault for address = ffffc90000230000 #PF: supervisor-privileged write access from kernel code #PF: error_code(0x000b) - reserved bit violation [1] https://lkml.kernel.org/r/CAHk-=whk_fsnxVMvF1T2fFCaP2WrvSybABrLQCWLJyCvHw6NKA@mail.gmail.com Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yu-cheng Yu <yu-cheng.yu@intel.com> Cc: linux-kernel@vger.kernel.org Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> --- arch/x86/mm/fault.c | 42 +++++++++++------------------------------- 1 file changed, 11 insertions(+), 31 deletions(-) diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 39dccdfef496..a4421cbd230b 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -603,24 +603,9 @@ static void show_ldttss(const struct desc_ptr *gdt, const char *name, u16 index) name, index, addr, (desc.limit0 | (desc.limit1 << 16))); } -/* - * This helper function transforms the #PF error_code bits into - * "[PROT] [USER]" type of descriptive, almost human-readable error strings: - */ -static void err_str_append(unsigned long error_code, char *buf, unsigned long mask, const char *txt) -{ - if (error_code & mask) { - if (buf[0]) - strcat(buf, " "); - strcat(buf, txt); - } -} - static void show_fault_oops(struct pt_regs *regs, unsigned long error_code, unsigned long address) { - char err_txt[64]; - if (!oops_may_print()) return; @@ -651,27 +636,22 @@ show_fault_oops(struct pt_regs *regs, unsigned long error_code, unsigned long ad pr_alert("BUG: unable to handle page fault for address = %px\n", (void *)address); - err_txt[0] = 0; - - /* - * Note: length of these appended strings including the separation space and the - * zero delimiter must fit into err_txt[]. - */ - err_str_append(error_code, err_txt, X86_PF_PROT, "[PROT]" ); - err_str_append(error_code, err_txt, X86_PF_WRITE, "[WRITE]"); - err_str_append(error_code, err_txt, X86_PF_USER, "[USER]" ); - err_str_append(error_code, err_txt, X86_PF_RSVD, "[RSVD]" ); - err_str_append(error_code, err_txt, X86_PF_INSTR, "[INSTR]"); - err_str_append(error_code, err_txt, X86_PF_PK, "[PK]" ); - - pr_alert("#PF error: %s\n", error_code ? err_txt : "[normal kernel read fault]"); + pr_alert("#PF: %s-privileged %s from %s code\n", + (error_code & X86_PF_USER) ? "user" : "supervisor", + (error_code & X86_PF_INSTR) ? "instruction fetch" : + (error_code & X86_PF_WRITE) ? "write access" : + "read access", + user_mode(regs) ? "user" : "kernel"); + pr_alert("#PF: error_code(0x%04lx) - %s\n", error_code, + !(error_code & X86_PF_PROT) ? "not-present page" : + (error_code & X86_PF_RSVD) ? "reserved bit violation" : + (error_code & X86_PF_PK) ? "protection keys violation" : + "permissions violation"); if (!(error_code & X86_PF_USER) && user_mode(regs)) { struct desc_ptr idt, gdt; u16 ldtr, tr; - pr_alert("This was a system access from user code\n"); - /* * This can happen for quite a few reasons. The more obvious * ones are faults accessing the GDT, or LDT. Perhaps -- 2.19.2 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [tip:x86/mm] x86/fault: Decode and print #PF oops in human readable form 2018-12-21 21:36 ` [PATCH v3 2/2] x86/fault: Decode and print #PF oops in human readable form Sean Christopherson @ 2019-04-19 18:35 ` tip-bot for Sean Christopherson 2019-04-21 18:35 ` Borislav Petkov 2019-04-29 13:58 ` [PATCH v3 2/2] x86/fault: Decode and print #PF oops in human readable form Dmitry Vyukov 1 sibling, 1 reply; 10+ messages in thread From: tip-bot for Sean Christopherson @ 2019-04-19 18:35 UTC (permalink / raw) To: linux-tip-commits Cc: torvalds, hpa, bp, luto, sean.j.christopherson, peterz, tglx, mingo, linux-kernel, riel, dave.hansen, yu-cheng.yu Commit-ID: 18ea35c5ed9921867194a8efc2a0ac2d5a3c7d2a Gitweb: https://git.kernel.org/tip/18ea35c5ed9921867194a8efc2a0ac2d5a3c7d2a Author: Sean Christopherson <sean.j.christopherson@intel.com> AuthorDate: Fri, 21 Dec 2018 13:36:57 -0800 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Fri, 19 Apr 2019 19:31:16 +0200 x86/fault: Decode and print #PF oops in human readable form Linus pointed out that deciphering the raw #PF error code and printing a more human readable message are two different things, and also that printing the negative cases is mostly just noise[1]. For example, the USER bit doesn't mean the fault originated in user code and stating that an oops wasn't due to a protection keys violation isn't interesting since an oops on a keys violation is a one-in-a-million scenario. Remove the per-bit decoding of the error code and instead print: - the raw error code - why the fault occurred - the effective privilege level of the access - the type of access - whether the fault originated in user code or kernel code This provides the user with the information needed to triage 99.9% of oopses without polluting the log with useless information or conflating the error_code with the CPL. Sample output: BUG: kernel NULL pointer dereference, address = 0000000000000008 #PF: supervisor-privileged instruction fetch from kernel code #PF: error_code(0x0010) - not-present page BUG: unable to handle page fault for address = ffffbeef00000000 #PF: supervisor-privileged instruction fetch from kernel code #PF: error_code(0x0010) - not-present page BUG: unable to handle page fault for address = ffffc90000230000 #PF: supervisor-privileged write access from kernel code #PF: error_code(0x000b) - reserved bit violation [1] https://lkml.kernel.org/r/CAHk-=whk_fsnxVMvF1T2fFCaP2WrvSybABrLQCWLJyCvHw6NKA@mail.gmail.com Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yu-cheng Yu <yu-cheng.yu@intel.com> Link: http://lkml.kernel.org/r/20181221213657.27628-3-sean.j.christopherson@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> --- arch/x86/mm/fault.c | 42 +++++++++++------------------------------- 1 file changed, 11 insertions(+), 31 deletions(-) diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index df2c5cdef5c4..74c9204c5751 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -603,24 +603,9 @@ static void show_ldttss(const struct desc_ptr *gdt, const char *name, u16 index) name, index, addr, (desc.limit0 | (desc.limit1 << 16))); } -/* - * This helper function transforms the #PF error_code bits into - * "[PROT] [USER]" type of descriptive, almost human-readable error strings: - */ -static void err_str_append(unsigned long error_code, char *buf, unsigned long mask, const char *txt) -{ - if (error_code & mask) { - if (buf[0]) - strcat(buf, " "); - strcat(buf, txt); - } -} - static void show_fault_oops(struct pt_regs *regs, unsigned long error_code, unsigned long address) { - char err_txt[64]; - if (!oops_may_print()) return; @@ -651,27 +636,22 @@ show_fault_oops(struct pt_regs *regs, unsigned long error_code, unsigned long ad pr_alert("BUG: unable to handle page fault for address = %px\n", (void *)address); - err_txt[0] = 0; - - /* - * Note: length of these appended strings including the separation space and the - * zero delimiter must fit into err_txt[]. - */ - err_str_append(error_code, err_txt, X86_PF_PROT, "[PROT]" ); - err_str_append(error_code, err_txt, X86_PF_WRITE, "[WRITE]"); - err_str_append(error_code, err_txt, X86_PF_USER, "[USER]" ); - err_str_append(error_code, err_txt, X86_PF_RSVD, "[RSVD]" ); - err_str_append(error_code, err_txt, X86_PF_INSTR, "[INSTR]"); - err_str_append(error_code, err_txt, X86_PF_PK, "[PK]" ); - - pr_alert("#PF error: %s\n", error_code ? err_txt : "[normal kernel read fault]"); + pr_alert("#PF: %s-privileged %s from %s code\n", + (error_code & X86_PF_USER) ? "user" : "supervisor", + (error_code & X86_PF_INSTR) ? "instruction fetch" : + (error_code & X86_PF_WRITE) ? "write access" : + "read access", + user_mode(regs) ? "user" : "kernel"); + pr_alert("#PF: error_code(0x%04lx) - %s\n", error_code, + !(error_code & X86_PF_PROT) ? "not-present page" : + (error_code & X86_PF_RSVD) ? "reserved bit violation" : + (error_code & X86_PF_PK) ? "protection keys violation" : + "permissions violation"); if (!(error_code & X86_PF_USER) && user_mode(regs)) { struct desc_ptr idt, gdt; u16 ldtr, tr; - pr_alert("This was a system access from user code\n"); - /* * This can happen for quite a few reasons. The more obvious * ones are faults accessing the GDT, or LDT. Perhaps ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [tip:x86/mm] x86/fault: Decode and print #PF oops in human readable form 2019-04-19 18:35 ` [tip:x86/mm] " tip-bot for Sean Christopherson @ 2019-04-21 18:35 ` Borislav Petkov 2019-04-21 18:52 ` [tip:x86/mm] x86/fault: Make fault messages more succinct tip-bot for Borislav Petkov 0 siblings, 1 reply; 10+ messages in thread From: Borislav Petkov @ 2019-04-21 18:35 UTC (permalink / raw) To: hpa, torvalds, luto, sean.j.christopherson, peterz, tglx, mingo, linux-kernel, riel, dave.hansen, yu-cheng.yu Cc: linux-tip-commits On Fri, Apr 19, 2019 at 11:35:51AM -0700, tip-bot for Sean Christopherson wrote: > BUG: kernel NULL pointer dereference, address = 0000000000000008 > #PF: supervisor-privileged instruction fetch from kernel code > #PF: error_code(0x0010) - not-present page > > BUG: unable to handle page fault for address = ffffbeef00000000 > #PF: supervisor-privileged instruction fetch from kernel code > #PF: error_code(0x0010) - not-present page > > BUG: unable to handle page fault for address = ffffc90000230000 > #PF: supervisor-privileged write access from kernel code > #PF: error_code(0x000b) - reserved bit violation Writing those in human-readable form is nice. May I suggest making those messages more succinct, though - we'll be staring at them for years, after all. --- From: Borislav Petkov <bp@suse.de> Date: Sun, 21 Apr 2019 20:24:08 +0200 Subject: [PATCH] x86/fault: Make fault messages more succinct So we are going to be staring at those in the next years, let's make them more succinct. In particular: - change "address = " to "address: " - "-privileged" reads funny. It should be simply "kernel" or "user" - "from kernel code" reads funny too. "kernel mode" or "user mode" is more natural. An actual example says more than 1000 words, of course: [ 0.248370] BUG: kernel NULL pointer dereference, address: 00000000000005b8 [ 0.249120] #PF: supervisor write access in kernel mode [ 0.249717] #PF: error_code(0x0002) - not-present page Signed-off-by: Borislav Petkov <bp@suse.de> --- arch/x86/mm/fault.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 74c9204c5751..a0df19b0897d 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -630,13 +630,13 @@ show_fault_oops(struct pt_regs *regs, unsigned long error_code, unsigned long ad } if (address < PAGE_SIZE && !user_mode(regs)) - pr_alert("BUG: kernel NULL pointer dereference, address = %px\n", + pr_alert("BUG: kernel NULL pointer dereference, address: %px\n", (void *)address); else - pr_alert("BUG: unable to handle page fault for address = %px\n", + pr_alert("BUG: unable to handle page fault for address: %px\n", (void *)address); - pr_alert("#PF: %s-privileged %s from %s code\n", + pr_alert("#PF: %s %s in %s mode\n", (error_code & X86_PF_USER) ? "user" : "supervisor", (error_code & X86_PF_INSTR) ? "instruction fetch" : (error_code & X86_PF_WRITE) ? "write access" : -- 2.21.0 -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [tip:x86/mm] x86/fault: Make fault messages more succinct 2019-04-21 18:35 ` Borislav Petkov @ 2019-04-21 18:52 ` tip-bot for Borislav Petkov 0 siblings, 0 replies; 10+ messages in thread From: tip-bot for Borislav Petkov @ 2019-04-21 18:52 UTC (permalink / raw) To: linux-tip-commits; +Cc: torvalds, mingo, hpa, peterz, linux-kernel, tglx, bp Commit-ID: ea2f8d60603efbd1cb4e193a593945a2fe24d264 Gitweb: https://git.kernel.org/tip/ea2f8d60603efbd1cb4e193a593945a2fe24d264 Author: Borislav Petkov <bp@suse.de> AuthorDate: Sun, 21 Apr 2019 20:35:24 +0200 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Sun, 21 Apr 2019 20:48:51 +0200 x86/fault: Make fault messages more succinct So we are going to be staring at those in the next years, let's make them more succinct. In particular: - change "address = " to "address: " - "-privileged" reads funny. It should be simply "kernel" or "user" - "from kernel code" reads funny too. "kernel mode" or "user mode" is more natural. An actual example says more than 1000 words, of course: [ 0.248370] BUG: kernel NULL pointer dereference, address: 00000000000005b8 [ 0.249120] #PF: supervisor write access in kernel mode [ 0.249717] #PF: error_code(0x0002) - not-present page Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dave.hansen@linux.intel.com Cc: luto@kernel.org Cc: riel@surriel.com Cc: sean.j.christopherson@intel.com Cc: yu-cheng.yu@intel.com Link: http://lkml.kernel.org/r/20190421183524.GC6048@zn.tnic Signed-off-by: Ingo Molnar <mingo@kernel.org> --- arch/x86/mm/fault.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 74c9204c5751..a0df19b0897d 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -630,13 +630,13 @@ show_fault_oops(struct pt_regs *regs, unsigned long error_code, unsigned long ad } if (address < PAGE_SIZE && !user_mode(regs)) - pr_alert("BUG: kernel NULL pointer dereference, address = %px\n", + pr_alert("BUG: kernel NULL pointer dereference, address: %px\n", (void *)address); else - pr_alert("BUG: unable to handle page fault for address = %px\n", + pr_alert("BUG: unable to handle page fault for address: %px\n", (void *)address); - pr_alert("#PF: %s-privileged %s from %s code\n", + pr_alert("#PF: %s %s in %s mode\n", (error_code & X86_PF_USER) ? "user" : "supervisor", (error_code & X86_PF_INSTR) ? "instruction fetch" : (error_code & X86_PF_WRITE) ? "write access" : ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v3 2/2] x86/fault: Decode and print #PF oops in human readable form 2018-12-21 21:36 ` [PATCH v3 2/2] x86/fault: Decode and print #PF oops in human readable form Sean Christopherson 2019-04-19 18:35 ` [tip:x86/mm] " tip-bot for Sean Christopherson @ 2019-04-29 13:58 ` Dmitry Vyukov 2019-04-29 14:14 ` Borislav Petkov 1 sibling, 1 reply; 10+ messages in thread From: Dmitry Vyukov @ 2019-04-29 13:58 UTC (permalink / raw) To: Sean Christopherson Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov, the arch/x86 maintainers, H. Peter Anvin, LKML, Linus Torvalds, Rik van Riel, Yu-cheng Yu, Ingo Molnar, syzkaller On Fri, Dec 21, 2018 at 10:37 PM Sean Christopherson <sean.j.christopherson@intel.com> wrote: > > Linus pointed out that deciphering the raw #PF error code and printing > a more human readable message are two different things, and also that > printing the negative cases is mostly just noise[1]. For example, the > USER bit doesn't mean the fault originated in user code and stating > that an oops wasn't due to a protection keys violation isn't interesting > since an oops on a keys violation is a one-in-a-million scenario. > > Remove the per-bit decoding of the error code and instead print: > - the raw error code > - why the fault occurred > - the effective privilege level of the access > - the type of access > - whether the fault originated in user code or kernel code > > This provides the user with the information needed to triage 99.9% of > oopses without polluting the log with useless information or conflating > the error_code with the CPL. > > Sample output: > > BUG: kernel NULL pointer dereference, address = 0000000000000008 > #PF: supervisor-privileged instruction fetch from kernel code > #PF: error_code(0x0010) - not-present page > > BUG: unable to handle page fault for address = ffffbeef00000000 > #PF: supervisor-privileged instruction fetch from kernel code > #PF: error_code(0x0010) - not-present page > > BUG: unable to handle page fault for address = ffffc90000230000 > #PF: supervisor-privileged write access from kernel code > #PF: error_code(0x000b) - reserved bit violation As a note for future. I see 3 recent changes that shuffle crash format forth and back: ea2f8d60603efbd1cb4e193a593945a2fe24d264 x86/fault: Make fault messages more succinct f28b11a2abd9cf5535baa03e28fc63ad0e14f52a x86/fault: Reword initial BUG message for unhandled page faults 18ea35c5ed9921867194a8efc2a0ac2d5a3c7d2a x86/fault: Decode and print #PF oops in human readable form Ideally, such changes are coordinated with kernel testing for gradual rollout. Since kernel does not provide an official facility for crash parsing, the actual output effectively becomes part of public API. Such changes in format may lead to glues bugs, not duped bugs and missed bugs (including incorrect bisection, when something stopped being detected as a crash at all). Ideally-ideally, kernel provides some kind official output parsing facility (that can understand when kernel has bugged, when the crash starts/end and some kind of stable identity for a crash) and includes output parsing tests for all types of crashes because this types kernel sabotaging testing happens periodically. We already have samples for "address =" format, but not for "address:" yet. So I still can't update parsing for the final format and we have massively glued crash reports. > [1] https://lkml.kernel.org/r/CAHk-=whk_fsnxVMvF1T2fFCaP2WrvSybABrLQCWLJyCvHw6NKA@mail.gmail.com > > Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> > Cc: Andy Lutomirski <luto@kernel.org> > Cc: Borislav Petkov <bp@alien8.de> > Cc: Dave Hansen <dave.hansen@linux.intel.com> > Cc: H. Peter Anvin <hpa@zytor.com> > Cc: Linus Torvalds <torvalds@linux-foundation.org> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Rik van Riel <riel@surriel.com> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Yu-cheng Yu <yu-cheng.yu@intel.com> > Cc: linux-kernel@vger.kernel.org > Cc: Ingo Molnar <mingo@kernel.org> > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> > --- > arch/x86/mm/fault.c | 42 +++++++++++------------------------------- > 1 file changed, 11 insertions(+), 31 deletions(-) > > diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c > index 39dccdfef496..a4421cbd230b 100644 > --- a/arch/x86/mm/fault.c > +++ b/arch/x86/mm/fault.c > @@ -603,24 +603,9 @@ static void show_ldttss(const struct desc_ptr *gdt, const char *name, u16 index) > name, index, addr, (desc.limit0 | (desc.limit1 << 16))); > } > > -/* > - * This helper function transforms the #PF error_code bits into > - * "[PROT] [USER]" type of descriptive, almost human-readable error strings: > - */ > -static void err_str_append(unsigned long error_code, char *buf, unsigned long mask, const char *txt) > -{ > - if (error_code & mask) { > - if (buf[0]) > - strcat(buf, " "); > - strcat(buf, txt); > - } > -} > - > static void > show_fault_oops(struct pt_regs *regs, unsigned long error_code, unsigned long address) > { > - char err_txt[64]; > - > if (!oops_may_print()) > return; > > @@ -651,27 +636,22 @@ show_fault_oops(struct pt_regs *regs, unsigned long error_code, unsigned long ad > pr_alert("BUG: unable to handle page fault for address = %px\n", > (void *)address); > > - err_txt[0] = 0; > - > - /* > - * Note: length of these appended strings including the separation space and the > - * zero delimiter must fit into err_txt[]. > - */ > - err_str_append(error_code, err_txt, X86_PF_PROT, "[PROT]" ); > - err_str_append(error_code, err_txt, X86_PF_WRITE, "[WRITE]"); > - err_str_append(error_code, err_txt, X86_PF_USER, "[USER]" ); > - err_str_append(error_code, err_txt, X86_PF_RSVD, "[RSVD]" ); > - err_str_append(error_code, err_txt, X86_PF_INSTR, "[INSTR]"); > - err_str_append(error_code, err_txt, X86_PF_PK, "[PK]" ); > - > - pr_alert("#PF error: %s\n", error_code ? err_txt : "[normal kernel read fault]"); > + pr_alert("#PF: %s-privileged %s from %s code\n", > + (error_code & X86_PF_USER) ? "user" : "supervisor", > + (error_code & X86_PF_INSTR) ? "instruction fetch" : > + (error_code & X86_PF_WRITE) ? "write access" : > + "read access", > + user_mode(regs) ? "user" : "kernel"); > + pr_alert("#PF: error_code(0x%04lx) - %s\n", error_code, > + !(error_code & X86_PF_PROT) ? "not-present page" : > + (error_code & X86_PF_RSVD) ? "reserved bit violation" : > + (error_code & X86_PF_PK) ? "protection keys violation" : > + "permissions violation"); > > if (!(error_code & X86_PF_USER) && user_mode(regs)) { > struct desc_ptr idt, gdt; > u16 ldtr, tr; > > - pr_alert("This was a system access from user code\n"); > - > /* > * This can happen for quite a few reasons. The more obvious > * ones are faults accessing the GDT, or LDT. Perhaps > -- > 2.19.2 > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3 2/2] x86/fault: Decode and print #PF oops in human readable form 2019-04-29 13:58 ` [PATCH v3 2/2] x86/fault: Decode and print #PF oops in human readable form Dmitry Vyukov @ 2019-04-29 14:14 ` Borislav Petkov 0 siblings, 0 replies; 10+ messages in thread From: Borislav Petkov @ 2019-04-29 14:14 UTC (permalink / raw) To: Dmitry Vyukov Cc: Sean Christopherson, Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner, Ingo Molnar, the arch/x86 maintainers, H. Peter Anvin, LKML, Linus Torvalds, Rik van Riel, Yu-cheng Yu, Ingo Molnar, syzkaller On Mon, Apr 29, 2019 at 03:58:20PM +0200, Dmitry Vyukov wrote: > Ideally, such changes are coordinated with kernel testing for gradual > rollout. Since kernel does not provide an official facility for crash > parsing, the actual output effectively becomes part of public API. Well, printk message formats are not an API. I agree, though, that if this is how kernel testing is going to get told about crashes, then we'd need some sort of parsing specification which gets changed together with when the actual messages are changed, so that tools don't break. Or something slicker and more robust than tools parsing dmesg output... -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3 0/2] x86/fault: Further improve #PF oops messages 2018-12-21 21:36 [PATCH v3 0/2] x86/fault: Further improve #PF oops messages Sean Christopherson 2018-12-21 21:36 ` [PATCH v3 1/2] x86/fault: Reword initial BUG message for unhandled page faults Sean Christopherson 2018-12-21 21:36 ` [PATCH v3 2/2] x86/fault: Decode and print #PF oops in human readable form Sean Christopherson @ 2019-02-28 15:44 ` Sean Christopherson 2 siblings, 0 replies; 10+ messages in thread From: Sean Christopherson @ 2019-02-28 15:44 UTC (permalink / raw) To: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86 Cc: H. Peter Anvin, linux-kernel, Linus Torvalds, Rik van Riel, Yu, Yu-cheng, Ingo Molnar High latency ping. These still apply as-is. On Fri, Dec 21, 2018 at 01:36:55PM -0800, Sean Christopherson wrote: > Rework the messages printed for #PF oopses to display more detailed > information about the fault in human readable form and to avoid > conflicting messages and/or statements that may not always be accurate. > > v3: > - Prepend a patch to reword the initial BUG message > - Add sample output in the changelogs > - Swap the order of the #PF lines. For most cases the three main lines > show up in reverse fir-tree ordering and the cause of the fault is > easy to pick out since it's the last thing highlighted by pr_alert > (excepting when dumping the IDT, GDT, etc...). > > v2: > - Explicitly call out protection keys violations > - "Slightly" reword the changelog > > v1: https://lkml.kernel.org/r/20181207195223.23968-1-sean.j.christopherson@intel.com > v2: https://lkml.kernel.org/r/20181207184423.1962-1-sean.j.christopherson@intel.com > > Sean Christopherson (2): > x86/fault: Reword initial BUG message for unhandled page faults > x86/fault: Decode and print #PF oops in human readable form > > arch/x86/mm/fault.c | 51 +++++++++++++++------------------------------ > 1 file changed, 17 insertions(+), 34 deletions(-) > > -- > 2.19.2 > ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2019-04-29 14:14 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-12-21 21:36 [PATCH v3 0/2] x86/fault: Further improve #PF oops messages Sean Christopherson 2018-12-21 21:36 ` [PATCH v3 1/2] x86/fault: Reword initial BUG message for unhandled page faults Sean Christopherson 2019-04-19 18:35 ` [tip:x86/mm] " tip-bot for Sean Christopherson 2018-12-21 21:36 ` [PATCH v3 2/2] x86/fault: Decode and print #PF oops in human readable form Sean Christopherson 2019-04-19 18:35 ` [tip:x86/mm] " tip-bot for Sean Christopherson 2019-04-21 18:35 ` Borislav Petkov 2019-04-21 18:52 ` [tip:x86/mm] x86/fault: Make fault messages more succinct tip-bot for Borislav Petkov 2019-04-29 13:58 ` [PATCH v3 2/2] x86/fault: Decode and print #PF oops in human readable form Dmitry Vyukov 2019-04-29 14:14 ` Borislav Petkov 2019-02-28 15:44 ` [PATCH v3 0/2] x86/fault: Further improve #PF oops messages Sean Christopherson
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.