linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dmitry Vyukov <dvyukov@google.com>
To: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>,
	Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"the arch/x86 maintainers" <x86@kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Rik van Riel <riel@surriel.com>,
	Yu-cheng Yu <yu-cheng.yu@intel.com>,
	Ingo Molnar <mingo@kernel.org>,
	syzkaller <syzkaller@googlegroups.com>
Subject: Re: [PATCH v3 2/2] x86/fault: Decode and print #PF oops in human readable form
Date: Mon, 29 Apr 2019 15:58:20 +0200	[thread overview]
Message-ID: <CACT4Y+Z96anrG1da1-8VXA8BLMd-RA16ysicA=X-CUoSuGAw0Q@mail.gmail.com> (raw)
In-Reply-To: <20181221213657.27628-3-sean.j.christopherson@intel.com>

On Fri, Dec 21, 2018 at 10:37 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> Linus pointed out that deciphering the raw #PF error code and printing
> a more human readable message are two different things, and also that
> printing the negative cases is mostly just noise[1].  For example, the
> USER bit doesn't mean the fault originated in user code and stating
> that an oops wasn't due to a protection keys violation isn't interesting
> since an oops on a keys violation is a one-in-a-million scenario.
>
> Remove the per-bit decoding of the error code and instead print:
>   - the raw error code
>   - why the fault occurred
>   - the effective privilege level of the access
>   - the type of access
>   - whether the fault originated in user code or kernel code
>
> This provides the user with the information needed to triage 99.9% of
> oopses without polluting the log with useless information or conflating
> the error_code with the CPL.
>
> Sample output:
>
>     BUG: kernel NULL pointer dereference, address = 0000000000000008
>     #PF: supervisor-privileged instruction fetch from kernel code
>     #PF: error_code(0x0010) - not-present page
>
>     BUG: unable to handle page fault for address = ffffbeef00000000
>     #PF: supervisor-privileged instruction fetch from kernel code
>     #PF: error_code(0x0010) - not-present page
>
>     BUG: unable to handle page fault for address = ffffc90000230000
>     #PF: supervisor-privileged write access from kernel code
>     #PF: error_code(0x000b) - reserved bit violation



As a note for future.
I see 3 recent changes that shuffle crash format forth and back:

ea2f8d60603efbd1cb4e193a593945a2fe24d264 x86/fault: Make fault
messages more succinct
f28b11a2abd9cf5535baa03e28fc63ad0e14f52a x86/fault: Reword initial BUG
message for unhandled page faults
18ea35c5ed9921867194a8efc2a0ac2d5a3c7d2a x86/fault: Decode and print
#PF oops in human readable form

Ideally, such changes are coordinated with kernel testing for gradual
rollout. Since kernel does not provide an official facility for crash
parsing, the actual output effectively becomes part of public API.
Such changes in format may lead to glues bugs, not duped bugs and
missed bugs (including incorrect bisection, when something stopped
being detected as a crash at all).
Ideally-ideally, kernel provides some kind official output parsing
facility (that can understand when kernel has bugged, when the crash
starts/end and some kind of stable identity for a crash) and includes
output parsing tests for all types of crashes because this types
kernel sabotaging testing happens periodically.

We already have samples for "address =" format, but not for "address:"
yet. So I still can't update parsing for the final format and we have
massively glued crash reports.






> [1] https://lkml.kernel.org/r/CAHk-=whk_fsnxVMvF1T2fFCaP2WrvSybABrLQCWLJyCvHw6NKA@mail.gmail.com
>
> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Rik van Riel <riel@surriel.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
> Cc: linux-kernel@vger.kernel.org
> Cc: Ingo Molnar <mingo@kernel.org>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>  arch/x86/mm/fault.c | 42 +++++++++++-------------------------------
>  1 file changed, 11 insertions(+), 31 deletions(-)
>
> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> index 39dccdfef496..a4421cbd230b 100644
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -603,24 +603,9 @@ static void show_ldttss(const struct desc_ptr *gdt, const char *name, u16 index)
>                  name, index, addr, (desc.limit0 | (desc.limit1 << 16)));
>  }
>
> -/*
> - * This helper function transforms the #PF error_code bits into
> - * "[PROT] [USER]" type of descriptive, almost human-readable error strings:
> - */
> -static void err_str_append(unsigned long error_code, char *buf, unsigned long mask, const char *txt)
> -{
> -       if (error_code & mask) {
> -               if (buf[0])
> -                       strcat(buf, " ");
> -               strcat(buf, txt);
> -       }
> -}
> -
>  static void
>  show_fault_oops(struct pt_regs *regs, unsigned long error_code, unsigned long address)
>  {
> -       char err_txt[64];
> -
>         if (!oops_may_print())
>                 return;
>
> @@ -651,27 +636,22 @@ show_fault_oops(struct pt_regs *regs, unsigned long error_code, unsigned long ad
>                 pr_alert("BUG: unable to handle page fault for address = %px\n",
>                         (void *)address);
>
> -       err_txt[0] = 0;
> -
> -       /*
> -        * Note: length of these appended strings including the separation space and the
> -        * zero delimiter must fit into err_txt[].
> -        */
> -       err_str_append(error_code, err_txt, X86_PF_PROT,  "[PROT]" );
> -       err_str_append(error_code, err_txt, X86_PF_WRITE, "[WRITE]");
> -       err_str_append(error_code, err_txt, X86_PF_USER,  "[USER]" );
> -       err_str_append(error_code, err_txt, X86_PF_RSVD,  "[RSVD]" );
> -       err_str_append(error_code, err_txt, X86_PF_INSTR, "[INSTR]");
> -       err_str_append(error_code, err_txt, X86_PF_PK,    "[PK]"   );
> -
> -       pr_alert("#PF error: %s\n", error_code ? err_txt : "[normal kernel read fault]");
> +       pr_alert("#PF: %s-privileged %s from %s code\n",
> +                (error_code & X86_PF_USER)  ? "user" : "supervisor",
> +                (error_code & X86_PF_INSTR) ? "instruction fetch" :
> +                (error_code & X86_PF_WRITE) ? "write access" :
> +                                              "read access",
> +                            user_mode(regs) ? "user" : "kernel");
> +       pr_alert("#PF: error_code(0x%04lx) - %s\n", error_code,
> +                !(error_code & X86_PF_PROT) ? "not-present page" :
> +                (error_code & X86_PF_RSVD)  ? "reserved bit violation" :
> +                (error_code & X86_PF_PK)    ? "protection keys violation" :
> +                                              "permissions violation");
>
>         if (!(error_code & X86_PF_USER) && user_mode(regs)) {
>                 struct desc_ptr idt, gdt;
>                 u16 ldtr, tr;
>
> -               pr_alert("This was a system access from user code\n");
> -
>                 /*
>                  * This can happen for quite a few reasons.  The more obvious
>                  * ones are faults accessing the GDT, or LDT.  Perhaps
> --
> 2.19.2
>

  parent reply	other threads:[~2019-04-29 13:58 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-21 21:36 [PATCH v3 0/2] x86/fault: Further improve #PF oops messages Sean Christopherson
2018-12-21 21:36 ` [PATCH v3 1/2] x86/fault: Reword initial BUG message for unhandled page faults Sean Christopherson
2019-04-19 18:35   ` [tip:x86/mm] " tip-bot for Sean Christopherson
2018-12-21 21:36 ` [PATCH v3 2/2] x86/fault: Decode and print #PF oops in human readable form Sean Christopherson
2019-04-19 18:35   ` [tip:x86/mm] " tip-bot for Sean Christopherson
2019-04-21 18:35     ` Borislav Petkov
2019-04-21 18:52       ` [tip:x86/mm] x86/fault: Make fault messages more succinct tip-bot for Borislav Petkov
2019-04-29 13:58   ` Dmitry Vyukov [this message]
2019-04-29 14:14     ` [PATCH v3 2/2] x86/fault: Decode and print #PF oops in human readable form Borislav Petkov
2019-02-28 15:44 ` [PATCH v3 0/2] x86/fault: Further improve #PF oops messages Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACT4Y+Z96anrG1da1-8VXA8BLMd-RA16ysicA=X-CUoSuGAw0Q@mail.gmail.com' \
    --to=dvyukov@google.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=riel@surriel.com \
    --cc=sean.j.christopherson@intel.com \
    --cc=syzkaller@googlegroups.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    --cc=yu-cheng.yu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).