Re: [PATCH 2/3] x86/traps: Print non-canonical address on #GP

From: Andy Lutomirski <luto@kernel.org>
To: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Jann Horn <jannh@google.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"H. Peter Anvin" <hpa@zytor.com>, X86 ML <x86@kernel.org>,
	Andrey Ryabinin <aryabinin@virtuozzo.com>,
	Alexander Potapenko <glider@google.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	kasan-dev <kasan-dev@googlegroups.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/3] x86/traps: Print non-canonical address on #GP
Date: Thu, 14 Nov 2019 10:00:35 -0800	[thread overview]
Message-ID: <CALCETrVmaN4BgvUdsuTJ8vdkaN1JrAfBzs+W7aS2cxxDYkqn_Q@mail.gmail.com> (raw)
In-Reply-To: <20191114174630.GF24045@linux.intel.com>

On Thu, Nov 14, 2019 at 9:46 AM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> On Tue, Nov 12, 2019 at 10:10:01PM +0100, Jann Horn wrote:
> > A frequent cause of #GP exceptions are memory accesses to non-canonical
> > addresses. Unlike #PF, #GP doesn't come with a fault address in CR2, so
> > the kernel doesn't currently print the fault address for #GP.
> > Luckily, we already have the necessary infrastructure for decoding X86
> > instructions and computing the memory address that is being accessed;
> > hook it up to the #GP handler so that we can figure out whether the #GP
> > looks like it was caused by a non-canonical address, and if so, print
> > that address.
> >
> > While it is already possible to compute the faulting address manually by
> > disassembling the opcode dump and evaluating the instruction against the
> > register dump, this should make it slightly easier to identify crashes
> > at a glance.
> >
> > Signed-off-by: Jann Horn <jannh@google.com>
> > ---
> >  arch/x86/kernel/traps.c | 45 +++++++++++++++++++++++++++++++++++++++--
> >  1 file changed, 43 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> > index c90312146da0..479cfc6e9507 100644
> > --- a/arch/x86/kernel/traps.c
> > +++ b/arch/x86/kernel/traps.c
> > @@ -56,6 +56,8 @@
> >  #include <asm/mpx.h>
> >  #include <asm/vm86.h>
> >  #include <asm/umip.h>
> > +#include <asm/insn.h>
> > +#include <asm/insn-eval.h>
> >
> >  #ifdef CONFIG_X86_64
> >  #include <asm/x86_init.h>
> > @@ -509,6 +511,42 @@ dotraplinkage void do_bounds(struct pt_regs *regs, long error_code)
> >       do_trap(X86_TRAP_BR, SIGSEGV, "bounds", regs, error_code, 0, NULL);
> >  }
> >
> > +/*
> > + * On 64-bit, if an uncaught #GP occurs while dereferencing a non-canonical
> > + * address, print that address.
> > + */
> > +static void print_kernel_gp_address(struct pt_regs *regs)
> > +{
> > +#ifdef CONFIG_X86_64
> > +     u8 insn_bytes[MAX_INSN_SIZE];
> > +     struct insn insn;
> > +     unsigned long addr_ref;
> > +
> > +     if (probe_kernel_read(insn_bytes, (void *)regs->ip, MAX_INSN_SIZE))
> > +             return;
> > +
> > +     kernel_insn_init(&insn, insn_bytes, MAX_INSN_SIZE);
> > +     insn_get_modrm(&insn);
> > +     insn_get_sib(&insn);
> > +     addr_ref = (unsigned long)insn_get_addr_ref(&insn, regs);
> > +
> > +     /*
> > +      * If insn_get_addr_ref() failed or we got a canonical address in the
> > +      * kernel half, bail out.
> > +      */
> > +     if ((addr_ref | __VIRTUAL_MASK) == ~0UL)
> > +             return;
> > +     /*
> > +      * For the user half, check against TASK_SIZE_MAX; this way, if the
> > +      * access crosses the canonical address boundary, we don't miss it.
> > +      */
> > +     if (addr_ref <= TASK_SIZE_MAX)
>
> Any objection to open coding the upper bound instead of using
> TASK_SIZE_MASK to make the threshold more obvious?
>
> > +             return;
> > +
> > +     pr_alert("dereferencing non-canonical address 0x%016lx\n", addr_ref);
>
> Printing the raw address will confuse users in the case where the access
> straddles the lower canonical boundary.  Maybe combine this with open
> coding the straddle case?  With a rough heuristic to hedge a bit for
> instructions whose operand size isn't accurately reflected in opnd_bytes.
>
>         if (addr_ref > __VIRTUAL_MASK)
>                 pr_alert("dereferencing non-canonical address 0x%016lx\n", addr_ref);
>         else if ((addr_ref + insn->opnd_bytes - 1) > __VIRTUAL_MASK)
>                 pr_alert("straddling non-canonical boundary 0x%016lx - 0x%016lx\n",
>                          addr_ref, addr_ref + insn->opnd_bytes - 1);
>         else if ((addr_ref + PAGE_SIZE - 1) > __VIRTUAL_MASK)
>                 pr_alert("potentially straddling non-canonical boundary 0x%016lx - 0x%016lx\n",
>                          addr_ref, addr_ref + PAGE_SIZE - 1);

This is unnecessarily complicated, and I suspect that Jann had the
right idea but just didn't quite explain it enough.  The secret here
is that TASK_SIZE_MAX is a full page below the canonical boundary
(thanks, Intel, for screwing up SYSRET), so, if we get #GP for an
address above TASK_SIZE_MAX, then it's either a #GP for a different
reason or it's a genuine non-canonical access.

So I think that just a comment about this would be enough.

*However*, the printout should at least hedge a bit and say something
like "probably dereferencing non-canonical address", since there are
plenty of ways to get #GP with an operand that is nominally
non-canonical but where the actual cause of #GP is different.  And I
think this code should be skipped entirely if error_code != 0.

--Andy