All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@kernel.org>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>,
	Benjamin Gilbert <benjamin.gilbert@coreos.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	X86 ML <x86@kernel.org>, LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, stable <stable@vger.kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Garnier <thgarnie@google.com>,
	Alexander Kuleshov <kuleshovmail@gmail.com>
Subject: Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
Date: Thu, 4 Jan 2018 08:17:06 -0800	[thread overview]
Message-ID: <CALCETrVg=XQh+9VczkoC-0oLnBHGD=5hswTmyWQUR8_TTpnDsQ@mail.gmail.com> (raw)
In-Reply-To: <alpine.DEB.2.20.1801041320360.1771@nanos>

On Thu, Jan 4, 2018 at 4:28 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Wed, 3 Jan 2018, Andy Lutomirski wrote:
>> On Wed, Jan 3, 2018 at 8:35 PM, Benjamin Gilbert
>> <benjamin.gilbert@coreos.com> wrote:
>> > On Wed, Jan 03, 2018 at 04:37:53PM -0800, Andy Lutomirski wrote:
>> >> Maybe try rebuilding a bad kernel with free_ldt_pgtables() modified
>> >> to do nothing, and the read /sys/kernel/debug/page_tables/current (or
>> >> current_kernel, or whatever it's called).  The problem may be obvious.
>> >
>> > current_kernel attached.  I have not seen any crashes with
>> > free_ldt_pgtables() stubbed out.
>>
>> I haven't reproduced it, but I think I see what's wrong.  KASLR sets
>> vaddr_end to a totally bogus value.  It should be no larger than
>> LDT_BASE_ADDR.  I suspect that your vmemmap is getting randomized into
>> the LDT range.  If it weren't for that, it could just as easily land
>> in the cpu_entry_area range.  This will need fixing in all versions
>> that aren't still called KAISER.
>>
>> Our memory map code is utter shite.  This kind of bug should not be
>> possible without a giant warning at boot that something is screwed up.
>
> You're right it's utter shite and the KASLR folks who added this insanity
> of making vaddr_end depend on a gazillion of config options and not
> documenting it in mm.txt or elsewhere where it's obvious to find should
> really sit back and think hard about their half baken 'security' features.
>
> Just look at the insanity of comment above the vaddr_end ifdef maze.
>
> Benjamin, can you test the patch below please?
>
> Thanks,
>
>         tglx
>
> 8<--------------
> --- a/Documentation/x86/x86_64/mm.txt
> +++ b/Documentation/x86/x86_64/mm.txt
> @@ -12,8 +12,9 @@ ffffea0000000000 - ffffeaffffffffff (=40
>  ... unused hole ...
>  ffffec0000000000 - fffffbffffffffff (=44 bits) kasan shadow memory (16TB)
>  ... unused hole ...
> -fffffe0000000000 - fffffe7fffffffff (=39 bits) LDT remap for PTI
> -fffffe8000000000 - fffffeffffffffff (=39 bits) cpu_entry_area mapping
> +                                   vaddr_end for KASLR
> +fffffe0000000000 - fffffe7fffffffff (=39 bits) cpu_entry_area mapping
> +fffffe8000000000 - fffffeffffffffff (=39 bits) LDT remap for PTI
>  ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
>  ... unused hole ...
>  ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space
> @@ -37,7 +38,9 @@ ffd4000000000000 - ffd5ffffffffffff (=49
>  ... unused hole ...
>  ffdf000000000000 - fffffc0000000000 (=53 bits) kasan shadow memory (8PB)
>  ... unused hole ...
> -fffffe8000000000 - fffffeffffffffff (=39 bits) cpu_entry_area mapping
> +                                   vaddr_end for KASLR
> +fffffe0000000000 - fffffe7fffffffff (=39 bits) cpu_entry_area mapping
> +... unused hole ...
>  ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
>  ... unused hole ...
>  ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space
> --- a/arch/x86/include/asm/pgtable_64_types.h
> +++ b/arch/x86/include/asm/pgtable_64_types.h
> @@ -88,7 +88,7 @@ typedef struct { pteval_t pte; } pte_t;
>  # define VMALLOC_SIZE_TB       _AC(32, UL)
>  # define __VMALLOC_BASE                _AC(0xffffc90000000000, UL)
>  # define __VMEMMAP_BASE                _AC(0xffffea0000000000, UL)
> -# define LDT_PGD_ENTRY         _AC(-4, UL)
> +# define LDT_PGD_ENTRY         _AC(-3, UL)
>  # define LDT_BASE_ADDR         (LDT_PGD_ENTRY << PGDIR_SHIFT)
>  #endif

If you actually change the memory map order, you need to change the
shadow copy in mm/dump_pagetables.c, too.  I have a draft patch to
just sort the damn list, but that's not ready yet.

WARNING: multiple messages have this Message-ID (diff)
From: Andy Lutomirski <luto@kernel.org>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>,
	Benjamin Gilbert <benjamin.gilbert@coreos.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	X86 ML <x86@kernel.org>, LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, stable <stable@vger.kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Garnier <thgarnie@google.com>,
	Alexander Kuleshov <kuleshovmail@gmail.com>
Subject: Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs
Date: Thu, 4 Jan 2018 08:17:06 -0800	[thread overview]
Message-ID: <CALCETrVg=XQh+9VczkoC-0oLnBHGD=5hswTmyWQUR8_TTpnDsQ@mail.gmail.com> (raw)
In-Reply-To: <alpine.DEB.2.20.1801041320360.1771@nanos>

On Thu, Jan 4, 2018 at 4:28 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Wed, 3 Jan 2018, Andy Lutomirski wrote:
>> On Wed, Jan 3, 2018 at 8:35 PM, Benjamin Gilbert
>> <benjamin.gilbert@coreos.com> wrote:
>> > On Wed, Jan 03, 2018 at 04:37:53PM -0800, Andy Lutomirski wrote:
>> >> Maybe try rebuilding a bad kernel with free_ldt_pgtables() modified
>> >> to do nothing, and the read /sys/kernel/debug/page_tables/current (or
>> >> current_kernel, or whatever it's called).  The problem may be obvious.
>> >
>> > current_kernel attached.  I have not seen any crashes with
>> > free_ldt_pgtables() stubbed out.
>>
>> I haven't reproduced it, but I think I see what's wrong.  KASLR sets
>> vaddr_end to a totally bogus value.  It should be no larger than
>> LDT_BASE_ADDR.  I suspect that your vmemmap is getting randomized into
>> the LDT range.  If it weren't for that, it could just as easily land
>> in the cpu_entry_area range.  This will need fixing in all versions
>> that aren't still called KAISER.
>>
>> Our memory map code is utter shite.  This kind of bug should not be
>> possible without a giant warning at boot that something is screwed up.
>
> You're right it's utter shite and the KASLR folks who added this insanity
> of making vaddr_end depend on a gazillion of config options and not
> documenting it in mm.txt or elsewhere where it's obvious to find should
> really sit back and think hard about their half baken 'security' features.
>
> Just look at the insanity of comment above the vaddr_end ifdef maze.
>
> Benjamin, can you test the patch below please?
>
> Thanks,
>
>         tglx
>
> 8<--------------
> --- a/Documentation/x86/x86_64/mm.txt
> +++ b/Documentation/x86/x86_64/mm.txt
> @@ -12,8 +12,9 @@ ffffea0000000000 - ffffeaffffffffff (=40
>  ... unused hole ...
>  ffffec0000000000 - fffffbffffffffff (=44 bits) kasan shadow memory (16TB)
>  ... unused hole ...
> -fffffe0000000000 - fffffe7fffffffff (=39 bits) LDT remap for PTI
> -fffffe8000000000 - fffffeffffffffff (=39 bits) cpu_entry_area mapping
> +                                   vaddr_end for KASLR
> +fffffe0000000000 - fffffe7fffffffff (=39 bits) cpu_entry_area mapping
> +fffffe8000000000 - fffffeffffffffff (=39 bits) LDT remap for PTI
>  ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
>  ... unused hole ...
>  ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space
> @@ -37,7 +38,9 @@ ffd4000000000000 - ffd5ffffffffffff (=49
>  ... unused hole ...
>  ffdf000000000000 - fffffc0000000000 (=53 bits) kasan shadow memory (8PB)
>  ... unused hole ...
> -fffffe8000000000 - fffffeffffffffff (=39 bits) cpu_entry_area mapping
> +                                   vaddr_end for KASLR
> +fffffe0000000000 - fffffe7fffffffff (=39 bits) cpu_entry_area mapping
> +... unused hole ...
>  ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
>  ... unused hole ...
>  ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space
> --- a/arch/x86/include/asm/pgtable_64_types.h
> +++ b/arch/x86/include/asm/pgtable_64_types.h
> @@ -88,7 +88,7 @@ typedef struct { pteval_t pte; } pte_t;
>  # define VMALLOC_SIZE_TB       _AC(32, UL)
>  # define __VMALLOC_BASE                _AC(0xffffc90000000000, UL)
>  # define __VMEMMAP_BASE                _AC(0xffffea0000000000, UL)
> -# define LDT_PGD_ENTRY         _AC(-4, UL)
> +# define LDT_PGD_ENTRY         _AC(-3, UL)
>  # define LDT_BASE_ADDR         (LDT_PGD_ENTRY << PGDIR_SHIFT)
>  #endif

If you actually change the memory map order, you need to change the
shadow copy in mm/dump_pagetables.c, too.  I have a draft patch to
just sort the damn list, but that's not ready yet.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2018-01-04 16:17 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-03  8:36 "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs Benjamin Gilbert
2018-01-03  8:46 ` Benjamin Gilbert
2018-01-03  9:20   ` Greg Kroah-Hartman
2018-01-03  9:20     ` Greg Kroah-Hartman
2018-01-03 15:48     ` Ingo Molnar
2018-01-03 15:48       ` Ingo Molnar
2018-01-03 22:32       ` Benjamin Gilbert
2018-01-03 22:32         ` Benjamin Gilbert
2018-01-03 22:34         ` Thomas Gleixner
2018-01-03 22:34           ` Thomas Gleixner
2018-01-03 22:49           ` Benjamin Gilbert
2018-01-03 22:57             ` Thomas Gleixner
2018-01-03 22:57               ` Thomas Gleixner
2018-01-03 22:58               ` Thomas Gleixner
2018-01-03 22:58                 ` Thomas Gleixner
2018-01-03 23:44                 ` Andy Lutomirski
2018-01-03 23:44                   ` Andy Lutomirski
2018-01-03 23:46                   ` Thomas Gleixner
2018-01-03 23:46                     ` Thomas Gleixner
2018-01-04  0:27                 ` Andy Lutomirski
2018-01-04  0:27                   ` Andy Lutomirski
2018-01-04  0:38                   ` Benjamin Gilbert
2018-01-04  0:38                     ` Benjamin Gilbert
2018-01-04  0:33     ` Benjamin Gilbert
2018-01-04  0:33       ` Benjamin Gilbert
2018-01-04  0:37       ` Thomas Gleixner
2018-01-04  0:37         ` Thomas Gleixner
2018-01-04  7:14         ` Ingo Molnar
2018-01-04  7:14           ` Ingo Molnar
2018-01-04  7:18           ` Greg Kroah-Hartman
2018-01-04  7:18             ` Greg Kroah-Hartman
2018-01-04  7:20             ` Ingo Molnar
2018-01-04  7:20               ` Ingo Molnar
2018-01-04  8:03               ` Greg Kroah-Hartman
2018-01-04  8:03                 ` Greg Kroah-Hartman
2018-01-04  7:22           ` Ingo Molnar
2018-01-04  7:22             ` Ingo Molnar
2018-01-04  0:37       ` Andy Lutomirski
2018-01-04  0:37         ` Andy Lutomirski
2018-01-04  4:35         ` Benjamin Gilbert
2018-01-04  4:45           ` Andy Lutomirski
2018-01-04  4:45             ` Andy Lutomirski
2018-01-04 12:28             ` Thomas Gleixner
2018-01-04 12:28               ` Thomas Gleixner
2018-01-04 16:17               ` Andy Lutomirski [this message]
2018-01-04 16:17                 ` Andy Lutomirski
2018-01-04 16:34                 ` Thomas Gleixner
2018-01-04 16:34                   ` Thomas Gleixner
2018-01-04 19:38               ` Benjamin Gilbert
2018-01-04 19:38                 ` Benjamin Gilbert
2018-01-04 22:10               ` [tip:x86/pti] x86/mm: Map cpu_entry_area at the same place on 4/5 level tip-bot for Thomas Gleixner
2018-01-04 22:10               ` [tip:x86/pti] x86/kaslr: Fix the vaddr_end mess tip-bot for Thomas Gleixner
2018-01-04 23:29                 ` Benjamin Gilbert
2018-01-04 23:32                   ` Thomas Gleixner
2018-01-04 23:48               ` tip-bot for Thomas Gleixner
2018-01-04  1:37       ` "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs Benjamin Gilbert
2018-01-04  1:37         ` Benjamin Gilbert
2018-01-04  4:36         ` Benjamin Gilbert
2018-01-04  4:36           ` Benjamin Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALCETrVg=XQh+9VczkoC-0oLnBHGD=5hswTmyWQUR8_TTpnDsQ@mail.gmail.com' \
    --to=luto@kernel.org \
    --cc=benjamin.gilbert@coreos.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=kuleshovmail@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=thgarnie@google.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.