linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@kernel.org>
To: x86@kernel.org, linux-kernel@vger.kernel.org
Cc: linux-arch@vger.kernel.org, Borislav Petkov <bp@alien8.de>,
	Nadav Amit <nadav.amit@gmail.com>,
	Kees Cook <keescook@chromium.org>,
	Brian Gerst <brgerst@gmail.com>,
	"kernel-hardening@lists.openwall.com" 
	<kernel-hardening@lists.openwall.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Josh Poimboeuf <jpoimboe@redhat.com>, Jann Horn <jann@thejh.net>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	Andy Lutomirski <luto@kernel.org>
Subject: [PATCH v4 00/16] Virtually mapped stacks with guard pages (x86, core)
Date: Thu, 23 Jun 2016 21:22:55 -0700	[thread overview]
Message-ID: <cover.1466741835.git.luto@kernel.org> (raw)

Since the dawn of time, a kernel stack overflow has been a real PITA
to debug, has caused nondeterministic crashes some time after the
actual overflow, and has generally been easy to exploit for root.

With this series, arches can enable HAVE_ARCH_VMAP_STACK.  Arches
that enable it (just x86 for now) get virtually mapped stacks with
guard pages.  This causes reliable faults when the stack overflows.

If the arch implements it well, we get a nice OOPS on stack overflow
(as opposed to panicing directly or otherwise exploding badly).  On
x86, the OOPS is nice, has a usable call trace, and the overflowing
task is killed cleanly.

On my laptop, this adds about 1.5µs of overhead to task creation,
which seems to be mainly caused by vmalloc inefficiently allocating
individual pages even when a higher-order page is available on the
freelist.

This does not address interrupt stacks.  It also does not address
the possibility of privilege escalation by a controlled stack
overflow that overwrites thread_info without hitting the guard page.
I'll send patches to address the latter issue once this series
lands.

It's worth noting that s390 has an arch-specific gcc feature that
detects stack overflows by adjusting function prologues.  Arches
with features like that may wish to avoid using vmapped stacks to
minimize the performance hit.

Ingo, would it make sense to throw it into a seaparate branch in
-tip?  I wouldn't mind seeing some -next testing to give people a
chance to shake out problems.  I'm particularly interested in
whether there are any drivers that expect virt_to_phys to work on
stack addresses.  (I know that virtio-net used to, but I fixed that
a while back.)

Once this lands in -tip, I'm planning on attacking thread_info.
Once thread_info is under control, we can start caching a couple of
stacks per cpu, and that should get us most of the performance back.

Changes from v3:
 - Fix rxrpc and bluetooth, which used scatterlists pointed at the stack
 - Add some acks and cc's

Changes from v2:
 - Delete kernel_unmap_pages_in_pgd rather than hardening it (Borislav)
 - Fix sub-page stack accounting better (Josh)

Changes from v1:
 - Fix rewind_stack_and_do_exit (Josh)
 - Fix deadlock under load
 - Clean up generic stack vmalloc code
 - Many other minor fixes
 
Andy Lutomirski (14):
  bluetooth: Switch SMP to crypto_cipher_encrypt_one()
  x86/cpa: In populate_pgd, don't set the pgd entry until it's populated
  x86/mm: Remove kernel_unmap_pages_in_pgd() and
    efi_cleanup_page_tables()
  mm: Track NR_KERNEL_STACK in KiB instead of number of stacks
  mm: Fix memcg stack accounting for sub-page stacks
  dma-api: Teach the "DMA-from-stack" check about vmapped stacks
  fork: Add generic vmalloced stack support
  x86/die: Don't try to recover from an OOPS on a non-default stack
  x86/dumpstack: When OOPSing, rewind the stack before do_exit
  x86/dumpstack: When dumping stack bytes due to OOPS, start with
    regs->sp
  x86/dumpstack: Try harder to get a call trace on stack overflow
  x86/dumpstack/64: Handle faults when printing the "Stack:" part of an
    OOPS
  x86/mm/64: Enable vmapped stacks
  x86/mm: Improve stack-overflow #PF handling

Herbert Xu (1):
  rxrpc: Avoid using stack memory in SG lists in rxkad

Ingo Molnar (1):
  x86/mm/hotplug: Don't remove PGD entries in remove_pagetable()

 arch/Kconfig                         |  29 ++++++++++
 arch/ia64/include/asm/thread_info.h  |   2 +-
 arch/x86/Kconfig                     |   1 +
 arch/x86/entry/entry_32.S            |  11 ++++
 arch/x86/entry/entry_64.S            |  11 ++++
 arch/x86/include/asm/efi.h           |   1 -
 arch/x86/include/asm/pgtable_types.h |   2 -
 arch/x86/include/asm/switch_to.h     |  28 +++++++++-
 arch/x86/include/asm/traps.h         |   6 ++
 arch/x86/kernel/dumpstack.c          |  19 ++++++-
 arch/x86/kernel/dumpstack_32.c       |   4 +-
 arch/x86/kernel/dumpstack_64.c       |  16 +++++-
 arch/x86/kernel/traps.c              |  32 +++++++++++
 arch/x86/mm/fault.c                  |  39 +++++++++++++
 arch/x86/mm/init_64.c                |  27 ---------
 arch/x86/mm/pageattr.c               |  32 +----------
 arch/x86/mm/tlb.c                    |  15 +++++
 arch/x86/platform/efi/efi.c          |   2 -
 arch/x86/platform/efi/efi_32.c       |   3 -
 arch/x86/platform/efi/efi_64.c       |   5 --
 drivers/base/node.c                  |   3 +-
 fs/proc/meminfo.c                    |   2 +-
 include/linux/memcontrol.h           |   2 +-
 include/linux/mmzone.h               |   2 +-
 include/linux/sched.h                |  15 +++++
 kernel/fork.c                        |  86 ++++++++++++++++++++++-------
 lib/dma-debug.c                      |  39 +++++++++++--
 mm/memcontrol.c                      |   2 +-
 mm/page_alloc.c                      |   3 +-
 net/bluetooth/smp.c                  |  67 ++++++++++-------------
 net/rxrpc/ar-internal.h              |   1 +
 net/rxrpc/rxkad.c                    | 103 +++++++++++++++--------------------
 32 files changed, 400 insertions(+), 210 deletions(-)

-- 
2.5.5

             reply	other threads:[~2016-06-24  4:23 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-24  4:22 Andy Lutomirski [this message]
2016-06-24  4:22 ` [PATCH v4 01/16] bluetooth: Switch SMP to crypto_cipher_encrypt_one() Andy Lutomirski
2016-06-24  6:10   ` Herbert Xu
2016-06-24  7:19   ` Johan Hedberg
2016-06-24  4:22 ` [PATCH v4 02/16] rxrpc: Avoid using stack memory in SG lists in rxkad Andy Lutomirski
2016-06-24  4:22 ` [PATCH v4 03/16] x86/mm/hotplug: Don't remove PGD entries in remove_pagetable() Andy Lutomirski
2016-06-24  4:22 ` [PATCH v4 04/16] x86/cpa: In populate_pgd, don't set the pgd entry until it's populated Andy Lutomirski
2016-06-24  4:23 ` [PATCH v4 05/16] x86/mm: Remove kernel_unmap_pages_in_pgd() and efi_cleanup_page_tables() Andy Lutomirski
2016-06-24  4:23 ` [PATCH v4 06/16] mm: Track NR_KERNEL_STACK in KiB instead of number of stacks Andy Lutomirski
2016-06-24 15:21   ` Josh Poimboeuf
2016-06-24  4:23 ` [PATCH v4 07/16] mm: Fix memcg stack accounting for sub-page stacks Andy Lutomirski
2016-06-24 15:22   ` Josh Poimboeuf
2016-06-24  4:23 ` [PATCH v4 08/16] dma-api: Teach the "DMA-from-stack" check about vmapped stacks Andy Lutomirski
2016-06-24  4:23 ` [PATCH v4 09/16] fork: Add generic vmalloced stack support Andy Lutomirski
2016-06-24  4:23 ` [PATCH v4 10/16] x86/die: Don't try to recover from an OOPS on a non-default stack Andy Lutomirski
2016-06-24  4:23 ` [PATCH v4 11/16] x86/dumpstack: When OOPSing, rewind the stack before do_exit Andy Lutomirski
2016-06-24 15:30   ` Josh Poimboeuf
2016-06-24 15:35     ` Brian Gerst
2016-06-24 15:48       ` Josh Poimboeuf
2016-06-24  4:23 ` [PATCH v4 12/16] x86/dumpstack: When dumping stack bytes due to OOPS, start with regs->sp Andy Lutomirski
2016-06-24 15:31   ` Josh Poimboeuf
2016-06-24  4:23 ` [PATCH v4 13/16] x86/dumpstack: Try harder to get a call trace on stack overflow Andy Lutomirski
2016-06-24 15:35   ` Josh Poimboeuf
2016-06-26 16:59     ` Andy Lutomirski
2016-06-24  4:23 ` [PATCH v4 14/16] x86/dumpstack/64: Handle faults when printing the "Stack:" part of an OOPS Andy Lutomirski
2016-06-24 15:36   ` Josh Poimboeuf
2016-06-24  4:23 ` [PATCH v4 15/16] x86/mm/64: Enable vmapped stacks Andy Lutomirski
2016-06-24  4:23 ` [PATCH v4 16/16] x86/mm: Improve stack-overflow #PF handling Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1466741835.git.luto@kernel.org \
    --to=luto@kernel.org \
    --cc=bp@alien8.de \
    --cc=brgerst@gmail.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=jann@thejh.net \
    --cc=jpoimboe@redhat.com \
    --cc=keescook@chromium.org \
    --cc=kernel-hardening@lists.openwall.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nadav.amit@gmail.com \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).