All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>,
	Brian Gerst <brgerst@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Andi Kleen <ak@linux.intel.com>,
	the arch/x86 maintainers <x86@kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@redhat.com>, Andy Lutomirski <luto@kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH 1/3] x86/entry: Clear extra registers beyond syscall arguments for 64bit kernels
Date: Tue, 6 Feb 2018 09:48:47 +0100	[thread overview]
Message-ID: <20180206084847.6lzfnumwlf3ehmvh@gmail.com> (raw)
In-Reply-To: <20180205200550.mhimujp7wltuwzod@gmail.com>


* Ingo Molnar <mingo@kernel.org> wrote:

> [...] so I implemented a real, per function register usage tracking.
> 
> For the x86 defconfig kernel the results are:
> 
>   r11: used in   1704 fns, not used in  43310 fns, usage ratio:    3.8%
>   r10: used in   3809 fns, not used in  41205 fns, usage ratio:    8.5%
>   r15: used in   6599 fns, not used in  38415 fns, usage ratio:   14.7%
>    r9: used in   8120 fns, not used in  36894 fns, usage ratio:   18.0%
>   r14: used in   9243 fns, not used in  35771 fns, usage ratio:   20.5%
>    r8: used in  12614 fns, not used in  32400 fns, usage ratio:   28.0%
>   r13: used in  12708 fns, not used in  32306 fns, usage ratio:   28.2%
>   r12: used in  17144 fns, not used in  27870 fns, usage ratio:   38.1%
>   rbp: used in  23289 fns, not used in  21725 fns, usage ratio:   51.7%
>   rcx: used in  23897 fns, not used in  21117 fns, usage ratio:   53.1%
>   rbx: used in  29226 fns, not used in  15788 fns, usage ratio:   64.9%
>   rdx: used in  33205 fns, not used in  11809 fns, usage ratio:   73.8%
>   rsi: used in  35415 fns, not used in   9599 fns, usage ratio:   78.7%
>   rdi: used in  40628 fns, not used in   4386 fns, usage ratio:   90.3%
>   rax: used in  43120 fns, not used in   1894 fns, usage ratio:   95.8%

So here's the next (and probably final) chapter of x86-64 register allocation 
statistics: out of curiosity I let this analysis run overnight on all 4 kernel 
configs, to see the register usage patterns of the distro and allyesconfig kernels 
as well.

Here's all the per function register allocation probabilities in a single table:

  REG      allnoconfig       localconfig      distroconfig      allyesconfig
  --------------------------------------------------------------------------
  rax:           94.6%             95.8%             94.3%             96.2%
  rbx:           46.9%             64.9%             67.6%             90.4%
  rcx:           47.8%             53.1%             57.9%             52.7%
  rdx:           66.0%             73.8%             76.0%             74.3%
  rbp:           36.2%             51.7%             55.5%             81.5%
  rsi:           64.8%             78.7%             81.3%             85.0%
  rdi:           79.9%             90.3%             92.1%             94.3%
   r8:           21.9%             28.0%             31.9%             29.7%
   r9:           13.9%             18.0%             20.4%             18.3%
  r10:            9.3%              8.5%              8.4%              4.7%
  r11:            4.9%              3.8%              4.5%              1.6%
  r12:           25.6%             38.1%             42.4%             69.3%
  r13:           18.3%             28.2%             31.5%             57.1%
  r14:           13.3%             20.5%             22.8%             46.1%
  r15:            9.3%             14.7%             16.4%             36.6%

These numbers underline the overall conclusions that we have reached so far:

 - We should clear all of R10-R15 in syscalls and R8-R15 in parameter-less
   entries (IRQs, NMIs, exceptions, etc.) - like the latest series from Dan does.

 - We should probably strive to clear R8-R9 for system calls that don't use it -
   which is ~98% of them. In particular R9 with its comparatively low (~20%)
   allocation probability could survive deep into the kernel: 5-deep call chains
   still have a ~30% chance to have R9 intact - and call chains as deep as 10 
   could still realistically have a ~10% residual probability to have R9 intact.
   We don't do this yet.

 - Smaller kernels are statistically easier to attack via Spectre, as long as the
   gadget is present in the smaller kernel. In particular heavily stripped down
   64-bit kernels might be attackable via R8-R9 (21%,14%) and also RBP (36%) to a 
   certain degree. This means that the RBP clearing introduced by this series is 
   very much relevant: because RBP is not part of the C function call calling 
   arguments ABI its allocation frequency is much lower than that of other GP 
   registers. Unfortunately R8/R9 values will survive through system calls, 
   because we restore them in do_syscall_64().

There's a somewhat surprising pattern as well: the register allocation probability 
of R10 and R11 _decreases_ as the kernel gets more complex. For all other 
registers the allocation probability increases with increasing kernel complexity, 
which is intuitive: larger functions with higher register pressure will use more 
registers.

So this result is counter-intuitive - my best guess is that it's some sort of GCC 
register allocation artifact. Here's the comparison of the code generation of a 
distro versus an allyesconfig kernel:

  #                             distro-config        allyesconfig
  #
  # nr of =y .config options:            4871                9553
  # nr of functions:                   190477              249340
  # nr of instructions:              10329411            20223765
  # nr of register uses:             16907185            33413619
  #
  # instructions per function:             54                  81
  #
  #
  # r10 used in:                        15404 fns           11300 fns
  # r10 not used in:                   167714 fns          228114 fns
  # r10 usage ratio:                      8.4%                4.7%
  #
  # r11 used in:                         8224 fns            3876 fns
  # r11 not used in:                   174894 fns          235538 fns
  # r11 usage ratio:                     4.5%                 1.6%

I don't know which kernel option (out of thousands) causes R10/R11 to be used much 
less frequently in a significantly larger kernel.

Note that even the absolute count of functions with R10/R11 use decreases in the 
allyesconfig kernel, so I don't think it can be caused by the extra 
instrumentation bloat of features like CONFIG_GCOV_KERNEL=y.

The basic inlining and optimization settings are the same and neither has 
branch-instrumentation enabled:

  #                           distro-config       allyesconfig
  #
  CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y       CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
  CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y            CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y
  # CONFIG_CC_OPTIMIZE_FOR_SIZE is not set        # CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
  CONFIG_OPTIMIZE_INLINING=y                      CONFIG_OPTIMIZE_INLINING=y

  CONFIG_BRANCH_PROFILE_NONE=y                    CONFIG_BRANCH_PROFILE_NONE=y

While no-one will build and boot an allyesconfig kernel (other than me), the 
numbers are still indicative: we should keep in mind the possibility that a Linux 
distro enabling seemingly benign non-default kernel options can lower the 
allocation probability of R10/R11 significantly.

Thanks,

	Ingo

  reply	other threads:[~2018-02-06  8:48 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-03 23:21 [PATCH 0/3] x86/entry: clear registers to sanitize speculative usages Dan Williams
2018-02-03 23:21 ` [PATCH 1/3] x86/entry: Clear extra registers beyond syscall arguments for 64bit kernels Dan Williams
2018-02-04  0:14   ` Andy Lutomirski
2018-02-04  1:25     ` Dan Williams
2018-02-04  1:29       ` Andy Lutomirski
2018-02-04 13:01   ` Brian Gerst
2018-02-04 17:42     ` Dan Williams
2018-02-04 18:40       ` Linus Torvalds
2018-02-05 16:26         ` Ingo Molnar
2018-02-05 16:38           ` Andy Lutomirski
2018-02-05 18:29             ` Ingo Molnar
2018-02-05 18:47               ` Brian Gerst
2018-02-05 19:48                 ` Ingo Molnar
2018-02-05 20:25                   ` Andy Lutomirski
2018-02-05 20:05           ` Ingo Molnar
2018-02-06  8:48             ` Ingo Molnar [this message]
2018-02-03 23:21 ` [PATCH 2/3] x86/entry: Clear registers for 64bit exceptions/interrupts Dan Williams
2018-02-03 23:21 ` [PATCH 3/3] x86/entry: Clear registers for compat syscalls Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180206084847.6lzfnumwlf3ehmvh@gmail.com \
    --to=mingo@kernel.org \
    --cc=ak@linux.intel.com \
    --cc=brgerst@gmail.com \
    --cc=dan.j.williams@intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.