All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
To: "jannh@google.com" <jannh@google.com>,
	"keescook@chromium.org" <keescook@chromium.org>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Van De Ven, Arjan" <arjan.van.de.ven@intel.com>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"x86@kernel.org" <x86@kernel.org>,
	"Accardi, Kristen C" <kristen.c.accardi@intel.com>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"kernel-hardening@lists.openwall.com" 
	<kernel-hardening@lists.openwall.com>,
	"Hansen, Dave" <dave.hansen@intel.com>
Subject: Re: [PATCH 0/3] KASLR feature to randomize each loadable module
Date: Thu, 21 Jun 2018 18:59:59 +0000	[thread overview]
Message-ID: <1529607615.29548.202.camel@intel.com> (raw)
In-Reply-To: <CAG48ez2uuQkSS9DLz6j5HbpuxaHMyAVYGMM+xoZEo51N=sHmdg@mail.gmail.com>

On Thu, 2018-06-21 at 15:37 +0200, Jann Horn wrote:
> On Thu, Jun 21, 2018 at 12:34 AM Kees Cook <keescook@chromium.org>
> wrote:
> > And most systems have <200 modules, really. I have 113 on a desktop
> > right now, 63 on a server. So this looks like a trivial win.
> But note that the eBPF JIT also uses module_alloc(). Every time a BPF
> program (this includes seccomp filters!) is JIT-compiled by the
> kernel, another module_alloc() allocation is made. For example, on my
> desktop machine, I have a bunch of seccomp-sandboxed processes thanks
> to Chrome. If I enable the net.core.bpf_jit_enable sysctl and open a
> few Chrome tabs, BPF JIT allocations start showing up between
> modules:
> 
> # grep -C1 bpf_jit_binary_alloc /proc/vmallocinfo | cut -d' ' -f 2-
>   20480 load_module+0x1326/0x2ab0 pages=4 vmalloc N0=4
>   12288 bpf_jit_binary_alloc+0x32/0x90 pages=2 vmalloc N0=2
>   20480 load_module+0x1326/0x2ab0 pages=4 vmalloc N0=4
> --
>   20480 load_module+0x1326/0x2ab0 pages=4 vmalloc N0=4
>   12288 bpf_jit_binary_alloc+0x32/0x90 pages=2 vmalloc N0=2
>   36864 load_module+0x1326/0x2ab0 pages=8 vmalloc N0=8
> --
>   20480 load_module+0x1326/0x2ab0 pages=4 vmalloc N0=4
>   12288 bpf_jit_binary_alloc+0x32/0x90 pages=2 vmalloc N0=2
>   40960 load_module+0x1326/0x2ab0 pages=9 vmalloc N0=9
> --
>   20480 load_module+0x1326/0x2ab0 pages=4 vmalloc N0=4
>   12288 bpf_jit_binary_alloc+0x32/0x90 pages=2 vmalloc N0=2
>  253952 load_module+0x1326/0x2ab0 pages=61 vmalloc N0=61
> 
> If you use Chrome with Site Isolation, you have a few dozen open
> tabs,
> and the BPF JIT is enabled, reaching a few hundred allocations might
> not be that hard.
> 
> Also: What's the impact on memory usage? Is this going to increase
> the
> number of pagetables that need to be allocated by the kernel per
> module_alloc() by 4K or 8K or so?
Thanks, it seems it might require some extra memory.  I'll look into it
to find out exactly how much.

I didn't include eBFP modules in the randomization estimates, but it
looks like they are usually smaller than a page.  So with the slight
leap that the larger normal modules based estimate is the worst case,
you should still get ~800 modules at 18 bits. After that it will start
to go down to 10 bits and so in either case it at least won't regress
the randomness of the existing algorithm.

> > 
> > > 
> > > As for fragmentation, this algorithm reduces the average number
> > > of modules that
> > > can be loaded without an allocation failure by about 6% (~17000
> > > to ~16000)
> > > (p<0.05). It can also reduce the largest module executable
> > > section that can be
> > > loaded by half to ~500MB in the worst case.
> > Given that we only have 8312 tristate Kconfig items, I think 16000
> > will remain just fine. And even large modules (i915) are under
> > 2MB...
> > 
> > > 
> > > The new __vmalloc_node_try_addr function uses the existing
> > > function
> > > __vmalloc_node_range, in order to introduce this algorithm with
> > > the least
> > > invasive change. The side effect is that each time there is a
> > > collision when
> > > trying to allocate in the random area a TLB flush will be
> > > triggered. There is
> > > a more complex, more efficient implementation that can be used
> > > instead if
> > > there is interest in improving performance.
> > The only time when module loading speed is noticeable, I would
> > think,
> > would be boot time. Have you done any boot time delta analysis? I
> > wouldn't expect it to change hardly at all, but it's probably a
> > good
> > idea to actually test it. :)
> If you have a forking server that applies seccomp filters on each
> fork, or something like that, you might care about those TLB flushes.
> 

I can test this as well.

  parent reply	other threads:[~2018-06-21 19:00 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-20 22:09 [PATCH 0/3] KASLR feature to randomize each loadable module Rick Edgecombe
2018-06-20 22:09 ` [PATCH 1/3] vmalloc: Add __vmalloc_node_try_addr function Rick Edgecombe
2018-06-20 22:16   ` Randy Dunlap
2018-06-20 22:35     ` Kees Cook
2018-06-20 22:44       ` Randy Dunlap
2018-06-20 23:05         ` Kees Cook
2018-06-20 23:16           ` Randy Dunlap
2018-06-20 22:26   ` Matthew Wilcox
2018-06-21 22:02     ` Edgecombe, Rick P
2018-06-21 22:02       ` Edgecombe, Rick P
2018-06-20 22:09 ` [PATCH 2/3] x86/modules: Increase randomization for modules Rick Edgecombe
2018-06-20 22:09 ` [PATCH 3/3] vmalloc: Add debugfs modfraginfo Rick Edgecombe
2018-06-21  0:53   ` kbuild test robot
2018-06-21  0:53     ` kbuild test robot
2018-06-21  1:17   ` kbuild test robot
2018-06-21  1:17     ` kbuild test robot
2018-06-21 12:32   ` Jann Horn
2018-06-21 18:56     ` Edgecombe, Rick P
2018-06-21 18:56       ` Edgecombe, Rick P
2018-06-20 22:33 ` [PATCH 0/3] KASLR feature to randomize each loadable module Kees Cook
2018-06-21 13:37   ` Jann Horn
2018-06-21 13:39     ` Jann Horn
2018-06-21 18:59     ` Edgecombe, Rick P [this message]
2018-06-21 18:59       ` Edgecombe, Rick P
2018-06-21 21:23       ` Daniel Borkmann
2018-06-21 21:23         ` Daniel Borkmann
2018-06-21 21:23         ` Daniel Borkmann
2018-06-21 18:56   ` Edgecombe, Rick P
2018-06-21 18:56     ` Edgecombe, Rick P

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1529607615.29548.202.camel@intel.com \
    --to=rick.p.edgecombe@intel.com \
    --cc=arjan.van.de.ven@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=hpa@zytor.com \
    --cc=jannh@google.com \
    --cc=keescook@chromium.org \
    --cc=kernel-hardening@lists.openwall.com \
    --cc=kristen.c.accardi@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.