linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Garnier <thgarnie@google.com>,
	Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Kees Cook <keescook@chromium.org>,
	gthelen@google.com, labbott@fedoraproject.org,
	kernel-hardening@lists.openwall.com,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH v4] mm: SLAB freelist randomization
Date: Wed, 27 Apr 2016 09:55:05 +0900	[thread overview]
Message-ID: <20160427005505.GA6336@js1304-P5Q-DELUXE> (raw)
In-Reply-To: <20160426161743.f831225a4efb3eb04debe402@linux-foundation.org>

On Tue, Apr 26, 2016 at 04:17:43PM -0700, Andrew Morton wrote:
> On Tue, 26 Apr 2016 09:21:10 -0700 Thomas Garnier <thgarnie@google.com> wrote:
> 
> > Provides an optional config (CONFIG_FREELIST_RANDOM) to randomize the
> > SLAB freelist. The list is randomized during initialization of a new set
> > of pages. The order on different freelist sizes is pre-computed at boot
> > for performance. Each kmem_cache has its own randomized freelist. Before
> > pre-computed lists are available freelists are generated
> > dynamically. This security feature reduces the predictability of the
> > kernel SLAB allocator against heap overflows rendering attacks much less
> > stable.
> > 
> > For example this attack against SLUB (also applicable against SLAB)
> > would be affected:
> > https://jon.oberheide.org/blog/2010/09/10/linux-kernel-can-slub-overflow/
> > 
> > Also, since v4.6 the freelist was moved at the end of the SLAB. It means
> > a controllable heap is opened to new attacks not yet publicly discussed.
> > A kernel heap overflow can be transformed to multiple use-after-free.
> > This feature makes this type of attack harder too.
> > 
> > To generate entropy, we use get_random_bytes_arch because 0 bits of
> > entropy is available in the boot stage. In the worse case this function
> > will fallback to the get_random_bytes sub API. We also generate a shift
> > random number to shift pre-computed freelist for each new set of pages.
> > 
> > The config option name is not specific to the SLAB as this approach will
> > be extended to other allocators like SLUB.
> > 
> > Performance results highlighted no major changes:
> > 
> > Hackbench (running 90 10 times):
> > 
> > Before average: 0.0698
> > After average: 0.0663 (-5.01%)
> > 
> > slab_test 1 run on boot. Difference only seen on the 2048 size test
> > being the worse case scenario covered by freelist randomization. New
> > slab pages are constantly being created on the 10000 allocations.
> > Variance should be mainly due to getting new pages every few
> > allocations.
> > 
> > ...
> >
> > --- a/include/linux/slab_def.h
> > +++ b/include/linux/slab_def.h
> > @@ -80,6 +80,10 @@ struct kmem_cache {
> >  	struct kasan_cache kasan_info;
> >  #endif
> >  
> > +#ifdef CONFIG_FREELIST_RANDOM
> > +	void *random_seq;
> > +#endif
> > +
> >  	struct kmem_cache_node *node[MAX_NUMNODES];
> >  };
> >  
> > diff --git a/init/Kconfig b/init/Kconfig
> > index 0c66640..73453d0 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -1742,6 +1742,15 @@ config SLOB
> >  
> >  endchoice
> >  
> > +config FREELIST_RANDOM
> > +	default n
> > +	depends on SLAB
> > +	bool "SLAB freelist randomization"
> > +	help
> > +	  Randomizes the freelist order used on creating new SLABs. This
> > +	  security feature reduces the predictability of the kernel slab
> > +	  allocator against heap overflows.
> 
> Against the v2 patch I didst observe:
> 
> : CONFIG_FREELIST_RANDOM bugs me a bit - "freelist" is so vague.
> : CONFIG_SLAB_FREELIST_RANDOM would be better.  I mean, what Kconfig
> : identifier could be used for implementing randomisation in
> : slub/slob/etc once CONFIG_FREELIST_RANDOM is used up?
> 
> but this pearl appeared to pass unnoticed.
> 
> >  config SLUB_CPU_PARTIAL
> >  	default y
> >  	depends on SLUB && SMP
> > diff --git a/mm/slab.c b/mm/slab.c
> > index b82ee6b..0ed728a 100644
> > --- a/mm/slab.c
> > +++ b/mm/slab.c
> > @@ -1230,6 +1230,61 @@ static void __init set_up_node(struct kmem_cache *cachep, int index)
> >  	}
> >  }
> >  
> > +#ifdef CONFIG_FREELIST_RANDOM
> > +static void freelist_randomize(struct rnd_state *state, freelist_idx_t *list,
> > +			size_t count)
> > +{
> > +	size_t i;
> > +	unsigned int rand;
> > +
> > +	for (i = 0; i < count; i++)
> > +		list[i] = i;
> > +
> > +	/* Fisher-Yates shuffle */
> > +	for (i = count - 1; i > 0; i--) {
> > +		rand = prandom_u32_state(state);
> > +		rand %= (i + 1);
> > +		swap(list[i], list[rand]);
> > +	}
> > +}
> > +
> > +/* Create a random sequence per cache */
> > +static int cache_random_seq_create(struct kmem_cache *cachep)
> > +{
> > +	unsigned int seed, count = cachep->num;
> > +	struct rnd_state state;
> > +
> > +	if (count < 2)
> > +		return 0;
> > +
> > +	/* If it fails, we will just use the global lists */
> > +	cachep->random_seq = kcalloc(count, sizeof(freelist_idx_t), GFP_KERNEL);
> > +	if (!cachep->random_seq)
> > +		return -ENOMEM;
> 
> OK, no BUG.  If this happens, kmem_cache_init_late() will go BUG
> instead ;)
> 
> Questions for slab maintainers:
> 
> What's going on with the gfp_flags in there?  kmem_cache_init_late()
> passes GFP_NOWAIT into enable_cpucache().
> 
> a) why the heck does it do that?  It's __init code!

Until some boot-up point, we should not enable interrupt.
In slab subsystem, If we use __GFP_DIRECT_RECLAIM, it will cause to
enable interrupt when allocating new slab page. GFP_NOWAIT is to
prevent that situation.

Anyway, I audit the code and kmem_cache_init_late() could use
__GFP_DIRECT_RECLAIM because it is called after interrupt is enabled
which means that that's safe time to manipulate interrupt. (See
kmem_cache_init_late() in start_kernel()).

> 
> b) if there's a legit reason then your new cache_random_seq_create()
> should be getting its gfp_t from its caller, rather than blindly
> assuming GFP_KERNEL.

In any case, ignoring provided gfp argument isn't good practice.

> c) kmem_cache_init_late() goes BUG on ENOMEM.  Generally that's OK in
> __init code: we assume infinite memory during bootup.  But it's really
> quite weird to use GFP_NOWAIT and then to go BUG if GFP_NOWAIT had its
> predictable outcome (ie: failure).

I don't think BUG() here is weird code. It just means that if we can't
initialize slab subsystem properly, machine cannot run properly so
BUG().

> Finally, all callers of enable_cpucache() (and hence of
> cache_random_seq_create()) are __init, so we're unnecessarily bloating
> up vmlinux.  Could someone please take a look at this as a separate
> thing?

That's not true. It is called whenever new kmem_cache is created.
I don't know concrete reason why setup_cpu_cache() is defined with
__init_refok tag but looks like it needs to be fixed.

I will look at it soon.

Thanks.

  parent reply	other threads:[~2016-04-27  0:55 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-26 16:21 [PATCH v4] mm: SLAB freelist randomization Thomas Garnier
2016-04-26 23:17 ` Andrew Morton
2016-04-26 23:22   ` Thomas Garnier
2016-04-27 15:40     ` Christoph Lameter
2016-04-27  0:55   ` Joonsoo Kim [this message]
2016-04-27 15:39   ` Christoph Lameter
2016-04-29  7:18     ` Joonsoo Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160427005505.GA6336@js1304-P5Q-DELUXE \
    --to=iamjoonsoo.kim@lge.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=gthelen@google.com \
    --cc=keescook@chromium.org \
    --cc=kernel-hardening@lists.openwall.com \
    --cc=labbott@fedoraproject.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=thgarnie@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).