All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kees Cook <keescook@chromium.org>
To: Alexander Potapenko <glider@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Christoph Lameter <cl@linux.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	Laura Abbott <labbott@redhat.com>, Linux-MM <linux-mm@kvack.org>,
	linux-security-module <linux-security-module@vger.kernel.org>,
	Kernel Hardening <kernel-hardening@lists.openwall.com>
Subject: Re: [PATCH 1/3] mm: security: introduce the init_allocations=1 boot option
Date: Tue, 23 Apr 2019 12:00:11 -0700	[thread overview]
Message-ID: <CAGXu5j+TsBgf0jw+05p8L_APKJ4+wnfPxL6hGmoHrddt-KuHuw@mail.gmail.com> (raw)
In-Reply-To: <20190418154208.131118-2-glider@google.com>

On Thu, Apr 18, 2019 at 8:42 AM Alexander Potapenko <glider@google.com> wrote:
> This option adds the possibility to initialize newly allocated pages and
> heap objects with zeroes. This is needed to prevent possible information
> leaks and make the control-flow bugs that depend on uninitialized values
> more deterministic.
>
> Initialization is done at allocation time at the places where checks for
> __GFP_ZERO are performed. We don't initialize slab caches with
> constructors to preserve their semantics. To reduce runtime costs of
> checking cachep->ctor we replace a call to memset with a call to
> cachep->poison_fn, which is only executed if the memory block needs to
> be initialized.
>
> For kernel testing purposes filling allocations with a nonzero pattern
> would be more suitable, but may require platform-specific code. To have
> a simple baseline we've decided to start with zero-initialization.

The memory tagging future may be worth mentioning here too. We'll
always need an allocation-time hook. What we do in that hook (tag,
zero, poison) can be manage from there.

> No performance optimizations are done at the moment to reduce double
> initialization of memory regions.

Isn't this addressed in later patches?

>
> Signed-off-by: Alexander Potapenko <glider@google.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: James Morris <jmorris@namei.org>
> Cc: "Serge E. Hallyn" <serge@hallyn.com>
> Cc: Nick Desaulniers <ndesaulniers@google.com>
> Cc: Kostya Serebryany <kcc@google.com>
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Sandeep Patil <sspatil@android.com>
> Cc: Laura Abbott <labbott@redhat.com>
> Cc: Randy Dunlap <rdunlap@infradead.org>
> Cc: Jann Horn <jannh@google.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Qian Cai <cai@lca.pw>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: linux-mm@kvack.org
> Cc: linux-security-module@vger.kernel.org
> Cc: kernel-hardening@lists.openwall.com
> ---
>  drivers/infiniband/core/uverbs_ioctl.c |  2 +-
>  include/linux/mm.h                     |  8 ++++++++
>  include/linux/slab_def.h               |  1 +
>  include/linux/slub_def.h               |  1 +
>  kernel/kexec_core.c                    |  2 +-
>  mm/dmapool.c                           |  2 +-
>  mm/page_alloc.c                        | 18 +++++++++++++++++-
>  mm/slab.c                              | 12 ++++++------
>  mm/slab.h                              |  1 +
>  mm/slab_common.c                       | 15 +++++++++++++++
>  mm/slob.c                              |  2 +-
>  mm/slub.c                              |  8 ++++----
>  net/core/sock.c                        |  2 +-
>  13 files changed, 58 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/infiniband/core/uverbs_ioctl.c b/drivers/infiniband/core/uverbs_ioctl.c
> index e1379949e663..f31234906be2 100644
> --- a/drivers/infiniband/core/uverbs_ioctl.c
> +++ b/drivers/infiniband/core/uverbs_ioctl.c
> @@ -127,7 +127,7 @@ __malloc void *_uverbs_alloc(struct uverbs_attr_bundle *bundle, size_t size,
>         res = (void *)pbundle->internal_buffer + pbundle->internal_used;
>         pbundle->internal_used =
>                 ALIGN(new_used, sizeof(*pbundle->internal_buffer));
> -       if (flags & __GFP_ZERO)
> +       if (want_init_memory(flags))
>                 memset(res, 0, size);
>         return res;
>  }
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 76769749b5a5..b38b71a5efaa 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -2597,6 +2597,14 @@ static inline void kernel_poison_pages(struct page *page, int numpages,
>                                         int enable) { }
>  #endif
>
> +DECLARE_STATIC_KEY_FALSE(init_allocations);

I'd like a CONFIG to control this default. We can keep the boot-time
option to change it, but I think a CONFIG is warranted here.

> +static inline bool want_init_memory(gfp_t flags)
> +{
> +       if (static_branch_unlikely(&init_allocations))
> +               return true;
> +       return flags & __GFP_ZERO;
> +}
> +
>  #ifdef CONFIG_DEBUG_PAGEALLOC
>  extern bool _debug_pagealloc_enabled;
>  extern void __kernel_map_pages(struct page *page, int numpages, int enable);
> diff --git a/include/linux/slab_def.h b/include/linux/slab_def.h
> index 9a5eafb7145b..9dfe9eb639d7 100644
> --- a/include/linux/slab_def.h
> +++ b/include/linux/slab_def.h
> @@ -37,6 +37,7 @@ struct kmem_cache {
>
>         /* constructor func */
>         void (*ctor)(void *obj);
> +       void (*poison_fn)(struct kmem_cache *c, void *object);
>
>  /* 4) cache creation/removal */
>         const char *name;
> diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
> index d2153789bd9f..afb928cb7c20 100644
> --- a/include/linux/slub_def.h
> +++ b/include/linux/slub_def.h
> @@ -99,6 +99,7 @@ struct kmem_cache {
>         gfp_t allocflags;       /* gfp flags to use on each alloc */
>         int refcount;           /* Refcount for slab cache destroy */
>         void (*ctor)(void *);
> +       void (*poison_fn)(struct kmem_cache *c, void *object);
>         unsigned int inuse;             /* Offset to metadata */
>         unsigned int align;             /* Alignment */
>         unsigned int red_left_pad;      /* Left redzone padding size */
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index d7140447be75..be84f5f95c97 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -315,7 +315,7 @@ static struct page *kimage_alloc_pages(gfp_t gfp_mask, unsigned int order)
>                 arch_kexec_post_alloc_pages(page_address(pages), count,
>                                             gfp_mask);
>
> -               if (gfp_mask & __GFP_ZERO)
> +               if (want_init_memory(gfp_mask))
>                         for (i = 0; i < count; i++)
>                                 clear_highpage(pages + i);
>         }
> diff --git a/mm/dmapool.c b/mm/dmapool.c
> index 76a160083506..796e38160d39 100644
> --- a/mm/dmapool.c
> +++ b/mm/dmapool.c
> @@ -381,7 +381,7 @@ void *dma_pool_alloc(struct dma_pool *pool, gfp_t mem_flags,
>  #endif
>         spin_unlock_irqrestore(&pool->lock, flags);
>
> -       if (mem_flags & __GFP_ZERO)
> +       if (want_init_memory(mem_flags))
>                 memset(retval, 0, pool->size);
>
>         return retval;
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index d96ca5bc555b..e2a21d866ac9 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -133,6 +133,22 @@ unsigned long totalcma_pages __read_mostly;
>
>  int percpu_pagelist_fraction;
>  gfp_t gfp_allowed_mask __read_mostly = GFP_BOOT_MASK;
> +bool want_init_allocations __read_mostly;

This can be a stack variable in early_init_allocations() -- it's never
used again.

> +EXPORT_SYMBOL(want_init_allocations);
> +DEFINE_STATIC_KEY_FALSE(init_allocations);
> +
> +static int __init early_init_allocations(char *buf)
> +{
> +       int ret;
> +
> +       if (!buf)
> +               return -EINVAL;
> +       ret = kstrtobool(buf, &want_init_allocations);
> +       if (want_init_allocations)
> +               static_branch_enable(&init_allocations);

With the CONFIG, this should have a _disable on an "else" here...

> +       return ret;
> +}
> +early_param("init_allocations", early_init_allocations);

Does early_init_allocations() get called before things like
prep_new_page() are up and running to call want_init_memory()?

>
>  /*
>   * A cached value of the page's pageblock's migratetype, used when the page is
> @@ -2014,7 +2030,7 @@ static void prep_new_page(struct page *page, unsigned int order, gfp_t gfp_flags
>
>         post_alloc_hook(page, order, gfp_flags);
>
> -       if (!free_pages_prezeroed() && (gfp_flags & __GFP_ZERO))
> +       if (!free_pages_prezeroed() && want_init_memory(gfp_flags))
>                 for (i = 0; i < (1 << order); i++)
>                         clear_highpage(page + i);
>
> diff --git a/mm/slab.c b/mm/slab.c
> index 47a380a486ee..dcc5b73cf767 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -3331,8 +3331,8 @@ slab_alloc_node(struct kmem_cache *cachep, gfp_t flags, int nodeid,
>         local_irq_restore(save_flags);
>         ptr = cache_alloc_debugcheck_after(cachep, flags, ptr, caller);
>
> -       if (unlikely(flags & __GFP_ZERO) && ptr)
> -               memset(ptr, 0, cachep->object_size);
> +       if (unlikely(want_init_memory(flags)) && ptr)
> +               cachep->poison_fn(cachep, ptr);

So... this _must_ zero when __GFP_ZERO is present, so I'm not sure
"poison_fn" is the right name, and it likely needs to take the "flags"
argument.

>
>         slab_post_alloc_hook(cachep, flags, 1, &ptr);
>         return ptr;
> @@ -3388,8 +3388,8 @@ slab_alloc(struct kmem_cache *cachep, gfp_t flags, unsigned long caller)
>         objp = cache_alloc_debugcheck_after(cachep, flags, objp, caller);
>         prefetchw(objp);
>
> -       if (unlikely(flags & __GFP_ZERO) && objp)
> -               memset(objp, 0, cachep->object_size);
> +       if (unlikely(want_init_memory(flags)) && objp)
> +               cachep->poison_fn(cachep, objp);

Same.

>
>         slab_post_alloc_hook(cachep, flags, 1, &objp);
>         return objp;
> @@ -3596,9 +3596,9 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
>         cache_alloc_debugcheck_after_bulk(s, flags, size, p, _RET_IP_);
>
>         /* Clear memory outside IRQ disabled section */
> -       if (unlikely(flags & __GFP_ZERO))
> +       if (unlikely(want_init_memory(flags)))
>                 for (i = 0; i < size; i++)
> -                       memset(p[i], 0, s->object_size);
> +                       s->poison_fn(s, p[i]);

Same.

>
>         slab_post_alloc_hook(s, flags, size, p);
>         /* FIXME: Trace call missing. Christoph would like a bulk variant */
> diff --git a/mm/slab.h b/mm/slab.h
> index 43ac818b8592..3b541e8970ee 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -27,6 +27,7 @@ struct kmem_cache {
>         const char *name;       /* Slab name for sysfs */
>         int refcount;           /* Use counter */
>         void (*ctor)(void *);   /* Called on object slot creation */
> +       void (*poison_fn)(struct kmem_cache *c, void *object);

How about naming it just "initialize"?

>         struct list_head list;  /* List of all slab caches on the system */
>  };
>
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index 58251ba63e4a..37810114b2ea 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -360,6 +360,16 @@ struct kmem_cache *find_mergeable(unsigned int size, unsigned int align,
>         return NULL;
>  }
>
> +static void poison_zero(struct kmem_cache *c, void *object)
> +{
> +       memset(object, 0, c->object_size);
> +}
> +
> +static void poison_dont(struct kmem_cache *c, void *object)
> +{
> +       /* Do nothing. Use for caches with constructors. */
> +}
> +
>  static struct kmem_cache *create_cache(const char *name,
>                 unsigned int object_size, unsigned int align,
>                 slab_flags_t flags, unsigned int useroffset,
> @@ -381,6 +391,10 @@ static struct kmem_cache *create_cache(const char *name,
>         s->size = s->object_size = object_size;
>         s->align = align;
>         s->ctor = ctor;
> +       if (ctor)
> +               s->poison_fn = poison_dont;
> +       else
> +               s->poison_fn = poison_zero;

As mentioned, we must still always zero when __GFP_ZERO is present.

>         s->useroffset = useroffset;
>         s->usersize = usersize;
>
> @@ -974,6 +988,7 @@ void __init create_boot_cache(struct kmem_cache *s, const char *name,
>         s->align = calculate_alignment(flags, ARCH_KMALLOC_MINALIGN, size);
>         s->useroffset = useroffset;
>         s->usersize = usersize;
> +       s->poison_fn = poison_zero;
>
>         slab_init_memcg_params(s);
>
> diff --git a/mm/slob.c b/mm/slob.c
> index 307c2c9feb44..18981a71e962 100644
> --- a/mm/slob.c
> +++ b/mm/slob.c
> @@ -330,7 +330,7 @@ static void *slob_alloc(size_t size, gfp_t gfp, int align, int node)
>                 BUG_ON(!b);
>                 spin_unlock_irqrestore(&slob_lock, flags);
>         }
> -       if (unlikely(gfp & __GFP_ZERO))
> +       if (unlikely(want_init_memory(gfp)))
>                 memset(b, 0, size);
>         return b;
>  }
> diff --git a/mm/slub.c b/mm/slub.c
> index d30ede89f4a6..e4efb6575510 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2750,8 +2750,8 @@ static __always_inline void *slab_alloc_node(struct kmem_cache *s,
>                 stat(s, ALLOC_FASTPATH);
>         }
>
> -       if (unlikely(gfpflags & __GFP_ZERO) && object)
> -               memset(object, 0, s->object_size);
> +       if (unlikely(want_init_memory(gfpflags)) && object)
> +               s->poison_fn(s, object);
>
>         slab_post_alloc_hook(s, gfpflags, 1, &object);
>
> @@ -3172,11 +3172,11 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
>         local_irq_enable();
>
>         /* Clear memory outside IRQ disabled fastpath loop */
> -       if (unlikely(flags & __GFP_ZERO)) {
> +       if (unlikely(want_init_memory(flags))) {
>                 int j;
>
>                 for (j = 0; j < i; j++)
> -                       memset(p[j], 0, s->object_size);
> +                       s->poison_fn(s, p[j]);
>         }
>
>         /* memcg and kmem_cache debug support */
> diff --git a/net/core/sock.c b/net/core/sock.c
> index 782343bb925b..99b288a19b39 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -1601,7 +1601,7 @@ static struct sock *sk_prot_alloc(struct proto *prot, gfp_t priority,
>                 sk = kmem_cache_alloc(slab, priority & ~__GFP_ZERO);
>                 if (!sk)
>                         return sk;
> -               if (priority & __GFP_ZERO)
> +               if (want_init_memory(priority))
>                         sk_prot_clear_nulls(sk, prot->obj_size);
>         } else
>                 sk = kmalloc(prot->obj_size, priority);
> --
> 2.21.0.392.gf8f6787159e-goog
>

Looking good. :) Thanks for working on this!

-- 
Kees Cook

  parent reply	other threads:[~2019-04-23 19:00 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-18 15:42 [PATCH 0/3] RFC: add init_allocations=1 boot option Alexander Potapenko
2019-04-18 15:42 ` Alexander Potapenko
2019-04-18 15:42 ` [PATCH 1/3] mm: security: introduce the " Alexander Potapenko
2019-04-18 15:42   ` Alexander Potapenko
2019-04-18 16:35   ` Dave Hansen
2019-04-18 16:43     ` Alexander Potapenko
2019-04-18 16:43       ` Alexander Potapenko
2019-04-18 16:50       ` Alexander Potapenko
2019-04-18 16:50         ` Alexander Potapenko
2019-04-23  8:31     ` Michal Hocko
2019-04-18 22:08   ` Randy Dunlap
2019-04-23 19:00   ` Kees Cook [this message]
2019-04-23 19:00     ` Kees Cook
2019-04-26 12:12     ` Alexander Potapenko
2019-04-26 12:12       ` Alexander Potapenko
2019-04-23 20:36   ` Dave Hansen
2019-04-26 14:14   ` Christopher Lameter
2019-04-26 14:14     ` Christopher Lameter
     [not found]   ` <alpine.DEB.2.21.1904260911570.8340@nuc-kabylake>
2019-04-26 15:24     ` Christopher Lameter
2019-04-26 15:24       ` Christopher Lameter
2019-04-26 15:48       ` Alexander Potapenko
2019-04-26 15:48         ` Alexander Potapenko
2019-04-18 15:42 ` [PATCH 2/3] gfp: mm: introduce __GFP_NOINIT Alexander Potapenko
2019-04-18 15:42   ` Alexander Potapenko
2019-04-18 16:52   ` Dave Hansen
2019-04-23 19:14     ` Kees Cook
2019-04-23 19:14       ` Kees Cook
2019-04-23 20:40       ` Dave Hansen
2019-04-23 19:11   ` Kees Cook
2019-04-23 19:11     ` Kees Cook
2019-04-18 15:42 ` [PATCH 3/3] RFC: net: apply __GFP_NOINIT to AF_UNIX sk_buff allocations Alexander Potapenko
2019-04-18 15:42   ` Alexander Potapenko
2019-04-23 19:17   ` Kees Cook
2019-04-23 19:17     ` Kees Cook
2019-04-18 15:44 ` [PATCH 0/3] RFC: add init_allocations=1 boot option Alexander Potapenko
2019-04-18 15:44   ` Alexander Potapenko
2019-04-18 22:07 ` Randy Dunlap
2019-04-23 18:49 ` Kees Cook
2019-04-23 18:49   ` Kees Cook
2019-04-26 12:39   ` Alexander Potapenko
2019-04-26 12:39     ` Alexander Potapenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGXu5j+TsBgf0jw+05p8L_APKJ4+wnfPxL6hGmoHrddt-KuHuw@mail.gmail.com \
    --to=keescook@chromium.org \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=dvyukov@google.com \
    --cc=glider@google.com \
    --cc=kernel-hardening@lists.openwall.com \
    --cc=labbott@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=linux-security-module@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.