All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kees Cook <keescook@chromium.org>
To: Michal Hocko <mhocko@kernel.org>
Cc: Christoph Lameter <cl@linux.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Pekka Enberg <penberg@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mm: Add additional consistency check
Date: Tue, 11 Apr 2017 09:05:24 -0700	[thread overview]
Message-ID: <CAGXu5jJkJeJYYicXmng0REgEamuxzKrKzq_gtJ2dv5BEN4BkUA@mail.gmail.com> (raw)
In-Reply-To: <20170411141956.GP6729@dhcp22.suse.cz>

On Tue, Apr 11, 2017 at 7:19 AM, Michal Hocko <mhocko@kernel.org> wrote:
> On Tue 11-04-17 07:14:01, Kees Cook wrote:
>> On Tue, Apr 11, 2017 at 6:46 AM, Michal Hocko <mhocko@kernel.org> wrote:
>> > On Mon 10-04-17 21:58:22, Kees Cook wrote:
>> >> On Tue, Apr 4, 2017 at 1:13 PM, Michal Hocko <mhocko@kernel.org> wrote:
>> >> > On Tue 04-04-17 14:58:06, Cristopher Lameter wrote:
>> >> >> On Tue, 4 Apr 2017, Michal Hocko wrote:
>> >> >>
>> >> >> > On Tue 04-04-17 14:13:06, Cristopher Lameter wrote:
>> >> >> > > On Tue, 4 Apr 2017, Michal Hocko wrote:
>> >> >> > >
>> >> >> > > > Yes, but we do not have to blow the kernel, right? Why cannot we simply
>> >> >> > > > leak that memory?
>> >> >> > >
>> >> >> > > Because it is a serious bug to attempt to free a non slab object using
>> >> >> > > slab operations. This is often the result of memory corruption, coding
>> >> >> > > errs etc. The system needs to stop right there.
>> >> >> >
>> >> >> > Why when an alternative is a memory leak?
>> >> >>
>> >> >> Because the slab allocators fail also in case you free an object multiple
>> >> >> times etc etc. Continuation is supported by enabling a special resiliency
>> >> >> feature via the kernel command line. The alternative is selectable but not
>> >> >> the default.
>> >> >
>> >> > I disagree! We should try to continue as long as we _know_ that the
>> >> > internal state of the allocator is still consistent and a further
>> >> > operation will not spread the corruption even more. This is clearly not
>> >> > the case for an invalid pointer to kfree.
>> >> >
>> >> > I can see why checking for an early allocator corruption is not always
>> >> > feasible and you can only detect after-the-fact but this is not the case
>> >> > here and putting your system down just because some buggy code is trying
>> >> > to free something it hasn't allocated is not really useful. I completely
>> >> > agree with Linus that we overuse BUG way too much and this is just
>> >> > another example of it.
>> >>
>> >> Instead of the proposed BUG here, what's the correct "safe" return value?
>> >
>> > I would assume that _you_ as the one who proposes the change would take
>> > some time to read and understand the code and know this answer. This is
>> > how we do changes to the kernel: have an objective, understand the code
>> > and generate the patch.
>> >
>> > I am really sad that this particular patch has shown that you didn't
>> > bother to consider the later part and blindly applied something that you
>> > haven't thought through properly. Please try harder next time.
>>
>> Our objectives are different: I want the kernel to immediately stop
>> when corruption is detected. Since others are interested in making it
>> survivable, I was hoping to get a hint about what such an improvement
>> would look like.
>
> I do not think sprinkling BUG_ONs will help that objective. And BUG_ON
> under IRQ disable is likely not helping an error survivable...

Yes, agreed. Handling it cleanly is always better.

>> Instead this condescending attitude, can you instead
>> provide constructive help that will get our users closer to the safe
>> kernel operation we're all interested in?
>
> I would do something like...
> ---
> diff --git a/mm/slab.c b/mm/slab.c
> index bd63450a9b16..87c99a5e9e18 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -393,10 +393,15 @@ static inline void set_store_user_dirty(struct kmem_cache *cachep) {}
>  static int slab_max_order = SLAB_MAX_ORDER_LO;
>  static bool slab_max_order_set __initdata;
>
> +static inline struct kmem_cache *page_to_cache(struct page *page)
> +{
> +       return page->slab_cache;
> +}
> +
>  static inline struct kmem_cache *virt_to_cache(const void *obj)
>  {
>         struct page *page = virt_to_head_page(obj);
> -       return page->slab_cache;
> +       return page_to_cache(page);
>  }
>
>  static inline void *index_to_obj(struct kmem_cache *cache, struct page *page,
> @@ -3813,14 +3818,18 @@ void kfree(const void *objp)
>  {
>         struct kmem_cache *c;
>         unsigned long flags;
> +       struct page *page;
>
>         trace_kfree(_RET_IP_, objp);
>
>         if (unlikely(ZERO_OR_NULL_PTR(objp)))
>                 return;
> +       page = virt_to_head_page(obj);
> +       if (CHECK_DATA_CORRUPTION(!PageSlab(page)))
> +               return;
>         local_irq_save(flags);
>         kfree_debugcheck(objp);
> -       c = virt_to_cache(objp);
> +       c = page_to_cache(page);
>         debug_check_no_locks_freed(objp, c->object_size);
>
>         debug_check_no_obj_freed(objp, c->object_size);

Awesome! Thank you very much! I'll play with this.

-Kees

-- 
Kees Cook
Pixel Security

WARNING: multiple messages have this Message-ID (diff)
From: Kees Cook <keescook@chromium.org>
To: Michal Hocko <mhocko@kernel.org>
Cc: Christoph Lameter <cl@linux.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Pekka Enberg <penberg@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mm: Add additional consistency check
Date: Tue, 11 Apr 2017 09:05:24 -0700	[thread overview]
Message-ID: <CAGXu5jJkJeJYYicXmng0REgEamuxzKrKzq_gtJ2dv5BEN4BkUA@mail.gmail.com> (raw)
In-Reply-To: <20170411141956.GP6729@dhcp22.suse.cz>

On Tue, Apr 11, 2017 at 7:19 AM, Michal Hocko <mhocko@kernel.org> wrote:
> On Tue 11-04-17 07:14:01, Kees Cook wrote:
>> On Tue, Apr 11, 2017 at 6:46 AM, Michal Hocko <mhocko@kernel.org> wrote:
>> > On Mon 10-04-17 21:58:22, Kees Cook wrote:
>> >> On Tue, Apr 4, 2017 at 1:13 PM, Michal Hocko <mhocko@kernel.org> wrote:
>> >> > On Tue 04-04-17 14:58:06, Cristopher Lameter wrote:
>> >> >> On Tue, 4 Apr 2017, Michal Hocko wrote:
>> >> >>
>> >> >> > On Tue 04-04-17 14:13:06, Cristopher Lameter wrote:
>> >> >> > > On Tue, 4 Apr 2017, Michal Hocko wrote:
>> >> >> > >
>> >> >> > > > Yes, but we do not have to blow the kernel, right? Why cannot we simply
>> >> >> > > > leak that memory?
>> >> >> > >
>> >> >> > > Because it is a serious bug to attempt to free a non slab object using
>> >> >> > > slab operations. This is often the result of memory corruption, coding
>> >> >> > > errs etc. The system needs to stop right there.
>> >> >> >
>> >> >> > Why when an alternative is a memory leak?
>> >> >>
>> >> >> Because the slab allocators fail also in case you free an object multiple
>> >> >> times etc etc. Continuation is supported by enabling a special resiliency
>> >> >> feature via the kernel command line. The alternative is selectable but not
>> >> >> the default.
>> >> >
>> >> > I disagree! We should try to continue as long as we _know_ that the
>> >> > internal state of the allocator is still consistent and a further
>> >> > operation will not spread the corruption even more. This is clearly not
>> >> > the case for an invalid pointer to kfree.
>> >> >
>> >> > I can see why checking for an early allocator corruption is not always
>> >> > feasible and you can only detect after-the-fact but this is not the case
>> >> > here and putting your system down just because some buggy code is trying
>> >> > to free something it hasn't allocated is not really useful. I completely
>> >> > agree with Linus that we overuse BUG way too much and this is just
>> >> > another example of it.
>> >>
>> >> Instead of the proposed BUG here, what's the correct "safe" return value?
>> >
>> > I would assume that _you_ as the one who proposes the change would take
>> > some time to read and understand the code and know this answer. This is
>> > how we do changes to the kernel: have an objective, understand the code
>> > and generate the patch.
>> >
>> > I am really sad that this particular patch has shown that you didn't
>> > bother to consider the later part and blindly applied something that you
>> > haven't thought through properly. Please try harder next time.
>>
>> Our objectives are different: I want the kernel to immediately stop
>> when corruption is detected. Since others are interested in making it
>> survivable, I was hoping to get a hint about what such an improvement
>> would look like.
>
> I do not think sprinkling BUG_ONs will help that objective. And BUG_ON
> under IRQ disable is likely not helping an error survivable...

Yes, agreed. Handling it cleanly is always better.

>> Instead this condescending attitude, can you instead
>> provide constructive help that will get our users closer to the safe
>> kernel operation we're all interested in?
>
> I would do something like...
> ---
> diff --git a/mm/slab.c b/mm/slab.c
> index bd63450a9b16..87c99a5e9e18 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -393,10 +393,15 @@ static inline void set_store_user_dirty(struct kmem_cache *cachep) {}
>  static int slab_max_order = SLAB_MAX_ORDER_LO;
>  static bool slab_max_order_set __initdata;
>
> +static inline struct kmem_cache *page_to_cache(struct page *page)
> +{
> +       return page->slab_cache;
> +}
> +
>  static inline struct kmem_cache *virt_to_cache(const void *obj)
>  {
>         struct page *page = virt_to_head_page(obj);
> -       return page->slab_cache;
> +       return page_to_cache(page);
>  }
>
>  static inline void *index_to_obj(struct kmem_cache *cache, struct page *page,
> @@ -3813,14 +3818,18 @@ void kfree(const void *objp)
>  {
>         struct kmem_cache *c;
>         unsigned long flags;
> +       struct page *page;
>
>         trace_kfree(_RET_IP_, objp);
>
>         if (unlikely(ZERO_OR_NULL_PTR(objp)))
>                 return;
> +       page = virt_to_head_page(obj);
> +       if (CHECK_DATA_CORRUPTION(!PageSlab(page)))
> +               return;
>         local_irq_save(flags);
>         kfree_debugcheck(objp);
> -       c = virt_to_cache(objp);
> +       c = page_to_cache(page);
>         debug_check_no_locks_freed(objp, c->object_size);
>
>         debug_check_no_obj_freed(objp, c->object_size);

Awesome! Thank you very much! I'll play with this.

-Kees

-- 
Kees Cook
Pixel Security

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-04-11 16:05 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-31 16:40 [PATCH] mm: Add additional consistency check Kees Cook
2017-03-31 16:40 ` Kees Cook
2017-03-31 21:33 ` Andrew Morton
2017-03-31 21:33   ` Andrew Morton
2017-04-01  0:04   ` Kees Cook
2017-04-01  0:04     ` Kees Cook
2017-04-03  3:40     ` Michael Ellerman
2017-04-03  3:40       ` Michael Ellerman
2017-04-03 14:03       ` Christoph Lameter
2017-04-03 14:03         ` Christoph Lameter
2017-04-03 14:53         ` Matthew Wilcox
2017-04-03 14:53           ` Matthew Wilcox
2017-04-04 11:30 ` Michal Hocko
2017-04-04 11:30   ` Michal Hocko
2017-04-04 15:07   ` Christoph Lameter
2017-04-04 15:07     ` Christoph Lameter
2017-04-04 15:16     ` Michal Hocko
2017-04-04 15:16       ` Michal Hocko
2017-04-04 15:46       ` Kees Cook
2017-04-04 15:46         ` Kees Cook
2017-04-04 15:58         ` Michal Hocko
2017-04-04 15:58           ` Michal Hocko
2017-04-04 16:02           ` Kees Cook
2017-04-04 16:02             ` Kees Cook
2017-04-04 19:13       ` Christoph Lameter
2017-04-04 19:13         ` Christoph Lameter
2017-04-04 19:42         ` Michal Hocko
2017-04-04 19:42           ` Michal Hocko
2017-04-04 19:58           ` Christoph Lameter
2017-04-04 19:58             ` Christoph Lameter
2017-04-04 20:13             ` Michal Hocko
2017-04-04 20:13               ` Michal Hocko
2017-04-11  4:58               ` Kees Cook
2017-04-11  4:58                 ` Kees Cook
2017-04-11 13:46                 ` Michal Hocko
2017-04-11 13:46                   ` Michal Hocko
2017-04-11 14:14                   ` Kees Cook
2017-04-11 14:14                     ` Kees Cook
2017-04-11 14:19                     ` Michal Hocko
2017-04-11 14:19                       ` Michal Hocko
2017-04-11 16:05                       ` Kees Cook [this message]
2017-04-11 16:05                         ` Kees Cook
2017-04-11 16:16                       ` Christoph Lameter
2017-04-11 16:16                         ` Christoph Lameter
2017-04-11 16:19                         ` Kees Cook
2017-04-11 16:19                           ` Kees Cook
2017-04-11 16:23                           ` Christoph Lameter
2017-04-11 16:23                             ` Christoph Lameter
2017-04-11 16:30                             ` Kees Cook
2017-04-11 16:30                               ` Kees Cook
2017-04-11 16:26                           ` Christoph Lameter
2017-04-11 16:26                             ` Christoph Lameter
2017-04-11 16:41                         ` Michal Hocko
2017-04-11 16:41                           ` Michal Hocko
2017-04-11 18:03                           ` Christoph Lameter
2017-04-11 18:03                             ` Christoph Lameter
2017-04-11 18:30                             ` Michal Hocko
2017-04-11 18:30                               ` Michal Hocko
2017-04-11 18:44                               ` Christoph Lameter
2017-04-11 18:44                                 ` Christoph Lameter
2017-04-11 18:55                                 ` Michal Hocko
2017-04-11 18:55                                   ` Michal Hocko
2017-04-11 18:59                                   ` Christoph Lameter
2017-04-11 18:59                                     ` Christoph Lameter
2017-04-11 19:39                                     ` Michal Hocko
2017-04-11 19:39                                       ` Michal Hocko
2017-04-17 15:22                                       ` Christoph Lameter
2017-04-17 15:22                                         ` Christoph Lameter
2017-04-18  6:41                                         ` Michal Hocko
2017-04-18  6:41                                           ` Michal Hocko
2017-04-18 13:31                                           ` Christoph Lameter
2017-04-18 13:31                                             ` Christoph Lameter
2017-04-18 13:37                                           ` Christoph Lameter
2017-04-18 13:37                                             ` Christoph Lameter
2017-04-28  1:11                       ` Kees Cook
2017-04-28  1:11                         ` Kees Cook
2017-04-28  6:16                         ` Michal Hocko
2017-04-28  6:16                           ` Michal Hocko
2017-04-27 12:06   ` Michal Hocko
2017-04-27 12:06     ` Michal Hocko
2017-04-11 18:30 ` Christoph Lameter
2017-04-11 18:30   ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGXu5jJkJeJYYicXmng0REgEamuxzKrKzq_gtJ2dv5BEN4BkUA@mail.gmail.com \
    --to=keescook@chromium.org \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.