linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Popov <alex.popov@linux.com>
To: Jann Horn <jannh@google.com>, Kees Cook <keescook@chromium.org>
Cc: Will Deacon <will@kernel.org>,
	Andrey Ryabinin <aryabinin@virtuozzo.com>,
	Alexander Potapenko <glider@google.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Masahiro Yamada <masahiroy@kernel.org>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Krzysztof Kozlowski <krzk@kernel.org>,
	Patrick Bellasi <patrick.bellasi@arm.com>,
	David Howells <dhowells@redhat.com>,
	Eric Biederman <ebiederm@xmission.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Laura Abbott <labbott@redhat.com>, Arnd Bergmann <arnd@arndb.de>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Daniel Micay <danielmicay@gmail.com>,
	Andrey Konovalov <andreyknvl@google.com>,
	Matthew Wilcox <willy@infradead.org>,
	Pavel Machek <pavel@denx.de>,
	Valentin Schneider <valentin.schneider@arm.com>,
	kasan-dev <kasan-dev@googlegroups.com>,
	Linux-MM <linux-mm@kvack.org>,
	Kernel Hardening <kernel-hardening@lists.openwall.com>,
	kernel list <linux-kernel@vger.kernel.org>,
	notify@kernel.org
Subject: Re: [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free
Date: Tue, 6 Oct 2020 20:56:09 +0300	[thread overview]
Message-ID: <1b5cf312-f7bb-87ce-6658-5ca741c2e790@linux.com> (raw)
In-Reply-To: <CAG48ez1tNU_7n8qtnxTYZ5qt-upJ81Fcb0P2rZe38ARK=iyBkA@mail.gmail.com>

On 06.10.2020 01:56, Jann Horn wrote:
> On Thu, Oct 1, 2020 at 9:43 PM Alexander Popov <alex.popov@linux.com> wrote:
>> On 29.09.2020 21:35, Alexander Popov wrote:
>>> This is the second version of the heap quarantine prototype for the Linux
>>> kernel. I performed a deeper evaluation of its security properties and
>>> developed new features like quarantine randomization and integration with
>>> init_on_free. That is fun! See below for more details.
>>>
>>>
>>> Rationale
>>> =========
>>>
>>> Use-after-free vulnerabilities in the Linux kernel are very popular for
>>> exploitation. There are many examples, some of them:
>>>  https://googleprojectzero.blogspot.com/2018/09/a-cache-invalidation-bug-in-linux.html

Hello Jann, thanks for your reply.

> I don't think your proposed mitigation would work with much
> reliability against this bug; the attacker has full control over the
> timing of the original use and the following use, so an attacker
> should be able to trigger the kmem_cache_free(), then spam enough new
> VMAs and delete them to flush out the quarantine, and then do heap
> spraying as normal, or something like that.

The randomized quarantine will release the vulnerable object at an unpredictable
moment (patch 4/6).

So I think the control over the time of the use-after-free access doesn't help
attackers, if they don't have an "infinite spray" -- unlimited ability to store
controlled data in the kernelspace objects of the needed size without freeing them.

"Unlimited", because the quarantine size is 1/32 of whole memory.
"Without freeing", because freed objects are erased by init_on_free before going
to randomized heap quarantine (patch 3/6).

Would you agree?

> Also, note that here, if the reallocation fails, the kernel still
> wouldn't crash because the dangling object is not accessed further if
> the address range stored in it doesn't match the fault address. So an
> attacker could potentially try multiple times, and if the object
> happens to be on the quarantine the first time, that wouldn't really
> be a showstopper, you'd just try again.

Freed objects are filled by zero before going to quarantine (patch 3/6).
Would it cause a null pointer dereference on unsuccessful try?

>>>  https://googleprojectzero.blogspot.com/2019/11/bad-binder-android-in-wild-exploit.html?m=1
> 
> I think that here, again, the free() and the dangling pointer use were
> caused by separate syscalls, meaning the attacker had control over
> that timing?

As I wrote above, I think attacker's control over this timing is required for a
successful attack, but is not enough for bypassing randomized quarantine.

>>>  https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
> 
> Haven't looked at that one in detail.
> 
>>> Use-after-free exploits usually employ heap spraying technique.
>>> Generally it aims to put controlled bytes at a predetermined memory
>>> location on the heap.
> 
> Well, not necessarily "predetermined". Depending on the circumstances,
> you don't necessarily need to know which address you're writing to;
> and you might not even need to overwrite a specific object, but
> instead just have to overwrite one out of a bunch of objects, no
> matter which.

Yes, of course, I didn't mean a "predetermined memory address".
Maybe "definite memory location" is a better phrase for that.

>>> Heap spraying for exploiting use-after-free in the Linux kernel relies on
>>> the fact that on kmalloc(), the slab allocator returns the address of
>>> the memory that was recently freed.
> 
> Yeah; and that behavior is pretty critical for performance. The longer
> it's been since a newly allocated object was freed, the higher the
> chance that you'll end up having to go further down the memory cache
> hierarchy.

Yes. That behaviour is fast, however very convenient for use-after-free
exploitation...

>>> So allocating a kernel object with
>>> the same size and controlled contents allows overwriting the vulnerable
>>> freed object.
> 
> The vmacache exploit you linked to doesn't do that, it frees the
> object all the way back to the page allocator and then sprays 4MiB of
> memory from the page allocator. (Because VMAs use their own
> kmem_cache, and the kmem_cache wasn't merged with any interesting
> ones, and I saw no good way to exploit the bug by reallocating another
> VMA over the old VMA back then. Although of course that doesn't mean
> that there is no such way.)

Sorry, my mistake.
Exploit examples with heap spraying that fit my description:
 - CVE-2017-6074 https://www.openwall.com/lists/oss-security/2017/02/26/2
 - CVE-2017-2636 https://a13xp0p0v.github.io/2017/03/24/CVE-2017-2636.html
 - CVE-2016-8655 https://seclists.org/oss-sec/2016/q4/607
 - CVE-2017-15649
https://ssd-disclosure.com/ssd-advisory-linux-kernel-af_packet-use-after-free/

> [...]
>>> Security properties
>>> ===================
>>>
>>> For researching security properties of the heap quarantine I developed 2 lkdtm
>>> tests (see the patch 5/6).
>>>
>>> The first test is called lkdtm_HEAP_SPRAY. It allocates and frees an object
>>> from a separate kmem_cache and then allocates 400000 similar objects.
>>> I.e. this test performs an original heap spraying technique for use-after-free
>>> exploitation.
>>>
>>> If CONFIG_SLAB_QUARANTINE is disabled, the freed object is instantly
>>> reallocated and overwritten:
>>>   # echo HEAP_SPRAY > /sys/kernel/debug/provoke-crash/DIRECT
>>>    lkdtm: Performing direct entry HEAP_SPRAY
>>>    lkdtm: Allocated and freed spray_cache object 000000002b5b3ad4 of size 333
>>>    lkdtm: Original heap spraying: allocate 400000 objects of size 333...
>>>    lkdtm: FAIL: attempt 0: freed object is reallocated
>>>
>>> If CONFIG_SLAB_QUARANTINE is enabled, 400000 new allocations don't overwrite
>>> the freed object:
>>>   # echo HEAP_SPRAY > /sys/kernel/debug/provoke-crash/DIRECT
>>>    lkdtm: Performing direct entry HEAP_SPRAY
>>>    lkdtm: Allocated and freed spray_cache object 000000009909e777 of size 333
>>>    lkdtm: Original heap spraying: allocate 400000 objects of size 333...
>>>    lkdtm: OK: original heap spraying hasn't succeed
>>>
>>> That happens because pushing an object through the quarantine requires _both_
>>> allocating and freeing memory. Objects are released from the quarantine on
>>> new memory allocations, but only when the quarantine size is over the limit.
>>> And the quarantine size grows on new memory freeing.
>>>
>>> That's why I created the second test called lkdtm_PUSH_THROUGH_QUARANTINE.
>>> It allocates and frees an object from a separate kmem_cache and then performs
>>> kmem_cache_alloc()+kmem_cache_free() for that cache 400000 times.
>>> This test effectively pushes the object through the heap quarantine and
>>> reallocates it after it returns back to the allocator freelist:
> [...]
>>> As you can see, the number of the allocations that are needed for overwriting
>>> the vulnerable object is almost the same. That would be good for stable
>>> use-after-free exploitation and should not be allowed.
>>> That's why I developed the quarantine randomization (see the patch 4/6).
>>>
>>> This randomization required very small hackish changes of the heap quarantine
>>> mechanism. At first all quarantine batches are filled by objects. Then during
>>> the quarantine reducing I randomly choose and free 1/2 of objects from a
>>> randomly chosen batch. Now the randomized quarantine releases the freed object
>>> at an unpredictable moment:
>>>    lkdtm: Target object is reallocated at attempt 107884
> [...]
>>>    lkdtm: Target object is reallocated at attempt 87343
> 
> Those numbers are fairly big. At that point you might not even fit
> into L3 cache anymore, right? You'd often be hitting DRAM for new
> allocations? And for many slabs, you might end using much more memory
> for the quarantine than for actual in-use allocations.

Yes. The original quarantine size is
  (totalram_pages() << PAGE_SHIFT) / QUARANTINE_FRACTION
where
  #define QUARANTINE_FRACTION 32

> It seems to me like, for this to stop attacks with a high probability,
> you'd have to reserve a huge chunk of kernel memory for the
> quarantines 

Yes, that's how it works now.

> - even if the attacker doesn't know anything about the
> status of the quarantine (which isn't necessarily the case, depending
> on whether the attacker can abuse microarchitectural data leakage, or
> if the attacker can trigger a pure data read through the dangling
> pointer), they should still be able to win with a probability around
> quarantine_size/allocated_memory_size if they have a heap spraying
> primitive without strict limits.

Not sure about this probability evaluation.
I will try calculating it taking the quarantine parameters into account.

>>> However, this randomization alone would not disturb the attacker, because
>>> the quarantine stores the attacker's data (the payload) in the sprayed objects.
>>> I.e. the reallocated and overwritten vulnerable object contains the payload
>>> until the next reallocation (very bad).
>>>
>>> Hence heap objects should be erased before going to the heap quarantine.
>>> Moreover, filling them by zeros gives a chance to detect use-after-free
>>> accesses to non-zero data while an object stays in the quarantine (nice!).
>>> That functionality already exists in the kernel, it's called init_on_free.
>>> I integrated it with CONFIG_SLAB_QUARANTINE in the patch 3/6.
>>>
>>> During that work I found a bug: in CONFIG_SLAB init_on_free happens too
>>> late, and heap objects go to the KASAN quarantine being dirty. See the fix
>>> in the patch 2/6.
> [...]
>> I've made various tests on real hardware and in virtual machines:
>>  1) network throughput test using iperf
>>      server: iperf -s -f K
>>      client: iperf -c 127.0.0.1 -t 60 -f K
>>  2) scheduler stress test
>>      hackbench -s 4000 -l 500 -g 15 -f 25 -P
>>  3) building the defconfig kernel
>>      time make -j2
>>
>> I compared Linux kernel 5.9.0-rc6 with:
>>  - init_on_free=off,
>>  - init_on_free=on,
>>  - CONFIG_SLAB_QUARANTINE=y (which enables init_on_free).
>>
>> Each test was performed 5 times. I will show the mean values.
>> If you are interested, I can share all the results and calculate standard deviation.
>>
>> Real hardware, Intel Core i7-6500U CPU
>>  1) Network throughput test with iperf
>>      init_on_free=off: 5467152.2 KBytes/sec
>>      init_on_free=on: 3937545 KBytes/sec (-28.0% vs init_on_free=off)
>>      CONFIG_SLAB_QUARANTINE: 3858848.6 KBytes/sec (-2.0% vs init_on_free=on)
>>  2) Scheduler stress test with hackbench
>>      init_on_free=off: 8.5364s
>>      init_on_free=on: 8.9858s (+5.3% vs init_on_free=off)
>>      CONFIG_SLAB_QUARANTINE: 17.2232s (+91.7% vs init_on_free=on)
> 
> These numbers seem really high for a mitigation, especially if that
> performance hit does not really buy you deterministic protection
> against many bugs.

Right, I agree.

It's a probabilistic protection, and the probability should be calculated.
I'll work on that.

> [...]
>> N.B. There was NO performance optimization made for this version of the heap
>> quarantine prototype. The main effort was put into researching its security
>> properties (hope for your feedback). Performance optimization will be done in
>> further steps, if we see that my work is worth doing.
> 
> But you are pretty much inherently limited in terms of performance by
> the effect the quarantine has on the data cache, right?

Yes.
However, the quarantine parameters can be adjusted.

> It seems to me like, if you want to make UAF exploitation harder at
> the heap allocator layer, you could do somewhat more effective things
> with a probably much smaller performance budget. Things like
> preventing the reallocation of virtual kernel addresses with different
> types, such that an attacker can only replace a UAF object with
> another object of the same type. (That is not an idea I like very much
> either, but I would like it more than this proposal.) (E.g. some
> browsers implement things along those lines, I believe.)

That's interesting, thank you.

Best regards,
Alexander

  parent reply	other threads:[~2020-10-06 17:56 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-29 18:35 [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free Alexander Popov
2020-09-29 18:35 ` [PATCH RFC v2 1/6] mm: Extract SLAB_QUARANTINE from KASAN Alexander Popov
2020-09-29 18:35 ` [PATCH RFC v2 2/6] mm/slab: Perform init_on_free earlier Alexander Popov
2020-09-30 12:50   ` Alexander Potapenko
2020-10-01 19:48     ` Alexander Popov
2020-12-03 19:50     ` Alexander Popov
2020-12-03 20:49       ` Andrew Morton
2020-12-04 11:54         ` Alexander Popov
2020-09-29 18:35 ` [PATCH RFC v2 3/6] mm: Integrate SLAB_QUARANTINE with init_on_free Alexander Popov
2020-09-29 18:35 ` [PATCH RFC v2 4/6] mm: Implement slab quarantine randomization Alexander Popov
2020-09-29 18:35 ` [PATCH RFC v2 5/6] lkdtm: Add heap quarantine tests Alexander Popov
2020-09-29 18:35 ` [PATCH RFC v2 6/6] mm: Add heap quarantine verbose debugging (not for merge) Alexander Popov
2020-10-01 19:42 ` [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free Alexander Popov
2020-10-05 22:56   ` Jann Horn
2020-10-06  0:44     ` Matthew Wilcox
2020-10-06  0:48       ` Jann Horn
2020-10-06  2:09       ` Kees Cook
2020-10-06  2:16         ` Jann Horn
2020-10-06  2:19         ` Daniel Micay
2020-10-06  8:35         ` Christopher Lameter
2020-10-06  8:32       ` Christopher Lameter
2020-10-06 17:56     ` Alexander Popov [this message]
2020-10-06 18:37       ` Jann Horn
2020-10-06 19:25         ` Alexander Popov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1b5cf312-f7bb-87ce-6658-5ca741c2e790@linux.com \
    --to=alex.popov@linux.com \
    --cc=akpm@linux-foundation.org \
    --cc=andreyknvl@google.com \
    --cc=arnd@arndb.de \
    --cc=aryabinin@virtuozzo.com \
    --cc=cl@linux.com \
    --cc=danielmicay@gmail.com \
    --cc=dhowells@redhat.com \
    --cc=dvyukov@google.com \
    --cc=ebiederm@xmission.com \
    --cc=glider@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=jannh@google.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=keescook@chromium.org \
    --cc=kernel-hardening@lists.openwall.com \
    --cc=krzk@kernel.org \
    --cc=labbott@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=masahiroy@kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=notify@kernel.org \
    --cc=patrick.bellasi@arm.com \
    --cc=pavel@denx.de \
    --cc=penberg@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rientjes@google.com \
    --cc=rostedt@goodmis.org \
    --cc=valentin.schneider@arm.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).