Re: [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free

From: Alexander Popov <alex.popov@linux.com>
To: Jann Horn <jannh@google.com>, Kees Cook <keescook@chromium.org>
Cc: Will Deacon <will@kernel.org>,
	Andrey Ryabinin <aryabinin@virtuozzo.com>,
	Alexander Potapenko <glider@google.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Masahiro Yamada <masahiroy@kernel.org>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Krzysztof Kozlowski <krzk@kernel.org>,
	Patrick Bellasi <patrick.bellasi@arm.com>,
	David Howells <dhowells@redhat.com>,
	Eric Biederman <ebiederm@xmission.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Laura Abbott <labbott@redhat.com>, Arnd Bergmann <arnd@arndb.de>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Daniel Micay <danielmicay@gmail.com>,
	Andrey Konovalov <andreyknvl@google.com>,
	Matthew Wilcox <willy@infradead.org>,
	Pavel Machek <pavel@denx.de>,
	Valentin Schneider <valentin.schneider@arm.com>,
	kasan-dev <kasan-dev@googlegroups.com>,
	Linux-MM <linux-mm@kvack.org>,
	Kernel Hardening <kernel-hardening@lists.openwall.com>,
	kernel list <linux-kernel@vger.kernel.org>,
	notify@kernel.org
Subject: Re: [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free
Date: Tue, 6 Oct 2020 20:56:09 +0300	[thread overview]
Message-ID: <1b5cf312-f7bb-87ce-6658-5ca741c2e790@linux.com> (raw)
In-Reply-To: <CAG48ez1tNU_7n8qtnxTYZ5qt-upJ81Fcb0P2rZe38ARK=iyBkA@mail.gmail.com>

On 06.10.2020 01:56, Jann Horn wrote:
> On Thu, Oct 1, 2020 at 9:43 PM Alexander Popov <alex.popov@linux.com> wrote:
>> On 29.09.2020 21:35, Alexander Popov wrote:
>>> This is the second version of the heap quarantine prototype for the Linux
>>> kernel. I performed a deeper evaluation of its security properties and
>>> developed new features like quarantine randomization and integration with
>>> init_on_free. That is fun! See below for more details.
>>>
>>>
>>> Rationale
>>> =========
>>>
>>> Use-after-free vulnerabilities in the Linux kernel are very popular for
>>> exploitation. There are many examples, some of them:
>>>  https://googleprojectzero.blogspot.com/2018/09/a-cache-invalidation-bug-in-linux.html

Hello Jann, thanks for your reply.

> I don't think your proposed mitigation would work with much
> reliability against this bug; the attacker has full control over the
> timing of the original use and the following use, so an attacker
> should be able to trigger the kmem_cache_free(), then spam enough new
> VMAs and delete them to flush out the quarantine, and then do heap
> spraying as normal, or something like that.

The randomized quarantine will release the vulnerable object at an unpredictable
moment (patch 4/6).

So I think the control over the time of the use-after-free access doesn't help
attackers, if they don't have an "infinite spray" -- unlimited ability to store
controlled data in the kernelspace objects of the needed size without freeing them.

"Unlimited", because the quarantine size is 1/32 of whole memory.
"Without freeing", because freed objects are erased by init_on_free before going
to randomized heap quarantine (patch 3/6).

Would you agree?

> Also, note that here, if the reallocation fails, the kernel still
> wouldn't crash because the dangling object is not accessed further if
> the address range stored in it doesn't match the fault address. So an
> attacker could potentially try multiple times, and if the object
> happens to be on the quarantine the first time, that wouldn't really
> be a showstopper, you'd just try again.

Freed objects are filled by zero before going to quarantine (patch 3/6).
Would it cause a null pointer dereference on unsuccessful try?

>>>  https://googleprojectzero.blogspot.com/2019/11/bad-binder-android-in-wild-exploit.html?m=1
> 
> I think that here, again, the free() and the dangling pointer use were
> caused by separate syscalls, meaning the attacker had control over
> that timing?

As I wrote above, I think attacker's control over this timing is required for a
successful attack, but is not enough for bypassing randomized quarantine.

>>>  https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
> 
> Haven't looked at that one in detail.
> 
>>> Use-after-free exploits usually employ heap spraying technique.
>>> Generally it aims to put controlled bytes at a predetermined memory
>>> location on the heap.
> 
> Well, not necessarily "predetermined". Depending on the circumstances,
> you don't necessarily need to know which address you're writing to;
> and you might not even need to overwrite a specific object, but
> instead just have to overwrite one out of a bunch of objects, no
> matter which.

Yes, of course, I didn't mean a "predetermined memory address".
Maybe "definite memory location" is a better phrase for that.

>>> Heap spraying for exploiting use-after-free in the Linux kernel relies on
>>> the fact that on kmalloc(), the slab allocator returns the address of
>>> the memory that was recently freed.
> 
> Yeah; and that behavior is pretty critical for performance. The longer
> it's been since a newly allocated object was freed, the higher the
> chance that you'll end up having to go further down the memory cache
> hierarchy.

Yes. That behaviour is fast, however very convenient for use-after-free
exploitation...

>>> So allocating a kernel object with
>>> the same size and controlled contents allows overwriting the vulnerable
>>> freed object.
> 
> The vmacache exploit you linked to doesn't do that, it frees the
> object all the way back to the page allocator and then sprays 4MiB of
> memory from the page allocator. (Because VMAs use their own
> kmem_cache, and the kmem_cache wasn't merged with any interesting
> ones, and I saw no good way to exploit the bug by reallocating another
> VMA over the old VMA back then. Although of course that doesn't mean
> that there is no such way.)

Sorry, my mistake.
Exploit examples with heap spraying that fit my description:
 - CVE-2017-6074 https://www.openwall.com/lists/oss-security/2017/02/26/2
 - CVE-2017-2636 https://a13xp0p0v.github.io/2017/03/24/CVE-2017-2636.html
 - CVE-2016-8655 https://seclists.org/oss-sec/2016/q4/607
 - CVE-2017-15649
https://ssd-disclosure.com/ssd-advisory-linux-kernel-af_packet-use-after-free/

> [...]
>>> Security properties
>>> ===================
>>>
>>> For researching security properties of the heap quarantine I developed 2 lkdtm
>>> tests (see the patch 5/6).
>>>
>>> The first test is called lkdtm_HEAP_SPRAY. It allocates and frees an object
>>> from a separate kmem_cache and then allocates 400000 similar objects.
>>> I.e. this test performs an original heap spraying technique for use-after-free
>>> exploitation.
>>>
>>> If CONFIG_SLAB_QUARANTINE is disabled, the freed object is instantly
>>> reallocated and overwritten:
>>>   # echo HEAP_SPRAY > /sys/kernel/debug/provoke-crash/DIRECT
>>>    lkdtm: Performing direct entry HEAP_SPRAY
>>>    lkdtm: Allocated and freed spray_cache object 000000002b5b3ad4 of size 333
>>>    lkdtm: Original heap spraying: allocate 400000 objects of size 333...
>>>    lkdtm: FAIL: attempt 0: freed object is reallocated
>>>
>>> If CONFIG_SLAB_QUARANTINE is enabled, 400000 new allocations don't overwrite
>>> the freed object:
>>>   # echo HEAP_SPRAY > /sys/kernel/debug/provoke-crash/DIRECT
>>>    lkdtm: Performing direct entry HEAP_SPRAY
>>>    lkdtm: Allocated and freed spray_cache object 000000009909e777 of size 333
>>>    lkdtm: Original heap spraying: allocate 400000 objects of size 333...
>>>    lkdtm: OK: original heap spraying hasn't succeed
>>>
>>> That happens because pushing an object through the quarantine requires _both_
>>> allocating and freeing memory. Objects are released from the quarantine on
>>> new memory allocations, but only when the quarantine size is over the limit.
>>> And the quarantine size grows on new memory freeing.
>>>
>>> That's why I created the second test called lkdtm_PUSH_THROUGH_QUARANTINE.
>>> It allocates and frees an object from a separate kmem_cache and then performs
>>> kmem_cache_alloc()+kmem_cache_free() for that cache 400000 times.
>>> This test effectively pushes the object through the heap quarantine and
>>> reallocates it after it returns back to the allocator freelist:
> [...]
>>> As you can see, the number of the allocations that are needed for overwriting
>>> the vulnerable object is almost the same. That would be good for stable
>>> use-after-free exploitation and should not be allowed.
>>> That's why I developed the quarantine randomization (see the patch 4/6).
>>>
>>> This randomization required very small hackish changes of the heap quarantine
>>> mechanism. At first all quarantine batches are filled by objects. Then during
>>> the quarantine reducing I randomly choose and free 1/2 of objects from a
>>> randomly chosen batch. Now the randomized quarantine releases the freed object
>>> at an unpredictable moment:
>>>    lkdtm: Target object is reallocated at attempt 107884
> [...]
>>>    lkdtm: Target object is reallocated at attempt 87343
> 
> Those numbers are fairly big. At that point you might not even fit
> into L3 cache anymore, right? You'd often be hitting DRAM for new
> allocations? And for many slabs, you might end using much more memory
> for the quarantine than for actual in-use allocations.

Yes. The original quarantine size is
  (totalram_pages() << PAGE_SHIFT) / QUARANTINE_FRACTION
where
  #define QUARANTINE_FRACTION 32

> It seems to me like, for this to stop attacks with a high probability,
> you'd have to reserve a huge chunk of kernel memory for the
> quarantines 

Yes, that's how it works now.

> - even if the attacker doesn't know anything about the
> status of the quarantine (which isn't necessarily the case, depending
> on whether the attacker can abuse microarchitectural data leakage, or
> if the attacker can trigger a pure data read through the dangling
> pointer), they should still be able to win with a probability around
> quarantine_size/allocated_memory_size if they have a heap spraying
> primitive without strict limits.

Not sure about this probability evaluation.
I will try calculating it taking the quarantine parameters into account.

>>> However, this randomization alone would not disturb the attacker, because
>>> the quarantine stores the attacker's data (the payload) in the sprayed objects.
>>> I.e. the reallocated and overwritten vulnerable object contains the payload
>>> until the next reallocation (very bad).
>>>
>>> Hence heap objects should be erased before going to the heap quarantine.
>>> Moreover, filling them by zeros gives a chance to detect use-after-free
>>> accesses to non-zero data while an object stays in the quarantine (nice!).
>>> That functionality already exists in the kernel, it's called init_on_free.
>>> I integrated it with CONFIG_SLAB_QUARANTINE in the patch 3/6.
>>>
>>> During that work I found a bug: in CONFIG_SLAB init_on_free happens too
>>> late, and heap objects go to the KASAN quarantine being dirty. See the fix
>>> in the patch 2/6.
> [...]
>> I've made various tests on real hardware and in virtual machines:
>>  1) network throughput test using iperf
>>      server: iperf -s -f K
>>      client: iperf -c 127.0.0.1 -t 60 -f K
>>  2) scheduler stress test
>>      hackbench -s 4000 -l 500 -g 15 -f 25 -P
>>  3) building the defconfig kernel
>>      time make -j2
>>
>> I compared Linux kernel 5.9.0-rc6 with:
>>  - init_on_free=off,
>>  - init_on_free=on,
>>  - CONFIG_SLAB_QUARANTINE=y (which enables init_on_free).
>>
>> Each test was performed 5 times. I will show the mean values.
>> If you are interested, I can share all the results and calculate standard deviation.
>>
>> Real hardware, Intel Core i7-6500U CPU
>>  1) Network throughput test with iperf
>>      init_on_free=off: 5467152.2 KBytes/sec
>>      init_on_free=on: 3937545 KBytes/sec (-28.0% vs init_on_free=off)
>>      CONFIG_SLAB_QUARANTINE: 3858848.6 KBytes/sec (-2.0% vs init_on_free=on)
>>  2) Scheduler stress test with hackbench
>>      init_on_free=off: 8.5364s
>>      init_on_free=on: 8.9858s (+5.3% vs init_on_free=off)
>>      CONFIG_SLAB_QUARANTINE: 17.2232s (+91.7% vs init_on_free=on)
> 
> These numbers seem really high for a mitigation, especially if that
> performance hit does not really buy you deterministic protection
> against many bugs.

Right, I agree.

It's a probabilistic protection, and the probability should be calculated.
I'll work on that.

> [...]
>> N.B. There was NO performance optimization made for this version of the heap
>> quarantine prototype. The main effort was put into researching its security
>> properties (hope for your feedback). Performance optimization will be done in
>> further steps, if we see that my work is worth doing.
> 
> But you are pretty much inherently limited in terms of performance by
> the effect the quarantine has on the data cache, right?

Yes.
However, the quarantine parameters can be adjusted.

> It seems to me like, if you want to make UAF exploitation harder at
> the heap allocator layer, you could do somewhat more effective things
> with a probably much smaller performance budget. Things like
> preventing the reallocation of virtual kernel addresses with different
> types, such that an attacker can only replace a UAF object with
> another object of the same type. (That is not an idea I like very much
> either, but I would like it more than this proposal.) (E.g. some
> browsers implement things along those lines, I believe.)

That's interesting, thank you.

Best regards,
Alexander