Kernel-hardening Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free
@ 2020-09-29 18:35 Alexander Popov
  2020-09-29 18:35 ` [PATCH RFC v2 1/6] mm: Extract SLAB_QUARANTINE from KASAN Alexander Popov
                   ` (6 more replies)
  0 siblings, 7 replies; 24+ messages in thread
From: Alexander Popov @ 2020-09-29 18:35 UTC (permalink / raw)
  To: Kees Cook, Jann Horn, Will Deacon, Andrey Ryabinin,
	Alexander Potapenko, Dmitry Vyukov, Christoph Lameter,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	Masahiro Yamada, Masami Hiramatsu, Steven Rostedt,
	Peter Zijlstra, Krzysztof Kozlowski, Patrick Bellasi,
	David Howells, Eric Biederman, Johannes Weiner, Laura Abbott,
	Arnd Bergmann, Greg Kroah-Hartman, Daniel Micay,
	Andrey Konovalov, Matthew Wilcox, Pavel Machek,
	Valentin Schneider, kasan-dev, linux-mm, kernel-hardening,
	linux-kernel, Alexander Popov
  Cc: notify

Hello everyone! Requesting for your comments.

This is the second version of the heap quarantine prototype for the Linux
kernel. I performed a deeper evaluation of its security properties and
developed new features like quarantine randomization and integration with
init_on_free. That is fun! See below for more details.


Rationale
=========

Use-after-free vulnerabilities in the Linux kernel are very popular for
exploitation. There are many examples, some of them:
 https://googleprojectzero.blogspot.com/2018/09/a-cache-invalidation-bug-in-linux.html
 https://googleprojectzero.blogspot.com/2019/11/bad-binder-android-in-wild-exploit.html?m=1
 https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html

Use-after-free exploits usually employ heap spraying technique.
Generally it aims to put controlled bytes at a predetermined memory
location on the heap.

Heap spraying for exploiting use-after-free in the Linux kernel relies on
the fact that on kmalloc(), the slab allocator returns the address of
the memory that was recently freed. So allocating a kernel object with
the same size and controlled contents allows overwriting the vulnerable
freed object.

I've found an easy way to break the heap spraying for use-after-free
exploitation. I extracted slab freelist quarantine from KASAN functionality
and called it CONFIG_SLAB_QUARANTINE. Please see patch 1/6.

If this feature is enabled, freed allocations are stored in the quarantine
queue where they wait for actual freeing. So they can't be instantly
reallocated and overwritten by use-after-free exploits.

N.B. Heap spraying for out-of-bounds exploitation is another technique,
heap quarantine doesn't break it.


Security properties
===================

For researching security properties of the heap quarantine I developed 2 lkdtm
tests (see the patch 5/6).

The first test is called lkdtm_HEAP_SPRAY. It allocates and frees an object
from a separate kmem_cache and then allocates 400000 similar objects.
I.e. this test performs an original heap spraying technique for use-after-free
exploitation.

If CONFIG_SLAB_QUARANTINE is disabled, the freed object is instantly
reallocated and overwritten:
  # echo HEAP_SPRAY > /sys/kernel/debug/provoke-crash/DIRECT
   lkdtm: Performing direct entry HEAP_SPRAY
   lkdtm: Allocated and freed spray_cache object 000000002b5b3ad4 of size 333
   lkdtm: Original heap spraying: allocate 400000 objects of size 333...
   lkdtm: FAIL: attempt 0: freed object is reallocated

If CONFIG_SLAB_QUARANTINE is enabled, 400000 new allocations don't overwrite
the freed object:
  # echo HEAP_SPRAY > /sys/kernel/debug/provoke-crash/DIRECT
   lkdtm: Performing direct entry HEAP_SPRAY
   lkdtm: Allocated and freed spray_cache object 000000009909e777 of size 333
   lkdtm: Original heap spraying: allocate 400000 objects of size 333...
   lkdtm: OK: original heap spraying hasn't succeed

That happens because pushing an object through the quarantine requires _both_
allocating and freeing memory. Objects are released from the quarantine on
new memory allocations, but only when the quarantine size is over the limit.
And the quarantine size grows on new memory freeing.

That's why I created the second test called lkdtm_PUSH_THROUGH_QUARANTINE.
It allocates and frees an object from a separate kmem_cache and then performs
kmem_cache_alloc()+kmem_cache_free() for that cache 400000 times.
This test effectively pushes the object through the heap quarantine and
reallocates it after it returns back to the allocator freelist:
  # echo PUSH_THROUGH_QUARANTINE > /sys/kernel/debug/provoke-crash/
   lkdtm: Performing direct entry PUSH_THROUGH_QUARANTINE
   lkdtm: Allocated and freed spray_cache object 000000008fdb15c3 of size 333
   lkdtm: Push through quarantine: allocate and free 400000 objects of size 333...
   lkdtm: Target object is reallocated at attempt 182994
  # echo PUSH_THROUGH_QUARANTINE > /sys/kernel/debug/provoke-crash/
   lkdtm: Performing direct entry PUSH_THROUGH_QUARANTINE
   lkdtm: Allocated and freed spray_cache object 000000004e223cbe of size 333
   lkdtm: Push through quarantine: allocate and free 400000 objects of size 333...
   lkdtm: Target object is reallocated at attempt 186830
  # echo PUSH_THROUGH_QUARANTINE > /sys/kernel/debug/provoke-crash/
   lkdtm: Performing direct entry PUSH_THROUGH_QUARANTINE
   lkdtm: Allocated and freed spray_cache object 000000007663a058 of size 333
   lkdtm: Push through quarantine: allocate and free 400000 objects of size 333...
   lkdtm: Target object is reallocated at attempt 182010

As you can see, the number of the allocations that are needed for overwriting
the vulnerable object is almost the same. That would be good for stable
use-after-free exploitation and should not be allowed.
That's why I developed the quarantine randomization (see the patch 4/6).

This randomization required very small hackish changes of the heap quarantine
mechanism. At first all quarantine batches are filled by objects. Then during
the quarantine reducing I randomly choose and free 1/2 of objects from a
randomly chosen batch. Now the randomized quarantine releases the freed object
at an unpredictable moment:
   lkdtm: Target object is reallocated at attempt 107884
   lkdtm: Target object is reallocated at attempt 265641
   lkdtm: Target object is reallocated at attempt 100030
   lkdtm: Target object is NOT reallocated in 400000 attempts
   lkdtm: Target object is reallocated at attempt 204731
   lkdtm: Target object is reallocated at attempt 359333
   lkdtm: Target object is reallocated at attempt 289349
   lkdtm: Target object is reallocated at attempt 119893
   lkdtm: Target object is reallocated at attempt 225202
   lkdtm: Target object is reallocated at attempt 87343

However, this randomization alone would not disturb the attacker, because
the quarantine stores the attacker's data (the payload) in the sprayed objects.
I.e. the reallocated and overwritten vulnerable object contains the payload
until the next reallocation (very bad).

Hence heap objects should be erased before going to the heap quarantine.
Moreover, filling them by zeros gives a chance to detect use-after-free
accesses to non-zero data while an object stays in the quarantine (nice!).
That functionality already exists in the kernel, it's called init_on_free.
I integrated it with CONFIG_SLAB_QUARANTINE in the patch 3/6.

During that work I found a bug: in CONFIG_SLAB init_on_free happens too
late, and heap objects go to the KASAN quarantine being dirty. See the fix
in the patch 2/6.

For deeper understanding of the heap quarantine inner workings, I attach
the patch 6/6, which contains verbose debugging (not for merge).
It's very helpful, see the output example:
   quarantine: PUT 508992 to tail batch 123, whole sz 65118872, batch sz 508854
   quarantine: whole sz exceed max by 494552, REDUCE head batch 0 by 415392, leave 396304
   quarantine: data level in batches:
     0 - 77%
     1 - 108%
     2 - 83%
     3 - 21%
   ...
     125 - 75%
     126 - 12%
     127 - 108%
   quarantine: whole sz exceed max by 79160, REDUCE head batch 12 by 14160, leave 17608
   quarantine: whole sz exceed max by 65000, REDUCE head batch 75 by 218328, leave 195232
   quarantine: PUT 508992 to tail batch 124, whole sz 64979984, batch sz 508854
   ...


Changes in v2
=============

 - Added heap quarantine randomization (the patch 4/6).

 - Integrated CONFIG_SLAB_QUARANTINE with init_on_free (the patch 3/6).

 - Fixed late init_on_free in CONFIG_SLAB (the patch 2/6).

 - Added lkdtm_PUSH_THROUGH_QUARANTINE test.

 - Added the quarantine verbose debugging (the patch 6/6, not for merge).

 - Improved the descriptions according to the feedback from Kees Cook
   and Matthew Wilcox.

 - Made fixes recommended by Kees Cook:

   * Avoided BUG_ON() in kasan_cache_create() by handling the error and
     reporting with WARN_ON().

   * Created a separate kmem_cache for new lkdtm tests.

   * Fixed kasan_track.pid type to pid_t.


TODO for the next prototypes
============================

1. Performance evaluation and optimization.
   I would really appreciate your ideas about performance testing of a
   kernel with the heap quarantine. The first prototype was tested with
   hackbench and kernel build timing (which showed very different numbers).
   Earlier the developers similarly tested init_on_free functionality.
   However, Brad Spengler says in his twitter that such testing method
   is poor.

2. Complete separation of CONFIG_SLAB_QUARANTINE from KASAN (feedback
   from Andrey Konovalov).

3. Adding a kernel boot parameter for enabling/disabling the heap quaranitne
   (feedback from Kees Cook).

4. Testing the heap quarantine in near-OOM situations (feedback from
   Pavel Machek).

5. Does this work somehow help or disturb the integration of the
   Memory Tagging for the Linux kernel?

6. After rebasing the series onto v5.9.0-rc6, CONFIG_SLAB kernel started to
   show warnings about few slab caches that have no space for additional
   metadata. It needs more investigation. I believe it affects KASAN bug
   detection abilities as well. Warning example:
     WARNING: CPU: 0 PID: 0 at mm/kasan/slab_quarantine.c:38 kasan_cache_create+0x37/0x50
     Modules linked in:
     CPU: 0 PID: 0 Comm: swapper Not tainted 5.9.0-rc6+ #1
     Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014
     RIP: 0010:kasan_cache_create+0x37/0x50
     ...
     Call Trace:
      __kmem_cache_create+0x74/0x250
      create_boot_cache+0x6d/0x91
      create_kmalloc_cache+0x57/0x93
      new_kmalloc_cache+0x39/0x47
      create_kmalloc_caches+0x33/0xd9
      start_kernel+0x25b/0x532
      secondary_startup_64+0xb6/0xc0

Thanks in advance for your feedback.
Best regards,
Alexander


Alexander Popov (6):
  mm: Extract SLAB_QUARANTINE from KASAN
  mm/slab: Perform init_on_free earlier
  mm: Integrate SLAB_QUARANTINE with init_on_free
  mm: Implement slab quarantine randomization
  lkdtm: Add heap quarantine tests
  mm: Add heap quarantine verbose debugging (not for merge)

 drivers/misc/lkdtm/core.c  |   2 +
 drivers/misc/lkdtm/heap.c  | 110 +++++++++++++++++++++++++++++++++++++
 drivers/misc/lkdtm/lkdtm.h |   2 +
 include/linux/kasan.h      | 107 ++++++++++++++++++++----------------
 include/linux/slab_def.h   |   2 +-
 include/linux/slub_def.h   |   2 +-
 init/Kconfig               |  14 +++++
 mm/Makefile                |   3 +-
 mm/kasan/Makefile          |   2 +
 mm/kasan/kasan.h           |  75 +++++++++++++------------
 mm/kasan/quarantine.c      | 102 ++++++++++++++++++++++++++++++----
 mm/kasan/slab_quarantine.c | 106 +++++++++++++++++++++++++++++++++++
 mm/page_alloc.c            |  22 ++++++++
 mm/slab.c                  |   5 +-
 mm/slub.c                  |   2 +-
 15 files changed, 455 insertions(+), 101 deletions(-)
 create mode 100644 mm/kasan/slab_quarantine.c

-- 
2.26.2


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH RFC v2 1/6] mm: Extract SLAB_QUARANTINE from KASAN
  2020-09-29 18:35 [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free Alexander Popov
@ 2020-09-29 18:35 ` Alexander Popov
  2020-09-29 18:35 ` [PATCH RFC v2 2/6] mm/slab: Perform init_on_free earlier Alexander Popov
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 24+ messages in thread
From: Alexander Popov @ 2020-09-29 18:35 UTC (permalink / raw)
  To: Kees Cook, Jann Horn, Will Deacon, Andrey Ryabinin,
	Alexander Potapenko, Dmitry Vyukov, Christoph Lameter,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	Masahiro Yamada, Masami Hiramatsu, Steven Rostedt,
	Peter Zijlstra, Krzysztof Kozlowski, Patrick Bellasi,
	David Howells, Eric Biederman, Johannes Weiner, Laura Abbott,
	Arnd Bergmann, Greg Kroah-Hartman, Daniel Micay,
	Andrey Konovalov, Matthew Wilcox, Pavel Machek,
	Valentin Schneider, kasan-dev, linux-mm, kernel-hardening,
	linux-kernel, Alexander Popov
  Cc: notify

Heap spraying is an exploitation technique that aims to put controlled
bytes at a predetermined memory location on the heap. Heap spraying for
exploiting use-after-free in the Linux kernel relies on the fact that on
kmalloc(), the slab allocator returns the address of the memory that was
recently freed. Allocating a kernel object with the same size and
controlled contents allows overwriting the vulnerable freed object.

Let's extract slab freelist quarantine from KASAN functionality and
call it CONFIG_SLAB_QUARANTINE. This feature breaks widespread heap
spraying technique for exploiting use-after-free vulnerabilities
in the kernel code.

If this feature is enabled, freed allocations are stored in the quarantine
queue where they wait for actual freeing. So they can't be instantly
reallocated and overwritten by use-after-free exploits.

N.B. Heap spraying for out-of-bounds exploitation is another technique,
heap quarantine doesn't break it.

Signed-off-by: Alexander Popov <alex.popov@linux.com>
---
 include/linux/kasan.h      | 107 ++++++++++++++++++++-----------------
 include/linux/slab_def.h   |   2 +-
 include/linux/slub_def.h   |   2 +-
 init/Kconfig               |  13 +++++
 mm/Makefile                |   3 +-
 mm/kasan/Makefile          |   2 +
 mm/kasan/kasan.h           |  75 +++++++++++++-------------
 mm/kasan/quarantine.c      |   2 +
 mm/kasan/slab_quarantine.c | 106 ++++++++++++++++++++++++++++++++++++
 mm/slub.c                  |   2 +-
 10 files changed, 225 insertions(+), 89 deletions(-)
 create mode 100644 mm/kasan/slab_quarantine.c

diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index 087fba34b209..b837216f760c 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -42,32 +42,14 @@ void kasan_unpoison_task_stack(struct task_struct *task);
 void kasan_alloc_pages(struct page *page, unsigned int order);
 void kasan_free_pages(struct page *page, unsigned int order);
 
-void kasan_cache_create(struct kmem_cache *cache, unsigned int *size,
-			slab_flags_t *flags);
-
 void kasan_poison_slab(struct page *page);
 void kasan_unpoison_object_data(struct kmem_cache *cache, void *object);
 void kasan_poison_object_data(struct kmem_cache *cache, void *object);
 void * __must_check kasan_init_slab_obj(struct kmem_cache *cache,
 					const void *object);
 
-void * __must_check kasan_kmalloc_large(const void *ptr, size_t size,
-						gfp_t flags);
 void kasan_kfree_large(void *ptr, unsigned long ip);
 void kasan_poison_kfree(void *ptr, unsigned long ip);
-void * __must_check kasan_kmalloc(struct kmem_cache *s, const void *object,
-					size_t size, gfp_t flags);
-void * __must_check kasan_krealloc(const void *object, size_t new_size,
-					gfp_t flags);
-
-void * __must_check kasan_slab_alloc(struct kmem_cache *s, void *object,
-					gfp_t flags);
-bool kasan_slab_free(struct kmem_cache *s, void *object, unsigned long ip);
-
-struct kasan_cache {
-	int alloc_meta_offset;
-	int free_meta_offset;
-};
 
 /*
  * These functions provide a special case to support backing module
@@ -107,10 +89,6 @@ static inline void kasan_disable_current(void) {}
 static inline void kasan_alloc_pages(struct page *page, unsigned int order) {}
 static inline void kasan_free_pages(struct page *page, unsigned int order) {}
 
-static inline void kasan_cache_create(struct kmem_cache *cache,
-				      unsigned int *size,
-				      slab_flags_t *flags) {}
-
 static inline void kasan_poison_slab(struct page *page) {}
 static inline void kasan_unpoison_object_data(struct kmem_cache *cache,
 					void *object) {}
@@ -122,17 +100,65 @@ static inline void *kasan_init_slab_obj(struct kmem_cache *cache,
 	return (void *)object;
 }
 
+static inline void kasan_kfree_large(void *ptr, unsigned long ip) {}
+static inline void kasan_poison_kfree(void *ptr, unsigned long ip) {}
+static inline void kasan_free_shadow(const struct vm_struct *vm) {}
+static inline void kasan_remove_zero_shadow(void *start, unsigned long size) {}
+static inline void kasan_unpoison_slab(const void *ptr) {}
+
+static inline int kasan_module_alloc(void *addr, size_t size)
+{
+	return 0;
+}
+
+static inline int kasan_add_zero_shadow(void *start, unsigned long size)
+{
+	return 0;
+}
+
+static inline size_t kasan_metadata_size(struct kmem_cache *cache)
+{
+	return 0;
+}
+
+#endif /* CONFIG_KASAN */
+
+struct kasan_cache {
+	int alloc_meta_offset;
+	int free_meta_offset;
+};
+
+#if defined(CONFIG_KASAN) || defined(CONFIG_SLAB_QUARANTINE)
+
+void kasan_cache_create(struct kmem_cache *cache, unsigned int *size,
+			slab_flags_t *flags);
+void * __must_check kasan_kmalloc_large(const void *ptr, size_t size,
+						gfp_t flags);
+void * __must_check kasan_kmalloc(struct kmem_cache *s, const void *object,
+					size_t size, gfp_t flags);
+void * __must_check kasan_krealloc(const void *object, size_t new_size,
+					gfp_t flags);
+void * __must_check kasan_slab_alloc(struct kmem_cache *s, void *object,
+					gfp_t flags);
+bool kasan_slab_free(struct kmem_cache *s, void *object, unsigned long ip);
+
+#else /* CONFIG_KASAN || CONFIG_SLAB_QUARANTINE */
+
+static inline void kasan_cache_create(struct kmem_cache *cache,
+				      unsigned int *size,
+				      slab_flags_t *flags) {}
+
 static inline void *kasan_kmalloc_large(void *ptr, size_t size, gfp_t flags)
 {
 	return ptr;
 }
-static inline void kasan_kfree_large(void *ptr, unsigned long ip) {}
-static inline void kasan_poison_kfree(void *ptr, unsigned long ip) {}
+
 static inline void *kasan_kmalloc(struct kmem_cache *s, const void *object,
 				size_t size, gfp_t flags)
 {
 	return (void *)object;
 }
+
 static inline void *kasan_krealloc(const void *object, size_t new_size,
 				 gfp_t flags)
 {
@@ -144,43 +170,28 @@ static inline void *kasan_slab_alloc(struct kmem_cache *s, void *object,
 {
 	return object;
 }
+
 static inline bool kasan_slab_free(struct kmem_cache *s, void *object,
 				   unsigned long ip)
 {
 	return false;
 }
-
-static inline int kasan_module_alloc(void *addr, size_t size) { return 0; }
-static inline void kasan_free_shadow(const struct vm_struct *vm) {}
-
-static inline int kasan_add_zero_shadow(void *start, unsigned long size)
-{
-	return 0;
-}
-static inline void kasan_remove_zero_shadow(void *start,
-					unsigned long size)
-{}
-
-static inline void kasan_unpoison_slab(const void *ptr) { }
-static inline size_t kasan_metadata_size(struct kmem_cache *cache) { return 0; }
-
-#endif /* CONFIG_KASAN */
+#endif /* CONFIG_KASAN || CONFIG_SLAB_QUARANTINE */
 
 #ifdef CONFIG_KASAN_GENERIC
-
 #define KASAN_SHADOW_INIT 0
-
-void kasan_cache_shrink(struct kmem_cache *cache);
-void kasan_cache_shutdown(struct kmem_cache *cache);
 void kasan_record_aux_stack(void *ptr);
-
 #else /* CONFIG_KASAN_GENERIC */
+static inline void kasan_record_aux_stack(void *ptr) {}
+#endif /* CONFIG_KASAN_GENERIC */
 
+#if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_SLAB_QUARANTINE)
+void kasan_cache_shrink(struct kmem_cache *cache);
+void kasan_cache_shutdown(struct kmem_cache *cache);
+#else /* CONFIG_KASAN_GENERIC || CONFIG_SLAB_QUARANTINE */
 static inline void kasan_cache_shrink(struct kmem_cache *cache) {}
 static inline void kasan_cache_shutdown(struct kmem_cache *cache) {}
-static inline void kasan_record_aux_stack(void *ptr) {}
-
-#endif /* CONFIG_KASAN_GENERIC */
+#endif /* CONFIG_KASAN_GENERIC || CONFIG_SLAB_QUARANTINE */
 
 #ifdef CONFIG_KASAN_SW_TAGS
 
diff --git a/include/linux/slab_def.h b/include/linux/slab_def.h
index 9eb430c163c2..fc7548f27512 100644
--- a/include/linux/slab_def.h
+++ b/include/linux/slab_def.h
@@ -72,7 +72,7 @@ struct kmem_cache {
 	int obj_offset;
 #endif /* CONFIG_DEBUG_SLAB */
 
-#ifdef CONFIG_KASAN
+#if defined(CONFIG_KASAN) || defined(CONFIG_SLAB_QUARANTINE)
 	struct kasan_cache kasan_info;
 #endif
 
diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index 1be0ed5befa1..71020cee9fd2 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -124,7 +124,7 @@ struct kmem_cache {
 	unsigned int *random_seq;
 #endif
 
-#ifdef CONFIG_KASAN
+#if defined(CONFIG_KASAN) || defined(CONFIG_SLAB_QUARANTINE)
 	struct kasan_cache kasan_info;
 #endif
 
diff --git a/init/Kconfig b/init/Kconfig
index d6a0b31b13dc..358c8ce818f4 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1931,6 +1931,19 @@ config SLAB_FREELIST_HARDENED
 	  sanity-checking than others. This option is most effective with
 	  CONFIG_SLUB.
 
+config SLAB_QUARANTINE
+	bool "Enable slab freelist quarantine"
+	depends on !KASAN && (SLAB || SLUB)
+	help
+	  Enable slab freelist quarantine to delay reusing of freed slab
+	  objects. If this feature is enabled, freed objects are stored
+	  in the quarantine queue where they wait for actual freeing.
+	  So they can't be instantly reallocated and overwritten by
+	  use-after-free exploits. In other words, this feature mitigates
+	  heap spraying technique for exploiting use-after-free
+	  vulnerabilities in the kernel code.
+	  KASAN also employs this feature for use-after-free detection.
+
 config SHUFFLE_PAGE_ALLOCATOR
 	bool "Page allocator randomization"
 	default SLAB_FREELIST_RANDOM && ACPI_NUMA
diff --git a/mm/Makefile b/mm/Makefile
index d5649f1c12c0..c052bc616a88 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -52,7 +52,7 @@ obj-y			:= filemap.o mempool.o oom_kill.o fadvise.o \
 			   mm_init.o percpu.o slab_common.o \
 			   compaction.o vmacache.o \
 			   interval_tree.o list_lru.o workingset.o \
-			   debug.o gup.o $(mmu-y)
+			   debug.o gup.o kasan/ $(mmu-y)
 
 # Give 'page_alloc' its own module-parameter namespace
 page-alloc-y := page_alloc.o
@@ -80,7 +80,6 @@ obj-$(CONFIG_KSM) += ksm.o
 obj-$(CONFIG_PAGE_POISONING) += page_poison.o
 obj-$(CONFIG_SLAB) += slab.o
 obj-$(CONFIG_SLUB) += slub.o
-obj-$(CONFIG_KASAN)	+= kasan/
 obj-$(CONFIG_FAILSLAB) += failslab.o
 obj-$(CONFIG_MEMORY_HOTPLUG) += memory_hotplug.o
 obj-$(CONFIG_MEMTEST)		+= memtest.o
diff --git a/mm/kasan/Makefile b/mm/kasan/Makefile
index 370d970e5ab5..f6367d56a4d0 100644
--- a/mm/kasan/Makefile
+++ b/mm/kasan/Makefile
@@ -32,3 +32,5 @@ CFLAGS_tags_report.o := $(CC_FLAGS_KASAN_RUNTIME)
 obj-$(CONFIG_KASAN) := common.o init.o report.o
 obj-$(CONFIG_KASAN_GENERIC) += generic.o generic_report.o quarantine.o
 obj-$(CONFIG_KASAN_SW_TAGS) += tags.o tags_report.o
+
+obj-$(CONFIG_SLAB_QUARANTINE) += slab_quarantine.o quarantine.o
diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
index ac499456740f..6692177177a2 100644
--- a/mm/kasan/kasan.h
+++ b/mm/kasan/kasan.h
@@ -5,6 +5,43 @@
 #include <linux/kasan.h>
 #include <linux/stackdepot.h>
 
+struct qlist_node {
+	struct qlist_node *next;
+};
+
+struct kasan_track {
+	pid_t pid;
+	depot_stack_handle_t stack;
+};
+
+struct kasan_free_meta {
+	/* This field is used while the object is in the quarantine.
+	 * Otherwise it might be used for the allocator freelist.
+	 */
+	struct qlist_node quarantine_link;
+#ifdef CONFIG_KASAN_GENERIC
+	struct kasan_track free_track;
+#endif
+};
+
+struct kasan_free_meta *get_free_info(struct kmem_cache *cache,
+					const void *object);
+
+#if defined(CONFIG_KASAN_GENERIC) && \
+	(defined(CONFIG_SLAB) || defined(CONFIG_SLUB)) || \
+	defined(CONFIG_SLAB_QUARANTINE)
+void quarantine_put(struct kasan_free_meta *info, struct kmem_cache *cache);
+void quarantine_reduce(void);
+void quarantine_remove_cache(struct kmem_cache *cache);
+#else
+static inline void quarantine_put(struct kasan_free_meta *info,
+				struct kmem_cache *cache) { }
+static inline void quarantine_reduce(void) { }
+static inline void quarantine_remove_cache(struct kmem_cache *cache) { }
+#endif
+
+#ifdef CONFIG_KASAN
+
 #define KASAN_SHADOW_SCALE_SIZE (1UL << KASAN_SHADOW_SCALE_SHIFT)
 #define KASAN_SHADOW_MASK       (KASAN_SHADOW_SCALE_SIZE - 1)
 
@@ -87,17 +124,8 @@ struct kasan_global {
 #endif
 };
 
-/**
- * Structures to keep alloc and free tracks *
- */
-
 #define KASAN_STACK_DEPTH 64
 
-struct kasan_track {
-	u32 pid;
-	depot_stack_handle_t stack;
-};
-
 #ifdef CONFIG_KASAN_SW_TAGS_IDENTIFY
 #define KASAN_NR_FREE_STACKS 5
 #else
@@ -121,23 +149,8 @@ struct kasan_alloc_meta {
 #endif
 };
 
-struct qlist_node {
-	struct qlist_node *next;
-};
-struct kasan_free_meta {
-	/* This field is used while the object is in the quarantine.
-	 * Otherwise it might be used for the allocator freelist.
-	 */
-	struct qlist_node quarantine_link;
-#ifdef CONFIG_KASAN_GENERIC
-	struct kasan_track free_track;
-#endif
-};
-
 struct kasan_alloc_meta *get_alloc_info(struct kmem_cache *cache,
 					const void *object);
-struct kasan_free_meta *get_free_info(struct kmem_cache *cache,
-					const void *object);
 
 static inline const void *kasan_shadow_to_mem(const void *shadow_addr)
 {
@@ -178,18 +191,6 @@ void kasan_set_free_info(struct kmem_cache *cache, void *object, u8 tag);
 struct kasan_track *kasan_get_free_track(struct kmem_cache *cache,
 				void *object, u8 tag);
 
-#if defined(CONFIG_KASAN_GENERIC) && \
-	(defined(CONFIG_SLAB) || defined(CONFIG_SLUB))
-void quarantine_put(struct kasan_free_meta *info, struct kmem_cache *cache);
-void quarantine_reduce(void);
-void quarantine_remove_cache(struct kmem_cache *cache);
-#else
-static inline void quarantine_put(struct kasan_free_meta *info,
-				struct kmem_cache *cache) { }
-static inline void quarantine_reduce(void) { }
-static inline void quarantine_remove_cache(struct kmem_cache *cache) { }
-#endif
-
 #ifdef CONFIG_KASAN_SW_TAGS
 
 void print_tags(u8 addr_tag, const void *addr);
@@ -296,4 +297,6 @@ void __hwasan_storeN_noabort(unsigned long addr, size_t size);
 
 void __hwasan_tag_memory(unsigned long addr, u8 tag, unsigned long size);
 
+#endif /* CONFIG_KASAN */
+
 #endif
diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c
index 4c5375810449..61666263c53e 100644
--- a/mm/kasan/quarantine.c
+++ b/mm/kasan/quarantine.c
@@ -145,7 +145,9 @@ static void qlink_free(struct qlist_node *qlink, struct kmem_cache *cache)
 	if (IS_ENABLED(CONFIG_SLAB))
 		local_irq_save(flags);
 
+#ifdef CONFIG_KASAN
 	*(u8 *)kasan_mem_to_shadow(object) = KASAN_KMALLOC_FREE;
+#endif
 	___cache_free(cache, object, _THIS_IP_);
 
 	if (IS_ENABLED(CONFIG_SLAB))
diff --git a/mm/kasan/slab_quarantine.c b/mm/kasan/slab_quarantine.c
new file mode 100644
index 000000000000..493c994ff87b
--- /dev/null
+++ b/mm/kasan/slab_quarantine.c
@@ -0,0 +1,106 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * The layer providing KASAN slab quarantine separately without the
+ * main KASAN functionality.
+ *
+ * Author: Alexander Popov <alex.popov@linux.com>
+ *
+ * This feature breaks widespread heap spraying technique used for
+ * exploiting use-after-free vulnerabilities in the kernel code.
+ *
+ * Heap spraying is an exploitation technique that aims to put controlled
+ * bytes at a predetermined memory location on the heap. Heap spraying for
+ * exploiting use-after-free in the Linux kernel relies on the fact that on
+ * kmalloc(), the slab allocator returns the address of the memory that was
+ * recently freed. Allocating a kernel object with the same size and
+ * controlled contents allows overwriting the vulnerable freed object.
+ *
+ * If freed allocations are stored in the quarantine queue where they wait
+ * for actual freeing, they can't be instantly reallocated and overwritten
+ * by use-after-free exploits.
+ *
+ * N.B. Heap spraying for out-of-bounds exploitation is another technique,
+ * heap quarantine doesn't break it.
+ */
+
+#include <linux/kasan.h>
+#include <linux/bug.h>
+#include <linux/slab.h>
+#include <linux/mm.h>
+#include "../slab.h"
+#include "kasan.h"
+
+void kasan_cache_create(struct kmem_cache *cache, unsigned int *size,
+			slab_flags_t *flags)
+{
+	cache->kasan_info.alloc_meta_offset = 0;
+
+	if (WARN_ON(*size + sizeof(struct kasan_free_meta) > KMALLOC_MAX_SIZE)) {
+		cache->kasan_info.free_meta_offset = 0;
+		return;
+	}
+
+	if (cache->flags & SLAB_TYPESAFE_BY_RCU || cache->ctor ||
+	     cache->object_size < sizeof(struct kasan_free_meta)) {
+		cache->kasan_info.free_meta_offset = *size;
+		*size += sizeof(struct kasan_free_meta);
+	}
+
+	*flags |= SLAB_KASAN;
+}
+
+struct kasan_free_meta *get_free_info(struct kmem_cache *cache,
+				      const void *object)
+{
+	BUILD_BUG_ON(sizeof(struct kasan_free_meta) > 32);
+	return (void *)object + cache->kasan_info.free_meta_offset;
+}
+
+bool kasan_slab_free(struct kmem_cache *cache, void *object, unsigned long ip)
+{
+	quarantine_put(get_free_info(cache, object), cache);
+	return true;
+}
+
+static void *reduce_helper(const void *ptr, gfp_t flags)
+{
+	if (gfpflags_allow_blocking(flags))
+		quarantine_reduce();
+
+	return (void *)ptr;
+}
+
+void * __must_check kasan_kmalloc_large(const void *ptr, size_t size,
+						gfp_t flags)
+{
+	return reduce_helper(ptr, flags);
+}
+
+void * __must_check kasan_krealloc(const void *object, size_t size, gfp_t flags)
+{
+	return reduce_helper(object, flags);
+}
+
+void * __must_check kasan_slab_alloc(struct kmem_cache *cache, void *object,
+					gfp_t flags)
+{
+	return reduce_helper(object, flags);
+}
+
+void * __must_check kasan_kmalloc(struct kmem_cache *cache, const void *object,
+				size_t size, gfp_t flags)
+{
+	return reduce_helper(object, flags);
+}
+EXPORT_SYMBOL(kasan_kmalloc);
+
+void kasan_cache_shrink(struct kmem_cache *cache)
+{
+	quarantine_remove_cache(cache);
+}
+
+void kasan_cache_shutdown(struct kmem_cache *cache)
+{
+	if (!__kmem_cache_empty(cache))
+		quarantine_remove_cache(cache);
+}
diff --git a/mm/slub.c b/mm/slub.c
index d4177aecedf6..6e276ed7606c 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3143,7 +3143,7 @@ static __always_inline void slab_free(struct kmem_cache *s, struct page *page,
 		do_slab_free(s, page, head, tail, cnt, addr);
 }
 
-#ifdef CONFIG_KASAN_GENERIC
+#if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_SLAB_QUARANTINE)
 void ___cache_free(struct kmem_cache *cache, void *x, unsigned long addr)
 {
 	do_slab_free(cache, virt_to_head_page(x), x, NULL, 1, addr);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH RFC v2 2/6] mm/slab: Perform init_on_free earlier
  2020-09-29 18:35 [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free Alexander Popov
  2020-09-29 18:35 ` [PATCH RFC v2 1/6] mm: Extract SLAB_QUARANTINE from KASAN Alexander Popov
@ 2020-09-29 18:35 ` Alexander Popov
  2020-09-30 12:50   ` Alexander Potapenko
  2020-09-29 18:35 ` [PATCH RFC v2 3/6] mm: Integrate SLAB_QUARANTINE with init_on_free Alexander Popov
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 24+ messages in thread
From: Alexander Popov @ 2020-09-29 18:35 UTC (permalink / raw)
  To: Kees Cook, Jann Horn, Will Deacon, Andrey Ryabinin,
	Alexander Potapenko, Dmitry Vyukov, Christoph Lameter,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	Masahiro Yamada, Masami Hiramatsu, Steven Rostedt,
	Peter Zijlstra, Krzysztof Kozlowski, Patrick Bellasi,
	David Howells, Eric Biederman, Johannes Weiner, Laura Abbott,
	Arnd Bergmann, Greg Kroah-Hartman, Daniel Micay,
	Andrey Konovalov, Matthew Wilcox, Pavel Machek,
	Valentin Schneider, kasan-dev, linux-mm, kernel-hardening,
	linux-kernel, Alexander Popov
  Cc: notify

Currently in CONFIG_SLAB init_on_free happens too late, and heap
objects go to the heap quarantine being dirty. Lets move memory
clearing before calling kasan_slab_free() to fix that.

Signed-off-by: Alexander Popov <alex.popov@linux.com>
---
 mm/slab.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 3160dff6fd76..5140203c5b76 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3414,6 +3414,9 @@ static void cache_flusharray(struct kmem_cache *cachep, struct array_cache *ac)
 static __always_inline void __cache_free(struct kmem_cache *cachep, void *objp,
 					 unsigned long caller)
 {
+	if (unlikely(slab_want_init_on_free(cachep)))
+		memset(objp, 0, cachep->object_size);
+
 	/* Put the object into the quarantine, don't touch it for now. */
 	if (kasan_slab_free(cachep, objp, _RET_IP_))
 		return;
@@ -3432,8 +3435,6 @@ void ___cache_free(struct kmem_cache *cachep, void *objp,
 	struct array_cache *ac = cpu_cache_get(cachep);
 
 	check_irq_off();
-	if (unlikely(slab_want_init_on_free(cachep)))
-		memset(objp, 0, cachep->object_size);
 	kmemleak_free_recursive(objp, cachep->flags);
 	objp = cache_free_debugcheck(cachep, objp, caller);
 	memcg_slab_free_hook(cachep, virt_to_head_page(objp), objp);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH RFC v2 3/6] mm: Integrate SLAB_QUARANTINE with init_on_free
  2020-09-29 18:35 [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free Alexander Popov
  2020-09-29 18:35 ` [PATCH RFC v2 1/6] mm: Extract SLAB_QUARANTINE from KASAN Alexander Popov
  2020-09-29 18:35 ` [PATCH RFC v2 2/6] mm/slab: Perform init_on_free earlier Alexander Popov
@ 2020-09-29 18:35 ` Alexander Popov
  2020-09-29 18:35 ` [PATCH RFC v2 4/6] mm: Implement slab quarantine randomization Alexander Popov
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 24+ messages in thread
From: Alexander Popov @ 2020-09-29 18:35 UTC (permalink / raw)
  To: Kees Cook, Jann Horn, Will Deacon, Andrey Ryabinin,
	Alexander Potapenko, Dmitry Vyukov, Christoph Lameter,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	Masahiro Yamada, Masami Hiramatsu, Steven Rostedt,
	Peter Zijlstra, Krzysztof Kozlowski, Patrick Bellasi,
	David Howells, Eric Biederman, Johannes Weiner, Laura Abbott,
	Arnd Bergmann, Greg Kroah-Hartman, Daniel Micay,
	Andrey Konovalov, Matthew Wilcox, Pavel Machek,
	Valentin Schneider, kasan-dev, linux-mm, kernel-hardening,
	linux-kernel, Alexander Popov
  Cc: notify

Having slab quarantine without memory erasing is harmful.
If the quarantined objects are not cleaned and contain data, then:
  1. they will be useful for use-after-free exploitation,
  2. there is no chance to detect use-after-free access.
So we want the quarantined objects to be erased.
Enable init_on_free that cleans objects before placing them into
the quarantine. CONFIG_PAGE_POISONING should be disabled since it
cuts off init_on_free.

Signed-off-by: Alexander Popov <alex.popov@linux.com>
---
 init/Kconfig    |  3 ++-
 mm/page_alloc.c | 22 ++++++++++++++++++++++
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/init/Kconfig b/init/Kconfig
index 358c8ce818f4..cd4cee71fd4e 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1933,7 +1933,8 @@ config SLAB_FREELIST_HARDENED
 
 config SLAB_QUARANTINE
 	bool "Enable slab freelist quarantine"
-	depends on !KASAN && (SLAB || SLUB)
+	depends on !KASAN && (SLAB || SLUB) && !PAGE_POISONING
+	select INIT_ON_FREE_DEFAULT_ON
 	help
 	  Enable slab freelist quarantine to delay reusing of freed slab
 	  objects. If this feature is enabled, freed objects are stored
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index fab5e97dc9ca..f67118e88500 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -168,6 +168,27 @@ static int __init early_init_on_alloc(char *buf)
 }
 early_param("init_on_alloc", early_init_on_alloc);
 
+#ifdef CONFIG_SLAB_QUARANTINE
+static int __init early_init_on_free(char *buf)
+{
+	/*
+	 * Having slab quarantine without memory erasing is harmful.
+	 * If the quarantined objects are not cleaned and contain data, then:
+	 *  1. they will be useful for use-after-free exploitation,
+	 *  2. use-after-free access may not be detected.
+	 * So we want the quarantined objects to be erased.
+	 *
+	 * Enable init_on_free that cleans objects before placing them into
+	 * the quarantine. CONFIG_PAGE_POISONING should be disabled since it
+	 * cuts off init_on_free.
+	 */
+	BUILD_BUG_ON(!IS_ENABLED(CONFIG_INIT_ON_FREE_DEFAULT_ON));
+	BUILD_BUG_ON(IS_ENABLED(CONFIG_PAGE_POISONING));
+	pr_info("mem auto-init: init_on_free is on for CONFIG_SLAB_QUARANTINE\n");
+
+	return 0;
+}
+#else /* CONFIG_SLAB_QUARANTINE */
 static int __init early_init_on_free(char *buf)
 {
 	int ret;
@@ -184,6 +205,7 @@ static int __init early_init_on_free(char *buf)
 		static_branch_disable(&init_on_free);
 	return ret;
 }
+#endif /* CONFIG_SLAB_QUARANTINE */
 early_param("init_on_free", early_init_on_free);
 
 /*
-- 
2.26.2


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH RFC v2 4/6] mm: Implement slab quarantine randomization
  2020-09-29 18:35 [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free Alexander Popov
                   ` (2 preceding siblings ...)
  2020-09-29 18:35 ` [PATCH RFC v2 3/6] mm: Integrate SLAB_QUARANTINE with init_on_free Alexander Popov
@ 2020-09-29 18:35 ` Alexander Popov
  2020-09-29 18:35 ` [PATCH RFC v2 5/6] lkdtm: Add heap quarantine tests Alexander Popov
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 24+ messages in thread
From: Alexander Popov @ 2020-09-29 18:35 UTC (permalink / raw)
  To: Kees Cook, Jann Horn, Will Deacon, Andrey Ryabinin,
	Alexander Potapenko, Dmitry Vyukov, Christoph Lameter,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	Masahiro Yamada, Masami Hiramatsu, Steven Rostedt,
	Peter Zijlstra, Krzysztof Kozlowski, Patrick Bellasi,
	David Howells, Eric Biederman, Johannes Weiner, Laura Abbott,
	Arnd Bergmann, Greg Kroah-Hartman, Daniel Micay,
	Andrey Konovalov, Matthew Wilcox, Pavel Machek,
	Valentin Schneider, kasan-dev, linux-mm, kernel-hardening,
	linux-kernel, Alexander Popov
  Cc: notify

The randomization is very important for the slab quarantine security
properties. Without it the number of kmalloc()+kfree() calls that are
needed for overwriting the vulnerable object is almost the same.
That would be good for stable use-after-free exploitation, and we
should not allow that.

This commit contains very compact and hackish changes that introduce
the quarantine randomization. At first all quarantine batches are filled
by objects. Then during the quarantine reducing we randomly choose and
free 1/2 of objects from a randomly chosen batch. Now the randomized
quarantine releases the freed object at an unpredictable moment, which
is harmful for the heap spraying technique employed by use-after-free
exploits.

Signed-off-by: Alexander Popov <alex.popov@linux.com>
---
 mm/kasan/quarantine.c | 79 +++++++++++++++++++++++++++++++++++++------
 1 file changed, 69 insertions(+), 10 deletions(-)

diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c
index 61666263c53e..4ce100605086 100644
--- a/mm/kasan/quarantine.c
+++ b/mm/kasan/quarantine.c
@@ -29,6 +29,7 @@
 #include <linux/srcu.h>
 #include <linux/string.h>
 #include <linux/types.h>
+#include <linux/random.h>
 
 #include "../slab.h"
 #include "kasan.h"
@@ -89,8 +90,13 @@ static void qlist_move_all(struct qlist_head *from, struct qlist_head *to)
 }
 
 #define QUARANTINE_PERCPU_SIZE (1 << 20)
+
+#ifdef CONFIG_KASAN
 #define QUARANTINE_BATCHES \
 	(1024 > 4 * CONFIG_NR_CPUS ? 1024 : 4 * CONFIG_NR_CPUS)
+#else
+#define QUARANTINE_BATCHES 128
+#endif
 
 /*
  * The object quarantine consists of per-cpu queues and a global queue,
@@ -110,10 +116,7 @@ DEFINE_STATIC_SRCU(remove_cache_srcu);
 /* Maximum size of the global queue. */
 static unsigned long quarantine_max_size;
 
-/*
- * Target size of a batch in global_quarantine.
- * Usually equal to QUARANTINE_PERCPU_SIZE unless we have too much RAM.
- */
+/* Target size of a batch in global_quarantine. */
 static unsigned long quarantine_batch_size;
 
 /*
@@ -191,7 +194,12 @@ void quarantine_put(struct kasan_free_meta *info, struct kmem_cache *cache)
 
 	q = this_cpu_ptr(&cpu_quarantine);
 	qlist_put(q, &info->quarantine_link, cache->size);
+#ifdef CONFIG_KASAN
 	if (unlikely(q->bytes > QUARANTINE_PERCPU_SIZE)) {
+#else
+	if (unlikely(q->bytes > min_t(size_t, QUARANTINE_PERCPU_SIZE,
+					READ_ONCE(quarantine_batch_size)))) {
+#endif
 		qlist_move_all(q, &temp);
 
 		raw_spin_lock(&quarantine_lock);
@@ -204,7 +212,7 @@ void quarantine_put(struct kasan_free_meta *info, struct kmem_cache *cache)
 			new_tail = quarantine_tail + 1;
 			if (new_tail == QUARANTINE_BATCHES)
 				new_tail = 0;
-			if (new_tail != quarantine_head)
+			if (new_tail != quarantine_head || !IS_ENABLED(CONFIG_KASAN))
 				quarantine_tail = new_tail;
 		}
 		raw_spin_unlock(&quarantine_lock);
@@ -213,12 +221,43 @@ void quarantine_put(struct kasan_free_meta *info, struct kmem_cache *cache)
 	local_irq_restore(flags);
 }
 
+static void qlist_move_random(struct qlist_head *from, struct qlist_head *to)
+{
+	struct qlist_node *curr;
+
+	if (unlikely(qlist_empty(from)))
+		return;
+
+	curr = from->head;
+	qlist_init(from);
+	while (curr) {
+		struct qlist_node *next = curr->next;
+		struct kmem_cache *obj_cache = qlink_to_cache(curr);
+		int rnd =  get_random_int();
+
+		/*
+		 * Hackish quarantine randomization, part 2:
+		 * move only 1/2 of objects to the destination list.
+		 * TODO: use random bits sparingly for better performance.
+		 */
+		if (rnd % 2 == 0)
+			qlist_put(to, curr, obj_cache->size);
+		else
+			qlist_put(from, curr, obj_cache->size);
+
+		curr = next;
+	}
+}
+
 void quarantine_reduce(void)
 {
-	size_t total_size, new_quarantine_size, percpu_quarantines;
+	size_t total_size;
 	unsigned long flags;
 	int srcu_idx;
 	struct qlist_head to_free = QLIST_INIT;
+#ifdef CONFIG_KASAN
+	size_t new_quarantine_size, percpu_quarantines;
+#endif
 
 	if (likely(READ_ONCE(quarantine_size) <=
 		   READ_ONCE(quarantine_max_size)))
@@ -236,12 +275,12 @@ void quarantine_reduce(void)
 	srcu_idx = srcu_read_lock(&remove_cache_srcu);
 	raw_spin_lock_irqsave(&quarantine_lock, flags);
 
-	/*
-	 * Update quarantine size in case of hotplug. Allocate a fraction of
-	 * the installed memory to quarantine minus per-cpu queue limits.
-	 */
+	/* Update quarantine size in case of hotplug */
 	total_size = (totalram_pages() << PAGE_SHIFT) /
 		QUARANTINE_FRACTION;
+
+#ifdef CONFIG_KASAN
+	/* Subtract per-cpu queue limits from total quarantine size */
 	percpu_quarantines = QUARANTINE_PERCPU_SIZE * num_online_cpus();
 	new_quarantine_size = (total_size < percpu_quarantines) ?
 		0 : total_size - percpu_quarantines;
@@ -257,6 +296,26 @@ void quarantine_reduce(void)
 		if (quarantine_head == QUARANTINE_BATCHES)
 			quarantine_head = 0;
 	}
+#else /* CONFIG_KASAN */
+	/*
+	 * Don't subtract per-cpu queue limits from total quarantine
+	 * size to consume all quarantine slots.
+	 */
+	WRITE_ONCE(quarantine_max_size, total_size);
+	WRITE_ONCE(quarantine_batch_size, total_size / QUARANTINE_BATCHES);
+
+	/*
+	 * Hackish quarantine randomization, part 1:
+	 * pick a random batch for reducing.
+	 */
+	if (likely(quarantine_size > quarantine_max_size)) {
+		do {
+			quarantine_head = get_random_int() % QUARANTINE_BATCHES;
+		} while (quarantine_head == quarantine_tail);
+		qlist_move_random(&global_quarantine[quarantine_head], &to_free);
+		WRITE_ONCE(quarantine_size, quarantine_size - to_free.bytes);
+	}
+#endif
 
 	raw_spin_unlock_irqrestore(&quarantine_lock, flags);
 
-- 
2.26.2


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH RFC v2 5/6] lkdtm: Add heap quarantine tests
  2020-09-29 18:35 [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free Alexander Popov
                   ` (3 preceding siblings ...)
  2020-09-29 18:35 ` [PATCH RFC v2 4/6] mm: Implement slab quarantine randomization Alexander Popov
@ 2020-09-29 18:35 ` Alexander Popov
  2020-09-29 18:35 ` [PATCH RFC v2 6/6] mm: Add heap quarantine verbose debugging (not for merge) Alexander Popov
  2020-10-01 19:42 ` [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free Alexander Popov
  6 siblings, 0 replies; 24+ messages in thread
From: Alexander Popov @ 2020-09-29 18:35 UTC (permalink / raw)
  To: Kees Cook, Jann Horn, Will Deacon, Andrey Ryabinin,
	Alexander Potapenko, Dmitry Vyukov, Christoph Lameter,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	Masahiro Yamada, Masami Hiramatsu, Steven Rostedt,
	Peter Zijlstra, Krzysztof Kozlowski, Patrick Bellasi,
	David Howells, Eric Biederman, Johannes Weiner, Laura Abbott,
	Arnd Bergmann, Greg Kroah-Hartman, Daniel Micay,
	Andrey Konovalov, Matthew Wilcox, Pavel Machek,
	Valentin Schneider, kasan-dev, linux-mm, kernel-hardening,
	linux-kernel, Alexander Popov
  Cc: notify

Add tests for CONFIG_SLAB_QUARANTINE.

The HEAP_SPRAY test aims to reallocate a recently freed heap object.
It allocates and frees an object from a separate kmem_cache and then
allocates 400000 similar objects from it. I.e. this test performs an
original heap spraying technique for use-after-free exploitation.
If CONFIG_SLAB_QUARANTINE is disabled, the freed object is instantly
reallocated and overwritten, which is required for a successful attack.

The PUSH_THROUGH_QUARANTINE test allocates and frees an object from a
separate kmem_cache and then performs kmem_cache_alloc()+kmem_cache_free()
400000 times. This test pushes the object through the heap quarantine and
reallocates it after it returns back to the allocator freelist.
If CONFIG_SLAB_QUARANTINE is enabled, this test should show that the
randomized quarantine will release the freed object at an unpredictable
moment, which makes use-after-free exploitation much harder.

Signed-off-by: Alexander Popov <alex.popov@linux.com>
---
 drivers/misc/lkdtm/core.c  |   2 +
 drivers/misc/lkdtm/heap.c  | 110 +++++++++++++++++++++++++++++++++++++
 drivers/misc/lkdtm/lkdtm.h |   2 +
 3 files changed, 114 insertions(+)

diff --git a/drivers/misc/lkdtm/core.c b/drivers/misc/lkdtm/core.c
index a5e344df9166..6be5ca49ae6b 100644
--- a/drivers/misc/lkdtm/core.c
+++ b/drivers/misc/lkdtm/core.c
@@ -126,6 +126,8 @@ static const struct crashtype crashtypes[] = {
 	CRASHTYPE(SLAB_FREE_DOUBLE),
 	CRASHTYPE(SLAB_FREE_CROSS),
 	CRASHTYPE(SLAB_FREE_PAGE),
+	CRASHTYPE(HEAP_SPRAY),
+	CRASHTYPE(PUSH_THROUGH_QUARANTINE),
 	CRASHTYPE(SOFTLOCKUP),
 	CRASHTYPE(HARDLOCKUP),
 	CRASHTYPE(SPINLOCKUP),
diff --git a/drivers/misc/lkdtm/heap.c b/drivers/misc/lkdtm/heap.c
index 1323bc16f113..f666a08d9462 100644
--- a/drivers/misc/lkdtm/heap.c
+++ b/drivers/misc/lkdtm/heap.c
@@ -10,6 +10,7 @@
 static struct kmem_cache *double_free_cache;
 static struct kmem_cache *a_cache;
 static struct kmem_cache *b_cache;
+static struct kmem_cache *spray_cache;
 
 /*
  * This tries to stay within the next largest power-of-2 kmalloc cache
@@ -204,6 +205,112 @@ static void ctor_a(void *region)
 { }
 static void ctor_b(void *region)
 { }
+static void ctor_spray(void *region)
+{ }
+
+#define SPRAY_LENGTH 400000
+#define SPRAY_ITEM_SIZE 333
+
+void lkdtm_HEAP_SPRAY(void)
+{
+	int *addr;
+	int **spray_addrs = NULL;
+	unsigned long i = 0;
+
+	addr = kmem_cache_alloc(spray_cache, GFP_KERNEL);
+	if (!addr) {
+		pr_info("Can't allocate memory in spray_cache cache\n");
+		return;
+	}
+
+	memset(addr, 0xA5, SPRAY_ITEM_SIZE);
+	kmem_cache_free(spray_cache, addr);
+	pr_info("Allocated and freed spray_cache object %p of size %d\n",
+					addr, SPRAY_ITEM_SIZE);
+
+	spray_addrs = kcalloc(SPRAY_LENGTH, sizeof(int *), GFP_KERNEL);
+	if (!spray_addrs) {
+		pr_info("Unable to allocate memory for spray_addrs\n");
+		return;
+	}
+
+	pr_info("Original heap spraying: allocate %d objects of size %d...\n",
+					SPRAY_LENGTH, SPRAY_ITEM_SIZE);
+	for (i = 0; i < SPRAY_LENGTH; i++) {
+		spray_addrs[i] = kmem_cache_alloc(spray_cache, GFP_KERNEL);
+		if (!spray_addrs[i]) {
+			pr_info("Can't allocate memory in spray_cache cache\n");
+			break;
+		}
+
+		memset(spray_addrs[i], 0x42, SPRAY_ITEM_SIZE);
+
+		if (spray_addrs[i] == addr) {
+			pr_info("FAIL: attempt %lu: freed object is reallocated\n", i);
+			break;
+		}
+	}
+
+	if (i == SPRAY_LENGTH)
+		pr_info("OK: original heap spraying hasn't succeed\n");
+
+	for (i = 0; i < SPRAY_LENGTH; i++) {
+		if (spray_addrs[i])
+			kmem_cache_free(spray_cache, spray_addrs[i]);
+	}
+
+	kfree(spray_addrs);
+}
+
+/*
+ * Pushing an object through the quarantine requires both allocating and
+ * freeing memory. Objects are released from the quarantine on new memory
+ * allocations, but only when the quarantine size is over the limit.
+ * And the quarantine size grows on new memory freeing.
+ *
+ * This test should show that the randomized quarantine will release the
+ * freed object at an unpredictable moment.
+ */
+void lkdtm_PUSH_THROUGH_QUARANTINE(void)
+{
+	int *addr;
+	int *push_addr;
+	unsigned long i;
+
+	addr = kmem_cache_alloc(spray_cache, GFP_KERNEL);
+	if (!addr) {
+		pr_info("Can't allocate memory in spray_cache cache\n");
+		return;
+	}
+
+	memset(addr, 0xA5, SPRAY_ITEM_SIZE);
+	kmem_cache_free(spray_cache, addr);
+	pr_info("Allocated and freed spray_cache object %p of size %d\n",
+					addr, SPRAY_ITEM_SIZE);
+
+	pr_info("Push through quarantine: allocate and free %d objects of size %d...\n",
+					SPRAY_LENGTH, SPRAY_ITEM_SIZE);
+	for (i = 0; i < SPRAY_LENGTH; i++) {
+		push_addr = kmem_cache_alloc(spray_cache, GFP_KERNEL);
+		if (!push_addr) {
+			pr_info("Can't allocate memory in spray_cache cache\n");
+			break;
+		}
+
+		memset(push_addr, 0x42, SPRAY_ITEM_SIZE);
+		kmem_cache_free(spray_cache, push_addr);
+
+		if (push_addr == addr) {
+			pr_info("Target object is reallocated at attempt %lu\n", i);
+			break;
+		}
+	}
+
+	if (i == SPRAY_LENGTH) {
+		pr_info("Target object is NOT reallocated in %d attempts\n",
+					SPRAY_LENGTH);
+	}
+}
 
 void __init lkdtm_heap_init(void)
 {
@@ -211,6 +318,8 @@ void __init lkdtm_heap_init(void)
 					      64, 0, 0, ctor_double_free);
 	a_cache = kmem_cache_create("lkdtm-heap-a", 64, 0, 0, ctor_a);
 	b_cache = kmem_cache_create("lkdtm-heap-b", 64, 0, 0, ctor_b);
+	spray_cache = kmem_cache_create("lkdtm-heap-spray",
+					SPRAY_ITEM_SIZE, 0, 0, ctor_spray);
 }
 
 void __exit lkdtm_heap_exit(void)
@@ -218,4 +327,5 @@ void __exit lkdtm_heap_exit(void)
 	kmem_cache_destroy(double_free_cache);
 	kmem_cache_destroy(a_cache);
 	kmem_cache_destroy(b_cache);
+	kmem_cache_destroy(spray_cache);
 }
diff --git a/drivers/misc/lkdtm/lkdtm.h b/drivers/misc/lkdtm/lkdtm.h
index 8878538b2c13..d6b4b0708359 100644
--- a/drivers/misc/lkdtm/lkdtm.h
+++ b/drivers/misc/lkdtm/lkdtm.h
@@ -45,6 +45,8 @@ void lkdtm_READ_BUDDY_AFTER_FREE(void);
 void lkdtm_SLAB_FREE_DOUBLE(void);
 void lkdtm_SLAB_FREE_CROSS(void);
 void lkdtm_SLAB_FREE_PAGE(void);
+void lkdtm_HEAP_SPRAY(void);
+void lkdtm_PUSH_THROUGH_QUARANTINE(void);
 
 /* lkdtm_perms.c */
 void __init lkdtm_perms_init(void);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH RFC v2 6/6] mm: Add heap quarantine verbose debugging (not for merge)
  2020-09-29 18:35 [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free Alexander Popov
                   ` (4 preceding siblings ...)
  2020-09-29 18:35 ` [PATCH RFC v2 5/6] lkdtm: Add heap quarantine tests Alexander Popov
@ 2020-09-29 18:35 ` Alexander Popov
  2020-10-01 19:42 ` [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free Alexander Popov
  6 siblings, 0 replies; 24+ messages in thread
From: Alexander Popov @ 2020-09-29 18:35 UTC (permalink / raw)
  To: Kees Cook, Jann Horn, Will Deacon, Andrey Ryabinin,
	Alexander Potapenko, Dmitry Vyukov, Christoph Lameter,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	Masahiro Yamada, Masami Hiramatsu, Steven Rostedt,
	Peter Zijlstra, Krzysztof Kozlowski, Patrick Bellasi,
	David Howells, Eric Biederman, Johannes Weiner, Laura Abbott,
	Arnd Bergmann, Greg Kroah-Hartman, Daniel Micay,
	Andrey Konovalov, Matthew Wilcox, Pavel Machek,
	Valentin Schneider, kasan-dev, linux-mm, kernel-hardening,
	linux-kernel, Alexander Popov
  Cc: notify

Add verbose debugging for deeper understanding of the heap quarantine
inner workings (this patch is not for merge).

Signed-off-by: Alexander Popov <alex.popov@linux.com>
---
 mm/kasan/quarantine.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c
index 4ce100605086..98cd6e963755 100644
--- a/mm/kasan/quarantine.c
+++ b/mm/kasan/quarantine.c
@@ -203,6 +203,12 @@ void quarantine_put(struct kasan_free_meta *info, struct kmem_cache *cache)
 		qlist_move_all(q, &temp);
 
 		raw_spin_lock(&quarantine_lock);
+
+		pr_info("quarantine: PUT %zu to tail batch %d, whole sz %zu, batch sz %lu\n",
+				temp.bytes, quarantine_tail,
+				READ_ONCE(quarantine_size),
+				READ_ONCE(quarantine_batch_size));
+
 		WRITE_ONCE(quarantine_size, quarantine_size + temp.bytes);
 		qlist_move_all(&temp, &global_quarantine[quarantine_tail]);
 		if (global_quarantine[quarantine_tail].bytes >=
@@ -313,7 +319,22 @@ void quarantine_reduce(void)
 			quarantine_head = get_random_int() % QUARANTINE_BATCHES;
 		} while (quarantine_head == quarantine_tail);
 		qlist_move_random(&global_quarantine[quarantine_head], &to_free);
+		pr_info("quarantine: whole sz exceed max by %lu, REDUCE head batch %d by %zu, leave %zu\n",
+				quarantine_size - quarantine_max_size,
+				quarantine_head, to_free.bytes,
+				global_quarantine[quarantine_head].bytes);
 		WRITE_ONCE(quarantine_size, quarantine_size - to_free.bytes);
+
+		if (quarantine_head == 0) {
+			unsigned long i;
+
+			pr_info("quarantine: data level in batches:");
+			for (i = 0; i < QUARANTINE_BATCHES; i++) {
+				pr_info("  %lu - %lu%%\n",
+					i, global_quarantine[i].bytes *
+						100 / quarantine_batch_size);
+			}
+		}
 	}
 #endif
 
-- 
2.26.2


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH RFC v2 2/6] mm/slab: Perform init_on_free earlier
  2020-09-29 18:35 ` [PATCH RFC v2 2/6] mm/slab: Perform init_on_free earlier Alexander Popov
@ 2020-09-30 12:50   ` Alexander Potapenko
  2020-10-01 19:48     ` Alexander Popov
  2020-12-03 19:50     ` Alexander Popov
  0 siblings, 2 replies; 24+ messages in thread
From: Alexander Potapenko @ 2020-09-30 12:50 UTC (permalink / raw)
  To: Alexander Popov
  Cc: Kees Cook, Jann Horn, Will Deacon, Andrey Ryabinin,
	Dmitry Vyukov, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, Masahiro Yamada, Masami Hiramatsu,
	Steven Rostedt, Peter Zijlstra, Krzysztof Kozlowski,
	Patrick Bellasi, David Howells, Eric Biederman, Johannes Weiner,
	Laura Abbott, Arnd Bergmann, Greg Kroah-Hartman, Daniel Micay,
	Andrey Konovalov, Matthew Wilcox, Pavel Machek,
	Valentin Schneider, kasan-dev, Linux Memory Management List,
	Kernel Hardening, LKML, notify

On Tue, Sep 29, 2020 at 8:35 PM Alexander Popov <alex.popov@linux.com> wrote:
>
> Currently in CONFIG_SLAB init_on_free happens too late, and heap
> objects go to the heap quarantine being dirty. Lets move memory
> clearing before calling kasan_slab_free() to fix that.
>
> Signed-off-by: Alexander Popov <alex.popov@linux.com>
Reviewed-by: Alexander Potapenko <glider@google.com>

> ---
>  mm/slab.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/mm/slab.c b/mm/slab.c
> index 3160dff6fd76..5140203c5b76 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -3414,6 +3414,9 @@ static void cache_flusharray(struct kmem_cache *cachep, struct array_cache *ac)
>  static __always_inline void __cache_free(struct kmem_cache *cachep, void *objp,
>                                          unsigned long caller)
>  {
> +       if (unlikely(slab_want_init_on_free(cachep)))
> +               memset(objp, 0, cachep->object_size);
> +
>         /* Put the object into the quarantine, don't touch it for now. */
>         if (kasan_slab_free(cachep, objp, _RET_IP_))
>                 return;
> @@ -3432,8 +3435,6 @@ void ___cache_free(struct kmem_cache *cachep, void *objp,
>         struct array_cache *ac = cpu_cache_get(cachep);
>
>         check_irq_off();
> -       if (unlikely(slab_want_init_on_free(cachep)))
> -               memset(objp, 0, cachep->object_size);
>         kmemleak_free_recursive(objp, cachep->flags);
>         objp = cache_free_debugcheck(cachep, objp, caller);
>         memcg_slab_free_hook(cachep, virt_to_head_page(objp), objp);
> --
> 2.26.2
>


-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free
  2020-09-29 18:35 [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free Alexander Popov
                   ` (5 preceding siblings ...)
  2020-09-29 18:35 ` [PATCH RFC v2 6/6] mm: Add heap quarantine verbose debugging (not for merge) Alexander Popov
@ 2020-10-01 19:42 ` Alexander Popov
  2020-10-05 22:56   ` Jann Horn
  6 siblings, 1 reply; 24+ messages in thread
From: Alexander Popov @ 2020-10-01 19:42 UTC (permalink / raw)
  To: Kees Cook, Jann Horn, Will Deacon, Andrey Ryabinin,
	Alexander Potapenko, Dmitry Vyukov, Christoph Lameter,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	Masahiro Yamada, Masami Hiramatsu, Steven Rostedt,
	Peter Zijlstra, Krzysztof Kozlowski, Patrick Bellasi,
	David Howells, Eric Biederman, Johannes Weiner, Laura Abbott,
	Arnd Bergmann, Greg Kroah-Hartman, Daniel Micay,
	Andrey Konovalov, Matthew Wilcox, Pavel Machek,
	Valentin Schneider, kasan-dev, linux-mm, kernel-hardening,
	linux-kernel
  Cc: notify, Alexander Popov

Hello! I have some performance numbers. Please see below.

On 29.09.2020 21:35, Alexander Popov wrote:
> Hello everyone! Requesting for your comments.
> 
> This is the second version of the heap quarantine prototype for the Linux
> kernel. I performed a deeper evaluation of its security properties and
> developed new features like quarantine randomization and integration with
> init_on_free. That is fun! See below for more details.
> 
> 
> Rationale
> =========
> 
> Use-after-free vulnerabilities in the Linux kernel are very popular for
> exploitation. There are many examples, some of them:
>  https://googleprojectzero.blogspot.com/2018/09/a-cache-invalidation-bug-in-linux.html
>  https://googleprojectzero.blogspot.com/2019/11/bad-binder-android-in-wild-exploit.html?m=1
>  https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
> 
> Use-after-free exploits usually employ heap spraying technique.
> Generally it aims to put controlled bytes at a predetermined memory
> location on the heap.
> 
> Heap spraying for exploiting use-after-free in the Linux kernel relies on
> the fact that on kmalloc(), the slab allocator returns the address of
> the memory that was recently freed. So allocating a kernel object with
> the same size and controlled contents allows overwriting the vulnerable
> freed object.
> 
> I've found an easy way to break the heap spraying for use-after-free
> exploitation. I extracted slab freelist quarantine from KASAN functionality
> and called it CONFIG_SLAB_QUARANTINE. Please see patch 1/6.
> 
> If this feature is enabled, freed allocations are stored in the quarantine
> queue where they wait for actual freeing. So they can't be instantly
> reallocated and overwritten by use-after-free exploits.
> 
> N.B. Heap spraying for out-of-bounds exploitation is another technique,
> heap quarantine doesn't break it.
> 
> 
> Security properties
> ===================
> 
> For researching security properties of the heap quarantine I developed 2 lkdtm
> tests (see the patch 5/6).
> 
> The first test is called lkdtm_HEAP_SPRAY. It allocates and frees an object
> from a separate kmem_cache and then allocates 400000 similar objects.
> I.e. this test performs an original heap spraying technique for use-after-free
> exploitation.
> 
> If CONFIG_SLAB_QUARANTINE is disabled, the freed object is instantly
> reallocated and overwritten:
>   # echo HEAP_SPRAY > /sys/kernel/debug/provoke-crash/DIRECT
>    lkdtm: Performing direct entry HEAP_SPRAY
>    lkdtm: Allocated and freed spray_cache object 000000002b5b3ad4 of size 333
>    lkdtm: Original heap spraying: allocate 400000 objects of size 333...
>    lkdtm: FAIL: attempt 0: freed object is reallocated
> 
> If CONFIG_SLAB_QUARANTINE is enabled, 400000 new allocations don't overwrite
> the freed object:
>   # echo HEAP_SPRAY > /sys/kernel/debug/provoke-crash/DIRECT
>    lkdtm: Performing direct entry HEAP_SPRAY
>    lkdtm: Allocated and freed spray_cache object 000000009909e777 of size 333
>    lkdtm: Original heap spraying: allocate 400000 objects of size 333...
>    lkdtm: OK: original heap spraying hasn't succeed
> 
> That happens because pushing an object through the quarantine requires _both_
> allocating and freeing memory. Objects are released from the quarantine on
> new memory allocations, but only when the quarantine size is over the limit.
> And the quarantine size grows on new memory freeing.
> 
> That's why I created the second test called lkdtm_PUSH_THROUGH_QUARANTINE.
> It allocates and frees an object from a separate kmem_cache and then performs
> kmem_cache_alloc()+kmem_cache_free() for that cache 400000 times.
> This test effectively pushes the object through the heap quarantine and
> reallocates it after it returns back to the allocator freelist:
>   # echo PUSH_THROUGH_QUARANTINE > /sys/kernel/debug/provoke-crash/
>    lkdtm: Performing direct entry PUSH_THROUGH_QUARANTINE
>    lkdtm: Allocated and freed spray_cache object 000000008fdb15c3 of size 333
>    lkdtm: Push through quarantine: allocate and free 400000 objects of size 333...
>    lkdtm: Target object is reallocated at attempt 182994
>   # echo PUSH_THROUGH_QUARANTINE > /sys/kernel/debug/provoke-crash/
>    lkdtm: Performing direct entry PUSH_THROUGH_QUARANTINE
>    lkdtm: Allocated and freed spray_cache object 000000004e223cbe of size 333
>    lkdtm: Push through quarantine: allocate and free 400000 objects of size 333...
>    lkdtm: Target object is reallocated at attempt 186830
>   # echo PUSH_THROUGH_QUARANTINE > /sys/kernel/debug/provoke-crash/
>    lkdtm: Performing direct entry PUSH_THROUGH_QUARANTINE
>    lkdtm: Allocated and freed spray_cache object 000000007663a058 of size 333
>    lkdtm: Push through quarantine: allocate and free 400000 objects of size 333...
>    lkdtm: Target object is reallocated at attempt 182010
> 
> As you can see, the number of the allocations that are needed for overwriting
> the vulnerable object is almost the same. That would be good for stable
> use-after-free exploitation and should not be allowed.
> That's why I developed the quarantine randomization (see the patch 4/6).
> 
> This randomization required very small hackish changes of the heap quarantine
> mechanism. At first all quarantine batches are filled by objects. Then during
> the quarantine reducing I randomly choose and free 1/2 of objects from a
> randomly chosen batch. Now the randomized quarantine releases the freed object
> at an unpredictable moment:
>    lkdtm: Target object is reallocated at attempt 107884
>    lkdtm: Target object is reallocated at attempt 265641
>    lkdtm: Target object is reallocated at attempt 100030
>    lkdtm: Target object is NOT reallocated in 400000 attempts
>    lkdtm: Target object is reallocated at attempt 204731
>    lkdtm: Target object is reallocated at attempt 359333
>    lkdtm: Target object is reallocated at attempt 289349
>    lkdtm: Target object is reallocated at attempt 119893
>    lkdtm: Target object is reallocated at attempt 225202
>    lkdtm: Target object is reallocated at attempt 87343
> 
> However, this randomization alone would not disturb the attacker, because
> the quarantine stores the attacker's data (the payload) in the sprayed objects.
> I.e. the reallocated and overwritten vulnerable object contains the payload
> until the next reallocation (very bad).
> 
> Hence heap objects should be erased before going to the heap quarantine.
> Moreover, filling them by zeros gives a chance to detect use-after-free
> accesses to non-zero data while an object stays in the quarantine (nice!).
> That functionality already exists in the kernel, it's called init_on_free.
> I integrated it with CONFIG_SLAB_QUARANTINE in the patch 3/6.
> 
> During that work I found a bug: in CONFIG_SLAB init_on_free happens too
> late, and heap objects go to the KASAN quarantine being dirty. See the fix
> in the patch 2/6.
> 
> For deeper understanding of the heap quarantine inner workings, I attach
> the patch 6/6, which contains verbose debugging (not for merge).
> It's very helpful, see the output example:
>    quarantine: PUT 508992 to tail batch 123, whole sz 65118872, batch sz 508854
>    quarantine: whole sz exceed max by 494552, REDUCE head batch 0 by 415392, leave 396304
>    quarantine: data level in batches:
>      0 - 77%
>      1 - 108%
>      2 - 83%
>      3 - 21%
>    ...
>      125 - 75%
>      126 - 12%
>      127 - 108%
>    quarantine: whole sz exceed max by 79160, REDUCE head batch 12 by 14160, leave 17608
>    quarantine: whole sz exceed max by 65000, REDUCE head batch 75 by 218328, leave 195232
>    quarantine: PUT 508992 to tail batch 124, whole sz 64979984, batch sz 508854
>    ...
> 
> 
> Changes in v2
> =============
> 
>  - Added heap quarantine randomization (the patch 4/6).
> 
>  - Integrated CONFIG_SLAB_QUARANTINE with init_on_free (the patch 3/6).
> 
>  - Fixed late init_on_free in CONFIG_SLAB (the patch 2/6).
> 
>  - Added lkdtm_PUSH_THROUGH_QUARANTINE test.
> 
>  - Added the quarantine verbose debugging (the patch 6/6, not for merge).
> 
>  - Improved the descriptions according to the feedback from Kees Cook
>    and Matthew Wilcox.
> 
>  - Made fixes recommended by Kees Cook:
> 
>    * Avoided BUG_ON() in kasan_cache_create() by handling the error and
>      reporting with WARN_ON().
> 
>    * Created a separate kmem_cache for new lkdtm tests.
> 
>    * Fixed kasan_track.pid type to pid_t.
> 
> 
> TODO for the next prototypes
> ============================
> 
> 1. Performance evaluation and optimization.
>    I would really appreciate your ideas about performance testing of a
>    kernel with the heap quarantine. The first prototype was tested with
>    hackbench and kernel build timing (which showed very different numbers).
>    Earlier the developers similarly tested init_on_free functionality.
>    However, Brad Spengler says in his twitter that such testing method
>    is poor.

I've made various tests on real hardware and in virtual machines:
 1) network throughput test using iperf
     server: iperf -s -f K
     client: iperf -c 127.0.0.1 -t 60 -f K
 2) scheduler stress test
     hackbench -s 4000 -l 500 -g 15 -f 25 -P
 3) building the defconfig kernel
     time make -j2

I compared Linux kernel 5.9.0-rc6 with:
 - init_on_free=off,
 - init_on_free=on,
 - CONFIG_SLAB_QUARANTINE=y (which enables init_on_free).

Each test was performed 5 times. I will show the mean values.
If you are interested, I can share all the results and calculate standard deviation.

Real hardware, Intel Core i7-6500U CPU
 1) Network throughput test with iperf
     init_on_free=off: 5467152.2 KBytes/sec
     init_on_free=on: 3937545 KBytes/sec (-28.0% vs init_on_free=off)
     CONFIG_SLAB_QUARANTINE: 3858848.6 KBytes/sec (-2.0% vs init_on_free=on)
 2) Scheduler stress test with hackbench
     init_on_free=off: 8.5364s
     init_on_free=on: 8.9858s (+5.3% vs init_on_free=off)
     CONFIG_SLAB_QUARANTINE: 17.2232s (+91.7% vs init_on_free=on)
 3) Building the defconfig kernel:
     init_on_free=off: 10m54.475s
     init_on_free=on: 11m5.745s (+1.7% vs init_on_free=off)
     CONFIG_SLAB_QUARANTINE: 11m13.291s (+1.1% vs init_on_free=on)

Virtual machine, QEMU/KVM
 1) Network throughput test with iperf
     init_on_free=off: 3554237.4 KBytes/sec
     init_on_free=on: 2828887.4 KBytes/sec (-20.4% vs init_on_free=off)
     CONFIG_SLAB_QUARANTINE: 2587308.2 KBytes/sec (-8.5% vs init_on_free=on)
 2) Scheduler stress test with hackbench
     init_on_free=off: 19.3602s
     init_on_free=on: 20.8854s (+7.9% vs init_on_free=off)
     CONFIG_SLAB_QUARANTINE: 30.0746s (+44.0% vs init_on_free=on)

We can see that the results of these tests are quite diverse.
Your interpretation of the results and ideas of other tests are welcome.

N.B. There was NO performance optimization made for this version of the heap
quarantine prototype. The main effort was put into researching its security
properties (hope for your feedback). Performance optimization will be done in
further steps, if we see that my work is worth doing.

> 2. Complete separation of CONFIG_SLAB_QUARANTINE from KASAN (feedback
>    from Andrey Konovalov).
> 
> 3. Adding a kernel boot parameter for enabling/disabling the heap quaranitne
>    (feedback from Kees Cook).
> 
> 4. Testing the heap quarantine in near-OOM situations (feedback from
>    Pavel Machek).
> 
> 5. Does this work somehow help or disturb the integration of the
>    Memory Tagging for the Linux kernel?
> 
> 6. After rebasing the series onto v5.9.0-rc6, CONFIG_SLAB kernel started to
>    show warnings about few slab caches that have no space for additional
>    metadata. It needs more investigation. I believe it affects KASAN bug
>    detection abilities as well. Warning example:
>      WARNING: CPU: 0 PID: 0 at mm/kasan/slab_quarantine.c:38 kasan_cache_create+0x37/0x50
>      Modules linked in:
>      CPU: 0 PID: 0 Comm: swapper Not tainted 5.9.0-rc6+ #1
>      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014
>      RIP: 0010:kasan_cache_create+0x37/0x50
>      ...
>      Call Trace:
>       __kmem_cache_create+0x74/0x250
>       create_boot_cache+0x6d/0x91
>       create_kmalloc_cache+0x57/0x93
>       new_kmalloc_cache+0x39/0x47
>       create_kmalloc_caches+0x33/0xd9
>       start_kernel+0x25b/0x532
>       secondary_startup_64+0xb6/0xc0
> 
> Thanks in advance for your feedback.
> Best regards,
> Alexander
> 
> 
> Alexander Popov (6):
>   mm: Extract SLAB_QUARANTINE from KASAN
>   mm/slab: Perform init_on_free earlier
>   mm: Integrate SLAB_QUARANTINE with init_on_free
>   mm: Implement slab quarantine randomization
>   lkdtm: Add heap quarantine tests
>   mm: Add heap quarantine verbose debugging (not for merge)
> 
>  drivers/misc/lkdtm/core.c  |   2 +
>  drivers/misc/lkdtm/heap.c  | 110 +++++++++++++++++++++++++++++++++++++
>  drivers/misc/lkdtm/lkdtm.h |   2 +
>  include/linux/kasan.h      | 107 ++++++++++++++++++++----------------
>  include/linux/slab_def.h   |   2 +-
>  include/linux/slub_def.h   |   2 +-
>  init/Kconfig               |  14 +++++
>  mm/Makefile                |   3 +-
>  mm/kasan/Makefile          |   2 +
>  mm/kasan/kasan.h           |  75 +++++++++++++------------
>  mm/kasan/quarantine.c      | 102 ++++++++++++++++++++++++++++++----
>  mm/kasan/slab_quarantine.c | 106 +++++++++++++++++++++++++++++++++++
>  mm/page_alloc.c            |  22 ++++++++
>  mm/slab.c                  |   5 +-
>  mm/slub.c                  |   2 +-
>  15 files changed, 455 insertions(+), 101 deletions(-)
>  create mode 100644 mm/kasan/slab_quarantine.c
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH RFC v2 2/6] mm/slab: Perform init_on_free earlier
  2020-09-30 12:50   ` Alexander Potapenko
@ 2020-10-01 19:48     ` Alexander Popov
  2020-12-03 19:50     ` Alexander Popov
  1 sibling, 0 replies; 24+ messages in thread
From: Alexander Popov @ 2020-10-01 19:48 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Kees Cook, Jann Horn, Will Deacon, Andrey Ryabinin,
	Dmitry Vyukov, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, Masahiro Yamada, Masami Hiramatsu,
	Steven Rostedt, Peter Zijlstra, Krzysztof Kozlowski,
	Patrick Bellasi, David Howells, Eric Biederman, Johannes Weiner,
	Laura Abbott, Arnd Bergmann, Greg Kroah-Hartman, Daniel Micay,
	Andrey Konovalov, Matthew Wilcox, Pavel Machek,
	Valentin Schneider, kasan-dev, Linux Memory Management List,
	Kernel Hardening, LKML, notify

On 30.09.2020 15:50, Alexander Potapenko wrote:
> On Tue, Sep 29, 2020 at 8:35 PM Alexander Popov <alex.popov@linux.com> wrote:
>>
>> Currently in CONFIG_SLAB init_on_free happens too late, and heap
>> objects go to the heap quarantine being dirty. Lets move memory
>> clearing before calling kasan_slab_free() to fix that.
>>
>> Signed-off-by: Alexander Popov <alex.popov@linux.com>
> Reviewed-by: Alexander Potapenko <glider@google.com>

Thanks for the review, Alexander!

Do you have any idea how this patch series relates to Memory Tagging support
that is currently developed?

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free
  2020-10-01 19:42 ` [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free Alexander Popov
@ 2020-10-05 22:56   ` Jann Horn
  2020-10-06  0:44     ` Matthew Wilcox
  2020-10-06 17:56     ` Alexander Popov
  0 siblings, 2 replies; 24+ messages in thread
From: Jann Horn @ 2020-10-05 22:56 UTC (permalink / raw)
  To: Alexander Popov
  Cc: Kees Cook, Will Deacon, Andrey Ryabinin, Alexander Potapenko,
	Dmitry Vyukov, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, Masahiro Yamada, Masami Hiramatsu,
	Steven Rostedt, Peter Zijlstra, Krzysztof Kozlowski,
	Patrick Bellasi, David Howells, Eric Biederman, Johannes Weiner,
	Laura Abbott, Arnd Bergmann, Greg Kroah-Hartman, Daniel Micay,
	Andrey Konovalov, Matthew Wilcox, Pavel Machek,
	Valentin Schneider, kasan-dev, Linux-MM, Kernel Hardening,
	kernel list, notify

On Thu, Oct 1, 2020 at 9:43 PM Alexander Popov <alex.popov@linux.com> wrote:
> On 29.09.2020 21:35, Alexander Popov wrote:
> > This is the second version of the heap quarantine prototype for the Linux
> > kernel. I performed a deeper evaluation of its security properties and
> > developed new features like quarantine randomization and integration with
> > init_on_free. That is fun! See below for more details.
> >
> >
> > Rationale
> > =========
> >
> > Use-after-free vulnerabilities in the Linux kernel are very popular for
> > exploitation. There are many examples, some of them:
> >  https://googleprojectzero.blogspot.com/2018/09/a-cache-invalidation-bug-in-linux.html

I don't think your proposed mitigation would work with much
reliability against this bug; the attacker has full control over the
timing of the original use and the following use, so an attacker
should be able to trigger the kmem_cache_free(), then spam enough new
VMAs and delete them to flush out the quarantine, and then do heap
spraying as normal, or something like that.

Also, note that here, if the reallocation fails, the kernel still
wouldn't crash because the dangling object is not accessed further if
the address range stored in it doesn't match the fault address. So an
attacker could potentially try multiple times, and if the object
happens to be on the quarantine the first time, that wouldn't really
be a showstopper, you'd just try again.

> >  https://googleprojectzero.blogspot.com/2019/11/bad-binder-android-in-wild-exploit.html?m=1

I think that here, again, the free() and the dangling pointer use were
caused by separate syscalls, meaning the attacker had control over
that timing?

> >  https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html

Haven't looked at that one in detail.

> > Use-after-free exploits usually employ heap spraying technique.
> > Generally it aims to put controlled bytes at a predetermined memory
> > location on the heap.

Well, not necessarily "predetermined". Depending on the circumstances,
you don't necessarily need to know which address you're writing to;
and you might not even need to overwrite a specific object, but
instead just have to overwrite one out of a bunch of objects, no
matter which.

> > Heap spraying for exploiting use-after-free in the Linux kernel relies on
> > the fact that on kmalloc(), the slab allocator returns the address of
> > the memory that was recently freed.

Yeah; and that behavior is pretty critical for performance. The longer
it's been since a newly allocated object was freed, the higher the
chance that you'll end up having to go further down the memory cache
hierarchy.

> > So allocating a kernel object with
> > the same size and controlled contents allows overwriting the vulnerable
> > freed object.

The vmacache exploit you linked to doesn't do that, it frees the
object all the way back to the page allocator and then sprays 4MiB of
memory from the page allocator. (Because VMAs use their own
kmem_cache, and the kmem_cache wasn't merged with any interesting
ones, and I saw no good way to exploit the bug by reallocating another
VMA over the old VMA back then. Although of course that doesn't mean
that there is no such way.)

[...]
> > Security properties
> > ===================
> >
> > For researching security properties of the heap quarantine I developed 2 lkdtm
> > tests (see the patch 5/6).
> >
> > The first test is called lkdtm_HEAP_SPRAY. It allocates and frees an object
> > from a separate kmem_cache and then allocates 400000 similar objects.
> > I.e. this test performs an original heap spraying technique for use-after-free
> > exploitation.
> >
> > If CONFIG_SLAB_QUARANTINE is disabled, the freed object is instantly
> > reallocated and overwritten:
> >   # echo HEAP_SPRAY > /sys/kernel/debug/provoke-crash/DIRECT
> >    lkdtm: Performing direct entry HEAP_SPRAY
> >    lkdtm: Allocated and freed spray_cache object 000000002b5b3ad4 of size 333
> >    lkdtm: Original heap spraying: allocate 400000 objects of size 333...
> >    lkdtm: FAIL: attempt 0: freed object is reallocated
> >
> > If CONFIG_SLAB_QUARANTINE is enabled, 400000 new allocations don't overwrite
> > the freed object:
> >   # echo HEAP_SPRAY > /sys/kernel/debug/provoke-crash/DIRECT
> >    lkdtm: Performing direct entry HEAP_SPRAY
> >    lkdtm: Allocated and freed spray_cache object 000000009909e777 of size 333
> >    lkdtm: Original heap spraying: allocate 400000 objects of size 333...
> >    lkdtm: OK: original heap spraying hasn't succeed
> >
> > That happens because pushing an object through the quarantine requires _both_
> > allocating and freeing memory. Objects are released from the quarantine on
> > new memory allocations, but only when the quarantine size is over the limit.
> > And the quarantine size grows on new memory freeing.
> >
> > That's why I created the second test called lkdtm_PUSH_THROUGH_QUARANTINE.
> > It allocates and frees an object from a separate kmem_cache and then performs
> > kmem_cache_alloc()+kmem_cache_free() for that cache 400000 times.
> > This test effectively pushes the object through the heap quarantine and
> > reallocates it after it returns back to the allocator freelist:
[...]
> > As you can see, the number of the allocations that are needed for overwriting
> > the vulnerable object is almost the same. That would be good for stable
> > use-after-free exploitation and should not be allowed.
> > That's why I developed the quarantine randomization (see the patch 4/6).
> >
> > This randomization required very small hackish changes of the heap quarantine
> > mechanism. At first all quarantine batches are filled by objects. Then during
> > the quarantine reducing I randomly choose and free 1/2 of objects from a
> > randomly chosen batch. Now the randomized quarantine releases the freed object
> > at an unpredictable moment:
> >    lkdtm: Target object is reallocated at attempt 107884
[...]
> >    lkdtm: Target object is reallocated at attempt 87343

Those numbers are fairly big. At that point you might not even fit
into L3 cache anymore, right? You'd often be hitting DRAM for new
allocations? And for many slabs, you might end using much more memory
for the quarantine than for actual in-use allocations.

It seems to me like, for this to stop attacks with a high probability,
you'd have to reserve a huge chunk of kernel memory for the
quarantines - even if the attacker doesn't know anything about the
status of the quarantine (which isn't necessarily the case, depending
on whether the attacker can abuse microarchitectural data leakage, or
if the attacker can trigger a pure data read through the dangling
pointer), they should still be able to win with a probability around
quarantine_size/allocated_memory_size if they have a heap spraying
primitive without strict limits.

> > However, this randomization alone would not disturb the attacker, because
> > the quarantine stores the attacker's data (the payload) in the sprayed objects.
> > I.e. the reallocated and overwritten vulnerable object contains the payload
> > until the next reallocation (very bad).
> >
> > Hence heap objects should be erased before going to the heap quarantine.
> > Moreover, filling them by zeros gives a chance to detect use-after-free
> > accesses to non-zero data while an object stays in the quarantine (nice!).
> > That functionality already exists in the kernel, it's called init_on_free.
> > I integrated it with CONFIG_SLAB_QUARANTINE in the patch 3/6.
> >
> > During that work I found a bug: in CONFIG_SLAB init_on_free happens too
> > late, and heap objects go to the KASAN quarantine being dirty. See the fix
> > in the patch 2/6.
[...]
> I've made various tests on real hardware and in virtual machines:
>  1) network throughput test using iperf
>      server: iperf -s -f K
>      client: iperf -c 127.0.0.1 -t 60 -f K
>  2) scheduler stress test
>      hackbench -s 4000 -l 500 -g 15 -f 25 -P
>  3) building the defconfig kernel
>      time make -j2
>
> I compared Linux kernel 5.9.0-rc6 with:
>  - init_on_free=off,
>  - init_on_free=on,
>  - CONFIG_SLAB_QUARANTINE=y (which enables init_on_free).
>
> Each test was performed 5 times. I will show the mean values.
> If you are interested, I can share all the results and calculate standard deviation.
>
> Real hardware, Intel Core i7-6500U CPU
>  1) Network throughput test with iperf
>      init_on_free=off: 5467152.2 KBytes/sec
>      init_on_free=on: 3937545 KBytes/sec (-28.0% vs init_on_free=off)
>      CONFIG_SLAB_QUARANTINE: 3858848.6 KBytes/sec (-2.0% vs init_on_free=on)
>  2) Scheduler stress test with hackbench
>      init_on_free=off: 8.5364s
>      init_on_free=on: 8.9858s (+5.3% vs init_on_free=off)
>      CONFIG_SLAB_QUARANTINE: 17.2232s (+91.7% vs init_on_free=on)

These numbers seem really high for a mitigation, especially if that
performance hit does not really buy you deterministic protection
against many bugs.

[...]
> N.B. There was NO performance optimization made for this version of the heap
> quarantine prototype. The main effort was put into researching its security
> properties (hope for your feedback). Performance optimization will be done in
> further steps, if we see that my work is worth doing.

But you are pretty much inherently limited in terms of performance by
the effect the quarantine has on the data cache, right?

It seems to me like, if you want to make UAF exploitation harder at
the heap allocator layer, you could do somewhat more effective things
with a probably much smaller performance budget. Things like
preventing the reallocation of virtual kernel addresses with different
types, such that an attacker can only replace a UAF object with
another object of the same type. (That is not an idea I like very much
either, but I would like it more than this proposal.) (E.g. some
browsers implement things along those lines, I believe.)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free
  2020-10-05 22:56   ` Jann Horn
@ 2020-10-06  0:44     ` Matthew Wilcox
  2020-10-06  0:48       ` Jann Horn
                         ` (2 more replies)
  2020-10-06 17:56     ` Alexander Popov
  1 sibling, 3 replies; 24+ messages in thread
From: Matthew Wilcox @ 2020-10-06  0:44 UTC (permalink / raw)
  To: Jann Horn
  Cc: Alexander Popov, Kees Cook, Will Deacon, Andrey Ryabinin,
	Alexander Potapenko, Dmitry Vyukov, Christoph Lameter,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	Masahiro Yamada, Masami Hiramatsu, Steven Rostedt,
	Peter Zijlstra, Krzysztof Kozlowski, Patrick Bellasi,
	David Howells, Eric Biederman, Johannes Weiner, Laura Abbott,
	Arnd Bergmann, Greg Kroah-Hartman, Daniel Micay,
	Andrey Konovalov, Pavel Machek, Valentin Schneider, kasan-dev,
	Linux-MM, Kernel Hardening, kernel list, notify

On Tue, Oct 06, 2020 at 12:56:33AM +0200, Jann Horn wrote:
> It seems to me like, if you want to make UAF exploitation harder at
> the heap allocator layer, you could do somewhat more effective things
> with a probably much smaller performance budget. Things like
> preventing the reallocation of virtual kernel addresses with different
> types, such that an attacker can only replace a UAF object with
> another object of the same type. (That is not an idea I like very much
> either, but I would like it more than this proposal.) (E.g. some
> browsers implement things along those lines, I believe.)

The slab allocator already has that functionality.  We call it
TYPESAFE_BY_RCU, but if forcing that on by default would enhance security
by a measurable amount, it wouldn't be a terribly hard sell ...

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free
  2020-10-06  0:44     ` Matthew Wilcox
@ 2020-10-06  0:48       ` Jann Horn
  2020-10-06  2:09       ` Kees Cook
  2020-10-06  8:32       ` Christopher Lameter
  2 siblings, 0 replies; 24+ messages in thread
From: Jann Horn @ 2020-10-06  0:48 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Alexander Popov, Kees Cook, Will Deacon, Andrey Ryabinin,
	Alexander Potapenko, Dmitry Vyukov, Christoph Lameter,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	Masahiro Yamada, Masami Hiramatsu, Steven Rostedt,
	Peter Zijlstra, Krzysztof Kozlowski, Patrick Bellasi,
	David Howells, Eric Biederman, Johannes Weiner, Laura Abbott,
	Arnd Bergmann, Greg Kroah-Hartman, Daniel Micay,
	Andrey Konovalov, Pavel Machek, Valentin Schneider, kasan-dev,
	Linux-MM, Kernel Hardening, kernel list, notify

On Tue, Oct 6, 2020 at 2:44 AM Matthew Wilcox <willy@infradead.org> wrote:
> On Tue, Oct 06, 2020 at 12:56:33AM +0200, Jann Horn wrote:
> > It seems to me like, if you want to make UAF exploitation harder at
> > the heap allocator layer, you could do somewhat more effective things
> > with a probably much smaller performance budget. Things like
> > preventing the reallocation of virtual kernel addresses with different
> > types, such that an attacker can only replace a UAF object with
> > another object of the same type. (That is not an idea I like very much
> > either, but I would like it more than this proposal.) (E.g. some
> > browsers implement things along those lines, I believe.)
>
> The slab allocator already has that functionality.  We call it
> TYPESAFE_BY_RCU, but if forcing that on by default would enhance security
> by a measurable amount, it wouldn't be a terribly hard sell ...

TYPESAFE_BY_RCU just forces an RCU grace period before the
reallocation; I'm thinking of something more drastic, like completely
refusing to give back the memory, or using vmalloc for slabs where
that's safe (reusing physical but not virtual addresses across types).
And, to make it more effective, something like a compiler plugin to
isolate kmalloc(sizeof(<type>)) allocations by type beyond just size
classes.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free
  2020-10-06  0:44     ` Matthew Wilcox
  2020-10-06  0:48       ` Jann Horn
@ 2020-10-06  2:09       ` Kees Cook
  2020-10-06  2:16         ` Jann Horn
                           ` (2 more replies)
  2020-10-06  8:32       ` Christopher Lameter
  2 siblings, 3 replies; 24+ messages in thread
From: Kees Cook @ 2020-10-06  2:09 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Jann Horn, Alexander Popov, Will Deacon, Andrey Ryabinin,
	Alexander Potapenko, Dmitry Vyukov, Christoph Lameter,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	Masahiro Yamada, Masami Hiramatsu, Steven Rostedt,
	Peter Zijlstra, Krzysztof Kozlowski, Patrick Bellasi,
	David Howells, Eric Biederman, Johannes Weiner, Laura Abbott,
	Arnd Bergmann, Greg Kroah-Hartman, Daniel Micay,
	Andrey Konovalov, Pavel Machek, Valentin Schneider, kasan-dev,
	Linux-MM, Kernel Hardening, kernel list, notify

On Tue, Oct 06, 2020 at 01:44:14AM +0100, Matthew Wilcox wrote:
> On Tue, Oct 06, 2020 at 12:56:33AM +0200, Jann Horn wrote:
> > It seems to me like, if you want to make UAF exploitation harder at
> > the heap allocator layer, you could do somewhat more effective things
> > with a probably much smaller performance budget. Things like
> > preventing the reallocation of virtual kernel addresses with different
> > types, such that an attacker can only replace a UAF object with
> > another object of the same type. (That is not an idea I like very much
> > either, but I would like it more than this proposal.) (E.g. some
> > browsers implement things along those lines, I believe.)
> 
> The slab allocator already has that functionality.  We call it
> TYPESAFE_BY_RCU, but if forcing that on by default would enhance security
> by a measurable amount, it wouldn't be a terribly hard sell ...

Isn't the "easy" version of this already controlled by slab_merge? (i.e.
do not share same-sized/flagged kmem_caches between different caches)

The large trouble are the kmalloc caches, which don't have types
associated with them. Having implicit kmem caches based on the type
being allocated there would need some pretty extensive plumbing, I
think?

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free
  2020-10-06  2:09       ` Kees Cook
@ 2020-10-06  2:16         ` Jann Horn
  2020-10-06  2:19         ` Daniel Micay
  2020-10-06  8:35         ` Christopher Lameter
  2 siblings, 0 replies; 24+ messages in thread
From: Jann Horn @ 2020-10-06  2:16 UTC (permalink / raw)
  To: Kees Cook
  Cc: Matthew Wilcox, Alexander Popov, Will Deacon, Andrey Ryabinin,
	Alexander Potapenko, Dmitry Vyukov, Christoph Lameter,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	Masahiro Yamada, Masami Hiramatsu, Steven Rostedt,
	Peter Zijlstra, Krzysztof Kozlowski, Patrick Bellasi,
	David Howells, Eric Biederman, Johannes Weiner, Laura Abbott,
	Arnd Bergmann, Greg Kroah-Hartman, Daniel Micay,
	Andrey Konovalov, Pavel Machek, Valentin Schneider, kasan-dev,
	Linux-MM, Kernel Hardening, kernel list, notify

On Tue, Oct 6, 2020 at 4:09 AM Kees Cook <keescook@chromium.org> wrote:
> On Tue, Oct 06, 2020 at 01:44:14AM +0100, Matthew Wilcox wrote:
> > On Tue, Oct 06, 2020 at 12:56:33AM +0200, Jann Horn wrote:
> > > It seems to me like, if you want to make UAF exploitation harder at
> > > the heap allocator layer, you could do somewhat more effective things
> > > with a probably much smaller performance budget. Things like
> > > preventing the reallocation of virtual kernel addresses with different
> > > types, such that an attacker can only replace a UAF object with
> > > another object of the same type. (That is not an idea I like very much
> > > either, but I would like it more than this proposal.) (E.g. some
> > > browsers implement things along those lines, I believe.)
> >
> > The slab allocator already has that functionality.  We call it
> > TYPESAFE_BY_RCU, but if forcing that on by default would enhance security
> > by a measurable amount, it wouldn't be a terribly hard sell ...
>
> Isn't the "easy" version of this already controlled by slab_merge? (i.e.
> do not share same-sized/flagged kmem_caches between different caches)

Yes, but slab_merge still normally frees slab pages to the page allocator.

> The large trouble are the kmalloc caches, which don't have types
> associated with them. Having implicit kmem caches based on the type
> being allocated there would need some pretty extensive plumbing, I
> think?

Well, a bit of plumbing, at least. You'd need to teach the compiler
frontend to grab type names from sizeof() and stuff that type
information somewhere, e.g. by generating an extra function argument
referring to the type, or something like that. Could be as simple as a
reference to a bss section variable that encodes the type in the name,
and the linker already has the logic to automatically deduplicate
those across compilation units - that way, on the compiler side, a
pure frontend plugin might do the job?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free
  2020-10-06  2:09       ` Kees Cook
  2020-10-06  2:16         ` Jann Horn
@ 2020-10-06  2:19         ` Daniel Micay
  2020-10-06  8:35         ` Christopher Lameter
  2 siblings, 0 replies; 24+ messages in thread
From: Daniel Micay @ 2020-10-06  2:19 UTC (permalink / raw)
  To: Kees Cook
  Cc: Matthew Wilcox, Jann Horn, Alexander Popov, Will Deacon,
	Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, Masahiro Yamada, Masami Hiramatsu, Steven Rostedt,
	Peter Zijlstra, Krzysztof Kozlowski, Patrick Bellasi,
	David Howells, Eric Biederman, Johannes Weiner, Laura Abbott,
	Arnd Bergmann, Greg Kroah-Hartman, Andrey Konovalov,
	Pavel Machek, Valentin Schneider, kasan-dev, Linux-MM,
	Kernel Hardening, kernel list, notify

It will reuse the memory for other things when the whole slab is freed
though. Not really realistic to change that without it being backed by
virtual memory along with higher-level management of regions to avoid
intense fragmentation and metadata waste. It would depend a lot on
having much finer-grained slab caches, otherwise it's not going to be
much of an alternative to a quarantine feature. Even then, a
quarantine feature is still useful, but is less suitable for a
mainstream feature due to performance cost. Even a small quarantine
has a fairly high performance cost.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free
  2020-10-06  0:44     ` Matthew Wilcox
  2020-10-06  0:48       ` Jann Horn
  2020-10-06  2:09       ` Kees Cook
@ 2020-10-06  8:32       ` Christopher Lameter
  2 siblings, 0 replies; 24+ messages in thread
From: Christopher Lameter @ 2020-10-06  8:32 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Jann Horn, Alexander Popov, Kees Cook, Will Deacon,
	Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	Masahiro Yamada, Masami Hiramatsu, Steven Rostedt,
	Peter Zijlstra, Krzysztof Kozlowski, Patrick Bellasi,
	David Howells, Eric Biederman, Johannes Weiner, Laura Abbott,
	Arnd Bergmann, Greg Kroah-Hartman, Daniel Micay,
	Andrey Konovalov, Pavel Machek, Valentin Schneider, kasan-dev,
	Linux-MM, Kernel Hardening, kernel list, notify



On Tue, 6 Oct 2020, Matthew Wilcox wrote:

> On Tue, Oct 06, 2020 at 12:56:33AM +0200, Jann Horn wrote:
> > It seems to me like, if you want to make UAF exploitation harder at
> > the heap allocator layer, you could do somewhat more effective things
> > with a probably much smaller performance budget. Things like
> > preventing the reallocation of virtual kernel addresses with different
> > types, such that an attacker can only replace a UAF object with
> > another object of the same type. (That is not an idea I like very much
> > either, but I would like it more than this proposal.) (E.g. some
> > browsers implement things along those lines, I believe.)
>
> The slab allocator already has that functionality.  We call it
> TYPESAFE_BY_RCU, but if forcing that on by default would enhance security
> by a measurable amount, it wouldn't be a terribly hard sell ...

TYPESAFE functionality switches a lot of debugging off because that also
allows speculative accesses to the object after it was freed (requires
for RCU safeness because the object may be freed in an RCU period where
it is still accessed). I do not think you would like that.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free
  2020-10-06  2:09       ` Kees Cook
  2020-10-06  2:16         ` Jann Horn
  2020-10-06  2:19         ` Daniel Micay
@ 2020-10-06  8:35         ` Christopher Lameter
  2 siblings, 0 replies; 24+ messages in thread
From: Christopher Lameter @ 2020-10-06  8:35 UTC (permalink / raw)
  To: Kees Cook
  Cc: Matthew Wilcox, Jann Horn, Alexander Popov, Will Deacon,
	Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	Masahiro Yamada, Masami Hiramatsu, Steven Rostedt,
	Peter Zijlstra, Krzysztof Kozlowski, Patrick Bellasi,
	David Howells, Eric Biederman, Johannes Weiner, Laura Abbott,
	Arnd Bergmann, Greg Kroah-Hartman, Daniel Micay,
	Andrey Konovalov, Pavel Machek, Valentin Schneider, kasan-dev,
	Linux-MM, Kernel Hardening, kernel list, notify


On Mon, 5 Oct 2020, Kees Cook wrote:

> > TYPESAFE_BY_RCU, but if forcing that on by default would enhance security
> > by a measurable amount, it wouldn't be a terribly hard sell ...
>
> Isn't the "easy" version of this already controlled by slab_merge? (i.e.
> do not share same-sized/flagged kmem_caches between different caches)

Right.

> The large trouble are the kmalloc caches, which don't have types
> associated with them. Having implicit kmem caches based on the type
> being allocated there would need some pretty extensive plumbing, I
> think?

Actually typifying those accesses may get rid of a lot of kmalloc
allocations and could help to ease the management and control of objects.

It may be a big task though given the ubiquity of kmalloc and the need to
create a massive amount of new slab caches. This is going to reduce the
cache hit rate significantly.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free
  2020-10-05 22:56   ` Jann Horn
  2020-10-06  0:44     ` Matthew Wilcox
@ 2020-10-06 17:56     ` Alexander Popov
  2020-10-06 18:37       ` Jann Horn
  1 sibling, 1 reply; 24+ messages in thread
From: Alexander Popov @ 2020-10-06 17:56 UTC (permalink / raw)
  To: Jann Horn, Kees Cook
  Cc: Will Deacon, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, Masahiro Yamada, Masami Hiramatsu, Steven Rostedt,
	Peter Zijlstra, Krzysztof Kozlowski, Patrick Bellasi,
	David Howells, Eric Biederman, Johannes Weiner, Laura Abbott,
	Arnd Bergmann, Greg Kroah-Hartman, Daniel Micay,
	Andrey Konovalov, Matthew Wilcox, Pavel Machek,
	Valentin Schneider, kasan-dev, Linux-MM, Kernel Hardening,
	kernel list, notify

On 06.10.2020 01:56, Jann Horn wrote:
> On Thu, Oct 1, 2020 at 9:43 PM Alexander Popov <alex.popov@linux.com> wrote:
>> On 29.09.2020 21:35, Alexander Popov wrote:
>>> This is the second version of the heap quarantine prototype for the Linux
>>> kernel. I performed a deeper evaluation of its security properties and
>>> developed new features like quarantine randomization and integration with
>>> init_on_free. That is fun! See below for more details.
>>>
>>>
>>> Rationale
>>> =========
>>>
>>> Use-after-free vulnerabilities in the Linux kernel are very popular for
>>> exploitation. There are many examples, some of them:
>>>  https://googleprojectzero.blogspot.com/2018/09/a-cache-invalidation-bug-in-linux.html

Hello Jann, thanks for your reply.

> I don't think your proposed mitigation would work with much
> reliability against this bug; the attacker has full control over the
> timing of the original use and the following use, so an attacker
> should be able to trigger the kmem_cache_free(), then spam enough new
> VMAs and delete them to flush out the quarantine, and then do heap
> spraying as normal, or something like that.

The randomized quarantine will release the vulnerable object at an unpredictable
moment (patch 4/6).

So I think the control over the time of the use-after-free access doesn't help
attackers, if they don't have an "infinite spray" -- unlimited ability to store
controlled data in the kernelspace objects of the needed size without freeing them.

"Unlimited", because the quarantine size is 1/32 of whole memory.
"Without freeing", because freed objects are erased by init_on_free before going
to randomized heap quarantine (patch 3/6).

Would you agree?

> Also, note that here, if the reallocation fails, the kernel still
> wouldn't crash because the dangling object is not accessed further if
> the address range stored in it doesn't match the fault address. So an
> attacker could potentially try multiple times, and if the object
> happens to be on the quarantine the first time, that wouldn't really
> be a showstopper, you'd just try again.

Freed objects are filled by zero before going to quarantine (patch 3/6).
Would it cause a null pointer dereference on unsuccessful try?

>>>  https://googleprojectzero.blogspot.com/2019/11/bad-binder-android-in-wild-exploit.html?m=1
> 
> I think that here, again, the free() and the dangling pointer use were
> caused by separate syscalls, meaning the attacker had control over
> that timing?

As I wrote above, I think attacker's control over this timing is required for a
successful attack, but is not enough for bypassing randomized quarantine.

>>>  https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
> 
> Haven't looked at that one in detail.
> 
>>> Use-after-free exploits usually employ heap spraying technique.
>>> Generally it aims to put controlled bytes at a predetermined memory
>>> location on the heap.
> 
> Well, not necessarily "predetermined". Depending on the circumstances,
> you don't necessarily need to know which address you're writing to;
> and you might not even need to overwrite a specific object, but
> instead just have to overwrite one out of a bunch of objects, no
> matter which.

Yes, of course, I didn't mean a "predetermined memory address".
Maybe "definite memory location" is a better phrase for that.

>>> Heap spraying for exploiting use-after-free in the Linux kernel relies on
>>> the fact that on kmalloc(), the slab allocator returns the address of
>>> the memory that was recently freed.
> 
> Yeah; and that behavior is pretty critical for performance. The longer
> it's been since a newly allocated object was freed, the higher the
> chance that you'll end up having to go further down the memory cache
> hierarchy.

Yes. That behaviour is fast, however very convenient for use-after-free
exploitation...

>>> So allocating a kernel object with
>>> the same size and controlled contents allows overwriting the vulnerable
>>> freed object.
> 
> The vmacache exploit you linked to doesn't do that, it frees the
> object all the way back to the page allocator and then sprays 4MiB of
> memory from the page allocator. (Because VMAs use their own
> kmem_cache, and the kmem_cache wasn't merged with any interesting
> ones, and I saw no good way to exploit the bug by reallocating another
> VMA over the old VMA back then. Although of course that doesn't mean
> that there is no such way.)

Sorry, my mistake.
Exploit examples with heap spraying that fit my description:
 - CVE-2017-6074 https://www.openwall.com/lists/oss-security/2017/02/26/2
 - CVE-2017-2636 https://a13xp0p0v.github.io/2017/03/24/CVE-2017-2636.html
 - CVE-2016-8655 https://seclists.org/oss-sec/2016/q4/607
 - CVE-2017-15649
https://ssd-disclosure.com/ssd-advisory-linux-kernel-af_packet-use-after-free/

> [...]
>>> Security properties
>>> ===================
>>>
>>> For researching security properties of the heap quarantine I developed 2 lkdtm
>>> tests (see the patch 5/6).
>>>
>>> The first test is called lkdtm_HEAP_SPRAY. It allocates and frees an object
>>> from a separate kmem_cache and then allocates 400000 similar objects.
>>> I.e. this test performs an original heap spraying technique for use-after-free
>>> exploitation.
>>>
>>> If CONFIG_SLAB_QUARANTINE is disabled, the freed object is instantly
>>> reallocated and overwritten:
>>>   # echo HEAP_SPRAY > /sys/kernel/debug/provoke-crash/DIRECT
>>>    lkdtm: Performing direct entry HEAP_SPRAY
>>>    lkdtm: Allocated and freed spray_cache object 000000002b5b3ad4 of size 333
>>>    lkdtm: Original heap spraying: allocate 400000 objects of size 333...
>>>    lkdtm: FAIL: attempt 0: freed object is reallocated
>>>
>>> If CONFIG_SLAB_QUARANTINE is enabled, 400000 new allocations don't overwrite
>>> the freed object:
>>>   # echo HEAP_SPRAY > /sys/kernel/debug/provoke-crash/DIRECT
>>>    lkdtm: Performing direct entry HEAP_SPRAY
>>>    lkdtm: Allocated and freed spray_cache object 000000009909e777 of size 333
>>>    lkdtm: Original heap spraying: allocate 400000 objects of size 333...
>>>    lkdtm: OK: original heap spraying hasn't succeed
>>>
>>> That happens because pushing an object through the quarantine requires _both_
>>> allocating and freeing memory. Objects are released from the quarantine on
>>> new memory allocations, but only when the quarantine size is over the limit.
>>> And the quarantine size grows on new memory freeing.
>>>
>>> That's why I created the second test called lkdtm_PUSH_THROUGH_QUARANTINE.
>>> It allocates and frees an object from a separate kmem_cache and then performs
>>> kmem_cache_alloc()+kmem_cache_free() for that cache 400000 times.
>>> This test effectively pushes the object through the heap quarantine and
>>> reallocates it after it returns back to the allocator freelist:
> [...]
>>> As you can see, the number of the allocations that are needed for overwriting
>>> the vulnerable object is almost the same. That would be good for stable
>>> use-after-free exploitation and should not be allowed.
>>> That's why I developed the quarantine randomization (see the patch 4/6).
>>>
>>> This randomization required very small hackish changes of the heap quarantine
>>> mechanism. At first all quarantine batches are filled by objects. Then during
>>> the quarantine reducing I randomly choose and free 1/2 of objects from a
>>> randomly chosen batch. Now the randomized quarantine releases the freed object
>>> at an unpredictable moment:
>>>    lkdtm: Target object is reallocated at attempt 107884
> [...]
>>>    lkdtm: Target object is reallocated at attempt 87343
> 
> Those numbers are fairly big. At that point you might not even fit
> into L3 cache anymore, right? You'd often be hitting DRAM for new
> allocations? And for many slabs, you might end using much more memory
> for the quarantine than for actual in-use allocations.

Yes. The original quarantine size is
  (totalram_pages() << PAGE_SHIFT) / QUARANTINE_FRACTION
where
  #define QUARANTINE_FRACTION 32

> It seems to me like, for this to stop attacks with a high probability,
> you'd have to reserve a huge chunk of kernel memory for the
> quarantines 

Yes, that's how it works now.

> - even if the attacker doesn't know anything about the
> status of the quarantine (which isn't necessarily the case, depending
> on whether the attacker can abuse microarchitectural data leakage, or
> if the attacker can trigger a pure data read through the dangling
> pointer), they should still be able to win with a probability around
> quarantine_size/allocated_memory_size if they have a heap spraying
> primitive without strict limits.

Not sure about this probability evaluation.
I will try calculating it taking the quarantine parameters into account.

>>> However, this randomization alone would not disturb the attacker, because
>>> the quarantine stores the attacker's data (the payload) in the sprayed objects.
>>> I.e. the reallocated and overwritten vulnerable object contains the payload
>>> until the next reallocation (very bad).
>>>
>>> Hence heap objects should be erased before going to the heap quarantine.
>>> Moreover, filling them by zeros gives a chance to detect use-after-free
>>> accesses to non-zero data while an object stays in the quarantine (nice!).
>>> That functionality already exists in the kernel, it's called init_on_free.
>>> I integrated it with CONFIG_SLAB_QUARANTINE in the patch 3/6.
>>>
>>> During that work I found a bug: in CONFIG_SLAB init_on_free happens too
>>> late, and heap objects go to the KASAN quarantine being dirty. See the fix
>>> in the patch 2/6.
> [...]
>> I've made various tests on real hardware and in virtual machines:
>>  1) network throughput test using iperf
>>      server: iperf -s -f K
>>      client: iperf -c 127.0.0.1 -t 60 -f K
>>  2) scheduler stress test
>>      hackbench -s 4000 -l 500 -g 15 -f 25 -P
>>  3) building the defconfig kernel
>>      time make -j2
>>
>> I compared Linux kernel 5.9.0-rc6 with:
>>  - init_on_free=off,
>>  - init_on_free=on,
>>  - CONFIG_SLAB_QUARANTINE=y (which enables init_on_free).
>>
>> Each test was performed 5 times. I will show the mean values.
>> If you are interested, I can share all the results and calculate standard deviation.
>>
>> Real hardware, Intel Core i7-6500U CPU
>>  1) Network throughput test with iperf
>>      init_on_free=off: 5467152.2 KBytes/sec
>>      init_on_free=on: 3937545 KBytes/sec (-28.0% vs init_on_free=off)
>>      CONFIG_SLAB_QUARANTINE: 3858848.6 KBytes/sec (-2.0% vs init_on_free=on)
>>  2) Scheduler stress test with hackbench
>>      init_on_free=off: 8.5364s
>>      init_on_free=on: 8.9858s (+5.3% vs init_on_free=off)
>>      CONFIG_SLAB_QUARANTINE: 17.2232s (+91.7% vs init_on_free=on)
> 
> These numbers seem really high for a mitigation, especially if that
> performance hit does not really buy you deterministic protection
> against many bugs.

Right, I agree.

It's a probabilistic protection, and the probability should be calculated.
I'll work on that.

> [...]
>> N.B. There was NO performance optimization made for this version of the heap
>> quarantine prototype. The main effort was put into researching its security
>> properties (hope for your feedback). Performance optimization will be done in
>> further steps, if we see that my work is worth doing.
> 
> But you are pretty much inherently limited in terms of performance by
> the effect the quarantine has on the data cache, right?

Yes.
However, the quarantine parameters can be adjusted.

> It seems to me like, if you want to make UAF exploitation harder at
> the heap allocator layer, you could do somewhat more effective things
> with a probably much smaller performance budget. Things like
> preventing the reallocation of virtual kernel addresses with different
> types, such that an attacker can only replace a UAF object with
> another object of the same type. (That is not an idea I like very much
> either, but I would like it more than this proposal.) (E.g. some
> browsers implement things along those lines, I believe.)

That's interesting, thank you.

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free
  2020-10-06 17:56     ` Alexander Popov
@ 2020-10-06 18:37       ` Jann Horn
  2020-10-06 19:25         ` Alexander Popov
  0 siblings, 1 reply; 24+ messages in thread
From: Jann Horn @ 2020-10-06 18:37 UTC (permalink / raw)
  To: Alexander Popov
  Cc: Kees Cook, Will Deacon, Andrey Ryabinin, Alexander Potapenko,
	Dmitry Vyukov, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, Masahiro Yamada, Masami Hiramatsu,
	Steven Rostedt, Peter Zijlstra, Krzysztof Kozlowski,
	Patrick Bellasi, David Howells, Eric Biederman, Johannes Weiner,
	Laura Abbott, Arnd Bergmann, Greg Kroah-Hartman, Daniel Micay,
	Andrey Konovalov, Matthew Wilcox, Pavel Machek,
	Valentin Schneider, kasan-dev, Linux-MM, Kernel Hardening,
	kernel list, notify

On Tue, Oct 6, 2020 at 7:56 PM Alexander Popov <alex.popov@linux.com> wrote:
>
> On 06.10.2020 01:56, Jann Horn wrote:
> > On Thu, Oct 1, 2020 at 9:43 PM Alexander Popov <alex.popov@linux.com> wrote:
> >> On 29.09.2020 21:35, Alexander Popov wrote:
> >>> This is the second version of the heap quarantine prototype for the Linux
> >>> kernel. I performed a deeper evaluation of its security properties and
> >>> developed new features like quarantine randomization and integration with
> >>> init_on_free. That is fun! See below for more details.
> >>>
> >>>
> >>> Rationale
> >>> =========
> >>>
> >>> Use-after-free vulnerabilities in the Linux kernel are very popular for
> >>> exploitation. There are many examples, some of them:
> >>>  https://googleprojectzero.blogspot.com/2018/09/a-cache-invalidation-bug-in-linux.html
>
> Hello Jann, thanks for your reply.
>
> > I don't think your proposed mitigation would work with much
> > reliability against this bug; the attacker has full control over the
> > timing of the original use and the following use, so an attacker
> > should be able to trigger the kmem_cache_free(), then spam enough new
> > VMAs and delete them to flush out the quarantine, and then do heap
> > spraying as normal, or something like that.
>
> The randomized quarantine will release the vulnerable object at an unpredictable
> moment (patch 4/6).
>
> So I think the control over the time of the use-after-free access doesn't help
> attackers, if they don't have an "infinite spray" -- unlimited ability to store
> controlled data in the kernelspace objects of the needed size without freeing them.
>
> "Unlimited", because the quarantine size is 1/32 of whole memory.
> "Without freeing", because freed objects are erased by init_on_free before going
> to randomized heap quarantine (patch 3/6).
>
> Would you agree?

But you have a single quarantine (per CPU) for all objects, right? So
for a UAF on slab A, the attacker can just spam allocations and
deallocations on slab B to almost deterministically flush everything
in slab A back to the SLUB freelists?

> > Also, note that here, if the reallocation fails, the kernel still
> > wouldn't crash because the dangling object is not accessed further if
> > the address range stored in it doesn't match the fault address. So an
> > attacker could potentially try multiple times, and if the object
> > happens to be on the quarantine the first time, that wouldn't really
> > be a showstopper, you'd just try again.
>
> Freed objects are filled by zero before going to quarantine (patch 3/6).
> Would it cause a null pointer dereference on unsuccessful try?

Not as far as I can tell.

[...]
> >> N.B. There was NO performance optimization made for this version of the heap
> >> quarantine prototype. The main effort was put into researching its security
> >> properties (hope for your feedback). Performance optimization will be done in
> >> further steps, if we see that my work is worth doing.
> >
> > But you are pretty much inherently limited in terms of performance by
> > the effect the quarantine has on the data cache, right?
>
> Yes.
> However, the quarantine parameters can be adjusted.
>
> > It seems to me like, if you want to make UAF exploitation harder at
> > the heap allocator layer, you could do somewhat more effective things
> > with a probably much smaller performance budget. Things like
> > preventing the reallocation of virtual kernel addresses with different
> > types, such that an attacker can only replace a UAF object with
> > another object of the same type. (That is not an idea I like very much
> > either, but I would like it more than this proposal.) (E.g. some
> > browsers implement things along those lines, I believe.)
>
> That's interesting, thank you.

Just as some more context of how I think about this:

Preventing memory corruption, outside of stuff like core memory
management code, isn't really all *that* hard. There are schemes out
there for hardware that reliably protects the integrity of data
pointers, and such things. And if people can do that in hardware, we
can also emulate that, and we'll get the same protection in software.

The hard part is making it reasonably fast. And if you are willing to
accept the kind of performance impact that comes with gigantic
quarantine queues, there might be more effective things to spend that
performance on?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free
  2020-10-06 18:37       ` Jann Horn
@ 2020-10-06 19:25         ` Alexander Popov
  0 siblings, 0 replies; 24+ messages in thread
From: Alexander Popov @ 2020-10-06 19:25 UTC (permalink / raw)
  To: Jann Horn
  Cc: Kees Cook, Will Deacon, Andrey Ryabinin, Alexander Potapenko,
	Dmitry Vyukov, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, Masahiro Yamada, Masami Hiramatsu,
	Steven Rostedt, Peter Zijlstra, Krzysztof Kozlowski,
	Patrick Bellasi, David Howells, Eric Biederman, Johannes Weiner,
	Laura Abbott, Arnd Bergmann, Greg Kroah-Hartman, Daniel Micay,
	Andrey Konovalov, Matthew Wilcox, Pavel Machek,
	Valentin Schneider, kasan-dev, Linux-MM, Kernel Hardening,
	kernel list, notify

On 06.10.2020 21:37, Jann Horn wrote:
> On Tue, Oct 6, 2020 at 7:56 PM Alexander Popov <alex.popov@linux.com> wrote:
>>
>> On 06.10.2020 01:56, Jann Horn wrote:
>>> On Thu, Oct 1, 2020 at 9:43 PM Alexander Popov <alex.popov@linux.com> wrote:
>>>> On 29.09.2020 21:35, Alexander Popov wrote:
>>>>> This is the second version of the heap quarantine prototype for the Linux
>>>>> kernel. I performed a deeper evaluation of its security properties and
>>>>> developed new features like quarantine randomization and integration with
>>>>> init_on_free. That is fun! See below for more details.
>>>>>
>>>>>
>>>>> Rationale
>>>>> =========
>>>>>
>>>>> Use-after-free vulnerabilities in the Linux kernel are very popular for
>>>>> exploitation. There are many examples, some of them:
>>>>>  https://googleprojectzero.blogspot.com/2018/09/a-cache-invalidation-bug-in-linux.html
>>
>> Hello Jann, thanks for your reply.
>>
>>> I don't think your proposed mitigation would work with much
>>> reliability against this bug; the attacker has full control over the
>>> timing of the original use and the following use, so an attacker
>>> should be able to trigger the kmem_cache_free(), then spam enough new
>>> VMAs and delete them to flush out the quarantine, and then do heap
>>> spraying as normal, or something like that.
>>
>> The randomized quarantine will release the vulnerable object at an unpredictable
>> moment (patch 4/6).
>>
>> So I think the control over the time of the use-after-free access doesn't help
>> attackers, if they don't have an "infinite spray" -- unlimited ability to store
>> controlled data in the kernelspace objects of the needed size without freeing them.
>>
>> "Unlimited", because the quarantine size is 1/32 of whole memory.
>> "Without freeing", because freed objects are erased by init_on_free before going
>> to randomized heap quarantine (patch 3/6).
>>
>> Would you agree?
> 
> But you have a single quarantine (per CPU) for all objects, right? So
> for a UAF on slab A, the attacker can just spam allocations and
> deallocations on slab B to almost deterministically flush everything
> in slab A back to the SLUB freelists?

Aaaahh! Nice shot Jann, I see.

Another slab cache can be used to flush the randomized quarantine, so eventually
the vulnerable object returns into the allocator freelist in its cache, and
original heap spraying can be used again.

For now I think the idea of a global quarantine for all slab objects is dead.

Thank you.

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH RFC v2 2/6] mm/slab: Perform init_on_free earlier
  2020-09-30 12:50   ` Alexander Potapenko
  2020-10-01 19:48     ` Alexander Popov
@ 2020-12-03 19:50     ` Alexander Popov
  2020-12-03 20:49       ` Andrew Morton
  1 sibling, 1 reply; 24+ messages in thread
From: Alexander Popov @ 2020-12-03 19:50 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Kees Cook, Jann Horn, Will Deacon, Andrey Ryabinin,
	Dmitry Vyukov, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, Masahiro Yamada, Masami Hiramatsu,
	Steven Rostedt, Peter Zijlstra, Krzysztof Kozlowski,
	Patrick Bellasi, David Howells, Eric Biederman, Johannes Weiner,
	Laura Abbott, Arnd Bergmann, Greg Kroah-Hartman, Daniel Micay,
	Andrey Konovalov, Matthew Wilcox, Pavel Machek,
	Valentin Schneider, kasan-dev, Linux Memory Management List,
	Kernel Hardening, LKML, notify

On 30.09.2020 15:50, Alexander Potapenko wrote:
> On Tue, Sep 29, 2020 at 8:35 PM Alexander Popov <alex.popov@linux.com> wrote:
>>
>> Currently in CONFIG_SLAB init_on_free happens too late, and heap
>> objects go to the heap quarantine being dirty. Lets move memory
>> clearing before calling kasan_slab_free() to fix that.
>>
>> Signed-off-by: Alexander Popov <alex.popov@linux.com>
> Reviewed-by: Alexander Potapenko <glider@google.com>

Hello!

Can this particular patch be considered for the mainline kernel?


Note: I summarized the results of the experiment with the Linux kernel heap
quarantine in a short article, for future reference:
https://a13xp0p0v.github.io/2020/11/30/slab-quarantine.html

Best regards,
Alexander

>> ---
>>  mm/slab.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/slab.c b/mm/slab.c
>> index 3160dff6fd76..5140203c5b76 100644
>> --- a/mm/slab.c
>> +++ b/mm/slab.c
>> @@ -3414,6 +3414,9 @@ static void cache_flusharray(struct kmem_cache *cachep, struct array_cache *ac)
>>  static __always_inline void __cache_free(struct kmem_cache *cachep, void *objp,
>>                                          unsigned long caller)
>>  {
>> +       if (unlikely(slab_want_init_on_free(cachep)))
>> +               memset(objp, 0, cachep->object_size);
>> +
>>         /* Put the object into the quarantine, don't touch it for now. */
>>         if (kasan_slab_free(cachep, objp, _RET_IP_))
>>                 return;
>> @@ -3432,8 +3435,6 @@ void ___cache_free(struct kmem_cache *cachep, void *objp,
>>         struct array_cache *ac = cpu_cache_get(cachep);
>>
>>         check_irq_off();
>> -       if (unlikely(slab_want_init_on_free(cachep)))
>> -               memset(objp, 0, cachep->object_size);
>>         kmemleak_free_recursive(objp, cachep->flags);
>>         objp = cache_free_debugcheck(cachep, objp, caller);
>>         memcg_slab_free_hook(cachep, virt_to_head_page(objp), objp);
>> --
>> 2.26.2
>>
> 
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH RFC v2 2/6] mm/slab: Perform init_on_free earlier
  2020-12-03 19:50     ` Alexander Popov
@ 2020-12-03 20:49       ` Andrew Morton
  2020-12-04 11:54         ` Alexander Popov
  0 siblings, 1 reply; 24+ messages in thread
From: Andrew Morton @ 2020-12-03 20:49 UTC (permalink / raw)
  To: alex.popov
  Cc: Alexander Potapenko, Kees Cook, Jann Horn, Will Deacon,
	Andrey Ryabinin, Dmitry Vyukov, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Masahiro Yamada, Masami Hiramatsu,
	Steven Rostedt, Peter Zijlstra, Krzysztof Kozlowski,
	Patrick Bellasi, David Howells, Eric Biederman, Johannes Weiner,
	Laura Abbott, Arnd Bergmann, Greg Kroah-Hartman, Daniel Micay,
	Andrey Konovalov, Matthew Wilcox, Pavel Machek,
	Valentin Schneider, kasan-dev, Linux Memory Management List,
	Kernel Hardening, LKML, notify

On Thu, 3 Dec 2020 22:50:27 +0300 Alexander Popov <alex.popov@linux.com> wrote:

> On 30.09.2020 15:50, Alexander Potapenko wrote:
> > On Tue, Sep 29, 2020 at 8:35 PM Alexander Popov <alex.popov@linux.com> wrote:
> >>
> >> Currently in CONFIG_SLAB init_on_free happens too late, and heap
> >> objects go to the heap quarantine being dirty. Lets move memory
> >> clearing before calling kasan_slab_free() to fix that.
> >>
> >> Signed-off-by: Alexander Popov <alex.popov@linux.com>
> > Reviewed-by: Alexander Potapenko <glider@google.com>
> 
> Hello!
> 
> Can this particular patch be considered for the mainline kernel?

All patches are considered ;) And merged if they're reviewed, tested,
judged useful, etc.

If you think this particular patch should be fast-tracked then please
send it as a non-RFC, standalone patch.  Please also enhance the
changelog so that it actually explains what goes wrong.  Presumably
"objects go to the heap quarantine being dirty" causes some
user-visible problem?  What is that problem?


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH RFC v2 2/6] mm/slab: Perform init_on_free earlier
  2020-12-03 20:49       ` Andrew Morton
@ 2020-12-04 11:54         ` Alexander Popov
  0 siblings, 0 replies; 24+ messages in thread
From: Alexander Popov @ 2020-12-04 11:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexander Potapenko, Kees Cook, Jann Horn, Will Deacon,
	Andrey Ryabinin, Dmitry Vyukov, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Masahiro Yamada, Masami Hiramatsu,
	Steven Rostedt, Peter Zijlstra, Krzysztof Kozlowski,
	Patrick Bellasi, David Howells, Eric Biederman, Johannes Weiner,
	Laura Abbott, Arnd Bergmann, Greg Kroah-Hartman, Daniel Micay,
	Andrey Konovalov, Matthew Wilcox, Pavel Machek,
	Valentin Schneider, kasan-dev, Linux Memory Management List,
	Kernel Hardening, LKML, notify

On 03.12.2020 23:49, Andrew Morton wrote:
> On Thu, 3 Dec 2020 22:50:27 +0300 Alexander Popov <alex.popov@linux.com> wrote:
> 
>> On 30.09.2020 15:50, Alexander Potapenko wrote:
>>> On Tue, Sep 29, 2020 at 8:35 PM Alexander Popov <alex.popov@linux.com> wrote:
>>>>
>>>> Currently in CONFIG_SLAB init_on_free happens too late, and heap
>>>> objects go to the heap quarantine being dirty. Lets move memory
>>>> clearing before calling kasan_slab_free() to fix that.
>>>>
>>>> Signed-off-by: Alexander Popov <alex.popov@linux.com>
>>> Reviewed-by: Alexander Potapenko <glider@google.com>
>>
>> Hello!
>>
>> Can this particular patch be considered for the mainline kernel?
> 
> All patches are considered ;) And merged if they're reviewed, tested,
> judged useful, etc.
> 
> If you think this particular patch should be fast-tracked then please
> send it as a non-RFC, standalone patch.  Please also enhance the
> changelog so that it actually explains what goes wrong.  Presumably
> "objects go to the heap quarantine being dirty" causes some
> user-visible problem?  What is that problem?

Ok, thanks!
I'll improve the commit message and send the patch separately.

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, back to index

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-29 18:35 [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free Alexander Popov
2020-09-29 18:35 ` [PATCH RFC v2 1/6] mm: Extract SLAB_QUARANTINE from KASAN Alexander Popov
2020-09-29 18:35 ` [PATCH RFC v2 2/6] mm/slab: Perform init_on_free earlier Alexander Popov
2020-09-30 12:50   ` Alexander Potapenko
2020-10-01 19:48     ` Alexander Popov
2020-12-03 19:50     ` Alexander Popov
2020-12-03 20:49       ` Andrew Morton
2020-12-04 11:54         ` Alexander Popov
2020-09-29 18:35 ` [PATCH RFC v2 3/6] mm: Integrate SLAB_QUARANTINE with init_on_free Alexander Popov
2020-09-29 18:35 ` [PATCH RFC v2 4/6] mm: Implement slab quarantine randomization Alexander Popov
2020-09-29 18:35 ` [PATCH RFC v2 5/6] lkdtm: Add heap quarantine tests Alexander Popov
2020-09-29 18:35 ` [PATCH RFC v2 6/6] mm: Add heap quarantine verbose debugging (not for merge) Alexander Popov
2020-10-01 19:42 ` [PATCH RFC v2 0/6] Break heap spraying needed for exploiting use-after-free Alexander Popov
2020-10-05 22:56   ` Jann Horn
2020-10-06  0:44     ` Matthew Wilcox
2020-10-06  0:48       ` Jann Horn
2020-10-06  2:09       ` Kees Cook
2020-10-06  2:16         ` Jann Horn
2020-10-06  2:19         ` Daniel Micay
2020-10-06  8:35         ` Christopher Lameter
2020-10-06  8:32       ` Christopher Lameter
2020-10-06 17:56     ` Alexander Popov
2020-10-06 18:37       ` Jann Horn
2020-10-06 19:25         ` Alexander Popov

Kernel-hardening Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kernel-hardening/0 kernel-hardening/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kernel-hardening kernel-hardening/ https://lore.kernel.org/kernel-hardening \
		kernel-hardening@lists.openwall.com
	public-inbox-index kernel-hardening

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/com.openwall.lists.kernel-hardening


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git