linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/7] SLAB support for KASAN
@ 2016-02-26 16:48 Alexander Potapenko
  2016-02-26 16:48 ` [PATCH v4 1/7] kasan: Modify kmalloc_large_oob_right(), add kmalloc_pagealloc_oob_right() Alexander Potapenko
                   ` (7 more replies)
  0 siblings, 8 replies; 30+ messages in thread
From: Alexander Potapenko @ 2016-02-26 16:48 UTC (permalink / raw)
  To: adech.fo, cl, dvyukov, akpm, ryabinin.a.a, rostedt,
	iamjoonsoo.kim, js1304, kcc
  Cc: kasan-dev, linux-kernel, linux-mm

This patch set implements SLAB support for KASAN

Unlike SLUB, SLAB doesn't store allocation/deallocation stacks for heap
objects, therefore we reimplement this feature in mm/kasan/stackdepot.c.
The intention is to ultimately switch SLUB to use this implementation as
well, which will save a lot of memory (right now SLUB bloats each object
by 256 bytes to store the allocation/deallocation stacks).

Also neither SLUB nor SLAB delay the reuse of freed memory chunks, which
is necessary for better detection of use-after-free errors. We introduce
memory quarantine (mm/kasan/quarantine.c), which allows delayed reuse of
deallocated memory.

Alexander Potapenko (7):
  kasan: Modify kmalloc_large_oob_right(), add
    kmalloc_pagealloc_oob_right()
  mm, kasan: SLAB support
  mm, kasan: Added GFP flags to KASAN API
  arch, ftrace: For KASAN put hard/soft IRQ entries into separate
    sections
  mm, kasan: Stackdepot implementation. Enable stackdepot for SLAB
  kasan: Test fix: Warn if the UAF could not be detected in kmalloc_uaf2
  mm: kasan: Initial memory quarantine implementation
---
v2: - merged two patches that touched kmalloc_large_oob_right
    - moved stackdepot implementation to lib/
    - moved IRQ definitions to include/linux/interrupt.h

v3: - minor description changes
    - store deallocation info in the "mm, kasan: SLAB support" patch

v4: - fix kbuild error reports
---
 Documentation/kasan.txt              |   5 +-
 arch/arm/kernel/vmlinux.lds.S        |   1 +
 arch/arm64/kernel/vmlinux.lds.S      |   1 +
 arch/blackfin/kernel/vmlinux.lds.S   |   1 +
 arch/c6x/kernel/vmlinux.lds.S        |   1 +
 arch/metag/kernel/vmlinux.lds.S      |   1 +
 arch/microblaze/kernel/vmlinux.lds.S |   1 +
 arch/mips/kernel/vmlinux.lds.S       |   1 +
 arch/nios2/kernel/vmlinux.lds.S      |   1 +
 arch/openrisc/kernel/vmlinux.lds.S   |   1 +
 arch/parisc/kernel/vmlinux.lds.S     |   1 +
 arch/powerpc/kernel/vmlinux.lds.S    |   1 +
 arch/s390/kernel/vmlinux.lds.S       |   1 +
 arch/sh/kernel/vmlinux.lds.S         |   1 +
 arch/sparc/kernel/vmlinux.lds.S      |   1 +
 arch/tile/kernel/vmlinux.lds.S       |   1 +
 arch/x86/kernel/Makefile             |   1 +
 arch/x86/kernel/vmlinux.lds.S        |   1 +
 include/asm-generic/vmlinux.lds.h    |  12 +-
 include/linux/ftrace.h               |  11 --
 include/linux/interrupt.h            |  20 +++
 include/linux/kasan.h                |  63 ++++++--
 include/linux/slab.h                 |  10 +-
 include/linux/slab_def.h             |  14 ++
 include/linux/slub_def.h             |  11 ++
 include/linux/stackdepot.h           |  32 ++++
 kernel/softirq.c                     |   2 +-
 kernel/trace/trace_functions_graph.c |   1 +
 lib/Kconfig.kasan                    |   4 +-
 lib/Makefile                         |   7 +
 lib/stackdepot.c                     | 274 +++++++++++++++++++++++++++++++
 lib/test_kasan.c                     |  59 ++++++-
 mm/Makefile                          |   1 +
 mm/kasan/Makefile                    |   4 +
 mm/kasan/kasan.c                     | 221 +++++++++++++++++++++++--
 mm/kasan/kasan.h                     |  45 ++++++
 mm/kasan/quarantine.c                | 306 +++++++++++++++++++++++++++++++++++
 mm/kasan/report.c                    |  71 ++++++--
 mm/mempool.c                         |  23 +--
 mm/page_alloc.c                      |   2 +-
 mm/slab.c                            |  58 ++++++-
 mm/slab.h                            |   2 +
 mm/slab_common.c                     |   8 +-
 mm/slub.c                            |  21 +--
 44 files changed, 1214 insertions(+), 90 deletions(-)
 create mode 100644 include/linux/stackdepot.h
 create mode 100644 lib/stackdepot.c
 create mode 100644 mm/kasan/quarantine.c

-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v4 1/7] kasan: Modify kmalloc_large_oob_right(), add kmalloc_pagealloc_oob_right()
  2016-02-26 16:48 [PATCH v4 0/7] SLAB support for KASAN Alexander Potapenko
@ 2016-02-26 16:48 ` Alexander Potapenko
  2016-02-26 16:48 ` [PATCH v4 2/7] mm, kasan: SLAB support Alexander Potapenko
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 30+ messages in thread
From: Alexander Potapenko @ 2016-02-26 16:48 UTC (permalink / raw)
  To: adech.fo, cl, dvyukov, akpm, ryabinin.a.a, rostedt,
	iamjoonsoo.kim, js1304, kcc
  Cc: kasan-dev, linux-kernel, linux-mm

Rename kmalloc_large_oob_right() to kmalloc_pagealloc_oob_right(), as the
test only checks the page allocator functionality.
Also reimplement kmalloc_large_oob_right() so that the test allocates a
large enough chunk of memory that still does not trigger the page
allocator fallback.

Signed-off-by: Alexander Potapenko <glider@google.com>
---
v2: - Merged "kasan: Change the behavior of kmalloc_large_oob_right" and
  "kasan: Changed kmalloc_large_oob_right, added kmalloc_pagealloc_oob_right"
  from v1

v3: - Minor description changes
---
 lib/test_kasan.c | 28 +++++++++++++++++++++++++++-
 1 file changed, 27 insertions(+), 1 deletion(-)

diff --git a/lib/test_kasan.c b/lib/test_kasan.c
index c32f3b0..90ad74f 100644
--- a/lib/test_kasan.c
+++ b/lib/test_kasan.c
@@ -65,11 +65,34 @@ static noinline void __init kmalloc_node_oob_right(void)
 	kfree(ptr);
 }
 
-static noinline void __init kmalloc_large_oob_right(void)
+#ifdef CONFIG_SLUB
+static noinline void __init kmalloc_pagealloc_oob_right(void)
 {
 	char *ptr;
 	size_t size = KMALLOC_MAX_CACHE_SIZE + 10;
 
+	/* Allocate a chunk that does not fit into a SLUB cache to trigger
+	 * the page allocator fallback.
+	 */
+	pr_info("kmalloc pagealloc allocation: out-of-bounds to right\n");
+	ptr = kmalloc(size, GFP_KERNEL);
+	if (!ptr) {
+		pr_err("Allocation failed\n");
+		return;
+	}
+
+	ptr[size] = 0;
+	kfree(ptr);
+}
+#endif
+
+static noinline void __init kmalloc_large_oob_right(void)
+{
+	char *ptr;
+	size_t size = KMALLOC_MAX_CACHE_SIZE - 256;
+	/* Allocate a chunk that is large enough, but still fits into a slab
+	 * and does not trigger the page allocator fallback in SLUB.
+	 */
 	pr_info("kmalloc large allocation: out-of-bounds to right\n");
 	ptr = kmalloc(size, GFP_KERNEL);
 	if (!ptr) {
@@ -324,6 +347,9 @@ static int __init kmalloc_tests_init(void)
 	kmalloc_oob_right();
 	kmalloc_oob_left();
 	kmalloc_node_oob_right();
+#ifdef CONFIG_SLUB
+	kmalloc_pagealloc_oob_right();
+#endif
 	kmalloc_large_oob_right();
 	kmalloc_oob_krealloc_more();
 	kmalloc_oob_krealloc_less();
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 2/7] mm, kasan: SLAB support
  2016-02-26 16:48 [PATCH v4 0/7] SLAB support for KASAN Alexander Potapenko
  2016-02-26 16:48 ` [PATCH v4 1/7] kasan: Modify kmalloc_large_oob_right(), add kmalloc_pagealloc_oob_right() Alexander Potapenko
@ 2016-02-26 16:48 ` Alexander Potapenko
  2016-02-29 15:10   ` Andrey Ryabinin
  2016-02-26 16:48 ` [PATCH v4 3/7] mm, kasan: Added GFP flags to KASAN API Alexander Potapenko
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Alexander Potapenko @ 2016-02-26 16:48 UTC (permalink / raw)
  To: adech.fo, cl, dvyukov, akpm, ryabinin.a.a, rostedt,
	iamjoonsoo.kim, js1304, kcc
  Cc: kasan-dev, linux-kernel, linux-mm

Add KASAN hooks to SLAB allocator.

This patch is based on the "mm: kasan: unified support for SLUB and
SLAB allocators" patch originally prepared by Dmitry Chernenkov.

Signed-off-by: Alexander Potapenko <glider@google.com>
---
v3: - minor description changes
    - store deallocation info in kasan_slab_free()

v4: - fix kbuild compile-time warnings in print_track()
---
 Documentation/kasan.txt  |   5 +--
 include/linux/kasan.h    |  12 ++++++
 include/linux/slab.h     |   6 +++
 include/linux/slab_def.h |  14 +++++++
 include/linux/slub_def.h |  11 +++++
 lib/Kconfig.kasan        |   4 +-
 mm/Makefile              |   1 +
 mm/kasan/kasan.c         | 102 +++++++++++++++++++++++++++++++++++++++++++++++
 mm/kasan/kasan.h         |  34 ++++++++++++++++
 mm/kasan/report.c        |  61 +++++++++++++++++++++++-----
 mm/slab.c                |  46 ++++++++++++++++++---
 mm/slab_common.c         |   2 +-
 12 files changed, 277 insertions(+), 21 deletions(-)

diff --git a/Documentation/kasan.txt b/Documentation/kasan.txt
index aa1e0c9..7dd95b3 100644
--- a/Documentation/kasan.txt
+++ b/Documentation/kasan.txt
@@ -12,8 +12,7 @@ KASAN uses compile-time instrumentation for checking every memory access,
 therefore you will need a GCC version 4.9.2 or later. GCC 5.0 or later is
 required for detection of out-of-bounds accesses to stack or global variables.
 
-Currently KASAN is supported only for x86_64 architecture and requires the
-kernel to be built with the SLUB allocator.
+Currently KASAN is supported only for x86_64 architecture.
 
 1. Usage
 ========
@@ -27,7 +26,7 @@ inline are compiler instrumentation types. The former produces smaller binary
 the latter is 1.1 - 2 times faster. Inline instrumentation requires a GCC
 version 5.0 or later.
 
-Currently KASAN works only with the SLUB memory allocator.
+KASAN works with both SLUB and SLAB memory allocators.
 For better bug detection and nicer reporting, enable CONFIG_STACKTRACE.
 
 To disable instrumentation for specific files or directories, add a line
diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index 4b9f85c..4405a35 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -46,6 +46,9 @@ void kasan_unpoison_shadow(const void *address, size_t size);
 void kasan_alloc_pages(struct page *page, unsigned int order);
 void kasan_free_pages(struct page *page, unsigned int order);
 
+void kasan_cache_create(struct kmem_cache *cache, size_t *size,
+			unsigned long *flags);
+
 void kasan_poison_slab(struct page *page);
 void kasan_unpoison_object_data(struct kmem_cache *cache, void *object);
 void kasan_poison_object_data(struct kmem_cache *cache, void *object);
@@ -59,6 +62,11 @@ void kasan_krealloc(const void *object, size_t new_size);
 void kasan_slab_alloc(struct kmem_cache *s, void *object);
 void kasan_slab_free(struct kmem_cache *s, void *object);
 
+struct kasan_cache {
+	int alloc_meta_offset;
+	int free_meta_offset;
+};
+
 int kasan_module_alloc(void *addr, size_t size);
 void kasan_free_shadow(const struct vm_struct *vm);
 
@@ -72,6 +80,10 @@ static inline void kasan_disable_current(void) {}
 static inline void kasan_alloc_pages(struct page *page, unsigned int order) {}
 static inline void kasan_free_pages(struct page *page, unsigned int order) {}
 
+static inline void kasan_cache_create(struct kmem_cache *cache,
+				      size_t *size,
+				      unsigned long *flags) {}
+
 static inline void kasan_poison_slab(struct page *page) {}
 static inline void kasan_unpoison_object_data(struct kmem_cache *cache,
 					void *object) {}
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 3627d5c..840e652 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -92,6 +92,12 @@
 # define SLAB_ACCOUNT		0x00000000UL
 #endif
 
+#ifdef CONFIG_KASAN
+#define SLAB_KASAN		0x08000000UL
+#else
+#define SLAB_KASAN		0x00000000UL
+#endif
+
 /* The following flags affect the page allocator grouping pages by mobility */
 #define SLAB_RECLAIM_ACCOUNT	0x00020000UL		/* Objects are reclaimable */
 #define SLAB_TEMPORARY		SLAB_RECLAIM_ACCOUNT	/* Objects are short-lived */
diff --git a/include/linux/slab_def.h b/include/linux/slab_def.h
index cf139d3..f57232c 100644
--- a/include/linux/slab_def.h
+++ b/include/linux/slab_def.h
@@ -73,8 +73,22 @@ struct kmem_cache {
 #ifdef CONFIG_MEMCG
 	struct memcg_cache_params memcg_params;
 #endif
+#ifdef CONFIG_KASAN
+	struct kasan_cache kasan_info;
+#endif
 
 	struct kmem_cache_node *node[MAX_NUMNODES];
 };
 
+static inline void *nearest_obj(struct kmem_cache *cache, struct page *page,
+				void *x) {
+	void *object = x - (x - page->s_mem) % cache->size;
+	void *last_object = page->s_mem + (cache->num - 1) * cache->size;
+
+	if (unlikely(object > last_object))
+		return last_object;
+	else
+		return object;
+}
+
 #endif	/* _LINUX_SLAB_DEF_H */
diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index b7e57927..c6970a0 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -129,4 +129,15 @@ static inline void *virt_to_obj(struct kmem_cache *s,
 void object_err(struct kmem_cache *s, struct page *page,
 		u8 *object, char *reason);
 
+static inline void *nearest_obj(struct kmem_cache *cache, struct page *page,
+				void *x) {
+	void *object = x - (x - page_address(page)) % cache->size;
+	void *last_object = page_address(page) +
+		(page->objects - 1) * cache->size;
+	if (unlikely(object > last_object))
+		return last_object;
+	else
+		return object;
+}
+
 #endif /* _LINUX_SLUB_DEF_H */
diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan
index 0fee5ac..0e4d2b3 100644
--- a/lib/Kconfig.kasan
+++ b/lib/Kconfig.kasan
@@ -5,7 +5,7 @@ if HAVE_ARCH_KASAN
 
 config KASAN
 	bool "KASan: runtime memory debugger"
-	depends on SLUB_DEBUG
+	depends on SLUB_DEBUG || (SLAB && !DEBUG_SLAB)
 	select CONSTRUCTORS
 	help
 	  Enables kernel address sanitizer - runtime memory debugger,
@@ -16,6 +16,8 @@ config KASAN
 	  This feature consumes about 1/8 of available memory and brings about
 	  ~x3 performance slowdown.
 	  For better error detection enable CONFIG_STACKTRACE.
+	  Currently CONFIG_KASAN doesn't work with CONFIG_DEBUG_SLAB
+	  (the resulting kernel does not boot).
 
 choice
 	prompt "Instrumentation type"
diff --git a/mm/Makefile b/mm/Makefile
index 2ed4319..d675b37 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -3,6 +3,7 @@
 #
 
 KASAN_SANITIZE_slab_common.o := n
+KASAN_SANITIZE_slab.o := n
 KASAN_SANITIZE_slub.o := n
 
 mmu-y			:= nommu.o
diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c
index bc0a8d8..d26ffb4 100644
--- a/mm/kasan/kasan.c
+++ b/mm/kasan/kasan.c
@@ -314,6 +314,59 @@ void kasan_free_pages(struct page *page, unsigned int order)
 				KASAN_FREE_PAGE);
 }
 
+#ifdef CONFIG_SLAB
+/*
+ * Adaptive redzone policy taken from the userspace AddressSanitizer runtime.
+ * For larger allocations larger redzones are used.
+ */
+static size_t optimal_redzone(size_t object_size)
+{
+	int rz =
+		object_size <= 64        - 16   ? 16 :
+		object_size <= 128       - 32   ? 32 :
+		object_size <= 512       - 64   ? 64 :
+		object_size <= 4096      - 128  ? 128 :
+		object_size <= (1 << 14) - 256  ? 256 :
+		object_size <= (1 << 15) - 512  ? 512 :
+		object_size <= (1 << 16) - 1024 ? 1024 : 2048;
+	return rz;
+}
+
+void kasan_cache_create(struct kmem_cache *cache, size_t *size,
+			unsigned long *flags)
+{
+	int redzone_adjust;
+	/* Make sure the adjusted size is still less than
+	 * KMALLOC_MAX_CACHE_SIZE.
+	 * TODO: this check is only useful for SLAB, but not SLUB. We'll need
+	 * to skip it for SLUB when it starts using kasan_cache_create().
+	 */
+	if (*size > KMALLOC_MAX_CACHE_SIZE -
+	    sizeof(struct kasan_alloc_meta) -
+	    sizeof(struct kasan_free_meta))
+		return;
+	*flags |= SLAB_KASAN;
+	/* Add alloc meta. */
+	cache->kasan_info.alloc_meta_offset = *size;
+	*size += sizeof(struct kasan_alloc_meta);
+
+	/* Add free meta. */
+	if (cache->flags & SLAB_DESTROY_BY_RCU || cache->ctor ||
+	    cache->object_size < sizeof(struct kasan_free_meta)) {
+		cache->kasan_info.free_meta_offset = *size;
+		*size += sizeof(struct kasan_free_meta);
+	}
+	redzone_adjust = optimal_redzone(cache->object_size) -
+		(*size - cache->object_size);
+	if (redzone_adjust > 0)
+		*size += redzone_adjust;
+	*size = min(KMALLOC_MAX_CACHE_SIZE,
+		    max(*size,
+			cache->object_size +
+			optimal_redzone(cache->object_size)));
+}
+#endif
+
 void kasan_poison_slab(struct page *page)
 {
 	kasan_poison_shadow(page_address(page),
@@ -331,8 +384,36 @@ void kasan_poison_object_data(struct kmem_cache *cache, void *object)
 	kasan_poison_shadow(object,
 			round_up(cache->object_size, KASAN_SHADOW_SCALE_SIZE),
 			KASAN_KMALLOC_REDZONE);
+#ifdef CONFIG_SLAB
+	if (cache->flags & SLAB_KASAN) {
+		struct kasan_alloc_meta *alloc_info =
+			get_alloc_info(cache, object);
+		alloc_info->state = KASAN_STATE_INIT;
+	}
+#endif
+}
+
+static inline void set_track(struct kasan_track *track)
+{
+	track->cpu = raw_smp_processor_id();
+	track->pid = current->pid;
+	track->when = jiffies;
 }
 
+#ifdef CONFIG_SLAB
+struct kasan_alloc_meta *get_alloc_info(struct kmem_cache *cache,
+					const void *object)
+{
+	return (void *)object + cache->kasan_info.alloc_meta_offset;
+}
+
+struct kasan_free_meta *get_free_info(struct kmem_cache *cache,
+				      const void *object)
+{
+	return (void *)object + cache->kasan_info.free_meta_offset;
+}
+#endif
+
 void kasan_slab_alloc(struct kmem_cache *cache, void *object)
 {
 	kasan_kmalloc(cache, object, cache->object_size);
@@ -347,6 +428,17 @@ void kasan_slab_free(struct kmem_cache *cache, void *object)
 	if (unlikely(cache->flags & SLAB_DESTROY_BY_RCU))
 		return;
 
+#ifdef CONFIG_SLAB
+	if (cache->flags & SLAB_KASAN) {
+		struct kasan_free_meta *free_info =
+			get_free_info(cache, object);
+		struct kasan_alloc_meta *alloc_info =
+			get_alloc_info(cache, object);
+		alloc_info->state = KASAN_STATE_FREE;
+		set_track(&free_info->track);
+	}
+#endif
+
 	kasan_poison_shadow(object, rounded_up_size, KASAN_KMALLOC_FREE);
 }
 
@@ -366,6 +458,16 @@ void kasan_kmalloc(struct kmem_cache *cache, const void *object, size_t size)
 	kasan_unpoison_shadow(object, size);
 	kasan_poison_shadow((void *)redzone_start, redzone_end - redzone_start,
 		KASAN_KMALLOC_REDZONE);
+#ifdef CONFIG_SLAB
+	if (cache->flags & SLAB_KASAN) {
+		struct kasan_alloc_meta *alloc_info =
+			get_alloc_info(cache, object);
+
+		alloc_info->state = KASAN_STATE_ALLOC;
+		alloc_info->alloc_size = size;
+		set_track(&alloc_info->track);
+	}
+#endif
 }
 EXPORT_SYMBOL(kasan_kmalloc);
 
diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
index 4f6c62e..7b9e4ab9 100644
--- a/mm/kasan/kasan.h
+++ b/mm/kasan/kasan.h
@@ -54,6 +54,40 @@ struct kasan_global {
 #endif
 };
 
+/**
+ * Structures to keep alloc and free tracks *
+ */
+
+enum kasan_state {
+	KASAN_STATE_INIT,
+	KASAN_STATE_ALLOC,
+	KASAN_STATE_FREE
+};
+
+struct kasan_track {
+	u64 cpu : 6;			/* for NR_CPUS = 64 */
+	u64 pid : 16;			/* 65536 processes */
+	u64 when : 42;			/* ~140 years */
+};
+
+struct kasan_alloc_meta {
+	u32 state : 2;	/* enum kasan_state */
+	u32 alloc_size : 30;
+	struct kasan_track track;
+};
+
+struct kasan_free_meta {
+	/* Allocator freelist pointer, unused by KASAN. */
+	void **freelist;
+	struct kasan_track track;
+};
+
+struct kasan_alloc_meta *get_alloc_info(struct kmem_cache *cache,
+					const void *object);
+struct kasan_free_meta *get_free_info(struct kmem_cache *cache,
+					const void *object);
+
+
 static inline const void *kasan_shadow_to_mem(const void *shadow_addr)
 {
 	return (void *)(((unsigned long)shadow_addr - KASAN_SHADOW_OFFSET)
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index 12f222d..2c1407f 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -115,6 +115,44 @@ static inline bool init_task_stack_addr(const void *addr)
 			sizeof(init_thread_union.stack));
 }
 
+#ifdef CONFIG_SLAB
+static void print_track(struct kasan_track *track)
+{
+	pr_err("PID = %u, CPU = %u, timestamp = %lu\n", track->pid,
+	       track->cpu, (unsigned long)track->when);
+}
+
+static void print_object(struct kmem_cache *cache, void *object)
+{
+	struct kasan_alloc_meta *alloc_info = get_alloc_info(cache, object);
+	struct kasan_free_meta *free_info;
+
+	pr_err("Object at %p, in cache %s\n", object, cache->name);
+	if (!(cache->flags & SLAB_KASAN))
+		return;
+	switch (alloc_info->state) {
+	case KASAN_STATE_INIT:
+		pr_err("Object not allocated yet\n");
+		break;
+	case KASAN_STATE_ALLOC:
+		pr_err("Object allocated with size %u bytes.\n",
+		       alloc_info->alloc_size);
+		pr_err("Allocation:\n");
+		print_track(&alloc_info->track);
+		break;
+	case KASAN_STATE_FREE:
+		pr_err("Object freed, allocated with size %u bytes\n",
+		       alloc_info->alloc_size);
+		free_info = get_free_info(cache, object);
+		pr_err("Allocation:\n");
+		print_track(&alloc_info->track);
+		pr_err("Deallocation:\n");
+		print_track(&free_info->track);
+		break;
+	}
+}
+#endif
+
 static void print_address_description(struct kasan_access_info *info)
 {
 	const void *addr = info->access_addr;
@@ -126,17 +164,14 @@ static void print_address_description(struct kasan_access_info *info)
 		if (PageSlab(page)) {
 			void *object;
 			struct kmem_cache *cache = page->slab_cache;
-			void *last_object;
-
-			object = virt_to_obj(cache, page_address(page), addr);
-			last_object = page_address(page) +
-				page->objects * cache->size;
-
-			if (unlikely(object > last_object))
-				object = last_object; /* we hit into padding */
-
+			object = nearest_obj(cache, page,
+						(void *)info->access_addr);
+#ifdef CONFIG_SLAB
+			print_object(cache, object);
+#else
 			object_err(cache, page, object,
-				"kasan: bad access detected");
+					"kasan: bad access detected");
+#endif
 			return;
 		}
 		dump_page(page, "kasan: bad access detected");
@@ -146,8 +181,9 @@ static void print_address_description(struct kasan_access_info *info)
 		if (!init_task_stack_addr(addr))
 			pr_err("Address belongs to variable %pS\n", addr);
 	}
-
+#ifdef CONFIG_SLUB
 	dump_stack();
+#endif
 }
 
 static bool row_is_guilty(const void *row, const void *guilty)
@@ -233,6 +269,9 @@ static void kasan_report_error(struct kasan_access_info *info)
 		dump_stack();
 	} else {
 		print_error_description(info);
+#ifdef CONFIG_SLAB
+		dump_stack();
+#endif
 		print_address_description(info);
 		print_shadow_for_address(info->first_bad_addr);
 	}
diff --git a/mm/slab.c b/mm/slab.c
index 621fbcb..805b39b 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -2196,6 +2196,7 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags)
 #endif
 #endif
 
+	kasan_cache_create(cachep, &size, &flags);
 	/*
 	 * Determine if the slab management is 'on' or 'off' slab.
 	 * (bootstrapping cannot cope with offslab caches so don't do
@@ -2503,8 +2504,13 @@ static void cache_init_objs(struct kmem_cache *cachep,
 		 * cache which they are a constructor for.  Otherwise, deadlock.
 		 * They must also be threaded.
 		 */
-		if (cachep->ctor && !(cachep->flags & SLAB_POISON))
+		if (cachep->ctor && !(cachep->flags & SLAB_POISON)) {
+			kasan_unpoison_object_data(cachep,
+						   objp + obj_offset(cachep));
 			cachep->ctor(objp + obj_offset(cachep));
+			kasan_poison_object_data(
+				cachep, objp + obj_offset(cachep));
+		}
 
 		if (cachep->flags & SLAB_RED_ZONE) {
 			if (*dbg_redzone2(cachep, objp) != RED_INACTIVE)
@@ -2519,8 +2525,11 @@ static void cache_init_objs(struct kmem_cache *cachep,
 			kernel_map_pages(virt_to_page(objp),
 					 cachep->size / PAGE_SIZE, 0);
 #else
-		if (cachep->ctor)
+		if (cachep->ctor) {
+			kasan_unpoison_object_data(cachep, objp);
 			cachep->ctor(objp);
+			kasan_poison_object_data(cachep, objp);
+		}
 #endif
 		set_obj_status(page, i, OBJECT_FREE);
 		set_free_obj(page, i, i);
@@ -2650,6 +2659,7 @@ static int cache_grow(struct kmem_cache *cachep,
 
 	slab_map_pages(cachep, page, freelist);
 
+	kasan_poison_slab(page);
 	cache_init_objs(cachep, page);
 
 	if (gfpflags_allow_blocking(local_flags))
@@ -3364,7 +3374,10 @@ free_done:
 static inline void __cache_free(struct kmem_cache *cachep, void *objp,
 				unsigned long caller)
 {
-	struct array_cache *ac = cpu_cache_get(cachep);
+	struct array_cache *ac;
+
+	kasan_slab_free(cachep, objp);
+	ac = cpu_cache_get(cachep);
 
 	check_irq_off();
 	kmemleak_free_recursive(objp, cachep->flags);
@@ -3403,6 +3416,8 @@ static inline void __cache_free(struct kmem_cache *cachep, void *objp,
 void *kmem_cache_alloc(struct kmem_cache *cachep, gfp_t flags)
 {
 	void *ret = slab_alloc(cachep, flags, _RET_IP_);
+	if (ret)
+		kasan_slab_alloc(cachep, ret);
 
 	trace_kmem_cache_alloc(_RET_IP_, ret,
 			       cachep->object_size, cachep->size, flags);
@@ -3432,6 +3447,8 @@ kmem_cache_alloc_trace(struct kmem_cache *cachep, gfp_t flags, size_t size)
 
 	ret = slab_alloc(cachep, flags, _RET_IP_);
 
+	if (ret)
+		kasan_kmalloc(cachep, ret, size);
 	trace_kmalloc(_RET_IP_, ret,
 		      size, cachep->size, flags);
 	return ret;
@@ -3455,6 +3472,8 @@ void *kmem_cache_alloc_node(struct kmem_cache *cachep, gfp_t flags, int nodeid)
 {
 	void *ret = slab_alloc_node(cachep, flags, nodeid, _RET_IP_);
 
+	if (ret)
+		kasan_slab_alloc(cachep, ret);
 	trace_kmem_cache_alloc_node(_RET_IP_, ret,
 				    cachep->object_size, cachep->size,
 				    flags, nodeid);
@@ -3473,6 +3492,8 @@ void *kmem_cache_alloc_node_trace(struct kmem_cache *cachep,
 
 	ret = slab_alloc_node(cachep, flags, nodeid, _RET_IP_);
 
+	if (ret)
+		kasan_kmalloc(cachep, ret, size);
 	trace_kmalloc_node(_RET_IP_, ret,
 			   size, cachep->size,
 			   flags, nodeid);
@@ -3485,11 +3506,16 @@ static __always_inline void *
 __do_kmalloc_node(size_t size, gfp_t flags, int node, unsigned long caller)
 {
 	struct kmem_cache *cachep;
+	void *ret;
 
 	cachep = kmalloc_slab(size, flags);
 	if (unlikely(ZERO_OR_NULL_PTR(cachep)))
 		return cachep;
-	return kmem_cache_alloc_node_trace(cachep, flags, node, size);
+	ret = kmem_cache_alloc_node_trace(cachep, flags, node, size);
+	if (ret)
+		kasan_kmalloc(cachep, ret, size);
+
+	return ret;
 }
 
 void *__kmalloc_node(size_t size, gfp_t flags, int node)
@@ -3523,6 +3549,8 @@ static __always_inline void *__do_kmalloc(size_t size, gfp_t flags,
 		return cachep;
 	ret = slab_alloc(cachep, flags, caller);
 
+	if (ret)
+		kasan_kmalloc(cachep, ret, size);
 	trace_kmalloc(caller, ret,
 		      size, cachep->size, flags);
 
@@ -4240,10 +4268,18 @@ module_init(slab_proc_init);
  */
 size_t ksize(const void *objp)
 {
+	size_t size;
+
 	BUG_ON(!objp);
 	if (unlikely(objp == ZERO_SIZE_PTR))
 		return 0;
 
-	return virt_to_cache(objp)->object_size;
+	size = virt_to_cache(objp)->object_size;
+	/* We assume that ksize callers could use whole allocated area,
+	 * so we need to unpoison this area.
+	 */
+	kasan_krealloc(objp, size);
+
+	return size;
 }
 EXPORT_SYMBOL(ksize);
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 065b7bd..bf04ec7 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -35,7 +35,7 @@ struct kmem_cache *kmem_cache;
  */
 #define SLAB_NEVER_MERGE (SLAB_RED_ZONE | SLAB_POISON | SLAB_STORE_USER | \
 		SLAB_TRACE | SLAB_DESTROY_BY_RCU | SLAB_NOLEAKTRACE | \
-		SLAB_FAILSLAB)
+		SLAB_FAILSLAB | SLAB_KASAN)
 
 #define SLAB_MERGE_SAME (SLAB_RECLAIM_ACCOUNT | SLAB_CACHE_DMA | \
 			 SLAB_NOTRACK | SLAB_ACCOUNT)
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 3/7] mm, kasan: Added GFP flags to KASAN API
  2016-02-26 16:48 [PATCH v4 0/7] SLAB support for KASAN Alexander Potapenko
  2016-02-26 16:48 ` [PATCH v4 1/7] kasan: Modify kmalloc_large_oob_right(), add kmalloc_pagealloc_oob_right() Alexander Potapenko
  2016-02-26 16:48 ` [PATCH v4 2/7] mm, kasan: SLAB support Alexander Potapenko
@ 2016-02-26 16:48 ` Alexander Potapenko
  2016-02-26 16:48 ` [PATCH v4 4/7] arch, ftrace: For KASAN put hard/soft IRQ entries into separate sections Alexander Potapenko
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 30+ messages in thread
From: Alexander Potapenko @ 2016-02-26 16:48 UTC (permalink / raw)
  To: adech.fo, cl, dvyukov, akpm, ryabinin.a.a, rostedt,
	iamjoonsoo.kim, js1304, kcc
  Cc: kasan-dev, linux-kernel, linux-mm

Add GFP flags to KASAN hooks for future patches to use.

This patch is based on the "mm: kasan: unified support for SLUB and
SLAB allocators" patch originally prepared by Dmitry Chernenkov.

Signed-off-by: Alexander Potapenko <glider@google.com>
---
v4: - fix kbuild compilation error (missing parameter for kasan_kmalloc())
---
 include/linux/kasan.h | 19 +++++++++++--------
 include/linux/slab.h  |  4 ++--
 mm/kasan/kasan.c      | 15 ++++++++-------
 mm/mempool.c          | 16 ++++++++--------
 mm/slab.c             | 14 +++++++-------
 mm/slab_common.c      |  4 ++--
 mm/slub.c             | 17 +++++++++--------
 7 files changed, 47 insertions(+), 42 deletions(-)

diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index 4405a35..bf71ab0 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -53,13 +53,14 @@ void kasan_poison_slab(struct page *page);
 void kasan_unpoison_object_data(struct kmem_cache *cache, void *object);
 void kasan_poison_object_data(struct kmem_cache *cache, void *object);
 
-void kasan_kmalloc_large(const void *ptr, size_t size);
+void kasan_kmalloc_large(const void *ptr, size_t size, gfp_t flags);
 void kasan_kfree_large(const void *ptr);
 void kasan_kfree(void *ptr);
-void kasan_kmalloc(struct kmem_cache *s, const void *object, size_t size);
-void kasan_krealloc(const void *object, size_t new_size);
+void kasan_kmalloc(struct kmem_cache *s, const void *object, size_t size,
+		  gfp_t flags);
+void kasan_krealloc(const void *object, size_t new_size, gfp_t flags);
 
-void kasan_slab_alloc(struct kmem_cache *s, void *object);
+void kasan_slab_alloc(struct kmem_cache *s, void *object, gfp_t flags);
 void kasan_slab_free(struct kmem_cache *s, void *object);
 
 struct kasan_cache {
@@ -90,14 +91,16 @@ static inline void kasan_unpoison_object_data(struct kmem_cache *cache,
 static inline void kasan_poison_object_data(struct kmem_cache *cache,
 					void *object) {}
 
-static inline void kasan_kmalloc_large(void *ptr, size_t size) {}
+static inline void kasan_kmalloc_large(void *ptr, size_t size, gfp_t flags) {}
 static inline void kasan_kfree_large(const void *ptr) {}
 static inline void kasan_kfree(void *ptr) {}
 static inline void kasan_kmalloc(struct kmem_cache *s, const void *object,
-				size_t size) {}
-static inline void kasan_krealloc(const void *object, size_t new_size) {}
+				size_t size, gfp_t flags) {}
+static inline void kasan_krealloc(const void *object, size_t new_size,
+				 gfp_t flags) {}
 
-static inline void kasan_slab_alloc(struct kmem_cache *s, void *object) {}
+static inline void kasan_slab_alloc(struct kmem_cache *s, void *object,
+				   gfp_t flags) {}
 static inline void kasan_slab_free(struct kmem_cache *s, void *object) {}
 
 static inline int kasan_module_alloc(void *addr, size_t size) { return 0; }
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 840e652..b873343 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -367,7 +367,7 @@ static __always_inline void *kmem_cache_alloc_trace(struct kmem_cache *s,
 {
 	void *ret = kmem_cache_alloc(s, flags);
 
-	kasan_kmalloc(s, ret, size);
+	kasan_kmalloc(s, ret, size, flags);
 	return ret;
 }
 
@@ -378,7 +378,7 @@ kmem_cache_alloc_node_trace(struct kmem_cache *s,
 {
 	void *ret = kmem_cache_alloc_node(s, gfpflags, node);
 
-	kasan_kmalloc(s, ret, size);
+	kasan_kmalloc(s, ret, size, gfpflags);
 	return ret;
 }
 #endif /* CONFIG_TRACING */
diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c
index d26ffb4..95b2267 100644
--- a/mm/kasan/kasan.c
+++ b/mm/kasan/kasan.c
@@ -414,9 +414,9 @@ struct kasan_free_meta *get_free_info(struct kmem_cache *cache,
 }
 #endif
 
-void kasan_slab_alloc(struct kmem_cache *cache, void *object)
+void kasan_slab_alloc(struct kmem_cache *cache, void *object, gfp_t flags)
 {
-	kasan_kmalloc(cache, object, cache->object_size);
+	kasan_kmalloc(cache, object, cache->object_size, flags);
 }
 
 void kasan_slab_free(struct kmem_cache *cache, void *object)
@@ -442,7 +442,8 @@ void kasan_slab_free(struct kmem_cache *cache, void *object)
 	kasan_poison_shadow(object, rounded_up_size, KASAN_KMALLOC_FREE);
 }
 
-void kasan_kmalloc(struct kmem_cache *cache, const void *object, size_t size)
+void kasan_kmalloc(struct kmem_cache *cache, const void *object, size_t size,
+		   gfp_t flags)
 {
 	unsigned long redzone_start;
 	unsigned long redzone_end;
@@ -471,7 +472,7 @@ void kasan_kmalloc(struct kmem_cache *cache, const void *object, size_t size)
 }
 EXPORT_SYMBOL(kasan_kmalloc);
 
-void kasan_kmalloc_large(const void *ptr, size_t size)
+void kasan_kmalloc_large(const void *ptr, size_t size, gfp_t flags)
 {
 	struct page *page;
 	unsigned long redzone_start;
@@ -490,7 +491,7 @@ void kasan_kmalloc_large(const void *ptr, size_t size)
 		KASAN_PAGE_REDZONE);
 }
 
-void kasan_krealloc(const void *object, size_t size)
+void kasan_krealloc(const void *object, size_t size, gfp_t flags)
 {
 	struct page *page;
 
@@ -500,9 +501,9 @@ void kasan_krealloc(const void *object, size_t size)
 	page = virt_to_head_page(object);
 
 	if (unlikely(!PageSlab(page)))
-		kasan_kmalloc_large(object, size);
+		kasan_kmalloc_large(object, size, flags);
 	else
-		kasan_kmalloc(page->slab_cache, object, size);
+		kasan_kmalloc(page->slab_cache, object, size, flags);
 }
 
 void kasan_kfree(void *ptr)
diff --git a/mm/mempool.c b/mm/mempool.c
index 004d42b..b47c8a7 100644
--- a/mm/mempool.c
+++ b/mm/mempool.c
@@ -112,12 +112,12 @@ static void kasan_poison_element(mempool_t *pool, void *element)
 		kasan_free_pages(element, (unsigned long)pool->pool_data);
 }
 
-static void kasan_unpoison_element(mempool_t *pool, void *element)
+static void kasan_unpoison_element(mempool_t *pool, void *element, gfp_t flags)
 {
 	if (pool->alloc == mempool_alloc_slab)
-		kasan_slab_alloc(pool->pool_data, element);
+		kasan_slab_alloc(pool->pool_data, element, flags);
 	if (pool->alloc == mempool_kmalloc)
-		kasan_krealloc(element, (size_t)pool->pool_data);
+		kasan_krealloc(element, (size_t)pool->pool_data, flags);
 	if (pool->alloc == mempool_alloc_pages)
 		kasan_alloc_pages(element, (unsigned long)pool->pool_data);
 }
@@ -130,13 +130,13 @@ static void add_element(mempool_t *pool, void *element)
 	pool->elements[pool->curr_nr++] = element;
 }
 
-static void *remove_element(mempool_t *pool)
+static void *remove_element(mempool_t *pool, gfp_t flags)
 {
 	void *element = pool->elements[--pool->curr_nr];
 
 	BUG_ON(pool->curr_nr < 0);
 	check_element(pool, element);
-	kasan_unpoison_element(pool, element);
+	kasan_unpoison_element(pool, element, flags);
 	return element;
 }
 
@@ -154,7 +154,7 @@ void mempool_destroy(mempool_t *pool)
 		return;
 
 	while (pool->curr_nr) {
-		void *element = remove_element(pool);
+		void *element = remove_element(pool, GFP_KERNEL);
 		pool->free(element, pool->pool_data);
 	}
 	kfree(pool->elements);
@@ -250,7 +250,7 @@ int mempool_resize(mempool_t *pool, int new_min_nr)
 	spin_lock_irqsave(&pool->lock, flags);
 	if (new_min_nr <= pool->min_nr) {
 		while (new_min_nr < pool->curr_nr) {
-			element = remove_element(pool);
+			element = remove_element(pool, GFP_KERNEL);
 			spin_unlock_irqrestore(&pool->lock, flags);
 			pool->free(element, pool->pool_data);
 			spin_lock_irqsave(&pool->lock, flags);
@@ -336,7 +336,7 @@ repeat_alloc:
 
 	spin_lock_irqsave(&pool->lock, flags);
 	if (likely(pool->curr_nr)) {
-		element = remove_element(pool);
+		element = remove_element(pool, gfp_temp);
 		spin_unlock_irqrestore(&pool->lock, flags);
 		/* paired with rmb in mempool_free(), read comment there */
 		smp_wmb();
diff --git a/mm/slab.c b/mm/slab.c
index 805b39b..52a7a8d 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3417,7 +3417,7 @@ void *kmem_cache_alloc(struct kmem_cache *cachep, gfp_t flags)
 {
 	void *ret = slab_alloc(cachep, flags, _RET_IP_);
 	if (ret)
-		kasan_slab_alloc(cachep, ret);
+		kasan_slab_alloc(cachep, ret, flags);
 
 	trace_kmem_cache_alloc(_RET_IP_, ret,
 			       cachep->object_size, cachep->size, flags);
@@ -3448,7 +3448,7 @@ kmem_cache_alloc_trace(struct kmem_cache *cachep, gfp_t flags, size_t size)
 	ret = slab_alloc(cachep, flags, _RET_IP_);
 
 	if (ret)
-		kasan_kmalloc(cachep, ret, size);
+		kasan_kmalloc(cachep, ret, size, flags);
 	trace_kmalloc(_RET_IP_, ret,
 		      size, cachep->size, flags);
 	return ret;
@@ -3473,7 +3473,7 @@ void *kmem_cache_alloc_node(struct kmem_cache *cachep, gfp_t flags, int nodeid)
 	void *ret = slab_alloc_node(cachep, flags, nodeid, _RET_IP_);
 
 	if (ret)
-		kasan_slab_alloc(cachep, ret);
+		kasan_slab_alloc(cachep, ret, flags);
 	trace_kmem_cache_alloc_node(_RET_IP_, ret,
 				    cachep->object_size, cachep->size,
 				    flags, nodeid);
@@ -3493,7 +3493,7 @@ void *kmem_cache_alloc_node_trace(struct kmem_cache *cachep,
 	ret = slab_alloc_node(cachep, flags, nodeid, _RET_IP_);
 
 	if (ret)
-		kasan_kmalloc(cachep, ret, size);
+		kasan_kmalloc(cachep, ret, size, flags);
 	trace_kmalloc_node(_RET_IP_, ret,
 			   size, cachep->size,
 			   flags, nodeid);
@@ -3513,7 +3513,7 @@ __do_kmalloc_node(size_t size, gfp_t flags, int node, unsigned long caller)
 		return cachep;
 	ret = kmem_cache_alloc_node_trace(cachep, flags, node, size);
 	if (ret)
-		kasan_kmalloc(cachep, ret, size);
+		kasan_kmalloc(cachep, ret, size, flags);
 
 	return ret;
 }
@@ -3550,7 +3550,7 @@ static __always_inline void *__do_kmalloc(size_t size, gfp_t flags,
 	ret = slab_alloc(cachep, flags, caller);
 
 	if (ret)
-		kasan_kmalloc(cachep, ret, size);
+		kasan_kmalloc(cachep, ret, size, flags);
 	trace_kmalloc(caller, ret,
 		      size, cachep->size, flags);
 
@@ -4278,7 +4278,7 @@ size_t ksize(const void *objp)
 	/* We assume that ksize callers could use whole allocated area,
 	 * so we need to unpoison this area.
 	 */
-	kasan_krealloc(objp, size);
+	kasan_krealloc(objp, size, GFP_NOWAIT);
 
 	return size;
 }
diff --git a/mm/slab_common.c b/mm/slab_common.c
index bf04ec7..538f616 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1009,7 +1009,7 @@ void *kmalloc_order(size_t size, gfp_t flags, unsigned int order)
 	page = alloc_kmem_pages(flags, order);
 	ret = page ? page_address(page) : NULL;
 	kmemleak_alloc(ret, size, 1, flags);
-	kasan_kmalloc_large(ret, size);
+	kasan_kmalloc_large(ret, size, flags);
 	return ret;
 }
 EXPORT_SYMBOL(kmalloc_order);
@@ -1190,7 +1190,7 @@ static __always_inline void *__do_krealloc(const void *p, size_t new_size,
 		ks = ksize(p);
 
 	if (ks >= new_size) {
-		kasan_krealloc((void *)p, new_size);
+		kasan_krealloc((void *)p, new_size, flags);
 		return (void *)p;
 	}
 
diff --git a/mm/slub.c b/mm/slub.c
index d8fbd4a..2978695 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1272,7 +1272,7 @@ static inline void dec_slabs_node(struct kmem_cache *s, int node,
 static inline void kmalloc_large_node_hook(void *ptr, size_t size, gfp_t flags)
 {
 	kmemleak_alloc(ptr, size, 1, flags);
-	kasan_kmalloc_large(ptr, size);
+	kasan_kmalloc_large(ptr, size, flags);
 }
 
 static inline void kfree_hook(const void *x)
@@ -1306,7 +1306,7 @@ static inline void slab_post_alloc_hook(struct kmem_cache *s, gfp_t flags,
 		kmemcheck_slab_alloc(s, flags, object, slab_ksize(s));
 		kmemleak_alloc_recursive(object, s->object_size, 1,
 					 s->flags, flags);
-		kasan_slab_alloc(s, object);
+		kasan_slab_alloc(s, object, flags);
 	}
 	memcg_kmem_put_cache(s);
 }
@@ -2584,7 +2584,7 @@ void *kmem_cache_alloc_trace(struct kmem_cache *s, gfp_t gfpflags, size_t size)
 {
 	void *ret = slab_alloc(s, gfpflags, _RET_IP_);
 	trace_kmalloc(_RET_IP_, ret, size, s->size, gfpflags);
-	kasan_kmalloc(s, ret, size);
+	kasan_kmalloc(s, ret, size, gfpflags);
 	return ret;
 }
 EXPORT_SYMBOL(kmem_cache_alloc_trace);
@@ -2612,7 +2612,7 @@ void *kmem_cache_alloc_node_trace(struct kmem_cache *s,
 	trace_kmalloc_node(_RET_IP_, ret,
 			   size, s->size, gfpflags, node);
 
-	kasan_kmalloc(s, ret, size);
+	kasan_kmalloc(s, ret, size, gfpflags);
 	return ret;
 }
 EXPORT_SYMBOL(kmem_cache_alloc_node_trace);
@@ -3156,7 +3156,8 @@ static void early_kmem_cache_node_alloc(int node)
 	init_object(kmem_cache_node, n, SLUB_RED_ACTIVE);
 	init_tracking(kmem_cache_node, n);
 #endif
-	kasan_kmalloc(kmem_cache_node, n, sizeof(struct kmem_cache_node));
+	kasan_kmalloc(kmem_cache_node, n, sizeof(struct kmem_cache_node),
+		      GFP_KERNEL);
 	init_kmem_cache_node(n);
 	inc_slabs_node(kmem_cache_node, node, page->objects);
 
@@ -3531,7 +3532,7 @@ void *__kmalloc(size_t size, gfp_t flags)
 
 	trace_kmalloc(_RET_IP_, ret, size, s->size, flags);
 
-	kasan_kmalloc(s, ret, size);
+	kasan_kmalloc(s, ret, size, flags);
 
 	return ret;
 }
@@ -3576,7 +3577,7 @@ void *__kmalloc_node(size_t size, gfp_t flags, int node)
 
 	trace_kmalloc_node(_RET_IP_, ret, size, s->size, flags, node);
 
-	kasan_kmalloc(s, ret, size);
+	kasan_kmalloc(s, ret, size, flags);
 
 	return ret;
 }
@@ -3605,7 +3606,7 @@ size_t ksize(const void *object)
 	size_t size = __ksize(object);
 	/* We assume that ksize callers could use whole allocated area,
 	   so we need unpoison this area. */
-	kasan_krealloc(object, size);
+	kasan_krealloc(object, size, GFP_NOWAIT);
 	return size;
 }
 EXPORT_SYMBOL(ksize);
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 4/7] arch, ftrace: For KASAN put hard/soft IRQ entries into separate sections
  2016-02-26 16:48 [PATCH v4 0/7] SLAB support for KASAN Alexander Potapenko
                   ` (2 preceding siblings ...)
  2016-02-26 16:48 ` [PATCH v4 3/7] mm, kasan: Added GFP flags to KASAN API Alexander Potapenko
@ 2016-02-26 16:48 ` Alexander Potapenko
  2016-02-27  1:44   ` kbuild test robot
  2016-03-02 17:41   ` Steven Rostedt
  2016-02-26 16:48 ` [PATCH v4 5/7] mm, kasan: Stackdepot implementation. Enable stackdepot for SLAB Alexander Potapenko
                   ` (3 subsequent siblings)
  7 siblings, 2 replies; 30+ messages in thread
From: Alexander Potapenko @ 2016-02-26 16:48 UTC (permalink / raw)
  To: adech.fo, cl, dvyukov, akpm, ryabinin.a.a, rostedt,
	iamjoonsoo.kim, js1304, kcc
  Cc: kasan-dev, linux-kernel, linux-mm

KASAN needs to know whether the allocation happens in an IRQ handler.
This lets us strip everything below the IRQ entry point to reduce the
number of unique stack traces needed to be stored.

Move the definition of __irq_entry to <linux/interrupt.h> so that the
users don't need to pull in <linux/ftrace.h>. Also introduce the
__softirq_entry macro which is similar to __irq_entry, but puts the
corresponding functions to the .softirqentry.text section.

Signed-off-by: Alexander Potapenko <glider@google.com>
---
v2: - per request from Steven Rostedt, moved the declarations of __softirq_entry
and __irq_entry to <linux/interrupt.h>

v3: - minor description changes
---
 arch/arm/kernel/vmlinux.lds.S        |  1 +
 arch/arm64/kernel/vmlinux.lds.S      |  1 +
 arch/blackfin/kernel/vmlinux.lds.S   |  1 +
 arch/c6x/kernel/vmlinux.lds.S        |  1 +
 arch/metag/kernel/vmlinux.lds.S      |  1 +
 arch/microblaze/kernel/vmlinux.lds.S |  1 +
 arch/mips/kernel/vmlinux.lds.S       |  1 +
 arch/nios2/kernel/vmlinux.lds.S      |  1 +
 arch/openrisc/kernel/vmlinux.lds.S   |  1 +
 arch/parisc/kernel/vmlinux.lds.S     |  1 +
 arch/powerpc/kernel/vmlinux.lds.S    |  1 +
 arch/s390/kernel/vmlinux.lds.S       |  1 +
 arch/sh/kernel/vmlinux.lds.S         |  1 +
 arch/sparc/kernel/vmlinux.lds.S      |  1 +
 arch/tile/kernel/vmlinux.lds.S       |  1 +
 arch/x86/kernel/vmlinux.lds.S        |  1 +
 include/asm-generic/vmlinux.lds.h    | 12 +++++++++++-
 include/linux/ftrace.h               | 11 -----------
 include/linux/interrupt.h            | 20 ++++++++++++++++++++
 kernel/softirq.c                     |  2 +-
 kernel/trace/trace_functions_graph.c |  1 +
 21 files changed, 49 insertions(+), 13 deletions(-)

diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index 8b60fde..28b690fc 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -105,6 +105,7 @@ SECTIONS
 			*(.exception.text)
 			__exception_text_end = .;
 			IRQENTRY_TEXT
+			SOFTIRQENTRY_TEXT
 			TEXT_TEXT
 			SCHED_TEXT
 			LOCK_TEXT
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index e3928f5..b9242b7 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -102,6 +102,7 @@ SECTIONS
 			*(.exception.text)
 			__exception_text_end = .;
 			IRQENTRY_TEXT
+			SOFTIRQENTRY_TEXT
 			TEXT_TEXT
 			SCHED_TEXT
 			LOCK_TEXT
diff --git a/arch/blackfin/kernel/vmlinux.lds.S b/arch/blackfin/kernel/vmlinux.lds.S
index c9eec84..d920b95 100644
--- a/arch/blackfin/kernel/vmlinux.lds.S
+++ b/arch/blackfin/kernel/vmlinux.lds.S
@@ -35,6 +35,7 @@ SECTIONS
 #endif
 		LOCK_TEXT
 		IRQENTRY_TEXT
+		SOFTIRQENTRY_TEXT
 		KPROBES_TEXT
 #ifdef CONFIG_ROMKERNEL
 		__sinittext = .;
diff --git a/arch/c6x/kernel/vmlinux.lds.S b/arch/c6x/kernel/vmlinux.lds.S
index 5a6e141..50bc10f 100644
--- a/arch/c6x/kernel/vmlinux.lds.S
+++ b/arch/c6x/kernel/vmlinux.lds.S
@@ -72,6 +72,7 @@ SECTIONS
 		SCHED_TEXT
 		LOCK_TEXT
 		IRQENTRY_TEXT
+		SOFTIRQENTRY_TEXT
 		KPROBES_TEXT
 		*(.fixup)
 		*(.gnu.warning)
diff --git a/arch/metag/kernel/vmlinux.lds.S b/arch/metag/kernel/vmlinux.lds.S
index e12055e..150ace9 100644
--- a/arch/metag/kernel/vmlinux.lds.S
+++ b/arch/metag/kernel/vmlinux.lds.S
@@ -24,6 +24,7 @@ SECTIONS
 	LOCK_TEXT
 	KPROBES_TEXT
 	IRQENTRY_TEXT
+	SOFTIRQENTRY_TEXT
 	*(.text.*)
 	*(.gnu.warning)
 	}
diff --git a/arch/microblaze/kernel/vmlinux.lds.S b/arch/microblaze/kernel/vmlinux.lds.S
index be9488d..0a47f04 100644
--- a/arch/microblaze/kernel/vmlinux.lds.S
+++ b/arch/microblaze/kernel/vmlinux.lds.S
@@ -36,6 +36,7 @@ SECTIONS {
 		LOCK_TEXT
 		KPROBES_TEXT
 		IRQENTRY_TEXT
+		SOFTIRQENTRY_TEXT
 		. = ALIGN (4) ;
 		_etext = . ;
 	}
diff --git a/arch/mips/kernel/vmlinux.lds.S b/arch/mips/kernel/vmlinux.lds.S
index 0a93e83..54d653e 100644
--- a/arch/mips/kernel/vmlinux.lds.S
+++ b/arch/mips/kernel/vmlinux.lds.S
@@ -58,6 +58,7 @@ SECTIONS
 		LOCK_TEXT
 		KPROBES_TEXT
 		IRQENTRY_TEXT
+		SOFTIRQENTRY_TEXT
 		*(.text.*)
 		*(.fixup)
 		*(.gnu.warning)
diff --git a/arch/nios2/kernel/vmlinux.lds.S b/arch/nios2/kernel/vmlinux.lds.S
index 326fab4..e23e895 100644
--- a/arch/nios2/kernel/vmlinux.lds.S
+++ b/arch/nios2/kernel/vmlinux.lds.S
@@ -39,6 +39,7 @@ SECTIONS
 		SCHED_TEXT
 		LOCK_TEXT
 		IRQENTRY_TEXT
+		SOFTIRQENTRY_TEXT
 		KPROBES_TEXT
 	} =0
 	_etext = .;
diff --git a/arch/openrisc/kernel/vmlinux.lds.S b/arch/openrisc/kernel/vmlinux.lds.S
index 2d69a85..d936de4 100644
--- a/arch/openrisc/kernel/vmlinux.lds.S
+++ b/arch/openrisc/kernel/vmlinux.lds.S
@@ -50,6 +50,7 @@ SECTIONS
 	  LOCK_TEXT
 	  KPROBES_TEXT
 	  IRQENTRY_TEXT
+	  SOFTIRQENTRY_TEXT
 	  *(.fixup)
 	  *(.text.__*)
 	  _etext = .;
diff --git a/arch/parisc/kernel/vmlinux.lds.S b/arch/parisc/kernel/vmlinux.lds.S
index 308f290..f3ead0b 100644
--- a/arch/parisc/kernel/vmlinux.lds.S
+++ b/arch/parisc/kernel/vmlinux.lds.S
@@ -72,6 +72,7 @@ SECTIONS
 		LOCK_TEXT
 		KPROBES_TEXT
 		IRQENTRY_TEXT
+		SOFTIRQENTRY_TEXT
 		*(.text.do_softirq)
 		*(.text.sys_exit)
 		*(.text.do_sigaltstack)
diff --git a/arch/powerpc/kernel/vmlinux.lds.S b/arch/powerpc/kernel/vmlinux.lds.S
index d41fd0a..2dd91f7 100644
--- a/arch/powerpc/kernel/vmlinux.lds.S
+++ b/arch/powerpc/kernel/vmlinux.lds.S
@@ -55,6 +55,7 @@ SECTIONS
 		LOCK_TEXT
 		KPROBES_TEXT
 		IRQENTRY_TEXT
+		SOFTIRQENTRY_TEXT
 
 #ifdef CONFIG_PPC32
 		*(.got1)
diff --git a/arch/s390/kernel/vmlinux.lds.S b/arch/s390/kernel/vmlinux.lds.S
index 445657f..0f41a82 100644
--- a/arch/s390/kernel/vmlinux.lds.S
+++ b/arch/s390/kernel/vmlinux.lds.S
@@ -28,6 +28,7 @@ SECTIONS
 		LOCK_TEXT
 		KPROBES_TEXT
 		IRQENTRY_TEXT
+		SOFTIRQENTRY_TEXT
 		*(.fixup)
 		*(.gnu.warning)
 	} :text = 0x0700
diff --git a/arch/sh/kernel/vmlinux.lds.S b/arch/sh/kernel/vmlinux.lds.S
index db88cbf..235a410 100644
--- a/arch/sh/kernel/vmlinux.lds.S
+++ b/arch/sh/kernel/vmlinux.lds.S
@@ -39,6 +39,7 @@ SECTIONS
 		LOCK_TEXT
 		KPROBES_TEXT
 		IRQENTRY_TEXT
+		SOFTIRQENTRY_TEXT
 		*(.fixup)
 		*(.gnu.warning)
 		_etext = .;		/* End of text section */
diff --git a/arch/sparc/kernel/vmlinux.lds.S b/arch/sparc/kernel/vmlinux.lds.S
index f1a2f68..aadd321 100644
--- a/arch/sparc/kernel/vmlinux.lds.S
+++ b/arch/sparc/kernel/vmlinux.lds.S
@@ -48,6 +48,7 @@ SECTIONS
 		LOCK_TEXT
 		KPROBES_TEXT
 		IRQENTRY_TEXT
+		SOFTIRQENTRY_TEXT
 		*(.gnu.warning)
 	} = 0
 	_etext = .;
diff --git a/arch/tile/kernel/vmlinux.lds.S b/arch/tile/kernel/vmlinux.lds.S
index 0e059a0..378f5d8 100644
--- a/arch/tile/kernel/vmlinux.lds.S
+++ b/arch/tile/kernel/vmlinux.lds.S
@@ -45,6 +45,7 @@ SECTIONS
     LOCK_TEXT
     KPROBES_TEXT
     IRQENTRY_TEXT
+    SOFTIRQENTRY_TEXT
     __fix_text_end = .;   /* tile-cpack won't rearrange before this */
     ALIGN_FUNCTION();
     *(.hottext*)
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 74e4bf1..056a97a 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -102,6 +102,7 @@ SECTIONS
 		KPROBES_TEXT
 		ENTRY_TEXT
 		IRQENTRY_TEXT
+		SOFTIRQENTRY_TEXT
 		*(.fixup)
 		*(.gnu.warning)
 		/* End of text section */
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index c4bd0e2..b470421 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -456,7 +456,7 @@
 		*(.entry.text)						\
 		VMLINUX_SYMBOL(__entry_text_end) = .;
 
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+#if defined(CONFIG_FUNCTION_GRAPH_TRACER) || defined(CONFIG_KASAN)
 #define IRQENTRY_TEXT							\
 		ALIGN_FUNCTION();					\
 		VMLINUX_SYMBOL(__irqentry_text_start) = .;		\
@@ -466,6 +466,16 @@
 #define IRQENTRY_TEXT
 #endif
 
+#if defined(CONFIG_FUNCTION_GRAPH_TRACER) || defined(CONFIG_KASAN)
+#define SOFTIRQENTRY_TEXT						\
+		ALIGN_FUNCTION();					\
+		VMLINUX_SYMBOL(__softirqentry_text_start) = .;		\
+		*(.softirqentry.text)					\
+		VMLINUX_SYMBOL(__softirqentry_text_end) = .;
+#else
+#define SOFTIRQENTRY_TEXT
+#endif
+
 /* Section used for early init (in .S files) */
 #define HEAD_TEXT  *(.head.text)
 
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index c2b340e..4da848d 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -799,16 +799,6 @@ ftrace_push_return_trace(unsigned long ret, unsigned long func, int *depth,
  */
 #define __notrace_funcgraph		notrace
 
-/*
- * We want to which function is an entrypoint of a hardirq.
- * That will help us to put a signal on output.
- */
-#define __irq_entry		 __attribute__((__section__(".irqentry.text")))
-
-/* Limits of hardirq entrypoints */
-extern char __irqentry_text_start[];
-extern char __irqentry_text_end[];
-
 #define FTRACE_NOTRACE_DEPTH 65536
 #define FTRACE_RETFUNC_DEPTH 50
 #define FTRACE_RETSTACK_ALLOC_SIZE 32
@@ -845,7 +835,6 @@ static inline void unpause_graph_tracing(void)
 #else /* !CONFIG_FUNCTION_GRAPH_TRACER */
 
 #define __notrace_funcgraph
-#define __irq_entry
 #define INIT_FTRACE_GRAPH
 
 static inline void ftrace_graph_init_task(struct task_struct *t) { }
diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 0e95fcc..1dcecaf 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -673,4 +673,24 @@ extern int early_irq_init(void);
 extern int arch_probe_nr_irqs(void);
 extern int arch_early_irq_init(void);
 
+#if defined(CONFIG_FUNCTION_GRAPH_TRACER) || defined(CONFIG_KASAN)
+/*
+ * We want to know which function is an entrypoint of a hardirq or a softirq.
+ */
+#define __irq_entry		 __attribute__((__section__(".irqentry.text")))
+#define __softirq_entry  \
+	__attribute__((__section__(".softirqentry.text")))
+
+/* Limits of hardirq entrypoints */
+extern char __irqentry_text_start[];
+extern char __irqentry_text_end[];
+/* Limits of softirq entrypoints */
+extern char __softirqentry_text_start[];
+extern char __softirqentry_text_end[];
+
+#else
+#define __irq_entry
+#define __softirq_entry
+#endif
+
 #endif
diff --git a/kernel/softirq.c b/kernel/softirq.c
index 479e443..359be4f 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -227,7 +227,7 @@ static inline bool lockdep_softirq_start(void) { return false; }
 static inline void lockdep_softirq_end(bool in_hardirq) { }
 #endif
 
-asmlinkage __visible void __do_softirq(void)
+asmlinkage __visible void __softirq_entry __do_softirq(void)
 {
 	unsigned long end = jiffies + MAX_SOFTIRQ_TIME;
 	unsigned long old_flags = current->flags;
diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
index a663cbb..3e6f7d4 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -8,6 +8,7 @@
  */
 #include <linux/uaccess.h>
 #include <linux/ftrace.h>
+#include <linux/interrupt.h>
 #include <linux/slab.h>
 #include <linux/fs.h>
 
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 5/7] mm, kasan: Stackdepot implementation. Enable stackdepot for SLAB
  2016-02-26 16:48 [PATCH v4 0/7] SLAB support for KASAN Alexander Potapenko
                   ` (3 preceding siblings ...)
  2016-02-26 16:48 ` [PATCH v4 4/7] arch, ftrace: For KASAN put hard/soft IRQ entries into separate sections Alexander Potapenko
@ 2016-02-26 16:48 ` Alexander Potapenko
  2016-02-29 16:29   ` Andrey Ryabinin
  2016-02-26 16:48 ` [PATCH v4 6/7] kasan: Test fix: Warn if the UAF could not be detected in kmalloc_uaf2 Alexander Potapenko
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Alexander Potapenko @ 2016-02-26 16:48 UTC (permalink / raw)
  To: adech.fo, cl, dvyukov, akpm, ryabinin.a.a, rostedt,
	iamjoonsoo.kim, js1304, kcc
  Cc: kasan-dev, linux-kernel, linux-mm

Stack depot will allow KASAN store allocation/deallocation stack traces
for memory chunks. The stack traces are stored in a hash table and
referenced by handles which reside in the kasan_alloc_meta and
kasan_free_meta structures in the allocated memory chunks.

IRQ stack traces are cut below the IRQ entry point to avoid unnecessary
duplication.

Right now stackdepot support is only enabled in SLAB allocator.
Once KASAN features in SLAB are on par with those in SLUB we can switch
SLUB to stackdepot as well, thus removing the dependency on SLUB stack
bookkeeping, which wastes a lot of memory.

This patch is based on the "mm: kasan: stack depots" patch originally
prepared by Dmitry Chernenkov.

Signed-off-by: Alexander Potapenko <glider@google.com>
---
v2: - per request from Joonsoo Kim, moved the stackdepot implementation to
lib/, as there's a plan to use it for page owner
    - added copyright comments
    - added comments about smp_load_acquire()/smp_store_release()

v3: - minor description changes
---
 arch/x86/kernel/Makefile   |   1 +
 include/linux/stackdepot.h |  32 ++++++
 lib/Makefile               |   7 ++
 lib/stackdepot.c           | 274 +++++++++++++++++++++++++++++++++++++++++++++
 mm/kasan/Makefile          |   1 +
 mm/kasan/kasan.c           |  51 ++++++++-
 mm/kasan/kasan.h           |   4 +
 mm/kasan/report.c          |   9 ++
 8 files changed, 376 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index b1b78ff..500584d 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -19,6 +19,7 @@ endif
 KASAN_SANITIZE_head$(BITS).o := n
 KASAN_SANITIZE_dumpstack.o := n
 KASAN_SANITIZE_dumpstack_$(BITS).o := n
+KASAN_SANITIZE_stacktrace.o := n
 
 CFLAGS_irq.o := -I$(src)/../include/asm/trace
 
diff --git a/include/linux/stackdepot.h b/include/linux/stackdepot.h
new file mode 100644
index 0000000..b6cbe05
--- /dev/null
+++ b/include/linux/stackdepot.h
@@ -0,0 +1,32 @@
+/*
+ * A generic stack depot implementation
+ *
+ * Author: Alexander Potapenko <glider@google.com>
+ * Copyright (C) 2016 Google, Inc.
+ *
+ * Based on code by Dmitry Chernenkov.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#ifndef _LINUX_STACKDEPOT_H
+#define _LINUX_STACKDEPOT_H
+
+typedef u32 depot_stack_handle;
+
+struct stack_trace;
+
+depot_stack_handle depot_save_stack(struct stack_trace *trace, gfp_t flags);
+
+void depot_fetch_stack(depot_stack_handle handle, struct stack_trace *trace);
+
+#endif
diff --git a/lib/Makefile b/lib/Makefile
index a7c26a4..10a4ae3 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -167,6 +167,13 @@ obj-$(CONFIG_SG_SPLIT) += sg_split.o
 obj-$(CONFIG_STMP_DEVICE) += stmp_device.o
 obj-$(CONFIG_IRQ_POLL) += irq_poll.o
 
+ifeq ($(CONFIG_KASAN),y)
+ifeq ($(CONFIG_SLAB),y)
+	obj-y	+= stackdepot.o
+	KASAN_SANITIZE_slub.o := n
+endif
+endif
+
 libfdt_files = fdt.o fdt_ro.o fdt_wip.o fdt_rw.o fdt_sw.o fdt_strerror.o \
 	       fdt_empty_tree.o
 $(foreach file, $(libfdt_files), \
diff --git a/lib/stackdepot.c b/lib/stackdepot.c
new file mode 100644
index 0000000..f09b0da
--- /dev/null
+++ b/lib/stackdepot.c
@@ -0,0 +1,274 @@
+/*
+ * Generic stack depot for storing stack traces.
+ *
+ * Some debugging tools need to save stack traces of certain events which can
+ * be later presented to the user. For example, KASAN needs to safe alloc and
+ * free stacks for each object, but storing two stack traces per object
+ * requires too much memory (e.g. SLUB_DEBUG needs 256 bytes per object for
+ * that).
+ *
+ * Instead, stack depot maintains a hashtable of unique stacktraces. Since alloc
+ * and free stacks repeat a lot, we save about 100x space.
+ * Stacks are never removed from depot, so we store them contiguously one after
+ * another in a contiguos memory allocation.
+ *
+ * Author: Alexander Potapenko <glider@google.com>
+ * Copyright (C) 2016 Google, Inc.
+ *
+ * Based on code by Dmitry Chernenkov.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ */
+
+#include <linux/gfp.h>
+#include <linux/jhash.h>
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/percpu.h>
+#include <linux/printk.h>
+#include <linux/stacktrace.h>
+#include <linux/stackdepot.h>
+#include <linux/string.h>
+#include <linux/types.h>
+
+#define DEPOT_STACK_BITS (sizeof(depot_stack_handle) * 8)
+
+#define STACK_ALLOC_ORDER 4 /* 'Slab' size order for stack depot, 16 pages */
+#define STACK_ALLOC_SIZE (1LL << (PAGE_SHIFT + STACK_ALLOC_ORDER))
+#define STACK_ALLOC_ALIGN 4
+#define STACK_ALLOC_OFFSET_BITS (STACK_ALLOC_ORDER + PAGE_SHIFT - \
+					STACK_ALLOC_ALIGN)
+#define STACK_ALLOC_INDEX_BITS (DEPOT_STACK_BITS - STACK_ALLOC_OFFSET_BITS)
+#define STACK_ALLOC_SLABS_CAP 1024
+#define STACK_ALLOC_MAX_SLABS \
+	(((1LL << (STACK_ALLOC_INDEX_BITS)) < STACK_ALLOC_SLABS_CAP) ? \
+	 (1LL << (STACK_ALLOC_INDEX_BITS)) : STACK_ALLOC_SLABS_CAP)
+
+/* The compact structure to store the reference to stacks. */
+union handle_parts {
+	depot_stack_handle handle;
+	struct {
+		u32 slabindex : STACK_ALLOC_INDEX_BITS;
+		u32 offset : STACK_ALLOC_OFFSET_BITS;
+	};
+};
+
+struct stack_record {
+	struct stack_record *next;	/* Link in the hashtable */
+	u32 hash;			/* Hash in the hastable */
+	u32 size;			/* Number of frames in the stack */
+	union handle_parts handle;
+	unsigned long entries[1];	/* Variable-sized array of entries. */
+};
+
+static void *stack_slabs[STACK_ALLOC_MAX_SLABS];
+
+static int depot_index;
+static int next_slab_inited;
+static size_t depot_offset;
+static DEFINE_SPINLOCK(depot_lock);
+
+static bool init_stack_slab(void **prealloc)
+{
+	if (!*prealloc)
+		return false;
+	/* This smp_load_acquire() pairs with smp_store_release() to
+	 * |next_slab_inited| below and in depot_alloc_stack().
+	 */
+	if (smp_load_acquire(&next_slab_inited))
+		return true;
+	if (stack_slabs[depot_index] == NULL) {
+		stack_slabs[depot_index] = *prealloc;
+	} else {
+		stack_slabs[depot_index + 1] = *prealloc;
+		/* This smp_store_release pairs with smp_load_acquire() from
+		 * |next_slab_inited| above and in depot_save_stack().
+		 */
+		smp_store_release(&next_slab_inited, 1);
+	}
+	*prealloc = NULL;
+	return true;
+}
+
+/* Allocation of a new stack in raw storage */
+static struct stack_record *depot_alloc_stack(unsigned long *entries, int size,
+		u32 hash, void **prealloc, gfp_t alloc_flags)
+{
+	int required_size = offsetof(struct stack_record, entries) +
+		sizeof(unsigned long) * size;
+	struct stack_record *stack;
+
+	required_size = ALIGN(required_size, 1 << STACK_ALLOC_ALIGN);
+
+	if (unlikely(depot_offset + required_size > STACK_ALLOC_SIZE)) {
+		if (unlikely(depot_index + 1 >= STACK_ALLOC_MAX_SLABS)) {
+			WARN_ONCE(1, "Stack depot reached limit capacity");
+			return NULL;
+		}
+		depot_index++;
+		depot_offset = 0;
+		/* smp_store_release() here pairs with smp_load_acquire() from
+		 * |next_slab_inited| in depot_save_stack() and
+		 * init_stack_slab().
+		 */
+		if (depot_index + 1 < STACK_ALLOC_MAX_SLABS)
+			smp_store_release(&next_slab_inited, 0);
+	}
+	init_stack_slab(prealloc);
+	if (stack_slabs[depot_index] == NULL)
+		return NULL;
+
+	stack = stack_slabs[depot_index] + depot_offset;
+
+	stack->hash = hash;
+	stack->size = size;
+	stack->handle.slabindex = depot_index;
+	stack->handle.offset = depot_offset >> STACK_ALLOC_ALIGN;
+	__memcpy(stack->entries, entries, size * sizeof(unsigned long));
+	depot_offset += required_size;
+
+	return stack;
+}
+
+#define STACK_HASH_ORDER 20
+#define STACK_HASH_SIZE (1L << STACK_HASH_ORDER)
+#define STACK_HASH_MASK (STACK_HASH_SIZE - 1)
+#define STACK_HASH_SEED 0x9747b28c
+
+static struct stack_record *stack_table[STACK_HASH_SIZE] = {
+	[0 ...	STACK_HASH_SIZE - 1] = NULL
+};
+
+/* Calculate hash for a stack */
+static inline u32 hash_stack(unsigned long *entries, unsigned int size)
+{
+	return jhash2((u32 *)entries,
+			       size * sizeof(unsigned long) / sizeof(u32),
+			       STACK_HASH_SEED);
+}
+
+/* Find a stack that is equal to the one stored in entries in the hash */
+static inline struct stack_record *find_stack(struct stack_record *bucket,
+					     unsigned long *entries, int size,
+					     u32 hash)
+{
+	struct stack_record *found;
+
+	for (found = bucket; found; found = found->next) {
+		if (found->hash == hash &&
+		    found->size == size &&
+		    !memcmp(entries, found->entries,
+			    size * sizeof(unsigned long))) {
+			return found;
+		}
+	}
+	return NULL;
+}
+
+void depot_fetch_stack(depot_stack_handle handle, struct stack_trace *trace)
+{
+	union handle_parts parts = { .handle = handle };
+	void *slab = stack_slabs[parts.slabindex];
+	size_t offset = parts.offset << STACK_ALLOC_ALIGN;
+	struct stack_record *stack = slab + offset;
+
+	trace->nr_entries = trace->max_entries = stack->size;
+	trace->entries = stack->entries;
+	trace->skip = 0;
+}
+
+/*
+ * depot_save_stack - save stack in a stack depot.
+ * @trace - the stacktrace to save.
+ * @alloc_flags - flags for allocating additional memory if required.
+ *
+ * Returns the handle of the stack struct stored in depot.
+ */
+depot_stack_handle depot_save_stack(struct stack_trace *trace,
+				    gfp_t alloc_flags)
+{
+	u32 hash;
+	depot_stack_handle retval = 0;
+	struct stack_record *found = NULL, **bucket;
+	unsigned long flags;
+	struct page *page = NULL;
+	void *prealloc = NULL;
+
+	if (unlikely(trace->nr_entries == 0))
+		goto exit;
+
+	hash = hash_stack(trace->entries, trace->nr_entries);
+	/* Bad luck, we won't store this stack. */
+	if (hash == 0)
+		goto exit;
+
+	bucket = &stack_table[hash & STACK_HASH_MASK];
+
+	/* Fast path: look the stack trace up without locking.
+	 *
+	 * The smp_load_acquire() here pairs with smp_store_release() to
+	 * |bucket| below.
+	 */
+	found = find_stack(smp_load_acquire(bucket), trace->entries,
+			   trace->nr_entries, hash);
+	if (found)
+		goto exit;
+
+	/* Check if the current or the next stack slab need to be initialized.
+	 * If so, allocate the memory - we won't be able to do that under the
+	 * lock.
+	 *
+	 * The smp_load_acquire() here pairs with smp_store_release() to
+	 * |next_slab_inited| in depot_alloc_stack() and init_stack_slab().
+	 */
+	if (unlikely(!smp_load_acquire(&next_slab_inited))) {
+		if (!preempt_count() && !in_irq()) {
+			alloc_flags &= (__GFP_RECLAIM | __GFP_IO | __GFP_FS |
+				__GFP_NOWARN | __GFP_NORETRY |
+				__GFP_NOMEMALLOC | __GFP_DIRECT_RECLAIM);
+			page = alloc_pages(alloc_flags, STACK_ALLOC_ORDER);
+			if (page)
+				prealloc = page_address(page);
+		}
+	}
+
+	spin_lock_irqsave(&depot_lock, flags);
+
+	found = find_stack(*bucket, trace->entries, trace->nr_entries, hash);
+	if (!found) {
+		struct stack_record *new =
+			depot_alloc_stack(trace->entries, trace->nr_entries,
+					  hash, &prealloc, alloc_flags);
+		if (new) {
+			new->next = *bucket;
+			/* This smp_store_release() pairs with
+			 * smp_load_acquire() from |bucket| above.
+			 */
+			smp_store_release(bucket, new);
+			found = new;
+		}
+	} else if (prealloc) {
+		/*
+		 * We didn't need to store this stack trace, but let's keep
+		 * the preallocated memory for the future.
+		 */
+		WARN_ON(!init_stack_slab(&prealloc));
+	}
+
+	spin_unlock_irqrestore(&depot_lock, flags);
+exit:
+	if (prealloc)
+		/* Nobody used this memory, ok to free it. */
+		free_pages((unsigned long)prealloc, STACK_ALLOC_ORDER);
+	if (found)
+		retval = found->handle.handle;
+	return retval;
+}
diff --git a/mm/kasan/Makefile b/mm/kasan/Makefile
index a61460d..32bd73a 100644
--- a/mm/kasan/Makefile
+++ b/mm/kasan/Makefile
@@ -7,3 +7,4 @@ CFLAGS_REMOVE_kasan.o = -pg
 CFLAGS_kasan.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector)
 
 obj-y := kasan.o report.o kasan_init.o
+
diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c
index 95b2267..088cb31 100644
--- a/mm/kasan/kasan.c
+++ b/mm/kasan/kasan.c
@@ -17,7 +17,9 @@
 #define DISABLE_BRANCH_PROFILING
 
 #include <linux/export.h>
+#include <linux/interrupt.h>
 #include <linux/init.h>
+#include <linux/kasan.h>
 #include <linux/kernel.h>
 #include <linux/kmemleak.h>
 #include <linux/memblock.h>
@@ -31,7 +33,6 @@
 #include <linux/string.h>
 #include <linux/types.h>
 #include <linux/vmalloc.h>
-#include <linux/kasan.h>
 
 #include "kasan.h"
 #include "../slab.h"
@@ -393,23 +394,67 @@ void kasan_poison_object_data(struct kmem_cache *cache, void *object)
 #endif
 }
 
-static inline void set_track(struct kasan_track *track)
+static inline int in_irqentry_text(unsigned long ptr)
+{
+	return (ptr >= (unsigned long)&__irqentry_text_start &&
+		ptr < (unsigned long)&__irqentry_text_end) ||
+		(ptr >= (unsigned long)&__softirqentry_text_start &&
+		 ptr < (unsigned long)&__softirqentry_text_end);
+}
+
+static inline void filter_irq_stacks(struct stack_trace *trace)
+{
+	int i;
+
+	if (!trace->nr_entries)
+		return;
+	for (i = 0; i < trace->nr_entries; i++)
+		if (in_irqentry_text(trace->entries[i])) {
+			/* Include the irqentry function into the stack. */
+			trace->nr_entries = i + 1;
+			break;
+		}
+}
+
+static inline depot_stack_handle save_stack(gfp_t flags)
+{
+	unsigned long entries[KASAN_STACK_DEPTH];
+	struct stack_trace trace = {
+		.nr_entries = 0,
+		.entries = entries,
+		.max_entries = KASAN_STACK_DEPTH,
+		.skip = 0
+	};
+
+	save_stack_trace(&trace);
+	filter_irq_stacks(&trace);
+	if (trace.nr_entries != 0 &&
+	    trace.entries[trace.nr_entries-1] == ULONG_MAX)
+		trace.nr_entries--;
+
+	return depot_save_stack(&trace, flags);
+}
+
+static inline void set_track(struct kasan_track *track, gfp_t flags)
 {
 	track->cpu = raw_smp_processor_id();
 	track->pid = current->pid;
 	track->when = jiffies;
+	track->stack = save_stack(flags);
 }
 
 #ifdef CONFIG_SLAB
 struct kasan_alloc_meta *get_alloc_info(struct kmem_cache *cache,
 					const void *object)
 {
+	BUILD_BUG_ON(sizeof(struct kasan_alloc_meta) > 32);
 	return (void *)object + cache->kasan_info.alloc_meta_offset;
 }
 
 struct kasan_free_meta *get_free_info(struct kmem_cache *cache,
 				      const void *object)
 {
+	BUILD_BUG_ON(sizeof(struct kasan_free_meta) > 32);
 	return (void *)object + cache->kasan_info.free_meta_offset;
 }
 #endif
@@ -466,7 +511,7 @@ void kasan_kmalloc(struct kmem_cache *cache, const void *object, size_t size,
 
 		alloc_info->state = KASAN_STATE_ALLOC;
 		alloc_info->alloc_size = size;
-		set_track(&alloc_info->track);
+		set_track(&alloc_info->track, flags);
 	}
 #endif
 }
diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
index 7b9e4ab9..b4e5942 100644
--- a/mm/kasan/kasan.h
+++ b/mm/kasan/kasan.h
@@ -2,6 +2,7 @@
 #define __MM_KASAN_KASAN_H
 
 #include <linux/kasan.h>
+#include <linux/stackdepot.h>
 
 #define KASAN_SHADOW_SCALE_SIZE (1UL << KASAN_SHADOW_SCALE_SHIFT)
 #define KASAN_SHADOW_MASK       (KASAN_SHADOW_SCALE_SIZE - 1)
@@ -64,10 +65,13 @@ enum kasan_state {
 	KASAN_STATE_FREE
 };
 
+#define KASAN_STACK_DEPTH 64
+
 struct kasan_track {
 	u64 cpu : 6;			/* for NR_CPUS = 64 */
 	u64 pid : 16;			/* 65536 processes */
 	u64 when : 42;			/* ~140 years */
+	depot_stack_handle stack : sizeof(depot_stack_handle);
 };
 
 struct kasan_alloc_meta {
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index 2c1407f..82d8858 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -18,6 +18,7 @@
 #include <linux/printk.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
+#include <linux/stackdepot.h>
 #include <linux/stacktrace.h>
 #include <linux/string.h>
 #include <linux/types.h>
@@ -120,6 +121,14 @@ static void print_track(struct kasan_track *track)
 {
 	pr_err("PID = %u, CPU = %u, timestamp = %lu\n", track->pid,
 	       track->cpu, (unsigned long)track->when);
+	if (track->stack) {
+		struct stack_trace trace;
+
+		depot_fetch_stack(track->stack, &trace);
+		print_stack_trace(&trace, 0);
+	} else {
+		pr_err("(stack is not available)\n");
+	}
 }
 
 static void print_object(struct kmem_cache *cache, void *object)
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 6/7] kasan: Test fix: Warn if the UAF could not be detected in kmalloc_uaf2
  2016-02-26 16:48 [PATCH v4 0/7] SLAB support for KASAN Alexander Potapenko
                   ` (4 preceding siblings ...)
  2016-02-26 16:48 ` [PATCH v4 5/7] mm, kasan: Stackdepot implementation. Enable stackdepot for SLAB Alexander Potapenko
@ 2016-02-26 16:48 ` Alexander Potapenko
  2016-02-29 16:31   ` Andrey Ryabinin
  2016-02-26 16:48 ` [PATCH v4 7/7] mm: kasan: Initial memory quarantine implementation Alexander Potapenko
  2016-02-26 22:28 ` [PATCH v4 0/7] SLAB support for KASAN Andrew Morton
  7 siblings, 1 reply; 30+ messages in thread
From: Alexander Potapenko @ 2016-02-26 16:48 UTC (permalink / raw)
  To: adech.fo, cl, dvyukov, akpm, ryabinin.a.a, rostedt,
	iamjoonsoo.kim, js1304, kcc
  Cc: kasan-dev, linux-kernel, linux-mm

Signed-off-by: Alexander Potapenko <glider@google.com>
---
 lib/test_kasan.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/test_kasan.c b/lib/test_kasan.c
index 90ad74f..82169fb 100644
--- a/lib/test_kasan.c
+++ b/lib/test_kasan.c
@@ -294,6 +294,8 @@ static noinline void __init kmalloc_uaf2(void)
 	}
 
 	ptr1[40] = 'x';
+	if (ptr1 == ptr2)
+		pr_err("Could not detect use-after-free: ptr1 == ptr2\n");
 	kfree(ptr2);
 }
 
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v4 7/7] mm: kasan: Initial memory quarantine implementation
  2016-02-26 16:48 [PATCH v4 0/7] SLAB support for KASAN Alexander Potapenko
                   ` (5 preceding siblings ...)
  2016-02-26 16:48 ` [PATCH v4 6/7] kasan: Test fix: Warn if the UAF could not be detected in kmalloc_uaf2 Alexander Potapenko
@ 2016-02-26 16:48 ` Alexander Potapenko
  2016-02-26 22:28 ` [PATCH v4 0/7] SLAB support for KASAN Andrew Morton
  7 siblings, 0 replies; 30+ messages in thread
From: Alexander Potapenko @ 2016-02-26 16:48 UTC (permalink / raw)
  To: adech.fo, cl, dvyukov, akpm, ryabinin.a.a, rostedt,
	iamjoonsoo.kim, js1304, kcc
  Cc: kasan-dev, linux-kernel, linux-mm

Quarantine isolates freed objects in a separate queue. The objects are
returned to the allocator later, which helps to detect use-after-free
errors.

Freed objects are first added to per-cpu quarantine queues.
When a cache is destroyed or memory shrinking is requested, the objects
are moved into the global quarantine queue. Whenever a kmalloc call
allows memory reclaiming, the oldest objects are popped out of the
global queue until the total size of objects in quarantine is less than
3/4 of the maximum quarantine size (which is a fraction of installed
physical memory).

Right now quarantine support is only enabled in SLAB allocator.
Unification of KASAN features in SLAB and SLUB will be done later.

This patch is based on the "mm: kasan: quarantine" patch originally
prepared by Dmitry Chernenkov.

Signed-off-by: Alexander Potapenko <glider@google.com>
---
v2: - added copyright comments
    - per request from Joonsoo Kim made __cache_free() more straightforward
    - added comments for smp_load_acquire()/smp_store_release()

v3: - incorporate changes introduced by the "mm, kasan: SLAB support" patch

v4: - fix kbuild compile-time error (missing ___cache_free() declaration)
      and a warning (wrong format specifier)
---
 include/linux/kasan.h |  30 +++--
 lib/test_kasan.c      |  29 +++++
 mm/kasan/Makefile     |   3 +
 mm/kasan/kasan.c      |  71 ++++++++++--
 mm/kasan/kasan.h      |  11 +-
 mm/kasan/quarantine.c | 306 ++++++++++++++++++++++++++++++++++++++++++++++++++
 mm/kasan/report.c     |   1 +
 mm/mempool.c          |   7 +-
 mm/page_alloc.c       |   2 +-
 mm/slab.c             |  14 ++-
 mm/slab.h             |   2 +
 mm/slab_common.c      |   2 +
 mm/slub.c             |   4 +-
 13 files changed, 454 insertions(+), 28 deletions(-)

diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index bf71ab0..355e722 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -44,24 +44,29 @@ static inline void kasan_disable_current(void)
 void kasan_unpoison_shadow(const void *address, size_t size);
 
 void kasan_alloc_pages(struct page *page, unsigned int order);
-void kasan_free_pages(struct page *page, unsigned int order);
+void kasan_poison_free_pages(struct page *page, unsigned int order);
 
 void kasan_cache_create(struct kmem_cache *cache, size_t *size,
 			unsigned long *flags);
+void kasan_cache_shrink(struct kmem_cache *cache);
+void kasan_cache_destroy(struct kmem_cache *cache);
 
 void kasan_poison_slab(struct page *page);
 void kasan_unpoison_object_data(struct kmem_cache *cache, void *object);
 void kasan_poison_object_data(struct kmem_cache *cache, void *object);
 
 void kasan_kmalloc_large(const void *ptr, size_t size, gfp_t flags);
-void kasan_kfree_large(const void *ptr);
-void kasan_kfree(void *ptr);
+void kasan_poison_kfree_large(const void *ptr);
+void kasan_poison_kfree(void *ptr);
 void kasan_kmalloc(struct kmem_cache *s, const void *object, size_t size,
 		  gfp_t flags);
 void kasan_krealloc(const void *object, size_t new_size, gfp_t flags);
 
 void kasan_slab_alloc(struct kmem_cache *s, void *object, gfp_t flags);
-void kasan_slab_free(struct kmem_cache *s, void *object);
+/* kasan_slab_free() returns true if the object has been put into quarantine.
+ */
+bool kasan_slab_free(struct kmem_cache *s, void *object);
+void kasan_poison_slab_free(struct kmem_cache *s, void *object);
 
 struct kasan_cache {
 	int alloc_meta_offset;
@@ -79,11 +84,14 @@ static inline void kasan_enable_current(void) {}
 static inline void kasan_disable_current(void) {}
 
 static inline void kasan_alloc_pages(struct page *page, unsigned int order) {}
-static inline void kasan_free_pages(struct page *page, unsigned int order) {}
+static inline void kasan_poison_free_pages(struct page *page,
+						unsigned int order) {}
 
 static inline void kasan_cache_create(struct kmem_cache *cache,
 				      size_t *size,
 				      unsigned long *flags) {}
+static inline void kasan_cache_shrink(struct kmem_cache *cache) {}
+static inline void kasan_cache_destroy(struct kmem_cache *cache) {}
 
 static inline void kasan_poison_slab(struct page *page) {}
 static inline void kasan_unpoison_object_data(struct kmem_cache *cache,
@@ -92,8 +100,8 @@ static inline void kasan_poison_object_data(struct kmem_cache *cache,
 					void *object) {}
 
 static inline void kasan_kmalloc_large(void *ptr, size_t size, gfp_t flags) {}
-static inline void kasan_kfree_large(const void *ptr) {}
-static inline void kasan_kfree(void *ptr) {}
+static inline void kasan_poison_kfree_large(const void *ptr) {}
+static inline void kasan_poison_kfree(void *ptr) {}
 static inline void kasan_kmalloc(struct kmem_cache *s, const void *object,
 				size_t size, gfp_t flags) {}
 static inline void kasan_krealloc(const void *object, size_t new_size,
@@ -101,7 +109,13 @@ static inline void kasan_krealloc(const void *object, size_t new_size,
 
 static inline void kasan_slab_alloc(struct kmem_cache *s, void *object,
 				   gfp_t flags) {}
-static inline void kasan_slab_free(struct kmem_cache *s, void *object) {}
+/* kasan_slab_free() returns true if the object has been put into quarantine.
+ */
+static inline bool kasan_slab_free(struct kmem_cache *s, void *object)
+{
+	return false;
+}
+static inline void kasan_poison_slab_free(struct kmem_cache *s, void *object) {}
 
 static inline int kasan_module_alloc(void *addr, size_t size) { return 0; }
 static inline void kasan_free_shadow(const struct vm_struct *vm) {}
diff --git a/lib/test_kasan.c b/lib/test_kasan.c
index 82169fb..799c98e 100644
--- a/lib/test_kasan.c
+++ b/lib/test_kasan.c
@@ -344,6 +344,32 @@ static noinline void __init kasan_stack_oob(void)
 	*(volatile char *)p;
 }
 
+#ifdef CONFIG_SLAB
+static noinline void __init kasan_quarantine_cache(void)
+{
+	struct kmem_cache *cache = kmem_cache_create(
+			"test", 137, 8, GFP_KERNEL, NULL);
+	int i;
+
+	for (i = 0; i <  100; i++) {
+		void *p = kmem_cache_alloc(cache, GFP_KERNEL);
+
+		kmem_cache_free(cache, p);
+		p = kmalloc(sizeof(u64), GFP_KERNEL);
+		kfree(p);
+	}
+	kmem_cache_shrink(cache);
+	for (i = 0; i <  100; i++) {
+		u64 *p = kmem_cache_alloc(cache, GFP_KERNEL);
+
+		kmem_cache_free(cache, p);
+		p = kmalloc(sizeof(u64), GFP_KERNEL);
+		kfree(p);
+	}
+	kmem_cache_destroy(cache);
+}
+#endif
+
 static int __init kmalloc_tests_init(void)
 {
 	kmalloc_oob_right();
@@ -367,6 +393,9 @@ static int __init kmalloc_tests_init(void)
 	kmem_cache_oob();
 	kasan_stack_oob();
 	kasan_global_oob();
+#ifdef CONFIG_SLAB
+	kasan_quarantine_cache();
+#endif
 	return -EAGAIN;
 }
 
diff --git a/mm/kasan/Makefile b/mm/kasan/Makefile
index 32bd73a..d1db41e 100644
--- a/mm/kasan/Makefile
+++ b/mm/kasan/Makefile
@@ -8,3 +8,6 @@ CFLAGS_kasan.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector)
 
 obj-y := kasan.o report.o kasan_init.o
 
+ifdef CONFIG_SLAB
+	obj-y	+= quarantine.o
+endif
diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c
index 088cb31..904cd9a 100644
--- a/mm/kasan/kasan.c
+++ b/mm/kasan/kasan.c
@@ -307,7 +307,7 @@ void kasan_alloc_pages(struct page *page, unsigned int order)
 		kasan_unpoison_shadow(page_address(page), PAGE_SIZE << order);
 }
 
-void kasan_free_pages(struct page *page, unsigned int order)
+void kasan_poison_free_pages(struct page *page, unsigned int order)
 {
 	if (likely(!PageHighMem(page)))
 		kasan_poison_shadow(page_address(page),
@@ -368,6 +368,20 @@ void kasan_cache_create(struct kmem_cache *cache, size_t *size,
 }
 #endif
 
+void kasan_cache_shrink(struct kmem_cache *cache)
+{
+#ifdef CONFIG_SLAB
+	quarantine_remove_cache(cache);
+#endif
+}
+
+void kasan_cache_destroy(struct kmem_cache *cache)
+{
+#ifdef CONFIG_SLAB
+	quarantine_remove_cache(cache);
+#endif
+}
+
 void kasan_poison_slab(struct page *page)
 {
 	kasan_poison_shadow(page_address(page),
@@ -464,7 +478,7 @@ void kasan_slab_alloc(struct kmem_cache *cache, void *object, gfp_t flags)
 	kasan_kmalloc(cache, object, cache->object_size, flags);
 }
 
-void kasan_slab_free(struct kmem_cache *cache, void *object)
+void kasan_poison_slab_free(struct kmem_cache *cache, void *object)
 {
 	unsigned long size = cache->object_size;
 	unsigned long rounded_up_size = round_up(size, KASAN_SHADOW_SCALE_SIZE);
@@ -473,18 +487,43 @@ void kasan_slab_free(struct kmem_cache *cache, void *object)
 	if (unlikely(cache->flags & SLAB_DESTROY_BY_RCU))
 		return;
 
+	kasan_poison_shadow(object, rounded_up_size, KASAN_KMALLOC_FREE);
+}
+
+bool kasan_slab_free(struct kmem_cache *cache, void *object)
+{
 #ifdef CONFIG_SLAB
-	if (cache->flags & SLAB_KASAN) {
-		struct kasan_free_meta *free_info =
-			get_free_info(cache, object);
+	/* RCU slabs could be legally used after free within the RCU period */
+	if (unlikely(cache->flags & SLAB_DESTROY_BY_RCU))
+		return false;
+
+	if (likely(cache->flags & SLAB_KASAN)) {
 		struct kasan_alloc_meta *alloc_info =
 			get_alloc_info(cache, object);
-		alloc_info->state = KASAN_STATE_FREE;
-		set_track(&free_info->track);
+		struct kasan_free_meta *free_info =
+			get_free_info(cache, object);
+
+		switch (alloc_info->state) {
+		case KASAN_STATE_ALLOC:
+			alloc_info->state = KASAN_STATE_QUARANTINE;
+			quarantine_put(free_info, cache);
+			set_track(&free_info->track, GFP_NOWAIT);
+			kasan_poison_slab_free(cache, object);
+			return true;
+		case KASAN_STATE_QUARANTINE:
+		case KASAN_STATE_FREE:
+			pr_err("Double free");
+			dump_stack();
+			break;
+		default:
+			break;
+		}
 	}
+	return false;
+#else
+	kasan_poison_slab_free(cache, object);
+	return false;
 #endif
-
-	kasan_poison_shadow(object, rounded_up_size, KASAN_KMALLOC_FREE);
 }
 
 void kasan_kmalloc(struct kmem_cache *cache, const void *object, size_t size,
@@ -493,6 +532,11 @@ void kasan_kmalloc(struct kmem_cache *cache, const void *object, size_t size,
 	unsigned long redzone_start;
 	unsigned long redzone_end;
 
+#ifdef CONFIG_SLAB
+	if (flags & __GFP_RECLAIM)
+		quarantine_reduce();
+#endif
+
 	if (unlikely(object == NULL))
 		return;
 
@@ -523,6 +567,11 @@ void kasan_kmalloc_large(const void *ptr, size_t size, gfp_t flags)
 	unsigned long redzone_start;
 	unsigned long redzone_end;
 
+#ifdef CONFIG_SLAB
+	if (flags & __GFP_RECLAIM)
+		quarantine_reduce();
+#endif
+
 	if (unlikely(ptr == NULL))
 		return;
 
@@ -551,7 +600,7 @@ void kasan_krealloc(const void *object, size_t size, gfp_t flags)
 		kasan_kmalloc(page->slab_cache, object, size, flags);
 }
 
-void kasan_kfree(void *ptr)
+void kasan_poison_kfree(void *ptr)
 {
 	struct page *page;
 
@@ -564,7 +613,7 @@ void kasan_kfree(void *ptr)
 		kasan_slab_free(page->slab_cache, ptr);
 }
 
-void kasan_kfree_large(const void *ptr)
+void kasan_poison_kfree_large(const void *ptr)
 {
 	struct page *page = virt_to_page(ptr);
 
diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
index b4e5942..37e0b3a 100644
--- a/mm/kasan/kasan.h
+++ b/mm/kasan/kasan.h
@@ -62,6 +62,7 @@ struct kasan_global {
 enum kasan_state {
 	KASAN_STATE_INIT,
 	KASAN_STATE_ALLOC,
+	KASAN_STATE_QUARANTINE,
 	KASAN_STATE_FREE
 };
 
@@ -81,8 +82,10 @@ struct kasan_alloc_meta {
 };
 
 struct kasan_free_meta {
-	/* Allocator freelist pointer, unused by KASAN. */
-	void **freelist;
+	/* This field is used while the object is in the quarantine.
+	 * Otherwise it might be used for the allocator freelist.
+	 */
+	void **quarantine_link;
 	struct kasan_track track;
 };
 
@@ -106,4 +109,8 @@ static inline bool kasan_report_enabled(void)
 void kasan_report(unsigned long addr, size_t size,
 		bool is_write, unsigned long ip);
 
+void quarantine_put(struct kasan_free_meta *info, struct kmem_cache *cache);
+void quarantine_reduce(void);
+void quarantine_remove_cache(struct kmem_cache *cache);
+
 #endif
diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c
new file mode 100644
index 0000000..cf3f1fd
--- /dev/null
+++ b/mm/kasan/quarantine.c
@@ -0,0 +1,306 @@
+/*
+ * KASAN quarantine.
+ *
+ * Author: Alexander Potapenko <glider@google.com>
+ * Copyright (C) 2016 Google, Inc.
+ *
+ * Based on code by Dmitry Chernenkov.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ */
+
+#include <linux/gfp.h>
+#include <linux/hash.h>
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/percpu.h>
+#include <linux/printk.h>
+#include <linux/shrinker.h>
+#include <linux/slab.h>
+#include <linux/string.h>
+#include <linux/types.h>
+
+#include "../slab.h"
+#include "kasan.h"
+
+/* Data structure and operations for quarantine queues. */
+
+/* Each queue is a signled-linked list, which also stores the total size of
+ * objects inside of it.
+ */
+struct qlist {
+	void **head;
+	void **tail;
+	size_t bytes;
+};
+
+#define QLIST_INIT { NULL, NULL, 0 }
+
+static inline bool empty_qlist(struct qlist *q)
+{
+	return !q->head;
+}
+
+static inline void init_qlist(struct qlist *q)
+{
+	q->head = q->tail = NULL;
+	q->bytes = 0;
+}
+
+static inline void qlist_put(struct qlist *q, void **qlink, size_t size)
+{
+	if (unlikely(empty_qlist(q)))
+		q->head = qlink;
+	else
+		*q->tail = qlink;
+	q->tail = qlink;
+	*qlink = NULL;
+	q->bytes += size;
+}
+
+static inline void **qlist_remove(struct qlist *q, void ***prev,
+				 size_t size)
+{
+	void **qlink = *prev;
+
+	*prev = *qlink;
+	if (q->tail == qlink) {
+		if (q->head == qlink)
+			q->tail = NULL;
+		else
+			q->tail = (void **)prev;
+	}
+	q->bytes -= size;
+
+	return qlink;
+}
+
+static inline void qlist_move_all(struct qlist *from, struct qlist *to)
+{
+	if (unlikely(empty_qlist(from)))
+		return;
+
+	if (empty_qlist(to)) {
+		*to = *from;
+		init_qlist(from);
+		return;
+	}
+
+	*to->tail = from->head;
+	to->tail = from->tail;
+	to->bytes += from->bytes;
+
+	init_qlist(from);
+}
+
+static inline void qlist_move(struct qlist *from, void **last, struct qlist *to,
+			  size_t size)
+{
+	if (unlikely(last == from->tail)) {
+		qlist_move_all(from, to);
+		return;
+	}
+	if (empty_qlist(to))
+		to->head = from->head;
+	else
+		*to->tail = from->head;
+	to->tail = last;
+	from->head = *last;
+	*last = NULL;
+	from->bytes -= size;
+	to->bytes += size;
+}
+
+
+/* The object quarantine consists of per-cpu queues and a global queue,
+ * guarded by quarantine_lock.
+ */
+static DEFINE_PER_CPU(struct qlist, cpu_quarantine);
+
+static struct qlist global_quarantine;
+static DEFINE_SPINLOCK(quarantine_lock);
+
+/* Maximum size of the global queue. */
+static unsigned long quarantine_size;
+
+/* The fraction of physical memory the quarantine is allowed to occupy.
+ * Quarantine doesn't support memory shrinker with SLAB allocator, so we keep
+ * the ratio low to avoid OOM.
+ */
+#define QUARANTINE_FRACTION 32
+
+/* smp_load_acquire() here pairs with smp_store_release() in
+ * quarantine_reduce().
+ */
+#define QUARANTINE_LOW_SIZE (smp_load_acquire(&quarantine_size) * 3 / 4)
+#define QUARANTINE_PERCPU_SIZE (1 << 20)
+
+static inline struct kmem_cache *qlink_to_cache(void **qlink)
+{
+	return virt_to_head_page(qlink)->slab_cache;
+}
+
+static inline void *qlink_to_object(void **qlink, struct kmem_cache *cache)
+{
+	struct kasan_free_meta *free_info =
+		container_of((void ***)qlink, struct kasan_free_meta,
+			     quarantine_link);
+
+	return ((void *)free_info) - cache->kasan_info.free_meta_offset;
+}
+
+static inline void qlink_free(void **qlink, struct kmem_cache *cache)
+{
+	void *object = qlink_to_object(qlink, cache);
+	struct kasan_alloc_meta *alloc_info = get_alloc_info(cache, object);
+	unsigned long flags;
+
+	local_irq_save(flags);
+	alloc_info->state = KASAN_STATE_FREE;
+	___cache_free(cache, object, _THIS_IP_);
+	local_irq_restore(flags);
+}
+
+static inline void qlist_free_all(struct qlist *q, struct kmem_cache *cache)
+{
+	void **qlink;
+
+	if (unlikely(empty_qlist(q)))
+		return;
+
+	qlink = q->head;
+	while (qlink) {
+		struct kmem_cache *obj_cache =
+			cache ? cache :	qlink_to_cache(qlink);
+		void **next = *qlink;
+
+		qlink_free(qlink, obj_cache);
+		qlink = next;
+	}
+	init_qlist(q);
+}
+
+void quarantine_put(struct kasan_free_meta *info, struct kmem_cache *cache)
+{
+	unsigned long flags;
+	struct qlist *q;
+	struct qlist temp = QLIST_INIT;
+
+	local_irq_save(flags);
+
+	q = this_cpu_ptr(&cpu_quarantine);
+	qlist_put(q, (void **) &info->quarantine_link, cache->size);
+	if (unlikely(q->bytes > QUARANTINE_PERCPU_SIZE))
+		qlist_move_all(q, &temp);
+
+	local_irq_restore(flags);
+
+	if (unlikely(!empty_qlist(&temp))) {
+		spin_lock_irqsave(&quarantine_lock, flags);
+		qlist_move_all(&temp, &global_quarantine);
+		spin_unlock_irqrestore(&quarantine_lock, flags);
+	}
+}
+
+void quarantine_reduce(void)
+{
+	size_t new_quarantine_size;
+	unsigned long flags;
+	struct qlist to_free = QLIST_INIT;
+	size_t size_to_free = 0;
+	void **last;
+
+	/* smp_load_acquire() here pairs with smp_store_release() below. */
+	if (likely(ACCESS_ONCE(global_quarantine.bytes) <=
+		   smp_load_acquire(&quarantine_size)))
+		return;
+
+	spin_lock_irqsave(&quarantine_lock, flags);
+
+	/* Update quarantine size in case of hotplug. Allocate a fraction of
+	 * the installed memory to quarantine minus per-cpu queue limits.
+	 */
+	new_quarantine_size = (ACCESS_ONCE(totalram_pages) << PAGE_SHIFT) /
+		QUARANTINE_FRACTION;
+	new_quarantine_size -= QUARANTINE_PERCPU_SIZE * num_online_cpus();
+	/* Pairs with smp_load_acquire() above and in QUARANTINE_LOW_SIZE. */
+	smp_store_release(&quarantine_size, new_quarantine_size);
+
+	last = global_quarantine.head;
+	while (last) {
+		struct kmem_cache *cache = qlink_to_cache(last);
+
+		size_to_free += cache->size;
+		if (!*last || size_to_free >
+		    global_quarantine.bytes - QUARANTINE_LOW_SIZE)
+			break;
+		last = (void **) *last;
+	}
+	qlist_move(&global_quarantine, last, &to_free, size_to_free);
+
+	spin_unlock_irqrestore(&quarantine_lock, flags);
+
+	qlist_free_all(&to_free, NULL);
+}
+
+static inline void qlist_move_cache(struct qlist *from,
+				   struct qlist *to,
+				   struct kmem_cache *cache)
+{
+	void ***prev;
+
+	if (unlikely(empty_qlist(from)))
+		return;
+
+	prev = &from->head;
+	while (*prev) {
+		void **qlink = *prev;
+		struct kmem_cache *obj_cache = qlink_to_cache(qlink);
+
+		if (obj_cache == cache) {
+			if (unlikely(from->tail == qlink))
+				from->tail = (void **) prev;
+			*prev = (void **) *qlink;
+			from->bytes -= cache->size;
+			qlist_put(to, qlink, cache->size);
+		} else
+			prev = (void ***) *prev;
+	}
+}
+
+static void per_cpu_remove_cache(void *arg)
+{
+	struct kmem_cache *cache = arg;
+	struct qlist to_free = QLIST_INIT;
+	struct qlist *q;
+	unsigned long flags;
+
+	local_irq_save(flags);
+	q = this_cpu_ptr(&cpu_quarantine);
+	qlist_move_cache(q, &to_free, cache);
+	local_irq_restore(flags);
+
+	qlist_free_all(&to_free, cache);
+}
+
+void quarantine_remove_cache(struct kmem_cache *cache)
+{
+	unsigned long flags;
+	struct qlist to_free = QLIST_INIT;
+
+	on_each_cpu(per_cpu_remove_cache, cache, 1);
+
+	spin_lock_irqsave(&quarantine_lock, flags);
+	qlist_move_cache(&global_quarantine, &to_free, cache);
+	spin_unlock_irqrestore(&quarantine_lock, flags);
+
+	qlist_free_all(&to_free, cache);
+}
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index 82d8858..958af33 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -150,6 +150,7 @@ static void print_object(struct kmem_cache *cache, void *object)
 		print_track(&alloc_info->track);
 		break;
 	case KASAN_STATE_FREE:
+	case KASAN_STATE_QUARANTINE:
 		pr_err("Object freed, allocated with size %u bytes\n",
 		       alloc_info->alloc_size);
 		free_info = get_free_info(cache, object);
diff --git a/mm/mempool.c b/mm/mempool.c
index b47c8a7..4beeeef 100644
--- a/mm/mempool.c
+++ b/mm/mempool.c
@@ -105,11 +105,12 @@ static inline void poison_element(mempool_t *pool, void *element)
 static void kasan_poison_element(mempool_t *pool, void *element)
 {
 	if (pool->alloc == mempool_alloc_slab)
-		kasan_slab_free(pool->pool_data, element);
+		kasan_poison_slab_free(pool->pool_data, element);
 	if (pool->alloc == mempool_kmalloc)
-		kasan_kfree(element);
+		kasan_poison_kfree(element);
 	if (pool->alloc == mempool_alloc_pages)
-		kasan_free_pages(element, (unsigned long)pool->pool_data);
+		kasan_poison_free_pages(element,
+					(unsigned long)pool->pool_data);
 }
 
 static void kasan_unpoison_element(mempool_t *pool, void *element, gfp_t flags)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 838ca8bb..a79ae81 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -980,7 +980,7 @@ static bool free_pages_prepare(struct page *page, unsigned int order)
 
 	trace_mm_page_free(page, order);
 	kmemcheck_free_shadow(page, order);
-	kasan_free_pages(page, order);
+	kasan_poison_free_pages(page, order);
 
 	if (PageAnon(page))
 		page->mapping = NULL;
diff --git a/mm/slab.c b/mm/slab.c
index 52a7a8d..8884fdf 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3374,9 +3374,21 @@ free_done:
 static inline void __cache_free(struct kmem_cache *cachep, void *objp,
 				unsigned long caller)
 {
+#ifdef CONFIG_KASAN
+	if (kasan_slab_free(cachep, objp))
+		/* The object has been put into the quarantine, don't touch it
+		 * for now.
+		 */
+		return;
+#endif
+	___cache_free(cachep, objp, caller);
+}
+
+void ___cache_free(struct kmem_cache *cachep, void *objp,
+		unsigned long caller)
+{
 	struct array_cache *ac;
 
-	kasan_slab_free(cachep, objp);
 	ac = cpu_cache_get(cachep);
 
 	check_irq_off();
diff --git a/mm/slab.h b/mm/slab.h
index 2eedace..e5680ee 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -373,4 +373,6 @@ void *slab_next(struct seq_file *m, void *p, loff_t *pos);
 void slab_stop(struct seq_file *m, void *p);
 int memcg_slab_show(struct seq_file *m, void *p);
 
+void ___cache_free(struct kmem_cache *cache, void *x, unsigned long addr);
+
 #endif /* MM_SLAB_H */
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 538f616..7bcb5bc 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -711,6 +711,7 @@ void kmem_cache_destroy(struct kmem_cache *s)
 	get_online_cpus();
 	get_online_mems();
 
+	kasan_cache_destroy(s);
 	mutex_lock(&slab_mutex);
 
 	s->refcount--;
@@ -749,6 +750,7 @@ int kmem_cache_shrink(struct kmem_cache *cachep)
 
 	get_online_cpus();
 	get_online_mems();
+	kasan_cache_shrink(cachep);
 	ret = __kmem_cache_shrink(cachep, false);
 	put_online_mems();
 	put_online_cpus();
diff --git a/mm/slub.c b/mm/slub.c
index 2978695..1819293 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1278,7 +1278,7 @@ static inline void kmalloc_large_node_hook(void *ptr, size_t size, gfp_t flags)
 static inline void kfree_hook(const void *x)
 {
 	kmemleak_free(x);
-	kasan_kfree_large(x);
+	kasan_poison_kfree_large(x);
 }
 
 static inline struct kmem_cache *slab_pre_alloc_hook(struct kmem_cache *s,
@@ -1333,7 +1333,7 @@ static inline void slab_free_hook(struct kmem_cache *s, void *x)
 	if (!(s->flags & SLAB_DEBUG_OBJECTS))
 		debug_check_no_obj_freed(x, s->object_size);
 
-	kasan_slab_free(s, x);
+	kasan_poison_slab_free(s, x);
 }
 
 static inline void slab_free_freelist_hook(struct kmem_cache *s,
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 0/7] SLAB support for KASAN
  2016-02-26 16:48 [PATCH v4 0/7] SLAB support for KASAN Alexander Potapenko
                   ` (6 preceding siblings ...)
  2016-02-26 16:48 ` [PATCH v4 7/7] mm: kasan: Initial memory quarantine implementation Alexander Potapenko
@ 2016-02-26 22:28 ` Andrew Morton
  7 siblings, 0 replies; 30+ messages in thread
From: Andrew Morton @ 2016-02-26 22:28 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: adech.fo, cl, dvyukov, ryabinin.a.a, rostedt, iamjoonsoo.kim,
	js1304, kcc, kasan-dev, linux-kernel, linux-mm

On Fri, 26 Feb 2016 17:48:40 +0100 Alexander Potapenko <glider@google.com> wrote:

> This patch set implements SLAB support for KASAN

That's a lot of code and I'm not seeing much review activity from
folks.  There was one ack against an earlier version of [4/7] from
Steven Rostedt but that ack wasn't maintained (bad!).

I scanned over these and my plan was to queue them for -next testing
and to await more review/test before proceeding further.  But alas,
there are significant collisions with pending slab patches (all in
linux-next) so could you please take a look at those?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 4/7] arch, ftrace: For KASAN put hard/soft IRQ entries into separate sections
  2016-02-26 16:48 ` [PATCH v4 4/7] arch, ftrace: For KASAN put hard/soft IRQ entries into separate sections Alexander Potapenko
@ 2016-02-27  1:44   ` kbuild test robot
  2016-03-02 17:41   ` Steven Rostedt
  1 sibling, 0 replies; 30+ messages in thread
From: kbuild test robot @ 2016-02-27  1:44 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: kbuild-all, adech.fo, cl, dvyukov, akpm, ryabinin.a.a, rostedt,
	iamjoonsoo.kim, js1304, kcc, kasan-dev, linux-kernel, linux-mm

[-- Attachment #1: Type: text/plain, Size: 2158 bytes --]

Hi Alexander,

[auto build test ERROR on tip/x86/core]
[also build test ERROR on v4.5-rc5]
[cannot apply to next-20160226]
[if your patch is applied to the wrong git tree, please drop us a note to help improving the system]

url:    https://github.com/0day-ci/linux/commits/Alexander-Potapenko/SLAB-support-for-KASAN/20160227-005318
config: arm-pxa_defconfig (attached as .config)
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=arm 

All errors (new ones prefixed by >>):

>> arch/arm/kernel/traps.c:476:39: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'handle_fiq_as_nmi'
    asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs)
                                          ^

vim +476 arch/arm/kernel/traps.c

c0e7f7ee Daniel Thompson 2014-09-17  470   * circumstances where non-maskability improves robustness, such as
c0e7f7ee Daniel Thompson 2014-09-17  471   * watchdog or debug logic.
c0e7f7ee Daniel Thompson 2014-09-17  472   *
c0e7f7ee Daniel Thompson 2014-09-17  473   * This handler is not appropriate for general purpose use in drivers
c0e7f7ee Daniel Thompson 2014-09-17  474   * platform code and can be overrideen using set_fiq_handler.
c0e7f7ee Daniel Thompson 2014-09-17  475   */
c0e7f7ee Daniel Thompson 2014-09-17 @476  asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs)
c0e7f7ee Daniel Thompson 2014-09-17  477  {
c0e7f7ee Daniel Thompson 2014-09-17  478  	struct pt_regs *old_regs = set_irq_regs(regs);
c0e7f7ee Daniel Thompson 2014-09-17  479  

:::::: The code at line 476 was first introduced by commit
:::::: c0e7f7ee717e2b4c5791e7422424c96b5008c39e ARM: 8150/3: fiq: Replace default FIQ handler

:::::: TO: Daniel Thompson <daniel.thompson@linaro.org>
:::::: CC: Russell King <rmk+kernel@arm.linux.org.uk>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 30012 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 2/7] mm, kasan: SLAB support
  2016-02-26 16:48 ` [PATCH v4 2/7] mm, kasan: SLAB support Alexander Potapenko
@ 2016-02-29 15:10   ` Andrey Ryabinin
  2016-02-29 18:28     ` Alexander Potapenko
  0 siblings, 1 reply; 30+ messages in thread
From: Andrey Ryabinin @ 2016-02-29 15:10 UTC (permalink / raw)
  To: Alexander Potapenko, adech.fo, cl, dvyukov, akpm, rostedt,
	iamjoonsoo.kim, js1304, kcc
  Cc: kasan-dev, linux-kernel, linux-mm



On 02/26/2016 07:48 PM, Alexander Potapenko wrote:
> Add KASAN hooks to SLAB allocator.
> 
> This patch is based on the "mm: kasan: unified support for SLUB and
> SLAB allocators" patch originally prepared by Dmitry Chernenkov.
> 
> Signed-off-by: Alexander Potapenko <glider@google.com>
> ---
> v3: - minor description changes
>     - store deallocation info in kasan_slab_free()
> 
> v4: - fix kbuild compile-time warnings in print_track()
> ---


> diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c
> index bc0a8d8..d26ffb4 100644
> --- a/mm/kasan/kasan.c
> +++ b/mm/kasan/kasan.c
> @@ -314,6 +314,59 @@ void kasan_free_pages(struct page *page, unsigned int order)
>  				KASAN_FREE_PAGE);
>  }
>  
> +#ifdef CONFIG_SLAB
> +/*
> + * Adaptive redzone policy taken from the userspace AddressSanitizer runtime.
> + * For larger allocations larger redzones are used.
> + */
> +static size_t optimal_redzone(size_t object_size)
> +{
> +	int rz =
> +		object_size <= 64        - 16   ? 16 :
> +		object_size <= 128       - 32   ? 32 :
> +		object_size <= 512       - 64   ? 64 :
> +		object_size <= 4096      - 128  ? 128 :
> +		object_size <= (1 << 14) - 256  ? 256 :
> +		object_size <= (1 << 15) - 512  ? 512 :
> +		object_size <= (1 << 16) - 1024 ? 1024 : 2048;
> +	return rz;
> +}
> +
> +void kasan_cache_create(struct kmem_cache *cache, size_t *size,
> +			unsigned long *flags)
> +{
> +	int redzone_adjust;
> +	/* Make sure the adjusted size is still less than
> +	 * KMALLOC_MAX_CACHE_SIZE.
> +	 * TODO: this check is only useful for SLAB, but not SLUB. We'll need
> +	 * to skip it for SLUB when it starts using kasan_cache_create().
> +	 */
> +	if (*size > KMALLOC_MAX_CACHE_SIZE -
> +	    sizeof(struct kasan_alloc_meta) -
> +	    sizeof(struct kasan_free_meta))
> +		return;
> +	*flags |= SLAB_KASAN;
> +	/* Add alloc meta. */
> +	cache->kasan_info.alloc_meta_offset = *size;
> +	*size += sizeof(struct kasan_alloc_meta);
> +
> +	/* Add free meta. */
> +	if (cache->flags & SLAB_DESTROY_BY_RCU || cache->ctor ||
> +	    cache->object_size < sizeof(struct kasan_free_meta)) {
> +		cache->kasan_info.free_meta_offset = *size;
> +		*size += sizeof(struct kasan_free_meta);
> +	}
> +	redzone_adjust = optimal_redzone(cache->object_size) -
> +		(*size - cache->object_size);
> +	if (redzone_adjust > 0)
> +		*size += redzone_adjust;
> +	*size = min(KMALLOC_MAX_CACHE_SIZE,
> +		    max(*size,
> +			cache->object_size +
> +			optimal_redzone(cache->object_size)));
> +}
> +#endif
> +




>  void kasan_poison_slab(struct page *page)
>  {
>  	kasan_poison_shadow(page_address(page),
> @@ -331,8 +384,36 @@ void kasan_poison_object_data(struct kmem_cache *cache, void *object)
>  	kasan_poison_shadow(object,
>  			round_up(cache->object_size, KASAN_SHADOW_SCALE_SIZE),
>  			KASAN_KMALLOC_REDZONE);
> +#ifdef CONFIG_SLAB
> +	if (cache->flags & SLAB_KASAN) {
> +		struct kasan_alloc_meta *alloc_info =
> +			get_alloc_info(cache, object);
> +		alloc_info->state = KASAN_STATE_INIT;
> +	}
> +#endif
> +}
> +
> +static inline void set_track(struct kasan_track *track)
> +{
> +	track->cpu = raw_smp_processor_id();
> +	track->pid = current->pid;
> +	track->when = jiffies;
>  }
>  
> +#ifdef CONFIG_SLAB
> +struct kasan_alloc_meta *get_alloc_info(struct kmem_cache *cache,
> +					const void *object)
> +{
> +	return (void *)object + cache->kasan_info.alloc_meta_offset;
> +}
> +
> +struct kasan_free_meta *get_free_info(struct kmem_cache *cache,
> +				      const void *object)
> +{
> +	return (void *)object + cache->kasan_info.free_meta_offset;
> +}
> +#endif
> +
>  void kasan_slab_alloc(struct kmem_cache *cache, void *object)
>  {
>  	kasan_kmalloc(cache, object, cache->object_size);
> @@ -347,6 +428,17 @@ void kasan_slab_free(struct kmem_cache *cache, void *object)
>  	if (unlikely(cache->flags & SLAB_DESTROY_BY_RCU))
>  		return;
>  
> +#ifdef CONFIG_SLAB
> +	if (cache->flags & SLAB_KASAN) {
> +		struct kasan_free_meta *free_info =
> +			get_free_info(cache, object);
> +		struct kasan_alloc_meta *alloc_info =
> +			get_alloc_info(cache, object);
> +		alloc_info->state = KASAN_STATE_FREE;
> +		set_track(&free_info->track);
> +	}
> +#endif
> +
>  	kasan_poison_shadow(object, rounded_up_size, KASAN_KMALLOC_FREE);
>  }
>  
> @@ -366,6 +458,16 @@ void kasan_kmalloc(struct kmem_cache *cache, const void *object, size_t size)
>  	kasan_unpoison_shadow(object, size);
>  	kasan_poison_shadow((void *)redzone_start, redzone_end - redzone_start,
>  		KASAN_KMALLOC_REDZONE);
> +#ifdef CONFIG_SLAB
> +	if (cache->flags & SLAB_KASAN) {
> +		struct kasan_alloc_meta *alloc_info =
> +			get_alloc_info(cache, object);
> +
> +		alloc_info->state = KASAN_STATE_ALLOC;
> +		alloc_info->alloc_size = size;
> +		set_track(&alloc_info->track);
> +	}
> +#endif
>  }
>  EXPORT_SYMBOL(kasan_kmalloc);
>  
> diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
> index 4f6c62e..7b9e4ab9 100644
> --- a/mm/kasan/kasan.h
> +++ b/mm/kasan/kasan.h
> @@ -54,6 +54,40 @@ struct kasan_global {
>  #endif
>  };
>  
> +/**
> + * Structures to keep alloc and free tracks *
> + */
> +
> +enum kasan_state {
> +	KASAN_STATE_INIT,
> +	KASAN_STATE_ALLOC,
> +	KASAN_STATE_FREE
> +};
> +
> +struct kasan_track {
> +	u64 cpu : 6;			/* for NR_CPUS = 64 */
> +	u64 pid : 16;			/* 65536 processes */
> +	u64 when : 42;			/* ~140 years */
> +};
> +
> +struct kasan_alloc_meta {
> +	u32 state : 2;	/* enum kasan_state */
> +	u32 alloc_size : 30;
> +	struct kasan_track track;
> +};
> +
> +struct kasan_free_meta {
> +	/* Allocator freelist pointer, unused by KASAN. */
> +	void **freelist;
> +	struct kasan_track track;
> +};
> +
> +struct kasan_alloc_meta *get_alloc_info(struct kmem_cache *cache,
> +					const void *object);
> +struct kasan_free_meta *get_free_info(struct kmem_cache *cache,
> +					const void *object);
> +
> +

Basically, all this big pile of code above is implementation of yet another SLAB_STORE_USER and SLAB_RED_ZONE
exclusively for KASAN. It would be so much better to alter existing code to satisfy all you needs.

>  static inline const void *kasan_shadow_to_mem(const void *shadow_addr)
>  {
>  	return (void *)(((unsigned long)shadow_addr - KASAN_SHADOW_OFFSET)
> diff --git a/mm/kasan/report.c b/mm/kasan/report.c
> index 12f222d..2c1407f 100644
> --- a/mm/kasan/report.c
> +++ b/mm/kasan/report.c
> @@ -115,6 +115,44 @@ static inline bool init_task_stack_addr(const void *addr)
>  			sizeof(init_thread_union.stack));
>  }
>  
> +#ifdef CONFIG_SLAB
> +static void print_track(struct kasan_track *track)
> +{
> +	pr_err("PID = %u, CPU = %u, timestamp = %lu\n", track->pid,
> +	       track->cpu, (unsigned long)track->when);
> +}
> +
> +static void print_object(struct kmem_cache *cache, void *object)
> +{
> +	struct kasan_alloc_meta *alloc_info = get_alloc_info(cache, object);
> +	struct kasan_free_meta *free_info;
> +
> +	pr_err("Object at %p, in cache %s\n", object, cache->name);
> +	if (!(cache->flags & SLAB_KASAN))
> +		return;
> +	switch (alloc_info->state) {

'->state' seems useless. It's used only here, but object's state could be determined by shadow value.

> +	case KASAN_STATE_INIT:
> +		pr_err("Object not allocated yet\n");
> +		break;
> +	case KASAN_STATE_ALLOC:
> +		pr_err("Object allocated with size %u bytes.\n",
> +		       alloc_info->alloc_size);
> +		pr_err("Allocation:\n");
> +		print_track(&alloc_info->track);
> +		break;
> +	case KASAN_STATE_FREE:
> +		pr_err("Object freed, allocated with size %u bytes\n",
> +		       alloc_info->alloc_size);
> +		free_info = get_free_info(cache, object);
> +		pr_err("Allocation:\n");
> +		print_track(&alloc_info->track);
> +		pr_err("Deallocation:\n");
> +		print_track(&free_info->track);
> +		break;
> +	}
> +}
> +#endif
> +
>  static void print_address_description(struct kasan_access_info *info)
>  {
>  	const void *addr = info->access_addr;
> @@ -126,17 +164,14 @@ static void print_address_description(struct kasan_access_info *info)
>  		if (PageSlab(page)) {
>  			void *object;
>  			struct kmem_cache *cache = page->slab_cache;
> -			void *last_object;
> -
> -			object = virt_to_obj(cache, page_address(page), addr);
> -			last_object = page_address(page) +
> -				page->objects * cache->size;
> -
> -			if (unlikely(object > last_object))
> -				object = last_object; /* we hit into padding */
> -
> +			object = nearest_obj(cache, page,
> +						(void *)info->access_addr);
> +#ifdef CONFIG_SLAB
> +			print_object(cache, object);
> +#else

Instead of these ifdefs, please, make universal API for printing object's information.

>  			object_err(cache, page, object,
> -				"kasan: bad access detected");
> +					"kasan: bad access detected");
> +#endif
>  			return;
>  		}
>  		dump_page(page, "kasan: bad access detected");
> @@ -146,8 +181,9 @@ static void print_address_description(struct kasan_access_info *info)
>  		if (!init_task_stack_addr(addr))
>  			pr_err("Address belongs to variable %pS\n", addr);
>  	}
> -
> +#ifdef CONFIG_SLUB

???

>  	dump_stack();
> +#endif
>  }
>  
>  static bool row_is_guilty(const void *row, const void *guilty)
> @@ -233,6 +269,9 @@ static void kasan_report_error(struct kasan_access_info *info)
>  		dump_stack();
>  	} else {
>  		print_error_description(info);
> +#ifdef CONFIG_SLAB

I'm lost here. What's the point of reordering dump_stack() for CONFIG_SLAB=y? 

> +		dump_stack();
> +#endif
>  		print_address_description(info);
>  		print_shadow_for_address(info->first_bad_addr);
>  	}
> diff --git a/mm/slab.c b/mm/slab.c
> index 621fbcb..805b39b 100644



>  
>  	if (gfpflags_allow_blocking(local_flags))
> @@ -3364,7 +3374,10 @@ free_done:
>  static inline void __cache_free(struct kmem_cache *cachep, void *objp,
>  				unsigned long caller)
>  {
> -	struct array_cache *ac = cpu_cache_get(cachep);
> +	struct array_cache *ac;
> +
> +	kasan_slab_free(cachep, objp);
> +	ac = cpu_cache_get(cachep);

Why cpu_cache_get() was moved? Looks like unnecessary change.

>  
>  	check_irq_off();
>  	kmemleak_free_recursive(objp, cachep->flags);
> @@ -3403,6 +3416,8 @@ static inline void __cache_free(struct kmem_cache *cachep, void *objp,
>  void *kmem_cache_alloc(struct kmem_cache *cachep, gfp_t flags)
>  {
>  	void *ret = slab_alloc(cachep, flags, _RET_IP_);
> +	if (ret)

kasan_slab_alloc() should deal fine with ret == NULL.

> +		kasan_slab_alloc(cachep, ret);
>  
>  	trace_kmem_cache_alloc(_RET_IP_, ret,
>  			       cachep->object_size, cachep->size, flags);

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 5/7] mm, kasan: Stackdepot implementation. Enable stackdepot for SLAB
  2016-02-26 16:48 ` [PATCH v4 5/7] mm, kasan: Stackdepot implementation. Enable stackdepot for SLAB Alexander Potapenko
@ 2016-02-29 16:29   ` Andrey Ryabinin
  2016-02-29 17:12     ` Dmitry Vyukov
  2016-03-08 11:30     ` Alexander Potapenko
  0 siblings, 2 replies; 30+ messages in thread
From: Andrey Ryabinin @ 2016-02-29 16:29 UTC (permalink / raw)
  To: Alexander Potapenko, adech.fo, cl, dvyukov, akpm, rostedt,
	iamjoonsoo.kim, js1304, kcc
  Cc: kasan-dev, linux-kernel, linux-mm



On 02/26/2016 07:48 PM, Alexander Potapenko wrote:
> Stack depot will allow KASAN store allocation/deallocation stack traces
> for memory chunks. The stack traces are stored in a hash table and
> referenced by handles which reside in the kasan_alloc_meta and
> kasan_free_meta structures in the allocated memory chunks.
> 
> IRQ stack traces are cut below the IRQ entry point to avoid unnecessary
> duplication.
> 
> Right now stackdepot support is only enabled in SLAB allocator.
> Once KASAN features in SLAB are on par with those in SLUB we can switch
> SLUB to stackdepot as well, thus removing the dependency on SLUB stack
> bookkeeping, which wastes a lot of memory.
> 
> This patch is based on the "mm: kasan: stack depots" patch originally
> prepared by Dmitry Chernenkov.
> 
> Signed-off-by: Alexander Potapenko <glider@google.com>
> ---
> v2: - per request from Joonsoo Kim, moved the stackdepot implementation to
> lib/, as there's a plan to use it for page owner
>     - added copyright comments
>     - added comments about smp_load_acquire()/smp_store_release()
> 
> v3: - minor description changes
> ---



> diff --git a/lib/Makefile b/lib/Makefile
> index a7c26a4..10a4ae3 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -167,6 +167,13 @@ obj-$(CONFIG_SG_SPLIT) += sg_split.o
>  obj-$(CONFIG_STMP_DEVICE) += stmp_device.o
>  obj-$(CONFIG_IRQ_POLL) += irq_poll.o
>  
> +ifeq ($(CONFIG_KASAN),y)
> +ifeq ($(CONFIG_SLAB),y)

Just try to imagine that another subsystem wants to use stackdepot. How this gonna look like?

We have Kconfig to describe dependencies. So, this should be under CONFIG_STACKDEPOT.
So any user of this feature can just do 'select STACKDEPOT' in Kconfig.

> +	obj-y	+= stackdepot.o
> +	KASAN_SANITIZE_slub.o := n
> +endif
> +endif
> +
>  libfdt_files = fdt.o fdt_ro.o fdt_wip.o fdt_rw.o fdt_sw.o fdt_strerror.o \
>  	       fdt_empty_tree.o
>  $(foreach file, $(libfdt_files), \
> diff --git a/lib/stackdepot.c b/lib/stackdepot.c
> new file mode 100644
> index 0000000..f09b0da
> --- /dev/null
> +++ b/lib/stackdepot.c


> +/* Allocation of a new stack in raw storage */
> +static struct stack_record *depot_alloc_stack(unsigned long *entries, int size,
> +		u32 hash, void **prealloc, gfp_t alloc_flags)
> +{


> +
> +	stack->hash = hash;
> +	stack->size = size;
> +	stack->handle.slabindex = depot_index;
> +	stack->handle.offset = depot_offset >> STACK_ALLOC_ALIGN;
> +	__memcpy(stack->entries, entries, size * sizeof(unsigned long));

s/__memcpy/memcpy

> +	depot_offset += required_size;
> +
> +	return stack;
> +}
> +


> +/*
> + * depot_save_stack - save stack in a stack depot.
> + * @trace - the stacktrace to save.
> + * @alloc_flags - flags for allocating additional memory if required.
> + *
> + * Returns the handle of the stack struct stored in depot.
> + */
> +depot_stack_handle depot_save_stack(struct stack_trace *trace,
> +				    gfp_t alloc_flags)
> +{
> +	u32 hash;
> +	depot_stack_handle retval = 0;
> +	struct stack_record *found = NULL, **bucket;
> +	unsigned long flags;
> +	struct page *page = NULL;
> +	void *prealloc = NULL;
> +
> +	if (unlikely(trace->nr_entries == 0))
> +		goto exit;
> +
> +	hash = hash_stack(trace->entries, trace->nr_entries);
> +	/* Bad luck, we won't store this stack. */
> +	if (hash == 0)
> +		goto exit;
> +
> +	bucket = &stack_table[hash & STACK_HASH_MASK];
> +
> +	/* Fast path: look the stack trace up without locking.
> +	 *
> +	 * The smp_load_acquire() here pairs with smp_store_release() to
> +	 * |bucket| below.
> +	 */
> +	found = find_stack(smp_load_acquire(bucket), trace->entries,
> +			   trace->nr_entries, hash);
> +	if (found)
> +		goto exit;
> +
> +	/* Check if the current or the next stack slab need to be initialized.
> +	 * If so, allocate the memory - we won't be able to do that under the
> +	 * lock.
> +	 *
> +	 * The smp_load_acquire() here pairs with smp_store_release() to
> +	 * |next_slab_inited| in depot_alloc_stack() and init_stack_slab().
> +	 */
> +	if (unlikely(!smp_load_acquire(&next_slab_inited))) {
> +		if (!preempt_count() && !in_irq()) {

If you trying to detect atomic context here, than this doesn't work. E.g. you can't know
about held spinlocks in non-preemptible kernel.
And I'm not sure why need this. You know gfp flags here, so allocation in atomic context shouldn't be problem.



> +			alloc_flags &= (__GFP_RECLAIM | __GFP_IO | __GFP_FS |
> +				__GFP_NOWARN | __GFP_NORETRY |
> +				__GFP_NOMEMALLOC | __GFP_DIRECT_RECLAIM);

I think blacklist approach would be better here.

> +			page = alloc_pages(alloc_flags, STACK_ALLOC_ORDER);

STACK_ALLOC_ORDER = 4 - that's a lot. Do you really need that much?


> diff --git a/mm/kasan/Makefile b/mm/kasan/Makefile
> index a61460d..32bd73a 100644
> --- a/mm/kasan/Makefile
> +++ b/mm/kasan/Makefile
> @@ -7,3 +7,4 @@ CFLAGS_REMOVE_kasan.o = -pg
>  CFLAGS_kasan.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector)
>  
>  obj-y := kasan.o report.o kasan_init.o
> +

Extra newline.


> diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
> index 7b9e4ab9..b4e5942 100644
> --- a/mm/kasan/kasan.h
> +++ b/mm/kasan/kasan.h
> @@ -2,6 +2,7 @@
>  #define __MM_KASAN_KASAN_H
>  
>  #include <linux/kasan.h>
> +#include <linux/stackdepot.h>
>  
>  #define KASAN_SHADOW_SCALE_SIZE (1UL << KASAN_SHADOW_SCALE_SHIFT)
>  #define KASAN_SHADOW_MASK       (KASAN_SHADOW_SCALE_SIZE - 1)
> @@ -64,10 +65,13 @@ enum kasan_state {
>  	KASAN_STATE_FREE
>  };
>  
> +#define KASAN_STACK_DEPTH 64

I think, you can reduce this (32 perhaps?). Kernel stacks are not so deep usually.

> +
>  struct kasan_track {
>  	u64 cpu : 6;			/* for NR_CPUS = 64 */
>  	u64 pid : 16;			/* 65536 processes */
>  	u64 when : 42;			/* ~140 years */
> +	depot_stack_handle stack : sizeof(depot_stack_handle);
>  };
>  

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 6/7] kasan: Test fix: Warn if the UAF could not be detected in kmalloc_uaf2
  2016-02-26 16:48 ` [PATCH v4 6/7] kasan: Test fix: Warn if the UAF could not be detected in kmalloc_uaf2 Alexander Potapenko
@ 2016-02-29 16:31   ` Andrey Ryabinin
  0 siblings, 0 replies; 30+ messages in thread
From: Andrey Ryabinin @ 2016-02-29 16:31 UTC (permalink / raw)
  To: Alexander Potapenko, adech.fo, cl, dvyukov, akpm, rostedt,
	iamjoonsoo.kim, js1304, kcc
  Cc: kasan-dev, linux-kernel, linux-mm



On 02/26/2016 07:48 PM, Alexander Potapenko wrote:
> Signed-off-by: Alexander Potapenko <glider@google.com>
> ---
>  lib/test_kasan.c | 2 ++
>  1 file changed, 2 insertions(+)
> 

Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>

> diff --git a/lib/test_kasan.c b/lib/test_kasan.c
> index 90ad74f..82169fb 100644
> --- a/lib/test_kasan.c
> +++ b/lib/test_kasan.c
> @@ -294,6 +294,8 @@ static noinline void __init kmalloc_uaf2(void)
>  	}
>  
>  	ptr1[40] = 'x';
> +	if (ptr1 == ptr2)
> +		pr_err("Could not detect use-after-free: ptr1 == ptr2\n");
>  	kfree(ptr2);
>  }
>  
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 5/7] mm, kasan: Stackdepot implementation. Enable stackdepot for SLAB
  2016-02-29 16:29   ` Andrey Ryabinin
@ 2016-02-29 17:12     ` Dmitry Vyukov
  2016-03-01 11:57       ` Andrey Ryabinin
  2016-03-08 11:30     ` Alexander Potapenko
  1 sibling, 1 reply; 30+ messages in thread
From: Dmitry Vyukov @ 2016-02-29 17:12 UTC (permalink / raw)
  To: Andrey Ryabinin
  Cc: Alexander Potapenko, Andrey Konovalov, Christoph Lameter,
	Andrew Morton, Steven Rostedt, Joonsoo Kim, JoonSoo Kim,
	Kostya Serebryany, kasan-dev, LKML, linux-mm

On Mon, Feb 29, 2016 at 5:29 PM, Andrey Ryabinin <ryabinin.a.a@gmail.com> wrote:
>
>
> On 02/26/2016 07:48 PM, Alexander Potapenko wrote:
>> Stack depot will allow KASAN store allocation/deallocation stack traces
>> for memory chunks. The stack traces are stored in a hash table and
>> referenced by handles which reside in the kasan_alloc_meta and
>> kasan_free_meta structures in the allocated memory chunks.
>>
>> IRQ stack traces are cut below the IRQ entry point to avoid unnecessary
>> duplication.
>>
>> Right now stackdepot support is only enabled in SLAB allocator.
>> Once KASAN features in SLAB are on par with those in SLUB we can switch
>> SLUB to stackdepot as well, thus removing the dependency on SLUB stack
>> bookkeeping, which wastes a lot of memory.
>>
>> This patch is based on the "mm: kasan: stack depots" patch originally
>> prepared by Dmitry Chernenkov.
>>
>> Signed-off-by: Alexander Potapenko <glider@google.com>
>> ---
>> v2: - per request from Joonsoo Kim, moved the stackdepot implementation to
>> lib/, as there's a plan to use it for page owner
>>     - added copyright comments
>>     - added comments about smp_load_acquire()/smp_store_release()
>>
>> v3: - minor description changes
>> ---
>
>
>
>> diff --git a/lib/Makefile b/lib/Makefile
>> index a7c26a4..10a4ae3 100644
>> --- a/lib/Makefile
>> +++ b/lib/Makefile
>> @@ -167,6 +167,13 @@ obj-$(CONFIG_SG_SPLIT) += sg_split.o
>>  obj-$(CONFIG_STMP_DEVICE) += stmp_device.o
>>  obj-$(CONFIG_IRQ_POLL) += irq_poll.o
>>
>> +ifeq ($(CONFIG_KASAN),y)
>> +ifeq ($(CONFIG_SLAB),y)
>
> Just try to imagine that another subsystem wants to use stackdepot. How this gonna look like?
>
> We have Kconfig to describe dependencies. So, this should be under CONFIG_STACKDEPOT.
> So any user of this feature can just do 'select STACKDEPOT' in Kconfig.
>
>> +     obj-y   += stackdepot.o
>> +     KASAN_SANITIZE_slub.o := n
>> +endif
>> +endif
>> +
>>  libfdt_files = fdt.o fdt_ro.o fdt_wip.o fdt_rw.o fdt_sw.o fdt_strerror.o \
>>              fdt_empty_tree.o
>>  $(foreach file, $(libfdt_files), \
>> diff --git a/lib/stackdepot.c b/lib/stackdepot.c
>> new file mode 100644
>> index 0000000..f09b0da
>> --- /dev/null
>> +++ b/lib/stackdepot.c
>
>
>> +/* Allocation of a new stack in raw storage */
>> +static struct stack_record *depot_alloc_stack(unsigned long *entries, int size,
>> +             u32 hash, void **prealloc, gfp_t alloc_flags)
>> +{
>
>
>> +
>> +     stack->hash = hash;
>> +     stack->size = size;
>> +     stack->handle.slabindex = depot_index;
>> +     stack->handle.offset = depot_offset >> STACK_ALLOC_ALIGN;
>> +     __memcpy(stack->entries, entries, size * sizeof(unsigned long));
>
> s/__memcpy/memcpy/

memcpy should be instrumented by asan/tsan, and we would like to avoid
that instrumentation here.
It's plain unnecessary and for tsan it can create recursion issues
(double entry into runtime).


>> +     depot_offset += required_size;
>> +
>> +     return stack;
>> +}
>> +
>
>
>> +/*
>> + * depot_save_stack - save stack in a stack depot.
>> + * @trace - the stacktrace to save.
>> + * @alloc_flags - flags for allocating additional memory if required.
>> + *
>> + * Returns the handle of the stack struct stored in depot.
>> + */
>> +depot_stack_handle depot_save_stack(struct stack_trace *trace,
>> +                                 gfp_t alloc_flags)
>> +{
>> +     u32 hash;
>> +     depot_stack_handle retval = 0;
>> +     struct stack_record *found = NULL, **bucket;
>> +     unsigned long flags;
>> +     struct page *page = NULL;
>> +     void *prealloc = NULL;
>> +
>> +     if (unlikely(trace->nr_entries == 0))
>> +             goto exit;
>> +
>> +     hash = hash_stack(trace->entries, trace->nr_entries);
>> +     /* Bad luck, we won't store this stack. */
>> +     if (hash == 0)
>> +             goto exit;
>> +
>> +     bucket = &stack_table[hash & STACK_HASH_MASK];
>> +
>> +     /* Fast path: look the stack trace up without locking.
>> +      *
>> +      * The smp_load_acquire() here pairs with smp_store_release() to
>> +      * |bucket| below.
>> +      */
>> +     found = find_stack(smp_load_acquire(bucket), trace->entries,
>> +                        trace->nr_entries, hash);
>> +     if (found)
>> +             goto exit;
>> +
>> +     /* Check if the current or the next stack slab need to be initialized.
>> +      * If so, allocate the memory - we won't be able to do that under the
>> +      * lock.
>> +      *
>> +      * The smp_load_acquire() here pairs with smp_store_release() to
>> +      * |next_slab_inited| in depot_alloc_stack() and init_stack_slab().
>> +      */
>> +     if (unlikely(!smp_load_acquire(&next_slab_inited))) {
>> +             if (!preempt_count() && !in_irq()) {
>
> If you trying to detect atomic context here, than this doesn't work. E.g. you can't know
> about held spinlocks in non-preemptible kernel.
> And I'm not sure why need this. You know gfp flags here, so allocation in atomic context shouldn't be problem.


We don't have gfp flags for kfree.
I wonder how CONFIG_DEBUG_ATOMIC_SLEEP handles this. Maybe it has the answer.
Alternatively, we can always assume that we are in atomic context in kfree.



>> +                     alloc_flags &= (__GFP_RECLAIM | __GFP_IO | __GFP_FS |
>> +                             __GFP_NOWARN | __GFP_NORETRY |
>> +                             __GFP_NOMEMALLOC | __GFP_DIRECT_RECLAIM);
>
> I think blacklist approach would be better here.
>
>> +                     page = alloc_pages(alloc_flags, STACK_ALLOC_ORDER);
>
> STACK_ALLOC_ORDER = 4 - that's a lot. Do you really need that much?

Part of the issue the atomic context above. When we can't allocate
memory we still want to save the stack trace. When we have less than
STACK_ALLOC_ORDER memory, we try to preallocate another
STACK_ALLOC_ORDER in advance. So in the worst case, we have
STACK_ALLOC_ORDER memory and that should be enough to handle all
kmalloc/kfree in the atomic context. 1 page does not look enough. I
think Alex did some measuring of the failure race (when we are out of
memory and can't allocate more).


>> diff --git a/mm/kasan/Makefile b/mm/kasan/Makefile
>> index a61460d..32bd73a 100644
>> --- a/mm/kasan/Makefile
>> +++ b/mm/kasan/Makefile
>> @@ -7,3 +7,4 @@ CFLAGS_REMOVE_kasan.o = -pg
>>  CFLAGS_kasan.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector)
>>
>>  obj-y := kasan.o report.o kasan_init.o
>> +
>
> Extra newline.
>
>
>> diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
>> index 7b9e4ab9..b4e5942 100644
>> --- a/mm/kasan/kasan.h
>> +++ b/mm/kasan/kasan.h
>> @@ -2,6 +2,7 @@
>>  #define __MM_KASAN_KASAN_H
>>
>>  #include <linux/kasan.h>
>> +#include <linux/stackdepot.h>
>>
>>  #define KASAN_SHADOW_SCALE_SIZE (1UL << KASAN_SHADOW_SCALE_SHIFT)
>>  #define KASAN_SHADOW_MASK       (KASAN_SHADOW_SCALE_SIZE - 1)
>> @@ -64,10 +65,13 @@ enum kasan_state {
>>       KASAN_STATE_FREE
>>  };
>>
>> +#define KASAN_STACK_DEPTH 64
>
> I think, you can reduce this (32 perhaps?). Kernel stacks are not so deep usually.

We don't allocate whole KASAN_STACK_DEPTH for each stack, we allocate
only required memory for a particular stack at hand. So large value
here is not an issue. If most stacks are shallow, then we will
allocate just that small amount of memory.



>> +
>>  struct kasan_track {
>>       u64 cpu : 6;                    /* for NR_CPUS = 64 */
>>       u64 pid : 16;                   /* 65536 processes */
>>       u64 when : 42;                  /* ~140 years */
>> +     depot_stack_handle stack : sizeof(depot_stack_handle);
>>  };
>>
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 2/7] mm, kasan: SLAB support
  2016-02-29 15:10   ` Andrey Ryabinin
@ 2016-02-29 18:28     ` Alexander Potapenko
  2016-02-29 18:33       ` Alexander Potapenko
  2016-03-01 14:34       ` Andrey Ryabinin
  0 siblings, 2 replies; 30+ messages in thread
From: Alexander Potapenko @ 2016-02-29 18:28 UTC (permalink / raw)
  To: Andrey Ryabinin
  Cc: Andrey Konovalov, Christoph Lameter, Dmitriy Vyukov,
	Andrew Morton, Steven Rostedt, Joonsoo Kim, JoonSoo Kim,
	Kostya Serebryany, kasan-dev, LKML, Linux Memory Management List

On Mon, Feb 29, 2016 at 4:10 PM, Andrey Ryabinin <ryabinin.a.a@gmail.com> wrote:
>
>
> On 02/26/2016 07:48 PM, Alexander Potapenko wrote:
>> Add KASAN hooks to SLAB allocator.
>>
>> This patch is based on the "mm: kasan: unified support for SLUB and
>> SLAB allocators" patch originally prepared by Dmitry Chernenkov.
>>
>> Signed-off-by: Alexander Potapenko <glider@google.com>
>> ---
>> v3: - minor description changes
>>     - store deallocation info in kasan_slab_free()
>>
>> v4: - fix kbuild compile-time warnings in print_track()
>> ---
>
>
>> diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c
>> index bc0a8d8..d26ffb4 100644
>> --- a/mm/kasan/kasan.c
>> +++ b/mm/kasan/kasan.c
>> @@ -314,6 +314,59 @@ void kasan_free_pages(struct page *page, unsigned int order)
>>                               KASAN_FREE_PAGE);
>>  }
>>
>> +#ifdef CONFIG_SLAB
>> +/*
>> + * Adaptive redzone policy taken from the userspace AddressSanitizer runtime.
>> + * For larger allocations larger redzones are used.
>> + */
>> +static size_t optimal_redzone(size_t object_size)
>> +{
>> +     int rz =
>> +             object_size <= 64        - 16   ? 16 :
>> +             object_size <= 128       - 32   ? 32 :
>> +             object_size <= 512       - 64   ? 64 :
>> +             object_size <= 4096      - 128  ? 128 :
>> +             object_size <= (1 << 14) - 256  ? 256 :
>> +             object_size <= (1 << 15) - 512  ? 512 :
>> +             object_size <= (1 << 16) - 1024 ? 1024 : 2048;
>> +     return rz;
>> +}
>> +
>> +void kasan_cache_create(struct kmem_cache *cache, size_t *size,
>> +                     unsigned long *flags)
>> +{
>> +     int redzone_adjust;
>> +     /* Make sure the adjusted size is still less than
>> +      * KMALLOC_MAX_CACHE_SIZE.
>> +      * TODO: this check is only useful for SLAB, but not SLUB. We'll need
>> +      * to skip it for SLUB when it starts using kasan_cache_create().
>> +      */
>> +     if (*size > KMALLOC_MAX_CACHE_SIZE -
>> +         sizeof(struct kasan_alloc_meta) -
>> +         sizeof(struct kasan_free_meta))
>> +             return;
>> +     *flags |= SLAB_KASAN;
>> +     /* Add alloc meta. */
>> +     cache->kasan_info.alloc_meta_offset = *size;
>> +     *size += sizeof(struct kasan_alloc_meta);
>> +
>> +     /* Add free meta. */
>> +     if (cache->flags & SLAB_DESTROY_BY_RCU || cache->ctor ||
>> +         cache->object_size < sizeof(struct kasan_free_meta)) {
>> +             cache->kasan_info.free_meta_offset = *size;
>> +             *size += sizeof(struct kasan_free_meta);
>> +     }
>> +     redzone_adjust = optimal_redzone(cache->object_size) -
>> +             (*size - cache->object_size);
>> +     if (redzone_adjust > 0)
>> +             *size += redzone_adjust;
>> +     *size = min(KMALLOC_MAX_CACHE_SIZE,
>> +                 max(*size,
>> +                     cache->object_size +
>> +                     optimal_redzone(cache->object_size)));
>> +}
>> +#endif
>> +
>
>
>
>
>>  void kasan_poison_slab(struct page *page)
>>  {
>>       kasan_poison_shadow(page_address(page),
>> @@ -331,8 +384,36 @@ void kasan_poison_object_data(struct kmem_cache *cache, void *object)
>>       kasan_poison_shadow(object,
>>                       round_up(cache->object_size, KASAN_SHADOW_SCALE_SIZE),
>>                       KASAN_KMALLOC_REDZONE);
>> +#ifdef CONFIG_SLAB
>> +     if (cache->flags & SLAB_KASAN) {
>> +             struct kasan_alloc_meta *alloc_info =
>> +                     get_alloc_info(cache, object);
>> +             alloc_info->state = KASAN_STATE_INIT;
>> +     }
>> +#endif
>> +}
>> +
>> +static inline void set_track(struct kasan_track *track)
>> +{
>> +     track->cpu = raw_smp_processor_id();
>> +     track->pid = current->pid;
>> +     track->when = jiffies;
>>  }
>>
>> +#ifdef CONFIG_SLAB
>> +struct kasan_alloc_meta *get_alloc_info(struct kmem_cache *cache,
>> +                                     const void *object)
>> +{
>> +     return (void *)object + cache->kasan_info.alloc_meta_offset;
>> +}
>> +
>> +struct kasan_free_meta *get_free_info(struct kmem_cache *cache,
>> +                                   const void *object)
>> +{
>> +     return (void *)object + cache->kasan_info.free_meta_offset;
>> +}
>> +#endif
>> +
>>  void kasan_slab_alloc(struct kmem_cache *cache, void *object)
>>  {
>>       kasan_kmalloc(cache, object, cache->object_size);
>> @@ -347,6 +428,17 @@ void kasan_slab_free(struct kmem_cache *cache, void *object)
>>       if (unlikely(cache->flags & SLAB_DESTROY_BY_RCU))
>>               return;
>>
>> +#ifdef CONFIG_SLAB
>> +     if (cache->flags & SLAB_KASAN) {
>> +             struct kasan_free_meta *free_info =
>> +                     get_free_info(cache, object);
>> +             struct kasan_alloc_meta *alloc_info =
>> +                     get_alloc_info(cache, object);
>> +             alloc_info->state = KASAN_STATE_FREE;
>> +             set_track(&free_info->track);
>> +     }
>> +#endif
>> +
>>       kasan_poison_shadow(object, rounded_up_size, KASAN_KMALLOC_FREE);
>>  }
>>
>> @@ -366,6 +458,16 @@ void kasan_kmalloc(struct kmem_cache *cache, const void *object, size_t size)
>>       kasan_unpoison_shadow(object, size);
>>       kasan_poison_shadow((void *)redzone_start, redzone_end - redzone_start,
>>               KASAN_KMALLOC_REDZONE);
>> +#ifdef CONFIG_SLAB
>> +     if (cache->flags & SLAB_KASAN) {
>> +             struct kasan_alloc_meta *alloc_info =
>> +                     get_alloc_info(cache, object);
>> +
>> +             alloc_info->state = KASAN_STATE_ALLOC;
>> +             alloc_info->alloc_size = size;
>> +             set_track(&alloc_info->track);
>> +     }
>> +#endif
>>  }
>>  EXPORT_SYMBOL(kasan_kmalloc);
>>
>> diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
>> index 4f6c62e..7b9e4ab9 100644
>> --- a/mm/kasan/kasan.h
>> +++ b/mm/kasan/kasan.h
>> @@ -54,6 +54,40 @@ struct kasan_global {
>>  #endif
>>  };
>>
>> +/**
>> + * Structures to keep alloc and free tracks *
>> + */
>> +
>> +enum kasan_state {
>> +     KASAN_STATE_INIT,
>> +     KASAN_STATE_ALLOC,
>> +     KASAN_STATE_FREE
>> +};
>> +
>> +struct kasan_track {
>> +     u64 cpu : 6;                    /* for NR_CPUS = 64 */
>> +     u64 pid : 16;                   /* 65536 processes */
>> +     u64 when : 42;                  /* ~140 years */
>> +};
>> +
>> +struct kasan_alloc_meta {
>> +     u32 state : 2;  /* enum kasan_state */
>> +     u32 alloc_size : 30;
>> +     struct kasan_track track;
>> +};
>> +
>> +struct kasan_free_meta {
>> +     /* Allocator freelist pointer, unused by KASAN. */
>> +     void **freelist;
>> +     struct kasan_track track;
>> +};
>> +
>> +struct kasan_alloc_meta *get_alloc_info(struct kmem_cache *cache,
>> +                                     const void *object);
>> +struct kasan_free_meta *get_free_info(struct kmem_cache *cache,
>> +                                     const void *object);
>> +
>> +
>
> Basically, all this big pile of code above is implementation of yet another SLAB_STORE_USER and SLAB_RED_ZONE
> exclusively for KASAN. It would be so much better to alter existing code to satisfy all you needs.
Thanks for the suggestion, I will take a look.


>>  static inline const void *kasan_shadow_to_mem(const void *shadow_addr)
>>  {
>>       return (void *)(((unsigned long)shadow_addr - KASAN_SHADOW_OFFSET)
>> diff --git a/mm/kasan/report.c b/mm/kasan/report.c
>> index 12f222d..2c1407f 100644
>> --- a/mm/kasan/report.c
>> +++ b/mm/kasan/report.c
>> @@ -115,6 +115,44 @@ static inline bool init_task_stack_addr(const void *addr)
>>                       sizeof(init_thread_union.stack));
>>  }
>>
>> +#ifdef CONFIG_SLAB
>> +static void print_track(struct kasan_track *track)
>> +{
>> +     pr_err("PID = %u, CPU = %u, timestamp = %lu\n", track->pid,
>> +            track->cpu, (unsigned long)track->when);
>> +}
>> +
>> +static void print_object(struct kmem_cache *cache, void *object)
>> +{
>> +     struct kasan_alloc_meta *alloc_info = get_alloc_info(cache, object);
>> +     struct kasan_free_meta *free_info;
>> +
>> +     pr_err("Object at %p, in cache %s\n", object, cache->name);
>> +     if (!(cache->flags & SLAB_KASAN))
>> +             return;
>> +     switch (alloc_info->state) {
>
> '->state' seems useless. It's used only here, but object's state could be determined by shadow value.
>
>> +     case KASAN_STATE_INIT:
>> +             pr_err("Object not allocated yet\n");
>> +             break;
>> +     case KASAN_STATE_ALLOC:
>> +             pr_err("Object allocated with size %u bytes.\n",
>> +                    alloc_info->alloc_size);
>> +             pr_err("Allocation:\n");
>> +             print_track(&alloc_info->track);
>> +             break;
>> +     case KASAN_STATE_FREE:
>> +             pr_err("Object freed, allocated with size %u bytes\n",
>> +                    alloc_info->alloc_size);
>> +             free_info = get_free_info(cache, object);
>> +             pr_err("Allocation:\n");
>> +             print_track(&alloc_info->track);
>> +             pr_err("Deallocation:\n");
>> +             print_track(&free_info->track);
>> +             break;
>> +     }
>> +}
>> +#endif
>> +
>>  static void print_address_description(struct kasan_access_info *info)
>>  {
>>       const void *addr = info->access_addr;
>> @@ -126,17 +164,14 @@ static void print_address_description(struct kasan_access_info *info)
>>               if (PageSlab(page)) {
>>                       void *object;
>>                       struct kmem_cache *cache = page->slab_cache;
>> -                     void *last_object;
>> -
>> -                     object = virt_to_obj(cache, page_address(page), addr);
>> -                     last_object = page_address(page) +
>> -                             page->objects * cache->size;
>> -
>> -                     if (unlikely(object > last_object))
>> -                             object = last_object; /* we hit into padding */
>> -
>> +                     object = nearest_obj(cache, page,
>> +                                             (void *)info->access_addr);
>> +#ifdef CONFIG_SLAB
>> +                     print_object(cache, object);
>> +#else
>
> Instead of these ifdefs, please, make universal API for printing object's information.
My intention here was to touch the SLUB functionality as little as
possible to avoid the mess and feature regressions.
I'll be happy to refactor the code in the upcoming patches once this
one is landed.

>>                       object_err(cache, page, object,
>> -                             "kasan: bad access detected");
>> +                                     "kasan: bad access detected");
>> +#endif
>>                       return;
>>               }
>>               dump_page(page, "kasan: bad access detected");
>> @@ -146,8 +181,9 @@ static void print_address_description(struct kasan_access_info *info)
>>               if (!init_task_stack_addr(addr))
>>                       pr_err("Address belongs to variable %pS\n", addr);
>>       }
>> -
>> +#ifdef CONFIG_SLUB
>
> ???
Not sure what did you mean here, assuming this comment is related to
the next one.
>
>>       dump_stack();
>> +#endif
>>  }
>>
>>  static bool row_is_guilty(const void *row, const void *guilty)
>> @@ -233,6 +269,9 @@ static void kasan_report_error(struct kasan_access_info *info)
>>               dump_stack();
>>       } else {
>>               print_error_description(info);
>> +#ifdef CONFIG_SLAB
>
> I'm lost here. What's the point of reordering dump_stack() for CONFIG_SLAB=y?
I should have documented this in the patch description properly.
My intention is to make the KASAN reports look more like those in the
userspace AddressSanitizer, so I'm moving the memory access stack to
the top of the report.
Having seen hundreds and hundreds of ASan reports, we believe that
important information must go at the beginning of the error report.
First, people usually do not need to read further once they see the
access stack.
Second, the whole report may simply not make it to the log (e.g. in
the case of a premature shutdown or remote log collection).
As said before, I wasn't going to touch the SLUB output format in this
patch set, but that also needs to be fixed (I'd also remove some
unnecessary info, e.g. the memory dump).

>> +             dump_stack();
>> +#endif
>>               print_address_description(info);
>>               print_shadow_for_address(info->first_bad_addr);
>>       }
>> diff --git a/mm/slab.c b/mm/slab.c
>> index 621fbcb..805b39b 100644
>
>
>
>>
>>       if (gfpflags_allow_blocking(local_flags))
>> @@ -3364,7 +3374,10 @@ free_done:
>>  static inline void __cache_free(struct kmem_cache *cachep, void *objp,
>>                               unsigned long caller)
>>  {
>> -     struct array_cache *ac = cpu_cache_get(cachep);
>> +     struct array_cache *ac;
>> +
>> +     kasan_slab_free(cachep, objp);
>> +     ac = cpu_cache_get(cachep);
>
> Why cpu_cache_get() was moved? Looks like unnecessary change.

Agreed.

>>
>>       check_irq_off();
>>       kmemleak_free_recursive(objp, cachep->flags);
>> @@ -3403,6 +3416,8 @@ static inline void __cache_free(struct kmem_cache *cachep, void *objp,
>>  void *kmem_cache_alloc(struct kmem_cache *cachep, gfp_t flags)
>>  {
>>       void *ret = slab_alloc(cachep, flags, _RET_IP_);
>> +     if (ret)
>
> kasan_slab_alloc() should deal fine with ret == NULL.
And it actually does. I'll remove this code in the updated patch set.
>
>> +             kasan_slab_alloc(cachep, ret);
>>
>>       trace_kmem_cache_alloc(_RET_IP_, ret,
>>                              cachep->object_size, cachep->size, flags);



-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat sind,
leiten Sie diese bitte nicht weiter, informieren Sie den
Absender und löschen Sie die E-Mail und alle Anhänge. Vielen Dank.
This e-mail is confidential. If you are not the right addressee please
do not forward it, please inform the sender, and please erase this
e-mail including any attachments. Thanks.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 2/7] mm, kasan: SLAB support
  2016-02-29 18:28     ` Alexander Potapenko
@ 2016-02-29 18:33       ` Alexander Potapenko
  2016-03-01 14:34       ` Andrey Ryabinin
  1 sibling, 0 replies; 30+ messages in thread
From: Alexander Potapenko @ 2016-02-29 18:33 UTC (permalink / raw)
  To: Andrey Ryabinin
  Cc: Andrey Konovalov, Christoph Lameter, Dmitriy Vyukov,
	Andrew Morton, Steven Rostedt, Joonsoo Kim, JoonSoo Kim,
	Kostya Serebryany, kasan-dev, LKML, Linux Memory Management List

On Mon, Feb 29, 2016 at 7:28 PM, Alexander Potapenko <glider@google.com> wrote:
> On Mon, Feb 29, 2016 at 4:10 PM, Andrey Ryabinin <ryabinin.a.a@gmail.com> wrote:
>>
>>
>> On 02/26/2016 07:48 PM, Alexander Potapenko wrote:
>>> Add KASAN hooks to SLAB allocator.
>>>
>>> This patch is based on the "mm: kasan: unified support for SLUB and
>>> SLAB allocators" patch originally prepared by Dmitry Chernenkov.
>>>
>>> Signed-off-by: Alexander Potapenko <glider@google.com>
>>> ---
>>> v3: - minor description changes
>>>     - store deallocation info in kasan_slab_free()
>>>
>>> v4: - fix kbuild compile-time warnings in print_track()
>>> ---
>>
>>
>>> diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c
>>> index bc0a8d8..d26ffb4 100644
>>> --- a/mm/kasan/kasan.c
>>> +++ b/mm/kasan/kasan.c
>>> @@ -314,6 +314,59 @@ void kasan_free_pages(struct page *page, unsigned int order)
>>>                               KASAN_FREE_PAGE);
>>>  }
>>>
>>> +#ifdef CONFIG_SLAB
>>> +/*
>>> + * Adaptive redzone policy taken from the userspace AddressSanitizer runtime.
>>> + * For larger allocations larger redzones are used.
>>> + */
>>> +static size_t optimal_redzone(size_t object_size)
>>> +{
>>> +     int rz =
>>> +             object_size <= 64        - 16   ? 16 :
>>> +             object_size <= 128       - 32   ? 32 :
>>> +             object_size <= 512       - 64   ? 64 :
>>> +             object_size <= 4096      - 128  ? 128 :
>>> +             object_size <= (1 << 14) - 256  ? 256 :
>>> +             object_size <= (1 << 15) - 512  ? 512 :
>>> +             object_size <= (1 << 16) - 1024 ? 1024 : 2048;
>>> +     return rz;
>>> +}
>>> +
>>> +void kasan_cache_create(struct kmem_cache *cache, size_t *size,
>>> +                     unsigned long *flags)
>>> +{
>>> +     int redzone_adjust;
>>> +     /* Make sure the adjusted size is still less than
>>> +      * KMALLOC_MAX_CACHE_SIZE.
>>> +      * TODO: this check is only useful for SLAB, but not SLUB. We'll need
>>> +      * to skip it for SLUB when it starts using kasan_cache_create().
>>> +      */
>>> +     if (*size > KMALLOC_MAX_CACHE_SIZE -
>>> +         sizeof(struct kasan_alloc_meta) -
>>> +         sizeof(struct kasan_free_meta))
>>> +             return;
>>> +     *flags |= SLAB_KASAN;
>>> +     /* Add alloc meta. */
>>> +     cache->kasan_info.alloc_meta_offset = *size;
>>> +     *size += sizeof(struct kasan_alloc_meta);
>>> +
>>> +     /* Add free meta. */
>>> +     if (cache->flags & SLAB_DESTROY_BY_RCU || cache->ctor ||
>>> +         cache->object_size < sizeof(struct kasan_free_meta)) {
>>> +             cache->kasan_info.free_meta_offset = *size;
>>> +             *size += sizeof(struct kasan_free_meta);
>>> +     }
>>> +     redzone_adjust = optimal_redzone(cache->object_size) -
>>> +             (*size - cache->object_size);
>>> +     if (redzone_adjust > 0)
>>> +             *size += redzone_adjust;
>>> +     *size = min(KMALLOC_MAX_CACHE_SIZE,
>>> +                 max(*size,
>>> +                     cache->object_size +
>>> +                     optimal_redzone(cache->object_size)));
>>> +}
>>> +#endif
>>> +
>>
>>
>>
>>
>>>  void kasan_poison_slab(struct page *page)
>>>  {
>>>       kasan_poison_shadow(page_address(page),
>>> @@ -331,8 +384,36 @@ void kasan_poison_object_data(struct kmem_cache *cache, void *object)
>>>       kasan_poison_shadow(object,
>>>                       round_up(cache->object_size, KASAN_SHADOW_SCALE_SIZE),
>>>                       KASAN_KMALLOC_REDZONE);
>>> +#ifdef CONFIG_SLAB
>>> +     if (cache->flags & SLAB_KASAN) {
>>> +             struct kasan_alloc_meta *alloc_info =
>>> +                     get_alloc_info(cache, object);
>>> +             alloc_info->state = KASAN_STATE_INIT;
>>> +     }
>>> +#endif
>>> +}
>>> +
>>> +static inline void set_track(struct kasan_track *track)
>>> +{
>>> +     track->cpu = raw_smp_processor_id();
>>> +     track->pid = current->pid;
>>> +     track->when = jiffies;
>>>  }
>>>
>>> +#ifdef CONFIG_SLAB
>>> +struct kasan_alloc_meta *get_alloc_info(struct kmem_cache *cache,
>>> +                                     const void *object)
>>> +{
>>> +     return (void *)object + cache->kasan_info.alloc_meta_offset;
>>> +}
>>> +
>>> +struct kasan_free_meta *get_free_info(struct kmem_cache *cache,
>>> +                                   const void *object)
>>> +{
>>> +     return (void *)object + cache->kasan_info.free_meta_offset;
>>> +}
>>> +#endif
>>> +
>>>  void kasan_slab_alloc(struct kmem_cache *cache, void *object)
>>>  {
>>>       kasan_kmalloc(cache, object, cache->object_size);
>>> @@ -347,6 +428,17 @@ void kasan_slab_free(struct kmem_cache *cache, void *object)
>>>       if (unlikely(cache->flags & SLAB_DESTROY_BY_RCU))
>>>               return;
>>>
>>> +#ifdef CONFIG_SLAB
>>> +     if (cache->flags & SLAB_KASAN) {
>>> +             struct kasan_free_meta *free_info =
>>> +                     get_free_info(cache, object);
>>> +             struct kasan_alloc_meta *alloc_info =
>>> +                     get_alloc_info(cache, object);
>>> +             alloc_info->state = KASAN_STATE_FREE;
>>> +             set_track(&free_info->track);
>>> +     }
>>> +#endif
>>> +
>>>       kasan_poison_shadow(object, rounded_up_size, KASAN_KMALLOC_FREE);
>>>  }
>>>
>>> @@ -366,6 +458,16 @@ void kasan_kmalloc(struct kmem_cache *cache, const void *object, size_t size)
>>>       kasan_unpoison_shadow(object, size);
>>>       kasan_poison_shadow((void *)redzone_start, redzone_end - redzone_start,
>>>               KASAN_KMALLOC_REDZONE);
>>> +#ifdef CONFIG_SLAB
>>> +     if (cache->flags & SLAB_KASAN) {
>>> +             struct kasan_alloc_meta *alloc_info =
>>> +                     get_alloc_info(cache, object);
>>> +
>>> +             alloc_info->state = KASAN_STATE_ALLOC;
>>> +             alloc_info->alloc_size = size;
>>> +             set_track(&alloc_info->track);
>>> +     }
>>> +#endif
>>>  }
>>>  EXPORT_SYMBOL(kasan_kmalloc);
>>>
>>> diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
>>> index 4f6c62e..7b9e4ab9 100644
>>> --- a/mm/kasan/kasan.h
>>> +++ b/mm/kasan/kasan.h
>>> @@ -54,6 +54,40 @@ struct kasan_global {
>>>  #endif
>>>  };
>>>
>>> +/**
>>> + * Structures to keep alloc and free tracks *
>>> + */
>>> +
>>> +enum kasan_state {
>>> +     KASAN_STATE_INIT,
>>> +     KASAN_STATE_ALLOC,
>>> +     KASAN_STATE_FREE
>>> +};
>>> +
>>> +struct kasan_track {
>>> +     u64 cpu : 6;                    /* for NR_CPUS = 64 */
>>> +     u64 pid : 16;                   /* 65536 processes */
>>> +     u64 when : 42;                  /* ~140 years */
>>> +};
>>> +
>>> +struct kasan_alloc_meta {
>>> +     u32 state : 2;  /* enum kasan_state */
>>> +     u32 alloc_size : 30;
>>> +     struct kasan_track track;
>>> +};
>>> +
>>> +struct kasan_free_meta {
>>> +     /* Allocator freelist pointer, unused by KASAN. */
>>> +     void **freelist;
>>> +     struct kasan_track track;
>>> +};
>>> +
>>> +struct kasan_alloc_meta *get_alloc_info(struct kmem_cache *cache,
>>> +                                     const void *object);
>>> +struct kasan_free_meta *get_free_info(struct kmem_cache *cache,
>>> +                                     const void *object);
>>> +
>>> +
>>
>> Basically, all this big pile of code above is implementation of yet another SLAB_STORE_USER and SLAB_RED_ZONE
>> exclusively for KASAN. It would be so much better to alter existing code to satisfy all you needs.
> Thanks for the suggestion, I will take a look.
>
>
>>>  static inline const void *kasan_shadow_to_mem(const void *shadow_addr)
>>>  {
>>>       return (void *)(((unsigned long)shadow_addr - KASAN_SHADOW_OFFSET)
>>> diff --git a/mm/kasan/report.c b/mm/kasan/report.c
>>> index 12f222d..2c1407f 100644
>>> --- a/mm/kasan/report.c
>>> +++ b/mm/kasan/report.c
>>> @@ -115,6 +115,44 @@ static inline bool init_task_stack_addr(const void *addr)
>>>                       sizeof(init_thread_union.stack));
>>>  }
>>>
>>> +#ifdef CONFIG_SLAB
>>> +static void print_track(struct kasan_track *track)
>>> +{
>>> +     pr_err("PID = %u, CPU = %u, timestamp = %lu\n", track->pid,
>>> +            track->cpu, (unsigned long)track->when);
>>> +}
>>> +
>>> +static void print_object(struct kmem_cache *cache, void *object)
>>> +{
>>> +     struct kasan_alloc_meta *alloc_info = get_alloc_info(cache, object);
>>> +     struct kasan_free_meta *free_info;
>>> +
>>> +     pr_err("Object at %p, in cache %s\n", object, cache->name);
>>> +     if (!(cache->flags & SLAB_KASAN))
>>> +             return;
>>> +     switch (alloc_info->state) {
>>
>> '->state' seems useless. It's used only here, but object's state could be determined by shadow value.
Don't think it's a good idea to rely on the shadow values here, the
state is a different conception.
For example, shadow values don't distinguish between the freed memory
and memory in the quarantine.
In the userspace tool it's also possible for the user to manually
poison and unpoison parts of the memory (and I can imagine having the
same mechanism in KASAN), but this does not necessarily make them
change the state.
>>> +     case KASAN_STATE_INIT:
>>> +             pr_err("Object not allocated yet\n");
>>> +             break;
>>> +     case KASAN_STATE_ALLOC:
>>> +             pr_err("Object allocated with size %u bytes.\n",
>>> +                    alloc_info->alloc_size);
>>> +             pr_err("Allocation:\n");
>>> +             print_track(&alloc_info->track);
>>> +             break;
>>> +     case KASAN_STATE_FREE:
>>> +             pr_err("Object freed, allocated with size %u bytes\n",
>>> +                    alloc_info->alloc_size);
>>> +             free_info = get_free_info(cache, object);
>>> +             pr_err("Allocation:\n");
>>> +             print_track(&alloc_info->track);
>>> +             pr_err("Deallocation:\n");
>>> +             print_track(&free_info->track);
>>> +             break;
>>> +     }
>>> +}
>>> +#endif
>>> +
>>>  static void print_address_description(struct kasan_access_info *info)
>>>  {
>>>       const void *addr = info->access_addr;
>>> @@ -126,17 +164,14 @@ static void print_address_description(struct kasan_access_info *info)
>>>               if (PageSlab(page)) {
>>>                       void *object;
>>>                       struct kmem_cache *cache = page->slab_cache;
>>> -                     void *last_object;
>>> -
>>> -                     object = virt_to_obj(cache, page_address(page), addr);
>>> -                     last_object = page_address(page) +
>>> -                             page->objects * cache->size;
>>> -
>>> -                     if (unlikely(object > last_object))
>>> -                             object = last_object; /* we hit into padding */
>>> -
>>> +                     object = nearest_obj(cache, page,
>>> +                                             (void *)info->access_addr);
>>> +#ifdef CONFIG_SLAB
>>> +                     print_object(cache, object);
>>> +#else
>>
>> Instead of these ifdefs, please, make universal API for printing object's information.
> My intention here was to touch the SLUB functionality as little as
> possible to avoid the mess and feature regressions.
> I'll be happy to refactor the code in the upcoming patches once this
> one is landed.
>
>>>                       object_err(cache, page, object,
>>> -                             "kasan: bad access detected");
>>> +                                     "kasan: bad access detected");
>>> +#endif
>>>                       return;
>>>               }
>>>               dump_page(page, "kasan: bad access detected");
>>> @@ -146,8 +181,9 @@ static void print_address_description(struct kasan_access_info *info)
>>>               if (!init_task_stack_addr(addr))
>>>                       pr_err("Address belongs to variable %pS\n", addr);
>>>       }
>>> -
>>> +#ifdef CONFIG_SLUB
>>
>> ???
> Not sure what did you mean here, assuming this comment is related to
> the next one.
>>
>>>       dump_stack();
>>> +#endif
>>>  }
>>>
>>>  static bool row_is_guilty(const void *row, const void *guilty)
>>> @@ -233,6 +269,9 @@ static void kasan_report_error(struct kasan_access_info *info)
>>>               dump_stack();
>>>       } else {
>>>               print_error_description(info);
>>> +#ifdef CONFIG_SLAB
>>
>> I'm lost here. What's the point of reordering dump_stack() for CONFIG_SLAB=y?
> I should have documented this in the patch description properly.
> My intention is to make the KASAN reports look more like those in the
> userspace AddressSanitizer, so I'm moving the memory access stack to
> the top of the report.
> Having seen hundreds and hundreds of ASan reports, we believe that
> important information must go at the beginning of the error report.
> First, people usually do not need to read further once they see the
> access stack.
> Second, the whole report may simply not make it to the log (e.g. in
> the case of a premature shutdown or remote log collection).
> As said before, I wasn't going to touch the SLUB output format in this
> patch set, but that also needs to be fixed (I'd also remove some
> unnecessary info, e.g. the memory dump).
>
>>> +             dump_stack();
>>> +#endif
>>>               print_address_description(info);
>>>               print_shadow_for_address(info->first_bad_addr);
>>>       }
>>> diff --git a/mm/slab.c b/mm/slab.c
>>> index 621fbcb..805b39b 100644
>>
>>
>>
>>>
>>>       if (gfpflags_allow_blocking(local_flags))
>>> @@ -3364,7 +3374,10 @@ free_done:
>>>  static inline void __cache_free(struct kmem_cache *cachep, void *objp,
>>>                               unsigned long caller)
>>>  {
>>> -     struct array_cache *ac = cpu_cache_get(cachep);
>>> +     struct array_cache *ac;
>>> +
>>> +     kasan_slab_free(cachep, objp);
>>> +     ac = cpu_cache_get(cachep);
>>
>> Why cpu_cache_get() was moved? Looks like unnecessary change.
>
> Agreed.
>
>>>
>>>       check_irq_off();
>>>       kmemleak_free_recursive(objp, cachep->flags);
>>> @@ -3403,6 +3416,8 @@ static inline void __cache_free(struct kmem_cache *cachep, void *objp,
>>>  void *kmem_cache_alloc(struct kmem_cache *cachep, gfp_t flags)
>>>  {
>>>       void *ret = slab_alloc(cachep, flags, _RET_IP_);
>>> +     if (ret)
>>
>> kasan_slab_alloc() should deal fine with ret == NULL.
> And it actually does. I'll remove this code in the updated patch set.
>>
>>> +             kasan_slab_alloc(cachep, ret);
>>>
>>>       trace_kmem_cache_alloc(_RET_IP_, ret,
>>>                              cachep->object_size, cachep->size, flags);
>
>
>
> --
> Alexander Potapenko
> Software Engineer
>
> Google Germany GmbH
> Erika-Mann-Straße, 33
> 80636 München
>
> Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle
> Registergericht und -nummer: Hamburg, HRB 86891
> Sitz der Gesellschaft: Hamburg
> Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat sind,
> leiten Sie diese bitte nicht weiter, informieren Sie den
> Absender und löschen Sie die E-Mail und alle Anhänge. Vielen Dank.
> This e-mail is confidential. If you are not the right addressee please
> do not forward it, please inform the sender, and please erase this
> e-mail including any attachments. Thanks.



-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat sind,
leiten Sie diese bitte nicht weiter, informieren Sie den
Absender und löschen Sie die E-Mail und alle Anhänge. Vielen Dank.
This e-mail is confidential. If you are not the right addressee please
do not forward it, please inform the sender, and please erase this
e-mail including any attachments. Thanks.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 5/7] mm, kasan: Stackdepot implementation. Enable stackdepot for SLAB
  2016-02-29 17:12     ` Dmitry Vyukov
@ 2016-03-01 11:57       ` Andrey Ryabinin
  2016-03-04 14:52         ` Alexander Potapenko
  2016-03-08 11:42         ` Alexander Potapenko
  0 siblings, 2 replies; 30+ messages in thread
From: Andrey Ryabinin @ 2016-03-01 11:57 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Alexander Potapenko, Andrey Konovalov, Christoph Lameter,
	Andrew Morton, Steven Rostedt, Joonsoo Kim, JoonSoo Kim,
	Kostya Serebryany, kasan-dev, LKML, linux-mm



On 02/29/2016 08:12 PM, Dmitry Vyukov wrote:

>>> diff --git a/lib/Makefile b/lib/Makefile
>>> index a7c26a4..10a4ae3 100644
>>> --- a/lib/Makefile
>>> +++ b/lib/Makefile
>>> @@ -167,6 +167,13 @@ obj-$(CONFIG_SG_SPLIT) += sg_split.o
>>>  obj-$(CONFIG_STMP_DEVICE) += stmp_device.o
>>>  obj-$(CONFIG_IRQ_POLL) += irq_poll.o
>>>
>>> +ifeq ($(CONFIG_KASAN),y)
>>> +ifeq ($(CONFIG_SLAB),y)
>>
>> Just try to imagine that another subsystem wants to use stackdepot. How this gonna look like?
>>
>> We have Kconfig to describe dependencies. So, this should be under CONFIG_STACKDEPOT.
>> So any user of this feature can just do 'select STACKDEPOT' in Kconfig.
>>
>>> +     obj-y   += stackdepot.o
>>> +     KASAN_SANITIZE_slub.o := n
	                _stackdepot.o


>>
>>> +
>>> +     stack->hash = hash;
>>> +     stack->size = size;
>>> +     stack->handle.slabindex = depot_index;
>>> +     stack->handle.offset = depot_offset >> STACK_ALLOC_ALIGN;
>>> +     __memcpy(stack->entries, entries, size * sizeof(unsigned long));
>>
>> s/__memcpy/memcpy/
> 
> memcpy should be instrumented by asan/tsan, and we would like to avoid
> that instrumentation here.

KASAN_SANITIZE_* := n already takes care about this.
__memcpy() is a special thing solely for kasan internals and some assembly code.
And it's not available generally.


>>> +     if (unlikely(!smp_load_acquire(&next_slab_inited))) {
>>> +             if (!preempt_count() && !in_irq()) {
>>
>> If you trying to detect atomic context here, than this doesn't work. E.g. you can't know
>> about held spinlocks in non-preemptible kernel.
>> And I'm not sure why need this. You know gfp flags here, so allocation in atomic context shouldn't be problem.
> 
> 
> We don't have gfp flags for kfree.
> I wonder how CONFIG_DEBUG_ATOMIC_SLEEP handles this. Maybe it has the answer.

It hasn't. It doesn't guarantee that atomic context always will be detected.

> Alternatively, we can always assume that we are in atomic context in kfree.
> 

Or do this allocation in separate context, put in work queue.

> 
> 
>>> +                     alloc_flags &= (__GFP_RECLAIM | __GFP_IO | __GFP_FS |
>>> +                             __GFP_NOWARN | __GFP_NORETRY |
>>> +                             __GFP_NOMEMALLOC | __GFP_DIRECT_RECLAIM);
>>
>> I think blacklist approach would be better here.
>>
>>> +                     page = alloc_pages(alloc_flags, STACK_ALLOC_ORDER);
>>
>> STACK_ALLOC_ORDER = 4 - that's a lot. Do you really need that much?
> 
> Part of the issue the atomic context above. When we can't allocate
> memory we still want to save the stack trace. When we have less than
> STACK_ALLOC_ORDER memory, we try to preallocate another
> STACK_ALLOC_ORDER in advance. So in the worst case, we have
> STACK_ALLOC_ORDER memory and that should be enough to handle all
> kmalloc/kfree in the atomic context. 1 page does not look enough. I
> think Alex did some measuring of the failure race (when we are out of
> memory and can't allocate more).
> 

A lot of 4-order pages will lead to high fragmentation. You don't need physically contiguous memory here,
so try to use vmalloc(). It is slower, but fragmentation won't be problem.

And one more thing. Take a look at mempool, because it's generally used to solve the problem you have here
(guaranteed allocation in atomic context).

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 2/7] mm, kasan: SLAB support
  2016-02-29 18:28     ` Alexander Potapenko
  2016-02-29 18:33       ` Alexander Potapenko
@ 2016-03-01 14:34       ` Andrey Ryabinin
  1 sibling, 0 replies; 30+ messages in thread
From: Andrey Ryabinin @ 2016-03-01 14:34 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Andrey Konovalov, Christoph Lameter, Dmitriy Vyukov,
	Andrew Morton, Steven Rostedt, Joonsoo Kim, JoonSoo Kim,
	Kostya Serebryany, kasan-dev, LKML, Linux Memory Management List



On 02/29/2016 09:28 PM, Alexander Potapenko wrote:

>>>  static void print_address_description(struct kasan_access_info *info)
>>>  {
>>>       const void *addr = info->access_addr;
>>> @@ -126,17 +164,14 @@ static void print_address_description(struct kasan_access_info *info)
>>>               if (PageSlab(page)) {
>>>                       void *object;
>>>                       struct kmem_cache *cache = page->slab_cache;
>>> -                     void *last_object;
>>> -
>>> -                     object = virt_to_obj(cache, page_address(page), addr);
>>> -                     last_object = page_address(page) +
>>> -                             page->objects * cache->size;
>>> -
>>> -                     if (unlikely(object > last_object))
>>> -                             object = last_object; /* we hit into padding */
>>> -
>>> +                     object = nearest_obj(cache, page,
>>> +                                             (void *)info->access_addr);
>>> +#ifdef CONFIG_SLAB
>>> +                     print_object(cache, object);
>>> +#else
>>
>> Instead of these ifdefs, please, make universal API for printing object's information.
> My intention here was to touch the SLUB functionality as little as
> possible to avoid the mess and feature regressions.
> I'll be happy to refactor the code in the upcoming patches once this
> one is landed.
> 

Avoid mess? You create one.
Although I don't understand that don't touch slub thing, but you can just
have object_err(cache, page, str) for slab without touching slub.

>>>                       object_err(cache, page, object,
>>> -                             "kasan: bad access detected");
>>> +                                     "kasan: bad access detected");
>>> +#endif
>>>                       return;
>>>               }
>>>               dump_page(page, "kasan: bad access detected");
>>> @@ -146,8 +181,9 @@ static void print_address_description(struct kasan_access_info *info)
>>>               if (!init_task_stack_addr(addr))
>>>                       pr_err("Address belongs to variable %pS\n", addr);
>>>       }
>>> -
>>> +#ifdef CONFIG_SLUB
>>
>> ???
> Not sure what did you mean here, assuming this comment is related to
> the next one.
>>
>>>       dump_stack();
>>> +#endif
>>>  }
>>>
>>>  static bool row_is_guilty(const void *row, const void *guilty)
>>> @@ -233,6 +269,9 @@ static void kasan_report_error(struct kasan_access_info *info)
>>>               dump_stack();
>>>       } else {
>>>               print_error_description(info);
>>> +#ifdef CONFIG_SLAB
>>
>> I'm lost here. What's the point of reordering dump_stack() for CONFIG_SLAB=y?
> I should have documented this in the patch description properly.
> My intention is to make the KASAN reports look more like those in the
> userspace AddressSanitizer, so I'm moving the memory access stack to
> the top of the report.
> Having seen hundreds and hundreds of ASan reports, we believe that
> important information must go at the beginning of the error report.
> First, people usually do not need to read further once they see the
> access stack.
> Second, the whole report may simply not make it to the log (e.g. in
> the case of a premature shutdown or remote log collection).
>
> As said before, I wasn't going to touch the SLUB output format in this
> patch set, but that also needs to be fixed (I'd also remove some
> unnecessary info, e.g. the memory dump).
>

That's all sounds fine, but this doesn't explain:
a) How this change related to this patch? (the answer is - it doesn't).
b) Why the output of non sl[a,u]b bugs depends on CONFIG_SL[A,U]B ?
 
So, in SLAB's print_objects() you can print stacks in whatever order you like.
That's it, don't change anything else here.

If you are not satisfied with current format output, change it, but in separate patch[es],
with reasoning described in changelog and without weird config dependencies.




>>> +             dump_stack();
>>> +#endif
>>>               print_address_description(info);
>>>               print_shadow_for_address(info->first_bad_addr);
>>>       }
>>> diff --git a/mm/slab.c b/mm/slab.c
>>> index 621fbcb..805b39b 100644
>>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 4/7] arch, ftrace: For KASAN put hard/soft IRQ entries into separate sections
  2016-02-26 16:48 ` [PATCH v4 4/7] arch, ftrace: For KASAN put hard/soft IRQ entries into separate sections Alexander Potapenko
  2016-02-27  1:44   ` kbuild test robot
@ 2016-03-02 17:41   ` Steven Rostedt
  1 sibling, 0 replies; 30+ messages in thread
From: Steven Rostedt @ 2016-03-02 17:41 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: adech.fo, cl, dvyukov, akpm, ryabinin.a.a, iamjoonsoo.kim,
	js1304, kcc, kasan-dev, linux-kernel, linux-mm

On Fri, 26 Feb 2016 17:48:44 +0100
Alexander Potapenko <glider@google.com> wrote:

> KASAN needs to know whether the allocation happens in an IRQ handler.
> This lets us strip everything below the IRQ entry point to reduce the
> number of unique stack traces needed to be stored.
> 
> Move the definition of __irq_entry to <linux/interrupt.h> so that the
> users don't need to pull in <linux/ftrace.h>. Also introduce the
> __softirq_entry macro which is similar to __irq_entry, but puts the
> corresponding functions to the .softirqentry.text section.
> 
> Signed-off-by: Alexander Potapenko <glider@google.com>

Acked-by: Steven Rostedt <rostedt@goodmis.org>

-- Steve

> ---
> v2: - per request from Steven Rostedt, moved the declarations of __softirq_entry
> and __irq_entry to <linux/interrupt.h>
> 
> v3: - minor description changes
> ---
>  arch/arm/kernel/vmlinux.lds.S        |  1 +
>  arch/arm64/kernel/vmlinux.lds.S      |  1 +
>  arch/blackfin/kernel/vmlinux.lds.S   |  1 +
>  arch/c6x/kernel/vmlinux.lds.S        |  1 +
>  arch/metag/kernel/vmlinux.lds.S      |  1 +
>  arch/microblaze/kernel/vmlinux.lds.S |  1 +
>  arch/mips/kernel/vmlinux.lds.S       |  1 +
>  arch/nios2/kernel/vmlinux.lds.S      |  1 +
>  arch/openrisc/kernel/vmlinux.lds.S   |  1 +
>  arch/parisc/kernel/vmlinux.lds.S     |  1 +
>  arch/powerpc/kernel/vmlinux.lds.S    |  1 +
>  arch/s390/kernel/vmlinux.lds.S       |  1 +
>  arch/sh/kernel/vmlinux.lds.S         |  1 +
>  arch/sparc/kernel/vmlinux.lds.S      |  1 +
>  arch/tile/kernel/vmlinux.lds.S       |  1 +
>  arch/x86/kernel/vmlinux.lds.S        |  1 +
>  include/asm-generic/vmlinux.lds.h    | 12 +++++++++++-
>  include/linux/ftrace.h               | 11 -----------
>  include/linux/interrupt.h            | 20 ++++++++++++++++++++
>  kernel/softirq.c                     |  2 +-
>  kernel/trace/trace_functions_graph.c |  1 +
>  21 files changed, 49 insertions(+), 13 deletions(-)
> 
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 5/7] mm, kasan: Stackdepot implementation. Enable stackdepot for SLAB
  2016-03-01 11:57       ` Andrey Ryabinin
@ 2016-03-04 14:52         ` Alexander Potapenko
  2016-03-04 15:01           ` Andrey Ryabinin
  2016-03-08 11:42         ` Alexander Potapenko
  1 sibling, 1 reply; 30+ messages in thread
From: Alexander Potapenko @ 2016-03-04 14:52 UTC (permalink / raw)
  To: Andrey Ryabinin
  Cc: Dmitry Vyukov, Andrey Konovalov, Christoph Lameter,
	Andrew Morton, Steven Rostedt, Joonsoo Kim, JoonSoo Kim,
	Kostya Serebryany, kasan-dev, LKML, linux-mm

On Tue, Mar 1, 2016 at 12:57 PM, Andrey Ryabinin <ryabinin.a.a@gmail.com> wrote:
>
>
> On 02/29/2016 08:12 PM, Dmitry Vyukov wrote:
>
>>>> diff --git a/lib/Makefile b/lib/Makefile
>>>> index a7c26a4..10a4ae3 100644
>>>> --- a/lib/Makefile
>>>> +++ b/lib/Makefile
>>>> @@ -167,6 +167,13 @@ obj-$(CONFIG_SG_SPLIT) += sg_split.o
>>>>  obj-$(CONFIG_STMP_DEVICE) += stmp_device.o
>>>>  obj-$(CONFIG_IRQ_POLL) += irq_poll.o
>>>>
>>>> +ifeq ($(CONFIG_KASAN),y)
>>>> +ifeq ($(CONFIG_SLAB),y)
>>>
>>> Just try to imagine that another subsystem wants to use stackdepot. How this gonna look like?
>>>
>>> We have Kconfig to describe dependencies. So, this should be under CONFIG_STACKDEPOT.
>>> So any user of this feature can just do 'select STACKDEPOT' in Kconfig.
>>>
>>>> +     obj-y   += stackdepot.o
>>>> +     KASAN_SANITIZE_slub.o := n
>                         _stackdepot.o
>
>
>>>
>>>> +
>>>> +     stack->hash = hash;
>>>> +     stack->size = size;
>>>> +     stack->handle.slabindex = depot_index;
>>>> +     stack->handle.offset = depot_offset >> STACK_ALLOC_ALIGN;
>>>> +     __memcpy(stack->entries, entries, size * sizeof(unsigned long));
>>>
>>> s/__memcpy/memcpy/
>>
>> memcpy should be instrumented by asan/tsan, and we would like to avoid
>> that instrumentation here.
>
> KASAN_SANITIZE_* := n already takes care about this.
> __memcpy() is a special thing solely for kasan internals and some assembly code.
> And it's not available generally.
As far as I can see, KASAN_SANITIZE_*:=n does not guarantee it.
It just removes KASAN flags from GCC command line, it does not
necessarily replace memcpy() calls with some kind of a
non-instrumented memcpy().

We see two possible ways to deal with this problem:
1. Define "memcpy" to "__memcpy" in lib/stackdepot.c under CONFIG_KASAN.
2. Create mm/kasan/kasan_stackdepot.c stub which will include
lib/stackdepot.c, and define "memcpy" to "__memcpy" in that file.
This way we'll be able to instrument the original stackdepot.c and
won't miss reports from it if someone starts using it somewhere else.

>>>> +     if (unlikely(!smp_load_acquire(&next_slab_inited))) {
>>>> +             if (!preempt_count() && !in_irq()) {
>>>
>>> If you trying to detect atomic context here, than this doesn't work. E.g. you can't know
>>> about held spinlocks in non-preemptible kernel.
>>> And I'm not sure why need this. You know gfp flags here, so allocation in atomic context shouldn't be problem.
>>
>>
>> We don't have gfp flags for kfree.
>> I wonder how CONFIG_DEBUG_ATOMIC_SLEEP handles this. Maybe it has the answer.
>
> It hasn't. It doesn't guarantee that atomic context always will be detected.
>
>> Alternatively, we can always assume that we are in atomic context in kfree.
>>
>
> Or do this allocation in separate context, put in work queue.
>
>>
>>
>>>> +                     alloc_flags &= (__GFP_RECLAIM | __GFP_IO | __GFP_FS |
>>>> +                             __GFP_NOWARN | __GFP_NORETRY |
>>>> +                             __GFP_NOMEMALLOC | __GFP_DIRECT_RECLAIM);
>>>
>>> I think blacklist approach would be better here.
>>>
>>>> +                     page = alloc_pages(alloc_flags, STACK_ALLOC_ORDER);
>>>
>>> STACK_ALLOC_ORDER = 4 - that's a lot. Do you really need that much?
>>
>> Part of the issue the atomic context above. When we can't allocate
>> memory we still want to save the stack trace. When we have less than
>> STACK_ALLOC_ORDER memory, we try to preallocate another
>> STACK_ALLOC_ORDER in advance. So in the worst case, we have
>> STACK_ALLOC_ORDER memory and that should be enough to handle all
>> kmalloc/kfree in the atomic context. 1 page does not look enough. I
>> think Alex did some measuring of the failure race (when we are out of
>> memory and can't allocate more).
>>
>
> A lot of 4-order pages will lead to high fragmentation. You don't need physically contiguous memory here,
> so try to use vmalloc(). It is slower, but fragmentation won't be problem.
>
> And one more thing. Take a look at mempool, because it's generally used to solve the problem you have here
> (guaranteed allocation in atomic context).
>
>



-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 5/7] mm, kasan: Stackdepot implementation. Enable stackdepot for SLAB
  2016-03-04 14:52         ` Alexander Potapenko
@ 2016-03-04 15:01           ` Andrey Ryabinin
  2016-03-04 15:06             ` Alexander Potapenko
  0 siblings, 1 reply; 30+ messages in thread
From: Andrey Ryabinin @ 2016-03-04 15:01 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Dmitry Vyukov, Andrey Konovalov, Christoph Lameter,
	Andrew Morton, Steven Rostedt, Joonsoo Kim, JoonSoo Kim,
	Kostya Serebryany, kasan-dev, LKML, linux-mm

2016-03-04 17:52 GMT+03:00 Alexander Potapenko <glider@google.com>:
> On Tue, Mar 1, 2016 at 12:57 PM, Andrey Ryabinin <ryabinin.a.a@gmail.com> wrote:
>>
>>
>> On 02/29/2016 08:12 PM, Dmitry Vyukov wrote:
>>
>>>>> diff --git a/lib/Makefile b/lib/Makefile
>>>>> index a7c26a4..10a4ae3 100644
>>>>> --- a/lib/Makefile
>>>>> +++ b/lib/Makefile
>>>>> @@ -167,6 +167,13 @@ obj-$(CONFIG_SG_SPLIT) += sg_split.o
>>>>>  obj-$(CONFIG_STMP_DEVICE) += stmp_device.o
>>>>>  obj-$(CONFIG_IRQ_POLL) += irq_poll.o
>>>>>
>>>>> +ifeq ($(CONFIG_KASAN),y)
>>>>> +ifeq ($(CONFIG_SLAB),y)
>>>>
>>>> Just try to imagine that another subsystem wants to use stackdepot. How this gonna look like?
>>>>
>>>> We have Kconfig to describe dependencies. So, this should be under CONFIG_STACKDEPOT.
>>>> So any user of this feature can just do 'select STACKDEPOT' in Kconfig.
>>>>
>>>>> +     obj-y   += stackdepot.o
>>>>> +     KASAN_SANITIZE_slub.o := n
>>                         _stackdepot.o
>>
>>
>>>>
>>>>> +
>>>>> +     stack->hash = hash;
>>>>> +     stack->size = size;
>>>>> +     stack->handle.slabindex = depot_index;
>>>>> +     stack->handle.offset = depot_offset >> STACK_ALLOC_ALIGN;
>>>>> +     __memcpy(stack->entries, entries, size * sizeof(unsigned long));
>>>>
>>>> s/__memcpy/memcpy/
>>>
>>> memcpy should be instrumented by asan/tsan, and we would like to avoid
>>> that instrumentation here.
>>
>> KASAN_SANITIZE_* := n already takes care about this.
>> __memcpy() is a special thing solely for kasan internals and some assembly code.
>> And it's not available generally.
> As far as I can see, KASAN_SANITIZE_*:=n does not guarantee it.
> It just removes KASAN flags from GCC command line, it does not
> necessarily replace memcpy() calls with some kind of a
> non-instrumented memcpy().
>

With removed kasan cflags '__SANITIZE_ADDRESS__' is not defined,
hence enable the following defines from arch/x86/include/asm/string_64.h:

#if defined(CONFIG_KASAN) && !defined(__SANITIZE_ADDRESS__)

/*
 * For files that not instrumented (e.g. mm/slub.c) we
 * should use not instrumented version of mem* functions.
 */

#undef memcpy
#define memcpy(dst, src, len) __memcpy(dst, src, len)
#define memmove(dst, src, len) __memmove(dst, src, len)
#define memset(s, c, n) __memset(s, c, n)
#endif

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 5/7] mm, kasan: Stackdepot implementation. Enable stackdepot for SLAB
  2016-03-04 15:01           ` Andrey Ryabinin
@ 2016-03-04 15:06             ` Alexander Potapenko
  2016-03-04 16:30               ` Andrey Ryabinin
  0 siblings, 1 reply; 30+ messages in thread
From: Alexander Potapenko @ 2016-03-04 15:06 UTC (permalink / raw)
  To: Andrey Ryabinin
  Cc: Dmitry Vyukov, Andrey Konovalov, Christoph Lameter,
	Andrew Morton, Steven Rostedt, Joonsoo Kim, JoonSoo Kim,
	Kostya Serebryany, kasan-dev, LKML, linux-mm

On Fri, Mar 4, 2016 at 4:01 PM, Andrey Ryabinin <ryabinin.a.a@gmail.com> wrote:
> 2016-03-04 17:52 GMT+03:00 Alexander Potapenko <glider@google.com>:
>> On Tue, Mar 1, 2016 at 12:57 PM, Andrey Ryabinin <ryabinin.a.a@gmail.com> wrote:
>>>
>>>
>>> On 02/29/2016 08:12 PM, Dmitry Vyukov wrote:
>>>
>>>>>> diff --git a/lib/Makefile b/lib/Makefile
>>>>>> index a7c26a4..10a4ae3 100644
>>>>>> --- a/lib/Makefile
>>>>>> +++ b/lib/Makefile
>>>>>> @@ -167,6 +167,13 @@ obj-$(CONFIG_SG_SPLIT) += sg_split.o
>>>>>>  obj-$(CONFIG_STMP_DEVICE) += stmp_device.o
>>>>>>  obj-$(CONFIG_IRQ_POLL) += irq_poll.o
>>>>>>
>>>>>> +ifeq ($(CONFIG_KASAN),y)
>>>>>> +ifeq ($(CONFIG_SLAB),y)
>>>>>
>>>>> Just try to imagine that another subsystem wants to use stackdepot. How this gonna look like?
>>>>>
>>>>> We have Kconfig to describe dependencies. So, this should be under CONFIG_STACKDEPOT.
>>>>> So any user of this feature can just do 'select STACKDEPOT' in Kconfig.
>>>>>
>>>>>> +     obj-y   += stackdepot.o
>>>>>> +     KASAN_SANITIZE_slub.o := n
>>>                         _stackdepot.o
>>>
>>>
>>>>>
>>>>>> +
>>>>>> +     stack->hash = hash;
>>>>>> +     stack->size = size;
>>>>>> +     stack->handle.slabindex = depot_index;
>>>>>> +     stack->handle.offset = depot_offset >> STACK_ALLOC_ALIGN;
>>>>>> +     __memcpy(stack->entries, entries, size * sizeof(unsigned long));
>>>>>
>>>>> s/__memcpy/memcpy/
>>>>
>>>> memcpy should be instrumented by asan/tsan, and we would like to avoid
>>>> that instrumentation here.
>>>
>>> KASAN_SANITIZE_* := n already takes care about this.
>>> __memcpy() is a special thing solely for kasan internals and some assembly code.
>>> And it's not available generally.
>> As far as I can see, KASAN_SANITIZE_*:=n does not guarantee it.
>> It just removes KASAN flags from GCC command line, it does not
>> necessarily replace memcpy() calls with some kind of a
>> non-instrumented memcpy().
>>
>
> With removed kasan cflags '__SANITIZE_ADDRESS__' is not defined,
> hence enable the following defines from arch/x86/include/asm/string_64.h:
>
> #if defined(CONFIG_KASAN) && !defined(__SANITIZE_ADDRESS__)
>
> /*
>  * For files that not instrumented (e.g. mm/slub.c) we
>  * should use not instrumented version of mem* functions.
>  */
>
> #undef memcpy
> #define memcpy(dst, src, len) __memcpy(dst, src, len)
> #define memmove(dst, src, len) __memmove(dst, src, len)
> #define memset(s, c, n) __memset(s, c, n)
> #endif
Nice!
What do you think about providing stub .c files to decouple the shared
code used by KASAN runtime from the rest of kernel?
(This is a completely different story though and can be done separately).


-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 5/7] mm, kasan: Stackdepot implementation. Enable stackdepot for SLAB
  2016-03-04 15:06             ` Alexander Potapenko
@ 2016-03-04 16:30               ` Andrey Ryabinin
  0 siblings, 0 replies; 30+ messages in thread
From: Andrey Ryabinin @ 2016-03-04 16:30 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Dmitry Vyukov, Andrey Konovalov, Christoph Lameter,
	Andrew Morton, Steven Rostedt, Joonsoo Kim, JoonSoo Kim,
	Kostya Serebryany, kasan-dev, LKML, linux-mm

2016-03-04 18:06 GMT+03:00 Alexander Potapenko <glider@google.com>:
> On Fri, Mar 4, 2016 at 4:01 PM, Andrey Ryabinin <ryabinin.a.a@gmail.com> wrote:
>> 2016-03-04 17:52 GMT+03:00 Alexander Potapenko <glider@google.com>:
>>> On Tue, Mar 1, 2016 at 12:57 PM, Andrey Ryabinin <ryabinin.a.a@gmail.com> wrote:
>>>>>>> +
>>>>>>> +     stack->hash = hash;
>>>>>>> +     stack->size = size;
>>>>>>> +     stack->handle.slabindex = depot_index;
>>>>>>> +     stack->handle.offset = depot_offset >> STACK_ALLOC_ALIGN;
>>>>>>> +     __memcpy(stack->entries, entries, size * sizeof(unsigned long));
>>>>>>
>>>>>> s/__memcpy/memcpy/
>>>>>
>>>>> memcpy should be instrumented by asan/tsan, and we would like to avoid
>>>>> that instrumentation here.
>>>>
>>>> KASAN_SANITIZE_* := n already takes care about this.
>>>> __memcpy() is a special thing solely for kasan internals and some assembly code.
>>>> And it's not available generally.
>>> As far as I can see, KASAN_SANITIZE_*:=n does not guarantee it.
>>> It just removes KASAN flags from GCC command line, it does not
>>> necessarily replace memcpy() calls with some kind of a
>>> non-instrumented memcpy().
>>>
>>
>> With removed kasan cflags '__SANITIZE_ADDRESS__' is not defined,
>> hence enable the following defines from arch/x86/include/asm/string_64.h:
>>
>> #if defined(CONFIG_KASAN) && !defined(__SANITIZE_ADDRESS__)
>>
>> /*
>>  * For files that not instrumented (e.g. mm/slub.c) we
>>  * should use not instrumented version of mem* functions.
>>  */
>>
>> #undef memcpy
>> #define memcpy(dst, src, len) __memcpy(dst, src, len)
>> #define memmove(dst, src, len) __memmove(dst, src, len)
>> #define memset(s, c, n) __memset(s, c, n)
>> #endif
> Nice!
> What do you think about providing stub .c files to decouple the shared
> code used by KASAN runtime from the rest of kernel?

Actually, I'm not quite understand why you need that at all, but your
idea will not link due to multiple definitions of the same functions.
Link problem should be easy to workaround with 'objcopy
--prefix-symbol=' though.

> (This is a completely different story though and can be done separately).
>
>
> --
> Alexander Potapenko
> Software Engineer
>
> Google Germany GmbH
> Erika-Mann-Straße, 33
> 80636 München
>
> Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle
> Registergericht und -nummer: Hamburg, HRB 86891
> Sitz der Gesellschaft: Hamburg

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 5/7] mm, kasan: Stackdepot implementation. Enable stackdepot for SLAB
  2016-02-29 16:29   ` Andrey Ryabinin
  2016-02-29 17:12     ` Dmitry Vyukov
@ 2016-03-08 11:30     ` Alexander Potapenko
  1 sibling, 0 replies; 30+ messages in thread
From: Alexander Potapenko @ 2016-03-08 11:30 UTC (permalink / raw)
  To: Andrey Ryabinin
  Cc: Andrey Konovalov, Christoph Lameter, Dmitriy Vyukov,
	Andrew Morton, Steven Rostedt, Joonsoo Kim, JoonSoo Kim,
	Kostya Serebryany, kasan-dev, LKML, Linux Memory Management List

On Mon, Feb 29, 2016 at 5:29 PM, Andrey Ryabinin <ryabinin.a.a@gmail.com> wrote:
>
>
> On 02/26/2016 07:48 PM, Alexander Potapenko wrote:
>> Stack depot will allow KASAN store allocation/deallocation stack traces
>> for memory chunks. The stack traces are stored in a hash table and
>> referenced by handles which reside in the kasan_alloc_meta and
>> kasan_free_meta structures in the allocated memory chunks.
>>
>> IRQ stack traces are cut below the IRQ entry point to avoid unnecessary
>> duplication.
>>
>> Right now stackdepot support is only enabled in SLAB allocator.
>> Once KASAN features in SLAB are on par with those in SLUB we can switch
>> SLUB to stackdepot as well, thus removing the dependency on SLUB stack
>> bookkeeping, which wastes a lot of memory.
>>
>> This patch is based on the "mm: kasan: stack depots" patch originally
>> prepared by Dmitry Chernenkov.
>>
>> Signed-off-by: Alexander Potapenko <glider@google.com>
>> ---
>> v2: - per request from Joonsoo Kim, moved the stackdepot implementation to
>> lib/, as there's a plan to use it for page owner
>>     - added copyright comments
>>     - added comments about smp_load_acquire()/smp_store_release()
>>
>> v3: - minor description changes
>> ---
>
>
>
>> diff --git a/lib/Makefile b/lib/Makefile
>> index a7c26a4..10a4ae3 100644
>> --- a/lib/Makefile
>> +++ b/lib/Makefile
>> @@ -167,6 +167,13 @@ obj-$(CONFIG_SG_SPLIT) += sg_split.o
>>  obj-$(CONFIG_STMP_DEVICE) += stmp_device.o
>>  obj-$(CONFIG_IRQ_POLL) += irq_poll.o
>>
>> +ifeq ($(CONFIG_KASAN),y)
>> +ifeq ($(CONFIG_SLAB),y)
>
> Just try to imagine that another subsystem wants to use stackdepot. How this gonna look like?
>
> We have Kconfig to describe dependencies. So, this should be under CONFIG_STACKDEPOT.
> So any user of this feature can just do 'select STACKDEPOT' in Kconfig.
Agreed. Will fix this in the updated patch.

>> +     obj-y   += stackdepot.o
>> +     KASAN_SANITIZE_slub.o := n
>> +endif
>> +endif
>> +
>>  libfdt_files = fdt.o fdt_ro.o fdt_wip.o fdt_rw.o fdt_sw.o fdt_strerror.o \
>>              fdt_empty_tree.o
>>  $(foreach file, $(libfdt_files), \
>> diff --git a/lib/stackdepot.c b/lib/stackdepot.c
>> new file mode 100644
>> index 0000000..f09b0da
>> --- /dev/null
>> +++ b/lib/stackdepot.c
>
>
>> +/* Allocation of a new stack in raw storage */
>> +static struct stack_record *depot_alloc_stack(unsigned long *entries, int size,
>> +             u32 hash, void **prealloc, gfp_t alloc_flags)
>> +{
>
>
>> +
>> +     stack->hash = hash;
>> +     stack->size = size;
>> +     stack->handle.slabindex = depot_index;
>> +     stack->handle.offset = depot_offset >> STACK_ALLOC_ALIGN;
>> +     __memcpy(stack->entries, entries, size * sizeof(unsigned long));
>
> s/__memcpy/memcpy
Ack.
>> +     depot_offset += required_size;
>> +
>> +     return stack;
>> +}
>> +
>
>
>> +/*
>> + * depot_save_stack - save stack in a stack depot.
>> + * @trace - the stacktrace to save.
>> + * @alloc_flags - flags for allocating additional memory if required.
>> + *
>> + * Returns the handle of the stack struct stored in depot.
>> + */
>> +depot_stack_handle depot_save_stack(struct stack_trace *trace,
>> +                                 gfp_t alloc_flags)
>> +{
>> +     u32 hash;
>> +     depot_stack_handle retval = 0;
>> +     struct stack_record *found = NULL, **bucket;
>> +     unsigned long flags;
>> +     struct page *page = NULL;
>> +     void *prealloc = NULL;
>> +
>> +     if (unlikely(trace->nr_entries == 0))
>> +             goto exit;
>> +
>> +     hash = hash_stack(trace->entries, trace->nr_entries);
>> +     /* Bad luck, we won't store this stack. */
>> +     if (hash == 0)
>> +             goto exit;
>> +
>> +     bucket = &stack_table[hash & STACK_HASH_MASK];
>> +
>> +     /* Fast path: look the stack trace up without locking.
>> +      *
>> +      * The smp_load_acquire() here pairs with smp_store_release() to
>> +      * |bucket| below.
>> +      */
>> +     found = find_stack(smp_load_acquire(bucket), trace->entries,
>> +                        trace->nr_entries, hash);
>> +     if (found)
>> +             goto exit;
>> +
>> +     /* Check if the current or the next stack slab need to be initialized.
>> +      * If so, allocate the memory - we won't be able to do that under the
>> +      * lock.
>> +      *
>> +      * The smp_load_acquire() here pairs with smp_store_release() to
>> +      * |next_slab_inited| in depot_alloc_stack() and init_stack_slab().
>> +      */
>> +     if (unlikely(!smp_load_acquire(&next_slab_inited))) {
>> +             if (!preempt_count() && !in_irq()) {
>
> If you trying to detect atomic context here, than this doesn't work. E.g. you can't know
> about held spinlocks in non-preemptible kernel.
> And I'm not sure why need this. You know gfp flags here, so allocation in atomic context shouldn't be problem.
Yeah, we can just remove these checks. As discussed before, this will
eliminate allocations from kfree(), but that's very unlikely to become
a problem.

>
>> +                     alloc_flags &= (__GFP_RECLAIM | __GFP_IO | __GFP_FS |
>> +                             __GFP_NOWARN | __GFP_NORETRY |
>> +                             __GFP_NOMEMALLOC | __GFP_DIRECT_RECLAIM);
>
> I think blacklist approach would be better here.
Perhaps we don't need to change the mask at all.

>> +                     page = alloc_pages(alloc_flags, STACK_ALLOC_ORDER);
>
> STACK_ALLOC_ORDER = 4 - that's a lot. Do you really need that much?
Well, this is not "that" much, actually. The allocation happens only
~150 times within three hours under Trinity, which means only 9
megabytes.
At around 250 allocations the stack depot saturates and new stacks are
very rare.
We can probably drop the order to 3 or 2, which will increase the
number of allocations by just the factor of 2 to 4, but will be better
from the point of page fragmentation.
>
>> diff --git a/mm/kasan/Makefile b/mm/kasan/Makefile
>> index a61460d..32bd73a 100644
>> --- a/mm/kasan/Makefile
>> +++ b/mm/kasan/Makefile
>> @@ -7,3 +7,4 @@ CFLAGS_REMOVE_kasan.o = -pg
>>  CFLAGS_kasan.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector)
>>
>>  obj-y := kasan.o report.o kasan_init.o
>> +
>
> Extra newline.
Ack.
>
>> diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
>> index 7b9e4ab9..b4e5942 100644
>> --- a/mm/kasan/kasan.h
>> +++ b/mm/kasan/kasan.h
>> @@ -2,6 +2,7 @@
>>  #define __MM_KASAN_KASAN_H
>>
>>  #include <linux/kasan.h>
>> +#include <linux/stackdepot.h>
>>
>>  #define KASAN_SHADOW_SCALE_SIZE (1UL << KASAN_SHADOW_SCALE_SHIFT)
>>  #define KASAN_SHADOW_MASK       (KASAN_SHADOW_SCALE_SIZE - 1)
>> @@ -64,10 +65,13 @@ enum kasan_state {
>>       KASAN_STATE_FREE
>>  };
>>
>> +#define KASAN_STACK_DEPTH 64
>
> I think, you can reduce this (32 perhaps?). Kernel stacks are not so deep usually.
>
>> +
>>  struct kasan_track {
>>       u64 cpu : 6;                    /* for NR_CPUS = 64 */
>>       u64 pid : 16;                   /* 65536 processes */
>>       u64 when : 42;                  /* ~140 years */
>> +     depot_stack_handle stack : sizeof(depot_stack_handle);
>>  };
>>
>



-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 5/7] mm, kasan: Stackdepot implementation. Enable stackdepot for SLAB
  2016-03-01 11:57       ` Andrey Ryabinin
  2016-03-04 14:52         ` Alexander Potapenko
@ 2016-03-08 11:42         ` Alexander Potapenko
  2016-03-10 16:58           ` Andrey Ryabinin
  1 sibling, 1 reply; 30+ messages in thread
From: Alexander Potapenko @ 2016-03-08 11:42 UTC (permalink / raw)
  To: Andrey Ryabinin
  Cc: Dmitry Vyukov, Andrey Konovalov, Christoph Lameter,
	Andrew Morton, Steven Rostedt, Joonsoo Kim, JoonSoo Kim,
	Kostya Serebryany, kasan-dev, LKML, linux-mm

On Tue, Mar 1, 2016 at 12:57 PM, Andrey Ryabinin <ryabinin.a.a@gmail.com> wrote:
>
>
> On 02/29/2016 08:12 PM, Dmitry Vyukov wrote:
>
>>>> diff --git a/lib/Makefile b/lib/Makefile
>>>> index a7c26a4..10a4ae3 100644
>>>> --- a/lib/Makefile
>>>> +++ b/lib/Makefile
>>>> @@ -167,6 +167,13 @@ obj-$(CONFIG_SG_SPLIT) += sg_split.o
>>>>  obj-$(CONFIG_STMP_DEVICE) += stmp_device.o
>>>>  obj-$(CONFIG_IRQ_POLL) += irq_poll.o
>>>>
>>>> +ifeq ($(CONFIG_KASAN),y)
>>>> +ifeq ($(CONFIG_SLAB),y)
>>>
>>> Just try to imagine that another subsystem wants to use stackdepot. How this gonna look like?
>>>
>>> We have Kconfig to describe dependencies. So, this should be under CONFIG_STACKDEPOT.
>>> So any user of this feature can just do 'select STACKDEPOT' in Kconfig.
>>>
>>>> +     obj-y   += stackdepot.o
>>>> +     KASAN_SANITIZE_slub.o := n
>                         _stackdepot.o
>
>
>>>
>>>> +
>>>> +     stack->hash = hash;
>>>> +     stack->size = size;
>>>> +     stack->handle.slabindex = depot_index;
>>>> +     stack->handle.offset = depot_offset >> STACK_ALLOC_ALIGN;
>>>> +     __memcpy(stack->entries, entries, size * sizeof(unsigned long));
>>>
>>> s/__memcpy/memcpy/
>>
>> memcpy should be instrumented by asan/tsan, and we would like to avoid
>> that instrumentation here.
>
> KASAN_SANITIZE_* := n already takes care about this.
> __memcpy() is a special thing solely for kasan internals and some assembly code.
> And it's not available generally.
>
>
>>>> +     if (unlikely(!smp_load_acquire(&next_slab_inited))) {
>>>> +             if (!preempt_count() && !in_irq()) {
>>>
>>> If you trying to detect atomic context here, than this doesn't work. E.g. you can't know
>>> about held spinlocks in non-preemptible kernel.
>>> And I'm not sure why need this. You know gfp flags here, so allocation in atomic context shouldn't be problem.
>>
>>
>> We don't have gfp flags for kfree.
>> I wonder how CONFIG_DEBUG_ATOMIC_SLEEP handles this. Maybe it has the answer.
>
> It hasn't. It doesn't guarantee that atomic context always will be detected.
>
>> Alternatively, we can always assume that we are in atomic context in kfree.
>>
>
> Or do this allocation in separate context, put in work queue.
>
>>
>>
>>>> +                     alloc_flags &= (__GFP_RECLAIM | __GFP_IO | __GFP_FS |
>>>> +                             __GFP_NOWARN | __GFP_NORETRY |
>>>> +                             __GFP_NOMEMALLOC | __GFP_DIRECT_RECLAIM);
>>>
>>> I think blacklist approach would be better here.
>>>
>>>> +                     page = alloc_pages(alloc_flags, STACK_ALLOC_ORDER);
>>>
>>> STACK_ALLOC_ORDER = 4 - that's a lot. Do you really need that much?
>>
>> Part of the issue the atomic context above. When we can't allocate
>> memory we still want to save the stack trace. When we have less than
>> STACK_ALLOC_ORDER memory, we try to preallocate another
>> STACK_ALLOC_ORDER in advance. So in the worst case, we have
>> STACK_ALLOC_ORDER memory and that should be enough to handle all
>> kmalloc/kfree in the atomic context. 1 page does not look enough. I
>> think Alex did some measuring of the failure race (when we are out of
>> memory and can't allocate more).
>>
>
> A lot of 4-order pages will lead to high fragmentation. You don't need physically contiguous memory here,
> so try to use vmalloc(). It is slower, but fragmentation won't be problem.
I've tried using vmalloc(), but turned out it's calling KASAN hooks
again. Dealing with reentrancy in this case sounds like an overkill.
Given that we only require 9 Mb most of the time, is allocating
physical pages still a problem?

> And one more thing. Take a look at mempool, because it's generally used to solve the problem you have here
> (guaranteed allocation in atomic context).
As far as I understood the docs, mempools have a drawback of
allocating too much memory which won't be available for any other use.
O'Reily's "Linux Device Drivers" even suggests not using mempools in
any case when it's easier to deal with allocation failures (that
advice is for device drivers, not sure if that stands for other
subsystems though).


-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 5/7] mm, kasan: Stackdepot implementation. Enable stackdepot for SLAB
  2016-03-08 11:42         ` Alexander Potapenko
@ 2016-03-10 16:58           ` Andrey Ryabinin
  2016-03-11 11:18             ` Alexander Potapenko
  0 siblings, 1 reply; 30+ messages in thread
From: Andrey Ryabinin @ 2016-03-10 16:58 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Dmitry Vyukov, Andrey Konovalov, Christoph Lameter,
	Andrew Morton, Steven Rostedt, Joonsoo Kim, JoonSoo Kim,
	Kostya Serebryany, kasan-dev, LKML, linux-mm

2016-03-08 14:42 GMT+03:00 Alexander Potapenko <glider@google.com>:
> On Tue, Mar 1, 2016 at 12:57 PM, Andrey Ryabinin <ryabinin.a.a@gmail.com> wrote:
>>>>
>>>>> +                     page = alloc_pages(alloc_flags, STACK_ALLOC_ORDER);
>>>>
>>>> STACK_ALLOC_ORDER = 4 - that's a lot. Do you really need that much?
>>>
>>> Part of the issue the atomic context above. When we can't allocate
>>> memory we still want to save the stack trace. When we have less than
>>> STACK_ALLOC_ORDER memory, we try to preallocate another
>>> STACK_ALLOC_ORDER in advance. So in the worst case, we have
>>> STACK_ALLOC_ORDER memory and that should be enough to handle all
>>> kmalloc/kfree in the atomic context. 1 page does not look enough. I
>>> think Alex did some measuring of the failure race (when we are out of
>>> memory and can't allocate more).
>>>
>>
>> A lot of 4-order pages will lead to high fragmentation. You don't need physically contiguous memory here,
>> so try to use vmalloc(). It is slower, but fragmentation won't be problem.
> I've tried using vmalloc(), but turned out it's calling KASAN hooks
> again. Dealing with reentrancy in this case sounds like an overkill.

We'll have to deal with recursion eventually. Using stackdepot for
page owner will cause recursion.

> Given that we only require 9 Mb most of the time, is allocating
> physical pages still a problem?
>

This is not about size, this about fragmentation. vmalloc allows to
utilize available low-order pages,
hence reduce the fragmentation.

>> And one more thing. Take a look at mempool, because it's generally used to solve the problem you have here
>> (guaranteed allocation in atomic context).
> As far as I understood the docs, mempools have a drawback of
> allocating too much memory which won't be available for any other use.

As far as I understood your code, it has a drawback of
allocating too much memory which won't be available for any other use ;)

However, now I think that mempool doesn't fit here. We never free
memory => never return it to pool.
And this will cause 5sec delays between allocation retries in mempool_alloc().


> O'Reily's "Linux Device Drivers" even suggests not using mempools in
> any case when it's easier to deal with allocation failures (that
> advice is for device drivers, not sure if that stands for other
> subsystems though).
>
>
> --
> Alexander Potapenko
> Software Engineer
>
> Google Germany GmbH
> Erika-Mann-Straße, 33
> 80636 München
>
> Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle
> Registergericht und -nummer: Hamburg, HRB 86891
> Sitz der Gesellschaft: Hamburg

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 5/7] mm, kasan: Stackdepot implementation. Enable stackdepot for SLAB
  2016-03-10 16:58           ` Andrey Ryabinin
@ 2016-03-11 11:18             ` Alexander Potapenko
  2016-03-11 11:43               ` Andrey Ryabinin
  0 siblings, 1 reply; 30+ messages in thread
From: Alexander Potapenko @ 2016-03-11 11:18 UTC (permalink / raw)
  To: Andrey Ryabinin, Steven Rostedt
  Cc: Dmitry Vyukov, Andrey Konovalov, Christoph Lameter,
	Andrew Morton, Joonsoo Kim, JoonSoo Kim, Kostya Serebryany,
	kasan-dev, LKML, linux-mm

On Thu, Mar 10, 2016 at 5:58 PM, Andrey Ryabinin <ryabinin.a.a@gmail.com> wrote:
> 2016-03-08 14:42 GMT+03:00 Alexander Potapenko <glider@google.com>:
>> On Tue, Mar 1, 2016 at 12:57 PM, Andrey Ryabinin <ryabinin.a.a@gmail.com> wrote:
>>>>>
>>>>>> +                     page = alloc_pages(alloc_flags, STACK_ALLOC_ORDER);
>>>>>
>>>>> STACK_ALLOC_ORDER = 4 - that's a lot. Do you really need that much?
>>>>
>>>> Part of the issue the atomic context above. When we can't allocate
>>>> memory we still want to save the stack trace. When we have less than
>>>> STACK_ALLOC_ORDER memory, we try to preallocate another
>>>> STACK_ALLOC_ORDER in advance. So in the worst case, we have
>>>> STACK_ALLOC_ORDER memory and that should be enough to handle all
>>>> kmalloc/kfree in the atomic context. 1 page does not look enough. I
>>>> think Alex did some measuring of the failure race (when we are out of
>>>> memory and can't allocate more).
>>>>
>>>
>>> A lot of 4-order pages will lead to high fragmentation. You don't need physically contiguous memory here,
>>> so try to use vmalloc(). It is slower, but fragmentation won't be problem.
>> I've tried using vmalloc(), but turned out it's calling KASAN hooks
>> again. Dealing with reentrancy in this case sounds like an overkill.
>
> We'll have to deal with recursion eventually. Using stackdepot for
> page owner will cause recursion.
>
>> Given that we only require 9 Mb most of the time, is allocating
>> physical pages still a problem?
>>
>
> This is not about size, this about fragmentation. vmalloc allows to
> utilize available low-order pages,
> hence reduce the fragmentation.
I've attempted to add __vmalloc(STACK_ALLOC_SIZE, alloc_flags,
PAGE_KERNEL) (also tried vmalloc(STACK_ALLOC_SIZE)) instead of
page_alloc() and am now getting a crash in
kmem_cache_alloc_node_trace() in mm/slab.c, because it doesn't allow
the kmem_cache pointer to be NULL (it's dereferenced when calling
trace_kmalloc_node()).

Steven, do you know if this because of my code violating some contract
(e.g. I'm calling vmalloc() too early, when kmalloc_caches[] haven't
been initialized), or is this a bug in kmem_cache_alloc_node_trace()
itself?

>>> And one more thing. Take a look at mempool, because it's generally used to solve the problem you have here
>>> (guaranteed allocation in atomic context).
>> As far as I understood the docs, mempools have a drawback of
>> allocating too much memory which won't be available for any other use.
>
> As far as I understood your code, it has a drawback of
> allocating too much memory which won't be available for any other use ;)
>
> However, now I think that mempool doesn't fit here. We never free
> memory => never return it to pool.
> And this will cause 5sec delays between allocation retries in mempool_alloc().
>
>
>> O'Reily's "Linux Device Drivers" even suggests not using mempools in
>> any case when it's easier to deal with allocation failures (that
>> advice is for device drivers, not sure if that stands for other
>> subsystems though).
>>
>>
>> --
>> Alexander Potapenko
>> Software Engineer
>>
>> Google Germany GmbH
>> Erika-Mann-Straße, 33
>> 80636 München
>>
>> Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle
>> Registergericht und -nummer: Hamburg, HRB 86891
>> Sitz der Gesellschaft: Hamburg



-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 5/7] mm, kasan: Stackdepot implementation. Enable stackdepot for SLAB
  2016-03-11 11:18             ` Alexander Potapenko
@ 2016-03-11 11:43               ` Andrey Ryabinin
  2016-03-11 14:49                 ` Alexander Potapenko
  2016-03-11 16:10                 ` Steven Rostedt
  0 siblings, 2 replies; 30+ messages in thread
From: Andrey Ryabinin @ 2016-03-11 11:43 UTC (permalink / raw)
  To: Alexander Potapenko, Steven Rostedt
  Cc: Dmitry Vyukov, Andrey Konovalov, Christoph Lameter,
	Andrew Morton, Joonsoo Kim, JoonSoo Kim, Kostya Serebryany,
	kasan-dev, LKML, linux-mm



On 03/11/2016 02:18 PM, Alexander Potapenko wrote:
> On Thu, Mar 10, 2016 at 5:58 PM, Andrey Ryabinin <ryabinin.a.a@gmail.com> wrote:
>> 2016-03-08 14:42 GMT+03:00 Alexander Potapenko <glider@google.com>:
>>> On Tue, Mar 1, 2016 at 12:57 PM, Andrey Ryabinin <ryabinin.a.a@gmail.com> wrote:
>>>>>>
>>>>>>> +                     page = alloc_pages(alloc_flags, STACK_ALLOC_ORDER);
>>>>>>
>>>>>> STACK_ALLOC_ORDER = 4 - that's a lot. Do you really need that much?
>>>>>
>>>>> Part of the issue the atomic context above. When we can't allocate
>>>>> memory we still want to save the stack trace. When we have less than
>>>>> STACK_ALLOC_ORDER memory, we try to preallocate another
>>>>> STACK_ALLOC_ORDER in advance. So in the worst case, we have
>>>>> STACK_ALLOC_ORDER memory and that should be enough to handle all
>>>>> kmalloc/kfree in the atomic context. 1 page does not look enough. I
>>>>> think Alex did some measuring of the failure race (when we are out of
>>>>> memory and can't allocate more).
>>>>>
>>>>
>>>> A lot of 4-order pages will lead to high fragmentation. You don't need physically contiguous memory here,
>>>> so try to use vmalloc(). It is slower, but fragmentation won't be problem.
>>> I've tried using vmalloc(), but turned out it's calling KASAN hooks
>>> again. Dealing with reentrancy in this case sounds like an overkill.
>>
>> We'll have to deal with recursion eventually. Using stackdepot for
>> page owner will cause recursion.
>>
>>> Given that we only require 9 Mb most of the time, is allocating
>>> physical pages still a problem?
>>>
>>
>> This is not about size, this about fragmentation. vmalloc allows to
>> utilize available low-order pages,
>> hence reduce the fragmentation.
> I've attempted to add __vmalloc(STACK_ALLOC_SIZE, alloc_flags,
> PAGE_KERNEL) (also tried vmalloc(STACK_ALLOC_SIZE)) instead of
> page_alloc() and am now getting a crash in
> kmem_cache_alloc_node_trace() in mm/slab.c, because it doesn't allow
> the kmem_cache pointer to be NULL (it's dereferenced when calling
> trace_kmalloc_node()).
> 
> Steven, do you know if this because of my code violating some contract
> (e.g. I'm calling vmalloc() too early, when kmalloc_caches[] haven't
> been initialized), 

Probably. kmem_cache_init() goes before vmalloc_init().


> or is this a bug in kmem_cache_alloc_node_trace()
> itself?
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 5/7] mm, kasan: Stackdepot implementation. Enable stackdepot for SLAB
  2016-03-11 11:43               ` Andrey Ryabinin
@ 2016-03-11 14:49                 ` Alexander Potapenko
  2016-03-11 16:10                 ` Steven Rostedt
  1 sibling, 0 replies; 30+ messages in thread
From: Alexander Potapenko @ 2016-03-11 14:49 UTC (permalink / raw)
  To: Andrey Ryabinin
  Cc: Steven Rostedt, Dmitry Vyukov, Andrey Konovalov,
	Christoph Lameter, Andrew Morton, Joonsoo Kim, JoonSoo Kim,
	Kostya Serebryany, kasan-dev, LKML, linux-mm

On Fri, Mar 11, 2016 at 12:43 PM, Andrey Ryabinin
<ryabinin.a.a@gmail.com> wrote:
>
>
> On 03/11/2016 02:18 PM, Alexander Potapenko wrote:
>> On Thu, Mar 10, 2016 at 5:58 PM, Andrey Ryabinin <ryabinin.a.a@gmail.com> wrote:
>>> 2016-03-08 14:42 GMT+03:00 Alexander Potapenko <glider@google.com>:
>>>> On Tue, Mar 1, 2016 at 12:57 PM, Andrey Ryabinin <ryabinin.a.a@gmail.com> wrote:
>>>>>>>
>>>>>>>> +                     page = alloc_pages(alloc_flags, STACK_ALLOC_ORDER);
>>>>>>>
>>>>>>> STACK_ALLOC_ORDER = 4 - that's a lot. Do you really need that much?
>>>>>>
>>>>>> Part of the issue the atomic context above. When we can't allocate
>>>>>> memory we still want to save the stack trace. When we have less than
>>>>>> STACK_ALLOC_ORDER memory, we try to preallocate another
>>>>>> STACK_ALLOC_ORDER in advance. So in the worst case, we have
>>>>>> STACK_ALLOC_ORDER memory and that should be enough to handle all
>>>>>> kmalloc/kfree in the atomic context. 1 page does not look enough. I
>>>>>> think Alex did some measuring of the failure race (when we are out of
>>>>>> memory and can't allocate more).
>>>>>>
>>>>>
>>>>> A lot of 4-order pages will lead to high fragmentation. You don't need physically contiguous memory here,
>>>>> so try to use vmalloc(). It is slower, but fragmentation won't be problem.
>>>> I've tried using vmalloc(), but turned out it's calling KASAN hooks
>>>> again. Dealing with reentrancy in this case sounds like an overkill.
>>>
>>> We'll have to deal with recursion eventually. Using stackdepot for
>>> page owner will cause recursion.
>>>
>>>> Given that we only require 9 Mb most of the time, is allocating
>>>> physical pages still a problem?
>>>>
>>>
>>> This is not about size, this about fragmentation. vmalloc allows to
>>> utilize available low-order pages,
>>> hence reduce the fragmentation.
>> I've attempted to add __vmalloc(STACK_ALLOC_SIZE, alloc_flags,
>> PAGE_KERNEL) (also tried vmalloc(STACK_ALLOC_SIZE)) instead of
>> page_alloc() and am now getting a crash in
>> kmem_cache_alloc_node_trace() in mm/slab.c, because it doesn't allow
>> the kmem_cache pointer to be NULL (it's dereferenced when calling
>> trace_kmalloc_node()).
>>
>> Steven, do you know if this because of my code violating some contract
>> (e.g. I'm calling vmalloc() too early, when kmalloc_caches[] haven't
>> been initialized),
>
> Probably. kmem_cache_init() goes before vmalloc_init().
The solution I'm currently testing is to introduce a per-CPU recursion
flag that depot_save_stack() checks and bails out if it's set.
In addition I look at |kmalloc_caches[KMALLOC_SHIFT_HIGH]| and
in_interrupt() to see if vmalloc() is available.
In the case it is not, I fall back to alloc_pages().

Right now (after 20 minutes of running Trinity) vmalloc() has been
called 490 times, alloc_pages() - only 13 times.
I hope it's now much better from the fragmentation point of view.
>
>> or is this a bug in kmem_cache_alloc_node_trace()
>> itself?
>>



-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v4 5/7] mm, kasan: Stackdepot implementation. Enable stackdepot for SLAB
  2016-03-11 11:43               ` Andrey Ryabinin
  2016-03-11 14:49                 ` Alexander Potapenko
@ 2016-03-11 16:10                 ` Steven Rostedt
  1 sibling, 0 replies; 30+ messages in thread
From: Steven Rostedt @ 2016-03-11 16:10 UTC (permalink / raw)
  To: Andrey Ryabinin
  Cc: Alexander Potapenko, Dmitry Vyukov, Andrey Konovalov,
	Christoph Lameter, Andrew Morton, Joonsoo Kim, JoonSoo Kim,
	Kostya Serebryany, kasan-dev, LKML, linux-mm

On Fri, 11 Mar 2016 14:43:45 +0300
Andrey Ryabinin <ryabinin.a.a@gmail.com> wrote:


> >> This is not about size, this about fragmentation. vmalloc allows to
> >> utilize available low-order pages,
> >> hence reduce the fragmentation.  
> > I've attempted to add __vmalloc(STACK_ALLOC_SIZE, alloc_flags,
> > PAGE_KERNEL) (also tried vmalloc(STACK_ALLOC_SIZE)) instead of
> > page_alloc() and am now getting a crash in
> > kmem_cache_alloc_node_trace() in mm/slab.c, because it doesn't allow
> > the kmem_cache pointer to be NULL (it's dereferenced when calling
> > trace_kmalloc_node()).
> > 
> > Steven, do you know if this because of my code violating some contract
> > (e.g. I'm calling vmalloc() too early, when kmalloc_caches[] haven't
> > been initialized),   
> 
> Probably. kmem_cache_init() goes before vmalloc_init().

Agreed, that function can not be called with cachep NULL, nor can it be
called before kmem_cache is set up to point to kmem_cache_boot.

-- Steve

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2016-03-11 16:10 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-26 16:48 [PATCH v4 0/7] SLAB support for KASAN Alexander Potapenko
2016-02-26 16:48 ` [PATCH v4 1/7] kasan: Modify kmalloc_large_oob_right(), add kmalloc_pagealloc_oob_right() Alexander Potapenko
2016-02-26 16:48 ` [PATCH v4 2/7] mm, kasan: SLAB support Alexander Potapenko
2016-02-29 15:10   ` Andrey Ryabinin
2016-02-29 18:28     ` Alexander Potapenko
2016-02-29 18:33       ` Alexander Potapenko
2016-03-01 14:34       ` Andrey Ryabinin
2016-02-26 16:48 ` [PATCH v4 3/7] mm, kasan: Added GFP flags to KASAN API Alexander Potapenko
2016-02-26 16:48 ` [PATCH v4 4/7] arch, ftrace: For KASAN put hard/soft IRQ entries into separate sections Alexander Potapenko
2016-02-27  1:44   ` kbuild test robot
2016-03-02 17:41   ` Steven Rostedt
2016-02-26 16:48 ` [PATCH v4 5/7] mm, kasan: Stackdepot implementation. Enable stackdepot for SLAB Alexander Potapenko
2016-02-29 16:29   ` Andrey Ryabinin
2016-02-29 17:12     ` Dmitry Vyukov
2016-03-01 11:57       ` Andrey Ryabinin
2016-03-04 14:52         ` Alexander Potapenko
2016-03-04 15:01           ` Andrey Ryabinin
2016-03-04 15:06             ` Alexander Potapenko
2016-03-04 16:30               ` Andrey Ryabinin
2016-03-08 11:42         ` Alexander Potapenko
2016-03-10 16:58           ` Andrey Ryabinin
2016-03-11 11:18             ` Alexander Potapenko
2016-03-11 11:43               ` Andrey Ryabinin
2016-03-11 14:49                 ` Alexander Potapenko
2016-03-11 16:10                 ` Steven Rostedt
2016-03-08 11:30     ` Alexander Potapenko
2016-02-26 16:48 ` [PATCH v4 6/7] kasan: Test fix: Warn if the UAF could not be detected in kmalloc_uaf2 Alexander Potapenko
2016-02-29 16:31   ` Andrey Ryabinin
2016-02-26 16:48 ` [PATCH v4 7/7] mm: kasan: Initial memory quarantine implementation Alexander Potapenko
2016-02-26 22:28 ` [PATCH v4 0/7] SLAB support for KASAN Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).