linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/23] Hardened usercopy whitelisting
@ 2017-06-19 23:36 Kees Cook
  2017-06-19 23:36 ` [PATCH 01/23] usercopy: Prepare for " Kees Cook
                   ` (24 more replies)
  0 siblings, 25 replies; 45+ messages in thread
From: Kees Cook @ 2017-06-19 23:36 UTC (permalink / raw)
  To: kernel-hardening; +Cc: Kees Cook, David Windsor, linux-mm, linux-kernel

This series is modified from Brad Spengler/PaX Team's PAX_USERCOPY code
in the last public patch of grsecurity/PaX based on our understanding
of the code. Changes or omissions from the original code are ours and
don't reflect the original grsecurity/PaX code.

David Windsor did the bulk of the porting, refactoring, splitting,
testing, etc; I just did some extra tweaks, hunk moving, and small
extra patches.


This updates the slab allocator to add annotations (useroffset and
usersize) to define allowed usercopy regions. Currently, hardened
usercopy performs dynamic bounds checking on whole slab cache objects.
This is good, but still leaves a lot of kernel slab memory available to
be copied to/from userspace in the face of bugs. To further restrict
what memory is available for copying, this creates a way to whitelist
specific areas of a given slab cache object for copying to/from userspace,
allowing much finer granularity of access control. Slab caches that are
never exposed to userspace can declare no whitelist for their objects,
thereby keeping them unavailable to userspace via dynamic copy operations.
(Note, an implicit form of whitelisting is the use of constant sizes
in usercopy operations and get_user()/put_user(); these bypass hardened
usercopy checks since these sizes cannot change at runtime.)

Two good examples of how much more this protects are with task_struct
(a huge structure that only needs two fields exposed to userspace)
and mm_struct (another large and sensitive structure that only needs
auxv exposed). Other places for whitelists are mostly VFS name related,
and some areas of network caches.

To support the whitelist annotation, usercopy region offset and size
members are added to struct kmem_cache. The slab allocator receives a
new function that creates a new cache with a usercopy region defined
(kmem_cache_create_usercopy()), suitable for storing objects that get
copied to/from userspace. The default cache creation function
(kmem_cache_create()) remains unchanged and defaults to having no
whitelist region.

Additionally, a way to split trivially size-controllable kmallocs away
from the general-purpose kmalloc is added.

Finally, a Kconfig is created to control slab_nomerge, since it
would be nice to make this build-time controllable.

-Kees (and David)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 01/23] usercopy: Prepare for usercopy whitelisting
  2017-06-19 23:36 [PATCH 00/23] Hardened usercopy whitelisting Kees Cook
@ 2017-06-19 23:36 ` Kees Cook
  2017-06-19 23:36 ` [PATCH 02/23] usercopy: Enforce slab cache usercopy region boundaries Kees Cook
                   ` (23 subsequent siblings)
  24 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2017-06-19 23:36 UTC (permalink / raw)
  To: kernel-hardening; +Cc: Kees Cook, David Windsor, linux-mm, linux-kernel

From: David Windsor <dave@nullcore.net>

This patch prepares the slab allocator to handle caches having annotations
(useroffset and usersize) defining usercopy regions.

This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY
whitelisting code in the last public patch of grsecurity/PaX based on
my understanding of the code. Changes or omissions from the original
code are mine and don't reflect the original grsecurity/PaX code.

Currently, hardened usercopy performs dynamic bounds checking on slab
cache objects.  This is good, but still leaves a lot of kernel memory
available to be copied to/from userspace in the face of bugs.  To
further restrict what memory is available for copying, this creates a
way to whitelist specific areas of a given slab cache object for copying
to/from userspace, allowing much finer granularity of access control.
Slab caches that are never exposed to userspace can declare no whitelist
for their objects, thereby keeping them unavailable to userspace via
dynamic copy operations.  (Note, an implicit form of whitelisting is the
use of constant sizes in usercopy operations and get_user()/put_user();
these bypass hardened usercopy checks since these sizes cannot change at
runtime.)

To support this whitelist annotation, usercopy region offset and size
members are added to struct kmem_cache.  The slab allocator receives a
new function that creates a new cache with a usercopy region defined,
suitable for storing objects that get copied to/from userspace.

In this patch, the default kmem_cache_create() marks the entire allocation
as whitelisted. Once all whitelists have been added, this will be changed
back to a usersize of 0.

Signed-off-by: David Windsor <dave@nullcore.net>
[kees: adjust commit log, split out a few extra kmalloc hunks]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 include/linux/slab.h     |  3 +++
 include/linux/slab_def.h |  3 +++
 include/linux/slub_def.h |  3 +++
 include/linux/stddef.h   |  2 ++
 mm/slab.c                |  2 +-
 mm/slab.h                |  5 ++++-
 mm/slab_common.c         | 42 ++++++++++++++++++++++++++++++++++--------
 mm/slub.c                | 11 +++++++++--
 8 files changed, 59 insertions(+), 12 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 04a7f7993e67..a48f54238273 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -129,6 +129,9 @@ bool slab_is_available(void);
 struct kmem_cache *kmem_cache_create(const char *, size_t, size_t,
 			unsigned long,
 			void (*)(void *));
+struct kmem_cache *kmem_cache_create_usercopy(const char *, size_t, size_t,
+			unsigned long, size_t, size_t,
+			void (*)(void *));
 void kmem_cache_destroy(struct kmem_cache *);
 int kmem_cache_shrink(struct kmem_cache *);
 
diff --git a/include/linux/slab_def.h b/include/linux/slab_def.h
index 4ad2c5a26399..03eef0df8648 100644
--- a/include/linux/slab_def.h
+++ b/include/linux/slab_def.h
@@ -84,6 +84,9 @@ struct kmem_cache {
 	unsigned int *random_seq;
 #endif
 
+	size_t useroffset;		/* Usercopy region offset */
+	size_t usersize;		/* Usercopy region size */
+
 	struct kmem_cache_node *node[MAX_NUMNODES];
 };
 
diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index 07ef550c6627..05b7343f69eb 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -108,6 +108,9 @@ struct kmem_cache {
 	struct kasan_cache kasan_info;
 #endif
 
+	size_t useroffset;		/* Usercopy region offset */
+	size_t usersize;		/* Usercopy region size */
+
 	struct kmem_cache_node *node[MAX_NUMNODES];
 };
 
diff --git a/include/linux/stddef.h b/include/linux/stddef.h
index 9c61c7cda936..f00355086fb2 100644
--- a/include/linux/stddef.h
+++ b/include/linux/stddef.h
@@ -18,6 +18,8 @@ enum {
 #define offsetof(TYPE, MEMBER)	((size_t)&((TYPE *)0)->MEMBER)
 #endif
 
+#define sizeof_field(structure, field) sizeof((((structure *)0)->field))
+
 /**
  * offsetofend(TYPE, MEMBER)
  *
diff --git a/mm/slab.c b/mm/slab.c
index 2a31ee3c5814..cf77f1691588 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1281,7 +1281,7 @@ void __init kmem_cache_init(void)
 	create_boot_cache(kmem_cache, "kmem_cache",
 		offsetof(struct kmem_cache, node) +
 				  nr_node_ids * sizeof(struct kmem_cache_node *),
-				  SLAB_HWCACHE_ALIGN);
+				  SLAB_HWCACHE_ALIGN, 0, 0);
 	list_add(&kmem_cache->list, &slab_caches);
 	slab_state = PARTIAL;
 
diff --git a/mm/slab.h b/mm/slab.h
index 9cfcf099709c..92c0cedb296d 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -21,6 +21,8 @@ struct kmem_cache {
 	unsigned int size;	/* The aligned/padded/added on size  */
 	unsigned int align;	/* Alignment as calculated */
 	unsigned long flags;	/* Active flags on the slab */
+	size_t useroffset;	/* Usercopy region offset */
+	size_t usersize;	/* Usercopy region size */
 	const char *name;	/* Slab name for sysfs */
 	int refcount;		/* Use counter */
 	void (*ctor)(void *);	/* Called on object slot creation */
@@ -96,7 +98,8 @@ extern int __kmem_cache_create(struct kmem_cache *, unsigned long flags);
 extern struct kmem_cache *create_kmalloc_cache(const char *name, size_t size,
 			unsigned long flags);
 extern void create_boot_cache(struct kmem_cache *, const char *name,
-			size_t size, unsigned long flags);
+			size_t size, unsigned long flags, size_t useroffset,
+			size_t usersize);
 
 int slab_unmergeable(struct kmem_cache *s);
 struct kmem_cache *find_mergeable(size_t size, size_t align,
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 01a0fe2eb332..af97465b99e6 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -273,6 +273,9 @@ int slab_unmergeable(struct kmem_cache *s)
 	if (s->ctor)
 		return 1;
 
+	if (s->usersize)
+		return 1;
+
 	/*
 	 * We may have set a slab to be unmergeable during bootstrap.
 	 */
@@ -358,12 +361,15 @@ unsigned long calculate_alignment(unsigned long flags,
 
 static struct kmem_cache *create_cache(const char *name,
 		size_t object_size, size_t size, size_t align,
-		unsigned long flags, void (*ctor)(void *),
+		unsigned long flags, size_t useroffset,
+		size_t usersize, void (*ctor)(void *),
 		struct mem_cgroup *memcg, struct kmem_cache *root_cache)
 {
 	struct kmem_cache *s;
 	int err;
 
+	BUG_ON(useroffset + usersize > object_size);
+
 	err = -ENOMEM;
 	s = kmem_cache_zalloc(kmem_cache, GFP_KERNEL);
 	if (!s)
@@ -374,6 +380,8 @@ static struct kmem_cache *create_cache(const char *name,
 	s->size = size;
 	s->align = align;
 	s->ctor = ctor;
+	s->useroffset = useroffset;
+	s->usersize = usersize;
 
 	err = init_memcg_params(s, memcg, root_cache);
 	if (err)
@@ -398,11 +406,13 @@ static struct kmem_cache *create_cache(const char *name,
 }
 
 /*
- * kmem_cache_create - Create a cache.
+ * kmem_cache_create_usercopy - Create a cache.
  * @name: A string which is used in /proc/slabinfo to identify this cache.
  * @size: The size of objects to be created in this cache.
  * @align: The required alignment for the objects.
  * @flags: SLAB flags
+ * @useroffset: Usercopy region offset
+ * @usersize: Usercopy region size
  * @ctor: A constructor for the objects.
  *
  * Returns a ptr to the cache on success, NULL on failure.
@@ -422,8 +432,9 @@ static struct kmem_cache *create_cache(const char *name,
  * as davem.
  */
 struct kmem_cache *
-kmem_cache_create(const char *name, size_t size, size_t align,
-		  unsigned long flags, void (*ctor)(void *))
+kmem_cache_create_usercopy(const char *name, size_t size, size_t align,
+		  unsigned long flags, size_t useroffset, size_t usersize,
+		  void (*ctor)(void *))
 {
 	struct kmem_cache *s = NULL;
 	const char *cache_name;
@@ -454,7 +465,10 @@ kmem_cache_create(const char *name, size_t size, size_t align,
 	 */
 	flags &= CACHE_CREATE_MASK;
 
-	s = __kmem_cache_alias(name, size, align, flags, ctor);
+	BUG_ON(!usersize && useroffset);
+	BUG_ON(size < usersize || size - usersize < useroffset);
+	if (!usersize)
+		s = __kmem_cache_alias(name, size, align, flags, ctor);
 	if (s)
 		goto out_unlock;
 
@@ -466,7 +480,7 @@ kmem_cache_create(const char *name, size_t size, size_t align,
 
 	s = create_cache(cache_name, size, size,
 			 calculate_alignment(flags, align, size),
-			 flags, ctor, NULL, NULL);
+			 flags, useroffset, usersize, ctor, NULL, NULL);
 	if (IS_ERR(s)) {
 		err = PTR_ERR(s);
 		kfree_const(cache_name);
@@ -492,6 +506,15 @@ kmem_cache_create(const char *name, size_t size, size_t align,
 	}
 	return s;
 }
+EXPORT_SYMBOL(kmem_cache_create_usercopy);
+
+struct kmem_cache *
+kmem_cache_create(const char *name, size_t size, size_t align,
+		unsigned long flags, void (*ctor)(void *))
+{
+	return kmem_cache_create_usercopy(name, size, align, flags, 0, size,
+					  ctor);
+}
 EXPORT_SYMBOL(kmem_cache_create);
 
 static void slab_caches_to_rcu_destroy_workfn(struct work_struct *work)
@@ -604,6 +627,7 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
 	s = create_cache(cache_name, root_cache->object_size,
 			 root_cache->size, root_cache->align,
 			 root_cache->flags & CACHE_CREATE_MASK,
+			 root_cache->useroffset, root_cache->usersize,
 			 root_cache->ctor, memcg, root_cache);
 	/*
 	 * If we could not create a memcg cache, do not complain, because
@@ -871,13 +895,15 @@ bool slab_is_available(void)
 #ifndef CONFIG_SLOB
 /* Create a cache during boot when no slab services are available yet */
 void __init create_boot_cache(struct kmem_cache *s, const char *name, size_t size,
-		unsigned long flags)
+		unsigned long flags, size_t useroffset, size_t usersize)
 {
 	int err;
 
 	s->name = name;
 	s->size = s->object_size = size;
 	s->align = calculate_alignment(flags, ARCH_KMALLOC_MINALIGN, size);
+	s->useroffset = useroffset;
+	s->usersize = usersize;
 
 	slab_init_memcg_params(s);
 
@@ -898,7 +924,7 @@ struct kmem_cache *__init create_kmalloc_cache(const char *name, size_t size,
 	if (!s)
 		panic("Out of memory when creating slab %s\n", name);
 
-	create_boot_cache(s, name, size, flags);
+	create_boot_cache(s, name, size, flags, 0, size);
 	list_add(&s->list, &slab_caches);
 	memcg_link_cache(s);
 	s->refcount = 1;
diff --git a/mm/slub.c b/mm/slub.c
index 7449593fca72..b8cbbc31b005 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4164,7 +4164,7 @@ void __init kmem_cache_init(void)
 	kmem_cache = &boot_kmem_cache;
 
 	create_boot_cache(kmem_cache_node, "kmem_cache_node",
-		sizeof(struct kmem_cache_node), SLAB_HWCACHE_ALIGN);
+		sizeof(struct kmem_cache_node), SLAB_HWCACHE_ALIGN, 0, 0);
 
 	register_hotmemory_notifier(&slab_memory_callback_nb);
 
@@ -4174,7 +4174,7 @@ void __init kmem_cache_init(void)
 	create_boot_cache(kmem_cache, "kmem_cache",
 			offsetof(struct kmem_cache, node) +
 				nr_node_ids * sizeof(struct kmem_cache_node *),
-		       SLAB_HWCACHE_ALIGN);
+		       SLAB_HWCACHE_ALIGN, 0, 0);
 
 	kmem_cache = bootstrap(&boot_kmem_cache);
 
@@ -5040,6 +5040,12 @@ static ssize_t cache_dma_show(struct kmem_cache *s, char *buf)
 SLAB_ATTR_RO(cache_dma);
 #endif
 
+static ssize_t usercopy_show(struct kmem_cache *s, char *buf)
+{
+	return sprintf(buf, "%d\n", !!s->usersize);
+}
+SLAB_ATTR_RO(usercopy);
+
 static ssize_t destroy_by_rcu_show(struct kmem_cache *s, char *buf)
 {
 	return sprintf(buf, "%d\n", !!(s->flags & SLAB_TYPESAFE_BY_RCU));
@@ -5414,6 +5420,7 @@ static struct attribute *slab_attrs[] = {
 #ifdef CONFIG_FAILSLAB
 	&failslab_attr.attr,
 #endif
+	&usercopy_attr.attr,
 
 	NULL
 };
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 02/23] usercopy: Enforce slab cache usercopy region boundaries
  2017-06-19 23:36 [PATCH 00/23] Hardened usercopy whitelisting Kees Cook
  2017-06-19 23:36 ` [PATCH 01/23] usercopy: Prepare for " Kees Cook
@ 2017-06-19 23:36 ` Kees Cook
  2017-06-19 23:36 ` [PATCH 03/23] vfs: define usercopy region in names_cache slab caches Kees Cook
                   ` (22 subsequent siblings)
  24 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2017-06-19 23:36 UTC (permalink / raw)
  To: kernel-hardening; +Cc: Kees Cook, David Windsor, linux-mm, linux-kernel

From: David Windsor <dave@nullcore.net>

This patch adds the enforcement component of usercopy cache whitelisting,
and is modified from Brad Spengler/PaX Team's PAX_USERCOPY whitelisting
code in the last public patch of grsecurity/PaX based on my understanding
of the code. Changes or omissions from the original code are mine and
don't reflect the original grsecurity/PaX code.

The SLAB and SLUB allocators are modified to deny all copy operations
in which the kernel heap memory being modified falls outside of the cache's
defined usercopy region.

Signed-off-by: David Windsor <dave@nullcore.net>
[kees: adjust commit log and comments]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 mm/slab.c | 16 +++++++++++-----
 mm/slub.c | 18 +++++++++++-------
 2 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index cf77f1691588..5c78830aeea0 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -4416,7 +4416,9 @@ module_init(slab_proc_init);
 
 #ifdef CONFIG_HARDENED_USERCOPY
 /*
- * Rejects objects that are incorrectly sized.
+ * Rejects incorrectly sized objects and objects that are to be copied
+ * to/from userspace but do not fall entirely within the containing slab
+ * cache's usercopy region.
  *
  * Returns NULL if check passes, otherwise const char * to name of cache
  * to indicate an error.
@@ -4436,11 +4438,15 @@ const char *__check_heap_object(const void *ptr, unsigned long n,
 	/* Find offset within object. */
 	offset = ptr - index_to_obj(cachep, page, objnr) - obj_offset(cachep);
 
-	/* Allow address range falling entirely within object size. */
-	if (offset <= cachep->object_size && n <= cachep->object_size - offset)
-		return NULL;
+	/* Make sure object falls entirely within cache's usercopy region. */
+	if (offset < cachep->useroffset)
+		return cachep->name;
+	if (offset - cachep->useroffset > cachep->usersize)
+		return cachep->name;
+	if (n > cachep->useroffset - offset + cachep->usersize)
+		return cachep->name;
 
-	return cachep->name;
+	return NULL;
 }
 #endif /* CONFIG_HARDENED_USERCOPY */
 
diff --git a/mm/slub.c b/mm/slub.c
index b8cbbc31b005..e12a2bfbca1e 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3796,7 +3796,9 @@ EXPORT_SYMBOL(__kmalloc_node);
 
 #ifdef CONFIG_HARDENED_USERCOPY
 /*
- * Rejects objects that are incorrectly sized.
+ * Rejects incorrectly sized objects and objects that are to be copied
+ * to/from userspace but do not fall entirely within the containing slab
+ * cache's usercopy region.
  *
  * Returns NULL if check passes, otherwise const char * to name of cache
  * to indicate an error.
@@ -3806,11 +3808,9 @@ const char *__check_heap_object(const void *ptr, unsigned long n,
 {
 	struct kmem_cache *s;
 	unsigned long offset;
-	size_t object_size;
 
 	/* Find object and usable object size. */
 	s = page->slab_cache;
-	object_size = slab_ksize(s);
 
 	/* Reject impossible pointers. */
 	if (ptr < page_address(page))
@@ -3826,11 +3826,15 @@ const char *__check_heap_object(const void *ptr, unsigned long n,
 		offset -= s->red_left_pad;
 	}
 
-	/* Allow address range falling entirely within object size. */
-	if (offset <= object_size && n <= object_size - offset)
-		return NULL;
+	/* Make sure object falls entirely within cache's usercopy region. */
+	if (offset < s->useroffset)
+		return s->name;
+	if (offset - s->useroffset > s->usersize)
+		return s->name;
+	if (n > s->useroffset - offset + s->usersize)
+		return s->name;
 
-	return s->name;
+	return NULL;
 }
 #endif /* CONFIG_HARDENED_USERCOPY */
 
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 03/23] vfs: define usercopy region in names_cache slab caches
  2017-06-19 23:36 [PATCH 00/23] Hardened usercopy whitelisting Kees Cook
  2017-06-19 23:36 ` [PATCH 01/23] usercopy: Prepare for " Kees Cook
  2017-06-19 23:36 ` [PATCH 02/23] usercopy: Enforce slab cache usercopy region boundaries Kees Cook
@ 2017-06-19 23:36 ` Kees Cook
  2017-06-19 23:36 ` [PATCH 04/23] vfs: copy struct mount.mnt_id to userspace using put_user() Kees Cook
                   ` (21 subsequent siblings)
  24 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2017-06-19 23:36 UTC (permalink / raw)
  To: kernel-hardening; +Cc: Kees Cook, David Windsor, linux-mm, linux-kernel

From: David Windsor <dave@nullcore.net>

vfs pathnames stored internally in inodes and contained in
the names_cache slab cache need to be copied to/from userspace.

In support of usercopy hardening, this patch defines the entire
cache object in the names_cache slab cache as whitelisted, since
it holds name strings to be copied to userspace.

This patch is verbatim from Brad Spengler/PaX Team's PAX_USERCOPY
whitelisting code in the last public patch of grsecurity/PaX based on my
understanding of the code. Changes or omissions from the original code are
mine and don't reflect the original grsecurity/PaX code.

Signed-off-by: David Windsor <dave@nullcore.net>
[kees: adjust commit log]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 fs/dcache.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index cddf39777835..f7f3c4114baa 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -3616,8 +3616,8 @@ void __init vfs_caches_init_early(void)
 
 void __init vfs_caches_init(void)
 {
-	names_cachep = kmem_cache_create("names_cache", PATH_MAX, 0,
-			SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
+	names_cachep = kmem_cache_create_usercopy("names_cache", PATH_MAX, 0,
+			SLAB_HWCACHE_ALIGN|SLAB_PANIC, 0, PATH_MAX, NULL);
 
 	dcache_init();
 	inode_init();
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 04/23] vfs: copy struct mount.mnt_id to userspace using put_user()
  2017-06-19 23:36 [PATCH 00/23] Hardened usercopy whitelisting Kees Cook
                   ` (2 preceding siblings ...)
  2017-06-19 23:36 ` [PATCH 03/23] vfs: define usercopy region in names_cache slab caches Kees Cook
@ 2017-06-19 23:36 ` Kees Cook
  2017-06-19 23:36 ` [PATCH 05/23] befs: define usercopy region in befs_inode_cache slab cache Kees Cook
                   ` (20 subsequent siblings)
  24 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2017-06-19 23:36 UTC (permalink / raw)
  To: kernel-hardening; +Cc: Kees Cook, David Windsor, linux-mm, linux-kernel

From: David Windsor <dave@nullcore.net>

The mnt_id field can be copied with put_user(), so there is no need to
use copy_to_user(). In both cases, hardened usercopy is being bypassed
since the size is constant, and not open to runtime manipulation.

This patch is verbatim from Brad Spengler/PaX Team's PAX_USERCOPY
whitelisting code in the last public patch of grsecurity/PaX based on my
understanding of the code. Changes or omissions from the original code are
mine and don't reflect the original grsecurity/PaX code.

Signed-off-by: David Windsor <dave@nullcore.net>
[kees: adjust commit log]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 fs/fhandle.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/fhandle.c b/fs/fhandle.c
index 58a61f55e0d0..46e00ccca8f0 100644
--- a/fs/fhandle.c
+++ b/fs/fhandle.c
@@ -68,8 +68,7 @@ static long do_sys_name_to_handle(struct path *path,
 	} else
 		retval = 0;
 	/* copy the mount id */
-	if (copy_to_user(mnt_id, &real_mount(path->mnt)->mnt_id,
-			 sizeof(*mnt_id)) ||
+	if (put_user(real_mount(path->mnt)->mnt_id, mnt_id) ||
 	    copy_to_user(ufh, handle,
 			 sizeof(struct file_handle) + handle_bytes))
 		retval = -EFAULT;
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 05/23] befs: define usercopy region in befs_inode_cache slab cache
  2017-06-19 23:36 [PATCH 00/23] Hardened usercopy whitelisting Kees Cook
                   ` (3 preceding siblings ...)
  2017-06-19 23:36 ` [PATCH 04/23] vfs: copy struct mount.mnt_id to userspace using put_user() Kees Cook
@ 2017-06-19 23:36 ` Kees Cook
  2017-06-19 23:36 ` [PATCH 06/23] cifs: define usercopy region in cifs_request " Kees Cook
                   ` (19 subsequent siblings)
  24 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2017-06-19 23:36 UTC (permalink / raw)
  To: kernel-hardening; +Cc: Kees Cook, David Windsor, linux-mm, linux-kernel

From: David Windsor <dave@nullcore.net>

The befs symlink pathnames, stored in struct befs_inode_info.i_data.symlink
and therefore contained in the befs_inode_cache slab cache, need to be
copied to/from userspace.

In support of usercopy hardening, this patch defines a region in
the befs_inode_cache slab cache in which userspace copy operations
are allowed.

This region is known as the slab cache's usercopy region.  Slab
caches can now check that each copy operation involving cache-managed
memory falls entirely within the slab's usercopy region.

This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY
whitelisting code in the last public patch of grsecurity/PaX based on my
understanding of the code. Changes or omissions from the original code are
mine and don't reflect the original grsecurity/PaX code.

Signed-off-by: David Windsor <dave@nullcore.net>
[kees: adjust commit log]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 fs/befs/linuxvfs.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/fs/befs/linuxvfs.c b/fs/befs/linuxvfs.c
index 63e7c4760bfb..893607591805 100644
--- a/fs/befs/linuxvfs.c
+++ b/fs/befs/linuxvfs.c
@@ -442,11 +442,15 @@ static struct inode *befs_iget(struct super_block *sb, unsigned long ino)
 static int __init
 befs_init_inodecache(void)
 {
-	befs_inode_cachep = kmem_cache_create("befs_inode_cache",
-					      sizeof (struct befs_inode_info),
-					      0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
-					      init_once);
+	befs_inode_cachep = kmem_cache_create_usercopy("befs_inode_cache",
+				sizeof(struct befs_inode_info), 0,
+				(SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
+					SLAB_ACCOUNT),
+				offsetof(struct befs_inode_info,
+					i_data.symlink),
+				sizeof_field(struct befs_inode_info,
+					i_data.symlink),
+				init_once);
 	if (befs_inode_cachep == NULL)
 		return -ENOMEM;
 
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 06/23] cifs: define usercopy region in cifs_request slab cache
  2017-06-19 23:36 [PATCH 00/23] Hardened usercopy whitelisting Kees Cook
                   ` (4 preceding siblings ...)
  2017-06-19 23:36 ` [PATCH 05/23] befs: define usercopy region in befs_inode_cache slab cache Kees Cook
@ 2017-06-19 23:36 ` Kees Cook
  2017-06-19 23:36 ` [PATCH 07/23] exofs: define usercopy region in exofs_inode_cache " Kees Cook
                   ` (18 subsequent siblings)
  24 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2017-06-19 23:36 UTC (permalink / raw)
  To: kernel-hardening; +Cc: Kees Cook, David Windsor, linux-mm, linux-kernel

From: David Windsor <dave@nullcore.net>

cifs request buffers, stored in the cifs_request slab cache,
need to be copied to/from userspace.

In support of usercopy hardening, this patch defines a region in
the cifs_request slab cache in which userspace copy operations
are allowed.

This region is known as the slab cache's usercopy region.  Slab
caches can now check that each copy operation involving cache-managed
memory falls entirely within the slab's usercopy region.

This patch is verbatim from Brad Spengler/PaX Team's PAX_USERCOPY
whitelisting code in the last public patch of grsecurity/PaX based on my
understanding of the code. Changes or omissions from the original code are
mine and don't reflect the original grsecurity/PaX code.

Signed-off-by: David Windsor <dave@nullcore.net>
[kees: adjust commit log]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 fs/cifs/cifsfs.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
index 9a1667e0e8d6..385c5cc8903e 100644
--- a/fs/cifs/cifsfs.c
+++ b/fs/cifs/cifsfs.c
@@ -1234,9 +1234,11 @@ cifs_init_request_bufs(void)
 	cifs_dbg(VFS, "CIFSMaxBufSize %d 0x%x\n",
 		 CIFSMaxBufSize, CIFSMaxBufSize);
 */
-	cifs_req_cachep = kmem_cache_create("cifs_request",
+	cifs_req_cachep = kmem_cache_create_usercopy("cifs_request",
 					    CIFSMaxBufSize + max_hdr_size, 0,
-					    SLAB_HWCACHE_ALIGN, NULL);
+					    SLAB_HWCACHE_ALIGN, 0,
+					    CIFSMaxBufSize + max_hdr_size,
+					    NULL);
 	if (cifs_req_cachep == NULL)
 		return -ENOMEM;
 
@@ -1262,9 +1264,9 @@ cifs_init_request_bufs(void)
 	more SMBs to use small buffer alloc and is still much more
 	efficient to alloc 1 per page off the slab compared to 17K (5page)
 	alloc of large cifs buffers even when page debugging is on */
-	cifs_sm_req_cachep = kmem_cache_create("cifs_small_rq",
+	cifs_sm_req_cachep = kmem_cache_create_usercopy("cifs_small_rq",
 			MAX_CIFS_SMALL_BUFFER_SIZE, 0, SLAB_HWCACHE_ALIGN,
-			NULL);
+			0, MAX_CIFS_SMALL_BUFFER_SIZE, NULL);
 	if (cifs_sm_req_cachep == NULL) {
 		mempool_destroy(cifs_req_poolp);
 		kmem_cache_destroy(cifs_req_cachep);
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 07/23] exofs: define usercopy region in exofs_inode_cache slab cache
  2017-06-19 23:36 [PATCH 00/23] Hardened usercopy whitelisting Kees Cook
                   ` (5 preceding siblings ...)
  2017-06-19 23:36 ` [PATCH 06/23] cifs: define usercopy region in cifs_request " Kees Cook
@ 2017-06-19 23:36 ` Kees Cook
  2017-06-19 23:36 ` [PATCH 08/23] ext2: define usercopy region in ext2_inode_cache " Kees Cook
                   ` (17 subsequent siblings)
  24 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2017-06-19 23:36 UTC (permalink / raw)
  To: kernel-hardening; +Cc: Kees Cook, David Windsor, linux-mm, linux-kernel

From: David Windsor <dave@nullcore.net>

exofs short symlink names and device #'s, stored in struct
exofs_i_info.i_data and therefore contained in the exofs_inode_cache
slab cache, need to be copied to/from userspace.

In support of usercopy hardening, this patch defines a region in
the exofs_inode_cache slab cache in which userspace copy operations
are allowed.

This region is known as the slab cache's usercopy region.  Slab
caches can now check that each copy operation involving cache-managed
memory falls entirely within the slab's usercopy region.

This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY
whitelisting code in the last public patch of grsecurity/PaX based on my
understanding of the code. Changes or omissions from the original code are
mine and don't reflect the original grsecurity/PaX code.

Signed-off-by: David Windsor <dave@nullcore.net>
[kees: adjust commit log]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 fs/exofs/super.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/exofs/super.c b/fs/exofs/super.c
index 819624cfc8da..e5c532875bb7 100644
--- a/fs/exofs/super.c
+++ b/fs/exofs/super.c
@@ -192,10 +192,13 @@ static void exofs_init_once(void *foo)
  */
 static int init_inodecache(void)
 {
-	exofs_inode_cachep = kmem_cache_create("exofs_inode_cache",
+	exofs_inode_cachep = kmem_cache_create_usercopy("exofs_inode_cache",
 				sizeof(struct exofs_i_info), 0,
 				SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD |
-				SLAB_ACCOUNT, exofs_init_once);
+				SLAB_ACCOUNT,
+				offsetof(struct exofs_i_info, i_data),
+				sizeof_field(struct exofs_i_info, i_data),
+				exofs_init_once);
 	if (exofs_inode_cachep == NULL)
 		return -ENOMEM;
 	return 0;
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 08/23] ext2: define usercopy region in ext2_inode_cache slab cache
  2017-06-19 23:36 [PATCH 00/23] Hardened usercopy whitelisting Kees Cook
                   ` (6 preceding siblings ...)
  2017-06-19 23:36 ` [PATCH 07/23] exofs: define usercopy region in exofs_inode_cache " Kees Cook
@ 2017-06-19 23:36 ` Kees Cook
  2017-06-19 23:36 ` [PATCH 09/23] ext4: define usercopy region in ext4_inode_cache " Kees Cook
                   ` (16 subsequent siblings)
  24 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2017-06-19 23:36 UTC (permalink / raw)
  To: kernel-hardening; +Cc: Kees Cook, David Windsor, linux-mm, linux-kernel

From: David Windsor <dave@nullcore.net>

The ext2 symlink pathnames, stored in struct ext2_inode_info.i_data
and therefore contained in the ext2_inode_cache slab cache, need to be
copied to/from userspace.

In support of usercopy hardening, this patch defines a region in
the ext2_inode_cache slab cache in which userspace copy operations
are allowed.

This region is known as the slab cache's usercopy region.  Slab
caches can now check that each copy operation involving cache-managed
memory falls entirely within the slab's usercopy region.

This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY
whitelisting code in the last public patch of grsecurity/PaX based on my
understanding of the code. Changes or omissions from the original code are
mine and don't reflect the original grsecurity/PaX code.

Signed-off-by: David Windsor <dave@nullcore.net>
[kees: adjust commit log]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 fs/ext2/super.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/fs/ext2/super.c b/fs/ext2/super.c
index 9c2028b50e5c..3e2aa21bc616 100644
--- a/fs/ext2/super.c
+++ b/fs/ext2/super.c
@@ -219,11 +219,13 @@ static void init_once(void *foo)
 
 static int __init init_inodecache(void)
 {
-	ext2_inode_cachep = kmem_cache_create("ext2_inode_cache",
-					     sizeof(struct ext2_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
-					     init_once);
+	ext2_inode_cachep = kmem_cache_create_usercopy("ext2_inode_cache",
+				sizeof(struct ext2_inode_info), 0,
+				(SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
+					SLAB_ACCOUNT),
+				offsetof(struct ext2_inode_info, i_data),
+				sizeof_field(struct ext2_inode_info, i_data),
+				init_once);
 	if (ext2_inode_cachep == NULL)
 		return -ENOMEM;
 	return 0;
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 09/23] ext4: define usercopy region in ext4_inode_cache slab cache
  2017-06-19 23:36 [PATCH 00/23] Hardened usercopy whitelisting Kees Cook
                   ` (7 preceding siblings ...)
  2017-06-19 23:36 ` [PATCH 08/23] ext2: define usercopy region in ext2_inode_cache " Kees Cook
@ 2017-06-19 23:36 ` Kees Cook
  2017-06-19 23:36 ` [PATCH 10/23] vxfs: define usercopy region in vxfs_inode " Kees Cook
                   ` (15 subsequent siblings)
  24 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2017-06-19 23:36 UTC (permalink / raw)
  To: kernel-hardening; +Cc: Kees Cook, David Windsor, linux-mm, linux-kernel

From: David Windsor <dave@nullcore.net>

The ext4 symlink pathnames, stored in struct ext4_inode_info.i_data
and therefore contained in the ext4_inode_cache slab cache, need
to be copied to/from userspace.

In support of usercopy hardening, this patch defines a region in
the ext4_inode_cache slab cache in which userspace copy operations
are allowed.

This region is known as the slab cache's usercopy region.  Slab
caches can now check that each copy operation involving cache-managed
memory falls entirely within the slab's usercopy region.

This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY
whitelisting code in the last public patch of grsecurity/PaX based on my
understanding of the code. Changes or omissions from the original code are
mine and don't reflect the original grsecurity/PaX code.

Signed-off-by: David Windsor <dave@nullcore.net>
[kees: adjust commit log]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 fs/ext4/super.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index d37c81f327e7..bd92123cf1fc 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1031,11 +1031,13 @@ static void init_once(void *foo)
 
 static int __init init_inodecache(void)
 {
-	ext4_inode_cachep = kmem_cache_create("ext4_inode_cache",
-					     sizeof(struct ext4_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
-					     init_once);
+	ext4_inode_cachep = kmem_cache_create_usercopy("ext4_inode_cache",
+				sizeof(struct ext4_inode_info), 0,
+				(SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
+					SLAB_ACCOUNT),
+				offsetof(struct ext4_inode_info, i_data),
+				sizeof_field(struct ext4_inode_info, i_data),
+				init_once);
 	if (ext4_inode_cachep == NULL)
 		return -ENOMEM;
 	return 0;
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 10/23] vxfs: define usercopy region in vxfs_inode slab cache
  2017-06-19 23:36 [PATCH 00/23] Hardened usercopy whitelisting Kees Cook
                   ` (8 preceding siblings ...)
  2017-06-19 23:36 ` [PATCH 09/23] ext4: define usercopy region in ext4_inode_cache " Kees Cook
@ 2017-06-19 23:36 ` Kees Cook
  2017-06-19 23:36 ` [PATCH 11/23] jfs: define usercopy region in jfs_ip " Kees Cook
                   ` (14 subsequent siblings)
  24 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2017-06-19 23:36 UTC (permalink / raw)
  To: kernel-hardening; +Cc: Kees Cook, David Windsor, linux-mm, linux-kernel

From: David Windsor <dave@nullcore.net>

The vxfs symlink pathnames, stored in struct vxfs_inode_info.vii_immed.vi_immed
and therefore contained in the vxfs_inode slab cache, need to be copied
to/from userspace.

In support of usercopy hardening, this patch defines a region in
the vxfs_inode slab cache in which userspace copy operations
are allowed.

This region is known as the slab cache's usercopy region.  Slab
caches can now check that each copy operation involving cache-managed
memory falls entirely within the slab's usercopy region.

This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY
whitelisting code in the last public patch of grsecurity/PaX based on my
understanding of the code. Changes or omissions from the original code are
mine and don't reflect the original grsecurity/PaX code.

Signed-off-by: David Windsor <dave@nullcore.net>
[kees: adjust commit log]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 fs/freevxfs/vxfs_super.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/freevxfs/vxfs_super.c b/fs/freevxfs/vxfs_super.c
index 455ce5b77e9b..c143e18d5a65 100644
--- a/fs/freevxfs/vxfs_super.c
+++ b/fs/freevxfs/vxfs_super.c
@@ -332,9 +332,13 @@ vxfs_init(void)
 {
 	int rv;
 
-	vxfs_inode_cachep = kmem_cache_create("vxfs_inode",
+	vxfs_inode_cachep = kmem_cache_create_usercopy("vxfs_inode",
 			sizeof(struct vxfs_inode_info), 0,
-			SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD, NULL);
+			SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
+			offsetof(struct vxfs_inode_info, vii_immed.vi_immed),
+			sizeof_field(struct vxfs_inode_info,
+				vii_immed.vi_immed),
+			NULL);
 	if (!vxfs_inode_cachep)
 		return -ENOMEM;
 	rv = register_filesystem(&vxfs_fs_type);
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 11/23] jfs: define usercopy region in jfs_ip slab cache
  2017-06-19 23:36 [PATCH 00/23] Hardened usercopy whitelisting Kees Cook
                   ` (9 preceding siblings ...)
  2017-06-19 23:36 ` [PATCH 10/23] vxfs: define usercopy region in vxfs_inode " Kees Cook
@ 2017-06-19 23:36 ` Kees Cook
  2017-06-19 23:36 ` [PATCH 12/23] orangefs: define usercopy region in orangefs_inode_cache " Kees Cook
                   ` (13 subsequent siblings)
  24 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2017-06-19 23:36 UTC (permalink / raw)
  To: kernel-hardening; +Cc: Kees Cook, David Windsor, linux-mm, linux-kernel

From: David Windsor <dave@nullcore.net>

The jfs symlink pathnames, stored in struct jfs_inode_info.i_inline
and therefore contained in the jfs_ip slab cache, need to be copied to/from
userspace.

In support of usercopy hardening, this patch defines a region in
the jfs_ip slab cache in which userspace copy operations
are allowed.

This region is known as the slab cache's usercopy region.  Slab
caches can now check that each copy operation involving cache-managed
memory falls entirely within the slab's usercopy region.

This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY
whitelisting code in the last public patch of grsecurity/PaX based on my
understanding of the code. Changes or omissions from the original code are
mine and don't reflect the original grsecurity/PaX code.

Signed-off-by: David Windsor <dave@nullcore.net>
[kees: adjust commit log]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 fs/jfs/super.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/jfs/super.c b/fs/jfs/super.c
index e8aad7d87b8c..10b958f49f57 100644
--- a/fs/jfs/super.c
+++ b/fs/jfs/super.c
@@ -972,9 +972,11 @@ static int __init init_jfs_fs(void)
 	int rc;
 
 	jfs_inode_cachep =
-	    kmem_cache_create("jfs_ip", sizeof(struct jfs_inode_info), 0,
-			    SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|SLAB_ACCOUNT,
-			    init_once);
+	    kmem_cache_create_usercopy("jfs_ip", sizeof(struct jfs_inode_info),
+			0, SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|SLAB_ACCOUNT,
+			offsetof(struct jfs_inode_info, i_inline),
+			sizeof_field(struct jfs_inode_info, i_inline),
+			init_once);
 	if (jfs_inode_cachep == NULL)
 		return -ENOMEM;
 
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 12/23] orangefs: define usercopy region in orangefs_inode_cache slab cache
  2017-06-19 23:36 [PATCH 00/23] Hardened usercopy whitelisting Kees Cook
                   ` (10 preceding siblings ...)
  2017-06-19 23:36 ` [PATCH 11/23] jfs: define usercopy region in jfs_ip " Kees Cook
@ 2017-06-19 23:36 ` Kees Cook
  2017-06-19 23:36 ` [PATCH 13/23] ufs: define usercopy region in ufs_inode_cache " Kees Cook
                   ` (12 subsequent siblings)
  24 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2017-06-19 23:36 UTC (permalink / raw)
  To: kernel-hardening; +Cc: Kees Cook, David Windsor, linux-mm, linux-kernel

From: David Windsor <dave@nullcore.net>

The orangefs symlink pathnames, stored in struct orangefs_inode_s.link_target
and therefore contained in the orangefs_inode_cache, need to be copied
to/from userspace.

In support of usercopy hardening, this patch defines a region in
the orangefs_inode_cache slab cache in which userspace copy operations
are allowed.

This region is known as the slab cache's usercopy region.  Slab
caches can now check that each copy operation involving cache-managed
memory falls entirely within the slab's usercopy region.

This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY
whitelisting code in the last public patch of grsecurity/PaX based on my
understanding of the code. Changes or omissions from the original code are
mine and don't reflect the original grsecurity/PaX code.

Signed-off-by: David Windsor <dave@nullcore.net>
[kees: adjust commit log]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 fs/orangefs/super.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/fs/orangefs/super.c b/fs/orangefs/super.c
index 5c7c273e17ec..0dddfc264aca 100644
--- a/fs/orangefs/super.c
+++ b/fs/orangefs/super.c
@@ -613,11 +613,16 @@ void orangefs_kill_sb(struct super_block *sb)
 
 int orangefs_inode_cache_initialize(void)
 {
-	orangefs_inode_cache = kmem_cache_create("orangefs_inode_cache",
-					      sizeof(struct orangefs_inode_s),
-					      0,
-					      ORANGEFS_CACHE_CREATE_FLAGS,
-					      orangefs_inode_cache_ctor);
+	orangefs_inode_cache = kmem_cache_create_usercopy(
+					"orangefs_inode_cache",
+					sizeof(struct orangefs_inode_s),
+					0,
+					ORANGEFS_CACHE_CREATE_FLAGS,
+					offsetof(struct orangefs_inode_s,
+						link_target),
+					sizeof_field(struct orangefs_inode_s,
+						link_target),
+					orangefs_inode_cache_ctor);
 
 	if (!orangefs_inode_cache) {
 		gossip_err("Cannot create orangefs_inode_cache\n");
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 13/23] ufs: define usercopy region in ufs_inode_cache slab cache
  2017-06-19 23:36 [PATCH 00/23] Hardened usercopy whitelisting Kees Cook
                   ` (11 preceding siblings ...)
  2017-06-19 23:36 ` [PATCH 12/23] orangefs: define usercopy region in orangefs_inode_cache " Kees Cook
@ 2017-06-19 23:36 ` Kees Cook
  2017-06-19 23:36 ` [PATCH 14/23] fork: define usercopy region in thread_stack, task_struct, mm_struct slab caches Kees Cook
                   ` (11 subsequent siblings)
  24 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2017-06-19 23:36 UTC (permalink / raw)
  To: kernel-hardening; +Cc: Kees Cook, David Windsor, linux-mm, linux-kernel

From: David Windsor <dave@nullcore.net>

The ufs symlink pathnames, stored in struct ufs_inode_info.i_u1.i_symlink
and therefore contained in the ufs_inode_cache slab cache, need to be copied
to/from userspace.

In support of usercopy hardening, this patch defines a region in the
ufs_inode_cache slab cache in which userspace copy operations are allowed.

This region is known as the slab cache's usercopy region.  Slab caches can
now check that each copy operation involving cache-managed memory falls
entirely within the slab's usercopy region.

This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY
whitelisting code in the last public patch of grsecurity/PaX based on my
understanding of the code. Changes or omissions from the original code are
mine and don't reflect the original grsecurity/PaX code.

Signed-off-by: David Windsor <dave@nullcore.net>
[kees: adjust commit log]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 fs/ufs/super.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/fs/ufs/super.c b/fs/ufs/super.c
index 878cc6264f1a..fa001feed14a 100644
--- a/fs/ufs/super.c
+++ b/fs/ufs/super.c
@@ -1441,11 +1441,14 @@ static void init_once(void *foo)
 
 static int __init init_inodecache(void)
 {
-	ufs_inode_cachep = kmem_cache_create("ufs_inode_cache",
-					     sizeof(struct ufs_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
-					     init_once);
+	ufs_inode_cachep = kmem_cache_create_usercopy("ufs_inode_cache",
+				sizeof(struct ufs_inode_info), 0,
+				(SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
+					SLAB_ACCOUNT),
+				offsetof(struct ufs_inode_info, i_u1.i_symlink),
+				sizeof_field(struct ufs_inode_info,
+					i_u1.i_symlink),
+				init_once);
 	if (ufs_inode_cachep == NULL)
 		return -ENOMEM;
 	return 0;
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 14/23] fork: define usercopy region in thread_stack, task_struct, mm_struct slab caches
  2017-06-19 23:36 [PATCH 00/23] Hardened usercopy whitelisting Kees Cook
                   ` (12 preceding siblings ...)
  2017-06-19 23:36 ` [PATCH 13/23] ufs: define usercopy region in ufs_inode_cache " Kees Cook
@ 2017-06-19 23:36 ` Kees Cook
  2017-06-19 23:36 ` [PATCH 15/23] net: define usercopy region in struct proto slab cache Kees Cook
                   ` (10 subsequent siblings)
  24 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2017-06-19 23:36 UTC (permalink / raw)
  To: kernel-hardening; +Cc: Kees Cook, David Windsor, linux-mm, linux-kernel

From: David Windsor <dave@nullcore.net>

In support of usercopy hardening, this patch defines a region in
the thread_stack, task_struct and mm_struct slab caches in which
userspace copy operations are allowed. Since only a single whitelisted
buffer region is used, this moves the usercopyable fields next to each
other in task_struct so a single window can cover them.

This region is known as the slab cache's usercopy region.  Slab
caches can now check that each copy operation involving cache-managed
memory falls entirely within the slab's usercopy region.

This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY
whitelisting code in the last public patch of grsecurity/PaX based on my
understanding of the code. Changes or omissions from the original code are
mine and don't reflect the original grsecurity/PaX code.

Signed-off-by: David Windsor <dave@nullcore.net>
[kees: adjust commit log]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 include/linux/sched.h | 15 ++++++++++++---
 kernel/fork.c         | 18 +++++++++++++-----
 2 files changed, 25 insertions(+), 8 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 2b69fc650201..345db7983af1 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -745,10 +745,19 @@ struct task_struct {
 	/* Signal handlers: */
 	struct signal_struct		*signal;
 	struct sighand_struct		*sighand;
-	sigset_t			blocked;
 	sigset_t			real_blocked;
-	/* Restored if set_restore_sigmask() was used: */
-	sigset_t			saved_sigmask;
+
+	/*
+	 * Usercopy slab whitelisting needs blocked, saved_sigmask
+	 * to be adjacent.
+	 */
+	struct {
+		sigset_t blocked;
+
+		/* Restored if set_restore_sigmask() was used */
+		sigset_t saved_sigmask;
+	};
+
 	struct sigpending		pending;
 	unsigned long			sas_ss_sp;
 	size_t				sas_ss_size;
diff --git a/kernel/fork.c b/kernel/fork.c
index e53770d2bf95..172df19baeb5 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -282,8 +282,9 @@ static void free_thread_stack(struct task_struct *tsk)
 
 void thread_stack_cache_init(void)
 {
-	thread_stack_cache = kmem_cache_create("thread_stack", THREAD_SIZE,
-					      THREAD_SIZE, 0, NULL);
+	thread_stack_cache = kmem_cache_create_usercopy("thread_stack",
+					THREAD_SIZE, THREAD_SIZE, 0, 0,
+					THREAD_SIZE, NULL);
 	BUG_ON(thread_stack_cache == NULL);
 }
 # endif
@@ -467,9 +468,14 @@ void __init fork_init(void)
 	int align = max_t(int, L1_CACHE_BYTES, ARCH_MIN_TASKALIGN);
 
 	/* create a slab on which task_structs can be allocated */
-	task_struct_cachep = kmem_cache_create("task_struct",
+	task_struct_cachep = kmem_cache_create_usercopy("task_struct",
 			arch_task_struct_size, align,
-			SLAB_PANIC|SLAB_NOTRACK|SLAB_ACCOUNT, NULL);
+			SLAB_PANIC|SLAB_NOTRACK|SLAB_ACCOUNT,
+			offsetof(struct task_struct, blocked),
+			offsetof(struct task_struct, saved_sigmask) -
+				offsetof(struct task_struct, blocked) +
+				sizeof(init_task.saved_sigmask),
+			NULL);
 #endif
 
 	/* do the arch specific task caches init */
@@ -2208,9 +2214,11 @@ void __init proc_caches_init(void)
 	 * maximum number of CPU's we can ever have.  The cpumask_allocation
 	 * is at the end of the structure, exactly for that reason.
 	 */
-	mm_cachep = kmem_cache_create("mm_struct",
+	mm_cachep = kmem_cache_create_usercopy("mm_struct",
 			sizeof(struct mm_struct), ARCH_MIN_MMSTRUCT_ALIGN,
 			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK|SLAB_ACCOUNT,
+			offsetof(struct mm_struct, saved_auxv),
+			sizeof_field(struct mm_struct, saved_auxv),
 			NULL);
 	vm_area_cachep = KMEM_CACHE(vm_area_struct, SLAB_PANIC|SLAB_ACCOUNT);
 	mmap_init();
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 15/23] net: define usercopy region in struct proto slab cache
  2017-06-19 23:36 [PATCH 00/23] Hardened usercopy whitelisting Kees Cook
                   ` (13 preceding siblings ...)
  2017-06-19 23:36 ` [PATCH 14/23] fork: define usercopy region in thread_stack, task_struct, mm_struct slab caches Kees Cook
@ 2017-06-19 23:36 ` Kees Cook
  2017-06-19 23:36 ` [PATCH 16/23] net: copy struct sctp_sock.autoclose to userspace using put_user() Kees Cook
                   ` (9 subsequent siblings)
  24 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2017-06-19 23:36 UTC (permalink / raw)
  To: kernel-hardening; +Cc: Kees Cook, David Windsor, linux-mm, linux-kernel

From: David Windsor <dave@nullcore.net>

The following objects need to be copied to/from userspace:

  * sctp socket event notification subscription information
  * ICMP filters for IPv4 and IPv6 raw sockets
  * CAIF channel connection request parameters

These objects are stored in per-protocol slabs.

In support of usercopy hardening, this patch defines a region in
the struct proto slab cache in which userspace copy operations
are allowed.

This region is known as the slab cache's usercopy region.  Slab
caches can now check that each copy operation involving cache-managed
memory falls entirely within the slab's usercopy region.

This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY
whitelisting code in the last public patch of grsecurity/PaX based on my
understanding of the code. Changes or omissions from the original code are
mine and don't reflect the original grsecurity/PaX code.

Signed-off-by: David Windsor <dave@nullcore.net>
[kees: adjust commit log]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 include/net/sock.h     | 2 ++
 net/caif/caif_socket.c | 2 ++
 net/core/sock.c        | 5 +++--
 net/ipv4/raw.c         | 2 ++
 net/ipv6/raw.c         | 2 ++
 net/sctp/socket.c      | 4 ++++
 6 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index f33e3d134e0b..9cc6052d3dac 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1091,6 +1091,8 @@ struct proto {
 	struct kmem_cache	*slab;
 	unsigned int		obj_size;
 	int			slab_flags;
+	size_t			useroffset;	/* Usercopy region offset */
+	size_t			usersize;	/* Usercopy region size */
 
 	struct percpu_counter	*orphan_count;
 
diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c
index adcad344c843..73fa59d87c3b 100644
--- a/net/caif/caif_socket.c
+++ b/net/caif/caif_socket.c
@@ -1028,6 +1028,8 @@ static int caif_create(struct net *net, struct socket *sock, int protocol,
 	static struct proto prot = {.name = "PF_CAIF",
 		.owner = THIS_MODULE,
 		.obj_size = sizeof(struct caifsock),
+		.useroffset = offsetof(struct caifsock, conn_req.param),
+		.usersize = sizeof_field(struct caifsock, conn_req.param)
 	};
 
 	if (!capable(CAP_SYS_ADMIN) && !capable(CAP_NET_ADMIN))
diff --git a/net/core/sock.c b/net/core/sock.c
index 727f924b7f91..9e229874c785 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3049,9 +3049,10 @@ static int req_prot_init(const struct proto *prot)
 int proto_register(struct proto *prot, int alloc_slab)
 {
 	if (alloc_slab) {
-		prot->slab = kmem_cache_create(prot->name, prot->obj_size, 0,
+		prot->slab = kmem_cache_create_usercopy(prot->name,
+					prot->obj_size, 0,
 					SLAB_HWCACHE_ALIGN | prot->slab_flags,
-					NULL);
+					prot->useroffset, prot->usersize, NULL);
 
 		if (prot->slab == NULL) {
 			pr_crit("%s: Can't create sock SLAB cache!\n",
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index bdffad875691..336d555ad237 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -964,6 +964,8 @@ struct proto raw_prot = {
 	.hash		   = raw_hash_sk,
 	.unhash		   = raw_unhash_sk,
 	.obj_size	   = sizeof(struct raw_sock),
+	.useroffset	   = offsetof(struct raw_sock, filter),
+	.usersize	   = sizeof_field(struct raw_sock, filter),
 	.h.raw_hash	   = &raw_v4_hashinfo,
 #ifdef CONFIG_COMPAT
 	.compat_setsockopt = compat_raw_setsockopt,
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 60be012fe708..27dd9a5f71c6 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -1265,6 +1265,8 @@ struct proto rawv6_prot = {
 	.hash		   = raw_hash_sk,
 	.unhash		   = raw_unhash_sk,
 	.obj_size	   = sizeof(struct raw6_sock),
+	.useroffset	   = offsetof(struct raw6_sock, filter),
+	.usersize	   = sizeof_field(struct raw6_sock, filter),
 	.h.raw_hash	   = &raw_v6_hashinfo,
 #ifdef CONFIG_COMPAT
 	.compat_setsockopt = compat_rawv6_setsockopt,
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index f16c8d97b7f3..0defc0c76552 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -8178,6 +8178,10 @@ struct proto sctp_prot = {
 	.unhash      =	sctp_unhash,
 	.get_port    =	sctp_get_port,
 	.obj_size    =  sizeof(struct sctp_sock),
+	.useroffset  =  offsetof(struct sctp_sock, subscribe),
+	.usersize    =  sizeof_field(struct sctp_sock, initmsg) -
+				offsetof(struct sctp_sock, subscribe) +
+				sizeof_field(struct sctp_sock, initmsg),
 	.sysctl_mem  =  sysctl_sctp_mem,
 	.sysctl_rmem =  sysctl_sctp_rmem,
 	.sysctl_wmem =  sysctl_sctp_wmem,
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 16/23] net: copy struct sctp_sock.autoclose to userspace using put_user()
  2017-06-19 23:36 [PATCH 00/23] Hardened usercopy whitelisting Kees Cook
                   ` (14 preceding siblings ...)
  2017-06-19 23:36 ` [PATCH 15/23] net: define usercopy region in struct proto slab cache Kees Cook
@ 2017-06-19 23:36 ` Kees Cook
  2017-06-19 23:36 ` [PATCH 17/23] dcache: define usercopy region in dentry_cache slab cache Kees Cook
                   ` (8 subsequent siblings)
  24 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2017-06-19 23:36 UTC (permalink / raw)
  To: kernel-hardening; +Cc: Kees Cook, David Windsor, linux-mm, linux-kernel

From: David Windsor <dave@nullcore.net>

The autoclose field can be copied with put_user(), so there is no need to
use copy_to_user(). In both cases, hardened usercopy is being bypassed
since the size is constant, and not open to runtime manipulation.

This patch is verbatim from Brad Spengler/PaX Team's PAX_USERCOPY
whitelisting code in the last public patch of grsecurity/PaX based on my
understanding of the code. Changes or omissions from the original code are
mine and don't reflect the original grsecurity/PaX code.

Signed-off-by: David Windsor <dave@nullcore.net>
[kees: adjust commit log]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 net/sctp/socket.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 0defc0c76552..e94c141bcf82 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -4883,7 +4883,7 @@ static int sctp_getsockopt_autoclose(struct sock *sk, int len, char __user *optv
 	len = sizeof(int);
 	if (put_user(len, optlen))
 		return -EFAULT;
-	if (copy_to_user(optval, &sctp_sk(sk)->autoclose, sizeof(int)))
+	if (put_user(sctp_sk(sk)->autoclose, (int __user *)optval))
 		return -EFAULT;
 	return 0;
 }
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 17/23] dcache: define usercopy region in dentry_cache slab cache
  2017-06-19 23:36 [PATCH 00/23] Hardened usercopy whitelisting Kees Cook
                   ` (15 preceding siblings ...)
  2017-06-19 23:36 ` [PATCH 16/23] net: copy struct sctp_sock.autoclose to userspace using put_user() Kees Cook
@ 2017-06-19 23:36 ` Kees Cook
  2017-06-20  4:08   ` [kernel-hardening] " Eric Biggers
  2017-06-19 23:36 ` [PATCH 18/23] scsi: define usercopy region in scsi_sense_cache " Kees Cook
                   ` (7 subsequent siblings)
  24 siblings, 1 reply; 45+ messages in thread
From: Kees Cook @ 2017-06-19 23:36 UTC (permalink / raw)
  To: kernel-hardening; +Cc: Kees Cook, David Windsor, linux-mm, linux-kernel

From: David Windsor <dave@nullcore.net>

When a dentry name is short enough, it can be stored directly in
the dentry itself.  These dentry short names, stored in struct
dentry.d_iname and therefore contained in the dentry_cache slab cache,
need to be coped to/from userspace.

In support of usercopy hardening, this patch defines a region in
the dentry_cache slab cache in which userspace copy operations
are allowed.

This region is known as the slab cache's usercopy region.  Slab
caches can now check that each copy operation involving cache-managed
memory falls entirely within the slab's usercopy region.

This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY
whitelisting code in the last public patch of grsecurity/PaX based on my
understanding of the code. Changes or omissions from the original code are
mine and don't reflect the original grsecurity/PaX code.

Signed-off-by: David Windsor <dave@nullcore.net>
[kees: adjust hunks for kmalloc-specific things moved later, adjust commit log]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 fs/dcache.c          | 5 +++--
 include/linux/slab.h | 5 +++++
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index f7f3c4114baa..bae2e148946c 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -3580,8 +3580,9 @@ static void __init dcache_init(void)
 	 * but it is probably not worth it because of the cache nature
 	 * of the dcache. 
 	 */
-	dentry_cache = KMEM_CACHE(dentry,
-		SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD|SLAB_ACCOUNT);
+	dentry_cache = KMEM_CACHE_USERCOPY(dentry,
+		SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD|SLAB_ACCOUNT,
+		d_iname);
 
 	/* Hash may have been set up in dcache_init_early */
 	if (!hashdist)
diff --git a/include/linux/slab.h b/include/linux/slab.h
index a48f54238273..97f4a0117b3b 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -151,6 +151,11 @@ void memcg_destroy_kmem_caches(struct mem_cgroup *);
 		sizeof(struct __struct), __alignof__(struct __struct),\
 		(__flags), NULL)
 
+#define KMEM_CACHE_USERCOPY(__struct, __flags, __field) kmem_cache_create_usercopy(#__struct,\
+		sizeof(struct __struct), __alignof__(struct __struct),\
+		(__flags), offsetof(struct __struct, __field),\
+		sizeof_field(struct __struct, __field), NULL)
+
 /*
  * Common kmalloc functions provided by all allocators
  */
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 18/23] scsi: define usercopy region in scsi_sense_cache slab cache
  2017-06-19 23:36 [PATCH 00/23] Hardened usercopy whitelisting Kees Cook
                   ` (16 preceding siblings ...)
  2017-06-19 23:36 ` [PATCH 17/23] dcache: define usercopy region in dentry_cache slab cache Kees Cook
@ 2017-06-19 23:36 ` Kees Cook
  2017-06-19 23:36 ` [PATCH 19/23] xfs: define usercopy region in xfs_inode " Kees Cook
                   ` (6 subsequent siblings)
  24 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2017-06-19 23:36 UTC (permalink / raw)
  To: kernel-hardening; +Cc: Kees Cook, David Windsor, linux-mm, linux-kernel

From: David Windsor <dave@nullcore.net>

SCSI sense buffers, stored in struct scsi_cmnd.sense and therefore
contained in the scsi_sense_cache slab cache, need to be copied to/from
userspace.

In support of usercopy hardening, this patch defines a region in
the scsi_sense_cache slab cache in which userspace copy operations
are allowed.

This region is known as the slab cache's usercopy region.  Slab
caches can now check that each copy operation involving cache-managed
memory falls entirely within the slab's usercopy region.

Signed-off-by: David Windsor <dave@nullcore.net>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 drivers/scsi/scsi_lib.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 99e16ac479e3..fc5052aded84 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -77,14 +77,15 @@ int scsi_init_sense_cache(struct Scsi_Host *shost)
 	if (shost->unchecked_isa_dma) {
 		scsi_sense_isadma_cache =
 			kmem_cache_create("scsi_sense_cache(DMA)",
-			SCSI_SENSE_BUFFERSIZE, 0,
-			SLAB_HWCACHE_ALIGN | SLAB_CACHE_DMA, NULL);
+				SCSI_SENSE_BUFFERSIZE, 0,
+				SLAB_HWCACHE_ALIGN | SLAB_CACHE_DMA, NULL);
 		if (!scsi_sense_isadma_cache)
 			ret = -ENOMEM;
 	} else {
 		scsi_sense_cache =
-			kmem_cache_create("scsi_sense_cache",
-			SCSI_SENSE_BUFFERSIZE, 0, SLAB_HWCACHE_ALIGN, NULL);
+			kmem_cache_create_usercopy("scsi_sense_cache",
+				SCSI_SENSE_BUFFERSIZE, 0, SLAB_HWCACHE_ALIGN,
+				0, SCSI_SENSE_BUFFERSIZE, NULL);
 		if (!scsi_sense_cache)
 			ret = -ENOMEM;
 	}
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 19/23] xfs: define usercopy region in xfs_inode slab cache
  2017-06-19 23:36 [PATCH 00/23] Hardened usercopy whitelisting Kees Cook
                   ` (17 preceding siblings ...)
  2017-06-19 23:36 ` [PATCH 18/23] scsi: define usercopy region in scsi_sense_cache " Kees Cook
@ 2017-06-19 23:36 ` Kees Cook
  2017-06-19 23:36 ` [PATCH 20/23] usercopy: convert kmalloc caches to usercopy caches Kees Cook
                   ` (5 subsequent siblings)
  24 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2017-06-19 23:36 UTC (permalink / raw)
  To: kernel-hardening; +Cc: Kees Cook, David Windsor, linux-mm, linux-kernel

From: David Windsor <dave@nullcore.net>

XFS inline inode data, stored in struct xfs_inode_t.i_df.if_u2.if_inline_data
and therefore contained in the xfs_inode slab cache, needs to be copied
to/from userspace.

In support of usercopy hardening, this patch defines a region in
the xfs_inode slab cache in which userspace copy operations
are allowed.

This region is known as the slab cache's usercopy region.  Slab
caches can now check that each copy operation involving cache-managed
memory falls entirely within the slab's usercopy region.

This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY
whitelisting code in the last public patch of grsecurity/PaX based on my
understanding of the code. Changes or omissions from the original code are
mine and don't reflect the original grsecurity/PaX code.

Signed-off-by: David Windsor <dave@nullcore.net>
[kees: adjust commit log]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 fs/xfs/kmem.h      | 10 ++++++++++
 fs/xfs/xfs_super.c |  7 +++++--
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/kmem.h b/fs/xfs/kmem.h
index d6ea520162b2..b3f02b6226b3 100644
--- a/fs/xfs/kmem.h
+++ b/fs/xfs/kmem.h
@@ -100,6 +100,16 @@ kmem_zone_init_flags(int size, char *zone_name, unsigned long flags,
 	return kmem_cache_create(zone_name, size, 0, flags, construct);
 }
 
+static inline kmem_zone_t *
+kmem_zone_init_flags_usercopy(int size, char *zone_name, unsigned long flags,
+				size_t useroffset, size_t usersize,
+				void (*construct)(void *))
+{
+	return kmem_cache_create_usercopy(zone_name, size, 0, flags,
+				useroffset, usersize, construct);
+}
+
+
 static inline void
 kmem_zone_free(kmem_zone_t *zone, void *ptr)
 {
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 455a575f101d..b6963baa3ac8 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1828,9 +1828,12 @@ xfs_init_zones(void)
 		goto out_destroy_efd_zone;
 
 	xfs_inode_zone =
-		kmem_zone_init_flags(sizeof(xfs_inode_t), "xfs_inode",
+		kmem_zone_init_flags_usercopy(sizeof(xfs_inode_t), "xfs_inode",
 			KM_ZONE_HWALIGN | KM_ZONE_RECLAIM | KM_ZONE_SPREAD |
-			KM_ZONE_ACCOUNT, xfs_fs_inode_init_once);
+				KM_ZONE_ACCOUNT,
+			offsetof(xfs_inode_t, i_df.if_u2.if_inline_data),
+			sizeof_field(xfs_inode_t, i_df.if_u2.if_inline_data),
+			xfs_fs_inode_init_once);
 	if (!xfs_inode_zone)
 		goto out_destroy_efi_zone;
 
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 20/23] usercopy: convert kmalloc caches to usercopy caches
  2017-06-19 23:36 [PATCH 00/23] Hardened usercopy whitelisting Kees Cook
                   ` (18 preceding siblings ...)
  2017-06-19 23:36 ` [PATCH 19/23] xfs: define usercopy region in xfs_inode " Kees Cook
@ 2017-06-19 23:36 ` Kees Cook
  2017-06-19 23:36 ` [PATCH 21/23] usercopy: Restrict non-usercopy caches to size 0 Kees Cook
                   ` (4 subsequent siblings)
  24 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2017-06-19 23:36 UTC (permalink / raw)
  To: kernel-hardening; +Cc: Kees Cook, David Windsor, linux-mm, linux-kernel

From: David Windsor <dave@nullcore.net>

Mark the kmalloc slab caches as entirely whitelisted.  These
caches are frequently used to fulfill kernel allocations that contain
data to be copied to/from userspace.  It is impossible to know
at build time what objects will be contained in these caches,
so the entire cache is whitelisted.

This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY
whitelisting code in the last public patch of grsecurity/PaX based on my
understanding of the code. Changes or omissions from the original code are
mine and don't reflect the original grsecurity/PaX code.

Signed-off-by: David Windsor <dave@nullcore.net>
[kees: merged in moved kmalloc hunks, adjust commit log]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 mm/slab.c        |  3 ++-
 mm/slab.h        |  3 ++-
 mm/slab_common.c | 10 ++++++----
 3 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 5c78830aeea0..4cafbe13f239 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1291,7 +1291,8 @@ void __init kmem_cache_init(void)
 	 */
 	kmalloc_caches[INDEX_NODE] = create_kmalloc_cache(
 				kmalloc_info[INDEX_NODE].name,
-				kmalloc_size(INDEX_NODE), ARCH_KMALLOC_FLAGS);
+				kmalloc_size(INDEX_NODE), ARCH_KMALLOC_FLAGS,
+				0, kmalloc_size(INDEX_NODE));
 	slab_state = PARTIAL_NODE;
 	setup_kmalloc_cache_index_table();
 
diff --git a/mm/slab.h b/mm/slab.h
index 92c0cedb296d..4cdc8e64fdbd 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -96,7 +96,8 @@ struct kmem_cache *kmalloc_slab(size_t, gfp_t);
 extern int __kmem_cache_create(struct kmem_cache *, unsigned long flags);
 
 extern struct kmem_cache *create_kmalloc_cache(const char *name, size_t size,
-			unsigned long flags);
+			unsigned long flags, size_t useroffset,
+			size_t usersize);
 extern void create_boot_cache(struct kmem_cache *, const char *name,
 			size_t size, unsigned long flags, size_t useroffset,
 			size_t usersize);
diff --git a/mm/slab_common.c b/mm/slab_common.c
index af97465b99e6..685321a0d355 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -917,14 +917,15 @@ void __init create_boot_cache(struct kmem_cache *s, const char *name, size_t siz
 }
 
 struct kmem_cache *__init create_kmalloc_cache(const char *name, size_t size,
-				unsigned long flags)
+				unsigned long flags, size_t useroffset,
+				size_t usersize)
 {
 	struct kmem_cache *s = kmem_cache_zalloc(kmem_cache, GFP_NOWAIT);
 
 	if (!s)
 		panic("Out of memory when creating slab %s\n", name);
 
-	create_boot_cache(s, name, size, flags, 0, size);
+	create_boot_cache(s, name, size, flags, useroffset, usersize);
 	list_add(&s->list, &slab_caches);
 	memcg_link_cache(s);
 	s->refcount = 1;
@@ -1078,7 +1079,8 @@ void __init setup_kmalloc_cache_index_table(void)
 static void __init new_kmalloc_cache(int idx, unsigned long flags)
 {
 	kmalloc_caches[idx] = create_kmalloc_cache(kmalloc_info[idx].name,
-					kmalloc_info[idx].size, flags);
+					kmalloc_info[idx].size, flags, 0,
+					kmalloc_info[idx].size);
 }
 
 /*
@@ -1119,7 +1121,7 @@ void __init create_kmalloc_caches(unsigned long flags)
 
 			BUG_ON(!n);
 			kmalloc_dma_caches[i] = create_kmalloc_cache(n,
-				size, SLAB_CACHE_DMA | flags);
+				size, SLAB_CACHE_DMA | flags, 0, 0);
 		}
 	}
 #endif
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 21/23] usercopy: Restrict non-usercopy caches to size 0
  2017-06-19 23:36 [PATCH 00/23] Hardened usercopy whitelisting Kees Cook
                   ` (19 preceding siblings ...)
  2017-06-19 23:36 ` [PATCH 20/23] usercopy: convert kmalloc caches to usercopy caches Kees Cook
@ 2017-06-19 23:36 ` Kees Cook
  2017-06-20  4:04   ` [kernel-hardening] " Eric Biggers
  2017-06-19 23:36 ` [PATCH 22/23] usercopy: split user-controlled slabs to separate caches Kees Cook
                   ` (3 subsequent siblings)
  24 siblings, 1 reply; 45+ messages in thread
From: Kees Cook @ 2017-06-19 23:36 UTC (permalink / raw)
  To: kernel-hardening; +Cc: Kees Cook, David Windsor, linux-mm, linux-kernel

With all known usercopied cache whitelists now defined in the kernel, switch
the default usercopy region of kmem_cache_create() to size 0. Any new caches
with usercopy regions will now need to use kmem_cache_create_usercopy()
instead of kmem_cache_create().

This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY
whitelisting code in the last public patch of grsecurity/PaX based on my
understanding of the code. Changes or omissions from the original code are
mine and don't reflect the original grsecurity/PaX code.

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: David Windsor <dave@nullcore.net>
---
 mm/slab_common.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 685321a0d355..2365dd21623d 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -512,7 +512,7 @@ struct kmem_cache *
 kmem_cache_create(const char *name, size_t size, size_t align,
 		unsigned long flags, void (*ctor)(void *))
 {
-	return kmem_cache_create_usercopy(name, size, align, flags, 0, size,
+	return kmem_cache_create_usercopy(name, size, align, flags, 0, 0,
 					  ctor);
 }
 EXPORT_SYMBOL(kmem_cache_create);
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 22/23] usercopy: split user-controlled slabs to separate caches
  2017-06-19 23:36 [PATCH 00/23] Hardened usercopy whitelisting Kees Cook
                   ` (20 preceding siblings ...)
  2017-06-19 23:36 ` [PATCH 21/23] usercopy: Restrict non-usercopy caches to size 0 Kees Cook
@ 2017-06-19 23:36 ` Kees Cook
  2017-06-20  4:24   ` [kernel-hardening] " Eric Biggers
                     ` (2 more replies)
  2017-06-19 23:36 ` [PATCH 23/23] mm: Allow slab_nomerge to be set at build time Kees Cook
                   ` (2 subsequent siblings)
  24 siblings, 3 replies; 45+ messages in thread
From: Kees Cook @ 2017-06-19 23:36 UTC (permalink / raw)
  To: kernel-hardening; +Cc: Kees Cook, David Windsor, linux-mm, linux-kernel

From: David Windsor <dave@nullcore.net>

Some userspace APIs (e.g. ipc, seq_file) provide precise control over
the size of kernel kmallocs, which provides a trivial way to perform
heap overflow attacks where the attacker must control neighboring
allocations of a specific size. Instead, move these APIs into their own
cache so they cannot interfere with standard kmallocs. This is enabled
with CONFIG_HARDENED_USERCOPY_SPLIT_KMALLOC.

This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY_SLABS
code in the last public patch of grsecurity/PaX based on my understanding
of the code. Changes or omissions from the original code are mine and
don't reflect the original grsecurity/PaX code.

Signed-off-by: David Windsor <dave@nullcore.net>
[kees: added SLAB_NO_MERGE flag to allow split of future no-merge Kconfig]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 fs/seq_file.c        |  2 +-
 include/linux/gfp.h  |  9 ++++++++-
 include/linux/slab.h | 12 ++++++++++++
 ipc/msgutil.c        |  5 +++--
 mm/slab.h            |  3 ++-
 mm/slab_common.c     | 29 ++++++++++++++++++++++++++++-
 security/Kconfig     | 12 ++++++++++++
 7 files changed, 66 insertions(+), 6 deletions(-)

diff --git a/fs/seq_file.c b/fs/seq_file.c
index dc7c2be963ed..5caa58a19bdc 100644
--- a/fs/seq_file.c
+++ b/fs/seq_file.c
@@ -25,7 +25,7 @@ static void seq_set_overflow(struct seq_file *m)
 
 static void *seq_buf_alloc(unsigned long size)
 {
-	return kvmalloc(size, GFP_KERNEL);
+	return kvmalloc(size, GFP_KERNEL | GFP_USERCOPY);
 }
 
 /**
diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index a89d37e8b387..ff4f4a698ad0 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -45,6 +45,7 @@ struct vm_area_struct;
 #else
 #define ___GFP_NOLOCKDEP	0
 #endif
+#define ___GFP_USERCOPY		0x4000000u
 /* If the above are modified, __GFP_BITS_SHIFT may need updating */
 
 /*
@@ -83,12 +84,17 @@ struct vm_area_struct;
  *   node with no fallbacks or placement policy enforcements.
  *
  * __GFP_ACCOUNT causes the allocation to be accounted to kmemcg.
+ *
+ * __GFP_USERCOPY indicates that the page will be explicitly copied to/from
+ *   userspace, and may be allocated from a separate kmalloc pool.
+ *
  */
 #define __GFP_RECLAIMABLE ((__force gfp_t)___GFP_RECLAIMABLE)
 #define __GFP_WRITE	((__force gfp_t)___GFP_WRITE)
 #define __GFP_HARDWALL   ((__force gfp_t)___GFP_HARDWALL)
 #define __GFP_THISNODE	((__force gfp_t)___GFP_THISNODE)
 #define __GFP_ACCOUNT	((__force gfp_t)___GFP_ACCOUNT)
+#define __GFP_USERCOPY	((__force gfp_t)___GFP_USERCOPY)
 
 /*
  * Watermark modifiers -- controls access to emergency reserves
@@ -188,7 +194,7 @@ struct vm_area_struct;
 #define __GFP_NOLOCKDEP ((__force gfp_t)___GFP_NOLOCKDEP)
 
 /* Room for N __GFP_FOO bits */
-#define __GFP_BITS_SHIFT (25 + IS_ENABLED(CONFIG_LOCKDEP))
+#define __GFP_BITS_SHIFT (26 + IS_ENABLED(CONFIG_LOCKDEP))
 #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
 
 /*
@@ -268,6 +274,7 @@ struct vm_area_struct;
 #define GFP_TRANSHUGE_LIGHT	((GFP_HIGHUSER_MOVABLE | __GFP_COMP | \
 			 __GFP_NOMEMALLOC | __GFP_NOWARN) & ~__GFP_RECLAIM)
 #define GFP_TRANSHUGE	(GFP_TRANSHUGE_LIGHT | __GFP_DIRECT_RECLAIM)
+#define GFP_USERCOPY	__GFP_USERCOPY
 
 /* Convert GFP flags to their corresponding migrate type */
 #define GFP_MOVABLE_MASK (__GFP_RECLAIMABLE|__GFP_MOVABLE)
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 97f4a0117b3b..7d9d7d838991 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -25,6 +25,7 @@
 #define SLAB_POISON		0x00000800UL	/* DEBUG: Poison objects */
 #define SLAB_HWCACHE_ALIGN	0x00002000UL	/* Align objs on cache lines */
 #define SLAB_CACHE_DMA		0x00004000UL	/* Use GFP_DMA memory */
+#define SLAB_NO_MERGE		0x00008000UL	/* Keep this cache unmerged */
 #define SLAB_STORE_USER		0x00010000UL	/* DEBUG: Store the last owner for bug hunting */
 #define SLAB_PANIC		0x00040000UL	/* Panic if kmem_cache_create() fails */
 /*
@@ -287,6 +288,17 @@ extern struct kmem_cache *kmalloc_dma_caches[KMALLOC_SHIFT_HIGH + 1];
 #endif
 
 /*
+ * Some userspace APIs (ipc, seq_file) provide precise control over
+ * the size of kernel kmallocs, which provides a trivial way to perform
+ * heap overflow attacks where the attacker must control neighboring
+ * allocations.  Instead, move these APIs into their own cache so they
+ * cannot interfere with standard kmallocs.
+ */
+#ifdef CONFIG_HARDENED_USERCOPY_SPLIT_KMALLOC
+extern struct kmem_cache *kmalloc_usersized_caches[KMALLOC_SHIFT_HIGH + 1];
+#endif
+
+/*
  * Figure out which kmalloc slab an allocation of a certain size
  * belongs to.
  * 0 = zero alloc
diff --git a/ipc/msgutil.c b/ipc/msgutil.c
index bf74eaa5c39f..5ae33d50da26 100644
--- a/ipc/msgutil.c
+++ b/ipc/msgutil.c
@@ -53,7 +53,7 @@ static struct msg_msg *alloc_msg(size_t len)
 	size_t alen;
 
 	alen = min(len, DATALEN_MSG);
-	msg = kmalloc(sizeof(*msg) + alen, GFP_KERNEL_ACCOUNT);
+	msg = kmalloc(sizeof(*msg) + alen, GFP_KERNEL_ACCOUNT | GFP_USERCOPY);
 	if (msg == NULL)
 		return NULL;
 
@@ -65,7 +65,8 @@ static struct msg_msg *alloc_msg(size_t len)
 	while (len > 0) {
 		struct msg_msgseg *seg;
 		alen = min(len, DATALEN_SEG);
-		seg = kmalloc(sizeof(*seg) + alen, GFP_KERNEL_ACCOUNT);
+		seg = kmalloc(sizeof(*seg) + alen, GFP_KERNEL_ACCOUNT |
+			GFP_USERCOPY);
 		if (seg == NULL)
 			goto out_err;
 		*pseg = seg;
diff --git a/mm/slab.h b/mm/slab.h
index 4cdc8e64fdbd..874b755f278d 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -130,7 +130,8 @@ static inline unsigned long kmem_cache_flags(unsigned long object_size,
 
 /* Legal flag mask for kmem_cache_create(), for various configurations */
 #define SLAB_CORE_FLAGS (SLAB_HWCACHE_ALIGN | SLAB_CACHE_DMA | SLAB_PANIC | \
-			 SLAB_TYPESAFE_BY_RCU | SLAB_DEBUG_OBJECTS )
+			 SLAB_TYPESAFE_BY_RCU | SLAB_DEBUG_OBJECTS | \
+			 SLAB_NO_MERGE)
 
 #if defined(CONFIG_DEBUG_SLAB)
 #define SLAB_DEBUG_FLAGS (SLAB_RED_ZONE | SLAB_POISON | SLAB_STORE_USER)
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 2365dd21623d..6c14d765379f 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -40,7 +40,7 @@ static DECLARE_WORK(slab_caches_to_rcu_destroy_work,
  */
 #define SLAB_NEVER_MERGE (SLAB_RED_ZONE | SLAB_POISON | SLAB_STORE_USER | \
 		SLAB_TRACE | SLAB_TYPESAFE_BY_RCU | SLAB_NOLEAKTRACE | \
-		SLAB_FAILSLAB | SLAB_KASAN)
+		SLAB_FAILSLAB | SLAB_KASAN | SLAB_NO_MERGE)
 
 #define SLAB_MERGE_SAME (SLAB_RECLAIM_ACCOUNT | SLAB_CACHE_DMA | \
 			 SLAB_NOTRACK | SLAB_ACCOUNT)
@@ -940,6 +940,11 @@ struct kmem_cache *kmalloc_dma_caches[KMALLOC_SHIFT_HIGH + 1];
 EXPORT_SYMBOL(kmalloc_dma_caches);
 #endif
 
+#ifdef CONFIG_HARDENED_USERCOPY_SPLIT_KMALLOC
+struct kmem_cache *kmalloc_usersized_caches[KMALLOC_SHIFT_HIGH + 1];
+EXPORT_SYMBOL(kmalloc_usersized_caches);
+#endif
+
 /*
  * Conversion table for small slabs sizes / 8 to the index in the
  * kmalloc array. This is necessary for slabs < 192 since we have non power
@@ -1004,6 +1009,12 @@ struct kmem_cache *kmalloc_slab(size_t size, gfp_t flags)
 		return kmalloc_dma_caches[index];
 
 #endif
+
+#ifdef CONFIG_HARDENED_USERCOPY_SPLIT_KMALLOC
+	if (unlikely((flags & GFP_USERCOPY)))
+		return kmalloc_usersized_caches[index];
+#endif
+
 	return kmalloc_caches[index];
 }
 
@@ -1125,6 +1136,22 @@ void __init create_kmalloc_caches(unsigned long flags)
 		}
 	}
 #endif
+
+#ifdef CONFIG_HARDENED_USERCOPY_SPLIT_KMALLOC
+	for (i = 0; i <= KMALLOC_SHIFT_HIGH; i++) {
+		struct kmem_cache *s = kmalloc_caches[i];
+
+		if (s) {
+			int size = kmalloc_size(i);
+			char *n = kasprintf(GFP_NOWAIT,
+				"usersized-kmalloc-%d", size);
+
+			BUG_ON(!n);
+			kmalloc_usersized_caches[i] = create_kmalloc_cache(n,
+				size, SLAB_NO_MERGE | flags, 0, size);
+		}
+	}
+#endif /* CONFIG_HARDENED_USERCOPY_SPLIT_KMALLOC */
 }
 #endif /* !CONFIG_SLOB */
 
diff --git a/security/Kconfig b/security/Kconfig
index 93027fdf47d1..0c181cebdb8a 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -154,6 +154,18 @@ config HARDENED_USERCOPY_PAGESPAN
 	  been removed. This config is intended to be used only while
 	  trying to find such users.
 
+config HARDENED_USERCOPY_SPLIT_KMALLOC
+	bool "Isolate kernel caches from user-controlled allocations"
+	default HARDENED_USERCOPY
+	help
+	  This option creates a separate set of kmalloc caches used to
+	  satisfy allocations from userspace APIs that allow for
+	  fine-grained control over the size of kernel allocations.
+	  Without this, it is much easier for attackers to precisely
+	  size and attack heap overflows.  If their allocations are
+	  confined to a separate cache, attackers must find other ways
+	  to prepare heap attacks that will be near their desired target.
+
 config STATIC_USERMODEHELPER
 	bool "Force all usermode helper calls through a single binary"
 	help
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 23/23] mm: Allow slab_nomerge to be set at build time
  2017-06-19 23:36 [PATCH 00/23] Hardened usercopy whitelisting Kees Cook
                   ` (21 preceding siblings ...)
  2017-06-19 23:36 ` [PATCH 22/23] usercopy: split user-controlled slabs to separate caches Kees Cook
@ 2017-06-19 23:36 ` Kees Cook
  2017-06-20  4:09   ` [kernel-hardening] " Daniel Micay
  2017-06-20  4:29   ` Eric Biggers
  2017-06-20 19:41 ` [kernel-hardening] [PATCH 00/23] Hardened usercopy whitelisting Rik van Riel
  2017-10-20 22:40 ` Paolo Bonzini
  24 siblings, 2 replies; 45+ messages in thread
From: Kees Cook @ 2017-06-19 23:36 UTC (permalink / raw)
  To: kernel-hardening; +Cc: Kees Cook, David Windsor, linux-mm, linux-kernel

Some hardened environments want to build kernels with slab_nomerge
already set (so that they do not depend on remembering to set the kernel
command line option). This is desired to reduce the risk of kernel heap
overflows being able to overwrite objects from merged caches, increasing
the difficulty of these attacks. By keeping caches unmerged, these kinds
of exploits can usually only damage objects in the same cache (though the
risk to metadata exploitation is unchanged).

Signed-off-by: Kees Cook <keescook@chromium.org>
---
 mm/slab_common.c |  5 ++---
 security/Kconfig | 13 +++++++++++++
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 6c14d765379f..17a4c4b33283 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -47,13 +47,12 @@ static DECLARE_WORK(slab_caches_to_rcu_destroy_work,
 
 /*
  * Merge control. If this is set then no merging of slab caches will occur.
- * (Could be removed. This was introduced to pacify the merge skeptics.)
  */
-static int slab_nomerge;
+static bool slab_nomerge = !IS_ENABLED(CONFIG_SLAB_MERGE_DEFAULT);
 
 static int __init setup_slab_nomerge(char *str)
 {
-	slab_nomerge = 1;
+	slab_nomerge = true;
 	return 1;
 }
 
diff --git a/security/Kconfig b/security/Kconfig
index 0c181cebdb8a..e40bd2a260f8 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -166,6 +166,19 @@ config HARDENED_USERCOPY_SPLIT_KMALLOC
 	  confined to a separate cache, attackers must find other ways
 	  to prepare heap attacks that will be near their desired target.
 
+config SLAB_MERGE_DEFAULT
+	bool "Allow slab caches to be merged"
+	default y
+	help
+	  For reduced kernel memory fragmentation, slab caches can be
+	  merged when they share the same size and other characteristics.
+	  This carries a small risk of kernel heap overflows being able
+	  to overwrite objects from merged caches, which reduces the
+	  difficulty of such heap attacks. By keeping caches unmerged,
+	  these kinds of exploits can usually only damage objects in the
+	  same cache. To disable merging at runtime, "slab_nomerge" can be
+	  passed on the kernel command line.
+
 config STATIC_USERMODEHELPER
 	bool "Force all usermode helper calls through a single binary"
 	help
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [kernel-hardening] [PATCH 21/23] usercopy: Restrict non-usercopy caches to size 0
  2017-06-19 23:36 ` [PATCH 21/23] usercopy: Restrict non-usercopy caches to size 0 Kees Cook
@ 2017-06-20  4:04   ` Eric Biggers
  2017-06-28 17:03     ` Kees Cook
  0 siblings, 1 reply; 45+ messages in thread
From: Eric Biggers @ 2017-06-20  4:04 UTC (permalink / raw)
  To: Kees Cook; +Cc: kernel-hardening, David Windsor, linux-mm, linux-kernel

Hi David + Kees,

On Mon, Jun 19, 2017 at 04:36:35PM -0700, Kees Cook wrote:
> With all known usercopied cache whitelists now defined in the kernel, switch
> the default usercopy region of kmem_cache_create() to size 0. Any new caches
> with usercopy regions will now need to use kmem_cache_create_usercopy()
> instead of kmem_cache_create().
> 

While I'd certainly like to see the caches be whitelisted, it needs to be made
very clear that it's being done (the cover letter for this series falsely claims
that kmem_cache_create() is unchanged) and what the consequences are.  Is there
any specific plan for identifying caches that were missed?  If it's expected for
people to just fix them as they are found, then they need to be helped a little
--- at the very least by putting a big comment above report_usercopy() that
explains the possible reasons why the error might have triggered and what to do
about it.

- Eric

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [kernel-hardening] [PATCH 17/23] dcache: define usercopy region in dentry_cache slab cache
  2017-06-19 23:36 ` [PATCH 17/23] dcache: define usercopy region in dentry_cache slab cache Kees Cook
@ 2017-06-20  4:08   ` Eric Biggers
  2017-06-28 16:44     ` Kees Cook
  0 siblings, 1 reply; 45+ messages in thread
From: Eric Biggers @ 2017-06-20  4:08 UTC (permalink / raw)
  To: Kees Cook; +Cc: kernel-hardening, David Windsor, linux-mm, linux-kernel

On Mon, Jun 19, 2017 at 04:36:31PM -0700, Kees Cook wrote:
> From: David Windsor <dave@nullcore.net>
> 
> When a dentry name is short enough, it can be stored directly in
> the dentry itself.  These dentry short names, stored in struct
> dentry.d_iname and therefore contained in the dentry_cache slab cache,
> need to be coped to/from userspace.
> 
> In support of usercopy hardening, this patch defines a region in
> the dentry_cache slab cache in which userspace copy operations
> are allowed.
> 
> This region is known as the slab cache's usercopy region.  Slab
> caches can now check that each copy operation involving cache-managed
> memory falls entirely within the slab's usercopy region.
> 
> This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY
> whitelisting code in the last public patch of grsecurity/PaX based on my
> understanding of the code. Changes or omissions from the original code are
> mine and don't reflect the original grsecurity/PaX code.
> 

For all these patches please mention *where* the data is being copied to/from
userspace.

> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index a48f54238273..97f4a0117b3b 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -151,6 +151,11 @@ void memcg_destroy_kmem_caches(struct mem_cgroup *);
>  		sizeof(struct __struct), __alignof__(struct __struct),\
>  		(__flags), NULL)
>  
> +#define KMEM_CACHE_USERCOPY(__struct, __flags, __field) kmem_cache_create_usercopy(#__struct,\
> +		sizeof(struct __struct), __alignof__(struct __struct),\
> +		(__flags), offsetof(struct __struct, __field),\
> +		sizeof_field(struct __struct, __field), NULL)
> +

This helper macro should be added in the patch which adds
kmem_cache_create_usercopy(), not in this one.

- Eric

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [kernel-hardening] [PATCH 23/23] mm: Allow slab_nomerge to be set at build time
  2017-06-19 23:36 ` [PATCH 23/23] mm: Allow slab_nomerge to be set at build time Kees Cook
@ 2017-06-20  4:09   ` Daniel Micay
  2017-06-20 22:51     ` Kees Cook
  2017-06-20  4:29   ` Eric Biggers
  1 sibling, 1 reply; 45+ messages in thread
From: Daniel Micay @ 2017-06-20  4:09 UTC (permalink / raw)
  To: Kees Cook, kernel-hardening; +Cc: David Windsor, linux-mm, linux-kernel

On Mon, 2017-06-19 at 16:36 -0700, Kees Cook wrote:
> Some hardened environments want to build kernels with slab_nomerge
> already set (so that they do not depend on remembering to set the
> kernel
> command line option). This is desired to reduce the risk of kernel
> heap
> overflows being able to overwrite objects from merged caches,
> increasing
> the difficulty of these attacks. By keeping caches unmerged, these
> kinds
> of exploits can usually only damage objects in the same cache (though
> the
> risk to metadata exploitation is unchanged).

It also further fragments the ability to influence slab cache layout,
i.e. primitives to do things like filling up slabs to set things up for
an exploit might not be able to deal with the target slabs anymore. It
doesn't need to be mentioned but it's something to think about too. In
theory, disabling merging can make it *easier* to get the right layout
too if there was some annoyance that's now split away. It's definitely a
lot more good than bad for security though, but allocator changes have
subtle impact on exploitation. This can make caches more deterministic.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [kernel-hardening] [PATCH 22/23] usercopy: split user-controlled slabs to separate caches
  2017-06-19 23:36 ` [PATCH 22/23] usercopy: split user-controlled slabs to separate caches Kees Cook
@ 2017-06-20  4:24   ` Eric Biggers
  2017-06-20  4:47   ` Eric Biggers
  2017-06-20 20:24   ` Laura Abbott
  2 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2017-06-20  4:24 UTC (permalink / raw)
  To: Kees Cook; +Cc: kernel-hardening, David Windsor, linux-mm, linux-kernel

On Mon, Jun 19, 2017 at 04:36:36PM -0700, Kees Cook wrote:
> From: David Windsor <dave@nullcore.net>
> 
> Some userspace APIs (e.g. ipc, seq_file) provide precise control over
> the size of kernel kmallocs, which provides a trivial way to perform
> heap overflow attacks where the attacker must control neighboring
> allocations of a specific size. Instead, move these APIs into their own
> cache so they cannot interfere with standard kmallocs. This is enabled
> with CONFIG_HARDENED_USERCOPY_SPLIT_KMALLOC.
> 

This is a logically separate change which IMO should be its own patch, not just
patch 22/23.

Also, is this really just about heap overflows?  I thought the main purpose of
separate heaps is to make it more difficult to exploit use-after-frees, since
anything allocating an object from heap A cannot overwrite freed memory in heap
B.  (At least, not at the SLAB level; it may still be done at the page level.)

> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index a89d37e8b387..ff4f4a698ad0 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -45,6 +45,7 @@ struct vm_area_struct;
>  #else
>  #define ___GFP_NOLOCKDEP	0
>  #endif
> +#define ___GFP_USERCOPY		0x4000000u
>  /* If the above are modified, __GFP_BITS_SHIFT may need updating */
>  
>  /*
> @@ -83,12 +84,17 @@ struct vm_area_struct;
>   *   node with no fallbacks or placement policy enforcements.
>   *
>   * __GFP_ACCOUNT causes the allocation to be accounted to kmemcg.
> + *
> + * __GFP_USERCOPY indicates that the page will be explicitly copied to/from
> + *   userspace, and may be allocated from a separate kmalloc pool.
> + *
>   */

The "page", or the allocation?  It's only for slab objects, is it not?  More
importantly, the purpose of this needs to be clearly documented; otherwise
people won't know what this is and whether they should/need to use it or not.

- Eric

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [kernel-hardening] [PATCH 23/23] mm: Allow slab_nomerge to be set at build time
  2017-06-19 23:36 ` [PATCH 23/23] mm: Allow slab_nomerge to be set at build time Kees Cook
  2017-06-20  4:09   ` [kernel-hardening] " Daniel Micay
@ 2017-06-20  4:29   ` Eric Biggers
  2017-06-20 23:09     ` Kees Cook
  1 sibling, 1 reply; 45+ messages in thread
From: Eric Biggers @ 2017-06-20  4:29 UTC (permalink / raw)
  To: Kees Cook; +Cc: kernel-hardening, David Windsor, linux-mm, linux-kernel

On Mon, Jun 19, 2017 at 04:36:37PM -0700, Kees Cook wrote:
> Some hardened environments want to build kernels with slab_nomerge
> already set (so that they do not depend on remembering to set the kernel
> command line option). This is desired to reduce the risk of kernel heap
> overflows being able to overwrite objects from merged caches, increasing
> the difficulty of these attacks. By keeping caches unmerged, these kinds
> of exploits can usually only damage objects in the same cache (though the
> risk to metadata exploitation is unchanged).
> 
> Signed-off-by: Kees Cook <keescook@chromium.org>
> ---
>  mm/slab_common.c |  5 ++---
>  security/Kconfig | 13 +++++++++++++
>  2 files changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index 6c14d765379f..17a4c4b33283 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -47,13 +47,12 @@ static DECLARE_WORK(slab_caches_to_rcu_destroy_work,
>  
>  /*
>   * Merge control. If this is set then no merging of slab caches will occur.
> - * (Could be removed. This was introduced to pacify the merge skeptics.)
>   */
> -static int slab_nomerge;
> +static bool slab_nomerge = !IS_ENABLED(CONFIG_SLAB_MERGE_DEFAULT);
>  
>  static int __init setup_slab_nomerge(char *str)
>  {
> -	slab_nomerge = 1;
> +	slab_nomerge = true;
>  	return 1;
>  }
>  
> diff --git a/security/Kconfig b/security/Kconfig
> index 0c181cebdb8a..e40bd2a260f8 100644
> --- a/security/Kconfig
> +++ b/security/Kconfig
> @@ -166,6 +166,19 @@ config HARDENED_USERCOPY_SPLIT_KMALLOC
>  	  confined to a separate cache, attackers must find other ways
>  	  to prepare heap attacks that will be near their desired target.
>  
> +config SLAB_MERGE_DEFAULT
> +	bool "Allow slab caches to be merged"
> +	default y
> +	help
> +	  For reduced kernel memory fragmentation, slab caches can be
> +	  merged when they share the same size and other characteristics.
> +	  This carries a small risk of kernel heap overflows being able
> +	  to overwrite objects from merged caches, which reduces the
> +	  difficulty of such heap attacks. By keeping caches unmerged,
> +	  these kinds of exploits can usually only damage objects in the
> +	  same cache. To disable merging at runtime, "slab_nomerge" can be
> +	  passed on the kernel command line.
> +

It's good to at least have this option, but again it's logically separate and
shouldn't just be hidden in patch 23/23.  And again, is it really just about
heap overflows?

Please also fix the documentation for slab_nomerge in
Documentation/admin-guide/kernel-parameters.txt.

- Eric

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [kernel-hardening] [PATCH 22/23] usercopy: split user-controlled slabs to separate caches
  2017-06-19 23:36 ` [PATCH 22/23] usercopy: split user-controlled slabs to separate caches Kees Cook
  2017-06-20  4:24   ` [kernel-hardening] " Eric Biggers
@ 2017-06-20  4:47   ` Eric Biggers
  2017-06-20 22:27     ` Kees Cook
  2017-06-20 20:24   ` Laura Abbott
  2 siblings, 1 reply; 45+ messages in thread
From: Eric Biggers @ 2017-06-20  4:47 UTC (permalink / raw)
  To: Kees Cook; +Cc: kernel-hardening, David Windsor, linux-mm, linux-kernel

On Mon, Jun 19, 2017 at 04:36:36PM -0700, Kees Cook wrote:
> From: David Windsor <dave@nullcore.net>
> 
> Some userspace APIs (e.g. ipc, seq_file) provide precise control over
> the size of kernel kmallocs, which provides a trivial way to perform
> heap overflow attacks where the attacker must control neighboring
> allocations of a specific size. Instead, move these APIs into their own
> cache so they cannot interfere with standard kmallocs. This is enabled
> with CONFIG_HARDENED_USERCOPY_SPLIT_KMALLOC.
> 
> This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY_SLABS
> code in the last public patch of grsecurity/PaX based on my understanding
> of the code. Changes or omissions from the original code are mine and
> don't reflect the original grsecurity/PaX code.
> 
> Signed-off-by: David Windsor <dave@nullcore.net>
> [kees: added SLAB_NO_MERGE flag to allow split of future no-merge Kconfig]
> Signed-off-by: Kees Cook <keescook@chromium.org>
> ---
>  fs/seq_file.c        |  2 +-
>  include/linux/gfp.h  |  9 ++++++++-
>  include/linux/slab.h | 12 ++++++++++++
>  ipc/msgutil.c        |  5 +++--
>  mm/slab.h            |  3 ++-
>  mm/slab_common.c     | 29 ++++++++++++++++++++++++++++-
>  security/Kconfig     | 12 ++++++++++++
>  7 files changed, 66 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/seq_file.c b/fs/seq_file.c
> index dc7c2be963ed..5caa58a19bdc 100644
> --- a/fs/seq_file.c
> +++ b/fs/seq_file.c
> @@ -25,7 +25,7 @@ static void seq_set_overflow(struct seq_file *m)
>  
>  static void *seq_buf_alloc(unsigned long size)
>  {
> -	return kvmalloc(size, GFP_KERNEL);
> +	return kvmalloc(size, GFP_KERNEL | GFP_USERCOPY);
>  }
>  

Also forgot to mention the obvious: there are way more places where GFP_USERCOPY
would need to be (or should be) used.  Helper functions like memdup_user() and
memdup_user_nul() would be the obvious ones.  And just a random example, some of
the keyrings syscalls (callable with no privileges) do a kmalloc() with
user-controlled contents and size.

So I think this by itself needs its own patch series.

Eric

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [kernel-hardening] [PATCH 00/23] Hardened usercopy whitelisting
  2017-06-19 23:36 [PATCH 00/23] Hardened usercopy whitelisting Kees Cook
                   ` (22 preceding siblings ...)
  2017-06-19 23:36 ` [PATCH 23/23] mm: Allow slab_nomerge to be set at build time Kees Cook
@ 2017-06-20 19:41 ` Rik van Riel
  2017-10-20 22:40 ` Paolo Bonzini
  24 siblings, 0 replies; 45+ messages in thread
From: Rik van Riel @ 2017-06-20 19:41 UTC (permalink / raw)
  To: Kees Cook, kernel-hardening; +Cc: David Windsor, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 827 bytes --]

On Mon, 2017-06-19 at 16:36 -0700, Kees Cook wrote:
> This series is modified from Brad Spengler/PaX Team's PAX_USERCOPY
> code
> in the last public patch of grsecurity/PaX based on our understanding
> of the code. Changes or omissions from the original code are ours and
> don't reflect the original grsecurity/PaX code.
> 
> David Windsor did the bulk of the porting, refactoring, splitting,
> testing, etc; I just did some extra tweaks, hunk moving, and small
> extra patches.
> 
> 
> This updates the slab allocator to add annotations (useroffset and
> usersize) to define allowed usercopy regions.

This is a great improvement over the old system
of having a few whitelisted kmalloc caches, and
bounce buffering to copy data from caches that
are not whitelisted!

I like it.

-- 
All rights reversed

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 22/23] usercopy: split user-controlled slabs to separate caches
  2017-06-19 23:36 ` [PATCH 22/23] usercopy: split user-controlled slabs to separate caches Kees Cook
  2017-06-20  4:24   ` [kernel-hardening] " Eric Biggers
  2017-06-20  4:47   ` Eric Biggers
@ 2017-06-20 20:24   ` Laura Abbott
  2017-06-20 22:22     ` Kees Cook
  2 siblings, 1 reply; 45+ messages in thread
From: Laura Abbott @ 2017-06-20 20:24 UTC (permalink / raw)
  To: Kees Cook, kernel-hardening; +Cc: David Windsor, linux-mm, linux-kernel

On 06/19/2017 04:36 PM, Kees Cook wrote:
> From: David Windsor <dave@nullcore.net>
> 
> Some userspace APIs (e.g. ipc, seq_file) provide precise control over
> the size of kernel kmallocs, which provides a trivial way to perform
> heap overflow attacks where the attacker must control neighboring
> allocations of a specific size. Instead, move these APIs into their own
> cache so they cannot interfere with standard kmallocs. This is enabled
> with CONFIG_HARDENED_USERCOPY_SPLIT_KMALLOC.
> 
> This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY_SLABS
> code in the last public patch of grsecurity/PaX based on my understanding
> of the code. Changes or omissions from the original code are mine and
> don't reflect the original grsecurity/PaX code.
> 
> Signed-off-by: David Windsor <dave@nullcore.net>
> [kees: added SLAB_NO_MERGE flag to allow split of future no-merge Kconfig]
> Signed-off-by: Kees Cook <keescook@chromium.org>

I just did a quick test of kspp/usercopy-whitelist/lateston my arm64 machine and got some spew:

[   21.818719] Unexpected gfp: 0x4000000 (0x4000000). Fixing up to gfp: 0x14000c0 (GFP_KERNEL). Fix your code!
[   21.828427] CPU: 7 PID: 652 Comm: irqbalance Tainted: G        W       4.12.0-rc5-whitelist+ #236
[   21.837259] Hardware name: AppliedMicro X-Gene Mustang Board/X-Gene Mustang Board, BIOS 3.06.12 Aug 12 2016
[   21.846955] Call trace:
[   21.849396] [<ffff000008089b18>] dump_backtrace+0x0/0x210
[   21.854770] [<ffff000008089d4c>] show_stack+0x24/0x30
[   21.859798] [<ffff00000845b7bc>] dump_stack+0x90/0xb4
[   21.864827] [<ffff00000826ff40>] new_slab+0x88/0x90
[   21.869681] [<ffff000008272218>] ___slab_alloc+0x428/0x6b0
[   21.875141] [<ffff0000082724f0>] __slab_alloc+0x50/0x68
[   21.880341] [<ffff000008273208>] __kmalloc_node+0xd0/0x348
[   21.885800] [<ffff000008223af0>] kvmalloc_node+0xa0/0xb8
[   21.891088] [<ffff0000082bb400>] single_open_size+0x40/0xb0
[   21.896636] [<ffff000008315a9c>] stat_open+0x54/0x60
[   21.901576] [<ffff00000830adf8>] proc_reg_open+0x90/0x168
[   21.906950] [<ffff00000828def4>] do_dentry_open+0x1c4/0x328
[   21.912496] [<ffff00000828f470>] vfs_open+0x58/0x88
[   21.917351] [<ffff0000082a1f14>] do_last+0x3d4/0x770
[   21.922292] [<ffff0000082a233c>] path_openat+0x8c/0x2e8
[   21.927492] [<ffff0000082a3888>] do_filp_open+0x70/0xe8
[   21.932692] [<ffff00000828f940>] do_sys_open+0x178/0x208
[   21.937977] [<ffff00000828fa54>] SyS_openat+0x3c/0x50
[   21.943005] [<ffff0000080835f0>] el0_svc_naked+0x24/0x28


I don't think 7e7844226f10 ("lockdep: allow to disable reclaim lockup detection")
is correct after new flags are added because we will still need space
for another bit even if lockdep is disabled. That might need to
be fixed separately.

I'm really not a fan the GFP approach though since the flags tend
to be a little bit fragile to manage. If we're going to have to
add something to callsites anyway, maybe we could just have an
alternate function (kmalloc_user?) instead of a GFP flag.


> ---
>  fs/seq_file.c        |  2 +-
>  include/linux/gfp.h  |  9 ++++++++-
>  include/linux/slab.h | 12 ++++++++++++
>  ipc/msgutil.c        |  5 +++--
>  mm/slab.h            |  3 ++-
>  mm/slab_common.c     | 29 ++++++++++++++++++++++++++++-
>  security/Kconfig     | 12 ++++++++++++
>  7 files changed, 66 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/seq_file.c b/fs/seq_file.c
> index dc7c2be963ed..5caa58a19bdc 100644
> --- a/fs/seq_file.c
> +++ b/fs/seq_file.c
> @@ -25,7 +25,7 @@ static void seq_set_overflow(struct seq_file *m)
>  
>  static void *seq_buf_alloc(unsigned long size)
>  {
> -	return kvmalloc(size, GFP_KERNEL);
> +	return kvmalloc(size, GFP_KERNEL | GFP_USERCOPY);
>  }
>  
>  /**
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index a89d37e8b387..ff4f4a698ad0 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -45,6 +45,7 @@ struct vm_area_struct;
>  #else
>  #define ___GFP_NOLOCKDEP	0
>  #endif
> +#define ___GFP_USERCOPY		0x4000000u
>  /* If the above are modified, __GFP_BITS_SHIFT may need updating */
>  
>  /*
> @@ -83,12 +84,17 @@ struct vm_area_struct;
>   *   node with no fallbacks or placement policy enforcements.
>   *
>   * __GFP_ACCOUNT causes the allocation to be accounted to kmemcg.
> + *
> + * __GFP_USERCOPY indicates that the page will be explicitly copied to/from
> + *   userspace, and may be allocated from a separate kmalloc pool.
> + *
>   */
>  #define __GFP_RECLAIMABLE ((__force gfp_t)___GFP_RECLAIMABLE)
>  #define __GFP_WRITE	((__force gfp_t)___GFP_WRITE)
>  #define __GFP_HARDWALL   ((__force gfp_t)___GFP_HARDWALL)
>  #define __GFP_THISNODE	((__force gfp_t)___GFP_THISNODE)
>  #define __GFP_ACCOUNT	((__force gfp_t)___GFP_ACCOUNT)
> +#define __GFP_USERCOPY	((__force gfp_t)___GFP_USERCOPY)
>  
>  /*
>   * Watermark modifiers -- controls access to emergency reserves
> @@ -188,7 +194,7 @@ struct vm_area_struct;
>  #define __GFP_NOLOCKDEP ((__force gfp_t)___GFP_NOLOCKDEP)
>  
>  /* Room for N __GFP_FOO bits */
> -#define __GFP_BITS_SHIFT (25 + IS_ENABLED(CONFIG_LOCKDEP))
> +#define __GFP_BITS_SHIFT (26 + IS_ENABLED(CONFIG_LOCKDEP))
>  #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
>  
>  /*
> @@ -268,6 +274,7 @@ struct vm_area_struct;
>  #define GFP_TRANSHUGE_LIGHT	((GFP_HIGHUSER_MOVABLE | __GFP_COMP | \
>  			 __GFP_NOMEMALLOC | __GFP_NOWARN) & ~__GFP_RECLAIM)
>  #define GFP_TRANSHUGE	(GFP_TRANSHUGE_LIGHT | __GFP_DIRECT_RECLAIM)
> +#define GFP_USERCOPY	__GFP_USERCOPY
>  
>  /* Convert GFP flags to their corresponding migrate type */
>  #define GFP_MOVABLE_MASK (__GFP_RECLAIMABLE|__GFP_MOVABLE)
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 97f4a0117b3b..7d9d7d838991 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -25,6 +25,7 @@
>  #define SLAB_POISON		0x00000800UL	/* DEBUG: Poison objects */
>  #define SLAB_HWCACHE_ALIGN	0x00002000UL	/* Align objs on cache lines */
>  #define SLAB_CACHE_DMA		0x00004000UL	/* Use GFP_DMA memory */
> +#define SLAB_NO_MERGE		0x00008000UL	/* Keep this cache unmerged */
>  #define SLAB_STORE_USER		0x00010000UL	/* DEBUG: Store the last owner for bug hunting */
>  #define SLAB_PANIC		0x00040000UL	/* Panic if kmem_cache_create() fails */
>  /*
> @@ -287,6 +288,17 @@ extern struct kmem_cache *kmalloc_dma_caches[KMALLOC_SHIFT_HIGH + 1];
>  #endif
>  
>  /*
> + * Some userspace APIs (ipc, seq_file) provide precise control over
> + * the size of kernel kmallocs, which provides a trivial way to perform
> + * heap overflow attacks where the attacker must control neighboring
> + * allocations.  Instead, move these APIs into their own cache so they
> + * cannot interfere with standard kmallocs.
> + */
> +#ifdef CONFIG_HARDENED_USERCOPY_SPLIT_KMALLOC
> +extern struct kmem_cache *kmalloc_usersized_caches[KMALLOC_SHIFT_HIGH + 1];
> +#endif
> +
> +/*
>   * Figure out which kmalloc slab an allocation of a certain size
>   * belongs to.
>   * 0 = zero alloc
> diff --git a/ipc/msgutil.c b/ipc/msgutil.c
> index bf74eaa5c39f..5ae33d50da26 100644
> --- a/ipc/msgutil.c
> +++ b/ipc/msgutil.c
> @@ -53,7 +53,7 @@ static struct msg_msg *alloc_msg(size_t len)
>  	size_t alen;
>  
>  	alen = min(len, DATALEN_MSG);
> -	msg = kmalloc(sizeof(*msg) + alen, GFP_KERNEL_ACCOUNT);
> +	msg = kmalloc(sizeof(*msg) + alen, GFP_KERNEL_ACCOUNT | GFP_USERCOPY);
>  	if (msg == NULL)
>  		return NULL;
>  
> @@ -65,7 +65,8 @@ static struct msg_msg *alloc_msg(size_t len)
>  	while (len > 0) {
>  		struct msg_msgseg *seg;
>  		alen = min(len, DATALEN_SEG);
> -		seg = kmalloc(sizeof(*seg) + alen, GFP_KERNEL_ACCOUNT);
> +		seg = kmalloc(sizeof(*seg) + alen, GFP_KERNEL_ACCOUNT |
> +			GFP_USERCOPY);
>  		if (seg == NULL)
>  			goto out_err;
>  		*pseg = seg;
> diff --git a/mm/slab.h b/mm/slab.h
> index 4cdc8e64fdbd..874b755f278d 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -130,7 +130,8 @@ static inline unsigned long kmem_cache_flags(unsigned long object_size,
>  
>  /* Legal flag mask for kmem_cache_create(), for various configurations */
>  #define SLAB_CORE_FLAGS (SLAB_HWCACHE_ALIGN | SLAB_CACHE_DMA | SLAB_PANIC | \
> -			 SLAB_TYPESAFE_BY_RCU | SLAB_DEBUG_OBJECTS )
> +			 SLAB_TYPESAFE_BY_RCU | SLAB_DEBUG_OBJECTS | \
> +			 SLAB_NO_MERGE)
>  
>  #if defined(CONFIG_DEBUG_SLAB)
>  #define SLAB_DEBUG_FLAGS (SLAB_RED_ZONE | SLAB_POISON | SLAB_STORE_USER)
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index 2365dd21623d..6c14d765379f 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -40,7 +40,7 @@ static DECLARE_WORK(slab_caches_to_rcu_destroy_work,
>   */
>  #define SLAB_NEVER_MERGE (SLAB_RED_ZONE | SLAB_POISON | SLAB_STORE_USER | \
>  		SLAB_TRACE | SLAB_TYPESAFE_BY_RCU | SLAB_NOLEAKTRACE | \
> -		SLAB_FAILSLAB | SLAB_KASAN)
> +		SLAB_FAILSLAB | SLAB_KASAN | SLAB_NO_MERGE)
>  
>  #define SLAB_MERGE_SAME (SLAB_RECLAIM_ACCOUNT | SLAB_CACHE_DMA | \
>  			 SLAB_NOTRACK | SLAB_ACCOUNT)
> @@ -940,6 +940,11 @@ struct kmem_cache *kmalloc_dma_caches[KMALLOC_SHIFT_HIGH + 1];
>  EXPORT_SYMBOL(kmalloc_dma_caches);
>  #endif
>  
> +#ifdef CONFIG_HARDENED_USERCOPY_SPLIT_KMALLOC
> +struct kmem_cache *kmalloc_usersized_caches[KMALLOC_SHIFT_HIGH + 1];
> +EXPORT_SYMBOL(kmalloc_usersized_caches);
> +#endif
> +
>  /*
>   * Conversion table for small slabs sizes / 8 to the index in the
>   * kmalloc array. This is necessary for slabs < 192 since we have non power
> @@ -1004,6 +1009,12 @@ struct kmem_cache *kmalloc_slab(size_t size, gfp_t flags)
>  		return kmalloc_dma_caches[index];
>  
>  #endif
> +
> +#ifdef CONFIG_HARDENED_USERCOPY_SPLIT_KMALLOC
> +	if (unlikely((flags & GFP_USERCOPY)))
> +		return kmalloc_usersized_caches[index];
> +#endif
> +
>  	return kmalloc_caches[index];
>  }
>  
> @@ -1125,6 +1136,22 @@ void __init create_kmalloc_caches(unsigned long flags)
>  		}
>  	}
>  #endif
> +
> +#ifdef CONFIG_HARDENED_USERCOPY_SPLIT_KMALLOC
> +	for (i = 0; i <= KMALLOC_SHIFT_HIGH; i++) {
> +		struct kmem_cache *s = kmalloc_caches[i];
> +
> +		if (s) {
> +			int size = kmalloc_size(i);
> +			char *n = kasprintf(GFP_NOWAIT,
> +				"usersized-kmalloc-%d", size);
> +
> +			BUG_ON(!n);
> +			kmalloc_usersized_caches[i] = create_kmalloc_cache(n,
> +				size, SLAB_NO_MERGE | flags, 0, size);
> +		}
> +	}
> +#endif /* CONFIG_HARDENED_USERCOPY_SPLIT_KMALLOC */
>  }
>  #endif /* !CONFIG_SLOB */
>  
> diff --git a/security/Kconfig b/security/Kconfig
> index 93027fdf47d1..0c181cebdb8a 100644
> --- a/security/Kconfig
> +++ b/security/Kconfig
> @@ -154,6 +154,18 @@ config HARDENED_USERCOPY_PAGESPAN
>  	  been removed. This config is intended to be used only while
>  	  trying to find such users.
>  
> +config HARDENED_USERCOPY_SPLIT_KMALLOC
> +	bool "Isolate kernel caches from user-controlled allocations"
> +	default HARDENED_USERCOPY
> +	help
> +	  This option creates a separate set of kmalloc caches used to
> +	  satisfy allocations from userspace APIs that allow for
> +	  fine-grained control over the size of kernel allocations.
> +	  Without this, it is much easier for attackers to precisely
> +	  size and attack heap overflows.  If their allocations are
> +	  confined to a separate cache, attackers must find other ways
> +	  to prepare heap attacks that will be near their desired target.
> +
>  config STATIC_USERMODEHELPER
>  	bool "Force all usermode helper calls through a single binary"
>  	help
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 22/23] usercopy: split user-controlled slabs to separate caches
  2017-06-20 20:24   ` Laura Abbott
@ 2017-06-20 22:22     ` Kees Cook
  2017-06-27  7:31       ` Michal Hocko
  0 siblings, 1 reply; 45+ messages in thread
From: Kees Cook @ 2017-06-20 22:22 UTC (permalink / raw)
  To: Laura Abbott; +Cc: kernel-hardening, David Windsor, Linux-MM, LKML

On Tue, Jun 20, 2017 at 1:24 PM, Laura Abbott <labbott@redhat.com> wrote:
> On 06/19/2017 04:36 PM, Kees Cook wrote:
>> From: David Windsor <dave@nullcore.net>
>>
>> Some userspace APIs (e.g. ipc, seq_file) provide precise control over
>> the size of kernel kmallocs, which provides a trivial way to perform
>> heap overflow attacks where the attacker must control neighboring
>> allocations of a specific size. Instead, move these APIs into their own
>> cache so they cannot interfere with standard kmallocs. This is enabled
>> with CONFIG_HARDENED_USERCOPY_SPLIT_KMALLOC.
>>
>> This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY_SLABS
>> code in the last public patch of grsecurity/PaX based on my understanding
>> of the code. Changes or omissions from the original code are mine and
>> don't reflect the original grsecurity/PaX code.
>>
>> Signed-off-by: David Windsor <dave@nullcore.net>
>> [kees: added SLAB_NO_MERGE flag to allow split of future no-merge Kconfig]
>> Signed-off-by: Kees Cook <keescook@chromium.org>
>
> I just did a quick test of kspp/usercopy-whitelist/lateston my arm64 machine and got some spew:
>
> [   21.818719] Unexpected gfp: 0x4000000 (0x4000000). Fixing up to gfp: 0x14000c0 (GFP_KERNEL). Fix your code!
> [   21.828427] CPU: 7 PID: 652 Comm: irqbalance Tainted: G        W       4.12.0-rc5-whitelist+ #236
> [   21.837259] Hardware name: AppliedMicro X-Gene Mustang Board/X-Gene Mustang Board, BIOS 3.06.12 Aug 12 2016
> [   21.846955] Call trace:
> [   21.849396] [<ffff000008089b18>] dump_backtrace+0x0/0x210
> [   21.854770] [<ffff000008089d4c>] show_stack+0x24/0x30
> [   21.859798] [<ffff00000845b7bc>] dump_stack+0x90/0xb4
> [   21.864827] [<ffff00000826ff40>] new_slab+0x88/0x90
> [   21.869681] [<ffff000008272218>] ___slab_alloc+0x428/0x6b0
> [   21.875141] [<ffff0000082724f0>] __slab_alloc+0x50/0x68
> [   21.880341] [<ffff000008273208>] __kmalloc_node+0xd0/0x348
> [   21.885800] [<ffff000008223af0>] kvmalloc_node+0xa0/0xb8
> [   21.891088] [<ffff0000082bb400>] single_open_size+0x40/0xb0
> [   21.896636] [<ffff000008315a9c>] stat_open+0x54/0x60
> [   21.901576] [<ffff00000830adf8>] proc_reg_open+0x90/0x168
> [   21.906950] [<ffff00000828def4>] do_dentry_open+0x1c4/0x328
> [   21.912496] [<ffff00000828f470>] vfs_open+0x58/0x88
> [   21.917351] [<ffff0000082a1f14>] do_last+0x3d4/0x770
> [   21.922292] [<ffff0000082a233c>] path_openat+0x8c/0x2e8
> [   21.927492] [<ffff0000082a3888>] do_filp_open+0x70/0xe8
> [   21.932692] [<ffff00000828f940>] do_sys_open+0x178/0x208
> [   21.937977] [<ffff00000828fa54>] SyS_openat+0x3c/0x50
> [   21.943005] [<ffff0000080835f0>] el0_svc_naked+0x24/0x28
>
>
> I don't think 7e7844226f10 ("lockdep: allow to disable reclaim lockup detection")
> is correct after new flags are added because we will still need space
> for another bit even if lockdep is disabled. That might need to
> be fixed separately.

Err... that commit has "___GFP_NOLOCKDEP       0x4000000u", but my
tree shows it as 0x2000000u? Hmm, looks like 1bde33e05123
("include/linux/gfp.h: fix ___GFP_NOLOCKDEP value") fixed that? Oh, or
I have misread it. It looks like new GFP flags need to be added
_above_ GFP_NOLOCKDEP and have to bump GFP_NOLOCKDEP's value too? Like
this:

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index ff4f4a698ad0..deb8ac39fba5 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -40,12 +40,12 @@ struct vm_area_struct;
 #define ___GFP_DIRECT_RECLAIM  0x400000u
 #define ___GFP_WRITE           0x800000u
 #define ___GFP_KSWAPD_RECLAIM  0x1000000u
+#define ___GFP_USERCOPY                0x2000000u
 #ifdef CONFIG_LOCKDEP
-#define ___GFP_NOLOCKDEP       0x2000000u
+#define ___GFP_NOLOCKDEP       0x4000000u
 #else
 #define ___GFP_NOLOCKDEP       0
 #endif
-#define ___GFP_USERCOPY                0x4000000u
 /* If the above are modified, __GFP_BITS_SHIFT may need updating */


> I'm really not a fan the GFP approach though since the flags tend
> to be a little bit fragile to manage. If we're going to have to
> add something to callsites anyway, maybe we could just have an
> alternate function (kmalloc_user?) instead of a GFP flag.

This would mean building out *_user() versions for all the various
*alloc() functions, though. That gets kind of long/ugly.

The other reason to use a GFP flag is to be able to interrogate a
cache later, which will be handy for doing things like %p and kernel
symbol censorship (this is what grsecurity does with their HIDESYM
logic). "If this would write to a usercopy-whitelisted object, censor
it" etc. Though now that I go double-check, it looks like grsecurity
uses cache->usersize as an indicator of censorship-need on slab
caches, not the GFP flag, which is only used to use the split kmalloc
cache. (Though as far as flags go, there is also VM_USERCOPY for what
are now the kvmalloc*() cases.)

Perhaps this should be named GFP_USERSIZED or so?

-Kees

-- 
Kees Cook
Pixel Security

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [kernel-hardening] [PATCH 22/23] usercopy: split user-controlled slabs to separate caches
  2017-06-20  4:47   ` Eric Biggers
@ 2017-06-20 22:27     ` Kees Cook
  0 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2017-06-20 22:27 UTC (permalink / raw)
  To: Eric Biggers; +Cc: kernel-hardening, David Windsor, Linux-MM, LKML

On Mon, Jun 19, 2017 at 9:47 PM, Eric Biggers <ebiggers3@gmail.com> wrote:
> On Mon, Jun 19, 2017 at 04:36:36PM -0700, Kees Cook wrote:
>> From: David Windsor <dave@nullcore.net>
>>
>> Some userspace APIs (e.g. ipc, seq_file) provide precise control over
>> the size of kernel kmallocs, which provides a trivial way to perform
>> heap overflow attacks where the attacker must control neighboring
>> allocations of a specific size. Instead, move these APIs into their own
>> cache so they cannot interfere with standard kmallocs. This is enabled
>> with CONFIG_HARDENED_USERCOPY_SPLIT_KMALLOC.
>>
>> This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY_SLABS
>> code in the last public patch of grsecurity/PaX based on my understanding
>> of the code. Changes or omissions from the original code are mine and
>> don't reflect the original grsecurity/PaX code.
>>
>> Signed-off-by: David Windsor <dave@nullcore.net>
>> [kees: added SLAB_NO_MERGE flag to allow split of future no-merge Kconfig]
>> Signed-off-by: Kees Cook <keescook@chromium.org>
>> ---
>>  fs/seq_file.c        |  2 +-
>>  include/linux/gfp.h  |  9 ++++++++-
>>  include/linux/slab.h | 12 ++++++++++++
>>  ipc/msgutil.c        |  5 +++--
>>  mm/slab.h            |  3 ++-
>>  mm/slab_common.c     | 29 ++++++++++++++++++++++++++++-
>>  security/Kconfig     | 12 ++++++++++++
>>  7 files changed, 66 insertions(+), 6 deletions(-)
>>
>> diff --git a/fs/seq_file.c b/fs/seq_file.c
>> index dc7c2be963ed..5caa58a19bdc 100644
>> --- a/fs/seq_file.c
>> +++ b/fs/seq_file.c
>> @@ -25,7 +25,7 @@ static void seq_set_overflow(struct seq_file *m)
>>
>>  static void *seq_buf_alloc(unsigned long size)
>>  {
>> -     return kvmalloc(size, GFP_KERNEL);
>> +     return kvmalloc(size, GFP_KERNEL | GFP_USERCOPY);
>>  }
>>
>
> Also forgot to mention the obvious: there are way more places where GFP_USERCOPY
> would need to be (or should be) used.  Helper functions like memdup_user() and
> memdup_user_nul() would be the obvious ones.  And just a random example, some of
> the keyrings syscalls (callable with no privileges) do a kmalloc() with
> user-controlled contents and size.

Looking again at how grsecurity uses it, they have some of those call
sites a couple more (keyctl, char/mem, kcore, memdup_user). Getting
the facility in place at all is a good first step, IMO.

>
> So I think this by itself needs its own patch series.

Sounds reasonable.

-Kees

-- 
Kees Cook
Pixel Security

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [kernel-hardening] [PATCH 23/23] mm: Allow slab_nomerge to be set at build time
  2017-06-20  4:09   ` [kernel-hardening] " Daniel Micay
@ 2017-06-20 22:51     ` Kees Cook
  0 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2017-06-20 22:51 UTC (permalink / raw)
  To: Daniel Micay; +Cc: kernel-hardening, David Windsor, Linux-MM, LKML

On Mon, Jun 19, 2017 at 9:09 PM, Daniel Micay <danielmicay@gmail.com> wrote:
> On Mon, 2017-06-19 at 16:36 -0700, Kees Cook wrote:
>> Some hardened environments want to build kernels with slab_nomerge
>> already set (so that they do not depend on remembering to set the
>> kernel
>> command line option). This is desired to reduce the risk of kernel
>> heap
>> overflows being able to overwrite objects from merged caches,
>> increasing
>> the difficulty of these attacks. By keeping caches unmerged, these
>> kinds
>> of exploits can usually only damage objects in the same cache (though
>> the
>> risk to metadata exploitation is unchanged).
>
> It also further fragments the ability to influence slab cache layout,
> i.e. primitives to do things like filling up slabs to set things up for
> an exploit might not be able to deal with the target slabs anymore. It
> doesn't need to be mentioned but it's something to think about too. In
> theory, disabling merging can make it *easier* to get the right layout
> too if there was some annoyance that's now split away. It's definitely a
> lot more good than bad for security though, but allocator changes have
> subtle impact on exploitation. This can make caches more deterministic.

Good point about changes to heap grooming; I'll adjust the commit log.

-Kees

-- 
Kees Cook
Pixel Security

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [kernel-hardening] [PATCH 23/23] mm: Allow slab_nomerge to be set at build time
  2017-06-20  4:29   ` Eric Biggers
@ 2017-06-20 23:09     ` Kees Cook
  0 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2017-06-20 23:09 UTC (permalink / raw)
  To: Eric Biggers; +Cc: kernel-hardening, David Windsor, Linux-MM, LKML

On Mon, Jun 19, 2017 at 9:29 PM, Eric Biggers <ebiggers3@gmail.com> wrote:
> On Mon, Jun 19, 2017 at 04:36:37PM -0700, Kees Cook wrote:
>> Some hardened environments want to build kernels with slab_nomerge
>> already set (so that they do not depend on remembering to set the kernel
>> command line option). This is desired to reduce the risk of kernel heap
>> overflows being able to overwrite objects from merged caches, increasing
>> the difficulty of these attacks. By keeping caches unmerged, these kinds
>> of exploits can usually only damage objects in the same cache (though the
>> risk to metadata exploitation is unchanged).
>>
>> Signed-off-by: Kees Cook <keescook@chromium.org>
>> ---
>>  mm/slab_common.c |  5 ++---
>>  security/Kconfig | 13 +++++++++++++
>>  2 files changed, 15 insertions(+), 3 deletions(-)
>>
>> diff --git a/mm/slab_common.c b/mm/slab_common.c
>> index 6c14d765379f..17a4c4b33283 100644
>> --- a/mm/slab_common.c
>> +++ b/mm/slab_common.c
>> @@ -47,13 +47,12 @@ static DECLARE_WORK(slab_caches_to_rcu_destroy_work,
>>
>>  /*
>>   * Merge control. If this is set then no merging of slab caches will occur.
>> - * (Could be removed. This was introduced to pacify the merge skeptics.)
>>   */
>> -static int slab_nomerge;
>> +static bool slab_nomerge = !IS_ENABLED(CONFIG_SLAB_MERGE_DEFAULT);
>>
>>  static int __init setup_slab_nomerge(char *str)
>>  {
>> -     slab_nomerge = 1;
>> +     slab_nomerge = true;
>>       return 1;
>>  }
>>
>> diff --git a/security/Kconfig b/security/Kconfig
>> index 0c181cebdb8a..e40bd2a260f8 100644
>> --- a/security/Kconfig
>> +++ b/security/Kconfig
>> @@ -166,6 +166,19 @@ config HARDENED_USERCOPY_SPLIT_KMALLOC
>>         confined to a separate cache, attackers must find other ways
>>         to prepare heap attacks that will be near their desired target.
>>
>> +config SLAB_MERGE_DEFAULT
>> +     bool "Allow slab caches to be merged"
>> +     default y
>> +     help
>> +       For reduced kernel memory fragmentation, slab caches can be
>> +       merged when they share the same size and other characteristics.
>> +       This carries a small risk of kernel heap overflows being able
>> +       to overwrite objects from merged caches, which reduces the
>> +       difficulty of such heap attacks. By keeping caches unmerged,
>> +       these kinds of exploits can usually only damage objects in the
>> +       same cache. To disable merging at runtime, "slab_nomerge" can be
>> +       passed on the kernel command line.
>> +
>
> It's good to at least have this option, but again it's logically separate and
> shouldn't just be hidden in patch 23/23.  And again, is it really just about
> heap overflows?
>
> Please also fix the documentation for slab_nomerge in
> Documentation/admin-guide/kernel-parameters.txt.

I've split it out and updated the docs, thanks!

-Kees

-- 
Kees Cook
Pixel Security

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 22/23] usercopy: split user-controlled slabs to separate caches
  2017-06-20 22:22     ` Kees Cook
@ 2017-06-27  7:31       ` Michal Hocko
  2017-06-27 22:07         ` Kees Cook
  0 siblings, 1 reply; 45+ messages in thread
From: Michal Hocko @ 2017-06-27  7:31 UTC (permalink / raw)
  To: Kees Cook; +Cc: Laura Abbott, kernel-hardening, David Windsor, Linux-MM, LKML

On Tue 20-06-17 15:22:58, Kees Cook wrote:
> On Tue, Jun 20, 2017 at 1:24 PM, Laura Abbott <labbott@redhat.com> wrote:
> > On 06/19/2017 04:36 PM, Kees Cook wrote:
> >> From: David Windsor <dave@nullcore.net>
> >>
> >> Some userspace APIs (e.g. ipc, seq_file) provide precise control over
> >> the size of kernel kmallocs, which provides a trivial way to perform
> >> heap overflow attacks where the attacker must control neighboring
> >> allocations of a specific size. Instead, move these APIs into their own
> >> cache so they cannot interfere with standard kmallocs. This is enabled
> >> with CONFIG_HARDENED_USERCOPY_SPLIT_KMALLOC.
> >>
> >> This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY_SLABS
> >> code in the last public patch of grsecurity/PaX based on my understanding
> >> of the code. Changes or omissions from the original code are mine and
> >> don't reflect the original grsecurity/PaX code.
> >>
> >> Signed-off-by: David Windsor <dave@nullcore.net>
> >> [kees: added SLAB_NO_MERGE flag to allow split of future no-merge Kconfig]
> >> Signed-off-by: Kees Cook <keescook@chromium.org>
> >
> > I just did a quick test of kspp/usercopy-whitelist/lateston my arm64 machine and got some spew:
> >
> > [   21.818719] Unexpected gfp: 0x4000000 (0x4000000). Fixing up to gfp: 0x14000c0 (GFP_KERNEL). Fix your code!
> > [   21.828427] CPU: 7 PID: 652 Comm: irqbalance Tainted: G        W       4.12.0-rc5-whitelist+ #236
> > [   21.837259] Hardware name: AppliedMicro X-Gene Mustang Board/X-Gene Mustang Board, BIOS 3.06.12 Aug 12 2016
> > [   21.846955] Call trace:
> > [   21.849396] [<ffff000008089b18>] dump_backtrace+0x0/0x210
> > [   21.854770] [<ffff000008089d4c>] show_stack+0x24/0x30
> > [   21.859798] [<ffff00000845b7bc>] dump_stack+0x90/0xb4
> > [   21.864827] [<ffff00000826ff40>] new_slab+0x88/0x90
> > [   21.869681] [<ffff000008272218>] ___slab_alloc+0x428/0x6b0
> > [   21.875141] [<ffff0000082724f0>] __slab_alloc+0x50/0x68
> > [   21.880341] [<ffff000008273208>] __kmalloc_node+0xd0/0x348
> > [   21.885800] [<ffff000008223af0>] kvmalloc_node+0xa0/0xb8
> > [   21.891088] [<ffff0000082bb400>] single_open_size+0x40/0xb0
> > [   21.896636] [<ffff000008315a9c>] stat_open+0x54/0x60
> > [   21.901576] [<ffff00000830adf8>] proc_reg_open+0x90/0x168
> > [   21.906950] [<ffff00000828def4>] do_dentry_open+0x1c4/0x328
> > [   21.912496] [<ffff00000828f470>] vfs_open+0x58/0x88
> > [   21.917351] [<ffff0000082a1f14>] do_last+0x3d4/0x770
> > [   21.922292] [<ffff0000082a233c>] path_openat+0x8c/0x2e8
> > [   21.927492] [<ffff0000082a3888>] do_filp_open+0x70/0xe8
> > [   21.932692] [<ffff00000828f940>] do_sys_open+0x178/0x208
> > [   21.937977] [<ffff00000828fa54>] SyS_openat+0x3c/0x50
> > [   21.943005] [<ffff0000080835f0>] el0_svc_naked+0x24/0x28
> >
> >
> > I don't think 7e7844226f10 ("lockdep: allow to disable reclaim lockup detection")
> > is correct after new flags are added because we will still need space
> > for another bit even if lockdep is disabled. That might need to
> > be fixed separately.
> 
> Err... that commit has "___GFP_NOLOCKDEP       0x4000000u", but my
> tree shows it as 0x2000000u? Hmm, looks like 1bde33e05123
> ("include/linux/gfp.h: fix ___GFP_NOLOCKDEP value") fixed that?

yes

> Oh, or
> I have misread it. It looks like new GFP flags need to be added
> _above_ GFP_NOLOCKDEP and have to bump GFP_NOLOCKDEP's value too? Like
> this:
> 
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index ff4f4a698ad0..deb8ac39fba5 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -40,12 +40,12 @@ struct vm_area_struct;
>  #define ___GFP_DIRECT_RECLAIM  0x400000u
>  #define ___GFP_WRITE           0x800000u
>  #define ___GFP_KSWAPD_RECLAIM  0x1000000u
> +#define ___GFP_USERCOPY                0x2000000u
>  #ifdef CONFIG_LOCKDEP
> -#define ___GFP_NOLOCKDEP       0x2000000u
> +#define ___GFP_NOLOCKDEP       0x4000000u
>  #else
>  #define ___GFP_NOLOCKDEP       0
>  #endif
> -#define ___GFP_USERCOPY                0x4000000u
>  /* If the above are modified, __GFP_BITS_SHIFT may need updating */

yes, this would work for your case. But...

> > I'm really not a fan the GFP approach though since the flags tend
> > to be a little bit fragile to manage. If we're going to have to
> > add something to callsites anyway, maybe we could just have an
> > alternate function (kmalloc_user?) instead of a GFP flag.

I agree. One additional concern I would have is that people tend to
propagate gfp maks down different layers. What would happen if such a
flag gets to a caller which doesn't expect it. Couldn't it reduce the
whole security point of this excercise?

But I am not really sure I understand consequences of this patch. So how
do those attacks look like. Do you have an example of a CVE which would
be prevented by this measure?

> This would mean building out *_user() versions for all the various
> *alloc() functions, though. That gets kind of long/ugly.

Only prepare those which are really needed. It seems only handful of
them in your patch.

> The other reason to use a GFP flag is to be able to interrogate a
> cache later, which will be handy for doing things like %p and kernel
> symbol censorship (this is what grsecurity does with their HIDESYM
> logic). "If this would write to a usercopy-whitelisted object, censor
> it" etc. Though now that I go double-check, it looks like grsecurity
> uses cache->usersize as an indicator of censorship-need on slab
> caches, not the GFP flag, which is only used to use the split kmalloc
> cache.

which sounds like a separate kmalloc API would do the trick without a new
gfp mask.

> (Though as far as flags go, there is also VM_USERCOPY for what
> are now the kvmalloc*() cases.)

OK, I was about to ask about vmalloc fallbacks. So this is not
implemented in your patch. Does it metter from the security point of
view?

> Perhaps this should be named GFP_USERSIZED or so?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 22/23] usercopy: split user-controlled slabs to separate caches
  2017-06-27  7:31       ` Michal Hocko
@ 2017-06-27 22:07         ` Kees Cook
  2017-06-28  8:54           ` Michal Hocko
  0 siblings, 1 reply; 45+ messages in thread
From: Kees Cook @ 2017-06-27 22:07 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Laura Abbott, kernel-hardening, David Windsor, Linux-MM, LKML

On Tue, Jun 27, 2017 at 12:31 AM, Michal Hocko <mhocko@kernel.org> wrote:
> But I am not really sure I understand consequences of this patch. So how
> do those attacks look like. Do you have an example of a CVE which would
> be prevented by this measure?

It's a regular practice, especially for heap grooming. You can see an
example here:
http://cyseclabs.com/blog/cve-2016-6187-heap-off-by-one-exploit
which even recognizes this as a common method, saying "the standard
msgget() technique". Having the separate caches doesn't strictly
_stop_ some attacks, but it changes the nature of what the attacker
has to do. Instead of having a universal way to groom the heap, they
must be forced into other paths. Generally speaking this can reduce
what's possible making the attack either impossible, more expensive to
develop, or less reliable.

>> This would mean building out *_user() versions for all the various
>> *alloc() functions, though. That gets kind of long/ugly.
>
> Only prepare those which are really needed. It seems only handful of
> them in your patch.

Okay, if that's the desired approach, we can do that.

> OK, I was about to ask about vmalloc fallbacks. So this is not
> implemented in your patch. Does it metter from the security point of
> view?

Right, the HIDESYM-like feature hasn't been ported by anyone yet, so
this hasn't happened. It would simply build on similar logic.

-Kees

-- 
Kees Cook
Pixel Security

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 22/23] usercopy: split user-controlled slabs to separate caches
  2017-06-27 22:07         ` Kees Cook
@ 2017-06-28  8:54           ` Michal Hocko
  0 siblings, 0 replies; 45+ messages in thread
From: Michal Hocko @ 2017-06-28  8:54 UTC (permalink / raw)
  To: Kees Cook; +Cc: Laura Abbott, kernel-hardening, David Windsor, Linux-MM, LKML

On Tue 27-06-17 15:07:17, Kees Cook wrote:
> On Tue, Jun 27, 2017 at 12:31 AM, Michal Hocko <mhocko@kernel.org> wrote:
> > But I am not really sure I understand consequences of this patch. So how
> > do those attacks look like. Do you have an example of a CVE which would
> > be prevented by this measure?
> 
> It's a regular practice, especially for heap grooming. You can see an
> example here:
> http://cyseclabs.com/blog/cve-2016-6187-heap-off-by-one-exploit
> which even recognizes this as a common method, saying "the standard
> msgget() technique". Having the separate caches doesn't strictly
> _stop_ some attacks, but it changes the nature of what the attacker
> has to do. Instead of having a universal way to groom the heap, they
> must be forced into other paths. Generally speaking this can reduce
> what's possible making the attack either impossible, more expensive to
> develop, or less reliable.

Thanks that makes it more clear to me. I believe this would be a useful
information in the changelog.

> >> This would mean building out *_user() versions for all the various
> >> *alloc() functions, though. That gets kind of long/ugly.
> >
> > Only prepare those which are really needed. It seems only handful of
> > them in your patch.
> 
> Okay, if that's the desired approach, we can do that.

yes please
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [kernel-hardening] [PATCH 17/23] dcache: define usercopy region in dentry_cache slab cache
  2017-06-20  4:08   ` [kernel-hardening] " Eric Biggers
@ 2017-06-28 16:44     ` Kees Cook
  2017-06-28 16:55       ` Eric Biggers
  0 siblings, 1 reply; 45+ messages in thread
From: Kees Cook @ 2017-06-28 16:44 UTC (permalink / raw)
  To: Eric Biggers; +Cc: kernel-hardening, David Windsor, Linux-MM, LKML

On Mon, Jun 19, 2017 at 9:08 PM, Eric Biggers <ebiggers3@gmail.com> wrote:
> On Mon, Jun 19, 2017 at 04:36:31PM -0700, Kees Cook wrote:
>> From: David Windsor <dave@nullcore.net>
>>
>> When a dentry name is short enough, it can be stored directly in
>> the dentry itself.  These dentry short names, stored in struct
>> dentry.d_iname and therefore contained in the dentry_cache slab cache,
>> need to be coped to/from userspace.
>>
>> In support of usercopy hardening, this patch defines a region in
>> the dentry_cache slab cache in which userspace copy operations
>> are allowed.
>>
>> This region is known as the slab cache's usercopy region.  Slab
>> caches can now check that each copy operation involving cache-managed
>> memory falls entirely within the slab's usercopy region.
>>
>> This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY
>> whitelisting code in the last public patch of grsecurity/PaX based on my
>> understanding of the code. Changes or omissions from the original code are
>> mine and don't reflect the original grsecurity/PaX code.
>>
>
> For all these patches please mention *where* the data is being copied to/from
> userspace.

Can you explain what you mean here? The field being copied is already
mentioned in the commit log; do you mean where in the kernel source
does the copy happen?

>
>> diff --git a/include/linux/slab.h b/include/linux/slab.h
>> index a48f54238273..97f4a0117b3b 100644
>> --- a/include/linux/slab.h
>> +++ b/include/linux/slab.h
>> @@ -151,6 +151,11 @@ void memcg_destroy_kmem_caches(struct mem_cgroup *);
>>               sizeof(struct __struct), __alignof__(struct __struct),\
>>               (__flags), NULL)
>>
>> +#define KMEM_CACHE_USERCOPY(__struct, __flags, __field) kmem_cache_create_usercopy(#__struct,\
>> +             sizeof(struct __struct), __alignof__(struct __struct),\
>> +             (__flags), offsetof(struct __struct, __field),\
>> +             sizeof_field(struct __struct, __field), NULL)
>> +
>
> This helper macro should be added in the patch which adds
> kmem_cache_create_usercopy(), not in this one.

It got moved here since this was the only user of this function and
there was already enough happening in the first patch. But yes,
probably it should stay with the first patch. It can be moved.

-Kees

-- 
Kees Cook
Pixel Security

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [kernel-hardening] [PATCH 17/23] dcache: define usercopy region in dentry_cache slab cache
  2017-06-28 16:44     ` Kees Cook
@ 2017-06-28 16:55       ` Eric Biggers
  0 siblings, 0 replies; 45+ messages in thread
From: Eric Biggers @ 2017-06-28 16:55 UTC (permalink / raw)
  To: Kees Cook; +Cc: kernel-hardening, David Windsor, Linux-MM, LKML

On Wed, Jun 28, 2017 at 09:44:13AM -0700, Kees Cook wrote:
> On Mon, Jun 19, 2017 at 9:08 PM, Eric Biggers <ebiggers3@gmail.com> wrote:
> > On Mon, Jun 19, 2017 at 04:36:31PM -0700, Kees Cook wrote:
> >> From: David Windsor <dave@nullcore.net>
> >>
> >> When a dentry name is short enough, it can be stored directly in
> >> the dentry itself.  These dentry short names, stored in struct
> >> dentry.d_iname and therefore contained in the dentry_cache slab cache,
> >> need to be coped to/from userspace.
> >>
> >> In support of usercopy hardening, this patch defines a region in
> >> the dentry_cache slab cache in which userspace copy operations
> >> are allowed.
> >>
> >> This region is known as the slab cache's usercopy region.  Slab
> >> caches can now check that each copy operation involving cache-managed
> >> memory falls entirely within the slab's usercopy region.
> >>
> >> This patch is modified from Brad Spengler/PaX Team's PAX_USERCOPY
> >> whitelisting code in the last public patch of grsecurity/PaX based on my
> >> understanding of the code. Changes or omissions from the original code are
> >> mine and don't reflect the original grsecurity/PaX code.
> >>
> >
> > For all these patches please mention *where* the data is being copied to/from
> > userspace.
> 
> Can you explain what you mean here? The field being copied is already
> mentioned in the commit log; do you mean where in the kernel source
> does the copy happen?
> 

Yes, for the ones where it isn't obvious, mentioning a syscall or ioctl might be
sufficient.  Others may need more explanation.

Eric

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [kernel-hardening] [PATCH 21/23] usercopy: Restrict non-usercopy caches to size 0
  2017-06-20  4:04   ` [kernel-hardening] " Eric Biggers
@ 2017-06-28 17:03     ` Kees Cook
  0 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2017-06-28 17:03 UTC (permalink / raw)
  To: Eric Biggers; +Cc: kernel-hardening, David Windsor, Linux-MM, LKML

On Mon, Jun 19, 2017 at 9:04 PM, Eric Biggers <ebiggers3@gmail.com> wrote:
> Hi David + Kees,
>
> On Mon, Jun 19, 2017 at 04:36:35PM -0700, Kees Cook wrote:
>> With all known usercopied cache whitelists now defined in the kernel, switch
>> the default usercopy region of kmem_cache_create() to size 0. Any new caches
>> with usercopy regions will now need to use kmem_cache_create_usercopy()
>> instead of kmem_cache_create().
>>
>
> While I'd certainly like to see the caches be whitelisted, it needs to be made
> very clear that it's being done (the cover letter for this series falsely claims
> that kmem_cache_create() is unchanged) and what the consequences are.  Is there

Well, in the first patch it is semantically unchanged: calls to
kmem_cache_create() after the first patch whitelist the entire cache
object. Only from this patch on does it change behavior to no longer
whitelist the object.

> any specific plan for identifying caches that were missed?  If it's expected for

The plan for finding caches needing whitelisting is mainly code audit
and operational testing. Encountering it is quite loud in that it BUGs
the kernel during the hardened usercopy checks.

> people to just fix them as they are found, then they need to be helped a little
> --- at the very least by putting a big comment above report_usercopy() that
> explains the possible reasons why the error might have triggered and what to do
> about it.

That sounds reasonable. It should have a comment even for the existing
protections.

Thanks!

-Kees

-- 
Kees Cook
Pixel Security

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 00/23] Hardened usercopy whitelisting
  2017-06-19 23:36 [PATCH 00/23] Hardened usercopy whitelisting Kees Cook
                   ` (23 preceding siblings ...)
  2017-06-20 19:41 ` [kernel-hardening] [PATCH 00/23] Hardened usercopy whitelisting Rik van Riel
@ 2017-10-20 22:40 ` Paolo Bonzini
  2017-10-20 23:25   ` Paolo Bonzini
  24 siblings, 1 reply; 45+ messages in thread
From: Paolo Bonzini @ 2017-10-20 22:40 UTC (permalink / raw)
  To: Kees Cook, kernel-hardening; +Cc: David Windsor, linux-mm, linux-kernel

On 20/06/2017 01:36, Kees Cook wrote:
> 
> This updates the slab allocator to add annotations (useroffset and
> usersize) to define allowed usercopy regions. Currently, hardened
> usercopy performs dynamic bounds checking on whole slab cache objects.
> This is good, but still leaves a lot of kernel slab memory available to
> be copied to/from userspace in the face of bugs. To further restrict
> what memory is available for copying, this creates a way to whitelist
> specific areas of a given slab cache object for copying to/from userspace,
> allowing much finer granularity of access control. Slab caches that are
> never exposed to userspace can declare no whitelist for their objects,
> thereby keeping them unavailable to userspace via dynamic copy operations.
> (Note, an implicit form of whitelisting is the use of constant sizes
> in usercopy operations and get_user()/put_user(); these bypass hardened
> usercopy checks since these sizes cannot change at runtime.)

This breaks KVM completely on x86, due to two ioctls
(KVM_GET/SET_CPUID2) accessing the cpuid_entries field of struct
kvm_vcpu_arch.

There's also another broken ioctl, KVM_XEN_HVM_CONFIG, but it is
obsolete and not a big deal at all.

I can post some patches, but probably not until the beginning of
November due to travelling.  Please do not send this too close to the
beginning of the merge window.

Paolo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 00/23] Hardened usercopy whitelisting
  2017-10-20 22:40 ` Paolo Bonzini
@ 2017-10-20 23:25   ` Paolo Bonzini
  2017-10-21  3:04     ` Kees Cook
  0 siblings, 1 reply; 45+ messages in thread
From: Paolo Bonzini @ 2017-10-20 23:25 UTC (permalink / raw)
  To: Kees Cook, kernel-hardening; +Cc: David Windsor, linux-mm, linux-kernel

On 21/10/2017 00:40, Paolo Bonzini wrote:
> This breaks KVM completely on x86, due to two ioctls
> (KVM_GET/SET_CPUID2) accessing the cpuid_entries field of struct
> kvm_vcpu_arch.
> 
> There's also another broken ioctl, KVM_XEN_HVM_CONFIG, but it is
> obsolete and not a big deal at all.
> 
> I can post some patches, but probably not until the beginning of
> November due to travelling.  Please do not send this too close to the
> beginning of the merge window.

Sleeping is overrated, sending patches now...

Paolo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 00/23] Hardened usercopy whitelisting
  2017-10-20 23:25   ` Paolo Bonzini
@ 2017-10-21  3:04     ` Kees Cook
  0 siblings, 0 replies; 45+ messages in thread
From: Kees Cook @ 2017-10-21  3:04 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kernel-hardening, David Windsor, Linux-MM, LKML

On Fri, Oct 20, 2017 at 4:25 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 21/10/2017 00:40, Paolo Bonzini wrote:
>> This breaks KVM completely on x86, due to two ioctls
>> (KVM_GET/SET_CPUID2) accessing the cpuid_entries field of struct
>> kvm_vcpu_arch.
>>
>> There's also another broken ioctl, KVM_XEN_HVM_CONFIG, but it is
>> obsolete and not a big deal at all.
>>
>> I can post some patches, but probably not until the beginning of
>> November due to travelling.  Please do not send this too close to the
>> beginning of the merge window.
>
> Sleeping is overrated, sending patches now...

Oh awesome, thank you very much for tracking this down and building fixes!

I'll insert these into the usercopy whitelisting series, and see if I
can find any similar cases.

Thanks!

-Kees

-- 
Kees Cook
Pixel Security

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2017-10-21  3:04 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-19 23:36 [PATCH 00/23] Hardened usercopy whitelisting Kees Cook
2017-06-19 23:36 ` [PATCH 01/23] usercopy: Prepare for " Kees Cook
2017-06-19 23:36 ` [PATCH 02/23] usercopy: Enforce slab cache usercopy region boundaries Kees Cook
2017-06-19 23:36 ` [PATCH 03/23] vfs: define usercopy region in names_cache slab caches Kees Cook
2017-06-19 23:36 ` [PATCH 04/23] vfs: copy struct mount.mnt_id to userspace using put_user() Kees Cook
2017-06-19 23:36 ` [PATCH 05/23] befs: define usercopy region in befs_inode_cache slab cache Kees Cook
2017-06-19 23:36 ` [PATCH 06/23] cifs: define usercopy region in cifs_request " Kees Cook
2017-06-19 23:36 ` [PATCH 07/23] exofs: define usercopy region in exofs_inode_cache " Kees Cook
2017-06-19 23:36 ` [PATCH 08/23] ext2: define usercopy region in ext2_inode_cache " Kees Cook
2017-06-19 23:36 ` [PATCH 09/23] ext4: define usercopy region in ext4_inode_cache " Kees Cook
2017-06-19 23:36 ` [PATCH 10/23] vxfs: define usercopy region in vxfs_inode " Kees Cook
2017-06-19 23:36 ` [PATCH 11/23] jfs: define usercopy region in jfs_ip " Kees Cook
2017-06-19 23:36 ` [PATCH 12/23] orangefs: define usercopy region in orangefs_inode_cache " Kees Cook
2017-06-19 23:36 ` [PATCH 13/23] ufs: define usercopy region in ufs_inode_cache " Kees Cook
2017-06-19 23:36 ` [PATCH 14/23] fork: define usercopy region in thread_stack, task_struct, mm_struct slab caches Kees Cook
2017-06-19 23:36 ` [PATCH 15/23] net: define usercopy region in struct proto slab cache Kees Cook
2017-06-19 23:36 ` [PATCH 16/23] net: copy struct sctp_sock.autoclose to userspace using put_user() Kees Cook
2017-06-19 23:36 ` [PATCH 17/23] dcache: define usercopy region in dentry_cache slab cache Kees Cook
2017-06-20  4:08   ` [kernel-hardening] " Eric Biggers
2017-06-28 16:44     ` Kees Cook
2017-06-28 16:55       ` Eric Biggers
2017-06-19 23:36 ` [PATCH 18/23] scsi: define usercopy region in scsi_sense_cache " Kees Cook
2017-06-19 23:36 ` [PATCH 19/23] xfs: define usercopy region in xfs_inode " Kees Cook
2017-06-19 23:36 ` [PATCH 20/23] usercopy: convert kmalloc caches to usercopy caches Kees Cook
2017-06-19 23:36 ` [PATCH 21/23] usercopy: Restrict non-usercopy caches to size 0 Kees Cook
2017-06-20  4:04   ` [kernel-hardening] " Eric Biggers
2017-06-28 17:03     ` Kees Cook
2017-06-19 23:36 ` [PATCH 22/23] usercopy: split user-controlled slabs to separate caches Kees Cook
2017-06-20  4:24   ` [kernel-hardening] " Eric Biggers
2017-06-20  4:47   ` Eric Biggers
2017-06-20 22:27     ` Kees Cook
2017-06-20 20:24   ` Laura Abbott
2017-06-20 22:22     ` Kees Cook
2017-06-27  7:31       ` Michal Hocko
2017-06-27 22:07         ` Kees Cook
2017-06-28  8:54           ` Michal Hocko
2017-06-19 23:36 ` [PATCH 23/23] mm: Allow slab_nomerge to be set at build time Kees Cook
2017-06-20  4:09   ` [kernel-hardening] " Daniel Micay
2017-06-20 22:51     ` Kees Cook
2017-06-20  4:29   ` Eric Biggers
2017-06-20 23:09     ` Kees Cook
2017-06-20 19:41 ` [kernel-hardening] [PATCH 00/23] Hardened usercopy whitelisting Rik van Riel
2017-10-20 22:40 ` Paolo Bonzini
2017-10-20 23:25   ` Paolo Bonzini
2017-10-21  3:04     ` Kees Cook

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).