[PATCH] mm/mincore: allow for making sys_mincore() privileged

* [PATCH] mm/mincore: allow for making sys_mincore() privileged
@ 2019-01-05 17:27 Jiri Kosina
  2019-01-05 19:14 ` Vlastimil Babka
                   ` (5 more replies)
  0 siblings, 6 replies; 161+ messages in thread
From: Jiri Kosina @ 2019-01-05 17:27 UTC (permalink / raw)
  To: Linus Torvalds, Andrew Morton, Greg KH, Peter Zijlstra, Michal Hocko
  Cc: linux-mm, linux-kernel, linux-api

From: Jiri Kosina <jkosina@suse.cz>

There are possibilities [1] how mincore() could be used as a converyor of 
a sidechannel information about pagecache metadata.

Provide vm.mincore_privileged sysctl, which makes it possible to mincore() 
start returning -EPERM in case it's invoked by a process lacking 
CAP_SYS_ADMIN.

The default behavior stays "mincore() can be used by anybody" in order to 
be conservative with respect to userspace behavior.

[1] https://www.theregister.co.uk/2019/01/05/boffins_beat_page_cache/

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
---
 Documentation/sysctl/vm.txt | 9 +++++++++
 kernel/sysctl.c             | 8 ++++++++
 mm/mincore.c                | 5 +++++
 3 files changed, 22 insertions(+)

diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
index 187ce4f599a2..afb8635e925e 100644
--- a/Documentation/sysctl/vm.txt
+++ b/Documentation/sysctl/vm.txt
@@ -41,6 +41,7 @@ Currently, these files are in /proc/sys/vm:
 - min_free_kbytes
 - min_slab_ratio
 - min_unmapped_ratio
+- mincore_privileged
 - mmap_min_addr
 - mmap_rnd_bits
 - mmap_rnd_compat_bits
@@ -485,6 +486,14 @@ files and similar are considered.
 The default is 1 percent.
 
 ==============================================================
+mincore_privileged:
+
+mincore() could be potentially used to mount a side-channel attack against
+pagecache metadata. This sysctl provides system administrators means to
+make it available only to processess that own CAP_SYS_ADMIN capability.
+
+The default is 0, which means mincore() can be used without restrictions.
+==============================================================
 
 mmap_min_addr
 
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 1825f712e73b..f03cb07c8dd4 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -114,6 +114,7 @@ extern unsigned int sysctl_nr_open_min, sysctl_nr_open_max;
 #ifndef CONFIG_MMU
 extern int sysctl_nr_trim_pages;
 #endif
+extern int sysctl_mincore_privileged;
 
 /* Constants used for minimum and  maximum */
 #ifdef CONFIG_LOCKUP_DETECTOR
@@ -1684,6 +1685,13 @@ static struct ctl_table vm_table[] = {
 		.extra2		= (void *)&mmap_rnd_compat_bits_max,
 	},
 #endif
+	{
+		.procname	= "mincore_privileged",
+		.data		= &sysctl_mincore_privileged,
+		.maxlen		= sizeof(sysctl_mincore_privileged),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec,
+	},
 	{ }
 };
 
diff --git a/mm/mincore.c b/mm/mincore.c
index 218099b5ed31..77d4928cdfaa 100644
--- a/mm/mincore.c
+++ b/mm/mincore.c
@@ -21,6 +21,8 @@
 #include <linux/uaccess.h>
 #include <asm/pgtable.h>
 
+int sysctl_mincore_privileged;
+
 static int mincore_hugetlb(pte_t *pte, unsigned long hmask, unsigned long addr,
 			unsigned long end, struct mm_walk *walk)
 {
@@ -228,6 +230,9 @@ SYSCALL_DEFINE3(mincore, unsigned long, start, size_t, len,
 	unsigned long pages;
 	unsigned char *tmp;
 
+	if (sysctl_mincore_privileged && !capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
 	/* Check the start address: needs to be page-aligned.. */
 	if (start & ~PAGE_MASK)
 		return -EINVAL;
-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply related	[flat|nested] 161+ messages in thread