bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] Refactor kallsyms_show_value() users for correct cred
@ 2020-07-02 23:26 Kees Cook
  2020-07-02 23:26 ` [PATCH 1/5] kallsyms: Refactor kallsyms_show_value() to take cred Kees Cook
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Kees Cook @ 2020-07-02 23:26 UTC (permalink / raw)
  To: Dominik Czarnota
  Cc: Kees Cook, Jessica Yu, Linus Torvalds, Greg Kroah-Hartman,
	Andrew Morton, Alexei Starovoitov, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, Andrii Nakryiko,
	KP Singh, Naveen N. Rao, Anil S Keshavamurthy, David S. Miller,
	Masami Hiramatsu, Jakub Kicinski, Steven Rostedt (VMware),
	Dmitry Safonov, Will Deacon, Alexey Dobriyan, Marc Zyngier,
	Masahiro Yamada, Al Viro, Matteo Croce, Edward Cree,
	Nicolas Dichtel, Alexander Lobakin, Thomas Richter, Ingo Molnar,
	netdev, bpf, linux-kernel

Hi,

I'm not sure who should carry this tree (me? Greg? akpm? Linus?), but
it fixes a kernel address exposure bug reported by Dominik Czarnota,
where /sys/modules/*/sections/* contents were visible to uid-0 without
CAP_SYSLOG (e.g. in containers):

This is correct, with CAP_SYSLOG:
 # cat /sys/module/*/sections/.*text
 0xffffffffc0458000
 ...

This is broken:
 # capsh --drop=CAP_SYSLOG -- -c "cat /sys/module/*/sections/.*text"
 0xffffffffc0458000
 ...

Fixing this required refactoring of several internals, and in the process
uncovered other users of kallsyms_show_value() that were doing checks
during "read" context instead of "open" context. This fixes all of these
cases by plumbing the file->f_cred through to their ultimate checks via
kallsyms_show_value()'s new cred argument.

Testing, reviews, and acks appreciated. :)

Thanks!

-Kees


Kees Cook (5):
  kallsyms: Refactor kallsyms_show_value() to take cred
  module: Refactor section attr into bin attribute
  module: Do not expose section addresses to non-CAP_SYSLOG
  kprobes: Do not expose probe addresses to non-CAP_SYSLOG
  bpf: Check correct cred for CAP_SYSLOG in bpf_dump_raw_ok()

 include/linux/filter.h     |  4 +--
 include/linux/kallsyms.h   |  5 ++--
 kernel/bpf/syscall.c       | 37 +++++++++++++++------------
 kernel/kallsyms.c          | 17 ++++++++-----
 kernel/kprobes.c           |  4 +--
 kernel/module.c            | 51 ++++++++++++++++++++------------------
 net/core/sysctl_net_core.c |  2 +-
 7 files changed, 67 insertions(+), 53 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/5] kallsyms: Refactor kallsyms_show_value() to take cred
  2020-07-02 23:26 [PATCH 0/5] Refactor kallsyms_show_value() users for correct cred Kees Cook
@ 2020-07-02 23:26 ` Kees Cook
  2020-07-02 23:26 ` [PATCH 2/5] module: Refactor section attr into bin attribute Kees Cook
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 16+ messages in thread
From: Kees Cook @ 2020-07-02 23:26 UTC (permalink / raw)
  To: Dominik Czarnota
  Cc: Kees Cook, stable, Jessica Yu, Linus Torvalds,
	Greg Kroah-Hartman, Andrew Morton, Alexei Starovoitov,
	Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	Andrii Nakryiko, KP Singh, Naveen N. Rao, Anil S Keshavamurthy,
	David S. Miller, Masami Hiramatsu, Jakub Kicinski,
	Steven Rostedt (VMware),
	Dmitry Safonov, Will Deacon, Alexey Dobriyan, Marc Zyngier,
	Masahiro Yamada, Al Viro, Matteo Croce, Edward Cree,
	Nicolas Dichtel, Alexander Lobakin, Thomas Richter, Ingo Molnar,
	netdev, bpf, linux-kernel

In order to perform future tests against the cred saved during open(),
switch kallsyms_show_value() to operate on a cred, and have all current
callers pass current_cred(). This makes it very obvious where callers
are checking the wrong credential in their "read" contexts. These will
be fixed in the coming patches.

Additionally switch return value to bool, since it is always used as a
direct permission check, not a 0-on-success, negative-on-error style
function return.

Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 include/linux/filter.h   |  2 +-
 include/linux/kallsyms.h |  5 +++--
 kernel/kallsyms.c        | 17 +++++++++++------
 kernel/kprobes.c         |  4 ++--
 kernel/module.c          |  2 +-
 5 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index 259377723603..55104f6c78e8 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -889,7 +889,7 @@ static inline bool bpf_dump_raw_ok(void)
 	/* Reconstruction of call-sites is dependent on kallsyms,
 	 * thus make dump the same restriction.
 	 */
-	return kallsyms_show_value() == 1;
+	return kallsyms_show_value(current_cred());
 }
 
 struct bpf_prog *bpf_patch_insn_single(struct bpf_prog *prog, u32 off,
diff --git a/include/linux/kallsyms.h b/include/linux/kallsyms.h
index 98338dc6b5d2..481273f0c72d 100644
--- a/include/linux/kallsyms.h
+++ b/include/linux/kallsyms.h
@@ -18,6 +18,7 @@
 #define KSYM_SYMBOL_LEN (sizeof("%s+%#lx/%#lx [%s]") + (KSYM_NAME_LEN - 1) + \
 			 2*(BITS_PER_LONG*3/10) + (MODULE_NAME_LEN - 1) + 1)
 
+struct cred;
 struct module;
 
 static inline int is_kernel_inittext(unsigned long addr)
@@ -98,7 +99,7 @@ int lookup_symbol_name(unsigned long addr, char *symname);
 int lookup_symbol_attrs(unsigned long addr, unsigned long *size, unsigned long *offset, char *modname, char *name);
 
 /* How and when do we show kallsyms values? */
-extern int kallsyms_show_value(void);
+extern bool kallsyms_show_value(const struct cred *cred);
 
 #else /* !CONFIG_KALLSYMS */
 
@@ -158,7 +159,7 @@ static inline int lookup_symbol_attrs(unsigned long addr, unsigned long *size, u
 	return -ERANGE;
 }
 
-static inline int kallsyms_show_value(void)
+static inline bool kallsyms_show_value(const struct cred *cred)
 {
 	return false;
 }
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 16c8c605f4b0..bb14e64f62a4 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -644,19 +644,20 @@ static inline int kallsyms_for_perf(void)
  * Otherwise, require CAP_SYSLOG (assuming kptr_restrict isn't set to
  * block even that).
  */
-int kallsyms_show_value(void)
+bool kallsyms_show_value(const struct cred *cred)
 {
 	switch (kptr_restrict) {
 	case 0:
 		if (kallsyms_for_perf())
-			return 1;
+			return true;
 	/* fallthrough */
 	case 1:
-		if (has_capability_noaudit(current, CAP_SYSLOG))
-			return 1;
+		if (security_capable(cred, &init_user_ns, CAP_SYSLOG,
+				     CAP_OPT_NOAUDIT) == 0)
+			return true;
 	/* fallthrough */
 	default:
-		return 0;
+		return false;
 	}
 }
 
@@ -673,7 +674,11 @@ static int kallsyms_open(struct inode *inode, struct file *file)
 		return -ENOMEM;
 	reset_iter(iter, 0);
 
-	iter->show_value = kallsyms_show_value();
+	/*
+	 * Instead of checking this on every s_show() call, cache
+	 * the result here at open time.
+	 */
+	iter->show_value = kallsyms_show_value(file->f_cred);
 	return 0;
 }
 
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 4a904cc56d68..d4de217e4a91 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -2448,7 +2448,7 @@ static void report_probe(struct seq_file *pi, struct kprobe *p,
 	else
 		kprobe_type = "k";
 
-	if (!kallsyms_show_value())
+	if (!kallsyms_show_value(current_cred()))
 		addr = NULL;
 
 	if (sym)
@@ -2540,7 +2540,7 @@ static int kprobe_blacklist_seq_show(struct seq_file *m, void *v)
 	 * If /proc/kallsyms is not showing kernel address, we won't
 	 * show them here either.
 	 */
-	if (!kallsyms_show_value())
+	if (!kallsyms_show_value(current_cred()))
 		seq_printf(m, "0x%px-0x%px\t%ps\n", NULL, NULL,
 			   (void *)ent->start_addr);
 	else
diff --git a/kernel/module.c b/kernel/module.c
index e8a198588f26..a5022ae84e50 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -4377,7 +4377,7 @@ static int modules_open(struct inode *inode, struct file *file)
 
 	if (!err) {
 		struct seq_file *m = file->private_data;
-		m->private = kallsyms_show_value() ? NULL : (void *)8ul;
+		m->private = kallsyms_show_value(current_cred()) ? NULL : (void *)8ul;
 	}
 
 	return err;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/5] module: Refactor section attr into bin attribute
  2020-07-02 23:26 [PATCH 0/5] Refactor kallsyms_show_value() users for correct cred Kees Cook
  2020-07-02 23:26 ` [PATCH 1/5] kallsyms: Refactor kallsyms_show_value() to take cred Kees Cook
@ 2020-07-02 23:26 ` Kees Cook
  2020-07-03  6:02   ` Greg Kroah-Hartman
  2020-07-08 16:10   ` Jessica Yu
  2020-07-02 23:26 ` [PATCH 3/5] module: Do not expose section addresses to non-CAP_SYSLOG Kees Cook
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 16+ messages in thread
From: Kees Cook @ 2020-07-02 23:26 UTC (permalink / raw)
  To: Dominik Czarnota
  Cc: Kees Cook, stable, Jessica Yu, Linus Torvalds,
	Greg Kroah-Hartman, Andrew Morton, Alexei Starovoitov,
	Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	Andrii Nakryiko, KP Singh, Naveen N. Rao, Anil S Keshavamurthy,
	David S. Miller, Masami Hiramatsu, Jakub Kicinski,
	Steven Rostedt (VMware),
	Dmitry Safonov, Will Deacon, Alexey Dobriyan, Marc Zyngier,
	Masahiro Yamada, Al Viro, Matteo Croce, Edward Cree,
	Nicolas Dichtel, Alexander Lobakin, Thomas Richter, Ingo Molnar,
	netdev, bpf, linux-kernel

In order to gain access to the open file's f_cred for kallsym visibility
permission checks, refactor the module section attributes to use the
bin_attribute instead of attribute interface. Additionally removes the
redundant "name" struct member.

Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 kernel/module.c | 45 ++++++++++++++++++++++++---------------------
 1 file changed, 24 insertions(+), 21 deletions(-)

diff --git a/kernel/module.c b/kernel/module.c
index a5022ae84e50..9e2954519259 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -1510,8 +1510,7 @@ static inline bool sect_empty(const Elf_Shdr *sect)
 }
 
 struct module_sect_attr {
-	struct module_attribute mattr;
-	char *name;
+	struct bin_attribute battr;
 	unsigned long address;
 };
 
@@ -1521,11 +1520,16 @@ struct module_sect_attrs {
 	struct module_sect_attr attrs[];
 };
 
-static ssize_t module_sect_show(struct module_attribute *mattr,
-				struct module_kobject *mk, char *buf)
+static ssize_t module_sect_read(struct file *file, struct kobject *kobj,
+				struct bin_attribute *battr,
+				char *buf, loff_t pos, size_t count)
 {
 	struct module_sect_attr *sattr =
-		container_of(mattr, struct module_sect_attr, mattr);
+		container_of(battr, struct module_sect_attr, battr);
+
+	if (pos != 0)
+		return -EINVAL;
+
 	return sprintf(buf, "0x%px\n", kptr_restrict < 2 ?
 		       (void *)sattr->address : NULL);
 }
@@ -1535,7 +1539,7 @@ static void free_sect_attrs(struct module_sect_attrs *sect_attrs)
 	unsigned int section;
 
 	for (section = 0; section < sect_attrs->nsections; section++)
-		kfree(sect_attrs->attrs[section].name);
+		kfree(sect_attrs->attrs[section].battr.attr.name);
 	kfree(sect_attrs);
 }
 
@@ -1544,42 +1548,41 @@ static void add_sect_attrs(struct module *mod, const struct load_info *info)
 	unsigned int nloaded = 0, i, size[2];
 	struct module_sect_attrs *sect_attrs;
 	struct module_sect_attr *sattr;
-	struct attribute **gattr;
+	struct bin_attribute **gattr;
 
 	/* Count loaded sections and allocate structures */
 	for (i = 0; i < info->hdr->e_shnum; i++)
 		if (!sect_empty(&info->sechdrs[i]))
 			nloaded++;
 	size[0] = ALIGN(struct_size(sect_attrs, attrs, nloaded),
-			sizeof(sect_attrs->grp.attrs[0]));
-	size[1] = (nloaded + 1) * sizeof(sect_attrs->grp.attrs[0]);
+			sizeof(sect_attrs->grp.bin_attrs[0]));
+	size[1] = (nloaded + 1) * sizeof(sect_attrs->grp.bin_attrs[0]);
 	sect_attrs = kzalloc(size[0] + size[1], GFP_KERNEL);
 	if (sect_attrs == NULL)
 		return;
 
 	/* Setup section attributes. */
 	sect_attrs->grp.name = "sections";
-	sect_attrs->grp.attrs = (void *)sect_attrs + size[0];
+	sect_attrs->grp.bin_attrs = (void *)sect_attrs + size[0];
 
 	sect_attrs->nsections = 0;
 	sattr = &sect_attrs->attrs[0];
-	gattr = &sect_attrs->grp.attrs[0];
+	gattr = &sect_attrs->grp.bin_attrs[0];
 	for (i = 0; i < info->hdr->e_shnum; i++) {
 		Elf_Shdr *sec = &info->sechdrs[i];
 		if (sect_empty(sec))
 			continue;
+		sysfs_bin_attr_init(&sattr->battr);
 		sattr->address = sec->sh_addr;
-		sattr->name = kstrdup(info->secstrings + sec->sh_name,
-					GFP_KERNEL);
-		if (sattr->name == NULL)
+		sattr->battr.attr.name =
+			kstrdup(info->secstrings + sec->sh_name, GFP_KERNEL);
+		if (sattr->battr.attr.name == NULL)
 			goto out;
 		sect_attrs->nsections++;
-		sysfs_attr_init(&sattr->mattr.attr);
-		sattr->mattr.show = module_sect_show;
-		sattr->mattr.store = NULL;
-		sattr->mattr.attr.name = sattr->name;
-		sattr->mattr.attr.mode = S_IRUSR;
-		*(gattr++) = &(sattr++)->mattr.attr;
+		sattr->battr.read = module_sect_read;
+		sattr->battr.size = 3 /* "0x", "\n" */ + (BITS_PER_LONG / 4);
+		sattr->battr.attr.mode = 0400;
+		*(gattr++) = &(sattr++)->battr;
 	}
 	*gattr = NULL;
 
@@ -1669,7 +1672,7 @@ static void add_notes_attrs(struct module *mod, const struct load_info *info)
 			continue;
 		if (info->sechdrs[i].sh_type == SHT_NOTE) {
 			sysfs_bin_attr_init(nattr);
-			nattr->attr.name = mod->sect_attrs->attrs[loaded].name;
+			nattr->attr.name = mod->sect_attrs->attrs[loaded].battr.attr.name;
 			nattr->attr.mode = S_IRUGO;
 			nattr->size = info->sechdrs[i].sh_size;
 			nattr->private = (void *) info->sechdrs[i].sh_addr;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/5] module: Do not expose section addresses to non-CAP_SYSLOG
  2020-07-02 23:26 [PATCH 0/5] Refactor kallsyms_show_value() users for correct cred Kees Cook
  2020-07-02 23:26 ` [PATCH 1/5] kallsyms: Refactor kallsyms_show_value() to take cred Kees Cook
  2020-07-02 23:26 ` [PATCH 2/5] module: Refactor section attr into bin attribute Kees Cook
@ 2020-07-02 23:26 ` Kees Cook
  2020-07-08 16:12   ` Jessica Yu
  2020-07-02 23:26 ` [PATCH 4/5] kprobes: Do not expose probe " Kees Cook
  2020-07-02 23:26 ` [PATCH 5/5] bpf: Check correct cred for CAP_SYSLOG in bpf_dump_raw_ok() Kees Cook
  4 siblings, 1 reply; 16+ messages in thread
From: Kees Cook @ 2020-07-02 23:26 UTC (permalink / raw)
  To: Dominik Czarnota
  Cc: Kees Cook, stable, Jessica Yu, Linus Torvalds,
	Greg Kroah-Hartman, Andrew Morton, Alexei Starovoitov,
	Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	Andrii Nakryiko, KP Singh, Naveen N. Rao, Anil S Keshavamurthy,
	David S. Miller, Masami Hiramatsu, Jakub Kicinski,
	Steven Rostedt (VMware),
	Dmitry Safonov, Will Deacon, Alexey Dobriyan, Marc Zyngier,
	Masahiro Yamada, Al Viro, Matteo Croce, Edward Cree,
	Nicolas Dichtel, Alexander Lobakin, Thomas Richter, Ingo Molnar,
	netdev, bpf, linux-kernel

The printing of section addresses in /sys/module/*/sections/* was not
using the correct credentials to evaluate visibility.

Before:

 # cat /sys/module/*/sections/.*text
 0xffffffffc0458000
 ...
 # capsh --drop=CAP_SYSLOG -- -c "cat /sys/module/*/sections/.*text"
 0xffffffffc0458000
 ...

After:

 # cat /sys/module/*/sections/*.text
 0xffffffffc0458000
 ...
 # capsh --drop=CAP_SYSLOG -- -c "cat /sys/module/*/sections/.*text"
 0x0000000000000000
 ...

Additionally replaces the existing (safe) /proc/modules check with
file->f_cred for consistency.

Cc: stable@vger.kernel.org
Reported-by: Dominik Czarnota <dominik.czarnota@trailofbits.com>
Fixes: be71eda5383f ("module: Fix display of wrong module .text address")
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 kernel/module.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/module.c b/kernel/module.c
index 9e2954519259..e6c7571092cb 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -1530,8 +1530,8 @@ static ssize_t module_sect_read(struct file *file, struct kobject *kobj,
 	if (pos != 0)
 		return -EINVAL;
 
-	return sprintf(buf, "0x%px\n", kptr_restrict < 2 ?
-		       (void *)sattr->address : NULL);
+	return sprintf(buf, "0x%px\n",
+		       kallsyms_show_value(file->f_cred) ? (void *)sattr->address : NULL);
 }
 
 static void free_sect_attrs(struct module_sect_attrs *sect_attrs)
@@ -4380,7 +4380,7 @@ static int modules_open(struct inode *inode, struct file *file)
 
 	if (!err) {
 		struct seq_file *m = file->private_data;
-		m->private = kallsyms_show_value(current_cred()) ? NULL : (void *)8ul;
+		m->private = kallsyms_show_value(file->f_cred) ? NULL : (void *)8ul;
 	}
 
 	return err;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/5] kprobes: Do not expose probe addresses to non-CAP_SYSLOG
  2020-07-02 23:26 [PATCH 0/5] Refactor kallsyms_show_value() users for correct cred Kees Cook
                   ` (2 preceding siblings ...)
  2020-07-02 23:26 ` [PATCH 3/5] module: Do not expose section addresses to non-CAP_SYSLOG Kees Cook
@ 2020-07-02 23:26 ` Kees Cook
  2020-07-03  1:00   ` Linus Torvalds
  2020-07-10 14:09   ` Masami Hiramatsu
  2020-07-02 23:26 ` [PATCH 5/5] bpf: Check correct cred for CAP_SYSLOG in bpf_dump_raw_ok() Kees Cook
  4 siblings, 2 replies; 16+ messages in thread
From: Kees Cook @ 2020-07-02 23:26 UTC (permalink / raw)
  To: Dominik Czarnota
  Cc: Kees Cook, stable, Jessica Yu, Linus Torvalds,
	Greg Kroah-Hartman, Andrew Morton, Alexei Starovoitov,
	Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	Andrii Nakryiko, KP Singh, Naveen N. Rao, Anil S Keshavamurthy,
	David S. Miller, Masami Hiramatsu, Jakub Kicinski,
	Steven Rostedt (VMware),
	Dmitry Safonov, Will Deacon, Alexey Dobriyan, Marc Zyngier,
	Masahiro Yamada, Al Viro, Matteo Croce, Edward Cree,
	Nicolas Dichtel, Alexander Lobakin, Thomas Richter, Ingo Molnar,
	netdev, bpf, linux-kernel

The kprobe show() functions were using "current"'s creds instead
of the file opener's creds for kallsyms visibility. Fix to use
seq_file->file->f_cred.

Cc: stable@vger.kernel.org
Fixes: 81365a947de4 ("kprobes: Show address of kprobes if kallsyms does")
Fixes: ffb9bd68ebdb ("kprobes: Show blacklist addresses as same as kallsyms does")
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 kernel/kprobes.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index d4de217e4a91..2e97febeef77 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -2448,7 +2448,7 @@ static void report_probe(struct seq_file *pi, struct kprobe *p,
 	else
 		kprobe_type = "k";
 
-	if (!kallsyms_show_value(current_cred()))
+	if (!kallsyms_show_value(pi->file->f_cred))
 		addr = NULL;
 
 	if (sym)
@@ -2540,7 +2540,7 @@ static int kprobe_blacklist_seq_show(struct seq_file *m, void *v)
 	 * If /proc/kallsyms is not showing kernel address, we won't
 	 * show them here either.
 	 */
-	if (!kallsyms_show_value(current_cred()))
+	if (!kallsyms_show_value(m->file->f_cred))
 		seq_printf(m, "0x%px-0x%px\t%ps\n", NULL, NULL,
 			   (void *)ent->start_addr);
 	else
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 5/5] bpf: Check correct cred for CAP_SYSLOG in bpf_dump_raw_ok()
  2020-07-02 23:26 [PATCH 0/5] Refactor kallsyms_show_value() users for correct cred Kees Cook
                   ` (3 preceding siblings ...)
  2020-07-02 23:26 ` [PATCH 4/5] kprobes: Do not expose probe " Kees Cook
@ 2020-07-02 23:26 ` Kees Cook
  4 siblings, 0 replies; 16+ messages in thread
From: Kees Cook @ 2020-07-02 23:26 UTC (permalink / raw)
  To: Dominik Czarnota
  Cc: Kees Cook, stable, Jessica Yu, Linus Torvalds,
	Greg Kroah-Hartman, Andrew Morton, Alexei Starovoitov,
	Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	Andrii Nakryiko, KP Singh, Naveen N. Rao, Anil S Keshavamurthy,
	David S. Miller, Masami Hiramatsu, Jakub Kicinski,
	Steven Rostedt (VMware),
	Dmitry Safonov, Will Deacon, Alexey Dobriyan, Marc Zyngier,
	Masahiro Yamada, Al Viro, Matteo Croce, Edward Cree,
	Nicolas Dichtel, Alexander Lobakin, Thomas Richter, Ingo Molnar,
	netdev, bpf, linux-kernel

When evaluating access control over kallsyms visibility, credentials at
open() time need to be used, not the "current" creds (though in BPF's
case, this has likely always been the same). Plumb access to associated
file->f_cred down through bpf_dump_raw_ok() and its callers now that
kallsysm_show_value() has been refactored to take struct cred.

Cc: stable@vger.kernel.org
Fixes: 7105e828c087 ("bpf: allow for correlation of maps and helpers in dump")
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 include/linux/filter.h     |  4 ++--
 kernel/bpf/syscall.c       | 37 +++++++++++++++++++++----------------
 net/core/sysctl_net_core.c |  2 +-
 3 files changed, 24 insertions(+), 19 deletions(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index 55104f6c78e8..0b0144752d78 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -884,12 +884,12 @@ void bpf_jit_compile(struct bpf_prog *prog);
 bool bpf_jit_needs_zext(void);
 bool bpf_helper_changes_pkt_data(void *func);
 
-static inline bool bpf_dump_raw_ok(void)
+static inline bool bpf_dump_raw_ok(const struct cred *cred)
 {
 	/* Reconstruction of call-sites is dependent on kallsyms,
 	 * thus make dump the same restriction.
 	 */
-	return kallsyms_show_value(current_cred());
+	return kallsyms_show_value(cred);
 }
 
 struct bpf_prog *bpf_patch_insn_single(struct bpf_prog *prog, u32 off,
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 8da159936bab..859053ddf05b 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3139,7 +3139,8 @@ static const struct bpf_map *bpf_map_from_imm(const struct bpf_prog *prog,
 	return NULL;
 }
 
-static struct bpf_insn *bpf_insn_prepare_dump(const struct bpf_prog *prog)
+static struct bpf_insn *bpf_insn_prepare_dump(const struct bpf_prog *prog,
+					      const struct cred *f_cred)
 {
 	const struct bpf_map *map;
 	struct bpf_insn *insns;
@@ -3165,7 +3166,7 @@ static struct bpf_insn *bpf_insn_prepare_dump(const struct bpf_prog *prog)
 		    code == (BPF_JMP | BPF_CALL_ARGS)) {
 			if (code == (BPF_JMP | BPF_CALL_ARGS))
 				insns[i].code = BPF_JMP | BPF_CALL;
-			if (!bpf_dump_raw_ok())
+			if (!bpf_dump_raw_ok(f_cred))
 				insns[i].imm = 0;
 			continue;
 		}
@@ -3221,7 +3222,8 @@ static int set_info_rec_size(struct bpf_prog_info *info)
 	return 0;
 }
 
-static int bpf_prog_get_info_by_fd(struct bpf_prog *prog,
+static int bpf_prog_get_info_by_fd(struct file *file,
+				   struct bpf_prog *prog,
 				   const union bpf_attr *attr,
 				   union bpf_attr __user *uattr)
 {
@@ -3290,11 +3292,11 @@ static int bpf_prog_get_info_by_fd(struct bpf_prog *prog,
 		struct bpf_insn *insns_sanitized;
 		bool fault;
 
-		if (prog->blinded && !bpf_dump_raw_ok()) {
+		if (prog->blinded && !bpf_dump_raw_ok(file->f_cred)) {
 			info.xlated_prog_insns = 0;
 			goto done;
 		}
-		insns_sanitized = bpf_insn_prepare_dump(prog);
+		insns_sanitized = bpf_insn_prepare_dump(prog, file->f_cred);
 		if (!insns_sanitized)
 			return -ENOMEM;
 		uinsns = u64_to_user_ptr(info.xlated_prog_insns);
@@ -3328,7 +3330,7 @@ static int bpf_prog_get_info_by_fd(struct bpf_prog *prog,
 	}
 
 	if (info.jited_prog_len && ulen) {
-		if (bpf_dump_raw_ok()) {
+		if (bpf_dump_raw_ok(file->f_cred)) {
 			uinsns = u64_to_user_ptr(info.jited_prog_insns);
 			ulen = min_t(u32, info.jited_prog_len, ulen);
 
@@ -3363,7 +3365,7 @@ static int bpf_prog_get_info_by_fd(struct bpf_prog *prog,
 	ulen = info.nr_jited_ksyms;
 	info.nr_jited_ksyms = prog->aux->func_cnt ? : 1;
 	if (ulen) {
-		if (bpf_dump_raw_ok()) {
+		if (bpf_dump_raw_ok(file->f_cred)) {
 			unsigned long ksym_addr;
 			u64 __user *user_ksyms;
 			u32 i;
@@ -3394,7 +3396,7 @@ static int bpf_prog_get_info_by_fd(struct bpf_prog *prog,
 	ulen = info.nr_jited_func_lens;
 	info.nr_jited_func_lens = prog->aux->func_cnt ? : 1;
 	if (ulen) {
-		if (bpf_dump_raw_ok()) {
+		if (bpf_dump_raw_ok(file->f_cred)) {
 			u32 __user *user_lens;
 			u32 func_len, i;
 
@@ -3451,7 +3453,7 @@ static int bpf_prog_get_info_by_fd(struct bpf_prog *prog,
 	else
 		info.nr_jited_line_info = 0;
 	if (info.nr_jited_line_info && ulen) {
-		if (bpf_dump_raw_ok()) {
+		if (bpf_dump_raw_ok(file->f_cred)) {
 			__u64 __user *user_linfo;
 			u32 i;
 
@@ -3497,7 +3499,8 @@ static int bpf_prog_get_info_by_fd(struct bpf_prog *prog,
 	return 0;
 }
 
-static int bpf_map_get_info_by_fd(struct bpf_map *map,
+static int bpf_map_get_info_by_fd(struct file *file,
+				  struct bpf_map *map,
 				  const union bpf_attr *attr,
 				  union bpf_attr __user *uattr)
 {
@@ -3540,7 +3543,8 @@ static int bpf_map_get_info_by_fd(struct bpf_map *map,
 	return 0;
 }
 
-static int bpf_btf_get_info_by_fd(struct btf *btf,
+static int bpf_btf_get_info_by_fd(struct file *file,
+				  struct btf *btf,
 				  const union bpf_attr *attr,
 				  union bpf_attr __user *uattr)
 {
@@ -3555,7 +3559,8 @@ static int bpf_btf_get_info_by_fd(struct btf *btf,
 	return btf_get_info_by_fd(btf, attr, uattr);
 }
 
-static int bpf_link_get_info_by_fd(struct bpf_link *link,
+static int bpf_link_get_info_by_fd(struct file *file,
+				  struct bpf_link *link,
 				  const union bpf_attr *attr,
 				  union bpf_attr __user *uattr)
 {
@@ -3608,15 +3613,15 @@ static int bpf_obj_get_info_by_fd(const union bpf_attr *attr,
 		return -EBADFD;
 
 	if (f.file->f_op == &bpf_prog_fops)
-		err = bpf_prog_get_info_by_fd(f.file->private_data, attr,
+		err = bpf_prog_get_info_by_fd(f.file, f.file->private_data, attr,
 					      uattr);
 	else if (f.file->f_op == &bpf_map_fops)
-		err = bpf_map_get_info_by_fd(f.file->private_data, attr,
+		err = bpf_map_get_info_by_fd(f.file, f.file->private_data, attr,
 					     uattr);
 	else if (f.file->f_op == &btf_fops)
-		err = bpf_btf_get_info_by_fd(f.file->private_data, attr, uattr);
+		err = bpf_btf_get_info_by_fd(f.file, f.file->private_data, attr, uattr);
 	else if (f.file->f_op == &bpf_link_fops)
-		err = bpf_link_get_info_by_fd(f.file->private_data,
+		err = bpf_link_get_info_by_fd(f.file, f.file->private_data,
 					      attr, uattr);
 	else
 		err = -EINVAL;
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index f93f8ace6c56..6ada114bbcca 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -274,7 +274,7 @@ static int proc_dointvec_minmax_bpf_enable(struct ctl_table *table, int write,
 	ret = proc_dointvec_minmax(&tmp, write, buffer, lenp, ppos);
 	if (write && !ret) {
 		if (jit_enable < 2 ||
-		    (jit_enable == 2 && bpf_dump_raw_ok())) {
+		    (jit_enable == 2 && bpf_dump_raw_ok(current_cred()))) {
 			*(int *)table->data = jit_enable;
 			if (jit_enable == 2)
 				pr_warn("bpf_jit_enable = 2 was set! NEVER use this in production, only for JIT debugging!\n");
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/5] kprobes: Do not expose probe addresses to non-CAP_SYSLOG
  2020-07-02 23:26 ` [PATCH 4/5] kprobes: Do not expose probe " Kees Cook
@ 2020-07-03  1:00   ` Linus Torvalds
  2020-07-03 15:13     ` Kees Cook
  2020-07-03 15:50     ` Kees Cook
  2020-07-10 14:09   ` Masami Hiramatsu
  1 sibling, 2 replies; 16+ messages in thread
From: Linus Torvalds @ 2020-07-03  1:00 UTC (permalink / raw)
  To: Kees Cook
  Cc: Dominik Czarnota, stable, Jessica Yu, Greg Kroah-Hartman,
	Andrew Morton, Alexei Starovoitov, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, Andrii Nakryiko,
	KP Singh, Naveen N. Rao, Anil S Keshavamurthy, David S. Miller,
	Masami Hiramatsu, Jakub Kicinski, Steven Rostedt (VMware),
	Dmitry Safonov, Will Deacon, Alexey Dobriyan, Marc Zyngier,
	Masahiro Yamada, Al Viro, Matteo Croce, Edward Cree,
	Nicolas Dichtel, Alexander Lobakin, Thomas Richter, Ingo Molnar,
	Netdev, bpf, Linux Kernel Mailing List

On Thu, Jul 2, 2020 at 4:26 PM Kees Cook <keescook@chromium.org> wrote:
>
> The kprobe show() functions were using "current"'s creds instead
> of the file opener's creds for kallsyms visibility. Fix to use
> seq_file->file->f_cred.

Side note: I have a distinct - but despite that possibly quite
incorrect - memory that I've discussed with somebody several years ago
about making "current_cred()" simply warn in any IO context.

IOW, we could have read and write just increment/decrement a
per-thread counter, and have current_cred() do a WARN_ON_ONCE() if
it's called with that counter incremented.

The issue of ioctl's is a bit less obvious - there are reasons to
argue those should also use open-time credentials, but on the other
hand I think it's reasonable to pass a file descriptor to a suid app
in order for that app to do things that the normal user cannot.

But read/write are dangerous because of the "it's so easy to fool suid
apps to read/write stdin/stdout".

So pread/pwrite/ioctl/splice etc are things that suid applications
very much do on purpose to affect a file descriptor. But plain
read/write are things that might be accidental and used by attack
vectors.

If somebody is interested in looking into things like that, it might
be a good idea to have kernel threads with that counter incremented by
default.

Just throwing this idea out in case somebody wants to try it. It's not
just "current_cred", of course. It's all the current_cred_xxx() users
too. But it may be that there are a ton of false positives because
maybe some code on purpose ends up doing things like just *comparing*
current_cred with file->f_cred and then that would warn too.

              Linus

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/5] module: Refactor section attr into bin attribute
  2020-07-02 23:26 ` [PATCH 2/5] module: Refactor section attr into bin attribute Kees Cook
@ 2020-07-03  6:02   ` Greg Kroah-Hartman
  2020-07-03 15:29     ` Kees Cook
  2020-07-08 16:10   ` Jessica Yu
  1 sibling, 1 reply; 16+ messages in thread
From: Greg Kroah-Hartman @ 2020-07-03  6:02 UTC (permalink / raw)
  To: Kees Cook
  Cc: Dominik Czarnota, stable, Jessica Yu, Linus Torvalds,
	Andrew Morton, Alexei Starovoitov, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, Andrii Nakryiko,
	KP Singh, Naveen N. Rao, Anil S Keshavamurthy, David S. Miller,
	Masami Hiramatsu, Jakub Kicinski, Steven Rostedt (VMware),
	Dmitry Safonov, Will Deacon, Alexey Dobriyan, Marc Zyngier,
	Masahiro Yamada, Al Viro, Matteo Croce, Edward Cree,
	Nicolas Dichtel, Alexander Lobakin, Thomas Richter, Ingo Molnar,
	netdev, bpf, linux-kernel

On Thu, Jul 02, 2020 at 04:26:35PM -0700, Kees Cook wrote:
> In order to gain access to the open file's f_cred for kallsym visibility
> permission checks, refactor the module section attributes to use the
> bin_attribute instead of attribute interface. Additionally removes the
> redundant "name" struct member.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Kees Cook <keescook@chromium.org>
> ---
>  kernel/module.c | 45 ++++++++++++++++++++++++---------------------
>  1 file changed, 24 insertions(+), 21 deletions(-)
> 
> diff --git a/kernel/module.c b/kernel/module.c
> index a5022ae84e50..9e2954519259 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -1510,8 +1510,7 @@ static inline bool sect_empty(const Elf_Shdr *sect)
>  }
>  
>  struct module_sect_attr {
> -	struct module_attribute mattr;
> -	char *name;
> +	struct bin_attribute battr;
>  	unsigned long address;
>  };
>  
> @@ -1521,11 +1520,16 @@ struct module_sect_attrs {
>  	struct module_sect_attr attrs[];
>  };
>  
> -static ssize_t module_sect_show(struct module_attribute *mattr,
> -				struct module_kobject *mk, char *buf)
> +static ssize_t module_sect_read(struct file *file, struct kobject *kobj,
> +				struct bin_attribute *battr,
> +				char *buf, loff_t pos, size_t count)
>  {
>  	struct module_sect_attr *sattr =
> -		container_of(mattr, struct module_sect_attr, mattr);
> +		container_of(battr, struct module_sect_attr, battr);
> +
> +	if (pos != 0)
> +		return -EINVAL;
> +
>  	return sprintf(buf, "0x%px\n", kptr_restrict < 2 ?
>  		       (void *)sattr->address : NULL);
>  }
> @@ -1535,7 +1539,7 @@ static void free_sect_attrs(struct module_sect_attrs *sect_attrs)
>  	unsigned int section;
>  
>  	for (section = 0; section < sect_attrs->nsections; section++)
> -		kfree(sect_attrs->attrs[section].name);
> +		kfree(sect_attrs->attrs[section].battr.attr.name);
>  	kfree(sect_attrs);
>  }
>  
> @@ -1544,42 +1548,41 @@ static void add_sect_attrs(struct module *mod, const struct load_info *info)
>  	unsigned int nloaded = 0, i, size[2];
>  	struct module_sect_attrs *sect_attrs;
>  	struct module_sect_attr *sattr;
> -	struct attribute **gattr;
> +	struct bin_attribute **gattr;
>  
>  	/* Count loaded sections and allocate structures */
>  	for (i = 0; i < info->hdr->e_shnum; i++)
>  		if (!sect_empty(&info->sechdrs[i]))
>  			nloaded++;
>  	size[0] = ALIGN(struct_size(sect_attrs, attrs, nloaded),
> -			sizeof(sect_attrs->grp.attrs[0]));
> -	size[1] = (nloaded + 1) * sizeof(sect_attrs->grp.attrs[0]);
> +			sizeof(sect_attrs->grp.bin_attrs[0]));
> +	size[1] = (nloaded + 1) * sizeof(sect_attrs->grp.bin_attrs[0]);
>  	sect_attrs = kzalloc(size[0] + size[1], GFP_KERNEL);
>  	if (sect_attrs == NULL)
>  		return;
>  
>  	/* Setup section attributes. */
>  	sect_attrs->grp.name = "sections";
> -	sect_attrs->grp.attrs = (void *)sect_attrs + size[0];
> +	sect_attrs->grp.bin_attrs = (void *)sect_attrs + size[0];
>  
>  	sect_attrs->nsections = 0;
>  	sattr = &sect_attrs->attrs[0];
> -	gattr = &sect_attrs->grp.attrs[0];
> +	gattr = &sect_attrs->grp.bin_attrs[0];
>  	for (i = 0; i < info->hdr->e_shnum; i++) {
>  		Elf_Shdr *sec = &info->sechdrs[i];
>  		if (sect_empty(sec))
>  			continue;
> +		sysfs_bin_attr_init(&sattr->battr);
>  		sattr->address = sec->sh_addr;
> -		sattr->name = kstrdup(info->secstrings + sec->sh_name,
> -					GFP_KERNEL);
> -		if (sattr->name == NULL)
> +		sattr->battr.attr.name =
> +			kstrdup(info->secstrings + sec->sh_name, GFP_KERNEL);
> +		if (sattr->battr.attr.name == NULL)
>  			goto out;
>  		sect_attrs->nsections++;
> -		sysfs_attr_init(&sattr->mattr.attr);
> -		sattr->mattr.show = module_sect_show;
> -		sattr->mattr.store = NULL;
> -		sattr->mattr.attr.name = sattr->name;
> -		sattr->mattr.attr.mode = S_IRUSR;
> -		*(gattr++) = &(sattr++)->mattr.attr;
> +		sattr->battr.read = module_sect_read;
> +		sattr->battr.size = 3 /* "0x", "\n" */ + (BITS_PER_LONG / 4);
> +		sattr->battr.attr.mode = 0400;
> +		*(gattr++) = &(sattr++)->battr;
>  	}
>  	*gattr = NULL;
>  
> @@ -1669,7 +1672,7 @@ static void add_notes_attrs(struct module *mod, const struct load_info *info)
>  			continue;
>  		if (info->sechdrs[i].sh_type == SHT_NOTE) {
>  			sysfs_bin_attr_init(nattr);
> -			nattr->attr.name = mod->sect_attrs->attrs[loaded].name;
> +			nattr->attr.name = mod->sect_attrs->attrs[loaded].battr.attr.name;
>  			nattr->attr.mode = S_IRUGO;
>  			nattr->size = info->sechdrs[i].sh_size;
>  			nattr->private = (void *) info->sechdrs[i].sh_addr;
> -- 
> 2.25.1
> 

They get a correct "size" value now, nice!

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/5] kprobes: Do not expose probe addresses to non-CAP_SYSLOG
  2020-07-03  1:00   ` Linus Torvalds
@ 2020-07-03 15:13     ` Kees Cook
  2020-07-03 15:50     ` Kees Cook
  1 sibling, 0 replies; 16+ messages in thread
From: Kees Cook @ 2020-07-03 15:13 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dominik Czarnota, stable, Jessica Yu, Greg Kroah-Hartman,
	Andrew Morton, Alexei Starovoitov, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, Andrii Nakryiko,
	KP Singh, Naveen N. Rao, Anil S Keshavamurthy, David S. Miller,
	Masami Hiramatsu, Jakub Kicinski, Steven Rostedt (VMware),
	Dmitry Safonov, Will Deacon, Alexey Dobriyan, Marc Zyngier,
	Masahiro Yamada, Al Viro, Matteo Croce, Edward Cree,
	Nicolas Dichtel, Alexander Lobakin, Thomas Richter, Ingo Molnar,
	Netdev, bpf, Linux Kernel Mailing List

On Thu, Jul 02, 2020 at 06:00:17PM -0700, Linus Torvalds wrote:
> On Thu, Jul 2, 2020 at 4:26 PM Kees Cook <keescook@chromium.org> wrote:
> >
> > The kprobe show() functions were using "current"'s creds instead
> > of the file opener's creds for kallsyms visibility. Fix to use
> > seq_file->file->f_cred.
> 
> Side note: I have a distinct - but despite that possibly quite
> incorrect - memory that I've discussed with somebody several years ago
> about making "current_cred()" simply warn in any IO context.
> 
> IOW, we could have read and write just increment/decrement a
> per-thread counter, and have current_cred() do a WARN_ON_ONCE() if
> it's called with that counter incremented.

That does sound familiar. I can't find a thread on it, but my search
abilities are poor. :) So an increment/decrement in all the IO-related
syscalls, or were you thinking of some other location?

> The issue of ioctl's is a bit less obvious - there are reasons to
> argue those should also use open-time credentials, but on the other
> hand I think it's reasonable to pass a file descriptor to a suid app
> in order for that app to do things that the normal user cannot.
>
> But read/write are dangerous because of the "it's so easy to fool suid
> apps to read/write stdin/stdout".
>
> So pread/pwrite/ioctl/splice etc are things that suid applications
> very much do on purpose to affect a file descriptor. But plain
> read/write are things that might be accidental and used by attack
> vectors.

So probably just start with read/write and tighten it over time, if we
find other clear places, leaving ioctl/pread/pwrite/splice alone.

> If somebody is interested in looking into things like that, it might
> be a good idea to have kernel threads with that counter incremented by
> 
> Just throwing this idea out in case somebody wants to try it. It's not
> just "current_cred", of course. It's all the current_cred_xxx() users
> too. But it may be that there are a ton of false positives because
> maybe some code on purpose ends up doing things like just *comparing*
> current_cred with file->f_cred and then that would warn too.

Yeah ... and I think the kthread test should answer that question.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/5] module: Refactor section attr into bin attribute
  2020-07-03  6:02   ` Greg Kroah-Hartman
@ 2020-07-03 15:29     ` Kees Cook
  0 siblings, 0 replies; 16+ messages in thread
From: Kees Cook @ 2020-07-03 15:29 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Dominik Czarnota, stable, Jessica Yu, Linus Torvalds,
	Andrew Morton, Alexei Starovoitov, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, Andrii Nakryiko,
	KP Singh, Naveen N. Rao, Anil S Keshavamurthy, David S. Miller,
	Masami Hiramatsu, Jakub Kicinski, Steven Rostedt (VMware),
	Dmitry Safonov, Will Deacon, Alexey Dobriyan, Marc Zyngier,
	Masahiro Yamada, Al Viro, Matteo Croce, Edward Cree,
	Nicolas Dichtel, Alexander Lobakin, Thomas Richter, Ingo Molnar,
	netdev, bpf, linux-kernel

On Fri, Jul 03, 2020 at 08:02:07AM +0200, Greg Kroah-Hartman wrote:
> On Thu, Jul 02, 2020 at 04:26:35PM -0700, Kees Cook wrote:
> > +		sattr->battr.size = 3 /* "0x", "\n" */ + (BITS_PER_LONG / 4);
> 
> They get a correct "size" value now, nice!

Yeah, though I do have some concerns that switching to a bin attribute
changes the userspace behavior a bit. With seq_file-based "show", we
get a 4096 size, and seeking isn't possible (lseek to non-0 location
will fail). With the raw "read", we get the right size, but lseek()
is allowed (but I've got the "read" handler refuse reads starting from
non-zero). When I reviewed[1] potential readers (elftutils, systemtap,
kmod), they all seem to do normal things (fopen/fscanf/fclose), so I'm
hoping this won't be a problem in practice.

> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Thanks!

[1] https://codesearch.debian.net/search?q=%2Fsys%2Fmodule.*sections&literal=0

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/5] kprobes: Do not expose probe addresses to non-CAP_SYSLOG
  2020-07-03  1:00   ` Linus Torvalds
  2020-07-03 15:13     ` Kees Cook
@ 2020-07-03 15:50     ` Kees Cook
  2020-07-05 20:10       ` Linus Torvalds
  1 sibling, 1 reply; 16+ messages in thread
From: Kees Cook @ 2020-07-03 15:50 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dominik Czarnota, stable, Jessica Yu, Greg Kroah-Hartman,
	Andrew Morton, Alexei Starovoitov, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, Andrii Nakryiko,
	KP Singh, Naveen N. Rao, Anil S Keshavamurthy, David S. Miller,
	Masami Hiramatsu, Jakub Kicinski, Steven Rostedt (VMware),
	Dmitry Safonov, Will Deacon, Alexey Dobriyan, Marc Zyngier,
	Masahiro Yamada, Al Viro, Matteo Croce, Edward Cree,
	Nicolas Dichtel, Thomas Richter, Ingo Molnar, Netdev, bpf,
	Linux Kernel Mailing List

On Thu, Jul 02, 2020 at 06:00:17PM -0700, Linus Torvalds wrote:
> If somebody is interested in looking into things like that, it might
> be a good idea to have kernel threads with that counter incremented by
> default.

With 67 kthreads on a booted system, this patch does not immediately
blow up... And it likely needs some beautification. (Note that
current_cred_*() calls current_cred() under the hood, so AFAICT, only
current_cred() needs coverage.)

diff --git a/include/linux/cred.h b/include/linux/cred.h
index 18639c069263..a624847cb0ce 100644
--- a/include/linux/cred.h
+++ b/include/linux/cred.h
@@ -295,7 +295,10 @@ static inline void put_cred(const struct cred *_cred)
  * since nobody else can modify it.
  */
 #define current_cred() \
-	rcu_dereference_protected(current->cred, 1)
+({							\
+	WARN_ON_ONCE(current->warn_on_current_cred);	\
+	rcu_dereference_protected(current->cred, 1);	\
+})
 
 /**
  * current_real_cred - Access the current task's objective credentials
diff --git a/include/linux/sched.h b/include/linux/sched.h
index b62e6aaf28f0..21ab1b81aa40 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -652,6 +652,7 @@ struct task_struct {
 	/* Per task flags (PF_*), defined further below: */
 	unsigned int			flags;
 	unsigned int			ptrace;
+	unsigned int			warn_on_current_cred;
 
 #ifdef CONFIG_SMP
 	struct llist_node		wake_entry;
diff --git a/kernel/fork.c b/kernel/fork.c
index 142b23645d82..2e181b9bfd3f 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2527,8 +2527,12 @@ pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags)
 		.stack		= (unsigned long)fn,
 		.stack_size	= (unsigned long)arg,
 	};
+	pid_t pid;
 
-	return _do_fork(&args);
+	pid = _do_fork(&args);
+	if (pid == 0)
+		current->warn_on_current_cred = 1;
+	return pid;
 }
 
 #ifdef __ARCH_WANT_SYS_FORK


-- 
Kees Cook

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/5] kprobes: Do not expose probe addresses to non-CAP_SYSLOG
  2020-07-03 15:50     ` Kees Cook
@ 2020-07-05 20:10       ` Linus Torvalds
  2020-07-05 20:19         ` Kees Cook
  0 siblings, 1 reply; 16+ messages in thread
From: Linus Torvalds @ 2020-07-05 20:10 UTC (permalink / raw)
  To: Kees Cook
  Cc: Dominik Czarnota, stable, Jessica Yu, Greg Kroah-Hartman,
	Andrew Morton, Alexei Starovoitov, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, Andrii Nakryiko,
	KP Singh, Naveen N. Rao, Anil S Keshavamurthy, David S. Miller,
	Masami Hiramatsu, Jakub Kicinski, Steven Rostedt (VMware),
	Dmitry Safonov, Will Deacon, Alexey Dobriyan, Marc Zyngier,
	Masahiro Yamada, Al Viro, Matteo Croce, Edward Cree,
	Nicolas Dichtel, Thomas Richter, Ingo Molnar, Netdev, bpf,
	Linux Kernel Mailing List

On Fri, Jul 3, 2020 at 8:50 AM Kees Cook <keescook@chromium.org> wrote:
>
> With 67 kthreads on a booted system, this patch does not immediately
> blow up...

Did you try making read/write inc/dec that thing too? Or does that
just blow up with tons of warnings?

                 Linus

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/5] kprobes: Do not expose probe addresses to non-CAP_SYSLOG
  2020-07-05 20:10       ` Linus Torvalds
@ 2020-07-05 20:19         ` Kees Cook
  0 siblings, 0 replies; 16+ messages in thread
From: Kees Cook @ 2020-07-05 20:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dominik Czarnota, stable, Jessica Yu, Greg Kroah-Hartman,
	Andrew Morton, Alexei Starovoitov, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, Andrii Nakryiko,
	KP Singh, Naveen N. Rao, Anil S Keshavamurthy, David S. Miller,
	Masami Hiramatsu, Jakub Kicinski, Steven Rostedt (VMware),
	Dmitry Safonov, Will Deacon, Alexey Dobriyan, Marc Zyngier,
	Masahiro Yamada, Al Viro, Matteo Croce, Edward Cree,
	Nicolas Dichtel, Thomas Richter, Ingo Molnar, Netdev, bpf,
	Linux Kernel Mailing List

On Sun, Jul 05, 2020 at 01:10:54PM -0700, Linus Torvalds wrote:
> On Fri, Jul 3, 2020 at 8:50 AM Kees Cook <keescook@chromium.org> wrote:
> >
> > With 67 kthreads on a booted system, this patch does not immediately
> > blow up...
> 
> Did you try making read/write inc/dec that thing too? Or does that
> just blow up with tons of warnings?

I hadn't gotten to that yet. I need to catch up on other stuff first,
and I thought I'd give people time to scream if they tested this
themselves. :)

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/5] module: Refactor section attr into bin attribute
  2020-07-02 23:26 ` [PATCH 2/5] module: Refactor section attr into bin attribute Kees Cook
  2020-07-03  6:02   ` Greg Kroah-Hartman
@ 2020-07-08 16:10   ` Jessica Yu
  1 sibling, 0 replies; 16+ messages in thread
From: Jessica Yu @ 2020-07-08 16:10 UTC (permalink / raw)
  To: Kees Cook
  Cc: Dominik Czarnota, stable, Linus Torvalds, Greg Kroah-Hartman,
	Andrew Morton, Alexei Starovoitov, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, Andrii Nakryiko,
	KP Singh, Naveen N. Rao, Anil S Keshavamurthy, David S. Miller,
	Masami Hiramatsu, Jakub Kicinski, Steven Rostedt (VMware),
	Dmitry Safonov, Will Deacon, Alexey Dobriyan, Marc Zyngier,
	Masahiro Yamada, Al Viro, Matteo Croce, Edward Cree,
	Nicolas Dichtel, Alexander Lobakin, Thomas Richter, Ingo Molnar,
	netdev, bpf, linux-kernel

+++ Kees Cook [02/07/20 16:26 -0700]:
>In order to gain access to the open file's f_cred for kallsym visibility
>permission checks, refactor the module section attributes to use the
>bin_attribute instead of attribute interface. Additionally removes the
>redundant "name" struct member.
>
>Cc: stable@vger.kernel.org
>Signed-off-by: Kees Cook <keescook@chromium.org>

Hi Kees,

This looks good to me:

Tested-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Jessica Yu <jeyu@kernel.org>

Thanks!

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/5] module: Do not expose section addresses to non-CAP_SYSLOG
  2020-07-02 23:26 ` [PATCH 3/5] module: Do not expose section addresses to non-CAP_SYSLOG Kees Cook
@ 2020-07-08 16:12   ` Jessica Yu
  0 siblings, 0 replies; 16+ messages in thread
From: Jessica Yu @ 2020-07-08 16:12 UTC (permalink / raw)
  To: Kees Cook
  Cc: Dominik Czarnota, stable, Linus Torvalds, Greg Kroah-Hartman,
	Andrew Morton, Alexei Starovoitov, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, Andrii Nakryiko,
	KP Singh, Naveen N. Rao, Anil S Keshavamurthy, David S. Miller,
	Masami Hiramatsu, Jakub Kicinski, Steven Rostedt (VMware),
	Dmitry Safonov, Will Deacon, Alexey Dobriyan, Marc Zyngier,
	Masahiro Yamada, Al Viro, Matteo Croce, Edward Cree,
	Nicolas Dichtel, Alexander Lobakin, Thomas Richter, Ingo Molnar,
	netdev, bpf, linux-kernel

+++ Kees Cook [02/07/20 16:26 -0700]:
>The printing of section addresses in /sys/module/*/sections/* was not
>using the correct credentials to evaluate visibility.
>
>Before:
>
> # cat /sys/module/*/sections/.*text
> 0xffffffffc0458000
> ...
> # capsh --drop=CAP_SYSLOG -- -c "cat /sys/module/*/sections/.*text"
> 0xffffffffc0458000
> ...
>
>After:
>
> # cat /sys/module/*/sections/*.text
> 0xffffffffc0458000
> ...
> # capsh --drop=CAP_SYSLOG -- -c "cat /sys/module/*/sections/.*text"
> 0x0000000000000000
> ...
>
>Additionally replaces the existing (safe) /proc/modules check with
>file->f_cred for consistency.
>
>Cc: stable@vger.kernel.org
>Reported-by: Dominik Czarnota <dominik.czarnota@trailofbits.com>
>Fixes: be71eda5383f ("module: Fix display of wrong module .text address")
>Signed-off-by: Kees Cook <keescook@chromium.org>

Tested-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Jessica Yu <jeyu@kernel.org>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/5] kprobes: Do not expose probe addresses to non-CAP_SYSLOG
  2020-07-02 23:26 ` [PATCH 4/5] kprobes: Do not expose probe " Kees Cook
  2020-07-03  1:00   ` Linus Torvalds
@ 2020-07-10 14:09   ` Masami Hiramatsu
  1 sibling, 0 replies; 16+ messages in thread
From: Masami Hiramatsu @ 2020-07-10 14:09 UTC (permalink / raw)
  To: Kees Cook
  Cc: Dominik Czarnota, stable, Jessica Yu, Linus Torvalds,
	Greg Kroah-Hartman, Andrew Morton, Alexei Starovoitov,
	Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	Andrii Nakryiko, KP Singh, Naveen N. Rao, Anil S Keshavamurthy,
	David S. Miller, Masami Hiramatsu, Jakub Kicinski,
	Steven Rostedt (VMware),
	Dmitry Safonov, Will Deacon, Alexey Dobriyan, Marc Zyngier,
	Masahiro Yamada, Al Viro, Matteo Croce, Edward Cree,
	Nicolas Dichtel, Alexander Lobakin, Thomas Richter, Ingo Molnar,
	netdev, bpf, linux-kernel

On Thu,  2 Jul 2020 16:26:37 -0700
Kees Cook <keescook@chromium.org> wrote:

> The kprobe show() functions were using "current"'s creds instead
> of the file opener's creds for kallsyms visibility. Fix to use
> seq_file->file->f_cred.

This looks good to me.

Acked-by: Masami Hiramatsu <mhiramat@kernel.org>

Thanks!

> 
> Cc: stable@vger.kernel.org
> Fixes: 81365a947de4 ("kprobes: Show address of kprobes if kallsyms does")
> Fixes: ffb9bd68ebdb ("kprobes: Show blacklist addresses as same as kallsyms does")
> Signed-off-by: Kees Cook <keescook@chromium.org>
> ---
>  kernel/kprobes.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index d4de217e4a91..2e97febeef77 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -2448,7 +2448,7 @@ static void report_probe(struct seq_file *pi, struct kprobe *p,
>  	else
>  		kprobe_type = "k";
>  
> -	if (!kallsyms_show_value(current_cred()))
> +	if (!kallsyms_show_value(pi->file->f_cred))
>  		addr = NULL;
>  
>  	if (sym)
> @@ -2540,7 +2540,7 @@ static int kprobe_blacklist_seq_show(struct seq_file *m, void *v)
>  	 * If /proc/kallsyms is not showing kernel address, we won't
>  	 * show them here either.
>  	 */
> -	if (!kallsyms_show_value(current_cred()))
> +	if (!kallsyms_show_value(m->file->f_cred))
>  		seq_printf(m, "0x%px-0x%px\t%ps\n", NULL, NULL,
>  			   (void *)ent->start_addr);
>  	else
> -- 
> 2.25.1
> 


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-07-10 14:09 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-02 23:26 [PATCH 0/5] Refactor kallsyms_show_value() users for correct cred Kees Cook
2020-07-02 23:26 ` [PATCH 1/5] kallsyms: Refactor kallsyms_show_value() to take cred Kees Cook
2020-07-02 23:26 ` [PATCH 2/5] module: Refactor section attr into bin attribute Kees Cook
2020-07-03  6:02   ` Greg Kroah-Hartman
2020-07-03 15:29     ` Kees Cook
2020-07-08 16:10   ` Jessica Yu
2020-07-02 23:26 ` [PATCH 3/5] module: Do not expose section addresses to non-CAP_SYSLOG Kees Cook
2020-07-08 16:12   ` Jessica Yu
2020-07-02 23:26 ` [PATCH 4/5] kprobes: Do not expose probe " Kees Cook
2020-07-03  1:00   ` Linus Torvalds
2020-07-03 15:13     ` Kees Cook
2020-07-03 15:50     ` Kees Cook
2020-07-05 20:10       ` Linus Torvalds
2020-07-05 20:19         ` Kees Cook
2020-07-10 14:09   ` Masami Hiramatsu
2020-07-02 23:26 ` [PATCH 5/5] bpf: Check correct cred for CAP_SYSLOG in bpf_dump_raw_ok() Kees Cook

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).