bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/7] Introduce CAP_SYS_PERFMON to secure system performance monitoring and observability
@ 2019-12-18  9:16 Alexey Budankov
  2019-12-18  9:24 ` [PATCH v4 1/9] capabilities: introduce CAP_SYS_PERFMON to kernel and user space Alexey Budankov
                   ` (8 more replies)
  0 siblings, 9 replies; 15+ messages in thread
From: Alexey Budankov @ 2019-12-18  9:16 UTC (permalink / raw)
  To: Peter Zijlstra, Arnaldo Carvalho de Melo, Ingo Molnar,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, Alexei Starovoitov,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	james.bottomley, Serge Hallyn, James Morris, Will Deacon,
	Mark Rutland, Casey Schaufler, Robert Richter
  Cc: Jiri Olsa, Andi Kleen, Stephane Eranian, Igor Lubashev,
	Alexander Shishkin, Namhyung Kim, Kees Cook, Jann Horn,
	Thomas Gleixner, Tvrtko Ursulin, Lionel Landwerlin, Song Liu,
	linux-kernel, linux-security-module, selinux, intel-gfx, bpf,
	linux-parisc, linuxppc-dev, linux-perf-users, linux-arm-kernel,
	oprofile-list


Currently access to perf_events, i915_perf and other performance monitoring and
observability subsystems of the kernel is open for a privileged process [1] with
CAP_SYS_ADMIN capability enabled in the process effective set [2].

This patch set introduces CAP_SYS_PERFMON capability devoted to secure system
performance monitoring and observability operations so that CAP_SYS_PERFMON would
assist CAP_SYS_ADMIN capability in its governing role for perf_events, i915_perf
and other performance monitoring and observability subsystems of the kernel.

CAP_SYS_PERFMON intends to meet the demand to secure system performance monitoring
and observability operations in security sensitive, restricted, production
environments (e.g. HPC clusters, cloud and virtual compute environments) where root
or CAP_SYS_ADMIN credentials are not available to mass users of a system because
of security considerations.

CAP_SYS_PERFMON intends to harden system security and integrity during system
performance monitoring and observability operations by decreasing attack surface
that is available to CAP_SYS_ADMIN privileged processes [2].

CAP_SYS_PERFMON intends to take over CAP_SYS_ADMIN credentials related to system
performance monitoring and observability operations and balance amount of
CAP_SYS_ADMIN credentials following the recommendations in the capabilities man
page [2] for CAP_SYS_ADMIN: "Note: this capability is overloaded; see Notes to
kernel developers, below."

For backward compatibility reasons access to system performance monitoring and
observability subsystems of the kernel remains open for CAP_SYS_ADMIN privileged
processes but CAP_SYS_ADMIN capability usage for secure system performance
monitoring and observability operations is discouraged with respect to the
introduced CAP_SYS_PERFMON capability.

The patch set is for tip perf/core repository:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip perf/core
sha1: ceb9e77324fa661b1001a0ae66f061b5fcb4e4e6

---
Changes in v4:
- converted perfmon_capable() into an inline function
- made perf_events kprobes, uprobes, hw breakpoints and namespaces data available
  to CAP_SYS_PERFMON privileged processes
- applied perfmon_capable() to drivers/perf and drivers/oprofile
- extended __cmd_ftrace() with support of CAP_SYS_PERFMON
Changes in v3:
- implemented perfmon_capable() macros aggregating required capabilities checks
Changes in v2:
- made perf_events trace points available to CAP_SYS_PERFMON privileged processes
- made perf_event_paranoid_check() treat CAP_SYS_PERFMON equally to CAP_SYS_ADMIN
- applied CAP_SYS_PERFMON to i915_perf, bpf_trace, powerpc and parisc system
  performance monitoring and observability related subsystems

---
Alexey Budankov (9):
  capabilities: introduce CAP_SYS_PERFMON to kernel and user space
  perf/core: open access for CAP_SYS_PERFMON privileged process
  perf tool: extend Perf tool with CAP_SYS_PERFMON capability support
  drm/i915/perf: open access for CAP_SYS_PERFMON privileged process
  trace/bpf_trace: open access for CAP_SYS_PERFMON privileged process
  powerpc/perf: open access for CAP_SYS_PERFMON privileged process
  parisc/perf: open access for CAP_SYS_PERFMON privileged process
  drivers/perf: open access for CAP_SYS_PERFMON privileged process
  drivers/oprofile: open access for CAP_SYS_PERFMON privileged process

 arch/parisc/kernel/perf.c           |  2 +-
 arch/powerpc/perf/imc-pmu.c         |  4 ++--
 drivers/gpu/drm/i915/i915_perf.c    | 13 ++++++-------
 drivers/oprofile/event_buffer.c     |  2 +-
 drivers/perf/arm_spe_pmu.c          |  4 ++--
 include/linux/capability.h          |  4 ++++
 include/linux/perf_event.h          |  6 +++---
 include/uapi/linux/capability.h     |  8 +++++++-
 kernel/events/core.c                |  6 +++---
 kernel/trace/bpf_trace.c            |  2 +-
 security/selinux/include/classmap.h |  4 ++--
 tools/perf/builtin-ftrace.c         |  5 +++--
 tools/perf/design.txt               |  3 ++-
 tools/perf/util/cap.h               |  4 ++++
 tools/perf/util/evsel.c             | 10 +++++-----
 tools/perf/util/util.c              |  1 +
 16 files changed, 47 insertions(+), 31 deletions(-)

---
Testing and validation (Intel Skylake, 8 cores, Fedora 29, 5.4.0-rc8+, x86_64):

libcap library [3], [4] and Perf tool can be used to apply CAP_SYS_PERFMON 
capability for secure system performance monitoring and observability beyond the
scope permitted by the system wide perf_event_paranoid kernel setting [5] and
below are the steps for evaluation:

  - patch, build and boot the kernel
  - patch, build Perf tool e.g. to /home/user/perf
  ...
  # git clone git://git.kernel.org/pub/scm/libs/libcap/libcap.git libcap
  # pushd libcap
  # patch libcap/include/uapi/linux/capabilities.h with [PATCH 1]
  # make
  # pushd progs
  # ./setcap "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
  # ./setcap -v "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
  /home/user/perf: OK
  # ./getcap /home/user/perf
  /home/user/perf = cap_sys_ptrace,cap_syslog,cap_sys_perfmon+ep
  # echo 2 > /proc/sys/kernel/perf_event_paranoid
  # cat /proc/sys/kernel/perf_event_paranoid 
  2
  ...
  $ /home/user/perf top
    ... works as expected ...
  $ cat /proc/`pidof perf`/status
  Name:	perf
  Umask:	0002
  State:	S (sleeping)
  Tgid:	2958
  Ngid:	0
  Pid:	2958
  PPid:	9847
  TracerPid:	0
  Uid:	500	500	500	500
  Gid:	500	500	500	500
  FDSize:	256
  ...
  CapInh:	0000000000000000
  CapPrm:	0000004400080000
  CapEff:	0000004400080000 => 01000100 00000000 00001000 00000000 00000000
                                     cap_sys_perfmon,cap_sys_ptrace,cap_syslog
  CapBnd:	0000007fffffffff
  CapAmb:	0000000000000000
  NoNewPrivs:	0
  Seccomp:	0
  Speculation_Store_Bypass:	thread vulnerable
  Cpus_allowed:	ff
  Cpus_allowed_list:	0-7
  ...

Usage of cap_sys_perfmon effectively avoids unused credentials excess:

- with cap_sys_admin:
  CapEff:	0000007fffffffff => 01111111 11111111 11111111 11111111 11111111

- with cap_sys_perfmon:
  CapEff:	0000004400080000 => 01000100 00000000 00001000 00000000 00000000
                                    38   34               19
                           sys_perfmon   syslog           sys_ptrace

---

[1] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
[2] http://man7.org/linux/man-pages/man7/capabilities.7.html
[3] http://man7.org/linux/man-pages/man8/setcap.8.html
[4] https://git.kernel.org/pub/scm/libs/libcap/libcap.git
[5] http://man7.org/linux/man-pages/man2/perf_event_open.2.html
[6] https://sites.google.com/site/fullycapable/, posix_1003.1e-990310.pdf

-- 
2.20.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v4 1/9] capabilities: introduce CAP_SYS_PERFMON to kernel and user space
  2019-12-18  9:16 [PATCH v4 0/7] Introduce CAP_SYS_PERFMON to secure system performance monitoring and observability Alexey Budankov
@ 2019-12-18  9:24 ` Alexey Budankov
  2019-12-18 19:56   ` Stephen Smalley
  2019-12-18  9:25 ` [PATCH v4 2/9] perf/core: open access for CAP_SYS_PERFMON privileged process Alexey Budankov
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 15+ messages in thread
From: Alexey Budankov @ 2019-12-18  9:24 UTC (permalink / raw)
  To: Peter Zijlstra, Arnaldo Carvalho de Melo, Ingo Molnar,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, Alexei Starovoitov,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	james.bottomley, Serge Hallyn, James Morris, Will Deacon,
	Mark Rutland, Casey Schaufler, Robert Richter
  Cc: Jiri Olsa, Andi Kleen, Stephane Eranian, Igor Lubashev,
	Alexander Shishkin, Namhyung Kim, Kees Cook, Jann Horn,
	Thomas Gleixner, Tvrtko Ursulin, Lionel Landwerlin, Song Liu,
	linux-kernel, linux-security-module, selinux, intel-gfx, bpf,
	linux-parisc, linuxppc-dev, linux-perf-users, linux-arm-kernel,
	oprofile-list


Introduce CAP_SYS_PERFMON capability devoted to secure system performance
monitoring and observability operations so that CAP_SYS_PERFMON would
assist CAP_SYS_ADMIN capability in its governing role for perf_events,
i915_perf and other subsystems of the kernel.

CAP_SYS_PERFMON intends to harden system security and integrity during
system performance monitoring and observability operations by decreasing
attack surface that is available to CAP_SYS_ADMIN privileged processes.

CAP_SYS_PERFMON intends to take over CAP_SYS_ADMIN credentials related
to system performance monitoring and observability operations and balance
amount of CAP_SYS_ADMIN credentials in accordance with the recommendations
provided in the man page for CAP_SYS_ADMIN [1]: "Note: this capability
is overloaded; see Notes to kernel developers, below."

[1] http://man7.org/linux/man-pages/man7/capabilities.7.html

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 include/linux/capability.h          | 4 ++++
 include/uapi/linux/capability.h     | 8 +++++++-
 security/selinux/include/classmap.h | 4 ++--
 3 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/include/linux/capability.h b/include/linux/capability.h
index ecce0f43c73a..883c879baa4b 100644
--- a/include/linux/capability.h
+++ b/include/linux/capability.h
@@ -251,6 +251,10 @@ extern bool privileged_wrt_inode_uidgid(struct user_namespace *ns, const struct
 extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap);
 extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap);
 extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns);
+static inline bool perfmon_capable(void)
+{
+	return capable(CAP_SYS_PERFMON) || capable(CAP_SYS_ADMIN);
+}
 
 /* audit system wants to get cap info from files as well */
 extern int get_vfs_caps_from_disk(const struct dentry *dentry, struct cpu_vfs_cap_data *cpu_caps);
diff --git a/include/uapi/linux/capability.h b/include/uapi/linux/capability.h
index 240fdb9a60f6..98e03cc76c7c 100644
--- a/include/uapi/linux/capability.h
+++ b/include/uapi/linux/capability.h
@@ -366,8 +366,14 @@ struct vfs_ns_cap_data {
 
 #define CAP_AUDIT_READ		37
 
+/*
+ * Allow system performance and observability privileged operations
+ * using perf_events, i915_perf and other kernel subsystems
+ */
+
+#define CAP_SYS_PERFMON		38
 
-#define CAP_LAST_CAP         CAP_AUDIT_READ
+#define CAP_LAST_CAP         CAP_SYS_PERFMON
 
 #define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP)
 
diff --git a/security/selinux/include/classmap.h b/security/selinux/include/classmap.h
index 7db24855e12d..bae602c623b0 100644
--- a/security/selinux/include/classmap.h
+++ b/security/selinux/include/classmap.h
@@ -27,9 +27,9 @@
 	    "audit_control", "setfcap"
 
 #define COMMON_CAP2_PERMS  "mac_override", "mac_admin", "syslog", \
-		"wake_alarm", "block_suspend", "audit_read"
+		"wake_alarm", "block_suspend", "audit_read", "sys_perfmon"
 
-#if CAP_LAST_CAP > CAP_AUDIT_READ
+#if CAP_LAST_CAP > CAP_SYS_PERFMON
 #error New capability defined, please update COMMON_CAP2_PERMS.
 #endif
 
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 2/9] perf/core: open access for CAP_SYS_PERFMON privileged process
  2019-12-18  9:16 [PATCH v4 0/7] Introduce CAP_SYS_PERFMON to secure system performance monitoring and observability Alexey Budankov
  2019-12-18  9:24 ` [PATCH v4 1/9] capabilities: introduce CAP_SYS_PERFMON to kernel and user space Alexey Budankov
@ 2019-12-18  9:25 ` Alexey Budankov
  2020-01-08 16:07   ` Peter Zijlstra
  2019-12-18  9:26 ` [PATCH v4 3/9] perf tool: extend Perf tool with CAP_SYS_PERFMON capability support Alexey Budankov
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 15+ messages in thread
From: Alexey Budankov @ 2019-12-18  9:25 UTC (permalink / raw)
  To: Peter Zijlstra, Arnaldo Carvalho de Melo, Ingo Molnar,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, Alexei Starovoitov,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	james.bottomley, Serge Hallyn, James Morris, Will Deacon,
	Mark Rutland, Casey Schaufler, Robert Richter
  Cc: Jiri Olsa, Andi Kleen, Stephane Eranian, Igor Lubashev,
	Alexander Shishkin, Namhyung Kim, Kees Cook, Jann Horn,
	Thomas Gleixner, Tvrtko Ursulin, Lionel Landwerlin, Song Liu,
	linux-kernel, linux-security-module, selinux, intel-gfx, bpf,
	linux-parisc, linuxppc-dev, linux-perf-users, linux-arm-kernel,
	oprofile-list


Open access to perf_events monitoring for CAP_SYS_PERFMON privileged
processes. For backward compatibility reasons access to perf_events
subsystem remains open for CAP_SYS_ADMIN privileged processes but
CAP_SYS_ADMIN usage for secure perf_events monitoring is discouraged
with respect to CAP_SYS_PERFMON capability.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 include/linux/perf_event.h | 6 +++---
 kernel/events/core.c       | 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 34c7c6910026..f46acd69425f 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1285,7 +1285,7 @@ static inline int perf_is_paranoid(void)
 
 static inline int perf_allow_kernel(struct perf_event_attr *attr)
 {
-	if (sysctl_perf_event_paranoid > 1 && !capable(CAP_SYS_ADMIN))
+	if (sysctl_perf_event_paranoid > 1 && !perfmon_capable())
 		return -EACCES;
 
 	return security_perf_event_open(attr, PERF_SECURITY_KERNEL);
@@ -1293,7 +1293,7 @@ static inline int perf_allow_kernel(struct perf_event_attr *attr)
 
 static inline int perf_allow_cpu(struct perf_event_attr *attr)
 {
-	if (sysctl_perf_event_paranoid > 0 && !capable(CAP_SYS_ADMIN))
+	if (sysctl_perf_event_paranoid > 0 && !perfmon_capable())
 		return -EACCES;
 
 	return security_perf_event_open(attr, PERF_SECURITY_CPU);
@@ -1301,7 +1301,7 @@ static inline int perf_allow_cpu(struct perf_event_attr *attr)
 
 static inline int perf_allow_tracepoint(struct perf_event_attr *attr)
 {
-	if (sysctl_perf_event_paranoid > -1 && !capable(CAP_SYS_ADMIN))
+	if (sysctl_perf_event_paranoid > -1 && !perfmon_capable())
 		return -EPERM;
 
 	return security_perf_event_open(attr, PERF_SECURITY_TRACEPOINT);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 059ee7116008..d9db414f2197 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -9056,7 +9056,7 @@ static int perf_kprobe_event_init(struct perf_event *event)
 	if (event->attr.type != perf_kprobe.type)
 		return -ENOENT;
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!perfmon_capable())
 		return -EACCES;
 
 	/*
@@ -9116,7 +9116,7 @@ static int perf_uprobe_event_init(struct perf_event *event)
 	if (event->attr.type != perf_uprobe.type)
 		return -ENOENT;
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!perfmon_capable())
 		return -EACCES;
 
 	/*
@@ -11157,7 +11157,7 @@ SYSCALL_DEFINE5(perf_event_open,
 	}
 
 	if (attr.namespaces) {
-		if (!capable(CAP_SYS_ADMIN))
+		if (!perfmon_capable())
 			return -EACCES;
 	}
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 3/9] perf tool: extend Perf tool with CAP_SYS_PERFMON capability support
  2019-12-18  9:16 [PATCH v4 0/7] Introduce CAP_SYS_PERFMON to secure system performance monitoring and observability Alexey Budankov
  2019-12-18  9:24 ` [PATCH v4 1/9] capabilities: introduce CAP_SYS_PERFMON to kernel and user space Alexey Budankov
  2019-12-18  9:25 ` [PATCH v4 2/9] perf/core: open access for CAP_SYS_PERFMON privileged process Alexey Budankov
@ 2019-12-18  9:26 ` Alexey Budankov
  2019-12-18  9:27 ` [PATCH v4 4/9] drm/i915/perf: open access for CAP_SYS_PERFMON privileged process Alexey Budankov
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Alexey Budankov @ 2019-12-18  9:26 UTC (permalink / raw)
  To: Peter Zijlstra, Arnaldo Carvalho de Melo, Ingo Molnar,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, Alexei Starovoitov,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	james.bottomley, Serge Hallyn, James Morris, Will Deacon,
	Mark Rutland, Casey Schaufler, Robert Richter
  Cc: Jiri Olsa, Andi Kleen, Stephane Eranian, Igor Lubashev,
	Alexander Shishkin, Namhyung Kim, Kees Cook, Jann Horn,
	Thomas Gleixner, Tvrtko Ursulin, Lionel Landwerlin, Song Liu,
	linux-kernel, linux-security-module, selinux, intel-gfx, bpf,
	linux-parisc, linuxppc-dev, linux-perf-users, linux-arm-kernel,
	oprofile-list


Extend error messages to mention CAP_SYS_PERFMON capability as an option
to substitute CAP_SYS_ADMIN capability for secure system performance
monitoring and observability operations. Make perf_event_paranoid_check()
and __cmd_ftrace() to be aware of CAP_SYS_PERFMON capability.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 tools/perf/builtin-ftrace.c |  5 +++--
 tools/perf/design.txt       |  3 ++-
 tools/perf/util/cap.h       |  4 ++++
 tools/perf/util/evsel.c     | 10 +++++-----
 tools/perf/util/util.c      |  1 +
 5 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/tools/perf/builtin-ftrace.c b/tools/perf/builtin-ftrace.c
index d5adc417a4ca..8096e9b5f4f9 100644
--- a/tools/perf/builtin-ftrace.c
+++ b/tools/perf/builtin-ftrace.c
@@ -284,10 +284,11 @@ static int __cmd_ftrace(struct perf_ftrace *ftrace, int argc, const char **argv)
 		.events = POLLIN,
 	};
 
-	if (!perf_cap__capable(CAP_SYS_ADMIN)) {
+	if (!(perf_cap__capable(CAP_SYS_PERFMON) ||
+	      perf_cap__capable(CAP_SYS_ADMIN))) {
 		pr_err("ftrace only works for %s!\n",
 #ifdef HAVE_LIBCAP_SUPPORT
-		"users with the SYS_ADMIN capability"
+		"users with the CAP_SYS_PERFMON or CAP_SYS_ADMIN capability"
 #else
 		"root"
 #endif
diff --git a/tools/perf/design.txt b/tools/perf/design.txt
index 0453ba26cdbd..71755b3e1303 100644
--- a/tools/perf/design.txt
+++ b/tools/perf/design.txt
@@ -258,7 +258,8 @@ gets schedule to. Per task counters can be created by any user, for
 their own tasks.
 
 A 'pid == -1' and 'cpu == x' counter is a per CPU counter that counts
-all events on CPU-x. Per CPU counters need CAP_SYS_ADMIN privilege.
+all events on CPU-x. Per CPU counters need CAP_SYS_PERFMON or
+CAP_SYS_ADMIN privilege.
 
 The 'flags' parameter is currently unused and must be zero.
 
diff --git a/tools/perf/util/cap.h b/tools/perf/util/cap.h
index 051dc590ceee..0f79fbf6638b 100644
--- a/tools/perf/util/cap.h
+++ b/tools/perf/util/cap.h
@@ -29,4 +29,8 @@ static inline bool perf_cap__capable(int cap __maybe_unused)
 #define CAP_SYSLOG	34
 #endif
 
+#ifndef CAP_SYS_PERFMON
+#define CAP_SYS_PERFMON 38
+#endif
+
 #endif /* __PERF_CAP_H */
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index f4dea055b080..3a46325e3702 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2468,14 +2468,14 @@ int perf_evsel__open_strerror(struct evsel *evsel, struct target *target,
 		 "You may not have permission to collect %sstats.\n\n"
 		 "Consider tweaking /proc/sys/kernel/perf_event_paranoid,\n"
 		 "which controls use of the performance events system by\n"
-		 "unprivileged users (without CAP_SYS_ADMIN).\n\n"
+		 "unprivileged users (without CAP_SYS_PERFMON or CAP_SYS_ADMIN).\n\n"
 		 "The current value is %d:\n\n"
 		 "  -1: Allow use of (almost) all events by all users\n"
 		 "      Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK\n"
-		 ">= 0: Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN\n"
-		 "      Disallow raw tracepoint access by users without CAP_SYS_ADMIN\n"
-		 ">= 1: Disallow CPU event access by users without CAP_SYS_ADMIN\n"
-		 ">= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN\n\n"
+		 ">= 0: Disallow ftrace function tracepoint by users without CAP_SYS_PERFMON or CAP_SYS_ADMIN\n"
+		 "      Disallow raw tracepoint access by users without CAP_SYS_PERFMON or CAP_SYS_ADMIN\n"
+		 ">= 1: Disallow CPU event access by users without CAP_SYS_PERFMON or CAP_SYS_ADMIN\n"
+		 ">= 2: Disallow kernel profiling by users without CAP_SYS_PERFMON or CAP_SYS_ADMIN\n\n"
 		 "To make this setting permanent, edit /etc/sysctl.conf too, e.g.:\n\n"
 		 "	kernel.perf_event_paranoid = -1\n" ,
 				 target->system_wide ? "system-wide " : "",
diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index 969ae560dad9..9981db0d8d09 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -272,6 +272,7 @@ int perf_event_paranoid(void)
 bool perf_event_paranoid_check(int max_level)
 {
 	return perf_cap__capable(CAP_SYS_ADMIN) ||
+			perf_cap__capable(CAP_SYS_PERFMON) ||
 			perf_event_paranoid() <= max_level;
 }
 
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 4/9] drm/i915/perf: open access for CAP_SYS_PERFMON privileged process
  2019-12-18  9:16 [PATCH v4 0/7] Introduce CAP_SYS_PERFMON to secure system performance monitoring and observability Alexey Budankov
                   ` (2 preceding siblings ...)
  2019-12-18  9:26 ` [PATCH v4 3/9] perf tool: extend Perf tool with CAP_SYS_PERFMON capability support Alexey Budankov
@ 2019-12-18  9:27 ` Alexey Budankov
  2019-12-19  9:10   ` Lionel Landwerlin
  2019-12-18  9:28 ` [PATCH v4 5/9] trace/bpf_trace: " Alexey Budankov
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 15+ messages in thread
From: Alexey Budankov @ 2019-12-18  9:27 UTC (permalink / raw)
  To: Peter Zijlstra, Arnaldo Carvalho de Melo, Ingo Molnar,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, Alexei Starovoitov,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	james.bottomley, Serge Hallyn, James Morris, Will Deacon,
	Mark Rutland, Casey Schaufler, Robert Richter
  Cc: Jiri Olsa, Andi Kleen, Stephane Eranian, Igor Lubashev,
	Alexander Shishkin, Namhyung Kim, Kees Cook, Jann Horn,
	Thomas Gleixner, Tvrtko Ursulin, Lionel Landwerlin, Song Liu,
	linux-kernel, linux-security-module, selinux, intel-gfx, bpf,
	linux-parisc, linuxppc-dev, linux-perf-users, linux-arm-kernel,
	oprofile-list


Open access to i915_perf monitoring for CAP_SYS_PERFMON privileged
processes. For backward compatibility reasons access to i915_perf
subsystem remains open for CAP_SYS_ADMIN privileged processes but
CAP_SYS_ADMIN usage for secure i915_perf monitoring is discouraged
with respect to CAP_SYS_PERFMON capability.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_perf.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index e42b86827d6b..e2697f8d04de 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -2748,10 +2748,10 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv,
 	/* Similar to perf's kernel.perf_paranoid_cpu sysctl option
 	 * we check a dev.i915.perf_stream_paranoid sysctl option
 	 * to determine if it's ok to access system wide OA counters
-	 * without CAP_SYS_ADMIN privileges.
+	 * without CAP_SYS_PERFMON or CAP_SYS_ADMIN privileges.
 	 */
 	if (privileged_op &&
-	    i915_perf_stream_paranoid && !capable(CAP_SYS_ADMIN)) {
+	    i915_perf_stream_paranoid && !perfmon_capable()) {
 		DRM_DEBUG("Insufficient privileges to open system-wide i915 perf stream\n");
 		ret = -EACCES;
 		goto err_ctx;
@@ -2939,9 +2939,8 @@ static int read_properties_unlocked(struct drm_i915_private *dev_priv,
 			} else
 				oa_freq_hz = 0;
 
-			if (oa_freq_hz > i915_oa_max_sample_rate &&
-			    !capable(CAP_SYS_ADMIN)) {
-				DRM_DEBUG("OA exponent would exceed the max sampling frequency (sysctl dev.i915.oa_max_sample_rate) %uHz without root privileges\n",
+			if (oa_freq_hz > i915_oa_max_sample_rate && !perfmon_capable()) {
+				DRM_DEBUG("OA exponent would exceed the max sampling frequency (sysctl dev.i915.oa_max_sample_rate) %uHz without CAP_SYS_PERFMON or CAP_SYS_ADMIN privileges\n",
 					  i915_oa_max_sample_rate);
 				return -EACCES;
 			}
@@ -3328,7 +3327,7 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, void *data,
 		return -EINVAL;
 	}
 
-	if (i915_perf_stream_paranoid && !capable(CAP_SYS_ADMIN)) {
+	if (i915_perf_stream_paranoid && !perfmon_capable()) {
 		DRM_DEBUG("Insufficient privileges to add i915 OA config\n");
 		return -EACCES;
 	}
@@ -3474,7 +3473,7 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, void *data,
 		return -ENOTSUPP;
 	}
 
-	if (i915_perf_stream_paranoid && !capable(CAP_SYS_ADMIN)) {
+	if (i915_perf_stream_paranoid && !perfmon_capable()) {
 		DRM_DEBUG("Insufficient privileges to remove i915 OA config\n");
 		return -EACCES;
 	}
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 5/9] trace/bpf_trace: open access for CAP_SYS_PERFMON privileged process
  2019-12-18  9:16 [PATCH v4 0/7] Introduce CAP_SYS_PERFMON to secure system performance monitoring and observability Alexey Budankov
                   ` (3 preceding siblings ...)
  2019-12-18  9:27 ` [PATCH v4 4/9] drm/i915/perf: open access for CAP_SYS_PERFMON privileged process Alexey Budankov
@ 2019-12-18  9:28 ` Alexey Budankov
  2019-12-18  9:28 ` [PATCH v4 6/9] powerpc/perf: " Alexey Budankov
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Alexey Budankov @ 2019-12-18  9:28 UTC (permalink / raw)
  To: Peter Zijlstra, Arnaldo Carvalho de Melo, Ingo Molnar,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, Alexei Starovoitov,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	james.bottomley, Serge Hallyn, James Morris, Will Deacon,
	Mark Rutland, Casey Schaufler, Robert Richter
  Cc: Jiri Olsa, Andi Kleen, Stephane Eranian, Igor Lubashev,
	Alexander Shishkin, Namhyung Kim, Kees Cook, Jann Horn,
	Thomas Gleixner, Tvrtko Ursulin, Lionel Landwerlin, Song Liu,
	linux-kernel, linux-security-module, selinux, intel-gfx, bpf,
	linux-parisc, linuxppc-dev, linux-perf-users, linux-arm-kernel,
	oprofile-list


Open access to bpf_trace monitoring for CAP_SYS_PERFMON privileged
processes. For backward compatibility reasons access to bpf_trace
monitoring remains open for CAP_SYS_ADMIN privileged processes but
CAP_SYS_ADMIN usage for secure bpf_trace monitoring is discouraged
with respect to CAP_SYS_PERFMON capability.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 kernel/trace/bpf_trace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 44bd08f2443b..bafe21ac6d92 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1272,7 +1272,7 @@ int perf_event_query_prog_array(struct perf_event *event, void __user *info)
 	u32 *ids, prog_cnt, ids_len;
 	int ret;
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!perfmon_capable())
 		return -EPERM;
 	if (event->attr.type != PERF_TYPE_TRACEPOINT)
 		return -EINVAL;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 6/9] powerpc/perf: open access for CAP_SYS_PERFMON privileged process
  2019-12-18  9:16 [PATCH v4 0/7] Introduce CAP_SYS_PERFMON to secure system performance monitoring and observability Alexey Budankov
                   ` (4 preceding siblings ...)
  2019-12-18  9:28 ` [PATCH v4 5/9] trace/bpf_trace: " Alexey Budankov
@ 2019-12-18  9:28 ` Alexey Budankov
  2019-12-18  9:29 ` [PATCH v4 7/9] parisc/perf: " Alexey Budankov
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Alexey Budankov @ 2019-12-18  9:28 UTC (permalink / raw)
  To: Peter Zijlstra, Arnaldo Carvalho de Melo, Ingo Molnar,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, Alexei Starovoitov,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	james.bottomley, Serge Hallyn, James Morris, Will Deacon,
	Mark Rutland, Casey Schaufler, Robert Richter
  Cc: Jiri Olsa, Andi Kleen, Stephane Eranian, Igor Lubashev,
	Alexander Shishkin, Namhyung Kim, Kees Cook, Jann Horn,
	Thomas Gleixner, Tvrtko Ursulin, Lionel Landwerlin, Song Liu,
	linux-kernel, linux-security-module, selinux, intel-gfx, bpf,
	linux-parisc, linuxppc-dev, linux-perf-users, linux-arm-kernel,
	oprofile-list


Open access to monitoring for CAP_SYS_PERFMON privileged processes.
For backward compatibility reasons access to the monitoring remains open
for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN usage for secure
monitoring is discouraged with respect to CAP_SYS_PERFMON capability.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 arch/powerpc/perf/imc-pmu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c
index cb50a9e1fd2d..e837717492e4 100644
--- a/arch/powerpc/perf/imc-pmu.c
+++ b/arch/powerpc/perf/imc-pmu.c
@@ -898,7 +898,7 @@ static int thread_imc_event_init(struct perf_event *event)
 	if (event->attr.type != event->pmu->type)
 		return -ENOENT;
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!perfmon_capable())
 		return -EACCES;
 
 	/* Sampling not supported */
@@ -1307,7 +1307,7 @@ static int trace_imc_event_init(struct perf_event *event)
 	if (event->attr.type != event->pmu->type)
 		return -ENOENT;
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!perfmon_capable())
 		return -EACCES;
 
 	/* Return if this is a couting event */
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 7/9] parisc/perf: open access for CAP_SYS_PERFMON privileged process
  2019-12-18  9:16 [PATCH v4 0/7] Introduce CAP_SYS_PERFMON to secure system performance monitoring and observability Alexey Budankov
                   ` (5 preceding siblings ...)
  2019-12-18  9:28 ` [PATCH v4 6/9] powerpc/perf: " Alexey Budankov
@ 2019-12-18  9:29 ` Alexey Budankov
  2020-01-27  8:52   ` Helge Deller
  2019-12-18  9:30 ` [PATCH v4 8/9] drivers/perf: " Alexey Budankov
  2019-12-18  9:31 ` [PATCH v4 9/9] drivers/oprofile: " Alexey Budankov
  8 siblings, 1 reply; 15+ messages in thread
From: Alexey Budankov @ 2019-12-18  9:29 UTC (permalink / raw)
  To: Peter Zijlstra, Arnaldo Carvalho de Melo, Ingo Molnar,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, Alexei Starovoitov,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	james.bottomley, Serge Hallyn, James Morris, Will Deacon,
	Mark Rutland, Casey Schaufler, Robert Richter
  Cc: Jiri Olsa, Andi Kleen, Stephane Eranian, Igor Lubashev,
	Alexander Shishkin, Namhyung Kim, Kees Cook, Jann Horn,
	Thomas Gleixner, Tvrtko Ursulin, Lionel Landwerlin, Song Liu,
	linux-kernel, linux-security-module, selinux, intel-gfx, bpf,
	linux-parisc, linuxppc-dev, linux-perf-users, linux-arm-kernel,
	oprofile-list


Open access to monitoring for CAP_SYS_PERFMON privileged processes.
For backward compatibility reasons access to the monitoring remains open
for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN usage for secure
monitoring is discouraged with respect to CAP_SYS_PERFMON capability.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 arch/parisc/kernel/perf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/parisc/kernel/perf.c b/arch/parisc/kernel/perf.c
index 676683641d00..c4208d027794 100644
--- a/arch/parisc/kernel/perf.c
+++ b/arch/parisc/kernel/perf.c
@@ -300,7 +300,7 @@ static ssize_t perf_write(struct file *file, const char __user *buf,
 	else
 		return -EFAULT;
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!perfmon_capable())
 		return -EACCES;
 
 	if (count != sizeof(uint32_t))
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 8/9] drivers/perf: open access for CAP_SYS_PERFMON privileged process
  2019-12-18  9:16 [PATCH v4 0/7] Introduce CAP_SYS_PERFMON to secure system performance monitoring and observability Alexey Budankov
                   ` (6 preceding siblings ...)
  2019-12-18  9:29 ` [PATCH v4 7/9] parisc/perf: " Alexey Budankov
@ 2019-12-18  9:30 ` Alexey Budankov
  2019-12-18  9:31 ` [PATCH v4 9/9] drivers/oprofile: " Alexey Budankov
  8 siblings, 0 replies; 15+ messages in thread
From: Alexey Budankov @ 2019-12-18  9:30 UTC (permalink / raw)
  To: Peter Zijlstra, Arnaldo Carvalho de Melo, Ingo Molnar,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, Alexei Starovoitov,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	james.bottomley, Serge Hallyn, James Morris, Will Deacon,
	Mark Rutland, Casey Schaufler, Robert Richter
  Cc: Jiri Olsa, Andi Kleen, Stephane Eranian, Igor Lubashev,
	Alexander Shishkin, Namhyung Kim, Kees Cook, Jann Horn,
	Thomas Gleixner, Tvrtko Ursulin, Lionel Landwerlin, Song Liu,
	linux-kernel, linux-security-module, selinux, intel-gfx, bpf,
	linux-parisc, linuxppc-dev, linux-perf-users, linux-arm-kernel,
	oprofile-list


Open access to monitoring for CAP_SYS_PERFMON privileged processes.
For backward compatibility reasons access to the monitoring remains open
for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN usage for secure
monitoring is discouraged with respect to CAP_SYS_PERFMON capability.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 drivers/perf/arm_spe_pmu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index 4e4984a55cd1..5dff81bc3324 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -274,7 +274,7 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event)
 	if (!attr->exclude_kernel)
 		reg |= BIT(SYS_PMSCR_EL1_E1SPE_SHIFT);
 
-	if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && capable(CAP_SYS_ADMIN))
+	if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && perfmon_capable())
 		reg |= BIT(SYS_PMSCR_EL1_CX_SHIFT);
 
 	return reg;
@@ -700,7 +700,7 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
 		return -EOPNOTSUPP;
 
 	reg = arm_spe_event_to_pmscr(event);
-	if (!capable(CAP_SYS_ADMIN) &&
+	if (!perfmon_capable() &&
 	    (reg & (BIT(SYS_PMSCR_EL1_PA_SHIFT) |
 		    BIT(SYS_PMSCR_EL1_CX_SHIFT) |
 		    BIT(SYS_PMSCR_EL1_PCT_SHIFT))))
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 9/9] drivers/oprofile: open access for CAP_SYS_PERFMON privileged process
  2019-12-18  9:16 [PATCH v4 0/7] Introduce CAP_SYS_PERFMON to secure system performance monitoring and observability Alexey Budankov
                   ` (7 preceding siblings ...)
  2019-12-18  9:30 ` [PATCH v4 8/9] drivers/perf: " Alexey Budankov
@ 2019-12-18  9:31 ` Alexey Budankov
  8 siblings, 0 replies; 15+ messages in thread
From: Alexey Budankov @ 2019-12-18  9:31 UTC (permalink / raw)
  To: Peter Zijlstra, Arnaldo Carvalho de Melo, Ingo Molnar,
	jani.nikula, joonas.lahtinen, rodrigo.vivi, Alexei Starovoitov,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	james.bottomley, Serge Hallyn, James Morris, Will Deacon,
	Mark Rutland, Casey Schaufler, Robert Richter
  Cc: Jiri Olsa, Andi Kleen, Stephane Eranian, Igor Lubashev,
	Alexander Shishkin, Namhyung Kim, Kees Cook, Jann Horn,
	Thomas Gleixner, Tvrtko Ursulin, Lionel Landwerlin, Song Liu,
	linux-kernel, linux-security-module, selinux, intel-gfx, bpf,
	linux-parisc, linuxppc-dev, linux-perf-users, linux-arm-kernel,
	oprofile-list


Open access to monitoring for CAP_SYS_PERFMON privileged processes.
For backward compatibility reasons access to the monitoring remains open
for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN usage for secure
monitoring is discouraged with respect to CAP_SYS_PERFMON capability.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 drivers/oprofile/event_buffer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/oprofile/event_buffer.c b/drivers/oprofile/event_buffer.c
index 12ea4a4ad607..6c9edc8bbc95 100644
--- a/drivers/oprofile/event_buffer.c
+++ b/drivers/oprofile/event_buffer.c
@@ -113,7 +113,7 @@ static int event_buffer_open(struct inode *inode, struct file *file)
 {
 	int err = -EPERM;
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!perfmon_capable())
 		return -EPERM;
 
 	if (test_and_set_bit_lock(0, &buffer_opened))
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 1/9] capabilities: introduce CAP_SYS_PERFMON to kernel and user space
  2019-12-18  9:24 ` [PATCH v4 1/9] capabilities: introduce CAP_SYS_PERFMON to kernel and user space Alexey Budankov
@ 2019-12-18 19:56   ` Stephen Smalley
  0 siblings, 0 replies; 15+ messages in thread
From: Stephen Smalley @ 2019-12-18 19:56 UTC (permalink / raw)
  To: Alexey Budankov, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Ingo Molnar, jani.nikula, joonas.lahtinen, rodrigo.vivi,
	Alexei Starovoitov, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, james.bottomley, Serge Hallyn, James Morris,
	Will Deacon, Mark Rutland, Casey Schaufler, Robert Richter
  Cc: Jiri Olsa, Andi Kleen, Stephane Eranian, Igor Lubashev,
	Alexander Shishkin, Namhyung Kim, Kees Cook, Jann Horn,
	Thomas Gleixner, Tvrtko Ursulin, Lionel Landwerlin, Song Liu,
	linux-kernel, linux-security-module, selinux, intel-gfx, bpf,
	linux-parisc, linuxppc-dev, linux-perf-users, linux-arm-kernel,
	oprofile-list

On 12/18/19 4:24 AM, Alexey Budankov wrote:
> 
> Introduce CAP_SYS_PERFMON capability devoted to secure system performance
> monitoring and observability operations so that CAP_SYS_PERFMON would
> assist CAP_SYS_ADMIN capability in its governing role for perf_events,
> i915_perf and other subsystems of the kernel.
> 
> CAP_SYS_PERFMON intends to harden system security and integrity during
> system performance monitoring and observability operations by decreasing
> attack surface that is available to CAP_SYS_ADMIN privileged processes.
> 
> CAP_SYS_PERFMON intends to take over CAP_SYS_ADMIN credentials related
> to system performance monitoring and observability operations and balance
> amount of CAP_SYS_ADMIN credentials in accordance with the recommendations
> provided in the man page for CAP_SYS_ADMIN [1]: "Note: this capability
> is overloaded; see Notes to kernel developers, below."
> 
> [1] http://man7.org/linux/man-pages/man7/capabilities.7.html
> 
> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>

Acked-by: Stephen Smalley <sds@tycho.nsa.gov>

Note for selinux developers: we will need to update the 
selinux-testsuite tests for perf_event when/if this change lands upstream.

> ---
>   include/linux/capability.h          | 4 ++++
>   include/uapi/linux/capability.h     | 8 +++++++-
>   security/selinux/include/classmap.h | 4 ++--
>   3 files changed, 13 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/capability.h b/include/linux/capability.h
> index ecce0f43c73a..883c879baa4b 100644
> --- a/include/linux/capability.h
> +++ b/include/linux/capability.h
> @@ -251,6 +251,10 @@ extern bool privileged_wrt_inode_uidgid(struct user_namespace *ns, const struct
>   extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap);
>   extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap);
>   extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns);
> +static inline bool perfmon_capable(void)
> +{
> +	return capable(CAP_SYS_PERFMON) || capable(CAP_SYS_ADMIN);
> +}
>   
>   /* audit system wants to get cap info from files as well */
>   extern int get_vfs_caps_from_disk(const struct dentry *dentry, struct cpu_vfs_cap_data *cpu_caps);
> diff --git a/include/uapi/linux/capability.h b/include/uapi/linux/capability.h
> index 240fdb9a60f6..98e03cc76c7c 100644
> --- a/include/uapi/linux/capability.h
> +++ b/include/uapi/linux/capability.h
> @@ -366,8 +366,14 @@ struct vfs_ns_cap_data {
>   
>   #define CAP_AUDIT_READ		37
>   
> +/*
> + * Allow system performance and observability privileged operations
> + * using perf_events, i915_perf and other kernel subsystems
> + */
> +
> +#define CAP_SYS_PERFMON		38
>   
> -#define CAP_LAST_CAP         CAP_AUDIT_READ
> +#define CAP_LAST_CAP         CAP_SYS_PERFMON
>   
>   #define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP)
>   
> diff --git a/security/selinux/include/classmap.h b/security/selinux/include/classmap.h
> index 7db24855e12d..bae602c623b0 100644
> --- a/security/selinux/include/classmap.h
> +++ b/security/selinux/include/classmap.h
> @@ -27,9 +27,9 @@
>   	    "audit_control", "setfcap"
>   
>   #define COMMON_CAP2_PERMS  "mac_override", "mac_admin", "syslog", \
> -		"wake_alarm", "block_suspend", "audit_read"
> +		"wake_alarm", "block_suspend", "audit_read", "sys_perfmon"
>   
> -#if CAP_LAST_CAP > CAP_AUDIT_READ
> +#if CAP_LAST_CAP > CAP_SYS_PERFMON
>   #error New capability defined, please update COMMON_CAP2_PERMS.
>   #endif
>   
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 4/9] drm/i915/perf: open access for CAP_SYS_PERFMON privileged process
  2019-12-18  9:27 ` [PATCH v4 4/9] drm/i915/perf: open access for CAP_SYS_PERFMON privileged process Alexey Budankov
@ 2019-12-19  9:10   ` Lionel Landwerlin
  0 siblings, 0 replies; 15+ messages in thread
From: Lionel Landwerlin @ 2019-12-19  9:10 UTC (permalink / raw)
  To: Alexey Budankov, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Ingo Molnar, jani.nikula, joonas.lahtinen, rodrigo.vivi,
	Alexei Starovoitov, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, james.bottomley, Serge Hallyn, James Morris,
	Will Deacon, Mark Rutland, Casey Schaufler, Robert Richter
  Cc: Jiri Olsa, Andi Kleen, Stephane Eranian, Igor Lubashev,
	Alexander Shishkin, Namhyung Kim, Kees Cook, Jann Horn,
	Thomas Gleixner, Tvrtko Ursulin, Song Liu, linux-kernel,
	linux-security-module, selinux, intel-gfx, bpf, linux-parisc,
	linuxppc-dev, linux-perf-users, linux-arm-kernel, oprofile-list

On 18/12/2019 11:27, Alexey Budankov wrote:
> Open access to i915_perf monitoring for CAP_SYS_PERFMON privileged
> processes. For backward compatibility reasons access to i915_perf
> subsystem remains open for CAP_SYS_ADMIN privileged processes but
> CAP_SYS_ADMIN usage for secure i915_perf monitoring is discouraged
> with respect to CAP_SYS_PERFMON capability.
>
> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

> ---
>   drivers/gpu/drm/i915/i915_perf.c | 13 ++++++-------
>   1 file changed, 6 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index e42b86827d6b..e2697f8d04de 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -2748,10 +2748,10 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv,
>   	/* Similar to perf's kernel.perf_paranoid_cpu sysctl option
>   	 * we check a dev.i915.perf_stream_paranoid sysctl option
>   	 * to determine if it's ok to access system wide OA counters
> -	 * without CAP_SYS_ADMIN privileges.
> +	 * without CAP_SYS_PERFMON or CAP_SYS_ADMIN privileges.
>   	 */
>   	if (privileged_op &&
> -	    i915_perf_stream_paranoid && !capable(CAP_SYS_ADMIN)) {
> +	    i915_perf_stream_paranoid && !perfmon_capable()) {
>   		DRM_DEBUG("Insufficient privileges to open system-wide i915 perf stream\n");
>   		ret = -EACCES;
>   		goto err_ctx;
> @@ -2939,9 +2939,8 @@ static int read_properties_unlocked(struct drm_i915_private *dev_priv,
>   			} else
>   				oa_freq_hz = 0;
>   
> -			if (oa_freq_hz > i915_oa_max_sample_rate &&
> -			    !capable(CAP_SYS_ADMIN)) {
> -				DRM_DEBUG("OA exponent would exceed the max sampling frequency (sysctl dev.i915.oa_max_sample_rate) %uHz without root privileges\n",
> +			if (oa_freq_hz > i915_oa_max_sample_rate && !perfmon_capable()) {
> +				DRM_DEBUG("OA exponent would exceed the max sampling frequency (sysctl dev.i915.oa_max_sample_rate) %uHz without CAP_SYS_PERFMON or CAP_SYS_ADMIN privileges\n",
>   					  i915_oa_max_sample_rate);
>   				return -EACCES;
>   			}
> @@ -3328,7 +3327,7 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, void *data,
>   		return -EINVAL;
>   	}
>   
> -	if (i915_perf_stream_paranoid && !capable(CAP_SYS_ADMIN)) {
> +	if (i915_perf_stream_paranoid && !perfmon_capable()) {
>   		DRM_DEBUG("Insufficient privileges to add i915 OA config\n");
>   		return -EACCES;
>   	}
> @@ -3474,7 +3473,7 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, void *data,
>   		return -ENOTSUPP;
>   	}
>   
> -	if (i915_perf_stream_paranoid && !capable(CAP_SYS_ADMIN)) {
> +	if (i915_perf_stream_paranoid && !perfmon_capable()) {
>   		DRM_DEBUG("Insufficient privileges to remove i915 OA config\n");
>   		return -EACCES;
>   	}



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 2/9] perf/core: open access for CAP_SYS_PERFMON privileged process
  2019-12-18  9:25 ` [PATCH v4 2/9] perf/core: open access for CAP_SYS_PERFMON privileged process Alexey Budankov
@ 2020-01-08 16:07   ` Peter Zijlstra
  2020-01-09 11:36     ` Alexey Budankov
  0 siblings, 1 reply; 15+ messages in thread
From: Peter Zijlstra @ 2020-01-08 16:07 UTC (permalink / raw)
  To: Alexey Budankov
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, jani.nikula,
	joonas.lahtinen, rodrigo.vivi, Alexei Starovoitov,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	james.bottomley, Serge Hallyn, James Morris, Will Deacon,
	Mark Rutland, Casey Schaufler, Robert Richter, Jiri Olsa,
	Andi Kleen, Stephane Eranian, Igor Lubashev, Alexander Shishkin,
	Namhyung Kim, Kees Cook, Jann Horn, Thomas Gleixner,
	Tvrtko Ursulin, Lionel Landwerlin, Song Liu, linux-kernel,
	linux-security-module, selinux, intel-gfx, bpf, linux-parisc,
	linuxppc-dev, linux-perf-users, linux-arm-kernel, oprofile-list

On Wed, Dec 18, 2019 at 12:25:35PM +0300, Alexey Budankov wrote:
> 
> Open access to perf_events monitoring for CAP_SYS_PERFMON privileged
> processes. For backward compatibility reasons access to perf_events
> subsystem remains open for CAP_SYS_ADMIN privileged processes but
> CAP_SYS_ADMIN usage for secure perf_events monitoring is discouraged
> with respect to CAP_SYS_PERFMON capability.
> 
> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
> ---
>  include/linux/perf_event.h | 6 +++---
>  kernel/events/core.c       | 6 +++---
>  2 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index 34c7c6910026..f46acd69425f 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -1285,7 +1285,7 @@ static inline int perf_is_paranoid(void)
>  
>  static inline int perf_allow_kernel(struct perf_event_attr *attr)
>  {
> -	if (sysctl_perf_event_paranoid > 1 && !capable(CAP_SYS_ADMIN))
> +	if (sysctl_perf_event_paranoid > 1 && !perfmon_capable())
>  		return -EACCES;
>  
>  	return security_perf_event_open(attr, PERF_SECURITY_KERNEL);
> @@ -1293,7 +1293,7 @@ static inline int perf_allow_kernel(struct perf_event_attr *attr)
>  
>  static inline int perf_allow_cpu(struct perf_event_attr *attr)
>  {
> -	if (sysctl_perf_event_paranoid > 0 && !capable(CAP_SYS_ADMIN))
> +	if (sysctl_perf_event_paranoid > 0 && !perfmon_capable())
>  		return -EACCES;
>  
>  	return security_perf_event_open(attr, PERF_SECURITY_CPU);
> @@ -1301,7 +1301,7 @@ static inline int perf_allow_cpu(struct perf_event_attr *attr)
>  
>  static inline int perf_allow_tracepoint(struct perf_event_attr *attr)
>  {
> -	if (sysctl_perf_event_paranoid > -1 && !capable(CAP_SYS_ADMIN))
> +	if (sysctl_perf_event_paranoid > -1 && !perfmon_capable())
>  		return -EPERM;
>  
>  	return security_perf_event_open(attr, PERF_SECURITY_TRACEPOINT);

These are OK I suppose.

> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 059ee7116008..d9db414f2197 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -9056,7 +9056,7 @@ static int perf_kprobe_event_init(struct perf_event *event)
>  	if (event->attr.type != perf_kprobe.type)
>  		return -ENOENT;
>  
> -	if (!capable(CAP_SYS_ADMIN))
> +	if (!perfmon_capable())
>  		return -EACCES;
>  
>  	/*

This one only allows attaching to already extant kprobes, right? It does
not allow creation of kprobes.

> @@ -9116,7 +9116,7 @@ static int perf_uprobe_event_init(struct perf_event *event)
>  	if (event->attr.type != perf_uprobe.type)
>  		return -ENOENT;
>  
> -	if (!capable(CAP_SYS_ADMIN))
> +	if (!perfmon_capable())
>  		return -EACCES;
>  
>  	/*

Idem, I presume.

> @@ -11157,7 +11157,7 @@ SYSCALL_DEFINE5(perf_event_open,
>  	}
>  
>  	if (attr.namespaces) {
> -		if (!capable(CAP_SYS_ADMIN))
> +		if (!perfmon_capable())
>  			return -EACCES;
>  	}

And given we basically make the entire kernel observable with this CAP,
busting namespaces shoulnd't be a problem either.

So yeah, I suppose that works.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 2/9] perf/core: open access for CAP_SYS_PERFMON privileged process
  2020-01-08 16:07   ` Peter Zijlstra
@ 2020-01-09 11:36     ` Alexey Budankov
  0 siblings, 0 replies; 15+ messages in thread
From: Alexey Budankov @ 2020-01-09 11:36 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, jani.nikula,
	joonas.lahtinen, rodrigo.vivi, Alexei Starovoitov,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	james.bottomley, Serge Hallyn, James Morris, Will Deacon,
	Mark Rutland, Casey Schaufler, Robert Richter, Jiri Olsa,
	Andi Kleen, Stephane Eranian, Igor Lubashev, Alexander Shishkin,
	Namhyung Kim, Kees Cook, Jann Horn, Thomas Gleixner,
	Tvrtko Ursulin, Lionel Landwerlin, Song Liu, linux-kernel,
	linux-security-module, selinux, intel-gfx, bpf, linux-parisc,
	linuxppc-dev, linux-perf-users, linux-arm-kernel, oprofile-list


On 08.01.2020 19:07, Peter Zijlstra wrote:
> On Wed, Dec 18, 2019 at 12:25:35PM +0300, Alexey Budankov wrote:
>>
>> Open access to perf_events monitoring for CAP_SYS_PERFMON privileged
>> processes. For backward compatibility reasons access to perf_events
>> subsystem remains open for CAP_SYS_ADMIN privileged processes but
>> CAP_SYS_ADMIN usage for secure perf_events monitoring is discouraged
>> with respect to CAP_SYS_PERFMON capability.
>>
>> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
>> ---
>>  include/linux/perf_event.h | 6 +++---
>>  kernel/events/core.c       | 6 +++---
>>  2 files changed, 6 insertions(+), 6 deletions(-)
>>
>> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
>> index 34c7c6910026..f46acd69425f 100644
>> --- a/include/linux/perf_event.h
>> +++ b/include/linux/perf_event.h
>> @@ -1285,7 +1285,7 @@ static inline int perf_is_paranoid(void)
>>  
>>  static inline int perf_allow_kernel(struct perf_event_attr *attr)
>>  {
>> -	if (sysctl_perf_event_paranoid > 1 && !capable(CAP_SYS_ADMIN))
>> +	if (sysctl_perf_event_paranoid > 1 && !perfmon_capable())
>>  		return -EACCES;
>>  
>>  	return security_perf_event_open(attr, PERF_SECURITY_KERNEL);
>> @@ -1293,7 +1293,7 @@ static inline int perf_allow_kernel(struct perf_event_attr *attr)
>>  
>>  static inline int perf_allow_cpu(struct perf_event_attr *attr)
>>  {
>> -	if (sysctl_perf_event_paranoid > 0 && !capable(CAP_SYS_ADMIN))
>> +	if (sysctl_perf_event_paranoid > 0 && !perfmon_capable())
>>  		return -EACCES;
>>  
>>  	return security_perf_event_open(attr, PERF_SECURITY_CPU);
>> @@ -1301,7 +1301,7 @@ static inline int perf_allow_cpu(struct perf_event_attr *attr)
>>  
>>  static inline int perf_allow_tracepoint(struct perf_event_attr *attr)
>>  {
>> -	if (sysctl_perf_event_paranoid > -1 && !capable(CAP_SYS_ADMIN))
>> +	if (sysctl_perf_event_paranoid > -1 && !perfmon_capable())
>>  		return -EPERM;
>>  
>>  	return security_perf_event_open(attr, PERF_SECURITY_TRACEPOINT);
> 
> These are OK I suppose.
> 
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index 059ee7116008..d9db414f2197 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -9056,7 +9056,7 @@ static int perf_kprobe_event_init(struct perf_event *event)
>>  	if (event->attr.type != perf_kprobe.type)
>>  		return -ENOENT;
>>  
>> -	if (!capable(CAP_SYS_ADMIN))
>> +	if (!perfmon_capable())
>>  		return -EACCES;
>>  
>>  	/*
> 
> This one only allows attaching to already extant kprobes, right? It does
> not allow creation of kprobes.

This unblocks creation of local trace kprobes and uprobes by CAP_SYS_PERFMON 
privileged process, exactly the same as for CAP_SYS_ADMIN privileged process.

> 
>> @@ -9116,7 +9116,7 @@ static int perf_uprobe_event_init(struct perf_event *event)
>>  	if (event->attr.type != perf_uprobe.type)
>>  		return -ENOENT;
>>  
>> -	if (!capable(CAP_SYS_ADMIN))
>> +	if (!perfmon_capable())
>>  		return -EACCES;
>>  
>>  	/*
> 
> Idem, I presume.
> 
>> @@ -11157,7 +11157,7 @@ SYSCALL_DEFINE5(perf_event_open,
>>  	}
>>  
>>  	if (attr.namespaces) {
>> -		if (!capable(CAP_SYS_ADMIN))
>> +		if (!perfmon_capable())
>>  			return -EACCES;
>>  	}
> 
> And given we basically make the entire kernel observable with this CAP,
> busting namespaces shoulnd't be a problem either.
> 
> So yeah, I suppose that works.
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 7/9] parisc/perf: open access for CAP_SYS_PERFMON privileged process
  2019-12-18  9:29 ` [PATCH v4 7/9] parisc/perf: " Alexey Budankov
@ 2020-01-27  8:52   ` Helge Deller
  0 siblings, 0 replies; 15+ messages in thread
From: Helge Deller @ 2020-01-27  8:52 UTC (permalink / raw)
  To: Alexey Budankov, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Ingo Molnar, jani.nikula, joonas.lahtinen, rodrigo.vivi,
	Alexei Starovoitov, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, james.bottomley, Serge Hallyn, James Morris,
	Will Deacon, Mark Rutland, Casey Schaufler, Robert Richter
  Cc: Jiri Olsa, Andi Kleen, Stephane Eranian, Igor Lubashev,
	Alexander Shishkin, Namhyung Kim, Kees Cook, Jann Horn,
	Thomas Gleixner, Tvrtko Ursulin, Lionel Landwerlin, Song Liu,
	linux-kernel, linux-security-module, selinux, intel-gfx, bpf,
	linux-parisc, linuxppc-dev, linux-perf-users, linux-arm-kernel,
	oprofile-list

On 18.12.19 10:29, Alexey Budankov wrote:
>
> Open access to monitoring for CAP_SYS_PERFMON privileged processes.
> For backward compatibility reasons access to the monitoring remains open
> for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN usage for secure
> monitoring is discouraged with respect to CAP_SYS_PERFMON capability.
>
> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>

Acked-by: Helge Deller <deller@gmx.de>

> ---
>  arch/parisc/kernel/perf.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/parisc/kernel/perf.c b/arch/parisc/kernel/perf.c
> index 676683641d00..c4208d027794 100644
> --- a/arch/parisc/kernel/perf.c
> +++ b/arch/parisc/kernel/perf.c
> @@ -300,7 +300,7 @@ static ssize_t perf_write(struct file *file, const char __user *buf,
>  	else
>  		return -EFAULT;
>
> -	if (!capable(CAP_SYS_ADMIN))
> +	if (!perfmon_capable())
>  		return -EACCES;
>
>  	if (count != sizeof(uint32_t))
>


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-01-27  8:53 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-18  9:16 [PATCH v4 0/7] Introduce CAP_SYS_PERFMON to secure system performance monitoring and observability Alexey Budankov
2019-12-18  9:24 ` [PATCH v4 1/9] capabilities: introduce CAP_SYS_PERFMON to kernel and user space Alexey Budankov
2019-12-18 19:56   ` Stephen Smalley
2019-12-18  9:25 ` [PATCH v4 2/9] perf/core: open access for CAP_SYS_PERFMON privileged process Alexey Budankov
2020-01-08 16:07   ` Peter Zijlstra
2020-01-09 11:36     ` Alexey Budankov
2019-12-18  9:26 ` [PATCH v4 3/9] perf tool: extend Perf tool with CAP_SYS_PERFMON capability support Alexey Budankov
2019-12-18  9:27 ` [PATCH v4 4/9] drm/i915/perf: open access for CAP_SYS_PERFMON privileged process Alexey Budankov
2019-12-19  9:10   ` Lionel Landwerlin
2019-12-18  9:28 ` [PATCH v4 5/9] trace/bpf_trace: " Alexey Budankov
2019-12-18  9:28 ` [PATCH v4 6/9] powerpc/perf: " Alexey Budankov
2019-12-18  9:29 ` [PATCH v4 7/9] parisc/perf: " Alexey Budankov
2020-01-27  8:52   ` Helge Deller
2019-12-18  9:30 ` [PATCH v4 8/9] drivers/perf: " Alexey Budankov
2019-12-18  9:31 ` [PATCH v4 9/9] drivers/oprofile: " Alexey Budankov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).