* [PATCH v2 0/3] admin-guide: extend perf-security with resource control, data categories and privileged users
@ 2019-02-07 13:23 Alexey Budankov
2019-02-07 13:29 ` [PATCH v2 1/4] perf-security: document perf_events/Perf resource control Alexey Budankov
` (3 more replies)
0 siblings, 4 replies; 9+ messages in thread
From: Alexey Budankov @ 2019-02-07 13:23 UTC (permalink / raw)
To: Jonatan Corbet, Kees Cook, Thomas Gleixner, Ingo Molnar, Peter Zijlstra
Cc: Jann Horn, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
Alexander Shishkin, Andi Kleen, Mark Rutland, Tvrtko Ursulin,
kernel-hardening, linux-doc, linux-kernel
The patch set extends the first version of perf-security.rst documentation
file [1], [2], [3] with the following topics:
1) perf_events/Perf resource limits and control management that describes
RLIMIT_NOFILE and perf_event_mlock_kb settings for processes conducting
performance monitoring;
2) categories of system and performance data that can be captured by
perf_events/Perf with explicit designation of process sensitive data;
3) possible steps to create perf_event/Perf privileged users groups for
the current implementations of perf_events syscall API [4] and Perf tool;
---
Alexey Budankov (4):
perf-security: document perf_events/Perf resource control
perf-security: document collected perf_events/Perf data categories
perf-security: elaborate on perf_events/Perf privileged users
perf-security: wrap paragraphs on 72 columns
Documentation/admin-guide/perf-security.rst | 247 +++++++++++++++-----
1 file changed, 187 insertions(+), 60 deletions(-)
---
Changes in v2:
- addressed comments for v1
- added fourth patch implementing 72 columns paragraph width
---
[1] https://marc.info/?l=linux-kernel&m=153736008310781&w=2
[2] https://lkml.org/lkml/2018/5/21/156
[3] https://lkml.org/lkml/2018/11/27/604
[4] http://man7.org/linux/man-pages/man2/perf_event_open.2.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2 1/4] perf-security: document perf_events/Perf resource control
2019-02-07 13:23 [PATCH v2 0/3] admin-guide: extend perf-security with resource control, data categories and privileged users Alexey Budankov
@ 2019-02-07 13:29 ` Alexey Budankov
2019-02-10 22:34 ` Thomas Gleixner
2019-02-07 13:30 ` [PATCH v2 2/4] perf-security: document collected perf_events/Perf data categories Alexey Budankov
` (2 subsequent siblings)
3 siblings, 1 reply; 9+ messages in thread
From: Alexey Budankov @ 2019-02-07 13:29 UTC (permalink / raw)
To: Jonatan Corbet, Kees Cook, Thomas Gleixner, Ingo Molnar, Peter Zijlstra
Cc: Jann Horn, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
Alexander Shishkin, Andi Kleen, Mark Rutland, Tvrtko Ursulin,
kernel-hardening, linux-doc, linux-kernel
Extend perf-security.rst file with perf_events/Perf resource control
section describing RLIMIT_NOFILE and perf_event_mlock_kb settings for
performance monitoring user processes.
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
Changes in v2:
- applied comments on v1
---
Documentation/admin-guide/perf-security.rst | 36 +++++++++++++++++++++
1 file changed, 36 insertions(+)
diff --git a/Documentation/admin-guide/perf-security.rst b/Documentation/admin-guide/perf-security.rst
index f73ebfe9bfe2..3915f07b9dea 100644
--- a/Documentation/admin-guide/perf-security.rst
+++ b/Documentation/admin-guide/perf-security.rst
@@ -84,6 +84,40 @@ governed by perf_event_paranoid [2]_ setting:
locking limit is imposed but ignored for unprivileged processes with
CAP_IPC_LOCK capability.
+perf_events/Perf resource control
+---------------------------------
+
+The perf_events system call API [2]_ allocates file descriptors for every configured
+PMU event. Open file descriptors are a per-process accountable resource governed
+by the RLIMIT_NOFILE [11]_ limit (ulimit -n), which is usually derived from the login
+shell process. When configuring Perf collection for a long list of events on a
+large server system, this limit can be easily hit preventing required monitoring
+configuration. RLIMIT_NOFILE limit can be increased on per-user basis modifying
+content of the limits.conf file [12]_ on some systems. Ordinarily, a Perf sampling session
+(perf record) requires an amount of open perf_event file descriptors that is not
+less than a number of monitored events multiplied by a number of monitored CPUs.
+
+An amount of memory available to user processes for capturing performance monitoring
+data is governed by the perf_event_mlock_kb [2]_ setting. This perf_event specific
+resource setting defines overall per-cpu limits of memory allowed for mapping
+by the user processes to execute performance monitoring. The setting essentially
+extends the RLIMIT_MEMLOCK [11]_ limit, but only for memory regions mapped specially
+for capturing monitored performance events and related data.
+
+For example, if a machine has eight cores and perf_event_mlock_kb limit is set
+to 516 KiB, then a user process is provided with 516 KiB * 8 = 4128 KiB of memory
+above the RLIMIT_MEMLOCK limit (ulimit -l) for perf_event mmap buffers. In particular,
+this means that, if the user wants to start two or more performance monitoring
+processes, the user is required to manually distribute available 4128 KiB between the
+monitoring processes, for example, using the --mmap-pages Perf record mode option.
+Otherwise, the first started performance monitoring process allocates all available
+4128 KiB and the other processes will fail to proceed due to the lack of memory.
+
+RLIMIT_MEMLOCK and perf_event_mlock_kb resource costraints are ignored for
+processes with the CAP_IPC_LOCK capability. Thus, perf_events/Perf privileged users
+can be provided with memory above the constraints for perf_events/Perf performance
+monitoring purpose by providing the Perf executable with CAP_IPC_LOCK capability.
+
Bibliography
------------
@@ -94,4 +128,6 @@ Bibliography
.. [5] `<https://www.kernel.org/doc/html/latest/security/credentials.html>`_
.. [6] `<http://man7.org/linux/man-pages/man7/capabilities.7.html>`_
.. [7] `<http://man7.org/linux/man-pages/man2/ptrace.2.html>`_
+.. [11] `<http://man7.org/linux/man-pages/man2/getrlimit.2.html>`_
+.. [12] `<http://man7.org/linux/man-pages/man5/limits.conf.5.html>`_
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v2 2/4] perf-security: document collected perf_events/Perf data categories
2019-02-07 13:23 [PATCH v2 0/3] admin-guide: extend perf-security with resource control, data categories and privileged users Alexey Budankov
2019-02-07 13:29 ` [PATCH v2 1/4] perf-security: document perf_events/Perf resource control Alexey Budankov
@ 2019-02-07 13:30 ` Alexey Budankov
2019-02-07 13:31 ` [PATCH v2 3/4] perf-security: elaborate on perf_events/Perf privileged users Alexey Budankov
2019-02-07 13:32 ` [PATCH v2 4/4] perf-security: wrap paragraphs on 72 columns Alexey Budankov
3 siblings, 0 replies; 9+ messages in thread
From: Alexey Budankov @ 2019-02-07 13:30 UTC (permalink / raw)
To: Jonatan Corbet, Kees Cook, Thomas Gleixner, Ingo Molnar, Peter Zijlstra
Cc: Jann Horn, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
Alexander Shishkin, Andi Kleen, Mark Rutland, Tvrtko Ursulin,
kernel-hardening, linux-doc, linux-kernel
Document and categorize system and performance data into groups that
can be captured by perf_events/Perf and explicitly indicate the group
that can contain process sensitive data.
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
Changes in v2:
- applied comments on v1
---
Documentation/admin-guide/perf-security.rst | 32 +++++++++++++++++++--
1 file changed, 30 insertions(+), 2 deletions(-)
diff --git a/Documentation/admin-guide/perf-security.rst b/Documentation/admin-guide/perf-security.rst
index 3915f07b9dea..e6eb7e1ee5ad 100644
--- a/Documentation/admin-guide/perf-security.rst
+++ b/Documentation/admin-guide/perf-security.rst
@@ -11,8 +11,34 @@ impose a considerable risk of leaking sensitive data accessed by monitored
processes. The data leakage is possible both in scenarios of direct usage of
perf_events system call API [2]_ and over data files generated by Perf tool user
mode utility (Perf) [3]_ , [4]_ . The risk depends on the nature of data that
-perf_events performance monitoring units (PMU) [2]_ collect and expose for
-performance analysis. Having that said perf_events/Perf performance monitoring
+perf_events performance monitoring units (PMU) [2]_ and Perf collect and expose
+for performance analysis. Collected system and performance data may be split into
+several categories:
+
+1. System hardware and software configuration data, for example: a CPU model and
+ its cache configuration, an amount of available memory and its topology, used
+ kernel and Perf versions, performance monitoring setup including experiment
+ time, events configuration, Perf command line parameters, etc.
+
+2. User and kernel module paths and their load addresses with sizes, process and
+ thread names with their PIDs and TIDs, timestamps for captured hardware and
+ software events.
+
+3. Content of kernel software counters (e.g., for context switches, page faults,
+ CPU migrations), architectural hardware performance counters (PMC) [8]_ and
+ machine specific registers (MSR) [9]_ that provide execution metrics for
+ various monitored parts of the system (e.g., memory controller (IMC), interconnect
+ (QPI/UPI) or peripheral (PCIe) uncore counters) without direct attribution to any
+ execution context state.
+
+4. Content of architectural execution context registers (e.g., RIP, RSP, RBP on
+ x86_64), process user and kernel space memory addresses and data, content of
+ various architectural MSRs that capture data from this category.
+
+Data that belong to the fourth category can potentially contain sensitive process
+data. If PMUs in some monitoring modes capture values of execution context registers
+or data from process memory then access to such monitoring capabilities requires
+to be ordered and secured properly. So, perf_events/Perf performance monitoring
is the subject for security access control management [5]_ .
perf_events/Perf access control
@@ -128,6 +154,8 @@ Bibliography
.. [5] `<https://www.kernel.org/doc/html/latest/security/credentials.html>`_
.. [6] `<http://man7.org/linux/man-pages/man7/capabilities.7.html>`_
.. [7] `<http://man7.org/linux/man-pages/man2/ptrace.2.html>`_
+.. [8] `<https://en.wikipedia.org/wiki/Hardware_performance_counter>`_
+.. [9] `<https://en.wikipedia.org/wiki/Model-specific_register>`_
.. [11] `<http://man7.org/linux/man-pages/man2/getrlimit.2.html>`_
.. [12] `<http://man7.org/linux/man-pages/man5/limits.conf.5.html>`_
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v2 3/4] perf-security: elaborate on perf_events/Perf privileged users
2019-02-07 13:23 [PATCH v2 0/3] admin-guide: extend perf-security with resource control, data categories and privileged users Alexey Budankov
2019-02-07 13:29 ` [PATCH v2 1/4] perf-security: document perf_events/Perf resource control Alexey Budankov
2019-02-07 13:30 ` [PATCH v2 2/4] perf-security: document collected perf_events/Perf data categories Alexey Budankov
@ 2019-02-07 13:31 ` Alexey Budankov
2019-02-07 13:32 ` [PATCH v2 4/4] perf-security: wrap paragraphs on 72 columns Alexey Budankov
3 siblings, 0 replies; 9+ messages in thread
From: Alexey Budankov @ 2019-02-07 13:31 UTC (permalink / raw)
To: Jonatan Corbet, Kees Cook, Thomas Gleixner, Ingo Molnar, Peter Zijlstra
Cc: Jann Horn, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
Alexander Shishkin, Andi Kleen, Mark Rutland, Tvrtko Ursulin,
kernel-hardening, linux-doc, linux-kernel
Elaborate on possible perf_event/Perf privileged users groups
and document steps about creating such groups.
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
Changes in v2:
- applied comments on v1
---
Documentation/admin-guide/perf-security.rst | 43 +++++++++++++++++++++
1 file changed, 43 insertions(+)
diff --git a/Documentation/admin-guide/perf-security.rst b/Documentation/admin-guide/perf-security.rst
index e6eb7e1ee5ad..f27a62805651 100644
--- a/Documentation/admin-guide/perf-security.rst
+++ b/Documentation/admin-guide/perf-security.rst
@@ -73,6 +73,48 @@ enable capturing of additional data required for later performance analysis of
monitored processes or a system. For example, CAP_SYSLOG capability permits
reading kernel space memory addresses from /proc/kallsyms file.
+perf_events/Perf privileged users
+---------------------------------
+
+Mechanisms of capabilities, privileged capability-dumb files [6]_ and file system
+ACLs [10]_ can be used to create a dedicated group of perf_events/Perf privileged
+users who are permitted to execute performance monitoring without scope limits.
+The following steps can be taken to create such a group of privileged Perf users.
+
+1. Create perf_users group of privileged Perf users, assign perf_users group to
+ Perf tool executable and limit access to the executable for other users in the
+ system who are not in the perf_users group:
+
+::
+
+ # groupadd perf_users
+ # ls -alhF
+ -rwxr-xr-x 2 root root 11M Oct 19 15:12 perf
+ # chgrp perf_users perf
+ # ls -alhF
+ -rwxr-xr-x 2 root perf_users 11M Oct 19 15:12 perf
+ # chmod o-rwx perf
+ # ls -alhF
+ -rwxr-x--- 2 root perf_users 11M Oct 19 15:12 perf
+
+2. Assign the required capabilities to the Perf tool executable file and enable
+ members of perf_users group with performance monitoring privileges [6]_ :
+
+::
+
+ # setcap "cap_sys_admin,cap_sys_ptrace,cap_syslog=ep" perf
+ # setcap -v "cap_sys_admin,cap_sys_ptrace,cap_syslog=ep" perf
+ perf: OK
+ # getcap perf
+ perf = cap_sys_ptrace,cap_sys_admin,cap_syslog+ep
+
+As a result, members of perf_users group are capable of conducting performance
+monitoring by using functionality of the configured Perf tool executable that,
+when executes, passes perf_events subsystem scope checks.
+
+This specific access control management is only available to superuser or root
+running processes with CAP_SETPCAP, CAP_SETFCAP [6]_ capabilities.
+
perf_events/Perf unprivileged users
-----------------------------------
@@ -156,6 +198,7 @@ Bibliography
.. [7] `<http://man7.org/linux/man-pages/man2/ptrace.2.html>`_
.. [8] `<https://en.wikipedia.org/wiki/Hardware_performance_counter>`_
.. [9] `<https://en.wikipedia.org/wiki/Model-specific_register>`_
+.. [10] `<http://man7.org/linux/man-pages/man5/acl.5.html>`_
.. [11] `<http://man7.org/linux/man-pages/man2/getrlimit.2.html>`_
.. [12] `<http://man7.org/linux/man-pages/man5/limits.conf.5.html>`_
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v2 4/4] perf-security: wrap paragraphs on 72 columns
2019-02-07 13:23 [PATCH v2 0/3] admin-guide: extend perf-security with resource control, data categories and privileged users Alexey Budankov
` (2 preceding siblings ...)
2019-02-07 13:31 ` [PATCH v2 3/4] perf-security: elaborate on perf_events/Perf privileged users Alexey Budankov
@ 2019-02-07 13:32 ` Alexey Budankov
3 siblings, 0 replies; 9+ messages in thread
From: Alexey Budankov @ 2019-02-07 13:32 UTC (permalink / raw)
To: Jonatan Corbet, Kees Cook, Thomas Gleixner, Ingo Molnar, Peter Zijlstra
Cc: Jann Horn, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
Alexander Shishkin, Andi Kleen, Mark Rutland, Tvrtko Ursulin,
kernel-hardening, linux-doc, linux-kernel
Implemented formatting of paragraphs to be not wider than 72 columns.
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
Documentation/admin-guide/perf-security.rst | 276 +++++++++++---------
1 file changed, 148 insertions(+), 128 deletions(-)
diff --git a/Documentation/admin-guide/perf-security.rst b/Documentation/admin-guide/perf-security.rst
index f27a62805651..6a26067f5fd6 100644
--- a/Documentation/admin-guide/perf-security.rst
+++ b/Documentation/admin-guide/perf-security.rst
@@ -6,84 +6,94 @@ Perf Events and tool security
Overview
--------
-Usage of Performance Counters for Linux (perf_events) [1]_ , [2]_ , [3]_ can
-impose a considerable risk of leaking sensitive data accessed by monitored
-processes. The data leakage is possible both in scenarios of direct usage of
-perf_events system call API [2]_ and over data files generated by Perf tool user
-mode utility (Perf) [3]_ , [4]_ . The risk depends on the nature of data that
-perf_events performance monitoring units (PMU) [2]_ and Perf collect and expose
-for performance analysis. Collected system and performance data may be split into
-several categories:
-
-1. System hardware and software configuration data, for example: a CPU model and
- its cache configuration, an amount of available memory and its topology, used
- kernel and Perf versions, performance monitoring setup including experiment
- time, events configuration, Perf command line parameters, etc.
-
-2. User and kernel module paths and their load addresses with sizes, process and
- thread names with their PIDs and TIDs, timestamps for captured hardware and
- software events.
-
-3. Content of kernel software counters (e.g., for context switches, page faults,
- CPU migrations), architectural hardware performance counters (PMC) [8]_ and
- machine specific registers (MSR) [9]_ that provide execution metrics for
- various monitored parts of the system (e.g., memory controller (IMC), interconnect
- (QPI/UPI) or peripheral (PCIe) uncore counters) without direct attribution to any
- execution context state.
-
-4. Content of architectural execution context registers (e.g., RIP, RSP, RBP on
- x86_64), process user and kernel space memory addresses and data, content of
- various architectural MSRs that capture data from this category.
-
-Data that belong to the fourth category can potentially contain sensitive process
-data. If PMUs in some monitoring modes capture values of execution context registers
-or data from process memory then access to such monitoring capabilities requires
-to be ordered and secured properly. So, perf_events/Perf performance monitoring
-is the subject for security access control management [5]_ .
+Usage of Performance Counters for Linux (perf_events) [1]_ , [2]_ , [3]_
+can impose a considerable risk of leaking sensitive data accessed by
+monitored processes. The data leakage is possible both in scenarios of
+direct usage of perf_events system call API [2]_ and over data files
+generated by Perf tool user mode utility (Perf) [3]_ , [4]_ . The risk
+depends on the nature of data that perf_events performance monitoring
+units (PMU) [2]_ and Perf collect and expose for performance analysis.
+Collected system and performance data may be split into several
+categories:
+
+1. System hardware and software configuration data, for example: a CPU
+ model and its cache configuration, an amount of available memory and
+ its topology, used kernel and Perf versions, performance monitoring
+ setup including experiment time, events configuration, Perf command
+ line parameters, etc.
+
+2. User and kernel module paths and their load addresses with sizes,
+ process and thread names with their PIDs and TIDs, timestamps for
+ captured hardware and software events.
+
+3. Content of kernel software counters (e.g., for context switches, page
+ faults, CPU migrations), architectural hardware performance counters
+ (PMC) [8]_ and machine specific registers (MSR) [9]_ that provide
+ execution metrics for various monitored parts of the system (e.g.,
+ memory controller (IMC), interconnect (QPI/UPI) or peripheral (PCIe)
+ uncore counters) without direct attribution to any execution context
+ state.
+
+4. Content of architectural execution context registers (e.g., RIP, RSP,
+ RBP on x86_64), process user and kernel space memory addresses and
+ data, content of various architectural MSRs that capture data from
+ this category.
+
+Data that belong to the fourth category can potentially contain
+sensitive process data. If PMUs in some monitoring modes capture values
+of execution context registers or data from process memory then access
+to such monitoring capabilities requires to be ordered and secured
+properly. So, perf_events/Perf performance monitoring is the subject for
+security access control management [5]_ .
perf_events/Perf access control
-------------------------------
-To perform security checks, the Linux implementation splits processes into two
-categories [6]_ : a) privileged processes (whose effective user ID is 0, referred
-to as superuser or root), and b) unprivileged processes (whose effective UID is
-nonzero). Privileged processes bypass all kernel security permission checks so
-perf_events performance monitoring is fully available to privileged processes
-without access, scope and resource restrictions.
-
-Unprivileged processes are subject to a full security permission check based on
-the process's credentials [5]_ (usually: effective UID, effective GID, and
-supplementary group list).
-
-Linux divides the privileges traditionally associated with superuser into
-distinct units, known as capabilities [6]_ , which can be independently enabled
-and disabled on per-thread basis for processes and files of unprivileged users.
-
-Unprivileged processes with enabled CAP_SYS_ADMIN capability are treated as
-privileged processes with respect to perf_events performance monitoring and
-bypass *scope* permissions checks in the kernel.
-
-Unprivileged processes using perf_events system call API is also subject for
-PTRACE_MODE_READ_REALCREDS ptrace access mode check [7]_ , whose outcome
-determines whether monitoring is permitted. So unprivileged processes provided
-with CAP_SYS_PTRACE capability are effectively permitted to pass the check.
-
-Other capabilities being granted to unprivileged processes can effectively
-enable capturing of additional data required for later performance analysis of
-monitored processes or a system. For example, CAP_SYSLOG capability permits
-reading kernel space memory addresses from /proc/kallsyms file.
+To perform security checks, the Linux implementation splits processes
+into two categories [6]_ : a) privileged processes (whose effective user
+ID is 0, referred to as superuser or root), and b) unprivileged
+processes (whose effective UID is nonzero). Privileged processes bypass
+all kernel security permission checks so perf_events performance
+monitoring is fully available to privileged processes without access,
+scope and resource restrictions.
+
+Unprivileged processes are subject to a full security permission check
+based on the process's credentials [5]_ (usually: effective UID,
+effective GID, and supplementary group list).
+
+Linux divides the privileges traditionally associated with superuser
+into distinct units, known as capabilities [6]_ , which can be
+independently enabled and disabled on per-thread basis for processes and
+files of unprivileged users.
+
+Unprivileged processes with enabled CAP_SYS_ADMIN capability are treated
+as privileged processes with respect to perf_events performance
+monitoring and bypass *scope* permissions checks in the kernel.
+
+Unprivileged processes using perf_events system call API is also subject
+for PTRACE_MODE_READ_REALCREDS ptrace access mode check [7]_ , whose
+outcome determines whether monitoring is permitted. So unprivileged
+processes provided with CAP_SYS_PTRACE capability are effectively
+permitted to pass the check.
+
+Other capabilities being granted to unprivileged processes can
+effectively enable capturing of additional data required for later
+performance analysis of monitored processes or a system. For example,
+CAP_SYSLOG capability permits reading kernel space memory addresses from
+/proc/kallsyms file.
perf_events/Perf privileged users
---------------------------------
-Mechanisms of capabilities, privileged capability-dumb files [6]_ and file system
-ACLs [10]_ can be used to create a dedicated group of perf_events/Perf privileged
-users who are permitted to execute performance monitoring without scope limits.
-The following steps can be taken to create such a group of privileged Perf users.
+Mechanisms of capabilities, privileged capability-dumb files [6]_ and
+file system ACLs [10]_ can be used to create a dedicated group of
+perf_events/Perf privileged users who are permitted to execute
+performance monitoring without scope limits. The following steps can be
+taken to create such a group of privileged Perf users.
-1. Create perf_users group of privileged Perf users, assign perf_users group to
- Perf tool executable and limit access to the executable for other users in the
- system who are not in the perf_users group:
+1. Create perf_users group of privileged Perf users, assign perf_users
+ group to Perf tool executable and limit access to the executable for
+ other users in the system who are not in the perf_users group:
::
@@ -97,8 +107,9 @@ The following steps can be taken to create such a group of privileged Perf users
# ls -alhF
-rwxr-x--- 2 root perf_users 11M Oct 19 15:12 perf
-2. Assign the required capabilities to the Perf tool executable file and enable
- members of perf_users group with performance monitoring privileges [6]_ :
+2. Assign the required capabilities to the Perf tool executable file and
+ enable members of perf_users group with performance monitoring
+ privileges [6]_ :
::
@@ -108,83 +119,92 @@ The following steps can be taken to create such a group of privileged Perf users
# getcap perf
perf = cap_sys_ptrace,cap_sys_admin,cap_syslog+ep
-As a result, members of perf_users group are capable of conducting performance
-monitoring by using functionality of the configured Perf tool executable that,
-when executes, passes perf_events subsystem scope checks.
+As a result, members of perf_users group are capable of conducting
+performance monitoring by using functionality of the configured Perf
+tool executable that, when executes, passes perf_events subsystem scope
+checks.
-This specific access control management is only available to superuser or root
-running processes with CAP_SETPCAP, CAP_SETFCAP [6]_ capabilities.
+This specific access control management is only available to superuser
+or root running processes with CAP_SETPCAP, CAP_SETFCAP [6]_
+capabilities.
perf_events/Perf unprivileged users
-----------------------------------
-perf_events/Perf *scope* and *access* control for unprivileged processes is
-governed by perf_event_paranoid [2]_ setting:
+perf_events/Perf *scope* and *access* control for unprivileged processes
+is governed by perf_event_paranoid [2]_ setting:
-1:
- Impose no *scope* and *access* restrictions on using perf_events performance
- monitoring. Per-user per-cpu perf_event_mlock_kb [2]_ locking limit is
- ignored when allocating memory buffers for storing performance data.
- This is the least secure mode since allowed monitored *scope* is
- maximized and no perf_events specific limits are imposed on *resources*
- allocated for performance monitoring.
+ Impose no *scope* and *access* restrictions on using perf_events
+ performance monitoring. Per-user per-cpu perf_event_mlock_kb [2]_
+ locking limit is ignored when allocating memory buffers for storing
+ performance data. This is the least secure mode since allowed
+ monitored *scope* is maximized and no perf_events specific limits
+ are imposed on *resources* allocated for performance monitoring.
>=0:
*scope* includes per-process and system wide performance monitoring
- but excludes raw tracepoints and ftrace function tracepoints monitoring.
- CPU and system events happened when executing either in user or
- in kernel space can be monitored and captured for later analysis.
- Per-user per-cpu perf_event_mlock_kb locking limit is imposed but
- ignored for unprivileged processes with CAP_IPC_LOCK [6]_ capability.
+ but excludes raw tracepoints and ftrace function tracepoints
+ monitoring. CPU and system events happened when executing either in
+ user or in kernel space can be monitored and captured for later
+ analysis. Per-user per-cpu perf_event_mlock_kb locking limit is
+ imposed but ignored for unprivileged processes with CAP_IPC_LOCK
+ [6]_ capability.
>=1:
- *scope* includes per-process performance monitoring only and excludes
- system wide performance monitoring. CPU and system events happened when
- executing either in user or in kernel space can be monitored and
- captured for later analysis. Per-user per-cpu perf_event_mlock_kb
- locking limit is imposed but ignored for unprivileged processes with
- CAP_IPC_LOCK capability.
+ *scope* includes per-process performance monitoring only and
+ excludes system wide performance monitoring. CPU and system events
+ happened when executing either in user or in kernel space can be
+ monitored and captured for later analysis. Per-user per-cpu
+ perf_event_mlock_kb locking limit is imposed but ignored for
+ unprivileged processes with CAP_IPC_LOCK capability.
>=2:
- *scope* includes per-process performance monitoring only. CPU and system
- events happened when executing in user space only can be monitored and
- captured for later analysis. Per-user per-cpu perf_event_mlock_kb
- locking limit is imposed but ignored for unprivileged processes with
- CAP_IPC_LOCK capability.
+ *scope* includes per-process performance monitoring only. CPU and
+ system events happened when executing in user space only can be
+ monitored and captured for later analysis. Per-user per-cpu
+ perf_event_mlock_kb locking limit is imposed but ignored for
+ unprivileged processes with CAP_IPC_LOCK capability.
perf_events/Perf resource control
---------------------------------
-The perf_events system call API [2]_ allocates file descriptors for every configured
-PMU event. Open file descriptors are a per-process accountable resource governed
-by the RLIMIT_NOFILE [11]_ limit (ulimit -n), which is usually derived from the login
-shell process. When configuring Perf collection for a long list of events on a
-large server system, this limit can be easily hit preventing required monitoring
-configuration. RLIMIT_NOFILE limit can be increased on per-user basis modifying
-content of the limits.conf file [12]_ on some systems. Ordinarily, a Perf sampling session
-(perf record) requires an amount of open perf_event file descriptors that is not
-less than a number of monitored events multiplied by a number of monitored CPUs.
-
-An amount of memory available to user processes for capturing performance monitoring
-data is governed by the perf_event_mlock_kb [2]_ setting. This perf_event specific
-resource setting defines overall per-cpu limits of memory allowed for mapping
-by the user processes to execute performance monitoring. The setting essentially
-extends the RLIMIT_MEMLOCK [11]_ limit, but only for memory regions mapped specially
+The perf_events system call API [2]_ allocates file descriptors for
+every configured PMU event. Open file descriptors are a per-process
+accountable resource governed by the RLIMIT_NOFILE [11]_ limit
+(ulimit -n), which is usually derived from the login shell process. When
+configuring Perf collection for a long list of events on a large server
+system, this limit can be easily hit preventing required monitoring
+configuration. RLIMIT_NOFILE limit can be increased on per-user basis
+modifying content of the limits.conf file [12]_ on some systems.
+Ordinarily, a Perf sampling session (perf record) requires an amount of
+open perf_event file descriptors that is not less than a number of
+monitored events multiplied by a number of monitored CPUs.
+
+An amount of memory available to user processes for capturing
+performance monitoring data is governed by the perf_event_mlock_kb [2]_
+setting. This perf_event specific resource setting defines overall
+per-cpu limits of memory allowed for mapping by the user processes to
+execute performance monitoring. The setting essentially extends the
+RLIMIT_MEMLOCK [11]_ limit, but only for memory regions mapped specially
for capturing monitored performance events and related data.
-For example, if a machine has eight cores and perf_event_mlock_kb limit is set
-to 516 KiB, then a user process is provided with 516 KiB * 8 = 4128 KiB of memory
-above the RLIMIT_MEMLOCK limit (ulimit -l) for perf_event mmap buffers. In particular,
-this means that, if the user wants to start two or more performance monitoring
-processes, the user is required to manually distribute available 4128 KiB between the
-monitoring processes, for example, using the --mmap-pages Perf record mode option.
-Otherwise, the first started performance monitoring process allocates all available
-4128 KiB and the other processes will fail to proceed due to the lack of memory.
-
-RLIMIT_MEMLOCK and perf_event_mlock_kb resource costraints are ignored for
-processes with the CAP_IPC_LOCK capability. Thus, perf_events/Perf privileged users
-can be provided with memory above the constraints for perf_events/Perf performance
-monitoring purpose by providing the Perf executable with CAP_IPC_LOCK capability.
+For example, if a machine has eight cores and perf_event_mlock_kb limit
+is set to 516 KiB, then a user process is provided with 516 KiB * 8 =
+4128 KiB of memory above the RLIMIT_MEMLOCK limit (ulimit -l) for
+perf_event mmap buffers. In particular, this means that, if the user
+wants to start two or more performance monitoring processes, the user is
+required to manually distribute available 4128 KiB between the
+monitoring processes, for example, using the --mmap-pages Perf record
+mode option. Otherwise, the first started performance monitoring process
+allocates all available 4128 KiB and the other processes will fail to
+proceed due to the lack of memory.
+
+RLIMIT_MEMLOCK and perf_event_mlock_kb resource costraints are ignored
+for processes with the CAP_IPC_LOCK capability. Thus, perf_events/Perf
+privileged users can be provided with memory above the constraints for
+perf_events/Perf performance monitoring purpose by providing the Perf
+executable with CAP_IPC_LOCK capability.
Bibliography
------------
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/4] perf-security: document perf_events/Perf resource control
2019-02-07 13:29 ` [PATCH v2 1/4] perf-security: document perf_events/Perf resource control Alexey Budankov
@ 2019-02-10 22:34 ` Thomas Gleixner
2019-02-11 12:46 ` Alexey Budankov
0 siblings, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2019-02-10 22:34 UTC (permalink / raw)
To: Alexey Budankov
Cc: Jonatan Corbet, Kees Cook, Ingo Molnar, Peter Zijlstra,
Jann Horn, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
Alexander Shishkin, Andi Kleen, Mark Rutland, Tvrtko Ursulin,
kernel-hardening, linux-doc, linux-kernel
On Thu, 7 Feb 2019, Alexey Budankov wrote:
General note: Please stay in the 80 char limit for all of the text.
> +The perf_events system call API [2]_ allocates file descriptors for every configured
> +PMU event. Open file descriptors are a per-process accountable resource governed
> +by the RLIMIT_NOFILE [11]_ limit (ulimit -n), which is usually derived from the login
> +shell process. When configuring Perf collection for a long list of events on a
> +large server system, this limit can be easily hit preventing required monitoring
> +configuration.
I'd move this sentence into a different paragraph and keep those related to
RLIMIT_NOFILE together.
> ... RLIMIT_NOFILE limit can be increased on per-user basis modifying
> +content of the limits.conf file [12]_ on some systems.
On some systems?
> Ordinarily, a Perf sampling session
> +(perf record) requires an amount of open perf_event file descriptors that is not
> +less than a number of monitored events multiplied by a number of monitored CPUs.
s/a number of/the number of/
The ordinary use case is:
perf CMD pile-of-events PROCESS
which does not specify the monitored CPUs at all. Then the number of file
descriptors is NR_EVENTS * NR_ONLINE_CPUS.
> +An amount of memory available to user processes for capturing performance monitoring
The amount ...
> +data is governed by the perf_event_mlock_kb [2]_ setting. This perf_event specific
> +resource setting defines overall per-cpu limits of memory allowed for mapping
> +by the user processes to execute performance monitoring. The setting essentially
> +extends the RLIMIT_MEMLOCK [11]_ limit, but only for memory regions mapped specially
s/specially/specifically/
> +for capturing monitored performance events and related data.
> +
> +For example, if a machine has eight cores and perf_event_mlock_kb limit is set
> +to 516 KiB, then a user process is provided with 516 KiB * 8 = 4128 KiB of memory
> +above the RLIMIT_MEMLOCK limit (ulimit -l) for perf_event mmap buffers. In particular,
> +this means that, if the user wants to start two or more performance monitoring
> +processes, the user is required to manually distribute available 4128 KiB between the
distribute the available
> +monitoring processes, for example, using the --mmap-pages Perf record mode option.
> +Otherwise, the first started performance monitoring process allocates all available
> +4128 KiB and the other processes will fail to proceed due to the lack of memory.
> +
> +RLIMIT_MEMLOCK and perf_event_mlock_kb resource costraints are ignored for
constraints.
> +processes with the CAP_IPC_LOCK capability. Thus, perf_events/Perf privileged users
what means perf_events/Perf ?
> +can be provided with memory above the constraints for perf_events/Perf performance
> +monitoring purpose by providing the Perf executable with CAP_IPC_LOCK capability.
Thanks,
tglx
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/4] perf-security: document perf_events/Perf resource control
2019-02-10 22:34 ` Thomas Gleixner
@ 2019-02-11 12:46 ` Alexey Budankov
2019-02-11 14:15 ` Thomas Gleixner
0 siblings, 1 reply; 9+ messages in thread
From: Alexey Budankov @ 2019-02-11 12:46 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Jonatan Corbet, Kees Cook, Ingo Molnar, Peter Zijlstra,
Jann Horn, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
Alexander Shishkin, Andi Kleen, Mark Rutland, Tvrtko Ursulin,
kernel-hardening, linux-doc, linux-kernel
On 11.02.2019 1:34, Thomas Gleixner wrote:
> On Thu, 7 Feb 2019, Alexey Budankov wrote:
>
> General note: Please stay in the 80 char limit for all of the text.
Yes, sure. [PATCH v2 4/4] implements wrapping at 72 columns.
>
>> +The perf_events system call API [2]_ allocates file descriptors for every configured
>> +PMU event. Open file descriptors are a per-process accountable resource governed
>> +by the RLIMIT_NOFILE [11]_ limit (ulimit -n), which is usually derived from the login
>> +shell process. When configuring Perf collection for a long list of events on a
>> +large server system, this limit can be easily hit preventing required monitoring
>> +configuration.
>
> I'd move this sentence into a different paragraph and keep those related to
> RLIMIT_NOFILE together.
Makes sense. Let's have these two paragraphs:
Open file descriptors
+++++++++++++++++++++
Memory allocation
+++++++++++++++++
>
>> ... RLIMIT_NOFILE limit can be increased on per-user basis modifying
>> +content of the limits.conf file [12]_ on some systems.
>
> On some systems?
Well, let's avoid this subtlety and have it like:
'RLIMIT_NOFILE limit can be increased on per-user basis
modifying content of the limits.conf file [12]_ .'
>
>> Ordinarily, a Perf sampling session
>> +(perf record) requires an amount of open perf_event file descriptors that is not
>> +less than a number of monitored events multiplied by a number of monitored CPUs.
>
> s/a number of/the number of/
Accepted.
>
> The ordinary use case is:
>
> perf CMD pile-of-events PROCESS
>
> which does not specify the monitored CPUs at all. Then the number of file
> descriptors is NR_EVENTS * NR_ONLINE_CPUS.
>
>> +An amount of memory available to user processes for capturing performance monitoring
>
> The amount ...
Accepted.
>
>> +data is governed by the perf_event_mlock_kb [2]_ setting. This perf_event specific
>> +resource setting defines overall per-cpu limits of memory allowed for mapping
>> +by the user processes to execute performance monitoring. The setting essentially
>> +extends the RLIMIT_MEMLOCK [11]_ limit, but only for memory regions mapped specially
>
> s/specially/specifically/
Accepted.
>
>> +for capturing monitored performance events and related data.
>> +
>> +For example, if a machine has eight cores and perf_event_mlock_kb limit is set
>> +to 516 KiB, then a user process is provided with 516 KiB * 8 = 4128 KiB of memory
>> +above the RLIMIT_MEMLOCK limit (ulimit -l) for perf_event mmap buffers. In particular,
>> +this means that, if the user wants to start two or more performance monitoring
>> +processes, the user is required to manually distribute available 4128 KiB between the
>
> distribute the available
Accepted.
>
>> +monitoring processes, for example, using the --mmap-pages Perf record mode option.
>> +Otherwise, the first started performance monitoring process allocates all available
>> +4128 KiB and the other processes will fail to proceed due to the lack of memory.
>> +
>> +RLIMIT_MEMLOCK and perf_event_mlock_kb resource costraints are ignored for
>
> constraints.
Accepted.
>
>> +processes with the CAP_IPC_LOCK capability. Thus, perf_events/Perf privileged users
>
> what means perf_events/Perf ?
'perf_events/Perf privileged users' refers to the paragraph about privileged users.
'perf_events/Perf' means exact combination of the kernel subsystem (perf_events) and
the privileged Perf tool (Perf) executable that enables certain group of users with
performance monitoring capabilities without scope limit.
>
>> +can be provided with memory above the constraints for perf_events/Perf performance
>> +monitoring purpose by providing the Perf executable with CAP_IPC_LOCK capability.
>
> Thanks,
>
> tglx
>
Thanks,
Alexey
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/4] perf-security: document perf_events/Perf resource control
2019-02-11 12:46 ` Alexey Budankov
@ 2019-02-11 14:15 ` Thomas Gleixner
2019-02-11 14:22 ` Alexey Budankov
0 siblings, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2019-02-11 14:15 UTC (permalink / raw)
To: Alexey Budankov
Cc: Jonatan Corbet, Kees Cook, Ingo Molnar, Peter Zijlstra,
Jann Horn, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
Alexander Shishkin, Andi Kleen, Mark Rutland, Tvrtko Ursulin,
kernel-hardening, linux-doc, linux-kernel
On Mon, 11 Feb 2019, Alexey Budankov wrote:
> On 11.02.2019 1:34, Thomas Gleixner wrote:
> > On Thu, 7 Feb 2019, Alexey Budankov wrote:
> >
> > General note: Please stay in the 80 char limit for all of the text.
>
> Yes, sure. [PATCH v2 4/4] implements wrapping at 72 columns.
So you provide crappy formatted stuff first, just to reformat it at the
end. I'm missing the logic behind that.
Thanks,
tglx
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/4] perf-security: document perf_events/Perf resource control
2019-02-11 14:15 ` Thomas Gleixner
@ 2019-02-11 14:22 ` Alexey Budankov
0 siblings, 0 replies; 9+ messages in thread
From: Alexey Budankov @ 2019-02-11 14:22 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Jonatan Corbet, Kees Cook, Ingo Molnar, Peter Zijlstra,
Jann Horn, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
Alexander Shishkin, Andi Kleen, Mark Rutland, Tvrtko Ursulin,
kernel-hardening, linux-doc, linux-kernel
On 11.02.2019 17:15, Thomas Gleixner wrote:
> On Mon, 11 Feb 2019, Alexey Budankov wrote:
>> On 11.02.2019 1:34, Thomas Gleixner wrote:
>>> On Thu, 7 Feb 2019, Alexey Budankov wrote:
>>>
>>> General note: Please stay in the 80 char limit for all of the text.
>>
>> Yes, sure. [PATCH v2 4/4] implements wrapping at 72 columns.
>
> So you provide crappy formatted stuff first, just to reformat it at the
> end. I'm missing the logic behind that.
The logic is not to mix new content review with the whole doc
formatting in the end.
Thanks,
Alexey
>
> Thanks,
>
> tglx
>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2019-02-11 14:22 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-07 13:23 [PATCH v2 0/3] admin-guide: extend perf-security with resource control, data categories and privileged users Alexey Budankov
2019-02-07 13:29 ` [PATCH v2 1/4] perf-security: document perf_events/Perf resource control Alexey Budankov
2019-02-10 22:34 ` Thomas Gleixner
2019-02-11 12:46 ` Alexey Budankov
2019-02-11 14:15 ` Thomas Gleixner
2019-02-11 14:22 ` Alexey Budankov
2019-02-07 13:30 ` [PATCH v2 2/4] perf-security: document collected perf_events/Perf data categories Alexey Budankov
2019-02-07 13:31 ` [PATCH v2 3/4] perf-security: elaborate on perf_events/Perf privileged users Alexey Budankov
2019-02-07 13:32 ` [PATCH v2 4/4] perf-security: wrap paragraphs on 72 columns Alexey Budankov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).