BPF Archive on lore.kernel.org
 help / color / Atom feed
From: Alexei Starovoitov <ast@kernel.org>
To: <davem@davemloft.net>
Cc: <daniel@iogearbox.net>, <peterz@infradead.org>,
	<luto@amacapital.net>, <netdev@vger.kernel.org>,
	<bpf@vger.kernel.org>, <kernel-team@fb.com>,
	<linux-api@vger.kernel.org>
Subject: [PATCH v4 bpf-next 1/4] capability: introduce CAP_BPF and CAP_TRACING
Date: Fri, 6 Sep 2019 16:10:50 -0700
Message-ID: <20190906231053.1276792-2-ast@kernel.org> (raw)
In-Reply-To: <20190906231053.1276792-1-ast@kernel.org>

Split BPF and perf/tracing operations that are allowed under
CAP_SYS_ADMIN into corresponding CAP_BPF and CAP_TRACING.
For backward compatibility include them in CAP_SYS_ADMIN as well.

The end result provides simple safety model for applications that use BPF:
- for tracing program types
  BPF_PROG_TYPE_{KPROBE, TRACEPOINT, PERF_EVENT, RAW_TRACEPOINT, etc}
  use CAP_BPF and CAP_TRACING
- for networking program types
  BPF_PROG_TYPE_{SCHED_CLS, XDP, CGROUP_SKB, SK_SKB, etc}
  use CAP_BPF and CAP_NET_ADMIN

There are few exceptions from this simple rule:
- bpf_trace_printk() is allowed in networking programs, but it's using
  ftrace mechanism, hence this helper needs additional CAP_TRACING.
- cpumap is used by XDP programs. Currently it's kept under CAP_SYS_ADMIN,
  but could be relaxed to CAP_NET_ADMIN in the future.
- BPF_F_ZERO_SEED flag for hash/lru map is allowed under CAP_SYS_ADMIN only
  to discourage production use.
- BPF HW offload is allowed under CAP_SYS_ADMIN.
- cg_sysctl, cg_device, lirc program types are neither networking nor tracing.
  They can be loaded under CAP_BPF, but attach is allowed under CAP_NET_ADMIN.
  This will be cleaned up in the future.

userid=nobody + (CAP_TRACING | CAP_NET_ADMIN) + CAP_BPF is safer than
typical setup with userid=root and sudo by existing bpf applications.
It's not secure, since these capabilities:
- allow bpf progs access arbitrary memory
- let tasks access any bpf map
- let tasks attach/detach any bpf prog

bpftool, bpftrace, bcc tools binaries should not be installed with
cap_bpf+cap_tracing, since unpriv users will be able to read kernel secrets.

CAP_BPF, CAP_NET_ADMIN, CAP_TRACING are roughly equal in terms of
damage they can make to the system.
Example:
CAP_NET_ADMIN can stop network traffic. CAP_BPF can write into map
and if that map is used by firewall-like bpf prog the network traffic
may stop.
CAP_BPF allows many bpf prog_load commands in parallel. The verifier
may consume large amount of memory and significantly slow down the system.
CAP_TRACING allows many kprobes that can slow down the system.

In the future more fine-grained bpf permissions may be added.

Existing unprivileged BPF operations are not affected.
In particular unprivileged users are allowed to load socket_filter and cg_skb
program types and to create array, hash, prog_array, map-in-map map types.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 include/linux/capability.h          | 18 +++++++++++
 include/uapi/linux/capability.h     | 49 ++++++++++++++++++++++++++++-
 security/selinux/include/classmap.h |  4 +--
 3 files changed, 68 insertions(+), 3 deletions(-)

diff --git a/include/linux/capability.h b/include/linux/capability.h
index ecce0f43c73a..13eb49c75797 100644
--- a/include/linux/capability.h
+++ b/include/linux/capability.h
@@ -247,6 +247,24 @@ static inline bool ns_capable_setid(struct user_namespace *ns, int cap)
 	return true;
 }
 #endif /* CONFIG_MULTIUSER */
+
+static inline bool capable_bpf(void)
+{
+	return capable(CAP_SYS_ADMIN) || capable(CAP_BPF);
+}
+static inline bool capable_tracing(void)
+{
+	return capable(CAP_SYS_ADMIN) || capable(CAP_TRACING);
+}
+static inline bool capable_bpf_tracing(void)
+{
+	return capable(CAP_SYS_ADMIN) || (capable(CAP_BPF) && capable(CAP_TRACING));
+}
+static inline bool capable_bpf_net_admin(void)
+{
+	return (capable(CAP_SYS_ADMIN) || capable(CAP_BPF)) && capable(CAP_NET_ADMIN);
+}
+
 extern bool privileged_wrt_inode_uidgid(struct user_namespace *ns, const struct inode *inode);
 extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap);
 extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap);
diff --git a/include/uapi/linux/capability.h b/include/uapi/linux/capability.h
index 240fdb9a60f6..fe01d8235e1e 100644
--- a/include/uapi/linux/capability.h
+++ b/include/uapi/linux/capability.h
@@ -274,6 +274,7 @@ struct vfs_ns_cap_data {
    arbitrary SCSI commands */
 /* Allow setting encryption key on loopback filesystem */
 /* Allow setting zone reclaim policy */
+/* Allow everything under CAP_BPF and CAP_TRACING for backward compatibility */
 
 #define CAP_SYS_ADMIN        21
 
@@ -366,8 +367,54 @@ struct vfs_ns_cap_data {
 
 #define CAP_AUDIT_READ		37
 
+/*
+ * CAP_BPF allows the following BPF operations:
+ * - Loading all types of BPF programs
+ * - Creating all types of BPF maps except:
+ *    - stackmap that needs CAP_TRACING
+ *    - devmap that needs CAP_NET_ADMIN
+ *    - cpumap that needs CAP_SYS_ADMIN
+ * - Advanced verifier features
+ *   - Indirect variable access
+ *   - Bounded loops
+ *   - BPF to BPF function calls
+ *   - Scalar precision tracking
+ *   - Larger complexity limits
+ *   - Dead code elimination
+ *   - And potentially other features
+ * - Use of pointer-to-integer conversions in BPF programs
+ * - Bypassing of speculation attack hardening measures
+ * - Loading BPF Type Format (BTF) data
+ * - Iterate system wide loaded programs, maps, BTF objects
+ * - Retrieve xlated and JITed code of BPF programs
+ * - Access maps and programs via id
+ * - Use bpf_spin_lock() helper
+ *
+ * CAP_BPF and CAP_TRACING together allow the following:
+ * - bpf_probe_read to read arbitrary kernel memory
+ * - bpf_trace_printk to print data to ftrace ring buffer
+ * - Attach to raw_tracepoint
+ * - Query association between kprobe/tracepoint and bpf program
+ *
+ * CAP_BPF and CAP_NET_ADMIN together allow the following:
+ * - Attach to cgroup-bpf hooks and query
+ * - skb, xdp, flow_dissector test_run command
+ *
+ * CAP_NET_ADMIN allows:
+ * - Attach networking bpf programs to xdp, tc, lwt, flow dissector
+ */
+#define CAP_BPF			38
+
+/*
+ * CAP_TRACING allows:
+ * - Full use of perf_event_open(), similarly to the effect of
+ *   kernel.perf_event_paranoid == -1
+ * - Creation of [ku][ret]probe
+ * - Attach tracing bpf programs to perf events
+ */
+#define CAP_TRACING		39
 
-#define CAP_LAST_CAP         CAP_AUDIT_READ
+#define CAP_LAST_CAP         CAP_TRACING
 
 #define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP)
 
diff --git a/security/selinux/include/classmap.h b/security/selinux/include/classmap.h
index 201f7e588a29..0b364e245163 100644
--- a/security/selinux/include/classmap.h
+++ b/security/selinux/include/classmap.h
@@ -26,9 +26,9 @@
 	    "audit_control", "setfcap"
 
 #define COMMON_CAP2_PERMS  "mac_override", "mac_admin", "syslog", \
-		"wake_alarm", "block_suspend", "audit_read"
+		"wake_alarm", "block_suspend", "audit_read", "bpf", "tracing"
 
-#if CAP_LAST_CAP > CAP_AUDIT_READ
+#if CAP_LAST_CAP > CAP_TRACING
 #error New capability defined, please update COMMON_CAP2_PERMS.
 #endif
 
-- 
2.20.0


  reply index

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-06 23:10 [PATCH v4 bpf-next 0/4] " Alexei Starovoitov
2019-09-06 23:10 ` Alexei Starovoitov [this message]
2019-09-09 22:52   ` [PATCH v4 bpf-next 1/4] capability: introduce " Andy Lutomirski
2019-09-06 23:10 ` [PATCH v4 bpf-next 2/4] bpf: implement CAP_BPF Alexei Starovoitov
2019-09-06 23:10 ` [PATCH v4 bpf-next 3/4] perf: implement CAP_TRACING Alexei Starovoitov
2019-09-06 23:10 ` [PATCH v4 bpf-next 4/4] selftests/bpf: use CAP_BPF and CAP_TRACING in tests Alexei Starovoitov

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190906231053.1276792-2-ast@kernel.org \
    --to=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=kernel-team@fb.com \
    --cc=linux-api@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

BPF Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/bpf/0 bpf/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 bpf bpf/ https://lore.kernel.org/bpf \
		bpf@vger.kernel.org
	public-inbox-index bpf

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.bpf


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git