BPF Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v3 bpf-next 1/3] capability: introduce CAP_BPF and CAP_TRACING
@ 2019-09-04 18:43 Alexei Starovoitov
  2019-09-04 18:43 ` [PATCH v3 bpf-next 2/3] bpf: implement CAP_BPF Alexei Starovoitov
  2019-09-04 18:43 ` [PATCH v3 bpf-next 3/3] perf: implement CAP_TRACING Alexei Starovoitov
  0 siblings, 2 replies; 11+ messages in thread
From: Alexei Starovoitov @ 2019-09-04 18:43 UTC (permalink / raw)
  To: davem; +Cc: daniel, peterz, luto, netdev, bpf, kernel-team, linux-api

Split BPF and perf/tracing operations that are allowed under
CAP_SYS_ADMIN into corresponding CAP_BPF and CAP_TRACING.
For backward compatibility include them in CAP_SYS_ADMIN as well.

The end result provides simple safety model for applications that use BPF:
- for tracing program types
  BPF_PROG_TYPE_{KPROBE, TRACEPOINT, PERF_EVENT, RAW_TRACEPOINT, etc}
  use CAP_BPF and CAP_TRACING
- for networking program types
  BPF_PROG_TYPE_{SCHED_CLS, XDP, CGROUP_SKB, SK_SKB, etc}
  use CAP_BPF and CAP_NET_ADMIN

There are few exceptions from this simple rule:
- bpf_trace_printk() is allowed in networking programs, but it's using
  ftrace mechanism, hence this helper needs additional CAP_TRACING.
- cpumap is used by XDP programs. Currently it's kept under CAP_SYS_ADMIN,
  but could be relaxed to CAP_NET_ADMIN in the future.
- BPF_F_ZERO_SEED flag for hash/lru map is allowed under CAP_SYS_ADMIN only
  to discourage production use.
- BPF HW offload is allowed under CAP_SYS_ADMIN.
- cg_sysctl, cg_device, lirc program types are neither networking nor tracing.
  They can be loaded under CAP_BPF, but attach is allowed under CAP_NET_ADMIN.
  This will be cleaned up in the future.

userid=nobody + (CAP_TRACING | CAP_NET_ADMIN) + CAP_BPF is safer than
typical setup with userid=root and sudo by existing bpf applications.
It's not secure, since these capabilities:
- allow bpf progs access arbitrary memory
- let tasks access any bpf map
- let tasks attach/detach any bpf prog

bpftool, bpftrace, bcc tools binaries should not be installed with
cap_bpf+cap_tracing, since unpriv users will be able to read kernel secrets.

CAP_BPF, CAP_NET_ADMIN, CAP_TRACING are roughly equal in terms of
damage they can make to the system.
Example:
CAP_NET_ADMIN can stop network traffic. CAP_BPF can write into map
and if that map is used by firewall-like bpf prog the network traffic
may stop.
CAP_BPF allows many bpf prog_load commands in parallel. The verifier
may consume large amount of memory and significantly slow down the system.
CAP_TRACING allows many kprobes that can slow down the system.

In the future more fine-grained bpf permissions may be added.

v2->v3:
- dropped ftrace and kallsyms from CAP_TRACING description.
  In the future these mechanisms can start using it too.
- added CAP_SYS_ADMIN backward compatibility.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 include/linux/capability.h          | 18 +++++++++++
 include/uapi/linux/capability.h     | 48 ++++++++++++++++++++++++++++-
 security/selinux/include/classmap.h |  4 +--
 3 files changed, 67 insertions(+), 3 deletions(-)

diff --git a/include/linux/capability.h b/include/linux/capability.h
index ecce0f43c73a..13eb49c75797 100644
--- a/include/linux/capability.h
+++ b/include/linux/capability.h
@@ -247,6 +247,24 @@ static inline bool ns_capable_setid(struct user_namespace *ns, int cap)
 	return true;
 }
 #endif /* CONFIG_MULTIUSER */
+
+static inline bool capable_bpf(void)
+{
+	return capable(CAP_SYS_ADMIN) || capable(CAP_BPF);
+}
+static inline bool capable_tracing(void)
+{
+	return capable(CAP_SYS_ADMIN) || capable(CAP_TRACING);
+}
+static inline bool capable_bpf_tracing(void)
+{
+	return capable(CAP_SYS_ADMIN) || (capable(CAP_BPF) && capable(CAP_TRACING));
+}
+static inline bool capable_bpf_net_admin(void)
+{
+	return (capable(CAP_SYS_ADMIN) || capable(CAP_BPF)) && capable(CAP_NET_ADMIN);
+}
+
 extern bool privileged_wrt_inode_uidgid(struct user_namespace *ns, const struct inode *inode);
 extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap);
 extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap);
diff --git a/include/uapi/linux/capability.h b/include/uapi/linux/capability.h
index 240fdb9a60f6..fb71dee0ac2b 100644
--- a/include/uapi/linux/capability.h
+++ b/include/uapi/linux/capability.h
@@ -366,8 +366,54 @@ struct vfs_ns_cap_data {
 
 #define CAP_AUDIT_READ		37
 
+/*
+ * CAP_BPF allows the following BPF operations:
+ * - Loading all types of BPF programs
+ * - Creating all types of BPF maps except:
+ *    - stackmap that needs CAP_TRACING
+ *    - devmap that needs CAP_NET_ADMIN
+ *    - cpumap that needs CAP_SYS_ADMIN
+ * - Advanced verifier features
+ *   - Indirect variable access
+ *   - Bounded loops
+ *   - BPF to BPF function calls
+ *   - Scalar precision tracking
+ *   - Larger complexity limits
+ *   - Dead code elimination
+ *   - And potentially other features
+ * - Use of pointer-to-integer conversions in BPF programs
+ * - Bypassing of speculation attack hardening measures
+ * - Loading BPF Type Format (BTF) data
+ * - Iterate system wide loaded programs, maps, BTF objects
+ * - Retrieve xlated and JITed code of BPF programs
+ * - Access maps and programs via id
+ * - Use bpf_spin_lock() helper
+ *
+ * CAP_BPF and CAP_TRACING together allow the following:
+ * - bpf_probe_read to read arbitrary kernel memory
+ * - bpf_trace_printk to print data to ftrace ring buffer
+ * - Attach to raw_tracepoint
+ * - Query association between kprobe/tracepoint and bpf program
+ *
+ * CAP_BPF and CAP_NET_ADMIN together allow the following:
+ * - Attach to cgroup-bpf hooks and query
+ * - skb, xdp, flow_dissector test_run command
+ *
+ * CAP_NET_ADMIN allows:
+ * - Attach networking bpf programs to xdp, tc, lwt, flow dissector
+ */
+#define CAP_BPF			38
+
+/*
+ * CAP_TRACING allows:
+ * - Full use of perf_event_open(), similarly to the effect of
+ *   kernel.perf_event_paranoid == -1
+ * - Creation of [ku][ret]probe
+ * - Attach tracing bpf programs to perf events
+ */
+#define CAP_TRACING		39
 
-#define CAP_LAST_CAP         CAP_AUDIT_READ
+#define CAP_LAST_CAP         CAP_TRACING
 
 #define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP)
 
diff --git a/security/selinux/include/classmap.h b/security/selinux/include/classmap.h
index 201f7e588a29..0b364e245163 100644
--- a/security/selinux/include/classmap.h
+++ b/security/selinux/include/classmap.h
@@ -26,9 +26,9 @@
 	    "audit_control", "setfcap"
 
 #define COMMON_CAP2_PERMS  "mac_override", "mac_admin", "syslog", \
-		"wake_alarm", "block_suspend", "audit_read"
+		"wake_alarm", "block_suspend", "audit_read", "bpf", "tracing"
 
-#if CAP_LAST_CAP > CAP_AUDIT_READ
+#if CAP_LAST_CAP > CAP_TRACING
 #error New capability defined, please update COMMON_CAP2_PERMS.
 #endif
 
-- 
2.20.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v3 bpf-next 2/3] bpf: implement CAP_BPF
  2019-09-04 18:43 [PATCH v3 bpf-next 1/3] capability: introduce CAP_BPF and CAP_TRACING Alexei Starovoitov
@ 2019-09-04 18:43 ` Alexei Starovoitov
  2019-09-05  0:34   ` Song Liu
                     ` (2 more replies)
  2019-09-04 18:43 ` [PATCH v3 bpf-next 3/3] perf: implement CAP_TRACING Alexei Starovoitov
  1 sibling, 3 replies; 11+ messages in thread
From: Alexei Starovoitov @ 2019-09-04 18:43 UTC (permalink / raw)
  To: davem; +Cc: daniel, peterz, luto, netdev, bpf, kernel-team, linux-api

Implement permissions as stated in uapi/linux/capability.h

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 kernel/bpf/arraymap.c                       |  2 +-
 kernel/bpf/cgroup.c                         |  2 +-
 kernel/bpf/core.c                           |  4 +-
 kernel/bpf/hashtab.c                        |  4 +-
 kernel/bpf/lpm_trie.c                       |  2 +-
 kernel/bpf/queue_stack_maps.c               |  2 +-
 kernel/bpf/reuseport_array.c                |  2 +-
 kernel/bpf/stackmap.c                       |  2 +-
 kernel/bpf/syscall.c                        | 32 ++++++++------
 kernel/bpf/verifier.c                       |  4 +-
 kernel/trace/bpf_trace.c                    |  2 +-
 net/core/bpf_sk_storage.c                   |  2 +-
 net/core/filter.c                           | 10 +++--
 tools/testing/selftests/bpf/test_verifier.c | 46 +++++++++++++++++----
 14 files changed, 77 insertions(+), 39 deletions(-)

diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index 1c65ce0098a9..149f868a02dc 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -73,7 +73,7 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr)
 	bool percpu = attr->map_type == BPF_MAP_TYPE_PERCPU_ARRAY;
 	int ret, numa_node = bpf_map_attr_numa_node(attr);
 	u32 elem_size, index_mask, max_entries;
-	bool unpriv = !capable(CAP_SYS_ADMIN);
+	bool unpriv = !capable_bpf();
 	u64 cost, array_size, mask64;
 	struct bpf_map_memory mem;
 	struct bpf_array *array;
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 6a6a154cfa7b..9c659ba5c146 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -795,7 +795,7 @@ cgroup_base_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 	case BPF_FUNC_get_current_cgroup_id:
 		return &bpf_get_current_cgroup_id_proto;
 	case BPF_FUNC_trace_printk:
-		if (capable(CAP_SYS_ADMIN))
+		if (capable_bpf_tracing())
 			return bpf_get_trace_printk_proto();
 		/* fall through */
 	default:
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 8191a7db2777..6b53c064e8e6 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -646,7 +646,7 @@ static bool bpf_prog_kallsyms_verify_off(const struct bpf_prog *fp)
 void bpf_prog_kallsyms_add(struct bpf_prog *fp)
 {
 	if (!bpf_prog_kallsyms_candidate(fp) ||
-	    !capable(CAP_SYS_ADMIN))
+	    !capable_bpf())
 		return;
 
 	spin_lock_bh(&bpf_lock);
@@ -768,7 +768,7 @@ static int bpf_jit_charge_modmem(u32 pages)
 {
 	if (atomic_long_add_return(pages, &bpf_jit_current) >
 	    (bpf_jit_limit >> PAGE_SHIFT)) {
-		if (!capable(CAP_SYS_ADMIN)) {
+		if (!capable_bpf()) {
 			atomic_long_sub(pages, &bpf_jit_current);
 			return -EPERM;
 		}
diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 22066a62c8c9..0fae5c45f425 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -244,9 +244,9 @@ static int htab_map_alloc_check(union bpf_attr *attr)
 	BUILD_BUG_ON(offsetof(struct htab_elem, fnode.next) !=
 		     offsetof(struct htab_elem, hash_node.pprev));
 
-	if (lru && !capable(CAP_SYS_ADMIN))
+	if (lru && !capable_bpf())
 		/* LRU implementation is much complicated than other
-		 * maps.  Hence, limit to CAP_SYS_ADMIN for now.
+		 * maps.  Hence, limit to CAP_BPF.
 		 */
 		return -EPERM;
 
diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c
index 56e6c75d354d..11da3be8a4e5 100644
--- a/kernel/bpf/lpm_trie.c
+++ b/kernel/bpf/lpm_trie.c
@@ -543,7 +543,7 @@ static struct bpf_map *trie_alloc(union bpf_attr *attr)
 	u64 cost = sizeof(*trie), cost_per_node;
 	int ret;
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!capable_bpf())
 		return ERR_PTR(-EPERM);
 
 	/* check sanity of attributes */
diff --git a/kernel/bpf/queue_stack_maps.c b/kernel/bpf/queue_stack_maps.c
index f697647ceb54..d83afac32863 100644
--- a/kernel/bpf/queue_stack_maps.c
+++ b/kernel/bpf/queue_stack_maps.c
@@ -45,7 +45,7 @@ static bool queue_stack_map_is_full(struct bpf_queue_stack *qs)
 /* Called from syscall */
 static int queue_stack_map_alloc_check(union bpf_attr *attr)
 {
-	if (!capable(CAP_SYS_ADMIN))
+	if (!capable_bpf())
 		return -EPERM;
 
 	/* check sanity of attributes */
diff --git a/kernel/bpf/reuseport_array.c b/kernel/bpf/reuseport_array.c
index 50c083ba978c..b268fe4b2972 100644
--- a/kernel/bpf/reuseport_array.c
+++ b/kernel/bpf/reuseport_array.c
@@ -154,7 +154,7 @@ static struct bpf_map *reuseport_array_alloc(union bpf_attr *attr)
 	struct bpf_map_memory mem;
 	u64 array_size;
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!capable_bpf())
 		return ERR_PTR(-EPERM);
 
 	array_size = sizeof(*array);
diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
index 052580c33d26..477063c63b27 100644
--- a/kernel/bpf/stackmap.c
+++ b/kernel/bpf/stackmap.c
@@ -90,7 +90,7 @@ static struct bpf_map *stack_map_alloc(union bpf_attr *attr)
 	u64 cost, n_buckets;
 	int err;
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!capable_tracing())
 		return ERR_PTR(-EPERM);
 
 	if (attr->map_flags & ~STACK_CREATE_FLAG_MASK)
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index ca60eafa6922..2b832eeafda9 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1176,7 +1176,7 @@ static int map_freeze(const union bpf_attr *attr)
 		err = -EBUSY;
 		goto err_put;
 	}
-	if (!capable(CAP_SYS_ADMIN)) {
+	if (!capable_bpf()) {
 		err = -EPERM;
 		goto err_put;
 	}
@@ -1635,7 +1635,7 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
 
 	if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) &&
 	    (attr->prog_flags & BPF_F_ANY_ALIGNMENT) &&
-	    !capable(CAP_SYS_ADMIN))
+	    !capable_bpf())
 		return -EPERM;
 
 	/* copy eBPF program license from user space */
@@ -1648,11 +1648,11 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
 	is_gpl = license_is_gpl_compatible(license);
 
 	if (attr->insn_cnt == 0 ||
-	    attr->insn_cnt > (capable(CAP_SYS_ADMIN) ? BPF_COMPLEXITY_LIMIT_INSNS : BPF_MAXINSNS))
+	    attr->insn_cnt > (capable_bpf() ? BPF_COMPLEXITY_LIMIT_INSNS : BPF_MAXINSNS))
 		return -E2BIG;
 	if (type != BPF_PROG_TYPE_SOCKET_FILTER &&
 	    type != BPF_PROG_TYPE_CGROUP_SKB &&
-	    !capable(CAP_SYS_ADMIN))
+	    !capable_bpf())
 		return -EPERM;
 
 	bpf_prog_load_fixup_attach_type(attr);
@@ -1803,6 +1803,9 @@ static int bpf_raw_tracepoint_open(const union bpf_attr *attr)
 	char tp_name[128];
 	int tp_fd, err;
 
+	if (!capable_bpf_tracing())
+		return -EPERM;
+
 	if (strncpy_from_user(tp_name, u64_to_user_ptr(attr->raw_tracepoint.name),
 			      sizeof(tp_name) - 1) < 0)
 		return -EFAULT;
@@ -2081,7 +2084,10 @@ static int bpf_prog_test_run(const union bpf_attr *attr,
 	struct bpf_prog *prog;
 	int ret = -ENOTSUPP;
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!capable_bpf_net_admin)
+		/* test_run callback is available for networking progs only.
+		 * Add capable_bpf_tracing() above when tracing progs become runable.
+		 */
 		return -EPERM;
 	if (CHECK_ATTR(BPF_PROG_TEST_RUN))
 		return -EINVAL;
@@ -2118,7 +2124,7 @@ static int bpf_obj_get_next_id(const union bpf_attr *attr,
 	if (CHECK_ATTR(BPF_OBJ_GET_NEXT_ID) || next_id >= INT_MAX)
 		return -EINVAL;
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!capable_bpf())
 		return -EPERM;
 
 	next_id++;
@@ -2144,7 +2150,7 @@ static int bpf_prog_get_fd_by_id(const union bpf_attr *attr)
 	if (CHECK_ATTR(BPF_PROG_GET_FD_BY_ID))
 		return -EINVAL;
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!capable_bpf())
 		return -EPERM;
 
 	spin_lock_bh(&prog_idr_lock);
@@ -2178,7 +2184,7 @@ static int bpf_map_get_fd_by_id(const union bpf_attr *attr)
 	    attr->open_flags & ~BPF_OBJ_FLAG_MASK)
 		return -EINVAL;
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!capable_bpf())
 		return -EPERM;
 
 	f_flags = bpf_get_file_flag(attr->open_flags);
@@ -2353,7 +2359,7 @@ static int bpf_prog_get_info_by_fd(struct bpf_prog *prog,
 	info.run_time_ns = stats.nsecs;
 	info.run_cnt = stats.cnt;
 
-	if (!capable(CAP_SYS_ADMIN)) {
+	if (!capable_bpf()) {
 		info.jited_prog_len = 0;
 		info.xlated_prog_len = 0;
 		info.nr_jited_ksyms = 0;
@@ -2671,7 +2677,7 @@ static int bpf_btf_load(const union bpf_attr *attr)
 	if (CHECK_ATTR(BPF_BTF_LOAD))
 		return -EINVAL;
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!capable_bpf())
 		return -EPERM;
 
 	return btf_new_fd(attr);
@@ -2684,7 +2690,7 @@ static int bpf_btf_get_fd_by_id(const union bpf_attr *attr)
 	if (CHECK_ATTR(BPF_BTF_GET_FD_BY_ID))
 		return -EINVAL;
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!capable_bpf())
 		return -EPERM;
 
 	return btf_get_fd_by_id(attr->btf_id);
@@ -2753,7 +2759,7 @@ static int bpf_task_fd_query(const union bpf_attr *attr,
 	if (CHECK_ATTR(BPF_TASK_FD_QUERY))
 		return -EINVAL;
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!capable_bpf_tracing())
 		return -EPERM;
 
 	if (attr->task_fd_query.flags != 0)
@@ -2821,7 +2827,7 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
 	union bpf_attr attr = {};
 	int err;
 
-	if (sysctl_unprivileged_bpf_disabled && !capable(CAP_SYS_ADMIN))
+	if (sysctl_unprivileged_bpf_disabled && !capable_bpf())
 		return -EPERM;
 
 	err = bpf_check_uarg_tail_zero(uattr, sizeof(attr), size);
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index f340cfe53c6e..b2a229557602 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -987,7 +987,7 @@ static void __mark_reg_unbounded(struct bpf_reg_state *reg)
 	reg->umax_value = U64_MAX;
 
 	/* constant backtracking is enabled for root only for now */
-	reg->precise = capable(CAP_SYS_ADMIN) ? false : true;
+	reg->precise = capable_bpf() ? false : true;
 }
 
 /* Mark a register as having a completely unknown (scalar) value. */
@@ -9233,7 +9233,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
 		env->insn_aux_data[i].orig_idx = i;
 	env->prog = *prog;
 	env->ops = bpf_verifier_ops[env->prog->type];
-	is_priv = capable(CAP_SYS_ADMIN);
+	is_priv = capable_bpf();
 
 	/* grab the mutex to protect few globals used by verifier */
 	if (!is_priv)
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index ca1255d14576..cdf8d6c8a430 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1246,7 +1246,7 @@ int perf_event_query_prog_array(struct perf_event *event, void __user *info)
 	u32 *ids, prog_cnt, ids_len;
 	int ret;
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!capable_bpf_tracing())
 		return -EPERM;
 	if (event->attr.type != PERF_TYPE_TRACEPOINT)
 		return -EINVAL;
diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
index da5639a5bd3b..aa74be21f5b6 100644
--- a/net/core/bpf_sk_storage.c
+++ b/net/core/bpf_sk_storage.c
@@ -616,7 +616,7 @@ static int bpf_sk_storage_map_alloc_check(union bpf_attr *attr)
 	    !attr->btf_key_type_id || !attr->btf_value_type_id)
 		return -EINVAL;
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!capable_bpf())
 		return -EPERM;
 
 	if (attr->value_size >= KMALLOC_MAX_SIZE -
diff --git a/net/core/filter.c b/net/core/filter.c
index 17bc9af8f156..fae09b4d4da4 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5990,7 +5990,7 @@ bpf_base_func_proto(enum bpf_func_id func_id)
 		break;
 	}
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!capable_bpf())
 		return NULL;
 
 	switch (func_id) {
@@ -5999,7 +5999,9 @@ bpf_base_func_proto(enum bpf_func_id func_id)
 	case BPF_FUNC_spin_unlock:
 		return &bpf_spin_unlock_proto;
 	case BPF_FUNC_trace_printk:
-		return bpf_get_trace_printk_proto();
+		if (capable_bpf_tracing())
+			return bpf_get_trace_printk_proto();
+		/* fall through */
 	default:
 		return NULL;
 	}
@@ -6563,7 +6565,7 @@ static bool cg_skb_is_valid_access(int off, int size,
 		return false;
 	case bpf_ctx_range(struct __sk_buff, data):
 	case bpf_ctx_range(struct __sk_buff, data_end):
-		if (!capable(CAP_SYS_ADMIN))
+		if (!capable_bpf())
 			return false;
 		break;
 	}
@@ -6575,7 +6577,7 @@ static bool cg_skb_is_valid_access(int off, int size,
 		case bpf_ctx_range_till(struct __sk_buff, cb[0], cb[4]):
 			break;
 		case bpf_ctx_range(struct __sk_buff, tstamp):
-			if (!capable(CAP_SYS_ADMIN))
+			if (!capable_bpf())
 				return false;
 			break;
 		default:
diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
index d27fd929abb9..0d5567962c4e 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -807,10 +807,20 @@ static void do_test_fixup(struct bpf_test *test, enum bpf_prog_type prog_type,
 	}
 }
 
+struct libcap {
+	struct __user_cap_header_struct hdr;
+	struct __user_cap_data_struct data[2];
+};
+
 static int set_admin(bool admin)
 {
 	cap_t caps;
-	const cap_value_t cap_val = CAP_SYS_ADMIN;
+	/* need CAP_BPF to load progs and CAP_NET_ADMIN to run networking progs,
+	 * and CAP_TRACING to create stackmap
+	 */
+	const cap_value_t cap_net_admin = CAP_NET_ADMIN;
+	const cap_value_t cap_sys_admin = CAP_SYS_ADMIN;
+	struct libcap *cap;
 	int ret = -1;
 
 	caps = cap_get_proc();
@@ -818,11 +828,26 @@ static int set_admin(bool admin)
 		perror("cap_get_proc");
 		return -1;
 	}
-	if (cap_set_flag(caps, CAP_EFFECTIVE, 1, &cap_val,
+	cap = (struct libcap *)caps;
+	if (cap_set_flag(caps, CAP_EFFECTIVE, 1, &cap_sys_admin, CAP_CLEAR)) {
+		perror("cap_set_flag clear admin");
+		goto out;
+	}
+	if (cap_set_flag(caps, CAP_EFFECTIVE, 1, &cap_net_admin,
 				admin ? CAP_SET : CAP_CLEAR)) {
-		perror("cap_set_flag");
+		perror("cap_set_flag set_or_clear net");
 		goto out;
 	}
+	/* libcap is likely old and simply ignores CAP_BPF and CAP_TRACING,
+	 * so update effective bits manually
+	 */
+	if (admin) {
+		cap->data[1].effective |= 1 << (38 /* CAP_BPF */ - 32);
+		cap->data[1].effective |= 1 << (39 /* CAP_TRACING */ - 32);
+	} else {
+		cap->data[1].effective &= ~(1 << (38 - 32));
+		cap->data[1].effective &= ~(1 << (39 - 32));
+	}
 	if (cap_set_proc(caps)) {
 		perror("cap_set_proc");
 		goto out;
@@ -1051,9 +1076,11 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
 
 static bool is_admin(void)
 {
+	cap_flag_value_t net_priv = CAP_CLEAR;
+	bool tracing_priv = false;
+	bool bpf_priv = false;
+	struct libcap *cap;
 	cap_t caps;
-	cap_flag_value_t sysadmin = CAP_CLEAR;
-	const cap_value_t cap_val = CAP_SYS_ADMIN;
 
 #ifdef CAP_IS_SUPPORTED
 	if (!CAP_IS_SUPPORTED(CAP_SETFCAP)) {
@@ -1066,11 +1093,14 @@ static bool is_admin(void)
 		perror("cap_get_proc");
 		return false;
 	}
-	if (cap_get_flag(caps, cap_val, CAP_EFFECTIVE, &sysadmin))
-		perror("cap_get_flag");
+	cap = (struct libcap *)caps;
+	bpf_priv = cap->data[1].effective & (1 << (38/* CAP_BPF */ - 32));
+	tracing_priv = cap->data[1].effective & (1 << (39/* CAP_TRACING */ - 32));
+	if (cap_get_flag(caps, CAP_NET_ADMIN, CAP_EFFECTIVE, &net_priv))
+		perror("cap_get_flag NET");
 	if (cap_free(caps))
 		perror("cap_free");
-	return (sysadmin == CAP_SET);
+	return bpf_priv && tracing_priv && net_priv == CAP_SET;
 }
 
 static void get_unpriv_disabled()
-- 
2.20.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v3 bpf-next 3/3] perf: implement CAP_TRACING
  2019-09-04 18:43 [PATCH v3 bpf-next 1/3] capability: introduce CAP_BPF and CAP_TRACING Alexei Starovoitov
  2019-09-04 18:43 ` [PATCH v3 bpf-next 2/3] bpf: implement CAP_BPF Alexei Starovoitov
@ 2019-09-04 18:43 ` Alexei Starovoitov
  1 sibling, 0 replies; 11+ messages in thread
From: Alexei Starovoitov @ 2019-09-04 18:43 UTC (permalink / raw)
  To: davem; +Cc: daniel, peterz, luto, netdev, bpf, kernel-team, linux-api

Implement permissions as stated in uapi/linux/capability.h

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 arch/powerpc/perf/core-book3s.c |  4 ++--
 arch/x86/events/intel/bts.c     |  2 +-
 arch/x86/events/intel/core.c    |  2 +-
 arch/x86/events/intel/p4.c      |  2 +-
 kernel/events/core.c            | 14 +++++++-------
 kernel/events/hw_breakpoint.c   |  2 +-
 kernel/trace/trace_event_perf.c |  4 ++--
 7 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index ca92e01d0bd1..a204a3c6c68b 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -204,7 +204,7 @@ static inline void perf_get_data_addr(struct pt_regs *regs, u64 *addrp)
 	if (!(mmcra & MMCRA_SAMPLE_ENABLE) || sdar_valid)
 		*addrp = mfspr(SPRN_SDAR);
 
-	if (perf_paranoid_kernel() && !capable(CAP_SYS_ADMIN) &&
+	if (perf_paranoid_kernel() && !capable_tracing() &&
 		is_kernel_addr(mfspr(SPRN_SDAR)))
 		*addrp = 0;
 }
@@ -472,7 +472,7 @@ static void power_pmu_bhrb_read(struct cpu_hw_events *cpuhw)
 			 * exporting it to userspace (avoid exposure of regions
 			 * where we could have speculative execution)
 			 */
-			if (perf_paranoid_kernel() && !capable(CAP_SYS_ADMIN) &&
+			if (perf_paranoid_kernel() && !capable_tracing() &&
 				is_kernel_addr(addr))
 				continue;
 
diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
index 5ee3fed881d3..bd713b2dd7c2 100644
--- a/arch/x86/events/intel/bts.c
+++ b/arch/x86/events/intel/bts.c
@@ -550,7 +550,7 @@ static int bts_event_init(struct perf_event *event)
 	 * users to profile the kernel.
 	 */
 	if (event->attr.exclude_kernel && perf_paranoid_kernel() &&
-	    !capable(CAP_SYS_ADMIN))
+	    !capable_tracing())
 		return -EACCES;
 
 	if (x86_add_exclusive(x86_lbr_exclusive_bts))
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 648260b5f367..277b12c054fa 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3307,7 +3307,7 @@ static int intel_pmu_hw_config(struct perf_event *event)
 	if (x86_pmu.version < 3)
 		return -EINVAL;
 
-	if (perf_paranoid_cpu() && !capable(CAP_SYS_ADMIN))
+	if (perf_paranoid_cpu() && !capable_tracing())
 		return -EACCES;
 
 	event->hw.config |= ARCH_PERFMON_EVENTSEL_ANY;
diff --git a/arch/x86/events/intel/p4.c b/arch/x86/events/intel/p4.c
index dee579efb2b2..f379a358c9cb 100644
--- a/arch/x86/events/intel/p4.c
+++ b/arch/x86/events/intel/p4.c
@@ -776,7 +776,7 @@ static int p4_validate_raw_event(struct perf_event *event)
 	 * the user needs special permissions to be able to use it
 	 */
 	if (p4_ht_active() && p4_event_bind_map[v].shared) {
-		if (perf_paranoid_cpu() && !capable(CAP_SYS_ADMIN))
+		if (perf_paranoid_cpu() && !capable_tracing())
 			return -EACCES;
 	}
 
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 0463c1151bae..eaba102e5d91 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4134,7 +4134,7 @@ find_get_context(struct pmu *pmu, struct task_struct *task,
 
 	if (!task) {
 		/* Must be root to operate on a CPU event: */
-		if (perf_paranoid_cpu() && !capable(CAP_SYS_ADMIN))
+		if (perf_paranoid_cpu() && !capable_tracing())
 			return ERR_PTR(-EACCES);
 
 		cpuctx = per_cpu_ptr(pmu->pmu_cpu_context, cpu);
@@ -8741,7 +8741,7 @@ static int perf_kprobe_event_init(struct perf_event *event)
 	if (event->attr.type != perf_kprobe.type)
 		return -ENOENT;
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!capable_tracing())
 		return -EACCES;
 
 	/*
@@ -8801,7 +8801,7 @@ static int perf_uprobe_event_init(struct perf_event *event)
 	if (event->attr.type != perf_uprobe.type)
 		return -ENOENT;
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!capable_tracing())
 		return -EACCES;
 
 	/*
@@ -10588,7 +10588,7 @@ static int perf_copy_attr(struct perf_event_attr __user *uattr,
 		}
 		/* privileged levels capture (kernel, hv): check permissions */
 		if ((mask & PERF_SAMPLE_BRANCH_PERM_PLM)
-		    && perf_paranoid_kernel() && !capable(CAP_SYS_ADMIN))
+		    && perf_paranoid_kernel() && !capable_tracing())
 			return -EACCES;
 	}
 
@@ -10807,12 +10807,12 @@ SYSCALL_DEFINE5(perf_event_open,
 		return err;
 
 	if (!attr.exclude_kernel) {
-		if (perf_paranoid_kernel() && !capable(CAP_SYS_ADMIN))
+		if (perf_paranoid_kernel() && !capable_tracing())
 			return -EACCES;
 	}
 
 	if (attr.namespaces) {
-		if (!capable(CAP_SYS_ADMIN))
+		if (!capable_tracing())
 			return -EACCES;
 	}
 
@@ -10826,7 +10826,7 @@ SYSCALL_DEFINE5(perf_event_open,
 
 	/* Only privileged users can get physical addresses */
 	if ((attr.sample_type & PERF_SAMPLE_PHYS_ADDR) &&
-	    perf_paranoid_kernel() && !capable(CAP_SYS_ADMIN))
+	    perf_paranoid_kernel() && !capable_tracing())
 		return -EACCES;
 
 	/*
diff --git a/kernel/events/hw_breakpoint.c b/kernel/events/hw_breakpoint.c
index c5cd852fe86b..8bc4d7d8c913 100644
--- a/kernel/events/hw_breakpoint.c
+++ b/kernel/events/hw_breakpoint.c
@@ -404,7 +404,7 @@ static int hw_breakpoint_parse(struct perf_event *bp,
 		 * Don't let unprivileged users set a breakpoint in the trap
 		 * path to avoid trap recursion attacks.
 		 */
-		if (!capable(CAP_SYS_ADMIN))
+		if (!capable_tracing())
 			return -EPERM;
 	}
 
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index 0892e38ed6fb..6861307f14d6 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -46,7 +46,7 @@ static int perf_trace_event_perm(struct trace_event_call *tp_event,
 
 	/* The ftrace function trace is allowed only for root. */
 	if (ftrace_event_is_function(tp_event)) {
-		if (perf_paranoid_tracepoint_raw() && !capable(CAP_SYS_ADMIN))
+		if (perf_paranoid_tracepoint_raw() && !capable_tracing())
 			return -EPERM;
 
 		if (!is_sampling_event(p_event))
@@ -82,7 +82,7 @@ static int perf_trace_event_perm(struct trace_event_call *tp_event,
 	 * ...otherwise raw tracepoint data can be a severe data leak,
 	 * only allow root to have these.
 	 */
-	if (perf_paranoid_tracepoint_raw() && !capable(CAP_SYS_ADMIN))
+	if (perf_paranoid_tracepoint_raw() && !capable_tracing())
 		return -EPERM;
 
 	return 0;
-- 
2.20.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 bpf-next 2/3] bpf: implement CAP_BPF
  2019-09-04 18:43 ` [PATCH v3 bpf-next 2/3] bpf: implement CAP_BPF Alexei Starovoitov
@ 2019-09-05  0:34   ` Song Liu
  2019-09-05  1:32     ` Alexei Starovoitov
  2019-09-06 16:20   ` kbuild test robot
  2019-09-07  4:09   ` kbuild test robot
  2 siblings, 1 reply; 11+ messages in thread
From: Song Liu @ 2019-09-05  0:34 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S . Miller, Daniel Borkmann, Peter Zijlstra,
	Andy Lutomirski, Networking, bpf, Kernel Team, linux-api



> On Sep 4, 2019, at 11:43 AM, Alexei Starovoitov <ast@kernel.org> wrote:
> 
> Implement permissions as stated in uapi/linux/capability.h
> 
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> 

[...]

> @@ -1648,11 +1648,11 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
> 	is_gpl = license_is_gpl_compatible(license);
> 
> 	if (attr->insn_cnt == 0 ||
> -	    attr->insn_cnt > (capable(CAP_SYS_ADMIN) ? BPF_COMPLEXITY_LIMIT_INSNS : BPF_MAXINSNS))
> +	    attr->insn_cnt > (capable_bpf() ? BPF_COMPLEXITY_LIMIT_INSNS : BPF_MAXINSNS))
> 		return -E2BIG;
> 	if (type != BPF_PROG_TYPE_SOCKET_FILTER &&
> 	    type != BPF_PROG_TYPE_CGROUP_SKB &&
> -	    !capable(CAP_SYS_ADMIN))
> +	    !capable_bpf())
> 		return -EPERM;

Do we allow load BPF_PROG_TYPE_SOCKET_FILTER and BPF_PROG_TYPE_CGROUP_SKB
without CAP_BPF? If so, maybe highlight in the header?

Thanks,
Song


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 bpf-next 2/3] bpf: implement CAP_BPF
  2019-09-05  0:34   ` Song Liu
@ 2019-09-05  1:32     ` Alexei Starovoitov
  2019-09-05  2:51       ` Song Liu
  0 siblings, 1 reply; 11+ messages in thread
From: Alexei Starovoitov @ 2019-09-05  1:32 UTC (permalink / raw)
  To: Song Liu
  Cc: Alexei Starovoitov, David S . Miller, Daniel Borkmann,
	Peter Zijlstra, Andy Lutomirski, Networking, bpf, Kernel Team,
	linux-api

On Thu, Sep 05, 2019 at 12:34:36AM +0000, Song Liu wrote:
> 
> 
> > On Sep 4, 2019, at 11:43 AM, Alexei Starovoitov <ast@kernel.org> wrote:
> > 
> > Implement permissions as stated in uapi/linux/capability.h
> > 
> > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > 
> 
> [...]
> 
> > @@ -1648,11 +1648,11 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
> > 	is_gpl = license_is_gpl_compatible(license);
> > 
> > 	if (attr->insn_cnt == 0 ||
> > -	    attr->insn_cnt > (capable(CAP_SYS_ADMIN) ? BPF_COMPLEXITY_LIMIT_INSNS : BPF_MAXINSNS))
> > +	    attr->insn_cnt > (capable_bpf() ? BPF_COMPLEXITY_LIMIT_INSNS : BPF_MAXINSNS))
> > 		return -E2BIG;
> > 	if (type != BPF_PROG_TYPE_SOCKET_FILTER &&
> > 	    type != BPF_PROG_TYPE_CGROUP_SKB &&
> > -	    !capable(CAP_SYS_ADMIN))
> > +	    !capable_bpf())
> > 		return -EPERM;
> 
> Do we allow load BPF_PROG_TYPE_SOCKET_FILTER and BPF_PROG_TYPE_CGROUP_SKB
> without CAP_BPF? If so, maybe highlight in the header?

of course. there is no change in behavior.
'highlight in the header'?
you mean in commit log?
I think it's a bit weird to describe things in commit that patch
is _not_ changing vs things that patch does actually change.
This type of comment would be great in a doc though.
The doc will be coming separately in the follow up assuming
the whole thing lands. I'll remember to note that bit.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 bpf-next 2/3] bpf: implement CAP_BPF
  2019-09-05  1:32     ` Alexei Starovoitov
@ 2019-09-05  2:51       ` Song Liu
  2019-09-05  3:15         ` Alexei Starovoitov
  0 siblings, 1 reply; 11+ messages in thread
From: Song Liu @ 2019-09-05  2:51 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alexei Starovoitov, David S . Miller, Daniel Borkmann,
	Peter Zijlstra, Andy Lutomirski, Networking, bpf, Kernel Team,
	linux-api



> On Sep 4, 2019, at 6:32 PM, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> 
> On Thu, Sep 05, 2019 at 12:34:36AM +0000, Song Liu wrote:
>> 
>> 
>>> On Sep 4, 2019, at 11:43 AM, Alexei Starovoitov <ast@kernel.org> wrote:
>>> 
>>> Implement permissions as stated in uapi/linux/capability.h
>>> 
>>> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
>>> 
>> 
>> [...]
>> 
>>> @@ -1648,11 +1648,11 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
>>> 	is_gpl = license_is_gpl_compatible(license);
>>> 
>>> 	if (attr->insn_cnt == 0 ||
>>> -	    attr->insn_cnt > (capable(CAP_SYS_ADMIN) ? BPF_COMPLEXITY_LIMIT_INSNS : BPF_MAXINSNS))
>>> +	    attr->insn_cnt > (capable_bpf() ? BPF_COMPLEXITY_LIMIT_INSNS : BPF_MAXINSNS))
>>> 		return -E2BIG;
>>> 	if (type != BPF_PROG_TYPE_SOCKET_FILTER &&
>>> 	    type != BPF_PROG_TYPE_CGROUP_SKB &&
>>> -	    !capable(CAP_SYS_ADMIN))
>>> +	    !capable_bpf())
>>> 		return -EPERM;
>> 
>> Do we allow load BPF_PROG_TYPE_SOCKET_FILTER and BPF_PROG_TYPE_CGROUP_SKB
>> without CAP_BPF? If so, maybe highlight in the header?
> 
> of course. there is no change in behavior.
> 'highlight in the header'?
> you mean in commit log?
> I think it's a bit weird to describe things in commit that patch
> is _not_ changing vs things that patch does actually change.
> This type of comment would be great in a doc though.
> The doc will be coming separately in the follow up assuming
> the whole thing lands. I'll remember to note that bit.

I meant capability.h:

+ * CAP_BPF allows the following BPF operations:
+ * - Loading all types of BPF programs

But CAP_BPF is not required to load all types of programs. 

On a second thought, I am not sure whether we will keep capability.h
up to date with all features. So this is probably OK. 

Thanks,
Song


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 bpf-next 2/3] bpf: implement CAP_BPF
  2019-09-05  2:51       ` Song Liu
@ 2019-09-05  3:15         ` Alexei Starovoitov
  2019-09-05 16:08           ` Song Liu
  0 siblings, 1 reply; 11+ messages in thread
From: Alexei Starovoitov @ 2019-09-05  3:15 UTC (permalink / raw)
  To: Song Liu
  Cc: Alexei Starovoitov, David S . Miller, Daniel Borkmann,
	Peter Zijlstra, Andy Lutomirski, Networking, bpf, Kernel Team,
	linux-api

On Thu, Sep 05, 2019 at 02:51:51AM +0000, Song Liu wrote:
> 
> 
> > On Sep 4, 2019, at 6:32 PM, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> > 
> > On Thu, Sep 05, 2019 at 12:34:36AM +0000, Song Liu wrote:
> >> 
> >> 
> >>> On Sep 4, 2019, at 11:43 AM, Alexei Starovoitov <ast@kernel.org> wrote:
> >>> 
> >>> Implement permissions as stated in uapi/linux/capability.h
> >>> 
> >>> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> >>> 
> >> 
> >> [...]
> >> 
> >>> @@ -1648,11 +1648,11 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
> >>> 	is_gpl = license_is_gpl_compatible(license);
> >>> 
> >>> 	if (attr->insn_cnt == 0 ||
> >>> -	    attr->insn_cnt > (capable(CAP_SYS_ADMIN) ? BPF_COMPLEXITY_LIMIT_INSNS : BPF_MAXINSNS))
> >>> +	    attr->insn_cnt > (capable_bpf() ? BPF_COMPLEXITY_LIMIT_INSNS : BPF_MAXINSNS))
> >>> 		return -E2BIG;
> >>> 	if (type != BPF_PROG_TYPE_SOCKET_FILTER &&
> >>> 	    type != BPF_PROG_TYPE_CGROUP_SKB &&
> >>> -	    !capable(CAP_SYS_ADMIN))
> >>> +	    !capable_bpf())
> >>> 		return -EPERM;
> >> 
> >> Do we allow load BPF_PROG_TYPE_SOCKET_FILTER and BPF_PROG_TYPE_CGROUP_SKB
> >> without CAP_BPF? If so, maybe highlight in the header?
> > 
> > of course. there is no change in behavior.
> > 'highlight in the header'?
> > you mean in commit log?
> > I think it's a bit weird to describe things in commit that patch
> > is _not_ changing vs things that patch does actually change.
> > This type of comment would be great in a doc though.
> > The doc will be coming separately in the follow up assuming
> > the whole thing lands. I'll remember to note that bit.
> 
> I meant capability.h:
> 
> + * CAP_BPF allows the following BPF operations:
> + * - Loading all types of BPF programs
> 
> But CAP_BPF is not required to load all types of programs. 

yes, but above statement is still correct, right?

And right below it says:
 * CAP_BPF allows the following BPF operations:
 * - Loading all types of BPF programs
 * - Creating all types of BPF maps except:
 *    - stackmap that needs CAP_TRACING
 *    - devmap that needs CAP_NET_ADMIN
 *    - cpumap that needs CAP_SYS_ADMIN
which is also correct, but CAP_BPF is not required
for array, hash, prog_array, percpu, map-in-map ...
except their lru variants...
and except if they contain bpf_spin_lock...
and if they need BTF it currently can be loaded with cap_sys_admin only...

If we say something about socket_filter, cg_skb progs in capability.h
we should clarify maps as well, but then it will become too big for .h
The comments in capability.h already look too long to me.
All that info and a lot more belongs in the doc.

> On a second thought, I am not sure whether we will keep capability.h
> up to date with all features. So this is probably OK.

It's clearly not for cap_sys_admin.
cap_net_admin is not up to date either.
These two are kitchen sink of anything root-like and networking-root like.
Developers didn't bother updating that .h
With CAP_BPF we can enforce through patch acceptance policy that
major new things (like big verifier features) should have a line
in capability.h
Though .h is not a replacement for proper doc which will come as follow up.
Unlike bpf.h that serves as a template for auto-generating man pages
capability.h is not such thing. I won't change often either.
So normal doc is a better way to document all details.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 bpf-next 2/3] bpf: implement CAP_BPF
  2019-09-05  3:15         ` Alexei Starovoitov
@ 2019-09-05 16:08           ` Song Liu
  0 siblings, 0 replies; 11+ messages in thread
From: Song Liu @ 2019-09-05 16:08 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alexei Starovoitov, David S . Miller, Daniel Borkmann,
	Peter Zijlstra, Andy Lutomirski, Networking, bpf, Kernel Team,
	linux-api



> On Sep 4, 2019, at 8:15 PM, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> 
> On Thu, Sep 05, 2019 at 02:51:51AM +0000, Song Liu wrote:
>> 
>> 
>>> On Sep 4, 2019, at 6:32 PM, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
>>> 
>>> On Thu, Sep 05, 2019 at 12:34:36AM +0000, Song Liu wrote:
>>>> 
>>>> 
>>>>> On Sep 4, 2019, at 11:43 AM, Alexei Starovoitov <ast@kernel.org> wrote:
>>>>> 
>>>>> Implement permissions as stated in uapi/linux/capability.h
>>>>> 
>>>>> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
>>>>> 
>>>> 
>>>> [...]
>>>> 
>>>>> @@ -1648,11 +1648,11 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
>>>>> 	is_gpl = license_is_gpl_compatible(license);
>>>>> 
>>>>> 	if (attr->insn_cnt == 0 ||
>>>>> -	    attr->insn_cnt > (capable(CAP_SYS_ADMIN) ? BPF_COMPLEXITY_LIMIT_INSNS : BPF_MAXINSNS))
>>>>> +	    attr->insn_cnt > (capable_bpf() ? BPF_COMPLEXITY_LIMIT_INSNS : BPF_MAXINSNS))
>>>>> 		return -E2BIG;
>>>>> 	if (type != BPF_PROG_TYPE_SOCKET_FILTER &&
>>>>> 	    type != BPF_PROG_TYPE_CGROUP_SKB &&
>>>>> -	    !capable(CAP_SYS_ADMIN))
>>>>> +	    !capable_bpf())
>>>>> 		return -EPERM;
>>>> 
>>>> Do we allow load BPF_PROG_TYPE_SOCKET_FILTER and BPF_PROG_TYPE_CGROUP_SKB
>>>> without CAP_BPF? If so, maybe highlight in the header?
>>> 
>>> of course. there is no change in behavior.
>>> 'highlight in the header'?
>>> you mean in commit log?
>>> I think it's a bit weird to describe things in commit that patch
>>> is _not_ changing vs things that patch does actually change.
>>> This type of comment would be great in a doc though.
>>> The doc will be coming separately in the follow up assuming
>>> the whole thing lands. I'll remember to note that bit.
>> 
>> I meant capability.h:
>> 
>> + * CAP_BPF allows the following BPF operations:
>> + * - Loading all types of BPF programs
>> 
>> But CAP_BPF is not required to load all types of programs. 
> 
> yes, but above statement is still correct, right?
> 
> And right below it says:
> * CAP_BPF allows the following BPF operations:
> * - Loading all types of BPF programs
> * - Creating all types of BPF maps except:
> *    - stackmap that needs CAP_TRACING
> *    - devmap that needs CAP_NET_ADMIN
> *    - cpumap that needs CAP_SYS_ADMIN
> which is also correct, but CAP_BPF is not required
> for array, hash, prog_array, percpu, map-in-map ...
> except their lru variants...
> and except if they contain bpf_spin_lock...
> and if they need BTF it currently can be loaded with cap_sys_admin only...
> 
> If we say something about socket_filter, cg_skb progs in capability.h
> we should clarify maps as well, but then it will become too big for .h
> The comments in capability.h already look too long to me.
> All that info and a lot more belongs in the doc.

Agreed. We cannot put all these details in capability.h. Doc/wikipages 
would be better fit for these information. 

Thanks,
Song



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 bpf-next 2/3] bpf: implement CAP_BPF
  2019-09-04 18:43 ` [PATCH v3 bpf-next 2/3] bpf: implement CAP_BPF Alexei Starovoitov
  2019-09-05  0:34   ` Song Liu
@ 2019-09-06 16:20   ` kbuild test robot
  2019-09-06 17:22     ` Alexei Starovoitov
  2019-09-07  4:09   ` kbuild test robot
  2 siblings, 1 reply; 11+ messages in thread
From: kbuild test robot @ 2019-09-06 16:20 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: kbuild-all, davem, daniel, peterz, luto, netdev, bpf,
	kernel-team, linux-api

[-- Attachment #1: Type: text/plain, Size: 2159 bytes --]

Hi Alexei,

I love your patch! Perhaps something to improve:

[auto build test WARNING on bpf-next/master]

url:    https://github.com/0day-ci/linux/commits/Alexei-Starovoitov/capability-introduce-CAP_BPF-and-CAP_TRACING/20190906-215814
base:   https://kernel.googlesource.com/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: x86_64-allmodconfig (attached as .config)
compiler: gcc-7 (Debian 7.4.0-11) 7.4.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   kernel//bpf/syscall.c: In function 'bpf_prog_test_run':
>> kernel//bpf/syscall.c:2087:6: warning: the address of 'capable_bpf_net_admin' will always evaluate as 'true' [-Waddress]
     if (!capable_bpf_net_admin)
         ^

vim +2087 kernel//bpf/syscall.c

  2080	
  2081	static int bpf_prog_test_run(const union bpf_attr *attr,
  2082				     union bpf_attr __user *uattr)
  2083	{
  2084		struct bpf_prog *prog;
  2085		int ret = -ENOTSUPP;
  2086	
> 2087		if (!capable_bpf_net_admin)
  2088			/* test_run callback is available for networking progs only.
  2089			 * Add capable_bpf_tracing() above when tracing progs become runable.
  2090			 */
  2091			return -EPERM;
  2092		if (CHECK_ATTR(BPF_PROG_TEST_RUN))
  2093			return -EINVAL;
  2094	
  2095		if ((attr->test.ctx_size_in && !attr->test.ctx_in) ||
  2096		    (!attr->test.ctx_size_in && attr->test.ctx_in))
  2097			return -EINVAL;
  2098	
  2099		if ((attr->test.ctx_size_out && !attr->test.ctx_out) ||
  2100		    (!attr->test.ctx_size_out && attr->test.ctx_out))
  2101			return -EINVAL;
  2102	
  2103		prog = bpf_prog_get(attr->test.prog_fd);
  2104		if (IS_ERR(prog))
  2105			return PTR_ERR(prog);
  2106	
  2107		if (prog->aux->ops->test_run)
  2108			ret = prog->aux->ops->test_run(prog, attr, uattr);
  2109	
  2110		bpf_prog_put(prog);
  2111		return ret;
  2112	}
  2113	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 70222 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 bpf-next 2/3] bpf: implement CAP_BPF
  2019-09-06 16:20   ` kbuild test robot
@ 2019-09-06 17:22     ` Alexei Starovoitov
  0 siblings, 0 replies; 11+ messages in thread
From: Alexei Starovoitov @ 2019-09-06 17:22 UTC (permalink / raw)
  To: kbuild test robot
  Cc: Alexei Starovoitov, kbuild-all, David S. Miller, Daniel Borkmann,
	Peter Zijlstra, Andy Lutomirski, Network Development, bpf,
	Kernel Team, Linux API

On Fri, Sep 6, 2019 at 9:21 AM kbuild test robot <lkp@intel.com> wrote:
>
> Hi Alexei,
>
> I love your patch! Perhaps something to improve:
>
> [auto build test WARNING on bpf-next/master]
>
> url:    https://github.com/0day-ci/linux/commits/Alexei-Starovoitov/capability-introduce-CAP_BPF-and-CAP_TRACING/20190906-215814
> base:   https://kernel.googlesource.com/pub/scm/linux/kernel/git/bpf/bpf-next.git master
> config: x86_64-allmodconfig (attached as .config)
> compiler: gcc-7 (Debian 7.4.0-11) 7.4.0
> reproduce:
>         # save the attached .config to linux build tree
>         make ARCH=x86_64
>
> If you fix the issue, kindly add following tag
> Reported-by: kbuild test robot <lkp@intel.com>
>
> All warnings (new ones prefixed by >>):
>
>    kernel//bpf/syscall.c: In function 'bpf_prog_test_run':
> >> kernel//bpf/syscall.c:2087:6: warning: the address of 'capable_bpf_net_admin' will always evaluate as 'true' [-Waddress]
>      if (!capable_bpf_net_admin)
>          ^

argh. fixing and rebasing.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 bpf-next 2/3] bpf: implement CAP_BPF
  2019-09-04 18:43 ` [PATCH v3 bpf-next 2/3] bpf: implement CAP_BPF Alexei Starovoitov
  2019-09-05  0:34   ` Song Liu
  2019-09-06 16:20   ` kbuild test robot
@ 2019-09-07  4:09   ` kbuild test robot
  2 siblings, 0 replies; 11+ messages in thread
From: kbuild test robot @ 2019-09-07  4:09 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: kbuild-all, davem, daniel, peterz, luto, netdev, bpf,
	kernel-team, linux-api

[-- Attachment #1: Type: text/plain, Size: 4029 bytes --]

Hi Alexei,

I love your patch! Perhaps something to improve:

[auto build test WARNING on bpf-next/master]

url:    https://github.com/0day-ci/linux/commits/Alexei-Starovoitov/capability-introduce-CAP_BPF-and-CAP_TRACING/20190906-215814
base:   https://kernel.googlesource.com/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: x86_64-randconfig-c003-201935 (attached as .config)
compiler: gcc-7 (Debian 7.4.0-11) 7.4.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   In file included from include/linux/export.h:45:0,
                    from include/linux/linkage.h:7,
                    from include/linux/kernel.h:8,
                    from include/linux/list.h:9,
                    from include/linux/timer.h:5,
                    from include/linux/workqueue.h:9,
                    from include/linux/bpf.h:9,
                    from kernel/bpf/syscall.c:4:
   kernel/bpf/syscall.c: In function 'bpf_prog_test_run':
   kernel/bpf/syscall.c:2087:6: warning: the address of 'capable_bpf_net_admin' will always evaluate as 'true' [-Waddress]
     if (!capable_bpf_net_admin)
         ^
   include/linux/compiler.h:58:52: note: in definition of macro '__trace_if_var'
    #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                                       ^~~~
>> kernel/bpf/syscall.c:2087:2: note: in expansion of macro 'if'
     if (!capable_bpf_net_admin)
     ^~
   kernel/bpf/syscall.c:2087:6: warning: the address of 'capable_bpf_net_admin' will always evaluate as 'true' [-Waddress]
     if (!capable_bpf_net_admin)
         ^
   include/linux/compiler.h:58:61: note: in definition of macro '__trace_if_var'
    #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                                                ^~~~
>> kernel/bpf/syscall.c:2087:2: note: in expansion of macro 'if'
     if (!capable_bpf_net_admin)
     ^~
   kernel/bpf/syscall.c:2087:6: warning: the address of 'capable_bpf_net_admin' will always evaluate as 'true' [-Waddress]
     if (!capable_bpf_net_admin)
         ^
   include/linux/compiler.h:69:3: note: in definition of macro '__trace_if_value'
     (cond) ?     \
      ^~~~
   include/linux/compiler.h:56:28: note: in expansion of macro '__trace_if_var'
    #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                               ^~~~~~~~~~~~~~
>> kernel/bpf/syscall.c:2087:2: note: in expansion of macro 'if'
     if (!capable_bpf_net_admin)
     ^~

vim +/if +2087 kernel/bpf/syscall.c

  2080	
  2081	static int bpf_prog_test_run(const union bpf_attr *attr,
  2082				     union bpf_attr __user *uattr)
  2083	{
  2084		struct bpf_prog *prog;
  2085		int ret = -ENOTSUPP;
  2086	
> 2087		if (!capable_bpf_net_admin)
  2088			/* test_run callback is available for networking progs only.
  2089			 * Add capable_bpf_tracing() above when tracing progs become runable.
  2090			 */
  2091			return -EPERM;
  2092		if (CHECK_ATTR(BPF_PROG_TEST_RUN))
  2093			return -EINVAL;
  2094	
  2095		if ((attr->test.ctx_size_in && !attr->test.ctx_in) ||
  2096		    (!attr->test.ctx_size_in && attr->test.ctx_in))
  2097			return -EINVAL;
  2098	
  2099		if ((attr->test.ctx_size_out && !attr->test.ctx_out) ||
  2100		    (!attr->test.ctx_size_out && attr->test.ctx_out))
  2101			return -EINVAL;
  2102	
  2103		prog = bpf_prog_get(attr->test.prog_fd);
  2104		if (IS_ERR(prog))
  2105			return PTR_ERR(prog);
  2106	
  2107		if (prog->aux->ops->test_run)
  2108			ret = prog->aux->ops->test_run(prog, attr, uattr);
  2109	
  2110		bpf_prog_put(prog);
  2111		return ret;
  2112	}
  2113	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 29761 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, back to index

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-04 18:43 [PATCH v3 bpf-next 1/3] capability: introduce CAP_BPF and CAP_TRACING Alexei Starovoitov
2019-09-04 18:43 ` [PATCH v3 bpf-next 2/3] bpf: implement CAP_BPF Alexei Starovoitov
2019-09-05  0:34   ` Song Liu
2019-09-05  1:32     ` Alexei Starovoitov
2019-09-05  2:51       ` Song Liu
2019-09-05  3:15         ` Alexei Starovoitov
2019-09-05 16:08           ` Song Liu
2019-09-06 16:20   ` kbuild test robot
2019-09-06 17:22     ` Alexei Starovoitov
2019-09-07  4:09   ` kbuild test robot
2019-09-04 18:43 ` [PATCH v3 bpf-next 3/3] perf: implement CAP_TRACING Alexei Starovoitov

BPF Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/bpf/0 bpf/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 bpf bpf/ https://lore.kernel.org/bpf \
		bpf@vger.kernel.org bpf@archiver.kernel.org
	public-inbox-index bpf


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.bpf


AGPL code for this site: git clone https://public-inbox.org/ public-inbox