Netdev Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v13 0/4] BPF: New helper to obtain namespace data from current task
@ 2019-10-09 15:26 Carlos Neira
  2019-10-09 15:26 ` [PATCH v13 1/4] fs/nsfs.c: added ns_match Carlos Neira
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Carlos Neira @ 2019-10-09 15:26 UTC (permalink / raw)
  To: netdev; +Cc: yhs, ebiederm, brouer, bpf, cneirabustos

Currently bpf_get_current_pid_tgid(), is used to do pid filtering in bcc's
scripts but this helper returns the pid as seen by the root namespace which is
fine when a bcc script is not executed inside a container.
When the process of interest is inside a container, pid filtering will not work
if bpf_get_current_pid_tgid() is used.
This helper addresses this limitation returning the pid as it's seen by the current
namespace where the script is executing.

In the future different pid_ns files may belong to different devices, according to the
discussion between Eric Biederman and Yonghong in 2017 Linux plumbers conference.
To address that situation the helper requires inum and dev_t from /proc/self/ns/pid.
This helper has the same use cases as bpf_get_current_pid_tgid() as it can be
used to do pid filtering even inside a container.

Signed-off-by: Carlos Neira <cneirabustos@gmail.com>

Carlos Neira (4):
  fs/nsfs.c: added ns_match
  bpf: added new helper bpf_get_ns_current_pid_tgid
  tools: Added bpf_get_ns_current_pid_tgid helper
  tools/testing/selftests/bpf: Add self-tests for new helper.

 fs/nsfs.c                                     |  8 ++
 include/linux/bpf.h                           |  1 +
 include/linux/proc_ns.h                       |  2 +
 include/uapi/linux/bpf.h                      | 22 ++++-
 kernel/bpf/core.c                             |  1 +
 kernel/bpf/helpers.c                          | 43 ++++++++++
 kernel/trace/bpf_trace.c                      |  2 +
 tools/include/uapi/linux/bpf.h                | 22 ++++-
 tools/testing/selftests/bpf/bpf_helpers.h     |  4 +
 .../bpf/prog_tests/get_ns_current_pid_tgid.c  | 85 +++++++++++++++++++
 .../bpf/progs/get_ns_current_pid_tgid_kern.c  | 53 ++++++++++++
 11 files changed, 241 insertions(+), 2 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/get_ns_current_pid_tgid.c
 create mode 100644 tools/testing/selftests/bpf/progs/get_ns_current_pid_tgid_kern.c

-- 
2.20.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v13 1/4] fs/nsfs.c: added ns_match
  2019-10-09 15:26 [PATCH v13 0/4] BPF: New helper to obtain namespace data from current task Carlos Neira
@ 2019-10-09 15:26 ` Carlos Neira
  2019-10-09 15:26 ` [PATCH v13 2/4] bpf: added new helper bpf_get_ns_current_pid_tgid Carlos Neira
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Carlos Neira @ 2019-10-09 15:26 UTC (permalink / raw)
  To: netdev; +Cc: yhs, ebiederm, brouer, bpf, cneirabustos

ns_match returns true if the namespace inode and dev_t matches the ones
provided by the caller.

Signed-off-by: Carlos Neira <cneirabustos@gmail.com>
---
 fs/nsfs.c               | 8 ++++++++
 include/linux/proc_ns.h | 2 ++
 2 files changed, 10 insertions(+)

diff --git a/fs/nsfs.c b/fs/nsfs.c
index a0431642c6b5..256f6295d33d 100644
--- a/fs/nsfs.c
+++ b/fs/nsfs.c
@@ -245,6 +245,14 @@ struct file *proc_ns_fget(int fd)
 	return ERR_PTR(-EINVAL);
 }
 
+/* Returns true if current namespace matches dev/ino.
+ */
+bool ns_match(const struct ns_common *ns, dev_t dev, ino_t ino)
+{
+	return ((ns->inum == ino) && (nsfs_mnt->mnt_sb->s_dev == dev));
+}
+
+
 static int nsfs_show_path(struct seq_file *seq, struct dentry *dentry)
 {
 	struct inode *inode = d_inode(dentry);
diff --git a/include/linux/proc_ns.h b/include/linux/proc_ns.h
index d31cb6215905..1da9f33489f3 100644
--- a/include/linux/proc_ns.h
+++ b/include/linux/proc_ns.h
@@ -82,6 +82,8 @@ typedef struct ns_common *ns_get_path_helper_t(void *);
 extern void *ns_get_path_cb(struct path *path, ns_get_path_helper_t ns_get_cb,
 			    void *private_data);
 
+extern bool ns_match(const struct ns_common *ns, dev_t dev, ino_t ino);
+
 extern int ns_get_name(char *buf, size_t size, struct task_struct *task,
 			const struct proc_ns_operations *ns_ops);
 extern void nsfs_init(void);
-- 
2.20.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v13 2/4] bpf: added new helper bpf_get_ns_current_pid_tgid
  2019-10-09 15:26 [PATCH v13 0/4] BPF: New helper to obtain namespace data from current task Carlos Neira
  2019-10-09 15:26 ` [PATCH v13 1/4] fs/nsfs.c: added ns_match Carlos Neira
@ 2019-10-09 15:26 ` Carlos Neira
  2019-10-09 16:14   ` Andrii Nakryiko
  2019-10-09 15:26 ` [PATCH v13 3/4] tools: Added bpf_get_ns_current_pid_tgid helper Carlos Neira
  2019-10-09 15:26 ` [PATCH v13 4/4] tools/testing/selftests/bpf: Add self-tests for new helper Carlos Neira
  3 siblings, 1 reply; 11+ messages in thread
From: Carlos Neira @ 2019-10-09 15:26 UTC (permalink / raw)
  To: netdev; +Cc: yhs, ebiederm, brouer, bpf, cneirabustos

New bpf helper bpf_get_ns_current_pid_tgid,
This helper will return pid and tgid from current task
which namespace matches dev_t and inode number provided,
this will allows us to instrument a process inside a container.

Signed-off-by: Carlos Neira <cneirabustos@gmail.com>
---
 include/linux/bpf.h      |  1 +
 include/uapi/linux/bpf.h | 22 +++++++++++++++++++-
 kernel/bpf/core.c        |  1 +
 kernel/bpf/helpers.c     | 43 ++++++++++++++++++++++++++++++++++++++++
 kernel/trace/bpf_trace.c |  2 ++
 5 files changed, 68 insertions(+), 1 deletion(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 5b9d22338606..231001475504 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1055,6 +1055,7 @@ extern const struct bpf_func_proto bpf_get_local_storage_proto;
 extern const struct bpf_func_proto bpf_strtol_proto;
 extern const struct bpf_func_proto bpf_strtoul_proto;
 extern const struct bpf_func_proto bpf_tcp_sock_proto;
+extern const struct bpf_func_proto bpf_get_ns_current_pid_tgid_proto;
 
 /* Shared helpers among cBPF and eBPF. */
 void bpf_user_rnd_init_once(void);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 77c6be96d676..6ad3f2abf00d 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2750,6 +2750,19 @@ union bpf_attr {
  *		**-EOPNOTSUPP** kernel configuration does not enable SYN cookies
  *
  *		**-EPROTONOSUPPORT** IP packet version is not 4 or 6
+ *
+ * u64 bpf_get_ns_current_pid_tgid(struct *bpf_pidns_info, u32 size)
+ *	Return
+ *		0 on success, values for pid and tgid from nsinfo will be as seen
+ *		from the namespace that matches dev and inum from nsinfo.
+ *
+ *		On failure, the returned value is one of the following:
+ *
+ *		**-EINVAL** if dev and inum supplied don't match dev_t and inode number
+ *              with nsfs of current task, or if dev conversion to dev_t lost high bits.
+ *
+ *		**-ENOENT** if /proc/self/ns does not exists.
+ *
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -2862,7 +2875,8 @@ union bpf_attr {
 	FN(sk_storage_get),		\
 	FN(sk_storage_delete),		\
 	FN(send_signal),		\
-	FN(tcp_gen_syncookie),
+	FN(tcp_gen_syncookie),          \
+	FN(get_ns_current_pid_tgid),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
@@ -3613,4 +3627,10 @@ struct bpf_sockopt {
 	__s32	retval;
 };
 
+struct bpf_pidns_info {
+	__u64 dev;
+	__u64 inum;
+	__u32 pid;
+	__u32 tgid;
+};
 #endif /* _UAPI__LINUX_BPF_H__ */
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 66088a9e9b9e..b2fd5358f472 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -2042,6 +2042,7 @@ const struct bpf_func_proto bpf_get_current_uid_gid_proto __weak;
 const struct bpf_func_proto bpf_get_current_comm_proto __weak;
 const struct bpf_func_proto bpf_get_current_cgroup_id_proto __weak;
 const struct bpf_func_proto bpf_get_local_storage_proto __weak;
+const struct bpf_func_proto bpf_get_ns_current_pid_tgid_proto __weak;
 
 const struct bpf_func_proto * __weak bpf_get_trace_printk_proto(void)
 {
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 5e28718928ca..78a1ce7726aa 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -11,6 +11,8 @@
 #include <linux/uidgid.h>
 #include <linux/filter.h>
 #include <linux/ctype.h>
+#include <linux/pid_namespace.h>
+#include <linux/proc_ns.h>
 
 #include "../../lib/kstrtox.h"
 
@@ -487,3 +489,44 @@ const struct bpf_func_proto bpf_strtoul_proto = {
 	.arg4_type	= ARG_PTR_TO_LONG,
 };
 #endif
+
+BPF_CALL_2(bpf_get_ns_current_pid_tgid, struct bpf_pidns_info *, nsdata, u32,
+	size)
+{
+	struct task_struct *task = current;
+	struct pid_namespace *pidns;
+	int err = -EINVAL;
+
+	if (unlikely(size != sizeof(struct bpf_pidns_info)))
+		goto clear;
+
+	if ((u64)(dev_t)nsdata->dev != nsdata->dev)
+		goto clear;
+
+	if (unlikely(!task))
+		goto clear;
+
+	pidns = task_active_pid_ns(task);
+	if (unlikely(!pidns)) {
+		err = -ENOENT;
+		goto clear;
+	}
+
+	if (!ns_match(&pidns->ns, (dev_t)nsdata->dev, nsdata->inum))
+		goto clear;
+
+	nsdata->pid = task_pid_nr_ns(task, pidns);
+	nsdata->tgid = task_tgid_nr_ns(task, pidns);
+	return 0;
+clear:
+	memset((void *)nsdata, 0, (size_t) size);
+	return err;
+}
+
+const struct bpf_func_proto bpf_get_ns_current_pid_tgid_proto = {
+	.func		= bpf_get_ns_current_pid_tgid,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type      = ARG_PTR_TO_UNINIT_MEM,
+	.arg2_type      = ARG_CONST_SIZE,
+};
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 44bd08f2443b..32331a1dcb6d 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -735,6 +735,8 @@ tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 #endif
 	case BPF_FUNC_send_signal:
 		return &bpf_send_signal_proto;
+	case BPF_FUNC_get_ns_current_pid_tgid:
+		return &bpf_get_ns_current_pid_tgid_proto;
 	default:
 		return NULL;
 	}
-- 
2.20.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v13 3/4] tools: Added bpf_get_ns_current_pid_tgid helper
  2019-10-09 15:26 [PATCH v13 0/4] BPF: New helper to obtain namespace data from current task Carlos Neira
  2019-10-09 15:26 ` [PATCH v13 1/4] fs/nsfs.c: added ns_match Carlos Neira
  2019-10-09 15:26 ` [PATCH v13 2/4] bpf: added new helper bpf_get_ns_current_pid_tgid Carlos Neira
@ 2019-10-09 15:26 ` Carlos Neira
  2019-10-09 15:26 ` [PATCH v13 4/4] tools/testing/selftests/bpf: Add self-tests for new helper Carlos Neira
  3 siblings, 0 replies; 11+ messages in thread
From: Carlos Neira @ 2019-10-09 15:26 UTC (permalink / raw)
  To: netdev; +Cc: yhs, ebiederm, brouer, bpf, cneirabustos

sync tools/include/uapi/linux/bpf.h to include new helper.

Signed-off-by: Carlos Neira <cneirabustos@gmail.com>
---
 tools/include/uapi/linux/bpf.h | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 77c6be96d676..6ad3f2abf00d 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -2750,6 +2750,19 @@ union bpf_attr {
  *		**-EOPNOTSUPP** kernel configuration does not enable SYN cookies
  *
  *		**-EPROTONOSUPPORT** IP packet version is not 4 or 6
+ *
+ * u64 bpf_get_ns_current_pid_tgid(struct *bpf_pidns_info, u32 size)
+ *	Return
+ *		0 on success, values for pid and tgid from nsinfo will be as seen
+ *		from the namespace that matches dev and inum from nsinfo.
+ *
+ *		On failure, the returned value is one of the following:
+ *
+ *		**-EINVAL** if dev and inum supplied don't match dev_t and inode number
+ *              with nsfs of current task, or if dev conversion to dev_t lost high bits.
+ *
+ *		**-ENOENT** if /proc/self/ns does not exists.
+ *
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -2862,7 +2875,8 @@ union bpf_attr {
 	FN(sk_storage_get),		\
 	FN(sk_storage_delete),		\
 	FN(send_signal),		\
-	FN(tcp_gen_syncookie),
+	FN(tcp_gen_syncookie),          \
+	FN(get_ns_current_pid_tgid),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
@@ -3613,4 +3627,10 @@ struct bpf_sockopt {
 	__s32	retval;
 };
 
+struct bpf_pidns_info {
+	__u64 dev;
+	__u64 inum;
+	__u32 pid;
+	__u32 tgid;
+};
 #endif /* _UAPI__LINUX_BPF_H__ */
-- 
2.20.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v13 4/4] tools/testing/selftests/bpf: Add self-tests for new helper.
  2019-10-09 15:26 [PATCH v13 0/4] BPF: New helper to obtain namespace data from current task Carlos Neira
                   ` (2 preceding siblings ...)
  2019-10-09 15:26 ` [PATCH v13 3/4] tools: Added bpf_get_ns_current_pid_tgid helper Carlos Neira
@ 2019-10-09 15:26 ` Carlos Neira
  2019-10-09 16:26   ` Andrii Nakryiko
  3 siblings, 1 reply; 11+ messages in thread
From: Carlos Neira @ 2019-10-09 15:26 UTC (permalink / raw)
  To: netdev; +Cc: yhs, ebiederm, brouer, bpf, cneirabustos

Self tests added for new helper

Signed-off-by: Carlos Neira <cneirabustos@gmail.com>
---
 tools/testing/selftests/bpf/bpf_helpers.h     |  4 +
 .../bpf/prog_tests/get_ns_current_pid_tgid.c  | 85 +++++++++++++++++++
 .../bpf/progs/get_ns_current_pid_tgid_kern.c  | 53 ++++++++++++
 3 files changed, 142 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/get_ns_current_pid_tgid.c
 create mode 100644 tools/testing/selftests/bpf/progs/get_ns_current_pid_tgid_kern.c

diff --git a/tools/testing/selftests/bpf/bpf_helpers.h b/tools/testing/selftests/bpf/bpf_helpers.h
index 54a50699bbfd..16261b23e011 100644
--- a/tools/testing/selftests/bpf/bpf_helpers.h
+++ b/tools/testing/selftests/bpf/bpf_helpers.h
@@ -233,6 +233,10 @@ static int (*bpf_send_signal)(unsigned sig) = (void *)BPF_FUNC_send_signal;
 static long long (*bpf_tcp_gen_syncookie)(struct bpf_sock *sk, void *ip,
 					  int ip_len, void *tcp, int tcp_len) =
 	(void *) BPF_FUNC_tcp_gen_syncookie;
+static unsigned long long (*bpf_get_ns_current_pid_tgid)(struct bpf_pidns_info *nsinfo,
+		unsigned int buf_size) =
+	(void *) BPF_FUNC_get_ns_current_pid_tgid;
+
 
 /* llvm builtin functions that eBPF C program may use to
  * emit BPF_LD_ABS and BPF_LD_IND instructions
diff --git a/tools/testing/selftests/bpf/prog_tests/get_ns_current_pid_tgid.c b/tools/testing/selftests/bpf/prog_tests/get_ns_current_pid_tgid.c
new file mode 100644
index 000000000000..a7bff0ef6677
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/get_ns_current_pid_tgid.c
@@ -0,0 +1,85 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2019 Carlos Neira cneirabustos@gmail.com */
+#include <test_progs.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <unistd.h>
+#include <sys/syscall.h>
+
+void test_get_ns_current_pid_tgid(void)
+{
+	const char *probe_name = "syscalls/sys_enter_nanosleep";
+	const char *file = "get_ns_current_pid_tgid_kern.o";
+	int ns_data_map_fd, duration = 0;
+	struct perf_event_attr attr = {};
+	int err, efd, prog_fd, pmu_fd;
+	__u64 ino, dev, id, nspid;
+	struct bpf_object *obj;
+	struct stat st;
+	__u32 key = 0;
+	char buf[256];
+
+	err = bpf_prog_load(file, BPF_PROG_TYPE_TRACEPOINT, &obj, &prog_fd);
+	if (CHECK(err, "prog_load", "err %d errno %d\n", err, errno))
+		return;
+
+	ns_data_map_fd = bpf_find_map(__func__, obj, "ns_data_map");
+	if (CHECK_FAIL(ns_data_map_fd < 0))
+		goto close_prog;
+
+	pid_t tid = syscall(SYS_gettid);
+	pid_t pid = getpid();
+
+	id = (__u64) tid << 32 | pid;
+	bpf_map_update_elem(ns_data_map_fd, &key, &id, 0);
+
+	if (stat("/proc/self/ns/pid", &st))
+		goto close_prog;
+
+	dev = st.st_dev;
+	ino = st.st_ino;
+	key = 1;
+	bpf_map_update_elem(ns_data_map_fd, &key, &dev, 0);
+	key = 2;
+	bpf_map_update_elem(ns_data_map_fd, &key, &ino, 0);
+
+	snprintf(buf, sizeof(buf),
+		 "/sys/kernel/debug/tracing/events/%s/id", probe_name);
+	efd = open(buf, O_RDONLY, 0);
+	read(efd, buf, sizeof(buf));
+	close(efd);
+	attr.config = strtol(buf, NULL, 0);
+	attr.type = PERF_TYPE_TRACEPOINT;
+	attr.sample_type = PERF_SAMPLE_RAW;
+	attr.sample_period = 1;
+	attr.wakeup_events = 1;
+
+	pmu_fd = syscall(__NR_perf_event_open, &attr, getpid(), -1, -1, 0);
+	if (CHECK_FAIL(pmu_fd < 0))
+		goto cleanup;
+
+	err = ioctl(pmu_fd, PERF_EVENT_IOC_ENABLE, 0);
+	if (CHECK_FAIL(err))
+		goto cleanup;
+
+	err = ioctl(pmu_fd, PERF_EVENT_IOC_SET_BPF, prog_fd);
+	if (CHECK_FAIL(err))
+		goto cleanup;
+
+	/* trigger some syscalls */
+	sleep(1);
+	key = 3;
+	err = bpf_map_lookup_elem(ns_data_map_fd, &key, &nspid);
+	if (CHECK_FAIL(err))
+		goto cleanup;
+
+	if (CHECK(id != nspid, "Compare user pid/tgid vs. bpf pid/tgid",
+		  "Userspace pid/tgid %llu EBPF pid/tgid %llu\n", id, nspid))
+		goto cleanup;
+
+cleanup:
+	close(pmu_fd);
+close_prog:
+	bpf_object__close(obj);
+}
diff --git a/tools/testing/selftests/bpf/progs/get_ns_current_pid_tgid_kern.c b/tools/testing/selftests/bpf/progs/get_ns_current_pid_tgid_kern.c
new file mode 100644
index 000000000000..3659aaa7c71f
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/get_ns_current_pid_tgid_kern.c
@@ -0,0 +1,53 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2019 Carlos Neira cneirabustos@gmail.com */
+
+#include <linux/bpf.h>
+#include "bpf_helpers.h"
+
+struct {
+	__uint(type, BPF_MAP_TYPE_ARRAY);
+	__uint(max_entries, 4);
+	__type(key, __u32);
+	__type(value, __u64);
+} ns_data_map SEC(".maps");
+
+
+SEC("tracepoint/syscalls/sys_enter_nanosleep")
+int trace(void *ctx)
+{
+	__u64 *val, *inum, *dev, nspidtgid, *expected_pid;
+	struct bpf_pidns_info nsdata;
+	__u32 key = 1;
+
+	dev = bpf_map_lookup_elem(&ns_data_map, &key);
+	if (!dev)
+		return 0;
+	key = 2;
+	inum = bpf_map_lookup_elem(&ns_data_map, &key);
+	if (!inum)
+		return 0;
+
+	nsdata.dev = *dev;
+	nsdata.inum = *inum;
+
+	if (bpf_get_ns_current_pid_tgid(&nsdata, sizeof(struct bpf_pidns_info)))
+		return 0;
+
+	nspidtgid = (__u64)nsdata.tgid << 32 | nsdata.pid;
+	key = 0;
+	expected_pid = bpf_map_lookup_elem(&ns_data_map, &key);
+
+	if (!expected_pid || *expected_pid != nspidtgid)
+		return 0;
+
+	key = 3;
+	val = bpf_map_lookup_elem(&ns_data_map, &key);
+
+	if (val)
+		*val = nspidtgid;
+
+	return 0;
+}
+
+char _license[] SEC("license") = "GPL";
+__u32 _version SEC("version") = 1;
-- 
2.20.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v13 2/4] bpf: added new helper bpf_get_ns_current_pid_tgid
  2019-10-09 15:26 ` [PATCH v13 2/4] bpf: added new helper bpf_get_ns_current_pid_tgid Carlos Neira
@ 2019-10-09 16:14   ` Andrii Nakryiko
  2019-10-09 17:45     ` Carlos Antonio Neira Bustos
  0 siblings, 1 reply; 11+ messages in thread
From: Andrii Nakryiko @ 2019-10-09 16:14 UTC (permalink / raw)
  To: Carlos Neira
  Cc: Networking, Yonghong Song, ebiederm, Jesper Dangaard Brouer, bpf

On Wed, Oct 9, 2019 at 8:27 AM Carlos Neira <cneirabustos@gmail.com> wrote:
>
> New bpf helper bpf_get_ns_current_pid_tgid,
> This helper will return pid and tgid from current task
> which namespace matches dev_t and inode number provided,
> this will allows us to instrument a process inside a container.
>
> Signed-off-by: Carlos Neira <cneirabustos@gmail.com>
> ---
>  include/linux/bpf.h      |  1 +
>  include/uapi/linux/bpf.h | 22 +++++++++++++++++++-
>  kernel/bpf/core.c        |  1 +
>  kernel/bpf/helpers.c     | 43 ++++++++++++++++++++++++++++++++++++++++
>  kernel/trace/bpf_trace.c |  2 ++
>  5 files changed, 68 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 5b9d22338606..231001475504 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -1055,6 +1055,7 @@ extern const struct bpf_func_proto bpf_get_local_storage_proto;
>  extern const struct bpf_func_proto bpf_strtol_proto;
>  extern const struct bpf_func_proto bpf_strtoul_proto;
>  extern const struct bpf_func_proto bpf_tcp_sock_proto;
> +extern const struct bpf_func_proto bpf_get_ns_current_pid_tgid_proto;
>
>  /* Shared helpers among cBPF and eBPF. */
>  void bpf_user_rnd_init_once(void);
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 77c6be96d676..6ad3f2abf00d 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -2750,6 +2750,19 @@ union bpf_attr {
>   *             **-EOPNOTSUPP** kernel configuration does not enable SYN cookies
>   *
>   *             **-EPROTONOSUPPORT** IP packet version is not 4 or 6
> + *
> + * u64 bpf_get_ns_current_pid_tgid(struct *bpf_pidns_info, u32 size)

Should be:

struct bpf_pidns_info *nsdata

> + *     Return
> + *             0 on success, values for pid and tgid from nsinfo will be as seen
> + *             from the namespace that matches dev and inum from nsinfo.

I think its cleaner to have a Description section, explaining that it
will return pid/tgid in bpf_pidns_info, and then describe exit codes
in Return section.

> + *
> + *             On failure, the returned value is one of the following:
> + *
> + *             **-EINVAL** if dev and inum supplied don't match dev_t and inode number
> + *              with nsfs of current task, or if dev conversion to dev_t lost high bits.
> + *
> + *             **-ENOENT** if /proc/self/ns does not exists.
> + *
>   */
>  #define __BPF_FUNC_MAPPER(FN)          \
>         FN(unspec),                     \
> @@ -2862,7 +2875,8 @@ union bpf_attr {
>         FN(sk_storage_get),             \
>         FN(sk_storage_delete),          \
>         FN(send_signal),                \
> -       FN(tcp_gen_syncookie),
> +       FN(tcp_gen_syncookie),          \
> +       FN(get_ns_current_pid_tgid),
>
>  /* integer value in 'imm' field of BPF_CALL instruction selects which helper
>   * function eBPF program intends to call
> @@ -3613,4 +3627,10 @@ struct bpf_sockopt {
>         __s32   retval;
>  };
>
> +struct bpf_pidns_info {
> +       __u64 dev;
> +       __u64 inum;

seems like conventionally this should be named "ino", this is what
ns_match calls it, so let's stay consistent.

> +       __u32 pid;
> +       __u32 tgid;
> +};

So it seems like dev and inum are treated as input parameters, while
pid/tgid is output parameter, right? Wouldn't it be cleaner to have
dev and inum as explicit arguments into bpf_get_ns_current_pid_tgid()?
What's also not great, is that on failure you'll memset this entire
struct to zero, and user will lose its dev/inum. So in practice you'll
be keeping dev/inum somewhere else, then constructing and filling in
this bpf_pidns_info struct every time you need to invoke
bpf_get_ns_current_pid_tgid.

Maybe it was discussed already, but IMO feels cleaner to have only
pid/tgid in bpf_pidns_info and pass dev/inum as direct arguments.

>  #endif /* _UAPI__LINUX_BPF_H__ */
> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> index 66088a9e9b9e..b2fd5358f472 100644
> --- a/kernel/bpf/core.c
> +++ b/kernel/bpf/core.c
> @@ -2042,6 +2042,7 @@ const struct bpf_func_proto bpf_get_current_uid_gid_proto __weak;
>  const struct bpf_func_proto bpf_get_current_comm_proto __weak;
>  const struct bpf_func_proto bpf_get_current_cgroup_id_proto __weak;
>  const struct bpf_func_proto bpf_get_local_storage_proto __weak;
> +const struct bpf_func_proto bpf_get_ns_current_pid_tgid_proto __weak;
>
>  const struct bpf_func_proto * __weak bpf_get_trace_printk_proto(void)
>  {
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index 5e28718928ca..78a1ce7726aa 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -11,6 +11,8 @@
>  #include <linux/uidgid.h>
>  #include <linux/filter.h>
>  #include <linux/ctype.h>
> +#include <linux/pid_namespace.h>
> +#include <linux/proc_ns.h>
>
>  #include "../../lib/kstrtox.h"
>
> @@ -487,3 +489,44 @@ const struct bpf_func_proto bpf_strtoul_proto = {
>         .arg4_type      = ARG_PTR_TO_LONG,
>  };
>  #endif
> +
> +BPF_CALL_2(bpf_get_ns_current_pid_tgid, struct bpf_pidns_info *, nsdata, u32,
> +       size)
> +{
> +       struct task_struct *task = current;
> +       struct pid_namespace *pidns;
> +       int err = -EINVAL;
> +
> +       if (unlikely(size != sizeof(struct bpf_pidns_info)))
> +               goto clear;
> +
> +       if ((u64)(dev_t)nsdata->dev != nsdata->dev)

this seems unlikely() as well :)

> +               goto clear;
> +
> +       if (unlikely(!task))
> +               goto clear;
> +
> +       pidns = task_active_pid_ns(task);
> +       if (unlikely(!pidns)) {
> +               err = -ENOENT;
> +               goto clear;
> +       }
> +
> +       if (!ns_match(&pidns->ns, (dev_t)nsdata->dev, nsdata->inum))
> +               goto clear;
> +
> +       nsdata->pid = task_pid_nr_ns(task, pidns);
> +       nsdata->tgid = task_tgid_nr_ns(task, pidns);
> +       return 0;
> +clear:
> +       memset((void *)nsdata, 0, (size_t) size);
> +       return err;
> +}
> +
> +const struct bpf_func_proto bpf_get_ns_current_pid_tgid_proto = {
> +       .func           = bpf_get_ns_current_pid_tgid,
> +       .gpl_only       = false,
> +       .ret_type       = RET_INTEGER,
> +       .arg1_type      = ARG_PTR_TO_UNINIT_MEM,

So this is a lie, you do expect part of that struct to be initialized.
One more reason to just split off dev/inum(ino?).


> +       .arg2_type      = ARG_CONST_SIZE,
> +};
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 44bd08f2443b..32331a1dcb6d 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -735,6 +735,8 @@ tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>  #endif
>         case BPF_FUNC_send_signal:
>                 return &bpf_send_signal_proto;
> +       case BPF_FUNC_get_ns_current_pid_tgid:
> +               return &bpf_get_ns_current_pid_tgid_proto;
>         default:
>                 return NULL;
>         }
> --
> 2.20.1
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v13 4/4] tools/testing/selftests/bpf: Add self-tests for new helper.
  2019-10-09 15:26 ` [PATCH v13 4/4] tools/testing/selftests/bpf: Add self-tests for new helper Carlos Neira
@ 2019-10-09 16:26   ` Andrii Nakryiko
  2019-10-09 16:44     ` Carlos Antonio Neira Bustos
  0 siblings, 1 reply; 11+ messages in thread
From: Andrii Nakryiko @ 2019-10-09 16:26 UTC (permalink / raw)
  To: Carlos Neira
  Cc: Networking, Yonghong Song, ebiederm, Jesper Dangaard Brouer, bpf

On Wed, Oct 9, 2019 at 8:29 AM Carlos Neira <cneirabustos@gmail.com> wrote:
>
> Self tests added for new helper
>
> Signed-off-by: Carlos Neira <cneirabustos@gmail.com>
> ---
>  tools/testing/selftests/bpf/bpf_helpers.h     |  4 +
>  .../bpf/prog_tests/get_ns_current_pid_tgid.c  | 85 +++++++++++++++++++
>  .../bpf/progs/get_ns_current_pid_tgid_kern.c  | 53 ++++++++++++
>  3 files changed, 142 insertions(+)
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/get_ns_current_pid_tgid.c
>  create mode 100644 tools/testing/selftests/bpf/progs/get_ns_current_pid_tgid_kern.c
>
> diff --git a/tools/testing/selftests/bpf/bpf_helpers.h b/tools/testing/selftests/bpf/bpf_helpers.h
> index 54a50699bbfd..16261b23e011 100644
> --- a/tools/testing/selftests/bpf/bpf_helpers.h
> +++ b/tools/testing/selftests/bpf/bpf_helpers.h
> @@ -233,6 +233,10 @@ static int (*bpf_send_signal)(unsigned sig) = (void *)BPF_FUNC_send_signal;
>  static long long (*bpf_tcp_gen_syncookie)(struct bpf_sock *sk, void *ip,
>                                           int ip_len, void *tcp, int tcp_len) =
>         (void *) BPF_FUNC_tcp_gen_syncookie;
> +static unsigned long long (*bpf_get_ns_current_pid_tgid)(struct bpf_pidns_info *nsinfo,
> +               unsigned int buf_size) =
> +       (void *) BPF_FUNC_get_ns_current_pid_tgid;
> +

This is obsolete as of two days ago :) We now generate this
automatically from the bpf.h's documentation (which is currently
broken for your helper, I replied on respective patch). So please pull
latest bpf-next and rebase.

>
>  /* llvm builtin functions that eBPF C program may use to
>   * emit BPF_LD_ABS and BPF_LD_IND instructions
> diff --git a/tools/testing/selftests/bpf/prog_tests/get_ns_current_pid_tgid.c b/tools/testing/selftests/bpf/prog_tests/get_ns_current_pid_tgid.c
> new file mode 100644
> index 000000000000..a7bff0ef6677
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/get_ns_current_pid_tgid.c
> @@ -0,0 +1,85 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2019 Carlos Neira cneirabustos@gmail.com */
> +#include <test_progs.h>
> +#include <sys/stat.h>
> +#include <sys/types.h>
> +#include <sys/stat.h>
> +#include <unistd.h>
> +#include <sys/syscall.h>
> +
> +void test_get_ns_current_pid_tgid(void)
> +{
> +       const char *probe_name = "syscalls/sys_enter_nanosleep";
> +       const char *file = "get_ns_current_pid_tgid_kern.o";
> +       int ns_data_map_fd, duration = 0;
> +       struct perf_event_attr attr = {};
> +       int err, efd, prog_fd, pmu_fd;
> +       __u64 ino, dev, id, nspid;
> +       struct bpf_object *obj;
> +       struct stat st;
> +       __u32 key = 0;
> +       char buf[256];
> +
> +       err = bpf_prog_load(file, BPF_PROG_TYPE_TRACEPOINT, &obj, &prog_fd);
> +       if (CHECK(err, "prog_load", "err %d errno %d\n", err, errno))
> +               return;
> +
> +       ns_data_map_fd = bpf_find_map(__func__, obj, "ns_data_map");
> +       if (CHECK_FAIL(ns_data_map_fd < 0))
> +               goto close_prog;
> +
> +       pid_t tid = syscall(SYS_gettid);
> +       pid_t pid = getpid();
> +
> +       id = (__u64) tid << 32 | pid;
> +       bpf_map_update_elem(ns_data_map_fd, &key, &id, 0);
> +
> +       if (stat("/proc/self/ns/pid", &st))

CHECK() or CHECK_FAIL() ?

> +               goto close_prog;
> +
> +       dev = st.st_dev;
> +       ino = st.st_ino;
> +       key = 1;
> +       bpf_map_update_elem(ns_data_map_fd, &key, &dev, 0);
> +       key = 2;
> +       bpf_map_update_elem(ns_data_map_fd, &key, &ino, 0);
> +
> +       snprintf(buf, sizeof(buf),
> +                "/sys/kernel/debug/tracing/events/%s/id", probe_name);
> +       efd = open(buf, O_RDONLY, 0);
> +       read(efd, buf, sizeof(buf));
> +       close(efd);
> +       attr.config = strtol(buf, NULL, 0);
> +       attr.type = PERF_TYPE_TRACEPOINT;
> +       attr.sample_type = PERF_SAMPLE_RAW;
> +       attr.sample_period = 1;
> +       attr.wakeup_events = 1;
> +
> +       pmu_fd = syscall(__NR_perf_event_open, &attr, getpid(), -1, -1, 0);
> +       if (CHECK_FAIL(pmu_fd < 0))
> +               goto cleanup;
> +
> +       err = ioctl(pmu_fd, PERF_EVENT_IOC_ENABLE, 0);
> +       if (CHECK_FAIL(err))
> +               goto cleanup;
> +
> +       err = ioctl(pmu_fd, PERF_EVENT_IOC_SET_BPF, prog_fd);
> +       if (CHECK_FAIL(err))
> +               goto cleanup;


All this attaching boilerplate is now obsolete as well, stick to
bpf_program__attach_tracepoint(). But even better, use RAW_TRACEPOINT.

> +
> +       /* trigger some syscalls */
> +       sleep(1);
> +       key = 3;
> +       err = bpf_map_lookup_elem(ns_data_map_fd, &key, &nspid);
> +       if (CHECK_FAIL(err))
> +               goto cleanup;
> +
> +       if (CHECK(id != nspid, "Compare user pid/tgid vs. bpf pid/tgid",
> +                 "Userspace pid/tgid %llu EBPF pid/tgid %llu\n", id, nspid))
> +               goto cleanup;
> +
> +cleanup:
> +       close(pmu_fd);
> +close_prog:
> +       bpf_object__close(obj);
> +}
> diff --git a/tools/testing/selftests/bpf/progs/get_ns_current_pid_tgid_kern.c b/tools/testing/selftests/bpf/progs/get_ns_current_pid_tgid_kern.c
> new file mode 100644
> index 000000000000..3659aaa7c71f
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/get_ns_current_pid_tgid_kern.c
> @@ -0,0 +1,53 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2019 Carlos Neira cneirabustos@gmail.com */
> +
> +#include <linux/bpf.h>
> +#include "bpf_helpers.h"
> +
> +struct {
> +       __uint(type, BPF_MAP_TYPE_ARRAY);
> +       __uint(max_entries, 4);
> +       __type(key, __u32);
> +       __type(value, __u64);
> +} ns_data_map SEC(".maps");
> +
> +
> +SEC("tracepoint/syscalls/sys_enter_nanosleep")
> +int trace(void *ctx)
> +{
> +       __u64 *val, *inum, *dev, nspidtgid, *expected_pid;
> +       struct bpf_pidns_info nsdata;
> +       __u32 key = 1;
> +
> +       dev = bpf_map_lookup_elem(&ns_data_map, &key);
> +       if (!dev)
> +               return 0;
> +       key = 2;
> +       inum = bpf_map_lookup_elem(&ns_data_map, &key);
> +       if (!inum)
> +               return 0;
> +
> +       nsdata.dev = *dev;
> +       nsdata.inum = *inum;
> +
> +       if (bpf_get_ns_current_pid_tgid(&nsdata, sizeof(struct bpf_pidns_info)))
> +               return 0;
> +
> +       nspidtgid = (__u64)nsdata.tgid << 32 | nsdata.pid;
> +       key = 0;
> +       expected_pid = bpf_map_lookup_elem(&ns_data_map, &key);
> +
> +       if (!expected_pid || *expected_pid != nspidtgid)
> +               return 0;
> +
> +       key = 3;
> +       val = bpf_map_lookup_elem(&ns_data_map, &key);

Please, use global data for this, will make this BPF program much
shorter, cleaner, and up to the point. See recent patch by Daniel T.
Lee, or some of the tests I added (e.g.,
progs/test_core_reloc_kernel.c)

> +
> +       if (val)
> +               *val = nspidtgid;
> +
> +       return 0;
> +}
> +
> +char _license[] SEC("license") = "GPL";
> +__u32 _version SEC("version") = 1;

You can drop version now (recent stuff), it's not required by libbpf anymore.

> --
> 2.20.1
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v13 4/4] tools/testing/selftests/bpf: Add self-tests for new helper.
  2019-10-09 16:26   ` Andrii Nakryiko
@ 2019-10-09 16:44     ` Carlos Antonio Neira Bustos
  0 siblings, 0 replies; 11+ messages in thread
From: Carlos Antonio Neira Bustos @ 2019-10-09 16:44 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Networking, Yonghong Song, ebiederm, Jesper Dangaard Brouer, bpf

On Wed, Oct 09, 2019 at 09:26:32AM -0700, Andrii Nakryiko wrote:
> On Wed, Oct 9, 2019 at 8:29 AM Carlos Neira <cneirabustos@gmail.com> wrote:
> >
> > Self tests added for new helper
> >
> > Signed-off-by: Carlos Neira <cneirabustos@gmail.com>
> > ---
> >  tools/testing/selftests/bpf/bpf_helpers.h     |  4 +
> >  .../bpf/prog_tests/get_ns_current_pid_tgid.c  | 85 +++++++++++++++++++
> >  .../bpf/progs/get_ns_current_pid_tgid_kern.c  | 53 ++++++++++++
> >  3 files changed, 142 insertions(+)
> >  create mode 100644 tools/testing/selftests/bpf/prog_tests/get_ns_current_pid_tgid.c
> >  create mode 100644 tools/testing/selftests/bpf/progs/get_ns_current_pid_tgid_kern.c
> >
> > diff --git a/tools/testing/selftests/bpf/bpf_helpers.h b/tools/testing/selftests/bpf/bpf_helpers.h
> > index 54a50699bbfd..16261b23e011 100644
> > --- a/tools/testing/selftests/bpf/bpf_helpers.h
> > +++ b/tools/testing/selftests/bpf/bpf_helpers.h
> > @@ -233,6 +233,10 @@ static int (*bpf_send_signal)(unsigned sig) = (void *)BPF_FUNC_send_signal;
> >  static long long (*bpf_tcp_gen_syncookie)(struct bpf_sock *sk, void *ip,
> >                                           int ip_len, void *tcp, int tcp_len) =
> >         (void *) BPF_FUNC_tcp_gen_syncookie;
> > +static unsigned long long (*bpf_get_ns_current_pid_tgid)(struct bpf_pidns_info *nsinfo,
> > +               unsigned int buf_size) =
> > +       (void *) BPF_FUNC_get_ns_current_pid_tgid;
> > +
> 
> This is obsolete as of two days ago :) We now generate this
> automatically from the bpf.h's documentation (which is currently
> broken for your helper, I replied on respective patch). So please pull
> latest bpf-next and rebase.
> 
> >
> >  /* llvm builtin functions that eBPF C program may use to
> >   * emit BPF_LD_ABS and BPF_LD_IND instructions
> > diff --git a/tools/testing/selftests/bpf/prog_tests/get_ns_current_pid_tgid.c b/tools/testing/selftests/bpf/prog_tests/get_ns_current_pid_tgid.c
> > new file mode 100644
> > index 000000000000..a7bff0ef6677
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/prog_tests/get_ns_current_pid_tgid.c
> > @@ -0,0 +1,85 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/* Copyright (c) 2019 Carlos Neira cneirabustos@gmail.com */
> > +#include <test_progs.h>
> > +#include <sys/stat.h>
> > +#include <sys/types.h>
> > +#include <sys/stat.h>
> > +#include <unistd.h>
> > +#include <sys/syscall.h>
> > +
> > +void test_get_ns_current_pid_tgid(void)
> > +{
> > +       const char *probe_name = "syscalls/sys_enter_nanosleep";
> > +       const char *file = "get_ns_current_pid_tgid_kern.o";
> > +       int ns_data_map_fd, duration = 0;
> > +       struct perf_event_attr attr = {};
> > +       int err, efd, prog_fd, pmu_fd;
> > +       __u64 ino, dev, id, nspid;
> > +       struct bpf_object *obj;
> > +       struct stat st;
> > +       __u32 key = 0;
> > +       char buf[256];
> > +
> > +       err = bpf_prog_load(file, BPF_PROG_TYPE_TRACEPOINT, &obj, &prog_fd);
> > +       if (CHECK(err, "prog_load", "err %d errno %d\n", err, errno))
> > +               return;
> > +
> > +       ns_data_map_fd = bpf_find_map(__func__, obj, "ns_data_map");
> > +       if (CHECK_FAIL(ns_data_map_fd < 0))
> > +               goto close_prog;
> > +
> > +       pid_t tid = syscall(SYS_gettid);
> > +       pid_t pid = getpid();
> > +
> > +       id = (__u64) tid << 32 | pid;
> > +       bpf_map_update_elem(ns_data_map_fd, &key, &id, 0);
> > +
> > +       if (stat("/proc/self/ns/pid", &st))
> 
> CHECK() or CHECK_FAIL() ?
> 
> > +               goto close_prog;
> > +
> > +       dev = st.st_dev;
> > +       ino = st.st_ino;
> > +       key = 1;
> > +       bpf_map_update_elem(ns_data_map_fd, &key, &dev, 0);
> > +       key = 2;
> > +       bpf_map_update_elem(ns_data_map_fd, &key, &ino, 0);
> > +
> > +       snprintf(buf, sizeof(buf),
> > +                "/sys/kernel/debug/tracing/events/%s/id", probe_name);
> > +       efd = open(buf, O_RDONLY, 0);
> > +       read(efd, buf, sizeof(buf));
> > +       close(efd);
> > +       attr.config = strtol(buf, NULL, 0);
> > +       attr.type = PERF_TYPE_TRACEPOINT;
> > +       attr.sample_type = PERF_SAMPLE_RAW;
> > +       attr.sample_period = 1;
> > +       attr.wakeup_events = 1;
> > +
> > +       pmu_fd = syscall(__NR_perf_event_open, &attr, getpid(), -1, -1, 0);
> > +       if (CHECK_FAIL(pmu_fd < 0))
> > +               goto cleanup;
> > +
> > +       err = ioctl(pmu_fd, PERF_EVENT_IOC_ENABLE, 0);
> > +       if (CHECK_FAIL(err))
> > +               goto cleanup;
> > +
> > +       err = ioctl(pmu_fd, PERF_EVENT_IOC_SET_BPF, prog_fd);
> > +       if (CHECK_FAIL(err))
> > +               goto cleanup;
> 
> 
> All this attaching boilerplate is now obsolete as well, stick to
> bpf_program__attach_tracepoint(). But even better, use RAW_TRACEPOINT.
> 
> > +
> > +       /* trigger some syscalls */
> > +       sleep(1);
> > +       key = 3;
> > +       err = bpf_map_lookup_elem(ns_data_map_fd, &key, &nspid);
> > +       if (CHECK_FAIL(err))
> > +               goto cleanup;
> > +
> > +       if (CHECK(id != nspid, "Compare user pid/tgid vs. bpf pid/tgid",
> > +                 "Userspace pid/tgid %llu EBPF pid/tgid %llu\n", id, nspid))
> > +               goto cleanup;
> > +
> > +cleanup:
> > +       close(pmu_fd);
> > +close_prog:
> > +       bpf_object__close(obj);
> > +}
> > diff --git a/tools/testing/selftests/bpf/progs/get_ns_current_pid_tgid_kern.c b/tools/testing/selftests/bpf/progs/get_ns_current_pid_tgid_kern.c
> > new file mode 100644
> > index 000000000000..3659aaa7c71f
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/progs/get_ns_current_pid_tgid_kern.c
> > @@ -0,0 +1,53 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/* Copyright (c) 2019 Carlos Neira cneirabustos@gmail.com */
> > +
> > +#include <linux/bpf.h>
> > +#include "bpf_helpers.h"
> > +
> > +struct {
> > +       __uint(type, BPF_MAP_TYPE_ARRAY);
> > +       __uint(max_entries, 4);
> > +       __type(key, __u32);
> > +       __type(value, __u64);
> > +} ns_data_map SEC(".maps");
> > +
> > +
> > +SEC("tracepoint/syscalls/sys_enter_nanosleep")
> > +int trace(void *ctx)
> > +{
> > +       __u64 *val, *inum, *dev, nspidtgid, *expected_pid;
> > +       struct bpf_pidns_info nsdata;
> > +       __u32 key = 1;
> > +
> > +       dev = bpf_map_lookup_elem(&ns_data_map, &key);
> > +       if (!dev)
> > +               return 0;
> > +       key = 2;
> > +       inum = bpf_map_lookup_elem(&ns_data_map, &key);
> > +       if (!inum)
> > +               return 0;
> > +
> > +       nsdata.dev = *dev;
> > +       nsdata.inum = *inum;
> > +
> > +       if (bpf_get_ns_current_pid_tgid(&nsdata, sizeof(struct bpf_pidns_info)))
> > +               return 0;
> > +
> > +       nspidtgid = (__u64)nsdata.tgid << 32 | nsdata.pid;
> > +       key = 0;
> > +       expected_pid = bpf_map_lookup_elem(&ns_data_map, &key);
> > +
> > +       if (!expected_pid || *expected_pid != nspidtgid)
> > +               return 0;
> > +
> > +       key = 3;
> > +       val = bpf_map_lookup_elem(&ns_data_map, &key);
> 
> Please, use global data for this, will make this BPF program much
> shorter, cleaner, and up to the point. See recent patch by Daniel T.
> Lee, or some of the tests I added (e.g.,
> progs/test_core_reloc_kernel.c)
> 
> > +
> > +       if (val)
> > +               *val = nspidtgid;
> > +
> > +       return 0;
> > +}
> > +
> > +char _license[] SEC("license") = "GPL";
> > +__u32 _version SEC("version") = 1;
> 
> You can drop version now (recent stuff), it's not required by libbpf anymore.
> 
> > --
> > 2.20.1
> >
Thanks for checking this out!, I'll rebase to current bpf-next and
checkout progs/test_core_reloc_kernel.c.

Bests 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v13 2/4] bpf: added new helper bpf_get_ns_current_pid_tgid
  2019-10-09 16:14   ` Andrii Nakryiko
@ 2019-10-09 17:45     ` Carlos Antonio Neira Bustos
  2019-10-09 19:50       ` Andrii Nakryiko
  0 siblings, 1 reply; 11+ messages in thread
From: Carlos Antonio Neira Bustos @ 2019-10-09 17:45 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Networking, Yonghong Song, ebiederm, Jesper Dangaard Brouer, bpf

On Wed, Oct 09, 2019 at 09:14:42AM -0700, Andrii Nakryiko wrote:
> On Wed, Oct 9, 2019 at 8:27 AM Carlos Neira <cneirabustos@gmail.com> wrote:
> >
> > New bpf helper bpf_get_ns_current_pid_tgid,
> > This helper will return pid and tgid from current task
> > which namespace matches dev_t and inode number provided,
> > this will allows us to instrument a process inside a container.
> >
> > Signed-off-by: Carlos Neira <cneirabustos@gmail.com>
> > ---
> >  include/linux/bpf.h      |  1 +
> >  include/uapi/linux/bpf.h | 22 +++++++++++++++++++-
> >  kernel/bpf/core.c        |  1 +
> >  kernel/bpf/helpers.c     | 43 ++++++++++++++++++++++++++++++++++++++++
> >  kernel/trace/bpf_trace.c |  2 ++
> >  5 files changed, 68 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index 5b9d22338606..231001475504 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -1055,6 +1055,7 @@ extern const struct bpf_func_proto bpf_get_local_storage_proto;
> >  extern const struct bpf_func_proto bpf_strtol_proto;
> >  extern const struct bpf_func_proto bpf_strtoul_proto;
> >  extern const struct bpf_func_proto bpf_tcp_sock_proto;
> > +extern const struct bpf_func_proto bpf_get_ns_current_pid_tgid_proto;
> >
> >  /* Shared helpers among cBPF and eBPF. */
> >  void bpf_user_rnd_init_once(void);
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 77c6be96d676..6ad3f2abf00d 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -2750,6 +2750,19 @@ union bpf_attr {
> >   *             **-EOPNOTSUPP** kernel configuration does not enable SYN cookies
> >   *
> >   *             **-EPROTONOSUPPORT** IP packet version is not 4 or 6
> > + *
> > + * u64 bpf_get_ns_current_pid_tgid(struct *bpf_pidns_info, u32 size)
> 
> Should be:
> 
> struct bpf_pidns_info *nsdata
> 
> > + *     Return
> > + *             0 on success, values for pid and tgid from nsinfo will be as seen
> > + *             from the namespace that matches dev and inum from nsinfo.
> 
> I think its cleaner to have a Description section, explaining that it
> will return pid/tgid in bpf_pidns_info, and then describe exit codes
> in Return section.
> 
> > + *
> > + *             On failure, the returned value is one of the following:
> > + *
> > + *             **-EINVAL** if dev and inum supplied don't match dev_t and inode number
> > + *              with nsfs of current task, or if dev conversion to dev_t lost high bits.
> > + *
> > + *             **-ENOENT** if /proc/self/ns does not exists.
> > + *
> >   */
> >  #define __BPF_FUNC_MAPPER(FN)          \
> >         FN(unspec),                     \
> > @@ -2862,7 +2875,8 @@ union bpf_attr {
> >         FN(sk_storage_get),             \
> >         FN(sk_storage_delete),          \
> >         FN(send_signal),                \
> > -       FN(tcp_gen_syncookie),
> > +       FN(tcp_gen_syncookie),          \
> > +       FN(get_ns_current_pid_tgid),
> >
> >  /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> >   * function eBPF program intends to call
> > @@ -3613,4 +3627,10 @@ struct bpf_sockopt {
> >         __s32   retval;
> >  };
> >
> > +struct bpf_pidns_info {
> > +       __u64 dev;
> > +       __u64 inum;
> 
> seems like conventionally this should be named "ino", this is what
> ns_match calls it, so let's stay consistent.
> 
> > +       __u32 pid;
> > +       __u32 tgid;
> > +};
> 
> So it seems like dev and inum are treated as input parameters, while
> pid/tgid is output parameter, right? Wouldn't it be cleaner to have
> dev and inum as explicit arguments into bpf_get_ns_current_pid_tgid()?
> What's also not great, is that on failure you'll memset this entire
> struct to zero, and user will lose its dev/inum. So in practice you'll
> be keeping dev/inum somewhere else, then constructing and filling in
> this bpf_pidns_info struct every time you need to invoke
> bpf_get_ns_current_pid_tgid.
> 
> Maybe it was discussed already, but IMO feels cleaner to have only
> pid/tgid in bpf_pidns_info and pass dev/inum as direct arguments.
> 
> >  #endif /* _UAPI__LINUX_BPF_H__ */
> > diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> > index 66088a9e9b9e..b2fd5358f472 100644
> > --- a/kernel/bpf/core.c
> > +++ b/kernel/bpf/core.c
> > @@ -2042,6 +2042,7 @@ const struct bpf_func_proto bpf_get_current_uid_gid_proto __weak;
> >  const struct bpf_func_proto bpf_get_current_comm_proto __weak;
> >  const struct bpf_func_proto bpf_get_current_cgroup_id_proto __weak;
> >  const struct bpf_func_proto bpf_get_local_storage_proto __weak;
> > +const struct bpf_func_proto bpf_get_ns_current_pid_tgid_proto __weak;
> >
> >  const struct bpf_func_proto * __weak bpf_get_trace_printk_proto(void)
> >  {
> > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > index 5e28718928ca..78a1ce7726aa 100644
> > --- a/kernel/bpf/helpers.c
> > +++ b/kernel/bpf/helpers.c
> > @@ -11,6 +11,8 @@
> >  #include <linux/uidgid.h>
> >  #include <linux/filter.h>
> >  #include <linux/ctype.h>
> > +#include <linux/pid_namespace.h>
> > +#include <linux/proc_ns.h>
> >
> >  #include "../../lib/kstrtox.h"
> >
> > @@ -487,3 +489,44 @@ const struct bpf_func_proto bpf_strtoul_proto = {
> >         .arg4_type      = ARG_PTR_TO_LONG,
> >  };
> >  #endif
> > +
> > +BPF_CALL_2(bpf_get_ns_current_pid_tgid, struct bpf_pidns_info *, nsdata, u32,
> > +       size)
> > +{
> > +       struct task_struct *task = current;
> > +       struct pid_namespace *pidns;
> > +       int err = -EINVAL;
> > +
> > +       if (unlikely(size != sizeof(struct bpf_pidns_info)))
> > +               goto clear;
> > +
> > +       if ((u64)(dev_t)nsdata->dev != nsdata->dev)
> 
> this seems unlikely() as well :)
> 
> > +               goto clear;
> > +
> > +       if (unlikely(!task))
> > +               goto clear;
> > +
> > +       pidns = task_active_pid_ns(task);
> > +       if (unlikely(!pidns)) {
> > +               err = -ENOENT;
> > +               goto clear;
> > +       }
> > +
> > +       if (!ns_match(&pidns->ns, (dev_t)nsdata->dev, nsdata->inum))
> > +               goto clear;
> > +
> > +       nsdata->pid = task_pid_nr_ns(task, pidns);
> > +       nsdata->tgid = task_tgid_nr_ns(task, pidns);
> > +       return 0;
> > +clear:
> > +       memset((void *)nsdata, 0, (size_t) size);
> > +       return err;
> > +}
> > +
> > +const struct bpf_func_proto bpf_get_ns_current_pid_tgid_proto = {
> > +       .func           = bpf_get_ns_current_pid_tgid,
> > +       .gpl_only       = false,
> > +       .ret_type       = RET_INTEGER,
> > +       .arg1_type      = ARG_PTR_TO_UNINIT_MEM,
> 
> So this is a lie, you do expect part of that struct to be initialized.
> One more reason to just split off dev/inum(ino?).
> 
> 
> > +       .arg2_type      = ARG_CONST_SIZE,
> > +};
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index 44bd08f2443b..32331a1dcb6d 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -735,6 +735,8 @@ tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> >  #endif
> >         case BPF_FUNC_send_signal:
> >                 return &bpf_send_signal_proto;
> > +       case BPF_FUNC_get_ns_current_pid_tgid:
> > +               return &bpf_get_ns_current_pid_tgid_proto;
> >         default:
> >                 return NULL;
> >         }
> > --
> > 2.20.1
> >
Thanks for reviewing this, I'll make the changes you suggest.
I'm not sure about removing dev and ino from struct bpf_pidns_info, I
think is useful to know to which ns pid/tgid belong to using dev and ino
to figure it out. Maybe in the future we could filter bpf_pidns_info
structs by dev/ino, just an idea.  

Bests

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v13 2/4] bpf: added new helper bpf_get_ns_current_pid_tgid
  2019-10-09 17:45     ` Carlos Antonio Neira Bustos
@ 2019-10-09 19:50       ` Andrii Nakryiko
  2019-10-09 20:30         ` Carlos Antonio Neira Bustos
  0 siblings, 1 reply; 11+ messages in thread
From: Andrii Nakryiko @ 2019-10-09 19:50 UTC (permalink / raw)
  To: Carlos Antonio Neira Bustos
  Cc: Networking, Yonghong Song, ebiederm, Jesper Dangaard Brouer, bpf

On Wed, Oct 9, 2019 at 10:45 AM Carlos Antonio Neira Bustos
<cneirabustos@gmail.com> wrote:
>
> On Wed, Oct 09, 2019 at 09:14:42AM -0700, Andrii Nakryiko wrote:
> > On Wed, Oct 9, 2019 at 8:27 AM Carlos Neira <cneirabustos@gmail.com> wrote:
> > >
> > > New bpf helper bpf_get_ns_current_pid_tgid,
> > > This helper will return pid and tgid from current task
> > > which namespace matches dev_t and inode number provided,
> > > this will allows us to instrument a process inside a container.
> > >
> > > Signed-off-by: Carlos Neira <cneirabustos@gmail.com>
> > > ---
> > >  include/linux/bpf.h      |  1 +
> > >  include/uapi/linux/bpf.h | 22 +++++++++++++++++++-
> > >  kernel/bpf/core.c        |  1 +
> > >  kernel/bpf/helpers.c     | 43 ++++++++++++++++++++++++++++++++++++++++
> > >  kernel/trace/bpf_trace.c |  2 ++
> > >  5 files changed, 68 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > index 5b9d22338606..231001475504 100644
> > > --- a/include/linux/bpf.h
> > > +++ b/include/linux/bpf.h
> > > @@ -1055,6 +1055,7 @@ extern const struct bpf_func_proto bpf_get_local_storage_proto;
> > >  extern const struct bpf_func_proto bpf_strtol_proto;
> > >  extern const struct bpf_func_proto bpf_strtoul_proto;
> > >  extern const struct bpf_func_proto bpf_tcp_sock_proto;
> > > +extern const struct bpf_func_proto bpf_get_ns_current_pid_tgid_proto;
> > >
> > >  /* Shared helpers among cBPF and eBPF. */
> > >  void bpf_user_rnd_init_once(void);
> > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > index 77c6be96d676..6ad3f2abf00d 100644
> > > --- a/include/uapi/linux/bpf.h
> > > +++ b/include/uapi/linux/bpf.h
> > > @@ -2750,6 +2750,19 @@ union bpf_attr {
> > >   *             **-EOPNOTSUPP** kernel configuration does not enable SYN cookies
> > >   *
> > >   *             **-EPROTONOSUPPORT** IP packet version is not 4 or 6
> > > + *
> > > + * u64 bpf_get_ns_current_pid_tgid(struct *bpf_pidns_info, u32 size)
> >
> > Should be:
> >
> > struct bpf_pidns_info *nsdata
> >
> > > + *     Return
> > > + *             0 on success, values for pid and tgid from nsinfo will be as seen
> > > + *             from the namespace that matches dev and inum from nsinfo.
> >
> > I think its cleaner to have a Description section, explaining that it
> > will return pid/tgid in bpf_pidns_info, and then describe exit codes
> > in Return section.
> >
> > > + *
> > > + *             On failure, the returned value is one of the following:
> > > + *
> > > + *             **-EINVAL** if dev and inum supplied don't match dev_t and inode number
> > > + *              with nsfs of current task, or if dev conversion to dev_t lost high bits.
> > > + *
> > > + *             **-ENOENT** if /proc/self/ns does not exists.
> > > + *
> > >   */
> > >  #define __BPF_FUNC_MAPPER(FN)          \
> > >         FN(unspec),                     \
> > > @@ -2862,7 +2875,8 @@ union bpf_attr {
> > >         FN(sk_storage_get),             \
> > >         FN(sk_storage_delete),          \
> > >         FN(send_signal),                \
> > > -       FN(tcp_gen_syncookie),
> > > +       FN(tcp_gen_syncookie),          \
> > > +       FN(get_ns_current_pid_tgid),
> > >
> > >  /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> > >   * function eBPF program intends to call
> > > @@ -3613,4 +3627,10 @@ struct bpf_sockopt {
> > >         __s32   retval;
> > >  };
> > >
> > > +struct bpf_pidns_info {
> > > +       __u64 dev;
> > > +       __u64 inum;
> >
> > seems like conventionally this should be named "ino", this is what
> > ns_match calls it, so let's stay consistent.
> >
> > > +       __u32 pid;
> > > +       __u32 tgid;
> > > +};
> >
> > So it seems like dev and inum are treated as input parameters, while
> > pid/tgid is output parameter, right? Wouldn't it be cleaner to have
> > dev and inum as explicit arguments into bpf_get_ns_current_pid_tgid()?
> > What's also not great, is that on failure you'll memset this entire
> > struct to zero, and user will lose its dev/inum. So in practice you'll
> > be keeping dev/inum somewhere else, then constructing and filling in
> > this bpf_pidns_info struct every time you need to invoke
> > bpf_get_ns_current_pid_tgid.
> >
> > Maybe it was discussed already, but IMO feels cleaner to have only
> > pid/tgid in bpf_pidns_info and pass dev/inum as direct arguments.
> >
> > >  #endif /* _UAPI__LINUX_BPF_H__ */
> > > diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> > > index 66088a9e9b9e..b2fd5358f472 100644
> > > --- a/kernel/bpf/core.c
> > > +++ b/kernel/bpf/core.c
> > > @@ -2042,6 +2042,7 @@ const struct bpf_func_proto bpf_get_current_uid_gid_proto __weak;
> > >  const struct bpf_func_proto bpf_get_current_comm_proto __weak;
> > >  const struct bpf_func_proto bpf_get_current_cgroup_id_proto __weak;
> > >  const struct bpf_func_proto bpf_get_local_storage_proto __weak;
> > > +const struct bpf_func_proto bpf_get_ns_current_pid_tgid_proto __weak;
> > >
> > >  const struct bpf_func_proto * __weak bpf_get_trace_printk_proto(void)
> > >  {
> > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > > index 5e28718928ca..78a1ce7726aa 100644
> > > --- a/kernel/bpf/helpers.c
> > > +++ b/kernel/bpf/helpers.c
> > > @@ -11,6 +11,8 @@
> > >  #include <linux/uidgid.h>
> > >  #include <linux/filter.h>
> > >  #include <linux/ctype.h>
> > > +#include <linux/pid_namespace.h>
> > > +#include <linux/proc_ns.h>
> > >
> > >  #include "../../lib/kstrtox.h"
> > >
> > > @@ -487,3 +489,44 @@ const struct bpf_func_proto bpf_strtoul_proto = {
> > >         .arg4_type      = ARG_PTR_TO_LONG,
> > >  };
> > >  #endif
> > > +
> > > +BPF_CALL_2(bpf_get_ns_current_pid_tgid, struct bpf_pidns_info *, nsdata, u32,
> > > +       size)
> > > +{
> > > +       struct task_struct *task = current;
> > > +       struct pid_namespace *pidns;
> > > +       int err = -EINVAL;
> > > +
> > > +       if (unlikely(size != sizeof(struct bpf_pidns_info)))
> > > +               goto clear;
> > > +
> > > +       if ((u64)(dev_t)nsdata->dev != nsdata->dev)
> >
> > this seems unlikely() as well :)
> >
> > > +               goto clear;
> > > +
> > > +       if (unlikely(!task))
> > > +               goto clear;
> > > +
> > > +       pidns = task_active_pid_ns(task);
> > > +       if (unlikely(!pidns)) {
> > > +               err = -ENOENT;
> > > +               goto clear;
> > > +       }
> > > +
> > > +       if (!ns_match(&pidns->ns, (dev_t)nsdata->dev, nsdata->inum))
> > > +               goto clear;
> > > +
> > > +       nsdata->pid = task_pid_nr_ns(task, pidns);
> > > +       nsdata->tgid = task_tgid_nr_ns(task, pidns);
> > > +       return 0;
> > > +clear:
> > > +       memset((void *)nsdata, 0, (size_t) size);
> > > +       return err;
> > > +}
> > > +
> > > +const struct bpf_func_proto bpf_get_ns_current_pid_tgid_proto = {
> > > +       .func           = bpf_get_ns_current_pid_tgid,
> > > +       .gpl_only       = false,
> > > +       .ret_type       = RET_INTEGER,
> > > +       .arg1_type      = ARG_PTR_TO_UNINIT_MEM,
> >
> > So this is a lie, you do expect part of that struct to be initialized.
> > One more reason to just split off dev/inum(ino?).
> >
> >
> > > +       .arg2_type      = ARG_CONST_SIZE,
> > > +};
> > > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > > index 44bd08f2443b..32331a1dcb6d 100644
> > > --- a/kernel/trace/bpf_trace.c
> > > +++ b/kernel/trace/bpf_trace.c
> > > @@ -735,6 +735,8 @@ tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> > >  #endif
> > >         case BPF_FUNC_send_signal:
> > >                 return &bpf_send_signal_proto;
> > > +       case BPF_FUNC_get_ns_current_pid_tgid:
> > > +               return &bpf_get_ns_current_pid_tgid_proto;
> > >         default:
> > >                 return NULL;
> > >         }
> > > --
> > > 2.20.1
> > >
> Thanks for reviewing this, I'll make the changes you suggest.
> I'm not sure about removing dev and ino from struct bpf_pidns_info, I
> think is useful to know to which ns pid/tgid belong to using dev and ino
> to figure it out. Maybe in the future we could filter bpf_pidns_info
> structs by dev/ino, just an idea.

I'm not following. dev/ino are specified by the caller to this helper,
right? So caller already know those values. With current set up,
though, the behavior is weird: this struct's dev/ino is preserved on
success, but zeroed out on failure, even though helper itself has
nothing to do with returning dev/ino. It also plays badly with
ARG_PTR_TO_UNINT_MEM, because that memory is expected to be at least
partially initialized. I see only downsides, to be honest.

>
> Bests

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v13 2/4] bpf: added new helper bpf_get_ns_current_pid_tgid
  2019-10-09 19:50       ` Andrii Nakryiko
@ 2019-10-09 20:30         ` Carlos Antonio Neira Bustos
  0 siblings, 0 replies; 11+ messages in thread
From: Carlos Antonio Neira Bustos @ 2019-10-09 20:30 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Networking, Yonghong Song, ebiederm, Jesper Dangaard Brouer, bpf

On Wed, Oct 09, 2019 at 12:50:10PM -0700, Andrii Nakryiko wrote:
> On Wed, Oct 9, 2019 at 10:45 AM Carlos Antonio Neira Bustos
> <cneirabustos@gmail.com> wrote:
> >
> > On Wed, Oct 09, 2019 at 09:14:42AM -0700, Andrii Nakryiko wrote:
> > > On Wed, Oct 9, 2019 at 8:27 AM Carlos Neira <cneirabustos@gmail.com> wrote:
> > > >
> > > > New bpf helper bpf_get_ns_current_pid_tgid,
> > > > This helper will return pid and tgid from current task
> > > > which namespace matches dev_t and inode number provided,
> > > > this will allows us to instrument a process inside a container.
> > > >
> > > > Signed-off-by: Carlos Neira <cneirabustos@gmail.com>
> > > > ---
> > > >  include/linux/bpf.h      |  1 +
> > > >  include/uapi/linux/bpf.h | 22 +++++++++++++++++++-
> > > >  kernel/bpf/core.c        |  1 +
> > > >  kernel/bpf/helpers.c     | 43 ++++++++++++++++++++++++++++++++++++++++
> > > >  kernel/trace/bpf_trace.c |  2 ++
> > > >  5 files changed, 68 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > > index 5b9d22338606..231001475504 100644
> > > > --- a/include/linux/bpf.h
> > > > +++ b/include/linux/bpf.h
> > > > @@ -1055,6 +1055,7 @@ extern const struct bpf_func_proto bpf_get_local_storage_proto;
> > > >  extern const struct bpf_func_proto bpf_strtol_proto;
> > > >  extern const struct bpf_func_proto bpf_strtoul_proto;
> > > >  extern const struct bpf_func_proto bpf_tcp_sock_proto;
> > > > +extern const struct bpf_func_proto bpf_get_ns_current_pid_tgid_proto;
> > > >
> > > >  /* Shared helpers among cBPF and eBPF. */
> > > >  void bpf_user_rnd_init_once(void);
> > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > > index 77c6be96d676..6ad3f2abf00d 100644
> > > > --- a/include/uapi/linux/bpf.h
> > > > +++ b/include/uapi/linux/bpf.h
> > > > @@ -2750,6 +2750,19 @@ union bpf_attr {
> > > >   *             **-EOPNOTSUPP** kernel configuration does not enable SYN cookies
> > > >   *
> > > >   *             **-EPROTONOSUPPORT** IP packet version is not 4 or 6
> > > > + *
> > > > + * u64 bpf_get_ns_current_pid_tgid(struct *bpf_pidns_info, u32 size)
> > >
> > > Should be:
> > >
> > > struct bpf_pidns_info *nsdata
> > >
> > > > + *     Return
> > > > + *             0 on success, values for pid and tgid from nsinfo will be as seen
> > > > + *             from the namespace that matches dev and inum from nsinfo.
> > >
> > > I think its cleaner to have a Description section, explaining that it
> > > will return pid/tgid in bpf_pidns_info, and then describe exit codes
> > > in Return section.
> > >
> > > > + *
> > > > + *             On failure, the returned value is one of the following:
> > > > + *
> > > > + *             **-EINVAL** if dev and inum supplied don't match dev_t and inode number
> > > > + *              with nsfs of current task, or if dev conversion to dev_t lost high bits.
> > > > + *
> > > > + *             **-ENOENT** if /proc/self/ns does not exists.
> > > > + *
> > > >   */
> > > >  #define __BPF_FUNC_MAPPER(FN)          \
> > > >         FN(unspec),                     \
> > > > @@ -2862,7 +2875,8 @@ union bpf_attr {
> > > >         FN(sk_storage_get),             \
> > > >         FN(sk_storage_delete),          \
> > > >         FN(send_signal),                \
> > > > -       FN(tcp_gen_syncookie),
> > > > +       FN(tcp_gen_syncookie),          \
> > > > +       FN(get_ns_current_pid_tgid),
> > > >
> > > >  /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> > > >   * function eBPF program intends to call
> > > > @@ -3613,4 +3627,10 @@ struct bpf_sockopt {
> > > >         __s32   retval;
> > > >  };
> > > >
> > > > +struct bpf_pidns_info {
> > > > +       __u64 dev;
> > > > +       __u64 inum;
> > >
> > > seems like conventionally this should be named "ino", this is what
> > > ns_match calls it, so let's stay consistent.
> > >
> > > > +       __u32 pid;
> > > > +       __u32 tgid;
> > > > +};
> > >
> > > So it seems like dev and inum are treated as input parameters, while
> > > pid/tgid is output parameter, right? Wouldn't it be cleaner to have
> > > dev and inum as explicit arguments into bpf_get_ns_current_pid_tgid()?
> > > What's also not great, is that on failure you'll memset this entire
> > > struct to zero, and user will lose its dev/inum. So in practice you'll
> > > be keeping dev/inum somewhere else, then constructing and filling in
> > > this bpf_pidns_info struct every time you need to invoke
> > > bpf_get_ns_current_pid_tgid.
> > >
> > > Maybe it was discussed already, but IMO feels cleaner to have only
> > > pid/tgid in bpf_pidns_info and pass dev/inum as direct arguments.
> > >
> > > >  #endif /* _UAPI__LINUX_BPF_H__ */
> > > > diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> > > > index 66088a9e9b9e..b2fd5358f472 100644
> > > > --- a/kernel/bpf/core.c
> > > > +++ b/kernel/bpf/core.c
> > > > @@ -2042,6 +2042,7 @@ const struct bpf_func_proto bpf_get_current_uid_gid_proto __weak;
> > > >  const struct bpf_func_proto bpf_get_current_comm_proto __weak;
> > > >  const struct bpf_func_proto bpf_get_current_cgroup_id_proto __weak;
> > > >  const struct bpf_func_proto bpf_get_local_storage_proto __weak;
> > > > +const struct bpf_func_proto bpf_get_ns_current_pid_tgid_proto __weak;
> > > >
> > > >  const struct bpf_func_proto * __weak bpf_get_trace_printk_proto(void)
> > > >  {
> > > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > > > index 5e28718928ca..78a1ce7726aa 100644
> > > > --- a/kernel/bpf/helpers.c
> > > > +++ b/kernel/bpf/helpers.c
> > > > @@ -11,6 +11,8 @@
> > > >  #include <linux/uidgid.h>
> > > >  #include <linux/filter.h>
> > > >  #include <linux/ctype.h>
> > > > +#include <linux/pid_namespace.h>
> > > > +#include <linux/proc_ns.h>
> > > >
> > > >  #include "../../lib/kstrtox.h"
> > > >
> > > > @@ -487,3 +489,44 @@ const struct bpf_func_proto bpf_strtoul_proto = {
> > > >         .arg4_type      = ARG_PTR_TO_LONG,
> > > >  };
> > > >  #endif
> > > > +
> > > > +BPF_CALL_2(bpf_get_ns_current_pid_tgid, struct bpf_pidns_info *, nsdata, u32,
> > > > +       size)
> > > > +{
> > > > +       struct task_struct *task = current;
> > > > +       struct pid_namespace *pidns;
> > > > +       int err = -EINVAL;
> > > > +
> > > > +       if (unlikely(size != sizeof(struct bpf_pidns_info)))
> > > > +               goto clear;
> > > > +
> > > > +       if ((u64)(dev_t)nsdata->dev != nsdata->dev)
> > >
> > > this seems unlikely() as well :)
> > >
> > > > +               goto clear;
> > > > +
> > > > +       if (unlikely(!task))
> > > > +               goto clear;
> > > > +
> > > > +       pidns = task_active_pid_ns(task);
> > > > +       if (unlikely(!pidns)) {
> > > > +               err = -ENOENT;
> > > > +               goto clear;
> > > > +       }
> > > > +
> > > > +       if (!ns_match(&pidns->ns, (dev_t)nsdata->dev, nsdata->inum))
> > > > +               goto clear;
> > > > +
> > > > +       nsdata->pid = task_pid_nr_ns(task, pidns);
> > > > +       nsdata->tgid = task_tgid_nr_ns(task, pidns);
> > > > +       return 0;
> > > > +clear:
> > > > +       memset((void *)nsdata, 0, (size_t) size);
> > > > +       return err;
> > > > +}
> > > > +
> > > > +const struct bpf_func_proto bpf_get_ns_current_pid_tgid_proto = {
> > > > +       .func           = bpf_get_ns_current_pid_tgid,
> > > > +       .gpl_only       = false,
> > > > +       .ret_type       = RET_INTEGER,
> > > > +       .arg1_type      = ARG_PTR_TO_UNINIT_MEM,
> > >
> > > So this is a lie, you do expect part of that struct to be initialized.
> > > One more reason to just split off dev/inum(ino?).
> > >
> > >
> > > > +       .arg2_type      = ARG_CONST_SIZE,
> > > > +};
> > > > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > > > index 44bd08f2443b..32331a1dcb6d 100644
> > > > --- a/kernel/trace/bpf_trace.c
> > > > +++ b/kernel/trace/bpf_trace.c
> > > > @@ -735,6 +735,8 @@ tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> > > >  #endif
> > > >         case BPF_FUNC_send_signal:
> > > >                 return &bpf_send_signal_proto;
> > > > +       case BPF_FUNC_get_ns_current_pid_tgid:
> > > > +               return &bpf_get_ns_current_pid_tgid_proto;
> > > >         default:
> > > >                 return NULL;
> > > >         }
> > > > --
> > > > 2.20.1
> > > >
> > Thanks for reviewing this, I'll make the changes you suggest.
> > I'm not sure about removing dev and ino from struct bpf_pidns_info, I
> > think is useful to know to which ns pid/tgid belong to using dev and ino
> > to figure it out. Maybe in the future we could filter bpf_pidns_info
> > structs by dev/ino, just an idea.
> 
> I'm not following. dev/ino are specified by the caller to this helper,
> right? So caller already know those values. With current set up,
> though, the behavior is weird: this struct's dev/ino is preserved on
> success, but zeroed out on failure, even though helper itself has
> nothing to do with returning dev/ino. It also plays badly with
> ARG_PTR_TO_UNINT_MEM, because that memory is expected to be at least
> partially initialized. I see only downsides, to be honest.
> 
> >
> > Bests
Andrii,

Oh you are right dev/ino are specific to the caller of this helper,
and what you state is correct dev/ino are also not returned by the caller
based on that, yes this should be changed. Now I see the point, I'll
change the helper to take dev/ino and just return pid/tgid.
Thanks again!.

Bests


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, back to index

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-09 15:26 [PATCH v13 0/4] BPF: New helper to obtain namespace data from current task Carlos Neira
2019-10-09 15:26 ` [PATCH v13 1/4] fs/nsfs.c: added ns_match Carlos Neira
2019-10-09 15:26 ` [PATCH v13 2/4] bpf: added new helper bpf_get_ns_current_pid_tgid Carlos Neira
2019-10-09 16:14   ` Andrii Nakryiko
2019-10-09 17:45     ` Carlos Antonio Neira Bustos
2019-10-09 19:50       ` Andrii Nakryiko
2019-10-09 20:30         ` Carlos Antonio Neira Bustos
2019-10-09 15:26 ` [PATCH v13 3/4] tools: Added bpf_get_ns_current_pid_tgid helper Carlos Neira
2019-10-09 15:26 ` [PATCH v13 4/4] tools/testing/selftests/bpf: Add self-tests for new helper Carlos Neira
2019-10-09 16:26   ` Andrii Nakryiko
2019-10-09 16:44     ` Carlos Antonio Neira Bustos

Netdev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/netdev/0 netdev/git/0.git
	git clone --mirror https://lore.kernel.org/netdev/1 netdev/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 netdev netdev/ https://lore.kernel.org/netdev \
		netdev@vger.kernel.org netdev@archiver.kernel.org
	public-inbox-index netdev

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.netdev


AGPL code for this site: git clone https://public-inbox.org/ public-inbox