All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v4 0/7] bpf: implement BPF_TASK_FD_QUERY
@ 2018-05-24  0:18 Yonghong Song
  2018-05-24  0:18 ` [PATCH bpf-next v4 1/7] perf/core: add perf_get_event() to return perf_event given a struct file Yonghong Song
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Yonghong Song @ 2018-05-24  0:18 UTC (permalink / raw)
  To: peterz, ast, daniel, netdev; +Cc: kernel-team

Currently, suppose a userspace application has loaded a bpf program
and attached it to a tracepoint/kprobe/uprobe, and a bpf
introspection tool, e.g., bpftool, wants to show which bpf program
is attached to which tracepoint/kprobe/uprobe. Such attachment
information will be really useful to understand the overall bpf
deployment in the system.

There is a name field (16 bytes) for each program, which could
be used to encode the attachment point. There are some drawbacks
for this approaches. First, bpftool user (e.g., an admin) may not
really understand the association between the name and the
attachment point. Second, if one program is attached to multiple
places, encoding a proper name which can imply all these
attachments becomes difficult.

This patch introduces a new bpf subcommand BPF_TASK_FD_QUERY.
Given a pid and fd, this command will return bpf related information
to user space. Right now it only supports tracepoint/kprobe/uprobe
perf event fd's. For such a fd, BPF_TASK_FD_QUERY will return
   . prog_id
   . tracepoint name, or
   . k[ret]probe funcname + offset or kernel addr, or
   . u[ret]probe filename + offset
to the userspace.
The user can use "bpftool prog" to find more information about
bpf program itself with prog_id.

Patch #1 adds function perf_get_event() in kernel/events/core.c.
Patch #2 implements the bpf subcommand BPF_TASK_FD_QUERY.
Patch #3 syncs tools bpf.h header and also add bpf_task_fd_query()
in the libbpf library for samples/selftests/bpftool to use.
Patch #4 adds ksym_get_addr() utility function.
Patch #5 add a test in samples/bpf for querying k[ret]probes and
u[ret]probes.
Patch #6 add a test in tools/testing/selftests/bpf for querying
raw_tracepoint and tracepoint.
Patch #7 add a new subcommand "perf" to bpftool.

Changelogs:
  v3 -> v4:
     . made attr buf_len input/output. The length of
       actual buffter is written to buf_len so user space knows
       what is actually needed. If user provides a buffer
       with length >= 1 but less than required, do partial
       copy and return -ENOSPC.
     . code simplification with put_user.
     . changed query result attach_info to fd_type.
     . add tests at selftests/bpf to test zero len, null buf and
       insufficient buf.
  v2 -> v3:
     . made perf_get_event() return perf_event pointer const.
       this was to ensure that event fields are not meddled.
     . detect whether newly BPF_TASK_FD_QUERY is supported or
       not in "bpftool perf" and warn users if it is not.
  v1 -> v2:
     . changed bpf subcommand name from BPF_PERF_EVENT_QUERY
       to BPF_TASK_FD_QUERY.
     . fixed various "bpftool perf" issues and added documentation
       and auto-completion.

Yonghong Song (7):
  perf/core: add perf_get_event() to return perf_event given a struct
    file
  bpf: introduce bpf subcommand BPF_TASK_FD_QUERY
  tools/bpf: sync kernel header bpf.h and add bpf_task_fd_query in
    libbpf
  tools/bpf: add ksym_get_addr() in trace_helpers
  samples/bpf: add a samples/bpf test for BPF_TASK_FD_QUERY
  tools/bpf: add two BPF_TASK_FD_QUERY tests in test_progs
  tools/bpftool: add perf subcommand

 include/linux/perf_event.h                       |   5 +
 include/linux/trace_events.h                     |  17 +
 include/uapi/linux/bpf.h                         |  26 ++
 kernel/bpf/syscall.c                             | 115 +++++++
 kernel/events/core.c                             |   8 +
 kernel/trace/bpf_trace.c                         |  48 +++
 kernel/trace/trace_kprobe.c                      |  29 ++
 kernel/trace/trace_uprobe.c                      |  22 ++
 samples/bpf/Makefile                             |   4 +
 samples/bpf/task_fd_query_kern.c                 |  19 ++
 samples/bpf/task_fd_query_user.c                 | 382 +++++++++++++++++++++++
 tools/bpf/bpftool/Documentation/bpftool-perf.rst |  81 +++++
 tools/bpf/bpftool/Documentation/bpftool.rst      |   5 +-
 tools/bpf/bpftool/bash-completion/bpftool        |   9 +
 tools/bpf/bpftool/main.c                         |   3 +-
 tools/bpf/bpftool/main.h                         |   1 +
 tools/bpf/bpftool/perf.c                         | 246 +++++++++++++++
 tools/include/uapi/linux/bpf.h                   |  26 ++
 tools/lib/bpf/bpf.c                              |  23 ++
 tools/lib/bpf/bpf.h                              |   3 +
 tools/testing/selftests/bpf/test_progs.c         | 158 ++++++++++
 tools/testing/selftests/bpf/trace_helpers.c      |  12 +
 tools/testing/selftests/bpf/trace_helpers.h      |   1 +
 23 files changed, 1241 insertions(+), 2 deletions(-)
 create mode 100644 samples/bpf/task_fd_query_kern.c
 create mode 100644 samples/bpf/task_fd_query_user.c
 create mode 100644 tools/bpf/bpftool/Documentation/bpftool-perf.rst
 create mode 100644 tools/bpf/bpftool/perf.c

-- 
2.9.5

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH bpf-next v4 1/7] perf/core: add perf_get_event() to return perf_event given a struct file
  2018-05-24  0:18 [PATCH bpf-next v4 0/7] bpf: implement BPF_TASK_FD_QUERY Yonghong Song
@ 2018-05-24  0:18 ` Yonghong Song
  2018-05-24  0:18 ` [PATCH bpf-next v4 2/7] bpf: introduce bpf subcommand BPF_TASK_FD_QUERY Yonghong Song
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Yonghong Song @ 2018-05-24  0:18 UTC (permalink / raw)
  To: peterz, ast, daniel, netdev; +Cc: kernel-team

A new extern function, perf_get_event(), is added to return a perf event
given a struct file. This function will be used in later patches.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 include/linux/perf_event.h | 5 +++++
 kernel/events/core.c       | 8 ++++++++
 2 files changed, 13 insertions(+)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index e71e99e..eec302b 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -868,6 +868,7 @@ extern void perf_event_exit_task(struct task_struct *child);
 extern void perf_event_free_task(struct task_struct *task);
 extern void perf_event_delayed_put(struct task_struct *task);
 extern struct file *perf_event_get(unsigned int fd);
+extern const struct perf_event *perf_get_event(struct file *file);
 extern const struct perf_event_attr *perf_event_attrs(struct perf_event *event);
 extern void perf_event_print_debug(void);
 extern void perf_pmu_disable(struct pmu *pmu);
@@ -1289,6 +1290,10 @@ static inline void perf_event_exit_task(struct task_struct *child)	{ }
 static inline void perf_event_free_task(struct task_struct *task)	{ }
 static inline void perf_event_delayed_put(struct task_struct *task)	{ }
 static inline struct file *perf_event_get(unsigned int fd)	{ return ERR_PTR(-EINVAL); }
+static inline const struct perf_event *perf_get_event(struct file *file)
+{
+	return ERR_PTR(-EINVAL);
+}
 static inline const struct perf_event_attr *perf_event_attrs(struct perf_event *event)
 {
 	return ERR_PTR(-EINVAL);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 67612ce..6eeab86 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -11212,6 +11212,14 @@ struct file *perf_event_get(unsigned int fd)
 	return file;
 }
 
+const struct perf_event *perf_get_event(struct file *file)
+{
+	if (file->f_op != &perf_fops)
+		return ERR_PTR(-EINVAL);
+
+	return file->private_data;
+}
+
 const struct perf_event_attr *perf_event_attrs(struct perf_event *event)
 {
 	if (!event)
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH bpf-next v4 2/7] bpf: introduce bpf subcommand BPF_TASK_FD_QUERY
  2018-05-24  0:18 [PATCH bpf-next v4 0/7] bpf: implement BPF_TASK_FD_QUERY Yonghong Song
  2018-05-24  0:18 ` [PATCH bpf-next v4 1/7] perf/core: add perf_get_event() to return perf_event given a struct file Yonghong Song
@ 2018-05-24  0:18 ` Yonghong Song
  2018-05-24  5:07   ` Martin KaFai Lau
  2018-05-24  0:18 ` [PATCH bpf-next v4 3/7] tools/bpf: sync kernel header bpf.h and add bpf_task_fd_query in libbpf Yonghong Song
  2018-05-24  0:18 ` [PATCH bpf-next v4 4/7] tools/bpf: add ksym_get_addr() in trace_helpers Yonghong Song
  3 siblings, 1 reply; 8+ messages in thread
From: Yonghong Song @ 2018-05-24  0:18 UTC (permalink / raw)
  To: peterz, ast, daniel, netdev; +Cc: kernel-team

Currently, suppose a userspace application has loaded a bpf program
and attached it to a tracepoint/kprobe/uprobe, and a bpf
introspection tool, e.g., bpftool, wants to show which bpf program
is attached to which tracepoint/kprobe/uprobe. Such attachment
information will be really useful to understand the overall bpf
deployment in the system.

There is a name field (16 bytes) for each program, which could
be used to encode the attachment point. There are some drawbacks
for this approaches. First, bpftool user (e.g., an admin) may not
really understand the association between the name and the
attachment point. Second, if one program is attached to multiple
places, encoding a proper name which can imply all these
attachments becomes difficult.

This patch introduces a new bpf subcommand BPF_TASK_FD_QUERY.
Given a pid and fd, if the <pid, fd> is associated with a
tracepoint/kprobe/uprobe perf event, BPF_TASK_FD_QUERY will return
   . prog_id
   . tracepoint name, or
   . k[ret]probe funcname + offset or kernel addr, or
   . u[ret]probe filename + offset
to the userspace.
The user can use "bpftool prog" to find more information about
bpf program itself with prog_id.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 include/linux/trace_events.h |  17 +++++++
 include/uapi/linux/bpf.h     |  26 ++++++++++
 kernel/bpf/syscall.c         | 115 +++++++++++++++++++++++++++++++++++++++++++
 kernel/trace/bpf_trace.c     |  48 ++++++++++++++++++
 kernel/trace/trace_kprobe.c  |  29 +++++++++++
 kernel/trace/trace_uprobe.c  |  22 +++++++++
 6 files changed, 257 insertions(+)

diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index 2bde3ef..d34144a 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -473,6 +473,9 @@ int perf_event_query_prog_array(struct perf_event *event, void __user *info);
 int bpf_probe_register(struct bpf_raw_event_map *btp, struct bpf_prog *prog);
 int bpf_probe_unregister(struct bpf_raw_event_map *btp, struct bpf_prog *prog);
 struct bpf_raw_event_map *bpf_find_raw_tracepoint(const char *name);
+int bpf_get_perf_event_info(const struct perf_event *event, u32 *prog_id,
+			    u32 *fd_type, const char **buf,
+			    u64 *probe_offset, u64 *probe_addr);
 #else
 static inline unsigned int trace_call_bpf(struct trace_event_call *call, void *ctx)
 {
@@ -504,6 +507,13 @@ static inline struct bpf_raw_event_map *bpf_find_raw_tracepoint(const char *name
 {
 	return NULL;
 }
+static inline int bpf_get_perf_event_info(const struct perf_event *event,
+					  u32 *prog_id, u32 *fd_type,
+					  const char **buf, u64 *probe_offset,
+					  u64 *probe_addr)
+{
+	return -EOPNOTSUPP;
+}
 #endif
 
 enum {
@@ -560,10 +570,17 @@ extern void perf_trace_del(struct perf_event *event, int flags);
 #ifdef CONFIG_KPROBE_EVENTS
 extern int  perf_kprobe_init(struct perf_event *event, bool is_retprobe);
 extern void perf_kprobe_destroy(struct perf_event *event);
+extern int bpf_get_kprobe_info(const struct perf_event *event,
+			       u32 *fd_type, const char **symbol,
+			       u64 *probe_offset, u64 *probe_addr,
+			       bool perf_type_tracepoint);
 #endif
 #ifdef CONFIG_UPROBE_EVENTS
 extern int  perf_uprobe_init(struct perf_event *event, bool is_retprobe);
 extern void perf_uprobe_destroy(struct perf_event *event);
+extern int bpf_get_uprobe_info(const struct perf_event *event,
+			       u32 *fd_type, const char **filename,
+			       u64 *probe_offset, bool perf_type_tracepoint);
 #endif
 extern int  ftrace_profile_set_filter(struct perf_event *event, int event_id,
 				     char *filter_str);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index c3e502d..0d51946 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -97,6 +97,7 @@ enum bpf_cmd {
 	BPF_RAW_TRACEPOINT_OPEN,
 	BPF_BTF_LOAD,
 	BPF_BTF_GET_FD_BY_ID,
+	BPF_TASK_FD_QUERY,
 };
 
 enum bpf_map_type {
@@ -379,6 +380,22 @@ union bpf_attr {
 		__u32		btf_log_size;
 		__u32		btf_log_level;
 	};
+
+	struct {
+		__u32		pid;		/* input: pid */
+		__u32		fd;		/* input: fd */
+		__u32		flags;		/* input: flags */
+		__u32		buf_len;	/* input/output: buf len */
+		__aligned_u64	buf;		/* input/output:
+						 *   tp_name for tracepoint
+						 *   symbol for kprobe
+						 *   filename for uprobe
+						 */
+		__u32		prog_id;	/* output: prod_id */
+		__u32		fd_type;	/* output: BPF_FD_TYPE_* */
+		__u64		probe_offset;	/* output: probe_offset */
+		__u64		probe_addr;	/* output: probe_addr */
+	} task_fd_query;
 } __attribute__((aligned(8)));
 
 /* The description below is an attempt at providing documentation to eBPF
@@ -2458,4 +2475,13 @@ struct bpf_fib_lookup {
 	__u8	dmac[6];     /* ETH_ALEN */
 };
 
+enum bpf_task_fd_type {
+	BPF_FD_TYPE_RAW_TRACEPOINT,	/* tp name */
+	BPF_FD_TYPE_TRACEPOINT,		/* tp name */
+	BPF_FD_TYPE_KPROBE,		/* (symbol + offset) or addr */
+	BPF_FD_TYPE_KRETPROBE,		/* (symbol + offset) or addr */
+	BPF_FD_TYPE_UPROBE,		/* filename + offset */
+	BPF_FD_TYPE_URETPROBE,		/* filename + offset */
+};
+
 #endif /* _UAPI__LINUX_BPF_H__ */
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 0b4c945..7dd8c86 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -18,7 +18,9 @@
 #include <linux/vmalloc.h>
 #include <linux/mmzone.h>
 #include <linux/anon_inodes.h>
+#include <linux/fdtable.h>
 #include <linux/file.h>
+#include <linux/fs.h>
 #include <linux/license.h>
 #include <linux/filter.h>
 #include <linux/version.h>
@@ -2102,6 +2104,116 @@ static int bpf_btf_get_fd_by_id(const union bpf_attr *attr)
 	return btf_get_fd_by_id(attr->btf_id);
 }
 
+static int bpf_task_fd_query_copy(const union bpf_attr *attr,
+				    union bpf_attr __user *uattr,
+				    u32 prog_id, u32 fd_type,
+				    const char *buf, u64 probe_offset,
+				    u64 probe_addr)
+{
+	void __user *ubuf = u64_to_user_ptr(attr->task_fd_query.buf);
+	u32 len = buf ? strlen(buf) + 1 : 0, input_len;
+	int err = 0;
+
+	if (put_user(len, &uattr->task_fd_query.buf_len))
+		return -EFAULT;
+	input_len = attr->task_fd_query.buf_len;
+	if (input_len && len && ubuf) {
+		if (input_len < len) {
+			err = -ENOSPC;
+			len = input_len;
+		}
+		if (copy_to_user(ubuf, buf, len))
+			return -EFAULT;
+	}
+
+	if (put_user(prog_id, &uattr->task_fd_query.prog_id) ||
+	    put_user(fd_type, &uattr->task_fd_query.fd_type) ||
+	    put_user(probe_offset, &uattr->task_fd_query.probe_offset) ||
+	    put_user(probe_addr, &uattr->task_fd_query.probe_addr))
+		return -EFAULT;
+
+	return err;
+}
+
+#define BPF_TASK_FD_QUERY_LAST_FIELD task_fd_query.probe_addr
+
+static int bpf_task_fd_query(const union bpf_attr *attr,
+			     union bpf_attr __user *uattr)
+{
+	pid_t pid = attr->task_fd_query.pid;
+	u32 fd = attr->task_fd_query.fd;
+	const struct perf_event *event;
+	struct files_struct *files;
+	struct task_struct *task;
+	struct file *file;
+	int err;
+
+	if (CHECK_ATTR(BPF_TASK_FD_QUERY))
+		return -EINVAL;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	if (attr->task_fd_query.flags != 0)
+		return -EINVAL;
+
+	task = get_pid_task(find_vpid(pid), PIDTYPE_PID);
+	if (!task)
+		return -ENOENT;
+
+	files = get_files_struct(task);
+	put_task_struct(task);
+	if (!files)
+		return -ENOENT;
+
+	err = 0;
+	spin_lock(&files->file_lock);
+	file = fcheck_files(files, fd);
+	if (!file)
+		err = -EBADF;
+	else
+		get_file(file);
+	spin_unlock(&files->file_lock);
+	put_files_struct(files);
+
+	if (err)
+		goto out;
+
+	if (file->f_op == &bpf_raw_tp_fops) {
+		struct bpf_raw_tracepoint *raw_tp = file->private_data;
+		struct bpf_raw_event_map *btp = raw_tp->btp;
+
+		err = bpf_task_fd_query_copy(attr, uattr,
+					     raw_tp->prog->aux->id,
+					     BPF_FD_TYPE_RAW_TRACEPOINT,
+					     btp->tp->name, 0, 0);
+		goto put_file;
+	}
+
+	event = perf_get_event(file);
+	if (!IS_ERR(event)) {
+		u64 probe_offset, probe_addr;
+		u32 prog_id, fd_type;
+		const char *buf;
+
+		err = bpf_get_perf_event_info(event, &prog_id, &fd_type,
+					      &buf, &probe_offset,
+					      &probe_addr);
+		if (!err)
+			err = bpf_task_fd_query_copy(attr, uattr, prog_id,
+						     fd_type, buf,
+						     probe_offset,
+						     probe_addr);
+		goto put_file;
+	}
+
+	err = -ENOTSUPP;
+put_file:
+	fput(file);
+out:
+	return err;
+}
+
 SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, size)
 {
 	union bpf_attr attr = {};
@@ -2188,6 +2300,9 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
 	case BPF_BTF_GET_FD_BY_ID:
 		err = bpf_btf_get_fd_by_id(&attr);
 		break;
+	case BPF_TASK_FD_QUERY:
+		err = bpf_task_fd_query(&attr, uattr);
+		break;
 	default:
 		err = -EINVAL;
 		break;
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index ce2cbbf..81fdf2f 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -14,6 +14,7 @@
 #include <linux/uaccess.h>
 #include <linux/ctype.h>
 #include <linux/kprobes.h>
+#include <linux/syscalls.h>
 #include <linux/error-injection.h>
 
 #include "trace_probe.h"
@@ -1163,3 +1164,50 @@ int bpf_probe_unregister(struct bpf_raw_event_map *btp, struct bpf_prog *prog)
 	mutex_unlock(&bpf_event_mutex);
 	return err;
 }
+
+int bpf_get_perf_event_info(const struct perf_event *event, u32 *prog_id,
+			    u32 *fd_type, const char **buf,
+			    u64 *probe_offset, u64 *probe_addr)
+{
+	bool is_tracepoint, is_syscall_tp;
+	struct bpf_prog *prog;
+	int flags, err = 0;
+
+	prog = event->prog;
+	if (!prog)
+		return -ENOENT;
+
+	/* not supporting BPF_PROG_TYPE_PERF_EVENT yet */
+	if (prog->type == BPF_PROG_TYPE_PERF_EVENT)
+		return -EOPNOTSUPP;
+
+	*prog_id = prog->aux->id;
+	flags = event->tp_event->flags;
+	is_tracepoint = flags & TRACE_EVENT_FL_TRACEPOINT;
+	is_syscall_tp = is_syscall_trace_event(event->tp_event);
+
+	if (is_tracepoint || is_syscall_tp) {
+		*buf = is_tracepoint ? event->tp_event->tp->name
+				     : event->tp_event->name;
+		*fd_type = BPF_FD_TYPE_TRACEPOINT;
+		*probe_offset = 0x0;
+		*probe_addr = 0x0;
+	} else {
+		/* kprobe/uprobe */
+		err = -EOPNOTSUPP;
+#ifdef CONFIG_KPROBE_EVENTS
+		if (flags & TRACE_EVENT_FL_KPROBE)
+			err = bpf_get_kprobe_info(event, fd_type, buf,
+						  probe_offset, probe_addr,
+						  event->attr.type == PERF_TYPE_TRACEPOINT);
+#endif
+#ifdef CONFIG_UPROBE_EVENTS
+		if (flags & TRACE_EVENT_FL_UPROBE)
+			err = bpf_get_uprobe_info(event, fd_type, buf,
+						  probe_offset,
+						  event->attr.type == PERF_TYPE_TRACEPOINT);
+#endif
+	}
+
+	return err;
+}
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 02aed76..daa8157 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1287,6 +1287,35 @@ kretprobe_perf_func(struct trace_kprobe *tk, struct kretprobe_instance *ri,
 			      head, NULL);
 }
 NOKPROBE_SYMBOL(kretprobe_perf_func);
+
+int bpf_get_kprobe_info(const struct perf_event *event, u32 *fd_type,
+			const char **symbol, u64 *probe_offset,
+			u64 *probe_addr, bool perf_type_tracepoint)
+{
+	const char *pevent = trace_event_name(event->tp_event);
+	const char *group = event->tp_event->class->system;
+	struct trace_kprobe *tk;
+
+	if (perf_type_tracepoint)
+		tk = find_trace_kprobe(pevent, group);
+	else
+		tk = event->tp_event->data;
+	if (!tk)
+		return -EINVAL;
+
+	*fd_type = trace_kprobe_is_return(tk) ? BPF_FD_TYPE_KRETPROBE
+					      : BPF_FD_TYPE_KPROBE;
+	if (tk->symbol) {
+		*symbol = tk->symbol;
+		*probe_offset = tk->rp.kp.offset;
+		*probe_addr = 0;
+	} else {
+		*symbol = NULL;
+		*probe_offset = 0;
+		*probe_addr = (unsigned long)tk->rp.kp.addr;
+	}
+	return 0;
+}
 #endif	/* CONFIG_PERF_EVENTS */
 
 /*
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index ac89287..bf89a51 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -1161,6 +1161,28 @@ static void uretprobe_perf_func(struct trace_uprobe *tu, unsigned long func,
 {
 	__uprobe_perf_func(tu, func, regs, ucb, dsize);
 }
+
+int bpf_get_uprobe_info(const struct perf_event *event, u32 *fd_type,
+			const char **filename, u64 *probe_offset,
+			bool perf_type_tracepoint)
+{
+	const char *pevent = trace_event_name(event->tp_event);
+	const char *group = event->tp_event->class->system;
+	struct trace_uprobe *tu;
+
+	if (perf_type_tracepoint)
+		tu = find_probe_event(pevent, group);
+	else
+		tu = event->tp_event->data;
+	if (!tu)
+		return -EINVAL;
+
+	*fd_type = is_ret_probe(tu) ? BPF_FD_TYPE_URETPROBE
+				    : BPF_FD_TYPE_UPROBE;
+	*filename = tu->filename;
+	*probe_offset = tu->offset;
+	return 0;
+}
 #endif	/* CONFIG_PERF_EVENTS */
 
 static int
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH bpf-next v4 3/7] tools/bpf: sync kernel header bpf.h and add bpf_task_fd_query in libbpf
  2018-05-24  0:18 [PATCH bpf-next v4 0/7] bpf: implement BPF_TASK_FD_QUERY Yonghong Song
  2018-05-24  0:18 ` [PATCH bpf-next v4 1/7] perf/core: add perf_get_event() to return perf_event given a struct file Yonghong Song
  2018-05-24  0:18 ` [PATCH bpf-next v4 2/7] bpf: introduce bpf subcommand BPF_TASK_FD_QUERY Yonghong Song
@ 2018-05-24  0:18 ` Yonghong Song
  2018-05-24  5:12   ` Martin KaFai Lau
  2018-05-24  0:18 ` [PATCH bpf-next v4 4/7] tools/bpf: add ksym_get_addr() in trace_helpers Yonghong Song
  3 siblings, 1 reply; 8+ messages in thread
From: Yonghong Song @ 2018-05-24  0:18 UTC (permalink / raw)
  To: peterz, ast, daniel, netdev; +Cc: kernel-team

Sync kernel header bpf.h to tools/include/uapi/linux/bpf.h and
implement bpf_task_fd_query() in libbpf. The test programs
in samples/bpf and tools/testing/selftests/bpf, and later bpftool
will use this libbpf function to query kernel.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 tools/include/uapi/linux/bpf.h | 26 ++++++++++++++++++++++++++
 tools/lib/bpf/bpf.c            | 23 +++++++++++++++++++++++
 tools/lib/bpf/bpf.h            |  3 +++
 3 files changed, 52 insertions(+)

diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index c3e502d..0d51946 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -97,6 +97,7 @@ enum bpf_cmd {
 	BPF_RAW_TRACEPOINT_OPEN,
 	BPF_BTF_LOAD,
 	BPF_BTF_GET_FD_BY_ID,
+	BPF_TASK_FD_QUERY,
 };
 
 enum bpf_map_type {
@@ -379,6 +380,22 @@ union bpf_attr {
 		__u32		btf_log_size;
 		__u32		btf_log_level;
 	};
+
+	struct {
+		__u32		pid;		/* input: pid */
+		__u32		fd;		/* input: fd */
+		__u32		flags;		/* input: flags */
+		__u32		buf_len;	/* input/output: buf len */
+		__aligned_u64	buf;		/* input/output:
+						 *   tp_name for tracepoint
+						 *   symbol for kprobe
+						 *   filename for uprobe
+						 */
+		__u32		prog_id;	/* output: prod_id */
+		__u32		fd_type;	/* output: BPF_FD_TYPE_* */
+		__u64		probe_offset;	/* output: probe_offset */
+		__u64		probe_addr;	/* output: probe_addr */
+	} task_fd_query;
 } __attribute__((aligned(8)));
 
 /* The description below is an attempt at providing documentation to eBPF
@@ -2458,4 +2475,13 @@ struct bpf_fib_lookup {
 	__u8	dmac[6];     /* ETH_ALEN */
 };
 
+enum bpf_task_fd_type {
+	BPF_FD_TYPE_RAW_TRACEPOINT,	/* tp name */
+	BPF_FD_TYPE_TRACEPOINT,		/* tp name */
+	BPF_FD_TYPE_KPROBE,		/* (symbol + offset) or addr */
+	BPF_FD_TYPE_KRETPROBE,		/* (symbol + offset) or addr */
+	BPF_FD_TYPE_UPROBE,		/* filename + offset */
+	BPF_FD_TYPE_URETPROBE,		/* filename + offset */
+};
+
 #endif /* _UAPI__LINUX_BPF_H__ */
diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 442b4cd..9ddc89d 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -643,3 +643,26 @@ int bpf_load_btf(void *btf, __u32 btf_size, char *log_buf, __u32 log_buf_size,
 
 	return fd;
 }
+
+int bpf_task_fd_query(int pid, int fd, __u32 flags, char *buf, __u32 *buf_len,
+		      __u32 *prog_id, __u32 *fd_type, __u64 *probe_offset,
+		      __u64 *probe_addr)
+{
+	union bpf_attr attr = {};
+	int err;
+
+	attr.task_fd_query.pid = pid;
+	attr.task_fd_query.fd = fd;
+	attr.task_fd_query.flags = flags;
+	attr.task_fd_query.buf = ptr_to_u64(buf);
+	attr.task_fd_query.buf_len = *buf_len;
+
+	err = sys_bpf(BPF_TASK_FD_QUERY, &attr, sizeof(attr));
+	*buf_len = attr.task_fd_query.buf_len;
+	*prog_id = attr.task_fd_query.prog_id;
+	*fd_type = attr.task_fd_query.fd_type;
+	*probe_offset = attr.task_fd_query.probe_offset;
+	*probe_addr = attr.task_fd_query.probe_addr;
+
+	return err;
+}
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index d12344f..0639a30 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -107,4 +107,7 @@ int bpf_prog_query(int target_fd, enum bpf_attach_type type, __u32 query_flags,
 int bpf_raw_tracepoint_open(const char *name, int prog_fd);
 int bpf_load_btf(void *btf, __u32 btf_size, char *log_buf, __u32 log_buf_size,
 		 bool do_log);
+int bpf_task_fd_query(int pid, int fd, __u32 flags, char *buf, __u32 *buf_len,
+		      __u32 *prog_id, __u32 *fd_type, __u64 *probe_offset,
+		      __u64 *probe_addr);
 #endif
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH bpf-next v4 4/7] tools/bpf: add ksym_get_addr() in trace_helpers
  2018-05-24  0:18 [PATCH bpf-next v4 0/7] bpf: implement BPF_TASK_FD_QUERY Yonghong Song
                   ` (2 preceding siblings ...)
  2018-05-24  0:18 ` [PATCH bpf-next v4 3/7] tools/bpf: sync kernel header bpf.h and add bpf_task_fd_query in libbpf Yonghong Song
@ 2018-05-24  0:18 ` Yonghong Song
  3 siblings, 0 replies; 8+ messages in thread
From: Yonghong Song @ 2018-05-24  0:18 UTC (permalink / raw)
  To: peterz, ast, daniel, netdev; +Cc: kernel-team

Given a kernel function name, ksym_get_addr() will return the kernel
address for this function, or 0 if it cannot find this function name
in /proc/kallsyms. This function will be used later when a kernel
address is used to initiate a kprobe perf event.

Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
---
 tools/testing/selftests/bpf/trace_helpers.c | 12 ++++++++++++
 tools/testing/selftests/bpf/trace_helpers.h |  1 +
 2 files changed, 13 insertions(+)

diff --git a/tools/testing/selftests/bpf/trace_helpers.c b/tools/testing/selftests/bpf/trace_helpers.c
index 8fb4fe8..3868dcb 100644
--- a/tools/testing/selftests/bpf/trace_helpers.c
+++ b/tools/testing/selftests/bpf/trace_helpers.c
@@ -72,6 +72,18 @@ struct ksym *ksym_search(long key)
 	return &syms[0];
 }
 
+long ksym_get_addr(const char *name)
+{
+	int i;
+
+	for (i = 0; i < sym_cnt; i++) {
+		if (strcmp(syms[i].name, name) == 0)
+			return syms[i].addr;
+	}
+
+	return 0;
+}
+
 static int page_size;
 static int page_cnt = 8;
 static struct perf_event_mmap_page *header;
diff --git a/tools/testing/selftests/bpf/trace_helpers.h b/tools/testing/selftests/bpf/trace_helpers.h
index 36d90e3..3b4bcf7 100644
--- a/tools/testing/selftests/bpf/trace_helpers.h
+++ b/tools/testing/selftests/bpf/trace_helpers.h
@@ -11,6 +11,7 @@ struct ksym {
 
 int load_kallsyms(void);
 struct ksym *ksym_search(long key);
+long ksym_get_addr(const char *name);
 
 typedef enum bpf_perf_event_ret (*perf_event_print_fn)(void *data, int size);
 
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v4 2/7] bpf: introduce bpf subcommand BPF_TASK_FD_QUERY
  2018-05-24  0:18 ` [PATCH bpf-next v4 2/7] bpf: introduce bpf subcommand BPF_TASK_FD_QUERY Yonghong Song
@ 2018-05-24  5:07   ` Martin KaFai Lau
  2018-05-24 16:00     ` Yonghong Song
  0 siblings, 1 reply; 8+ messages in thread
From: Martin KaFai Lau @ 2018-05-24  5:07 UTC (permalink / raw)
  To: Yonghong Song; +Cc: peterz, ast, daniel, netdev, kernel-team

On Wed, May 23, 2018 at 05:18:42PM -0700, Yonghong Song wrote:
> Currently, suppose a userspace application has loaded a bpf program
> and attached it to a tracepoint/kprobe/uprobe, and a bpf
> introspection tool, e.g., bpftool, wants to show which bpf program
> is attached to which tracepoint/kprobe/uprobe. Such attachment
> information will be really useful to understand the overall bpf
> deployment in the system.
> 
> There is a name field (16 bytes) for each program, which could
> be used to encode the attachment point. There are some drawbacks
> for this approaches. First, bpftool user (e.g., an admin) may not
> really understand the association between the name and the
> attachment point. Second, if one program is attached to multiple
> places, encoding a proper name which can imply all these
> attachments becomes difficult.
> 
> This patch introduces a new bpf subcommand BPF_TASK_FD_QUERY.
> Given a pid and fd, if the <pid, fd> is associated with a
> tracepoint/kprobe/uprobe perf event, BPF_TASK_FD_QUERY will return
>    . prog_id
>    . tracepoint name, or
>    . k[ret]probe funcname + offset or kernel addr, or
>    . u[ret]probe filename + offset
> to the userspace.
> The user can use "bpftool prog" to find more information about
> bpf program itself with prog_id.
> 
> Signed-off-by: Yonghong Song <yhs@fb.com>
> ---
>  include/linux/trace_events.h |  17 +++++++
>  include/uapi/linux/bpf.h     |  26 ++++++++++
>  kernel/bpf/syscall.c         | 115 +++++++++++++++++++++++++++++++++++++++++++
>  kernel/trace/bpf_trace.c     |  48 ++++++++++++++++++
>  kernel/trace/trace_kprobe.c  |  29 +++++++++++
>  kernel/trace/trace_uprobe.c  |  22 +++++++++
>  6 files changed, 257 insertions(+)
> 
> diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
> index 2bde3ef..d34144a 100644
> --- a/include/linux/trace_events.h
> +++ b/include/linux/trace_events.h
> @@ -473,6 +473,9 @@ int perf_event_query_prog_array(struct perf_event *event, void __user *info);
>  int bpf_probe_register(struct bpf_raw_event_map *btp, struct bpf_prog *prog);
>  int bpf_probe_unregister(struct bpf_raw_event_map *btp, struct bpf_prog *prog);
>  struct bpf_raw_event_map *bpf_find_raw_tracepoint(const char *name);
> +int bpf_get_perf_event_info(const struct perf_event *event, u32 *prog_id,
> +			    u32 *fd_type, const char **buf,
> +			    u64 *probe_offset, u64 *probe_addr);
>  #else
>  static inline unsigned int trace_call_bpf(struct trace_event_call *call, void *ctx)
>  {
> @@ -504,6 +507,13 @@ static inline struct bpf_raw_event_map *bpf_find_raw_tracepoint(const char *name
>  {
>  	return NULL;
>  }
> +static inline int bpf_get_perf_event_info(const struct perf_event *event,
> +					  u32 *prog_id, u32 *fd_type,
> +					  const char **buf, u64 *probe_offset,
> +					  u64 *probe_addr)
> +{
> +	return -EOPNOTSUPP;
> +}
>  #endif
>  
>  enum {
> @@ -560,10 +570,17 @@ extern void perf_trace_del(struct perf_event *event, int flags);
>  #ifdef CONFIG_KPROBE_EVENTS
>  extern int  perf_kprobe_init(struct perf_event *event, bool is_retprobe);
>  extern void perf_kprobe_destroy(struct perf_event *event);
> +extern int bpf_get_kprobe_info(const struct perf_event *event,
> +			       u32 *fd_type, const char **symbol,
> +			       u64 *probe_offset, u64 *probe_addr,
> +			       bool perf_type_tracepoint);
>  #endif
>  #ifdef CONFIG_UPROBE_EVENTS
>  extern int  perf_uprobe_init(struct perf_event *event, bool is_retprobe);
>  extern void perf_uprobe_destroy(struct perf_event *event);
> +extern int bpf_get_uprobe_info(const struct perf_event *event,
> +			       u32 *fd_type, const char **filename,
> +			       u64 *probe_offset, bool perf_type_tracepoint);
>  #endif
>  extern int  ftrace_profile_set_filter(struct perf_event *event, int event_id,
>  				     char *filter_str);
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index c3e502d..0d51946 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -97,6 +97,7 @@ enum bpf_cmd {
>  	BPF_RAW_TRACEPOINT_OPEN,
>  	BPF_BTF_LOAD,
>  	BPF_BTF_GET_FD_BY_ID,
> +	BPF_TASK_FD_QUERY,
>  };
>  
>  enum bpf_map_type {
> @@ -379,6 +380,22 @@ union bpf_attr {
>  		__u32		btf_log_size;
>  		__u32		btf_log_level;
>  	};
> +
> +	struct {
> +		__u32		pid;		/* input: pid */
> +		__u32		fd;		/* input: fd */
> +		__u32		flags;		/* input: flags */
> +		__u32		buf_len;	/* input/output: buf len */
> +		__aligned_u64	buf;		/* input/output:
> +						 *   tp_name for tracepoint
> +						 *   symbol for kprobe
> +						 *   filename for uprobe
> +						 */
> +		__u32		prog_id;	/* output: prod_id */
> +		__u32		fd_type;	/* output: BPF_FD_TYPE_* */
> +		__u64		probe_offset;	/* output: probe_offset */
> +		__u64		probe_addr;	/* output: probe_addr */
> +	} task_fd_query;
>  } __attribute__((aligned(8)));
>  
>  /* The description below is an attempt at providing documentation to eBPF
> @@ -2458,4 +2475,13 @@ struct bpf_fib_lookup {
>  	__u8	dmac[6];     /* ETH_ALEN */
>  };
>  
> +enum bpf_task_fd_type {
> +	BPF_FD_TYPE_RAW_TRACEPOINT,	/* tp name */
> +	BPF_FD_TYPE_TRACEPOINT,		/* tp name */
> +	BPF_FD_TYPE_KPROBE,		/* (symbol + offset) or addr */
> +	BPF_FD_TYPE_KRETPROBE,		/* (symbol + offset) or addr */
> +	BPF_FD_TYPE_UPROBE,		/* filename + offset */
> +	BPF_FD_TYPE_URETPROBE,		/* filename + offset */
> +};
> +
>  #endif /* _UAPI__LINUX_BPF_H__ */
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 0b4c945..7dd8c86 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -18,7 +18,9 @@
>  #include <linux/vmalloc.h>
>  #include <linux/mmzone.h>
>  #include <linux/anon_inodes.h>
> +#include <linux/fdtable.h>
>  #include <linux/file.h>
> +#include <linux/fs.h>
>  #include <linux/license.h>
>  #include <linux/filter.h>
>  #include <linux/version.h>
> @@ -2102,6 +2104,116 @@ static int bpf_btf_get_fd_by_id(const union bpf_attr *attr)
>  	return btf_get_fd_by_id(attr->btf_id);
>  }
>  
> +static int bpf_task_fd_query_copy(const union bpf_attr *attr,
> +				    union bpf_attr __user *uattr,
> +				    u32 prog_id, u32 fd_type,
> +				    const char *buf, u64 probe_offset,
> +				    u64 probe_addr)
> +{
> +	void __user *ubuf = u64_to_user_ptr(attr->task_fd_query.buf);
> +	u32 len = buf ? strlen(buf) + 1 : 0, input_len;
> +	int err = 0;
> +
> +	if (put_user(len, &uattr->task_fd_query.buf_len))
> +		return -EFAULT;
> +	input_len = attr->task_fd_query.buf_len;
> +	if (input_len && len && ubuf) {
When len is 0 and input_len > 0, ubuf will not be touched (and
so not null terminated).

It may be helpful to note in uapi bpf.h that !output_buf_len has to be
checked on top of checking the syscall return value.  It is reasonable for
the userspace to assume that ubuf can be directly used with
strlen()/printf()... as long as the syscall does not return -1/ENOSPC.
I think the comment change could be done in a follow up patch.

or

always null terminate ubuf as long as input_len > 0
and the output_buf_len should be strlen(buf) instead of
strlen(buf) + 1 (i.e. exclude the null char in output_buf_len)
such that the !buf case will have output_buf_len == 0.
The user can depend on ENOSPC or input_buf_len <= output_buf_len
to decide the truncated condition.  This convention should be
closer to the snprintf() situation.

Other than that,

Acked-by: Martin KaFai Lau <kafai@fb.com>

> +		if (input_len < len) {
> +			err = -ENOSPC;
> +			len = input_len;
> +		}
> +		if (copy_to_user(ubuf, buf, len))
> +			return -EFAULT;
> +	}
> +
> +	if (put_user(prog_id, &uattr->task_fd_query.prog_id) ||
> +	    put_user(fd_type, &uattr->task_fd_query.fd_type) ||
> +	    put_user(probe_offset, &uattr->task_fd_query.probe_offset) ||
> +	    put_user(probe_addr, &uattr->task_fd_query.probe_addr))
> +		return -EFAULT;
> +
> +	return err;
> +}
> +
> +#define BPF_TASK_FD_QUERY_LAST_FIELD task_fd_query.probe_addr
> +
> +static int bpf_task_fd_query(const union bpf_attr *attr,
> +			     union bpf_attr __user *uattr)
> +{
> +	pid_t pid = attr->task_fd_query.pid;
> +	u32 fd = attr->task_fd_query.fd;
> +	const struct perf_event *event;
> +	struct files_struct *files;
> +	struct task_struct *task;
> +	struct file *file;
> +	int err;
> +
> +	if (CHECK_ATTR(BPF_TASK_FD_QUERY))
> +		return -EINVAL;
> +
> +	if (!capable(CAP_SYS_ADMIN))
> +		return -EPERM;
> +
> +	if (attr->task_fd_query.flags != 0)
> +		return -EINVAL;
> +
> +	task = get_pid_task(find_vpid(pid), PIDTYPE_PID);
> +	if (!task)
> +		return -ENOENT;
> +
> +	files = get_files_struct(task);
> +	put_task_struct(task);
> +	if (!files)
> +		return -ENOENT;
> +
> +	err = 0;
> +	spin_lock(&files->file_lock);
> +	file = fcheck_files(files, fd);
> +	if (!file)
> +		err = -EBADF;
> +	else
> +		get_file(file);
> +	spin_unlock(&files->file_lock);
> +	put_files_struct(files);
> +
> +	if (err)
> +		goto out;
> +
> +	if (file->f_op == &bpf_raw_tp_fops) {
> +		struct bpf_raw_tracepoint *raw_tp = file->private_data;
> +		struct bpf_raw_event_map *btp = raw_tp->btp;
> +
> +		err = bpf_task_fd_query_copy(attr, uattr,
> +					     raw_tp->prog->aux->id,
> +					     BPF_FD_TYPE_RAW_TRACEPOINT,
> +					     btp->tp->name, 0, 0);
> +		goto put_file;
> +	}
> +
> +	event = perf_get_event(file);
> +	if (!IS_ERR(event)) {
> +		u64 probe_offset, probe_addr;
> +		u32 prog_id, fd_type;
> +		const char *buf;
> +
> +		err = bpf_get_perf_event_info(event, &prog_id, &fd_type,
> +					      &buf, &probe_offset,
> +					      &probe_addr);
> +		if (!err)
> +			err = bpf_task_fd_query_copy(attr, uattr, prog_id,
> +						     fd_type, buf,
> +						     probe_offset,
> +						     probe_addr);
> +		goto put_file;
> +	}
> +
> +	err = -ENOTSUPP;
> +put_file:
> +	fput(file);
> +out:
> +	return err;
> +}
> +
>  SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, size)
>  {
>  	union bpf_attr attr = {};
> @@ -2188,6 +2300,9 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
>  	case BPF_BTF_GET_FD_BY_ID:
>  		err = bpf_btf_get_fd_by_id(&attr);
>  		break;
> +	case BPF_TASK_FD_QUERY:
> +		err = bpf_task_fd_query(&attr, uattr);
> +		break;
>  	default:
>  		err = -EINVAL;
>  		break;
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index ce2cbbf..81fdf2f 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -14,6 +14,7 @@
>  #include <linux/uaccess.h>
>  #include <linux/ctype.h>
>  #include <linux/kprobes.h>
> +#include <linux/syscalls.h>
>  #include <linux/error-injection.h>
>  
>  #include "trace_probe.h"
> @@ -1163,3 +1164,50 @@ int bpf_probe_unregister(struct bpf_raw_event_map *btp, struct bpf_prog *prog)
>  	mutex_unlock(&bpf_event_mutex);
>  	return err;
>  }
> +
> +int bpf_get_perf_event_info(const struct perf_event *event, u32 *prog_id,
> +			    u32 *fd_type, const char **buf,
> +			    u64 *probe_offset, u64 *probe_addr)
> +{
> +	bool is_tracepoint, is_syscall_tp;
> +	struct bpf_prog *prog;
> +	int flags, err = 0;
> +
> +	prog = event->prog;
> +	if (!prog)
> +		return -ENOENT;
> +
> +	/* not supporting BPF_PROG_TYPE_PERF_EVENT yet */
> +	if (prog->type == BPF_PROG_TYPE_PERF_EVENT)
> +		return -EOPNOTSUPP;
> +
> +	*prog_id = prog->aux->id;
> +	flags = event->tp_event->flags;
> +	is_tracepoint = flags & TRACE_EVENT_FL_TRACEPOINT;
> +	is_syscall_tp = is_syscall_trace_event(event->tp_event);
> +
> +	if (is_tracepoint || is_syscall_tp) {
> +		*buf = is_tracepoint ? event->tp_event->tp->name
> +				     : event->tp_event->name;
> +		*fd_type = BPF_FD_TYPE_TRACEPOINT;
> +		*probe_offset = 0x0;
> +		*probe_addr = 0x0;
> +	} else {
> +		/* kprobe/uprobe */
> +		err = -EOPNOTSUPP;
> +#ifdef CONFIG_KPROBE_EVENTS
> +		if (flags & TRACE_EVENT_FL_KPROBE)
> +			err = bpf_get_kprobe_info(event, fd_type, buf,
> +						  probe_offset, probe_addr,
> +						  event->attr.type == PERF_TYPE_TRACEPOINT);
> +#endif
> +#ifdef CONFIG_UPROBE_EVENTS
> +		if (flags & TRACE_EVENT_FL_UPROBE)
> +			err = bpf_get_uprobe_info(event, fd_type, buf,
> +						  probe_offset,
> +						  event->attr.type == PERF_TYPE_TRACEPOINT);
> +#endif
> +	}
> +
> +	return err;
> +}
> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> index 02aed76..daa8157 100644
> --- a/kernel/trace/trace_kprobe.c
> +++ b/kernel/trace/trace_kprobe.c
> @@ -1287,6 +1287,35 @@ kretprobe_perf_func(struct trace_kprobe *tk, struct kretprobe_instance *ri,
>  			      head, NULL);
>  }
>  NOKPROBE_SYMBOL(kretprobe_perf_func);
> +
> +int bpf_get_kprobe_info(const struct perf_event *event, u32 *fd_type,
> +			const char **symbol, u64 *probe_offset,
> +			u64 *probe_addr, bool perf_type_tracepoint)
> +{
> +	const char *pevent = trace_event_name(event->tp_event);
> +	const char *group = event->tp_event->class->system;
> +	struct trace_kprobe *tk;
> +
> +	if (perf_type_tracepoint)
> +		tk = find_trace_kprobe(pevent, group);
> +	else
> +		tk = event->tp_event->data;
> +	if (!tk)
> +		return -EINVAL;
> +
> +	*fd_type = trace_kprobe_is_return(tk) ? BPF_FD_TYPE_KRETPROBE
> +					      : BPF_FD_TYPE_KPROBE;
> +	if (tk->symbol) {
> +		*symbol = tk->symbol;
> +		*probe_offset = tk->rp.kp.offset;
> +		*probe_addr = 0;
> +	} else {
> +		*symbol = NULL;
> +		*probe_offset = 0;
> +		*probe_addr = (unsigned long)tk->rp.kp.addr;
> +	}
> +	return 0;
> +}
>  #endif	/* CONFIG_PERF_EVENTS */
>  
>  /*
> diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
> index ac89287..bf89a51 100644
> --- a/kernel/trace/trace_uprobe.c
> +++ b/kernel/trace/trace_uprobe.c
> @@ -1161,6 +1161,28 @@ static void uretprobe_perf_func(struct trace_uprobe *tu, unsigned long func,
>  {
>  	__uprobe_perf_func(tu, func, regs, ucb, dsize);
>  }
> +
> +int bpf_get_uprobe_info(const struct perf_event *event, u32 *fd_type,
> +			const char **filename, u64 *probe_offset,
> +			bool perf_type_tracepoint)
> +{
> +	const char *pevent = trace_event_name(event->tp_event);
> +	const char *group = event->tp_event->class->system;
> +	struct trace_uprobe *tu;
> +
> +	if (perf_type_tracepoint)
> +		tu = find_probe_event(pevent, group);
> +	else
> +		tu = event->tp_event->data;
> +	if (!tu)
> +		return -EINVAL;
> +
> +	*fd_type = is_ret_probe(tu) ? BPF_FD_TYPE_URETPROBE
> +				    : BPF_FD_TYPE_UPROBE;
> +	*filename = tu->filename;
> +	*probe_offset = tu->offset;
> +	return 0;
> +}
>  #endif	/* CONFIG_PERF_EVENTS */
>  
>  static int
> -- 
> 2.9.5
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v4 3/7] tools/bpf: sync kernel header bpf.h and add bpf_task_fd_query in libbpf
  2018-05-24  0:18 ` [PATCH bpf-next v4 3/7] tools/bpf: sync kernel header bpf.h and add bpf_task_fd_query in libbpf Yonghong Song
@ 2018-05-24  5:12   ` Martin KaFai Lau
  0 siblings, 0 replies; 8+ messages in thread
From: Martin KaFai Lau @ 2018-05-24  5:12 UTC (permalink / raw)
  To: Yonghong Song; +Cc: peterz, ast, daniel, netdev, kernel-team

On Wed, May 23, 2018 at 05:18:43PM -0700, Yonghong Song wrote:
> Sync kernel header bpf.h to tools/include/uapi/linux/bpf.h and
> implement bpf_task_fd_query() in libbpf. The test programs
> in samples/bpf and tools/testing/selftests/bpf, and later bpftool
> will use this libbpf function to query kernel.
> 
> Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v4 2/7] bpf: introduce bpf subcommand BPF_TASK_FD_QUERY
  2018-05-24  5:07   ` Martin KaFai Lau
@ 2018-05-24 16:00     ` Yonghong Song
  0 siblings, 0 replies; 8+ messages in thread
From: Yonghong Song @ 2018-05-24 16:00 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: peterz, ast, daniel, netdev, kernel-team



On 5/23/18 10:07 PM, Martin KaFai Lau wrote:
> On Wed, May 23, 2018 at 05:18:42PM -0700, Yonghong Song wrote:
>> Currently, suppose a userspace application has loaded a bpf program
>> and attached it to a tracepoint/kprobe/uprobe, and a bpf
>> introspection tool, e.g., bpftool, wants to show which bpf program
>> is attached to which tracepoint/kprobe/uprobe. Such attachment
>> information will be really useful to understand the overall bpf
>> deployment in the system.
>>
>> There is a name field (16 bytes) for each program, which could
>> be used to encode the attachment point. There are some drawbacks
>> for this approaches. First, bpftool user (e.g., an admin) may not
>> really understand the association between the name and the
>> attachment point. Second, if one program is attached to multiple
>> places, encoding a proper name which can imply all these
>> attachments becomes difficult.
>>
>> This patch introduces a new bpf subcommand BPF_TASK_FD_QUERY.
>> Given a pid and fd, if the <pid, fd> is associated with a
>> tracepoint/kprobe/uprobe perf event, BPF_TASK_FD_QUERY will return
>>     . prog_id
>>     . tracepoint name, or
>>     . k[ret]probe funcname + offset or kernel addr, or
>>     . u[ret]probe filename + offset
>> to the userspace.
>> The user can use "bpftool prog" to find more information about
>> bpf program itself with prog_id.
>>
>> Signed-off-by: Yonghong Song <yhs@fb.com>
>> ---
>>   include/linux/trace_events.h |  17 +++++++
>>   include/uapi/linux/bpf.h     |  26 ++++++++++
>>   kernel/bpf/syscall.c         | 115 +++++++++++++++++++++++++++++++++++++++++++
>>   kernel/trace/bpf_trace.c     |  48 ++++++++++++++++++
>>   kernel/trace/trace_kprobe.c  |  29 +++++++++++
>>   kernel/trace/trace_uprobe.c  |  22 +++++++++
>>   6 files changed, 257 insertions(+)
>>
>> diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
>> index 2bde3ef..d34144a 100644
>> --- a/include/linux/trace_events.h
>> +++ b/include/linux/trace_events.h
>> @@ -473,6 +473,9 @@ int perf_event_query_prog_array(struct perf_event *event, void __user *info);
>>   int bpf_probe_register(struct bpf_raw_event_map *btp, struct bpf_prog *prog);
>>   int bpf_probe_unregister(struct bpf_raw_event_map *btp, struct bpf_prog *prog);
>>   struct bpf_raw_event_map *bpf_find_raw_tracepoint(const char *name);
>> +int bpf_get_perf_event_info(const struct perf_event *event, u32 *prog_id,
>> +			    u32 *fd_type, const char **buf,
>> +			    u64 *probe_offset, u64 *probe_addr);
>>   #else
>>   static inline unsigned int trace_call_bpf(struct trace_event_call *call, void *ctx)
>>   {
>> @@ -504,6 +507,13 @@ static inline struct bpf_raw_event_map *bpf_find_raw_tracepoint(const char *name
>>   {
>>   	return NULL;
>>   }
>> +static inline int bpf_get_perf_event_info(const struct perf_event *event,
>> +					  u32 *prog_id, u32 *fd_type,
>> +					  const char **buf, u64 *probe_offset,
>> +					  u64 *probe_addr)
>> +{
>> +	return -EOPNOTSUPP;
>> +}
>>   #endif
>>   
>>   enum {
>> @@ -560,10 +570,17 @@ extern void perf_trace_del(struct perf_event *event, int flags);
>>   #ifdef CONFIG_KPROBE_EVENTS
>>   extern int  perf_kprobe_init(struct perf_event *event, bool is_retprobe);
>>   extern void perf_kprobe_destroy(struct perf_event *event);
>> +extern int bpf_get_kprobe_info(const struct perf_event *event,
>> +			       u32 *fd_type, const char **symbol,
>> +			       u64 *probe_offset, u64 *probe_addr,
>> +			       bool perf_type_tracepoint);
>>   #endif
>>   #ifdef CONFIG_UPROBE_EVENTS
>>   extern int  perf_uprobe_init(struct perf_event *event, bool is_retprobe);
>>   extern void perf_uprobe_destroy(struct perf_event *event);
>> +extern int bpf_get_uprobe_info(const struct perf_event *event,
>> +			       u32 *fd_type, const char **filename,
>> +			       u64 *probe_offset, bool perf_type_tracepoint);
>>   #endif
>>   extern int  ftrace_profile_set_filter(struct perf_event *event, int event_id,
>>   				     char *filter_str);
>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> index c3e502d..0d51946 100644
>> --- a/include/uapi/linux/bpf.h
>> +++ b/include/uapi/linux/bpf.h
>> @@ -97,6 +97,7 @@ enum bpf_cmd {
>>   	BPF_RAW_TRACEPOINT_OPEN,
>>   	BPF_BTF_LOAD,
>>   	BPF_BTF_GET_FD_BY_ID,
>> +	BPF_TASK_FD_QUERY,
>>   };
>>   
>>   enum bpf_map_type {
>> @@ -379,6 +380,22 @@ union bpf_attr {
>>   		__u32		btf_log_size;
>>   		__u32		btf_log_level;
>>   	};
>> +
>> +	struct {
>> +		__u32		pid;		/* input: pid */
>> +		__u32		fd;		/* input: fd */
>> +		__u32		flags;		/* input: flags */
>> +		__u32		buf_len;	/* input/output: buf len */
>> +		__aligned_u64	buf;		/* input/output:
>> +						 *   tp_name for tracepoint
>> +						 *   symbol for kprobe
>> +						 *   filename for uprobe
>> +						 */
>> +		__u32		prog_id;	/* output: prod_id */
>> +		__u32		fd_type;	/* output: BPF_FD_TYPE_* */
>> +		__u64		probe_offset;	/* output: probe_offset */
>> +		__u64		probe_addr;	/* output: probe_addr */
>> +	} task_fd_query;
>>   } __attribute__((aligned(8)));
>>   
>>   /* The description below is an attempt at providing documentation to eBPF
>> @@ -2458,4 +2475,13 @@ struct bpf_fib_lookup {
>>   	__u8	dmac[6];     /* ETH_ALEN */
>>   };
>>   
>> +enum bpf_task_fd_type {
>> +	BPF_FD_TYPE_RAW_TRACEPOINT,	/* tp name */
>> +	BPF_FD_TYPE_TRACEPOINT,		/* tp name */
>> +	BPF_FD_TYPE_KPROBE,		/* (symbol + offset) or addr */
>> +	BPF_FD_TYPE_KRETPROBE,		/* (symbol + offset) or addr */
>> +	BPF_FD_TYPE_UPROBE,		/* filename + offset */
>> +	BPF_FD_TYPE_URETPROBE,		/* filename + offset */
>> +};
>> +
>>   #endif /* _UAPI__LINUX_BPF_H__ */
>> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
>> index 0b4c945..7dd8c86 100644
>> --- a/kernel/bpf/syscall.c
>> +++ b/kernel/bpf/syscall.c
>> @@ -18,7 +18,9 @@
>>   #include <linux/vmalloc.h>
>>   #include <linux/mmzone.h>
>>   #include <linux/anon_inodes.h>
>> +#include <linux/fdtable.h>
>>   #include <linux/file.h>
>> +#include <linux/fs.h>
>>   #include <linux/license.h>
>>   #include <linux/filter.h>
>>   #include <linux/version.h>
>> @@ -2102,6 +2104,116 @@ static int bpf_btf_get_fd_by_id(const union bpf_attr *attr)
>>   	return btf_get_fd_by_id(attr->btf_id);
>>   }
>>   
>> +static int bpf_task_fd_query_copy(const union bpf_attr *attr,
>> +				    union bpf_attr __user *uattr,
>> +				    u32 prog_id, u32 fd_type,
>> +				    const char *buf, u64 probe_offset,
>> +				    u64 probe_addr)
>> +{
>> +	void __user *ubuf = u64_to_user_ptr(attr->task_fd_query.buf);
>> +	u32 len = buf ? strlen(buf) + 1 : 0, input_len;
>> +	int err = 0;
>> +
>> +	if (put_user(len, &uattr->task_fd_query.buf_len))
>> +		return -EFAULT;
>> +	input_len = attr->task_fd_query.buf_len;
>> +	if (input_len && len && ubuf) {
> When len is 0 and input_len > 0, ubuf will not be touched (and
> so not null terminated).

This follows what we did for cgroup prog array query, when len (to be 
copied) is 0, nothing will be copied. But see below.

> 
> It may be helpful to note in uapi bpf.h that !output_buf_len has to be
> checked on top of checking the syscall return value.  It is reasonable for
> the userspace to assume that ubuf can be directly used with
> strlen()/printf()... as long as the syscall does not return -1/ENOSPC.
> I think the comment change could be done in a follow up patch.
> 
> or
> 
> always null terminate ubuf as long as input_len > 0
> and the output_buf_len should be strlen(buf) instead of
> strlen(buf) + 1 (i.e. exclude the null char in output_buf_len)
> such that the !buf case will have output_buf_len == 0.
> The user can depend on ENOSPC or input_buf_len <= output_buf_len
> to decide the truncated condition.  This convention should be
> closer to the snprintf() situation.


The second approach is better as in cases the user space
may have limited space to print and we should not enforce users
to massage further for string printing under ENOSPC.

I will respin the patch set. Thanks for suggestion!

> Other than that,
> 
> Acked-by: Martin KaFai Lau <kafai@fb.com>
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-05-24 16:01 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-24  0:18 [PATCH bpf-next v4 0/7] bpf: implement BPF_TASK_FD_QUERY Yonghong Song
2018-05-24  0:18 ` [PATCH bpf-next v4 1/7] perf/core: add perf_get_event() to return perf_event given a struct file Yonghong Song
2018-05-24  0:18 ` [PATCH bpf-next v4 2/7] bpf: introduce bpf subcommand BPF_TASK_FD_QUERY Yonghong Song
2018-05-24  5:07   ` Martin KaFai Lau
2018-05-24 16:00     ` Yonghong Song
2018-05-24  0:18 ` [PATCH bpf-next v4 3/7] tools/bpf: sync kernel header bpf.h and add bpf_task_fd_query in libbpf Yonghong Song
2018-05-24  5:12   ` Martin KaFai Lau
2018-05-24  0:18 ` [PATCH bpf-next v4 4/7] tools/bpf: add ksym_get_addr() in trace_helpers Yonghong Song

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.