All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v2 0/7] bpf: implement BPF_TASK_FD_QUERY
@ 2018-05-18  5:03 Yonghong Song
  2018-05-18  5:03 ` [PATCH bpf-next v2 2/7] bpf: introduce bpf subcommand BPF_TASK_FD_QUERY Yonghong Song
                   ` (5 more replies)
  0 siblings, 6 replies; 9+ messages in thread
From: Yonghong Song @ 2018-05-18  5:03 UTC (permalink / raw)
  To: peterz, ast, daniel, netdev; +Cc: kernel-team

Currently, suppose a userspace application has loaded a bpf program
and attached it to a tracepoint/kprobe/uprobe, and a bpf
introspection tool, e.g., bpftool, wants to show which bpf program
is attached to which tracepoint/kprobe/uprobe. Such attachment
information will be really useful to understand the overall bpf
deployment in the system.

There is a name field (16 bytes) for each program, which could
be used to encode the attachment point. There are some drawbacks
for this approaches. First, bpftool user (e.g., an admin) may not
really understand the association between the name and the
attachment point. Second, if one program is attached to multiple
places, encoding a proper name which can imply all these
attachments becomes difficult.

This patch introduces a new bpf subcommand BPF_TASK_FD_QUERY.
Given a pid and fd, this command will return bpf related information
to user space. Right now it only supports tracepoint/kprobe/uprobe
perf event fd's. For such a fd, BPF_TASK_FD_QUERY will return
   . prog_id
   . tracepoint name, or
   . k[ret]probe funcname + offset or kernel addr, or
   . u[ret]probe filename + offset
to the userspace.
The user can use "bpftool prog" to find more information about
bpf program itself with prog_id.

Patch #1 adds function perf_get_event() in kernel/events/core.c.
Patch #2 implements the bpf subcommand BPF_TASK_FD_QUERY.
Patch #3 syncs tools bpf.h header and also add bpf_task_fd_query()
in the libbpf library for samples/selftests/bpftool to use.
Patch #4 adds ksym_get_addr() utility function.
Patch #5 add a test in samples/bpf for querying k[ret]probes and
u[ret]probes.
Patch #6 add a test in tools/testing/selftests/bpf for querying
raw_tracepoint and tracepoint.
Patch #7 add a new subcommand "perf" to bpftool.

Changelogs:
  v1 -> v2:
     . changed bpf subcommand name from BPF_PERF_EVENT_QUERY
       to BPF_TASK_FD_QUERY.
     . fixed various "bpftool perf" issues and added documentation
       and auto-completion.

Yonghong Song (7):
  perf/core: add perf_get_event() to return perf_event given a struct
    file
  bpf: introduce bpf subcommand BPF_TASK_FD_QUERY
  tools/bpf: sync kernel header bpf.h and add bpf_trace_event_query in
    libbpf
  tools/bpf: add ksym_get_addr() in trace_helpers
  samples/bpf: add a samples/bpf test for BPF_TASK_FD_QUERY
  tools/bpf: add two BPF_TASK_FD_QUERY tests in test_progs
  tools/bpftool: add perf subcommand

 include/linux/perf_event.h                       |   5 +
 include/linux/trace_events.h                     |  15 +
 include/uapi/linux/bpf.h                         |  27 ++
 kernel/bpf/syscall.c                             | 124 ++++++++
 kernel/events/core.c                             |   8 +
 kernel/trace/bpf_trace.c                         |  48 +++
 kernel/trace/trace_kprobe.c                      |  29 ++
 kernel/trace/trace_uprobe.c                      |  22 ++
 samples/bpf/Makefile                             |   4 +
 samples/bpf/task_fd_query_kern.c                 |  19 ++
 samples/bpf/task_fd_query_user.c                 | 379 +++++++++++++++++++++++
 tools/bpf/bpftool/Documentation/bpftool-perf.rst |  81 +++++
 tools/bpf/bpftool/Documentation/bpftool.rst      |   5 +-
 tools/bpf/bpftool/bash-completion/bpftool        |   9 +
 tools/bpf/bpftool/main.c                         |   3 +-
 tools/bpf/bpftool/main.h                         |   1 +
 tools/bpf/bpftool/perf.c                         | 200 ++++++++++++
 tools/include/uapi/linux/bpf.h                   |  27 ++
 tools/lib/bpf/bpf.c                              |  24 ++
 tools/lib/bpf/bpf.h                              |   3 +
 tools/testing/selftests/bpf/test_progs.c         | 133 ++++++++
 tools/testing/selftests/bpf/trace_helpers.c      |  12 +
 tools/testing/selftests/bpf/trace_helpers.h      |   1 +
 23 files changed, 1177 insertions(+), 2 deletions(-)
 create mode 100644 samples/bpf/task_fd_query_kern.c
 create mode 100644 samples/bpf/task_fd_query_user.c
 create mode 100644 tools/bpf/bpftool/Documentation/bpftool-perf.rst
 create mode 100644 tools/bpf/bpftool/perf.c

-- 
2.9.5

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH bpf-next v2 2/7] bpf: introduce bpf subcommand BPF_TASK_FD_QUERY
  2018-05-18  5:03 [PATCH bpf-next v2 0/7] bpf: implement BPF_TASK_FD_QUERY Yonghong Song
@ 2018-05-18  5:03 ` Yonghong Song
  2018-05-18  5:03 ` [PATCH bpf-next v2 3/7] tools/bpf: sync kernel header bpf.h and add bpf_trace_event_query in libbpf Yonghong Song
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Yonghong Song @ 2018-05-18  5:03 UTC (permalink / raw)
  To: peterz, ast, daniel, netdev; +Cc: kernel-team

Currently, suppose a userspace application has loaded a bpf program
and attached it to a tracepoint/kprobe/uprobe, and a bpf
introspection tool, e.g., bpftool, wants to show which bpf program
is attached to which tracepoint/kprobe/uprobe. Such attachment
information will be really useful to understand the overall bpf
deployment in the system.

There is a name field (16 bytes) for each program, which could
be used to encode the attachment point. There are some drawbacks
for this approaches. First, bpftool user (e.g., an admin) may not
really understand the association between the name and the
attachment point. Second, if one program is attached to multiple
places, encoding a proper name which can imply all these
attachments becomes difficult.

This patch introduces a new bpf subcommand BPF_TASK_FD_QUERY.
Given a pid and fd, if the <pid, fd> is associated with a
tracepoint/kprobe/uprobe perf event, BPF_TASK_FD_QUERY will return
   . prog_id
   . tracepoint name, or
   . k[ret]probe funcname + offset or kernel addr, or
   . u[ret]probe filename + offset
to the userspace.
The user can use "bpftool prog" to find more information about
bpf program itself with prog_id.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 include/linux/trace_events.h |  15 ++++++
 include/uapi/linux/bpf.h     |  27 ++++++++++
 kernel/bpf/syscall.c         | 124 +++++++++++++++++++++++++++++++++++++++++++
 kernel/trace/bpf_trace.c     |  48 +++++++++++++++++
 kernel/trace/trace_kprobe.c  |  29 ++++++++++
 kernel/trace/trace_uprobe.c  |  22 ++++++++
 6 files changed, 265 insertions(+)

diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index 2bde3ef..bd08e11 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -473,6 +473,9 @@ int perf_event_query_prog_array(struct perf_event *event, void __user *info);
 int bpf_probe_register(struct bpf_raw_event_map *btp, struct bpf_prog *prog);
 int bpf_probe_unregister(struct bpf_raw_event_map *btp, struct bpf_prog *prog);
 struct bpf_raw_event_map *bpf_find_raw_tracepoint(const char *name);
+int bpf_get_perf_event_info(struct perf_event *event, u32 *prog_id,
+			    u32 *attach_info, const char **buf,
+			    u64 *probe_offset, u64 *probe_addr);
 #else
 static inline unsigned int trace_call_bpf(struct trace_event_call *call, void *ctx)
 {
@@ -504,6 +507,12 @@ static inline struct bpf_raw_event_map *bpf_find_raw_tracepoint(const char *name
 {
 	return NULL;
 }
+static inline int bpf_get_perf_event_info(struct file *file, u32 *prog_id,
+					  u32 *attach_info, const char **buf,
+					  u64 *probe_offset, u64 *probe_addr)
+{
+	return -EOPNOTSUPP;
+}
 #endif
 
 enum {
@@ -560,10 +569,16 @@ extern void perf_trace_del(struct perf_event *event, int flags);
 #ifdef CONFIG_KPROBE_EVENTS
 extern int  perf_kprobe_init(struct perf_event *event, bool is_retprobe);
 extern void perf_kprobe_destroy(struct perf_event *event);
+extern int bpf_get_kprobe_info(struct perf_event *event, u32 *attach_info,
+			       const char **symbol, u64 *probe_offset,
+			       u64 *probe_addr, bool perf_type_tracepoint);
 #endif
 #ifdef CONFIG_UPROBE_EVENTS
 extern int  perf_uprobe_init(struct perf_event *event, bool is_retprobe);
 extern void perf_uprobe_destroy(struct perf_event *event);
+extern int bpf_get_uprobe_info(struct perf_event *event, u32 *attach_info,
+			       const char **filename, u64 *probe_offset,
+			       bool perf_type_tracepoint);
 #endif
 extern int  ftrace_profile_set_filter(struct perf_event *event, int event_id,
 				     char *filter_str);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index d94d333..6a22ad4 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -97,6 +97,7 @@ enum bpf_cmd {
 	BPF_RAW_TRACEPOINT_OPEN,
 	BPF_BTF_LOAD,
 	BPF_BTF_GET_FD_BY_ID,
+	BPF_TASK_FD_QUERY,
 };
 
 enum bpf_map_type {
@@ -379,6 +380,22 @@ union bpf_attr {
 		__u32		btf_log_size;
 		__u32		btf_log_level;
 	};
+
+	struct {
+		int		pid;		/* input: pid */
+		int		fd;		/* input: fd */
+		__u32		flags;		/* input: flags */
+		__u32		buf_len;	/* input: buf len */
+		__aligned_u64	buf;		/* input/output:
+						 *   tp_name for tracepoint
+						 *   symbol for kprobe
+						 *   filename for uprobe
+						 */
+		__u32		prog_id;	/* output: prod_id */
+		__u32		attach_info;	/* output: BPF_ATTACH_* */
+		__u64		probe_offset;	/* output: probe_offset */
+		__u64		probe_addr;	/* output: probe_addr */
+	} task_fd_query;
 } __attribute__((aligned(8)));
 
 /* The description below is an attempt at providing documentation to eBPF
@@ -2450,4 +2467,14 @@ struct bpf_fib_lookup {
 	__u8	dmac[6];     /* ETH_ALEN */
 };
 
+/* used by <task, fd> based query */
+enum {
+	BPF_ATTACH_RAW_TRACEPOINT,	/* tp name */
+	BPF_ATTACH_TRACEPOINT,		/* tp name */
+	BPF_ATTACH_KPROBE,		/* (symbol + offset) or addr */
+	BPF_ATTACH_KRETPROBE,		/* (symbol + offset) or addr */
+	BPF_ATTACH_UPROBE,		/* filename + offset */
+	BPF_ATTACH_URETPROBE,		/* filename + offset */
+};
+
 #endif /* _UAPI__LINUX_BPF_H__ */
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index bfcde94..2f2bb22 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -18,7 +18,9 @@
 #include <linux/vmalloc.h>
 #include <linux/mmzone.h>
 #include <linux/anon_inodes.h>
+#include <linux/fdtable.h>
 #include <linux/file.h>
+#include <linux/fs.h>
 #include <linux/license.h>
 #include <linux/filter.h>
 #include <linux/version.h>
@@ -2102,6 +2104,125 @@ static int bpf_btf_get_fd_by_id(const union bpf_attr *attr)
 	return btf_get_fd_by_id(attr->btf_id);
 }
 
+static int bpf_task_fd_query_copy(const union bpf_attr *attr,
+				    union bpf_attr __user *uattr,
+				    u32 prog_id, u32 attach_info,
+				    const char *buf, u64 probe_offset,
+				    u64 probe_addr)
+{
+	__u64 __user *ubuf;
+	int len;
+
+	ubuf = u64_to_user_ptr(attr->task_fd_query.buf);
+	if (buf) {
+		len = strlen(buf);
+		if (attr->task_fd_query.buf_len < len + 1)
+			return -ENOSPC;
+		if (copy_to_user(ubuf, buf, len + 1))
+			return -EFAULT;
+	} else if (attr->task_fd_query.buf_len) {
+		/* copy '\0' to ubuf */
+		__u8 zero = 0;
+
+		if (copy_to_user(ubuf, &zero, 1))
+			return -EFAULT;
+	}
+
+	if (copy_to_user(&uattr->task_fd_query.prog_id, &prog_id,
+			 sizeof(prog_id)) ||
+	    copy_to_user(&uattr->task_fd_query.attach_info, &attach_info,
+			 sizeof(attach_info)) ||
+	    copy_to_user(&uattr->task_fd_query.probe_offset, &probe_offset,
+			 sizeof(probe_offset)) ||
+	    copy_to_user(&uattr->task_fd_query.probe_addr, &probe_addr,
+			 sizeof(probe_addr)))
+		return -EFAULT;
+
+	return 0;
+}
+
+#define BPF_TASK_FD_QUERY_LAST_FIELD task_fd_query.probe_addr
+
+static int bpf_task_fd_query(const union bpf_attr *attr,
+			     union bpf_attr __user *uattr)
+{
+	pid_t pid = attr->task_fd_query.pid;
+	int fd = attr->task_fd_query.fd;
+	struct files_struct *files;
+	struct task_struct *task;
+	struct perf_event *event;
+	struct file *file;
+	int err;
+
+	if (CHECK_ATTR(BPF_TASK_FD_QUERY))
+		return -EINVAL;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	if (attr->task_fd_query.flags != 0)
+		return -EINVAL;
+
+	task = get_pid_task(find_vpid(pid), PIDTYPE_PID);
+	if (!task)
+		return -ENOENT;
+
+	files = get_files_struct(task);
+	put_task_struct(task);
+	if (!files)
+		return -ENOENT;
+
+	err = 0;
+	spin_lock(&files->file_lock);
+	file = fcheck_files(files, fd);
+	if (!file)
+		err = -EBADF;
+	else
+		get_file(file);
+	spin_unlock(&files->file_lock);
+	put_files_struct(files);
+
+	if (err)
+		goto out;
+
+	if (file->f_op == &bpf_raw_tp_fops) {
+		struct bpf_raw_tracepoint *raw_tp = file->private_data;
+		struct bpf_raw_event_map *btp = raw_tp->btp;
+
+		if (!raw_tp->prog)
+			err = -ENOENT;
+		else
+			err = bpf_task_fd_query_copy(attr, uattr,
+						     raw_tp->prog->aux->id,
+						     BPF_ATTACH_RAW_TRACEPOINT,
+						     btp->tp->name, 0, 0);
+		goto put_file;
+	}
+
+	event = perf_get_event(file);
+	if (!IS_ERR(event)) {
+		u64 probe_offset, probe_addr;
+		u32 prog_id, attach_info;
+		const char *buf;
+
+		err = bpf_get_perf_event_info(event, &prog_id, &attach_info,
+					      &buf, &probe_offset,
+					      &probe_addr);
+		if (!err)
+			err = bpf_task_fd_query_copy(attr, uattr, prog_id,
+						     attach_info, buf,
+						     probe_offset,
+						     probe_addr);
+		goto put_file;
+	}
+
+	err = -ENOTSUPP;
+put_file:
+	fput(file);
+out:
+	return err;
+}
+
 SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, size)
 {
 	union bpf_attr attr = {};
@@ -2188,6 +2309,9 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
 	case BPF_BTF_GET_FD_BY_ID:
 		err = bpf_btf_get_fd_by_id(&attr);
 		break;
+	case BPF_TASK_FD_QUERY:
+		err = bpf_task_fd_query(&attr, uattr);
+		break;
 	default:
 		err = -EINVAL;
 		break;
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index ce2cbbf..b55758a 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -14,6 +14,7 @@
 #include <linux/uaccess.h>
 #include <linux/ctype.h>
 #include <linux/kprobes.h>
+#include <linux/syscalls.h>
 #include <linux/error-injection.h>
 
 #include "trace_probe.h"
@@ -1163,3 +1164,50 @@ int bpf_probe_unregister(struct bpf_raw_event_map *btp, struct bpf_prog *prog)
 	mutex_unlock(&bpf_event_mutex);
 	return err;
 }
+
+int bpf_get_perf_event_info(struct perf_event *event, u32 *prog_id,
+			    u32 *attach_info, const char **buf,
+			    u64 *probe_offset, u64 *probe_addr)
+{
+	bool is_tracepoint, is_syscall_tp;
+	struct bpf_prog *prog;
+	int flags, err = 0;
+
+	prog = event->prog;
+	if (!prog)
+		return -ENOENT;
+
+	/* not supporting BPF_PROG_TYPE_PERF_EVENT yet */
+	if (prog->type == BPF_PROG_TYPE_PERF_EVENT)
+		return -EOPNOTSUPP;
+
+	*prog_id = prog->aux->id;
+	flags = event->tp_event->flags;
+	is_tracepoint = flags & TRACE_EVENT_FL_TRACEPOINT;
+	is_syscall_tp = is_syscall_trace_event(event->tp_event);
+
+	if (is_tracepoint || is_syscall_tp) {
+		*buf = is_tracepoint ? event->tp_event->tp->name
+				     : event->tp_event->name;
+		*attach_info = BPF_ATTACH_TRACEPOINT;
+		*probe_offset = 0x0;
+		*probe_addr = 0x0;
+	} else {
+		/* kprobe/uprobe */
+		err = -EOPNOTSUPP;
+#ifdef CONFIG_KPROBE_EVENTS
+		if (flags & TRACE_EVENT_FL_KPROBE)
+			err = bpf_get_kprobe_info(event, attach_info, buf,
+						  probe_offset, probe_addr,
+						  event->attr.type == PERF_TYPE_TRACEPOINT);
+#endif
+#ifdef CONFIG_UPROBE_EVENTS
+		if (flags & TRACE_EVENT_FL_UPROBE)
+			err = bpf_get_uprobe_info(event, attach_info, buf,
+						  probe_offset,
+						  event->attr.type == PERF_TYPE_TRACEPOINT);
+#endif
+	}
+
+	return err;
+}
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 02aed76..e6bcaf6 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1287,6 +1287,35 @@ kretprobe_perf_func(struct trace_kprobe *tk, struct kretprobe_instance *ri,
 			      head, NULL);
 }
 NOKPROBE_SYMBOL(kretprobe_perf_func);
+
+int bpf_get_kprobe_info(struct perf_event *event, u32 *attach_info,
+			const char **symbol, u64 *probe_offset,
+			u64 *probe_addr, bool perf_type_tracepoint)
+{
+	const char *pevent = trace_event_name(event->tp_event);
+	const char *group = event->tp_event->class->system;
+	struct trace_kprobe *tk;
+
+	if (perf_type_tracepoint)
+		tk = find_trace_kprobe(pevent, group);
+	else
+		tk = event->tp_event->data;
+	if (!tk)
+		return -EINVAL;
+
+	*attach_info = trace_kprobe_is_return(tk) ? BPF_ATTACH_KRETPROBE
+						  : BPF_ATTACH_KPROBE;
+	if (tk->symbol) {
+		*symbol = tk->symbol;
+		*probe_offset = tk->rp.kp.offset;
+		*probe_addr = 0;
+	} else {
+		*symbol = NULL;
+		*probe_offset = 0;
+		*probe_addr = (u64)tk->rp.kp.addr;
+	}
+	return 0;
+}
 #endif	/* CONFIG_PERF_EVENTS */
 
 /*
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index ac89287..534cf1a 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -1161,6 +1161,28 @@ static void uretprobe_perf_func(struct trace_uprobe *tu, unsigned long func,
 {
 	__uprobe_perf_func(tu, func, regs, ucb, dsize);
 }
+
+int bpf_get_uprobe_info(struct perf_event *event, u32 *attach_info,
+			const char **filename, u64 *probe_offset,
+			bool perf_type_tracepoint)
+{
+	const char *pevent = trace_event_name(event->tp_event);
+	const char *group = event->tp_event->class->system;
+	struct trace_uprobe *tu;
+
+	if (perf_type_tracepoint)
+		tu = find_probe_event(pevent, group);
+	else
+		tu = event->tp_event->data;
+	if (!tu)
+		return -EINVAL;
+
+	*attach_info = is_ret_probe(tu) ? BPF_ATTACH_URETPROBE
+					: BPF_ATTACH_UPROBE;
+	*filename = tu->filename;
+	*probe_offset = tu->offset;
+	return 0;
+}
 #endif	/* CONFIG_PERF_EVENTS */
 
 static int
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH bpf-next v2 3/7] tools/bpf: sync kernel header bpf.h and add bpf_trace_event_query in libbpf
  2018-05-18  5:03 [PATCH bpf-next v2 0/7] bpf: implement BPF_TASK_FD_QUERY Yonghong Song
  2018-05-18  5:03 ` [PATCH bpf-next v2 2/7] bpf: introduce bpf subcommand BPF_TASK_FD_QUERY Yonghong Song
@ 2018-05-18  5:03 ` Yonghong Song
  2018-05-18  5:03 ` [PATCH bpf-next v2 4/7] tools/bpf: add ksym_get_addr() in trace_helpers Yonghong Song
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Yonghong Song @ 2018-05-18  5:03 UTC (permalink / raw)
  To: peterz, ast, daniel, netdev; +Cc: kernel-team

Sync kernel header bpf.h to tools/include/uapi/linux/bpf.h and
implement bpf_trace_event_query() in libbpf. The test programs
in samples/bpf and tools/testing/selftests/bpf, and later bpftool
will use this libbpf function to query kernel.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 tools/include/uapi/linux/bpf.h | 27 +++++++++++++++++++++++++++
 tools/lib/bpf/bpf.c            | 24 ++++++++++++++++++++++++
 tools/lib/bpf/bpf.h            |  3 +++
 3 files changed, 54 insertions(+)

diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index d94d333..6a22ad4 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -97,6 +97,7 @@ enum bpf_cmd {
 	BPF_RAW_TRACEPOINT_OPEN,
 	BPF_BTF_LOAD,
 	BPF_BTF_GET_FD_BY_ID,
+	BPF_TASK_FD_QUERY,
 };
 
 enum bpf_map_type {
@@ -379,6 +380,22 @@ union bpf_attr {
 		__u32		btf_log_size;
 		__u32		btf_log_level;
 	};
+
+	struct {
+		int		pid;		/* input: pid */
+		int		fd;		/* input: fd */
+		__u32		flags;		/* input: flags */
+		__u32		buf_len;	/* input: buf len */
+		__aligned_u64	buf;		/* input/output:
+						 *   tp_name for tracepoint
+						 *   symbol for kprobe
+						 *   filename for uprobe
+						 */
+		__u32		prog_id;	/* output: prod_id */
+		__u32		attach_info;	/* output: BPF_ATTACH_* */
+		__u64		probe_offset;	/* output: probe_offset */
+		__u64		probe_addr;	/* output: probe_addr */
+	} task_fd_query;
 } __attribute__((aligned(8)));
 
 /* The description below is an attempt at providing documentation to eBPF
@@ -2450,4 +2467,14 @@ struct bpf_fib_lookup {
 	__u8	dmac[6];     /* ETH_ALEN */
 };
 
+/* used by <task, fd> based query */
+enum {
+	BPF_ATTACH_RAW_TRACEPOINT,	/* tp name */
+	BPF_ATTACH_TRACEPOINT,		/* tp name */
+	BPF_ATTACH_KPROBE,		/* (symbol + offset) or addr */
+	BPF_ATTACH_KRETPROBE,		/* (symbol + offset) or addr */
+	BPF_ATTACH_UPROBE,		/* filename + offset */
+	BPF_ATTACH_URETPROBE,		/* filename + offset */
+};
+
 #endif /* _UAPI__LINUX_BPF_H__ */
diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 6a8a000..da3f336 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -643,3 +643,27 @@ int bpf_load_btf(void *btf, __u32 btf_size, char *log_buf, __u32 log_buf_size,
 
 	return fd;
 }
+
+int bpf_task_fd_query(int pid, int fd, __u32 flags, char *buf, __u32 buf_len,
+		      __u32 *prog_id, __u32 *attach_info,
+		      __u64 *probe_offset, __u64 *probe_addr)
+{
+	union bpf_attr attr = {};
+	int err;
+
+	attr.task_fd_query.pid = pid;
+	attr.task_fd_query.fd = fd;
+	attr.task_fd_query.flags = flags;
+	attr.task_fd_query.buf = ptr_to_u64(buf);
+	attr.task_fd_query.buf_len = buf_len;
+
+	err = sys_bpf(BPF_TASK_FD_QUERY, &attr, sizeof(attr));
+	if (!err) {
+		*prog_id = attr.task_fd_query.prog_id;
+		*attach_info = attr.task_fd_query.attach_info;
+		*probe_offset = attr.task_fd_query.probe_offset;
+		*probe_addr = attr.task_fd_query.probe_addr;
+	}
+
+	return err;
+}
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index 15bff77..9adfde6 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -107,4 +107,7 @@ int bpf_prog_query(int target_fd, enum bpf_attach_type type, __u32 query_flags,
 int bpf_raw_tracepoint_open(const char *name, int prog_fd);
 int bpf_load_btf(void *btf, __u32 btf_size, char *log_buf, __u32 log_buf_size,
 		 bool do_log);
+int bpf_task_fd_query(int pid, int fd, __u32 flags, char *buf, __u32 buf_len,
+		      __u32 *prog_id, __u32 *prog_info,
+		      __u64 *probe_offset, __u64 *probe_addr);
 #endif
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH bpf-next v2 4/7] tools/bpf: add ksym_get_addr() in trace_helpers
  2018-05-18  5:03 [PATCH bpf-next v2 0/7] bpf: implement BPF_TASK_FD_QUERY Yonghong Song
  2018-05-18  5:03 ` [PATCH bpf-next v2 2/7] bpf: introduce bpf subcommand BPF_TASK_FD_QUERY Yonghong Song
  2018-05-18  5:03 ` [PATCH bpf-next v2 3/7] tools/bpf: sync kernel header bpf.h and add bpf_trace_event_query in libbpf Yonghong Song
@ 2018-05-18  5:03 ` Yonghong Song
  2018-05-18  5:03 ` [PATCH bpf-next v2 5/7] samples/bpf: add a samples/bpf test for BPF_TASK_FD_QUERY Yonghong Song
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Yonghong Song @ 2018-05-18  5:03 UTC (permalink / raw)
  To: peterz, ast, daniel, netdev; +Cc: kernel-team

Given a kernel function name, ksym_get_addr() will return the kernel
address for this function, or 0 if it cannot find this function name
in /proc/kallsyms. This function will be used later when a kernel
address is used to initiate a kprobe perf event.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 tools/testing/selftests/bpf/trace_helpers.c | 12 ++++++++++++
 tools/testing/selftests/bpf/trace_helpers.h |  1 +
 2 files changed, 13 insertions(+)

diff --git a/tools/testing/selftests/bpf/trace_helpers.c b/tools/testing/selftests/bpf/trace_helpers.c
index 8fb4fe8..3868dcb 100644
--- a/tools/testing/selftests/bpf/trace_helpers.c
+++ b/tools/testing/selftests/bpf/trace_helpers.c
@@ -72,6 +72,18 @@ struct ksym *ksym_search(long key)
 	return &syms[0];
 }
 
+long ksym_get_addr(const char *name)
+{
+	int i;
+
+	for (i = 0; i < sym_cnt; i++) {
+		if (strcmp(syms[i].name, name) == 0)
+			return syms[i].addr;
+	}
+
+	return 0;
+}
+
 static int page_size;
 static int page_cnt = 8;
 static struct perf_event_mmap_page *header;
diff --git a/tools/testing/selftests/bpf/trace_helpers.h b/tools/testing/selftests/bpf/trace_helpers.h
index 36d90e3..3b4bcf7 100644
--- a/tools/testing/selftests/bpf/trace_helpers.h
+++ b/tools/testing/selftests/bpf/trace_helpers.h
@@ -11,6 +11,7 @@ struct ksym {
 
 int load_kallsyms(void);
 struct ksym *ksym_search(long key);
+long ksym_get_addr(const char *name);
 
 typedef enum bpf_perf_event_ret (*perf_event_print_fn)(void *data, int size);
 
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH bpf-next v2 5/7] samples/bpf: add a samples/bpf test for BPF_TASK_FD_QUERY
  2018-05-18  5:03 [PATCH bpf-next v2 0/7] bpf: implement BPF_TASK_FD_QUERY Yonghong Song
                   ` (2 preceding siblings ...)
  2018-05-18  5:03 ` [PATCH bpf-next v2 4/7] tools/bpf: add ksym_get_addr() in trace_helpers Yonghong Song
@ 2018-05-18  5:03 ` Yonghong Song
  2018-05-18  5:03 ` [PATCH bpf-next v2 6/7] tools/bpf: add two BPF_TASK_FD_QUERY tests in test_progs Yonghong Song
  2018-05-18  5:03 ` [PATCH bpf-next v2 7/7] tools/bpftool: add perf subcommand Yonghong Song
  5 siblings, 0 replies; 9+ messages in thread
From: Yonghong Song @ 2018-05-18  5:03 UTC (permalink / raw)
  To: peterz, ast, daniel, netdev; +Cc: kernel-team

This is mostly to test kprobe/uprobe which needs kernel headers.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 samples/bpf/Makefile             |   4 +
 samples/bpf/task_fd_query_kern.c |  19 ++
 samples/bpf/task_fd_query_user.c | 379 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 402 insertions(+)
 create mode 100644 samples/bpf/task_fd_query_kern.c
 create mode 100644 samples/bpf/task_fd_query_user.c

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 62d1aa1..7dc85ed 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -51,6 +51,7 @@ hostprogs-y += cpustat
 hostprogs-y += xdp_adjust_tail
 hostprogs-y += xdpsock
 hostprogs-y += xdp_fwd
+hostprogs-y += task_fd_query
 
 # Libbpf dependencies
 LIBBPF = $(TOOLS_PATH)/lib/bpf/libbpf.a
@@ -105,6 +106,7 @@ cpustat-objs := bpf_load.o cpustat_user.o
 xdp_adjust_tail-objs := xdp_adjust_tail_user.o
 xdpsock-objs := bpf_load.o xdpsock_user.o
 xdp_fwd-objs := bpf_load.o xdp_fwd_user.o
+task_fd_query-objs := bpf_load.o task_fd_query_user.o $(TRACE_HELPERS)
 
 # Tell kbuild to always build the programs
 always := $(hostprogs-y)
@@ -160,6 +162,7 @@ always += cpustat_kern.o
 always += xdp_adjust_tail_kern.o
 always += xdpsock_kern.o
 always += xdp_fwd_kern.o
+always += task_fd_query_kern.o
 
 HOSTCFLAGS += -I$(objtree)/usr/include
 HOSTCFLAGS += -I$(srctree)/tools/lib/
@@ -175,6 +178,7 @@ HOSTCFLAGS_offwaketime_user.o += -I$(srctree)/tools/lib/bpf/
 HOSTCFLAGS_spintest_user.o += -I$(srctree)/tools/lib/bpf/
 HOSTCFLAGS_trace_event_user.o += -I$(srctree)/tools/lib/bpf/
 HOSTCFLAGS_sampleip_user.o += -I$(srctree)/tools/lib/bpf/
+HOSTCFLAGS_task_fd_query_user.o += -I$(srctree)/tools/lib/bpf/
 
 HOST_LOADLIBES		+= $(LIBBPF) -lelf
 HOSTLOADLIBES_tracex4		+= -lrt
diff --git a/samples/bpf/task_fd_query_kern.c b/samples/bpf/task_fd_query_kern.c
new file mode 100644
index 0000000..f4b0a9e
--- /dev/null
+++ b/samples/bpf/task_fd_query_kern.c
@@ -0,0 +1,19 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/version.h>
+#include <linux/ptrace.h>
+#include <uapi/linux/bpf.h>
+#include "bpf_helpers.h"
+
+SEC("kprobe/blk_start_request")
+int bpf_prog1(struct pt_regs *ctx)
+{
+	return 0;
+}
+
+SEC("kretprobe/blk_account_io_completion")
+int bpf_prog2(struct pt_regs *ctx)
+{
+	return 0;
+}
+char _license[] SEC("license") = "GPL";
+u32 _version SEC("version") = LINUX_VERSION_CODE;
diff --git a/samples/bpf/task_fd_query_user.c b/samples/bpf/task_fd_query_user.c
new file mode 100644
index 0000000..792ef24
--- /dev/null
+++ b/samples/bpf/task_fd_query_user.c
@@ -0,0 +1,379 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <signal.h>
+#include <unistd.h>
+#include <stdbool.h>
+#include <string.h>
+#include <stdint.h>
+#include <fcntl.h>
+#include <linux/bpf.h>
+#include <sys/ioctl.h>
+#include <sys/resource.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+
+#include "libbpf.h"
+#include "bpf_load.h"
+#include "bpf_util.h"
+#include "perf-sys.h"
+#include "trace_helpers.h"
+
+#define CHECK_PERROR_RET(condition) ({			\
+	int __ret = !!(condition);			\
+	if (__ret) {					\
+		printf("FAIL: %s:\n", __func__);	\
+		perror("    ");			\
+		return -1;				\
+	}						\
+})
+
+#define CHECK_AND_RET(condition) ({			\
+	int __ret = !!(condition);			\
+	if (__ret)					\
+		return -1;				\
+})
+
+static __u64 ptr_to_u64(void *ptr)
+{
+	return (__u64) (unsigned long) ptr;
+}
+
+#define PMU_TYPE_FILE "/sys/bus/event_source/devices/%s/type"
+static int bpf_find_probe_type(const char *event_type)
+{
+	char buf[256];
+	int fd, ret;
+
+	ret = snprintf(buf, sizeof(buf), PMU_TYPE_FILE, event_type);
+	CHECK_PERROR_RET(ret < 0 || ret >= sizeof(buf));
+
+	fd = open(buf, O_RDONLY);
+	CHECK_PERROR_RET(fd < 0);
+
+	ret = read(fd, buf, sizeof(buf));
+	close(fd);
+	CHECK_PERROR_RET(ret < 0 || ret >= sizeof(buf));
+
+	errno = 0;
+	ret = (int)strtol(buf, NULL, 10);
+	CHECK_PERROR_RET(errno);
+	return ret;
+}
+
+#define PMU_RETPROBE_FILE "/sys/bus/event_source/devices/%s/format/retprobe"
+static int bpf_get_retprobe_bit(const char *event_type)
+{
+	char buf[256];
+	int fd, ret;
+
+	ret = snprintf(buf, sizeof(buf), PMU_RETPROBE_FILE, event_type);
+	CHECK_PERROR_RET(ret < 0 || ret >= sizeof(buf));
+
+	fd = open(buf, O_RDONLY);
+	CHECK_PERROR_RET(fd < 0);
+
+	ret = read(fd, buf, sizeof(buf));
+	close(fd);
+	CHECK_PERROR_RET(ret < 0 || ret >= sizeof(buf));
+	CHECK_PERROR_RET(strlen(buf) < strlen("config:"));
+
+	errno = 0;
+	ret = (int)strtol(buf + strlen("config:"), NULL, 10);
+	CHECK_PERROR_RET(errno);
+	return ret;
+}
+
+static int test_debug_fs_kprobe(int fd_idx, const char *fn_name,
+				__u32 expected_prog_info)
+{
+	__u64 probe_offset, probe_addr;
+	__u32 prog_id, prog_info;
+	char buf[256];
+	int err;
+
+	err = bpf_task_fd_query(getpid(), event_fd[fd_idx], 0, buf, 256,
+				&prog_id, &prog_info, &probe_offset,
+				&probe_addr);
+	if (err < 0) {
+		printf("FAIL: %s, for event_fd idx %d, fn_name %s\n",
+		       __func__, fd_idx, fn_name);
+		perror("    :");
+		return -1;
+	}
+	if (strcmp(buf, fn_name) != 0 ||
+	    prog_info != expected_prog_info ||
+	    probe_offset != 0x0 || probe_addr != 0x0) {
+		printf("FAIL: bpf_trace_event_query(event_fd[1]):\n");
+		printf("buf: %s, prog_info: %u, probe_offset: 0x%llx,"
+		       " probe_addr: 0x%llx\n",
+		       buf, prog_info, probe_offset, probe_addr);
+		return -1;
+	}
+	return 0;
+}
+
+static int test_nondebug_fs_kuprobe_common(const char *event_type,
+	const char *name, __u64 offset, __u64 addr, bool is_return,
+	char *buf, int buf_len, __u32 *prog_id, __u32 *prog_info,
+	__u64 *probe_offset, __u64 *probe_addr)
+{
+	int is_return_bit = bpf_get_retprobe_bit(event_type);
+	int type = bpf_find_probe_type(event_type);
+	struct perf_event_attr attr = {};
+	int fd;
+
+	if (type < 0 || is_return_bit < 0) {
+		printf("FAIL: %s incorrect type (%d) or is_return_bit (%d)\n",
+			__func__, type, is_return_bit);
+		return -1;
+	}
+
+	attr.sample_period = 1;
+	attr.wakeup_events = 1;
+	if (is_return)
+		attr.config |= 1 << is_return_bit;
+
+	if (name) {
+		attr.config1 = ptr_to_u64((void *)name);
+		attr.config2 = offset;
+	} else {
+		attr.config1 = 0;
+		attr.config2 = addr;
+	}
+	attr.size = sizeof(attr);
+	attr.type = type;
+
+	fd = sys_perf_event_open(&attr, -1, 0, -1, 0);
+	CHECK_PERROR_RET(fd < 0);
+
+	CHECK_PERROR_RET(ioctl(fd, PERF_EVENT_IOC_ENABLE, 0) < 0);
+	CHECK_PERROR_RET(ioctl(fd, PERF_EVENT_IOC_SET_BPF, prog_fd[0]) < 0);
+	CHECK_PERROR_RET(bpf_task_fd_query(getpid(), fd, 0, buf, buf_len,
+			 prog_id, prog_info, probe_offset, probe_addr) < 0);
+
+	return 0;
+}
+
+static int test_nondebug_fs_probe(const char *event_type, const char *name,
+				  __u64 offset, __u64 addr, bool is_return,
+				  __u32 expected_prog_info,
+				  __u32 expected_ret_prog_info,
+				  char *buf, int buf_len)
+{
+	__u64 probe_offset, probe_addr;
+	__u32 prog_id, prog_info;
+	int err;
+
+	err = test_nondebug_fs_kuprobe_common(event_type, name,
+					      offset, addr, is_return,
+					      buf, buf_len, &prog_id,
+					      &prog_info, &probe_offset,
+					      &probe_addr);
+	if (err < 0) {
+		printf("FAIL: %s, "
+		       "for name %s, offset 0x%llx, addr 0x%llx, is_return %d\n",
+		       __func__, name ? name : "", offset, addr, is_return);
+		perror("    :");
+		return -1;
+	}
+	if ((is_return && prog_info != expected_ret_prog_info) ||
+	    (!is_return && prog_info != expected_prog_info)) {
+		printf("FAIL: %s, incorrect prog_info %u\n",
+		       __func__, prog_info);
+		return -1;
+	}
+	if (name) {
+		if (strcmp(name, buf) != 0) {
+			printf("FAIL: %s, incorrect buf %s\n", __func__, buf);
+			return -1;
+		}
+		if (probe_offset != offset) {
+			printf("FAIL: %s, incorrect probe_offset 0x%llx\n",
+			       __func__, probe_offset);
+			return -1;
+		}
+	} else {
+		if (buf && buf[0] != '\0') {
+			printf("FAIL: %s, incorrect buf %p\n",
+			       __func__, buf);
+			return -1;
+		}
+
+		if (probe_addr != addr) {
+			printf("FAIL: %s, incorrect probe_addr 0x%llx\n",
+			       __func__, probe_addr);
+			return -1;
+		}
+	}
+	return 0;
+}
+
+static int test_debug_fs_uprobe(char *binary_path, long offset, bool is_return)
+{
+	const char *event_type = "uprobe";
+	struct perf_event_attr attr = {};
+	char buf[256], event_alias[256];
+	__u64 probe_offset, probe_addr;
+	__u32 prog_id, prog_info;
+	int err, res, kfd, efd;
+	ssize_t bytes;
+
+	snprintf(buf, sizeof(buf), "/sys/kernel/debug/tracing/%s_events",
+		 event_type);
+	kfd = open(buf, O_WRONLY | O_APPEND, 0);
+	CHECK_PERROR_RET(kfd < 0);
+
+	res = snprintf(event_alias, sizeof(event_alias), "test_%d", getpid());
+	CHECK_PERROR_RET(res < 0 || res >= sizeof(event_alias));
+
+	res = snprintf(buf, sizeof(buf), "%c:%ss/%s %s:0x%lx",
+		       is_return ? 'r' : 'p', event_type, event_alias,
+		       binary_path, offset);
+	CHECK_PERROR_RET(res < 0 || res >= sizeof(buf));
+	CHECK_PERROR_RET(write(kfd, buf, strlen(buf)) < 0);
+
+	close(kfd);
+	kfd = -1;
+
+	snprintf(buf, sizeof(buf), "/sys/kernel/debug/tracing/events/%ss/%s/id",
+		 event_type, event_alias);
+	efd = open(buf, O_RDONLY, 0);
+	CHECK_PERROR_RET(efd < 0);
+
+	bytes = read(efd, buf, sizeof(buf));
+	CHECK_PERROR_RET(bytes <= 0 || bytes >= sizeof(buf));
+	close(efd);
+	buf[bytes] = '\0';
+
+	attr.config = strtol(buf, NULL, 0);
+	attr.type = PERF_TYPE_TRACEPOINT;
+	attr.sample_period = 1;
+	attr.wakeup_events = 1;
+	kfd = sys_perf_event_open(&attr, -1, 0, -1, PERF_FLAG_FD_CLOEXEC);
+	CHECK_PERROR_RET(kfd < 0);
+	CHECK_PERROR_RET(ioctl(kfd, PERF_EVENT_IOC_SET_BPF, prog_fd[0]) < 0);
+	CHECK_PERROR_RET(ioctl(kfd, PERF_EVENT_IOC_ENABLE, 0) < 0);
+
+	err = bpf_task_fd_query(getpid(), kfd, 0, buf, 256,
+				&prog_id, &prog_info, &probe_offset,
+				&probe_addr);
+	if (err < 0) {
+		printf("FAIL: %s, binary_path %s\n", __func__, binary_path);
+		perror("    :");
+		return -1;
+	}
+	if ((is_return && prog_info != BPF_ATTACH_URETPROBE) ||
+	    (!is_return && prog_info != BPF_ATTACH_UPROBE)) {
+		printf("FAIL: %s, incorrect prog_info %u\n", __func__,
+		       prog_info);
+		return -1;
+	}
+	if (strcmp(binary_path, buf) != 0) {
+		printf("FAIL: %s, incorrect buf %s\n", __func__, buf);
+		return -1;
+	}
+	if (probe_offset != offset) {
+		printf("FAIL: %s, incorrect probe_offset 0x%llx\n", __func__,
+		       probe_offset);
+		return -1;
+	}
+
+	close(kfd);
+	return 0;
+}
+
+int main(int argc, char **argv)
+{
+	struct rlimit r = {1024*1024, RLIM_INFINITY};
+	extern char __executable_start;
+	char filename[256], buf[256];
+	__u64 uprobe_file_offset;
+
+	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
+	if (setrlimit(RLIMIT_MEMLOCK, &r)) {
+		perror("setrlimit(RLIMIT_MEMLOCK)");
+		return 1;
+	}
+
+	if (load_kallsyms()) {
+		printf("failed to process /proc/kallsyms\n");
+		return 1;
+	}
+
+	if (load_bpf_file(filename)) {
+		printf("%s", bpf_log_buf);
+		return 1;
+	}
+
+	/* test two functions in the corresponding *_kern.c file */
+	CHECK_AND_RET(test_debug_fs_kprobe(0, "blk_start_request",
+					   BPF_ATTACH_KPROBE));
+	CHECK_AND_RET(test_debug_fs_kprobe(1, "blk_account_io_completion",
+					   BPF_ATTACH_KRETPROBE));
+
+	/* test nondebug fs kprobe */
+	CHECK_AND_RET(test_nondebug_fs_probe("kprobe", "bpf_check", 0x0, 0x0,
+					     false, BPF_ATTACH_KPROBE,
+					     BPF_ATTACH_KRETPROBE,
+					     buf, sizeof(buf)));
+#ifdef __x86_64__
+	/* set a kprobe on "bpf_check + 0x5", which is x64 specific */
+	CHECK_AND_RET(test_nondebug_fs_probe("kprobe", "bpf_check", 0x5, 0x0,
+					     false, BPF_ATTACH_KPROBE,
+					     BPF_ATTACH_KRETPROBE,
+					     buf, sizeof(buf)));
+#endif
+	CHECK_AND_RET(test_nondebug_fs_probe("kprobe", "bpf_check", 0x0, 0x0,
+					     true, BPF_ATTACH_KPROBE,
+					     BPF_ATTACH_KRETPROBE,
+					     buf, sizeof(buf)));
+	CHECK_AND_RET(test_nondebug_fs_probe("kprobe", NULL, 0x0,
+					     ksym_get_addr("bpf_check"), false,
+					     BPF_ATTACH_KPROBE,
+					     BPF_ATTACH_KRETPROBE,
+					     buf, sizeof(buf)));
+	CHECK_AND_RET(test_nondebug_fs_probe("kprobe", NULL, 0x0,
+					     ksym_get_addr("bpf_check"), false,
+					     BPF_ATTACH_KPROBE,
+					     BPF_ATTACH_KRETPROBE,
+					     NULL, 0));
+	CHECK_AND_RET(test_nondebug_fs_probe("kprobe", NULL, 0x0,
+					     ksym_get_addr("bpf_check"), true,
+					     BPF_ATTACH_KPROBE,
+					     BPF_ATTACH_KRETPROBE,
+					     buf, sizeof(buf)));
+	CHECK_AND_RET(test_nondebug_fs_probe("kprobe", NULL, 0x0,
+					     ksym_get_addr("bpf_check"), true,
+					     BPF_ATTACH_KPROBE,
+					     BPF_ATTACH_KRETPROBE,
+					     0, 0));
+
+	/* test nondebug fs uprobe */
+	/* the calculation of uprobe file offset is based on gcc 7.3.1 on x64
+	 * and the default linker script, which defines __executable_start as
+	 * the start of the .text section. The calculation could be different
+	 * on different systems with different compilers. The right way is
+	 * to parse the ELF file. We took a shortcut here.
+	 */
+	uprobe_file_offset = (__u64)main - (__u64)&__executable_start;
+	CHECK_AND_RET(test_nondebug_fs_probe("uprobe", (char *)argv[0],
+					     uprobe_file_offset, 0x0, false,
+					     BPF_ATTACH_UPROBE,
+					     BPF_ATTACH_URETPROBE,
+					     buf, sizeof(buf)));
+	CHECK_AND_RET(test_nondebug_fs_probe("uprobe", (char *)argv[0],
+					     uprobe_file_offset, 0x0, true,
+					     BPF_ATTACH_UPROBE,
+					     BPF_ATTACH_URETPROBE,
+					     buf, sizeof(buf)));
+
+	/* test debug fs uprobe */
+	CHECK_AND_RET(test_debug_fs_uprobe((char *)argv[0], uprobe_file_offset,
+					   false));
+	CHECK_AND_RET(test_debug_fs_uprobe((char *)argv[0], uprobe_file_offset,
+					   true));
+
+	return 0;
+}
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH bpf-next v2 6/7] tools/bpf: add two BPF_TASK_FD_QUERY tests in test_progs
  2018-05-18  5:03 [PATCH bpf-next v2 0/7] bpf: implement BPF_TASK_FD_QUERY Yonghong Song
                   ` (3 preceding siblings ...)
  2018-05-18  5:03 ` [PATCH bpf-next v2 5/7] samples/bpf: add a samples/bpf test for BPF_TASK_FD_QUERY Yonghong Song
@ 2018-05-18  5:03 ` Yonghong Song
  2018-05-18  5:03 ` [PATCH bpf-next v2 7/7] tools/bpftool: add perf subcommand Yonghong Song
  5 siblings, 0 replies; 9+ messages in thread
From: Yonghong Song @ 2018-05-18  5:03 UTC (permalink / raw)
  To: peterz, ast, daniel, netdev; +Cc: kernel-team

The new tests are added to query perf_event information
for raw_tracepoint and tracepoint attachment. For tracepoint,
both syscalls and non-syscalls tracepoints are queries as
they are treated slightly differently inside the kernel.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 tools/testing/selftests/bpf/test_progs.c | 133 +++++++++++++++++++++++++++++++
 1 file changed, 133 insertions(+)

diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
index 3ecf733..f7ede03 100644
--- a/tools/testing/selftests/bpf/test_progs.c
+++ b/tools/testing/selftests/bpf/test_progs.c
@@ -1542,6 +1542,137 @@ static void test_get_stack_raw_tp(void)
 	bpf_object__close(obj);
 }
 
+static void test_task_fd_query_rawtp(void)
+{
+	const char *file = "./test_get_stack_rawtp.o";
+	struct perf_event_attr attr = {};
+	__u64 probe_offset, probe_addr;
+	int efd, err, prog_fd, pmu_fd;
+	__u32 prog_id, attach_info;
+	struct bpf_object *obj;
+	__u32 duration = 0;
+	char buf[256];
+
+	err = bpf_prog_load(file, BPF_PROG_TYPE_RAW_TRACEPOINT, &obj, &prog_fd);
+	if (CHECK(err, "prog_load raw tp", "err %d errno %d\n", err, errno))
+		return;
+
+	efd = bpf_raw_tracepoint_open("sys_enter", prog_fd);
+	if (CHECK(efd < 0, "raw_tp_open", "err %d errno %d\n", efd, errno))
+		goto close_prog;
+
+	attr.sample_type = PERF_SAMPLE_RAW;
+	attr.type = PERF_TYPE_SOFTWARE;
+	attr.config = PERF_COUNT_SW_BPF_OUTPUT;
+	pmu_fd = syscall(__NR_perf_event_open, &attr, getpid(), -1, -1, 0);
+	if (CHECK(pmu_fd < 0, "perf_event_open", "err %d errno %d\n", pmu_fd,
+		  errno))
+		goto close_prog;
+
+	err = ioctl(pmu_fd, PERF_EVENT_IOC_ENABLE, 0);
+	if (CHECK(err < 0, "ioctl PERF_EVENT_IOC_ENABLE", "err %d errno %d\n",
+		  err, errno))
+		goto close_prog;
+
+	/* query (getpid(), efd */
+	err = bpf_task_fd_query(getpid(), efd, 0, buf, 256, &prog_id,
+				&attach_info, &probe_offset, &probe_addr);
+	if (CHECK(err < 0, "bpf_trace_event_query", "err %d errno %d\n", err,
+		  errno))
+		goto close_prog;
+
+	err = (attach_info == BPF_ATTACH_RAW_TRACEPOINT) &&
+	      (strcmp(buf, "sys_enter") == 0);
+	if (CHECK(!err, "check_results", "attach_info %d tp_name %s\n",
+		  attach_info, buf))
+		goto close_prog;
+
+	goto close_prog_noerr;
+close_prog:
+	error_cnt++;
+close_prog_noerr:
+	bpf_object__close(obj);
+}
+
+static void test_task_fd_query_tp_core(const char *probe_name,
+				       const char *tp_name)
+{
+	const char *file = "./test_tracepoint.o";
+	int err, bytes, efd, prog_fd, pmu_fd;
+	struct perf_event_attr attr = {};
+	__u64 probe_offset, probe_addr;
+	__u32 prog_id, attach_info;
+	struct bpf_object *obj;
+	__u32 duration = 0;
+	char buf[256];
+
+	err = bpf_prog_load(file, BPF_PROG_TYPE_TRACEPOINT, &obj, &prog_fd);
+	if (CHECK(err, "bpf_prog_load", "err %d errno %d\n", err, errno))
+		goto close_prog;
+
+	snprintf(buf, sizeof(buf),
+		 "/sys/kernel/debug/tracing/events/%s/id", probe_name);
+	efd = open(buf, O_RDONLY, 0);
+	if (CHECK(efd < 0, "open", "err %d errno %d\n", efd, errno))
+		goto close_prog;
+	bytes = read(efd, buf, sizeof(buf));
+	close(efd);
+	if (CHECK(bytes <= 0 || bytes >= sizeof(buf), "read",
+		  "bytes %d errno %d\n", bytes, errno))
+		goto close_prog;
+
+	attr.config = strtol(buf, NULL, 0);
+	attr.type = PERF_TYPE_TRACEPOINT;
+	attr.sample_type = PERF_SAMPLE_RAW;
+	attr.sample_period = 1;
+	attr.wakeup_events = 1;
+	pmu_fd = syscall(__NR_perf_event_open, &attr, -1 /* pid */,
+			 0 /* cpu 0 */, -1 /* group id */,
+			 0 /* flags */);
+	if (CHECK(err, "perf_event_open", "err %d errno %d\n", err, errno))
+		goto close_pmu;
+
+	err = ioctl(pmu_fd, PERF_EVENT_IOC_ENABLE, 0);
+	if (CHECK(err, "perf_event_ioc_enable", "err %d errno %d\n", err,
+		  errno))
+		goto close_pmu;
+
+	err = ioctl(pmu_fd, PERF_EVENT_IOC_SET_BPF, prog_fd);
+	if (CHECK(err, "perf_event_ioc_set_bpf", "err %d errno %d\n", err,
+		  errno))
+		goto close_pmu;
+
+	/* query (getpid(), pmu_fd */
+	err = bpf_task_fd_query(getpid(), pmu_fd, 0, buf, 256, &prog_id,
+				&attach_info, &probe_offset, &probe_addr);
+	if (CHECK(err < 0, "bpf_trace_event_query", "err %d errno %d\n", err,
+		  errno))
+		goto close_pmu;
+
+	err = (attach_info == BPF_ATTACH_TRACEPOINT) && !strcmp(buf, tp_name);
+	if (CHECK(!err, "check_results", "attach_info %d tp_name %s\n",
+		  attach_info, buf))
+		goto close_pmu;
+
+	close(pmu_fd);
+	goto close_prog_noerr;
+
+close_pmu:
+	close(pmu_fd);
+close_prog:
+	error_cnt++;
+close_prog_noerr:
+	bpf_object__close(obj);
+}
+
+static void test_task_fd_query_tp(void)
+{
+	test_task_fd_query_tp_core("sched/sched_switch",
+				   "sched_switch");
+	test_task_fd_query_tp_core("syscalls/sys_enter_read",
+				   "sys_enter_read");
+}
+
 int main(void)
 {
 	jit_enabled = is_jit_enabled();
@@ -1561,6 +1692,8 @@ int main(void)
 	test_stacktrace_build_id_nmi();
 	test_stacktrace_map_raw_tp();
 	test_get_stack_raw_tp();
+	test_task_fd_query_rawtp();
+	test_task_fd_query_tp();
 
 	printf("Summary: %d PASSED, %d FAILED\n", pass_cnt, error_cnt);
 	return error_cnt ? EXIT_FAILURE : EXIT_SUCCESS;
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH bpf-next v2 7/7] tools/bpftool: add perf subcommand
  2018-05-18  5:03 [PATCH bpf-next v2 0/7] bpf: implement BPF_TASK_FD_QUERY Yonghong Song
                   ` (4 preceding siblings ...)
  2018-05-18  5:03 ` [PATCH bpf-next v2 6/7] tools/bpf: add two BPF_TASK_FD_QUERY tests in test_progs Yonghong Song
@ 2018-05-18  5:03 ` Yonghong Song
  2018-05-18 20:51   ` Jakub Kicinski
  5 siblings, 1 reply; 9+ messages in thread
From: Yonghong Song @ 2018-05-18  5:03 UTC (permalink / raw)
  To: peterz, ast, daniel, netdev; +Cc: kernel-team

The new command "bpftool perf [show | list]" will traverse
all processes under /proc, and if any fd is associated
with a perf event, it will print out related perf event
information. Documentation is also added.

Below is an example to show the results using bcc commands.
Running the following 4 bcc commands:
  kprobe:     trace.py '__x64_sys_nanosleep'
  kretprobe:  trace.py 'r::__x64_sys_nanosleep'
  tracepoint: trace.py 't:syscalls:sys_enter_nanosleep'
  uprobe:     trace.py 'p:/home/yhs/a.out:main'

The bpftool command line and result:

  $ bpftool perf
  pid 21711  fd 5: prog_id 5  kprobe  func __x64_sys_write  offset 0
  pid 21765  fd 5: prog_id 7  kretprobe  func __x64_sys_nanosleep  offset 0
  pid 21767  fd 5: prog_id 8  tracepoint  sys_enter_nanosleep
  pid 21800  fd 5: prog_id 9  uprobe  filename /home/yhs/a.out  offset 1159

  $ bpftool -j perf
  {"pid":21711,"fd":5,"prog_id":5,"attach_info":"kprobe","func":"__x64_sys_write","offset":0}, \
  {"pid":21765,"fd":5,"prog_id":7,"attach_info":"kretprobe","func":"__x64_sys_nanosleep","offset":0}, \
  {"pid":21767,"fd":5,"prog_id":8,"attach_info":"tracepoint","tracepoint":"sys_enter_nanosleep"}, \
  {"pid":21800,"fd":5,"prog_id":9,"attach_info":"uprobe","filename":"/home/yhs/a.out","offset":1159}

  $ bpftool prog
  5: kprobe  name probe___x64_sys  tag e495a0c82f2c7a8d  gpl
	  loaded_at 2018-05-15T04:46:37-0700  uid 0
	  xlated 200B  not jited  memlock 4096B  map_ids 4
  7: kprobe  name probe___x64_sys  tag f2fdee479a503abf  gpl
	  loaded_at 2018-05-15T04:48:32-0700  uid 0
	  xlated 200B  not jited  memlock 4096B  map_ids 7
  8: tracepoint  name tracepoint__sys  tag 5390badef2395fcf  gpl
	  loaded_at 2018-05-15T04:48:48-0700  uid 0
	  xlated 200B  not jited  memlock 4096B  map_ids 8
  9: kprobe  name probe_main_1  tag 0a87bdc2e2953b6d  gpl
	  loaded_at 2018-05-15T04:49:52-0700  uid 0
	  xlated 200B  not jited  memlock 4096B  map_ids 9

  $ ps ax | grep "python ./trace.py"
  21711 pts/0    T      0:03 python ./trace.py __x64_sys_write
  21765 pts/0    S+     0:00 python ./trace.py r::__x64_sys_nanosleep
  21767 pts/2    S+     0:00 python ./trace.py t:syscalls:sys_enter_nanosleep
  21800 pts/3    S+     0:00 python ./trace.py p:/home/yhs/a.out:main
  22374 pts/1    S+     0:00 grep --color=auto python ./trace.py

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 tools/bpf/bpftool/Documentation/bpftool-perf.rst |  81 +++++++++
 tools/bpf/bpftool/Documentation/bpftool.rst      |   5 +-
 tools/bpf/bpftool/bash-completion/bpftool        |   9 +
 tools/bpf/bpftool/main.c                         |   3 +-
 tools/bpf/bpftool/main.h                         |   1 +
 tools/bpf/bpftool/perf.c                         | 200 +++++++++++++++++++++++
 6 files changed, 297 insertions(+), 2 deletions(-)
 create mode 100644 tools/bpf/bpftool/Documentation/bpftool-perf.rst
 create mode 100644 tools/bpf/bpftool/perf.c

diff --git a/tools/bpf/bpftool/Documentation/bpftool-perf.rst b/tools/bpf/bpftool/Documentation/bpftool-perf.rst
new file mode 100644
index 0000000..3e65375
--- /dev/null
+++ b/tools/bpf/bpftool/Documentation/bpftool-perf.rst
@@ -0,0 +1,81 @@
+================
+bpftool-perf
+================
+-------------------------------------------------------------------------------
+tool for inspection of perf related bpf prog attachments
+-------------------------------------------------------------------------------
+
+:Manual section: 8
+
+SYNOPSIS
+========
+
+	**bpftool** [*OPTIONS*] **perf** *COMMAND*
+
+	*OPTIONS* := { [{ **-j** | **--json** }] [{ **-p** | **--pretty** }] }
+
+	*COMMANDS* :=
+	{ **show** | **list** | **help** }
+
+PERF COMMANDS
+=============
+
+|	**bpftool** **perf { show | list }**
+|	**bpftool** **perf help**
+
+DESCRIPTION
+===========
+	**bpftool perf { show | list }**
+		  List all raw_tracepoint, tracepoint, kprobe attachment in the system.
+
+		  Output will start with process id and file descriptor in that process,
+		  followed by bpf program id, attachment information, and attachment point.
+		  The attachment point for raw_tracepoint/tracepoint is the trace probe name.
+		  The attachment point for k[ret]probe is either symbol name and offset,
+		  or a kernel virtual address.
+		  The attachment point for u[ret]probe is the file name and the file offset.
+
+	**bpftool perf help**
+		  Print short help message.
+
+OPTIONS
+=======
+	-h, --help
+		  Print short generic help message (similar to **bpftool help**).
+
+	-v, --version
+		  Print version number (similar to **bpftool version**).
+
+	-j, --json
+		  Generate JSON output. For commands that cannot produce JSON, this
+		  option has no effect.
+
+	-p, --pretty
+		  Generate human-readable JSON output. Implies **-j**.
+
+EXAMPLES
+========
+
+| **# bpftool perf**
+
+::
+
+      pid 21711  fd 5: prog_id 5  kprobe  func __x64_sys_write  offset 0
+      pid 21765  fd 5: prog_id 7  kretprobe  func __x64_sys_nanosleep  offset 0
+      pid 21767  fd 5: prog_id 8  tracepoint  sys_enter_nanosleep
+      pid 21800  fd 5: prog_id 9  uprobe  filename /home/yhs/a.out  offset 1159
+
+|
+| **# bpftool -j perf**
+
+::
+
+    [{"pid":21711,"fd":5,"prog_id":5,"attach_info":"kprobe","func":"__x64_sys_write","offset":0}, \
+     {"pid":21765,"fd":5,"prog_id":7,"attach_info":"kretprobe","func":"__x64_sys_nanosleep","offset":0}, \
+     {"pid":21767,"fd":5,"prog_id":8,"attach_info":"tracepoint","tracepoint":"sys_enter_nanosleep"}, \
+     {"pid":21800,"fd":5,"prog_id":9,"attach_info":"uprobe","filename":"/home/yhs/a.out","offset":1159}]
+
+
+SEE ALSO
+========
+	**bpftool**\ (8), **bpftool-prog**\ (8), **bpftool-map**\ (8)
diff --git a/tools/bpf/bpftool/Documentation/bpftool.rst b/tools/bpf/bpftool/Documentation/bpftool.rst
index 564cb0d..b6f5d56 100644
--- a/tools/bpf/bpftool/Documentation/bpftool.rst
+++ b/tools/bpf/bpftool/Documentation/bpftool.rst
@@ -16,7 +16,7 @@ SYNOPSIS
 
 	**bpftool** **version**
 
-	*OBJECT* := { **map** | **program** | **cgroup** }
+	*OBJECT* := { **map** | **program** | **cgroup** | **perf** }
 
 	*OPTIONS* := { { **-V** | **--version** } | { **-h** | **--help** }
 	| { **-j** | **--json** } [{ **-p** | **--pretty** }] }
@@ -30,6 +30,8 @@ SYNOPSIS
 
 	*CGROUP-COMMANDS* := { **show** | **list** | **attach** | **detach** | **help** }
 
+	*PERF-COMMANDS* := { **show** | **list** | **help** }
+
 DESCRIPTION
 ===========
 	*bpftool* allows for inspection and simple modification of BPF objects
@@ -56,3 +58,4 @@ OPTIONS
 SEE ALSO
 ========
 	**bpftool-map**\ (8), **bpftool-prog**\ (8), **bpftool-cgroup**\ (8)
+        **bpftool-perf**\ (8)
diff --git a/tools/bpf/bpftool/bash-completion/bpftool b/tools/bpf/bpftool/bash-completion/bpftool
index b301c9b..3680ad4 100644
--- a/tools/bpf/bpftool/bash-completion/bpftool
+++ b/tools/bpf/bpftool/bash-completion/bpftool
@@ -448,6 +448,15 @@ _bpftool()
                     ;;
             esac
             ;;
+        cgroup)
+            case $command in
+                *)
+                    [[ $prev == $object ]] && \
+                        COMPREPLY=( $( compgen -W 'help \
+                            show list' -- "$cur" ) )
+                    ;;
+            esac
+            ;;
     esac
 } &&
 complete -F _bpftool bpftool
diff --git a/tools/bpf/bpftool/main.c b/tools/bpf/bpftool/main.c
index 1ec852d..eea7f14 100644
--- a/tools/bpf/bpftool/main.c
+++ b/tools/bpf/bpftool/main.c
@@ -87,7 +87,7 @@ static int do_help(int argc, char **argv)
 		"       %s batch file FILE\n"
 		"       %s version\n"
 		"\n"
-		"       OBJECT := { prog | map | cgroup }\n"
+		"       OBJECT := { prog | map | cgroup | perf }\n"
 		"       " HELP_SPEC_OPTIONS "\n"
 		"",
 		bin_name, bin_name, bin_name);
@@ -216,6 +216,7 @@ static const struct cmd cmds[] = {
 	{ "prog",	do_prog },
 	{ "map",	do_map },
 	{ "cgroup",	do_cgroup },
+	{ "perf",	do_perf },
 	{ "version",	do_version },
 	{ 0 }
 };
diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h
index 6173cd9..63fdb31 100644
--- a/tools/bpf/bpftool/main.h
+++ b/tools/bpf/bpftool/main.h
@@ -119,6 +119,7 @@ int do_prog(int argc, char **arg);
 int do_map(int argc, char **arg);
 int do_event_pipe(int argc, char **argv);
 int do_cgroup(int argc, char **arg);
+int do_perf(int argc, char **arg);
 
 int prog_parse_fd(int *argc, char ***argv);
 int map_parse_fd_and_info(int *argc, char ***argv, void *info, __u32 *info_len);
diff --git a/tools/bpf/bpftool/perf.c b/tools/bpf/bpftool/perf.c
new file mode 100644
index 0000000..baa4bbe
--- /dev/null
+++ b/tools/bpf/bpftool/perf.c
@@ -0,0 +1,200 @@
+// SPDX-License-Identifier: GPL-2.0+
+// Copyright (C) 2018 Facebook
+// Author: Yonghong Song <yhs@fb.com>
+
+#define _GNU_SOURCE
+#include <ctype.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <unistd.h>
+#include <ftw.h>
+
+#include <bpf.h>
+
+#include "main.h"
+
+static void print_perf_json(int pid, int fd, __u32 prog_id, __u32 attach_info,
+			    char *buf, __u64 probe_offset, __u64 probe_addr)
+{
+	jsonw_start_object(json_wtr);
+	jsonw_int_field(json_wtr, "pid", pid);
+	jsonw_int_field(json_wtr, "fd", fd);
+	jsonw_uint_field(json_wtr, "prog_id", prog_id);
+	switch (attach_info) {
+	case BPF_ATTACH_RAW_TRACEPOINT:
+		jsonw_string_field(json_wtr, "attach_info", "raw_tracepoint");
+		jsonw_string_field(json_wtr, "tracepoint", buf);
+		break;
+	case BPF_ATTACH_TRACEPOINT:
+		jsonw_string_field(json_wtr, "attach_info", "tracepoint");
+		jsonw_string_field(json_wtr, "tracepoint", buf);
+		break;
+	case BPF_ATTACH_KPROBE:
+		jsonw_string_field(json_wtr, "attach_info", "kprobe");
+		if (buf[0] != '\0') {
+			jsonw_string_field(json_wtr, "func", buf);
+			jsonw_lluint_field(json_wtr, "offset", probe_offset);
+		} else {
+			jsonw_lluint_field(json_wtr, "addr", probe_addr);
+		}
+		break;
+	case BPF_ATTACH_KRETPROBE:
+		jsonw_string_field(json_wtr, "attach_info", "kretprobe");
+		if (buf[0] != '\0') {
+			jsonw_string_field(json_wtr, "func", buf);
+			jsonw_lluint_field(json_wtr, "offset", probe_offset);
+		} else {
+			jsonw_lluint_field(json_wtr, "addr", probe_addr);
+		}
+		break;
+	case BPF_ATTACH_UPROBE:
+		jsonw_string_field(json_wtr, "attach_info", "uprobe");
+		jsonw_string_field(json_wtr, "filename", buf);
+		jsonw_lluint_field(json_wtr, "offset", probe_offset);
+		break;
+	case BPF_ATTACH_URETPROBE:
+		jsonw_string_field(json_wtr, "attach_info", "uretprobe");
+		jsonw_string_field(json_wtr, "filename", buf);
+		jsonw_lluint_field(json_wtr, "offset", probe_offset);
+		break;
+	}
+	jsonw_end_object(json_wtr);
+}
+
+static void print_perf_plain(int pid, int fd, __u32 prog_id, __u32 attach_info,
+			    char *buf, __u64 probe_offset, __u64 probe_addr)
+{
+	printf("pid %d  fd %d: prog_id %u  ", pid, fd, prog_id);
+	switch (attach_info) {
+	case BPF_ATTACH_RAW_TRACEPOINT:
+		printf("raw_tracepoint  %s\n", buf);
+		break;
+	case BPF_ATTACH_TRACEPOINT:
+		printf("tracepoint  %s\n", buf);
+		break;
+	case BPF_ATTACH_KPROBE:
+		if (buf[0] != '\0')
+			printf("kprobe  func %s  offset %llu\n", buf,
+			       probe_offset);
+		else
+			printf("kprobe  addr %llu\n", probe_addr);
+		break;
+	case BPF_ATTACH_KRETPROBE:
+		if (buf[0] != '\0')
+			printf("kretprobe  func %s  offset %llu\n", buf,
+			       probe_offset);
+		else
+			printf("kretprobe  addr %llu\n", probe_addr);
+		break;
+	case BPF_ATTACH_UPROBE:
+		printf("uprobe  filename %s  offset %llu\n", buf, probe_offset);
+		break;
+	case BPF_ATTACH_URETPROBE:
+		printf("uretprobe  filename %s  offset %llu\n", buf,
+		       probe_offset);
+		break;
+	}
+}
+
+static int show_proc(const char *fpath, const struct stat *sb,
+		     int tflag, struct FTW *ftwbuf)
+{
+	__u64 probe_offset, probe_addr;
+	__u32 prog_id, attach_info;
+	int err, pid = 0, fd = 0;
+	const char *pch;
+	char buf[4096];
+
+	/* prefix always /proc */
+	pch = fpath + 5;
+	if (*pch == '\0')
+		return 0;
+
+	/* pid should be all numbers */
+	pch++;
+	while (isdigit(*pch)) {
+		pid = pid * 10 + *pch - '0';
+		pch++;
+	}
+	if (*pch == '\0')
+		return 0;
+	if (*pch != '/')
+		return FTW_SKIP_SUBTREE;
+
+	/* check /proc/<pid>/fd directory */
+	pch++;
+	if (strncmp(pch, "fd", 2))
+		return FTW_SKIP_SUBTREE;
+	pch += 2;
+	if (*pch == '\0')
+		return 0;
+	if (*pch != '/')
+		return FTW_SKIP_SUBTREE;
+
+	/* check /proc/<pid>/fd/<fd_num> */
+	pch++;
+	while (isdigit(*pch)) {
+		fd = fd * 10 + *pch - '0';
+		pch++;
+	}
+	if (*pch != '\0')
+		return FTW_SKIP_SUBTREE;
+
+	/* query (pid, fd) for potential perf events */
+	err = bpf_task_fd_query(pid, fd, 0, buf, sizeof(buf), &prog_id,
+				&attach_info, &probe_offset, &probe_addr);
+	if (err < 0)
+		return 0;
+
+	if (json_output)
+		print_perf_json(pid, fd, prog_id, attach_info, buf, probe_offset,
+				probe_addr);
+	else
+		print_perf_plain(pid, fd, prog_id, attach_info, buf, probe_offset,
+				 probe_addr);
+
+	return 0;
+}
+
+static int do_show(int argc, char **argv)
+{
+	int err = 0, nopenfd = 16;
+	int flags = FTW_ACTIONRETVAL | FTW_PHYS;
+
+	if (json_output)
+		jsonw_start_array(json_wtr);
+	if (nftw("/proc", show_proc, nopenfd, flags) == -1) {
+		p_err("%s", strerror(errno));
+		err = -1;
+	}
+	if (json_output)
+		jsonw_end_array(json_wtr);
+
+	return err;
+}
+
+static int do_help(int argc, char **argv)
+{
+	fprintf(stderr,
+		"Usage: %s %s { show | list | help }\n"
+		"",
+		bin_name, argv[-2]);
+
+	return 0;
+}
+
+static const struct cmd cmds[] = {
+	{ "show",	do_show },
+	{ "list",	do_show },
+	{ "help",	do_help },
+	{ 0 }
+};
+
+int do_perf(int argc, char **argv)
+{
+	return cmd_select(cmds, argc, argv, do_help);
+}
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf-next v2 7/7] tools/bpftool: add perf subcommand
  2018-05-18  5:03 ` [PATCH bpf-next v2 7/7] tools/bpftool: add perf subcommand Yonghong Song
@ 2018-05-18 20:51   ` Jakub Kicinski
  2018-05-18 22:13     ` Y Song
  0 siblings, 1 reply; 9+ messages in thread
From: Jakub Kicinski @ 2018-05-18 20:51 UTC (permalink / raw)
  To: Yonghong Song, Quentin Monnet; +Cc: peterz, ast, daniel, netdev, kernel-team

On Thu, 17 May 2018 22:03:10 -0700, Yonghong Song wrote:
> The new command "bpftool perf [show | list]" will traverse
> all processes under /proc, and if any fd is associated
> with a perf event, it will print out related perf event
> information. Documentation is also added.

Thanks for the changes, it looks good with some minor nits which can be
addressed as follow up if there is no other need to respin.  Please
consider it:

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>

> Below is an example to show the results using bcc commands.
> Running the following 4 bcc commands:
>   kprobe:     trace.py '__x64_sys_nanosleep'
>   kretprobe:  trace.py 'r::__x64_sys_nanosleep'
>   tracepoint: trace.py 't:syscalls:sys_enter_nanosleep'
>   uprobe:     trace.py 'p:/home/yhs/a.out:main'
> 
> The bpftool command line and result:
> 
>   $ bpftool perf
>   pid 21711  fd 5: prog_id 5  kprobe  func __x64_sys_write  offset 0
>   pid 21765  fd 5: prog_id 7  kretprobe  func __x64_sys_nanosleep  offset 0
>   pid 21767  fd 5: prog_id 8  tracepoint  sys_enter_nanosleep
>   pid 21800  fd 5: prog_id 9  uprobe  filename /home/yhs/a.out  offset 1159
> 
>   $ bpftool -j perf
>   {"pid":21711,"fd":5,"prog_id":5,"attach_info":"kprobe","func":"__x64_sys_write","offset":0}, \
>   {"pid":21765,"fd":5,"prog_id":7,"attach_info":"kretprobe","func":"__x64_sys_nanosleep","offset":0}, \
>   {"pid":21767,"fd":5,"prog_id":8,"attach_info":"tracepoint","tracepoint":"sys_enter_nanosleep"}, \
>   {"pid":21800,"fd":5,"prog_id":9,"attach_info":"uprobe","filename":"/home/yhs/a.out","offset":1159}

nit: this is now an array

>   $ bpftool prog
>   5: kprobe  name probe___x64_sys  tag e495a0c82f2c7a8d  gpl
> 	  loaded_at 2018-05-15T04:46:37-0700  uid 0
> 	  xlated 200B  not jited  memlock 4096B  map_ids 4
>   7: kprobe  name probe___x64_sys  tag f2fdee479a503abf  gpl
> 	  loaded_at 2018-05-15T04:48:32-0700  uid 0
> 	  xlated 200B  not jited  memlock 4096B  map_ids 7
>   8: tracepoint  name tracepoint__sys  tag 5390badef2395fcf  gpl
> 	  loaded_at 2018-05-15T04:48:48-0700  uid 0
> 	  xlated 200B  not jited  memlock 4096B  map_ids 8
>   9: kprobe  name probe_main_1  tag 0a87bdc2e2953b6d  gpl
> 	  loaded_at 2018-05-15T04:49:52-0700  uid 0
> 	  xlated 200B  not jited  memlock 4096B  map_ids 9
> 
>   $ ps ax | grep "python ./trace.py"
>   21711 pts/0    T      0:03 python ./trace.py __x64_sys_write
>   21765 pts/0    S+     0:00 python ./trace.py r::__x64_sys_nanosleep
>   21767 pts/2    S+     0:00 python ./trace.py t:syscalls:sys_enter_nanosleep
>   21800 pts/3    S+     0:00 python ./trace.py p:/home/yhs/a.out:main
>   22374 pts/1    S+     0:00 grep --color=auto python ./trace.py
> 
> Signed-off-by: Yonghong Song <yhs@fb.com>

> diff --git a/tools/bpf/bpftool/bash-completion/bpftool b/tools/bpf/bpftool/bash-completion/bpftool
> index b301c9b..3680ad4 100644
> --- a/tools/bpf/bpftool/bash-completion/bpftool
> +++ b/tools/bpf/bpftool/bash-completion/bpftool
> @@ -448,6 +448,15 @@ _bpftool()
>                      ;;
>              esac
>              ;;
> +        cgroup)

s/cgroup/perf/ :)

> +            case $command in
> +                *)
> +                    [[ $prev == $object ]] && \
> +                        COMPREPLY=( $( compgen -W 'help \
> +                            show list' -- "$cur" ) )
> +                    ;;
> +            esac
> +            ;;
>      esac
>  } &&
>  complete -F _bpftool bpftool

> +static int show_proc(const char *fpath, const struct stat *sb,
> +		     int tflag, struct FTW *ftwbuf)
> +{
> +	__u64 probe_offset, probe_addr;
> +	__u32 prog_id, attach_info;
> +	int err, pid = 0, fd = 0;
> +	const char *pch;
> +	char buf[4096];
> +
> +	/* prefix always /proc */
> +	pch = fpath + 5;
> +	if (*pch == '\0')
> +		return 0;
> +
> +	/* pid should be all numbers */
> +	pch++;
> +	while (isdigit(*pch)) {
> +		pid = pid * 10 + *pch - '0';
> +		pch++;
> +	}
> +	if (*pch == '\0')
> +		return 0;
> +	if (*pch != '/')
> +		return FTW_SKIP_SUBTREE;
> +
> +	/* check /proc/<pid>/fd directory */
> +	pch++;
> +	if (strncmp(pch, "fd", 2))
> +		return FTW_SKIP_SUBTREE;
> +	pch += 2;
> +	if (*pch == '\0')
> +		return 0;
> +	if (*pch != '/')
> +		return FTW_SKIP_SUBTREE;
> +
> +	/* check /proc/<pid>/fd/<fd_num> */
> +	pch++;
> +	while (isdigit(*pch)) {
> +		fd = fd * 10 + *pch - '0';
> +		pch++;
> +	}
> +	if (*pch != '\0')
> +		return FTW_SKIP_SUBTREE;
> +
> +	/* query (pid, fd) for potential perf events */
> +	err = bpf_task_fd_query(pid, fd, 0, buf, sizeof(buf), &prog_id,
> +				&attach_info, &probe_offset, &probe_addr);
> +	if (err < 0)
> +		return 0;

nit: it could be nice from user perspective to detect whether kernel
     supports the command and fail if not.  Otherwise user is not sure
     if there is no output because kernel lacks support or because
     there were really no attached progs.  Just a thought, not really
     a requirement.

> +	if (json_output)
> +		print_perf_json(pid, fd, prog_id, attach_info, buf, probe_offset,
> +				probe_addr);
> +	else
> +		print_perf_plain(pid, fd, prog_id, attach_info, buf, probe_offset,
> +				 probe_addr);
> +
> +	return 0;
> +}
> +
> +static int do_show(int argc, char **argv)
> +{
> +	int err = 0, nopenfd = 16;
> +	int flags = FTW_ACTIONRETVAL | FTW_PHYS;

nit: reverse xmas tree

> +	if (json_output)
> +		jsonw_start_array(json_wtr);
> +	if (nftw("/proc", show_proc, nopenfd, flags) == -1) {
> +		p_err("%s", strerror(errno));
> +		err = -1;
> +	}
> +	if (json_output)
> +		jsonw_end_array(json_wtr);
> +
> +	return err;
> +}

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf-next v2 7/7] tools/bpftool: add perf subcommand
  2018-05-18 20:51   ` Jakub Kicinski
@ 2018-05-18 22:13     ` Y Song
  0 siblings, 0 replies; 9+ messages in thread
From: Y Song @ 2018-05-18 22:13 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Yonghong Song, Quentin Monnet, peterz, Alexei Starovoitov,
	Daniel Borkmann, netdev, kernel-team

On Fri, May 18, 2018 at 1:51 PM, Jakub Kicinski
<jakub.kicinski@netronome.com> wrote:
> On Thu, 17 May 2018 22:03:10 -0700, Yonghong Song wrote:
>> The new command "bpftool perf [show | list]" will traverse
>> all processes under /proc, and if any fd is associated
>> with a perf event, it will print out related perf event
>> information. Documentation is also added.
>
> Thanks for the changes, it looks good with some minor nits which can be
> addressed as follow up if there is no other need to respin.  Please
> consider it:
>
> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>

Most likely will need respin. Will make suggested changes then.

>
>> Below is an example to show the results using bcc commands.
>> Running the following 4 bcc commands:
>>   kprobe:     trace.py '__x64_sys_nanosleep'
>>   kretprobe:  trace.py 'r::__x64_sys_nanosleep'
>>   tracepoint: trace.py 't:syscalls:sys_enter_nanosleep'
>>   uprobe:     trace.py 'p:/home/yhs/a.out:main'
>>
>> The bpftool command line and result:
>>
>>   $ bpftool perf
>>   pid 21711  fd 5: prog_id 5  kprobe  func __x64_sys_write  offset 0
>>   pid 21765  fd 5: prog_id 7  kretprobe  func __x64_sys_nanosleep  offset 0
>>   pid 21767  fd 5: prog_id 8  tracepoint  sys_enter_nanosleep
>>   pid 21800  fd 5: prog_id 9  uprobe  filename /home/yhs/a.out  offset 1159
>>
>>   $ bpftool -j perf
>>   {"pid":21711,"fd":5,"prog_id":5,"attach_info":"kprobe","func":"__x64_sys_write","offset":0}, \
>>   {"pid":21765,"fd":5,"prog_id":7,"attach_info":"kretprobe","func":"__x64_sys_nanosleep","offset":0}, \
>>   {"pid":21767,"fd":5,"prog_id":8,"attach_info":"tracepoint","tracepoint":"sys_enter_nanosleep"}, \
>>   {"pid":21800,"fd":5,"prog_id":9,"attach_info":"uprobe","filename":"/home/yhs/a.out","offset":1159}
>
> nit: this is now an array

Sorry, this is probably updated in middle of work. Will make the change in
the next revision.

>
>>   $ bpftool prog
>>   5: kprobe  name probe___x64_sys  tag e495a0c82f2c7a8d  gpl
>>         loaded_at 2018-05-15T04:46:37-0700  uid 0
>>         xlated 200B  not jited  memlock 4096B  map_ids 4
>>   7: kprobe  name probe___x64_sys  tag f2fdee479a503abf  gpl
>>         loaded_at 2018-05-15T04:48:32-0700  uid 0
>>         xlated 200B  not jited  memlock 4096B  map_ids 7
>>   8: tracepoint  name tracepoint__sys  tag 5390badef2395fcf  gpl
>>         loaded_at 2018-05-15T04:48:48-0700  uid 0
>>         xlated 200B  not jited  memlock 4096B  map_ids 8
>>   9: kprobe  name probe_main_1  tag 0a87bdc2e2953b6d  gpl
>>         loaded_at 2018-05-15T04:49:52-0700  uid 0
>>         xlated 200B  not jited  memlock 4096B  map_ids 9
>>
>>   $ ps ax | grep "python ./trace.py"
>>   21711 pts/0    T      0:03 python ./trace.py __x64_sys_write
>>   21765 pts/0    S+     0:00 python ./trace.py r::__x64_sys_nanosleep
>>   21767 pts/2    S+     0:00 python ./trace.py t:syscalls:sys_enter_nanosleep
>>   21800 pts/3    S+     0:00 python ./trace.py p:/home/yhs/a.out:main
>>   22374 pts/1    S+     0:00 grep --color=auto python ./trace.py
>>
>> Signed-off-by: Yonghong Song <yhs@fb.com>
>
>> diff --git a/tools/bpf/bpftool/bash-completion/bpftool b/tools/bpf/bpftool/bash-completion/bpftool
>> index b301c9b..3680ad4 100644
>> --- a/tools/bpf/bpftool/bash-completion/bpftool
>> +++ b/tools/bpf/bpftool/bash-completion/bpftool
>> @@ -448,6 +448,15 @@ _bpftool()
>>                      ;;
>>              esac
>>              ;;
>> +        cgroup)
>
> s/cgroup/perf/ :)

A mistake in my side to consolidate different version of code.
I did have "perf" in one of my versions and tested it properly.

>
>> +            case $command in
>> +                *)
>> +                    [[ $prev == $object ]] && \
>> +                        COMPREPLY=( $( compgen -W 'help \
>> +                            show list' -- "$cur" ) )
>> +                    ;;
>> +            esac
>> +            ;;
>>      esac
>>  } &&
>>  complete -F _bpftool bpftool
>
>> +static int show_proc(const char *fpath, const struct stat *sb,
>> +                  int tflag, struct FTW *ftwbuf)
>> +{
>> +     __u64 probe_offset, probe_addr;
>> +     __u32 prog_id, attach_info;
>> +     int err, pid = 0, fd = 0;
>> +     const char *pch;
>> +     char buf[4096];
>> +
>> +     /* prefix always /proc */
>> +     pch = fpath + 5;
>> +     if (*pch == '\0')
>> +             return 0;
>> +
>> +     /* pid should be all numbers */
>> +     pch++;
>> +     while (isdigit(*pch)) {
>> +             pid = pid * 10 + *pch - '0';
>> +             pch++;
>> +     }
>> +     if (*pch == '\0')
>> +             return 0;
>> +     if (*pch != '/')
>> +             return FTW_SKIP_SUBTREE;
>> +
>> +     /* check /proc/<pid>/fd directory */
>> +     pch++;
>> +     if (strncmp(pch, "fd", 2))
>> +             return FTW_SKIP_SUBTREE;
>> +     pch += 2;
>> +     if (*pch == '\0')
>> +             return 0;
>> +     if (*pch != '/')
>> +             return FTW_SKIP_SUBTREE;
>> +
>> +     /* check /proc/<pid>/fd/<fd_num> */
>> +     pch++;
>> +     while (isdigit(*pch)) {
>> +             fd = fd * 10 + *pch - '0';
>> +             pch++;
>> +     }
>> +     if (*pch != '\0')
>> +             return FTW_SKIP_SUBTREE;
>> +
>> +     /* query (pid, fd) for potential perf events */
>> +     err = bpf_task_fd_query(pid, fd, 0, buf, sizeof(buf), &prog_id,
>> +                             &attach_info, &probe_offset, &probe_addr);
>> +     if (err < 0)
>> +             return 0;
>
> nit: it could be nice from user perspective to detect whether kernel
>      supports the command and fail if not.  Otherwise user is not sure
>      if there is no output because kernel lacks support or because
>      there were really no attached progs.  Just a thought, not really
>      a requirement.

I agree with you. it is good to output an error if the kernel does not
support the syscall (e.g., either non-root or new subcommand is not
supported).

>
>> +     if (json_output)
>> +             print_perf_json(pid, fd, prog_id, attach_info, buf, probe_offset,
>> +                             probe_addr);
>> +     else
>> +             print_perf_plain(pid, fd, prog_id, attach_info, buf, probe_offset,
>> +                              probe_addr);
>> +
>> +     return 0;
>> +}
>> +
>> +static int do_show(int argc, char **argv)
>> +{
>> +     int err = 0, nopenfd = 16;
>> +     int flags = FTW_ACTIONRETVAL | FTW_PHYS;
>
> nit: reverse xmas tree

Will make the change in the next revision.

>
>> +     if (json_output)
>> +             jsonw_start_array(json_wtr);
>> +     if (nftw("/proc", show_proc, nopenfd, flags) == -1) {
>> +             p_err("%s", strerror(errno));
>> +             err = -1;
>> +     }
>> +     if (json_output)
>> +             jsonw_end_array(json_wtr);
>> +
>> +     return err;
>> +}

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-05-18 22:14 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-18  5:03 [PATCH bpf-next v2 0/7] bpf: implement BPF_TASK_FD_QUERY Yonghong Song
2018-05-18  5:03 ` [PATCH bpf-next v2 2/7] bpf: introduce bpf subcommand BPF_TASK_FD_QUERY Yonghong Song
2018-05-18  5:03 ` [PATCH bpf-next v2 3/7] tools/bpf: sync kernel header bpf.h and add bpf_trace_event_query in libbpf Yonghong Song
2018-05-18  5:03 ` [PATCH bpf-next v2 4/7] tools/bpf: add ksym_get_addr() in trace_helpers Yonghong Song
2018-05-18  5:03 ` [PATCH bpf-next v2 5/7] samples/bpf: add a samples/bpf test for BPF_TASK_FD_QUERY Yonghong Song
2018-05-18  5:03 ` [PATCH bpf-next v2 6/7] tools/bpf: add two BPF_TASK_FD_QUERY tests in test_progs Yonghong Song
2018-05-18  5:03 ` [PATCH bpf-next v2 7/7] tools/bpftool: add perf subcommand Yonghong Song
2018-05-18 20:51   ` Jakub Kicinski
2018-05-18 22:13     ` Y Song

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.