linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 bpf-next 0/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE
@ 2019-08-16 22:31 Daniel Xu
  2019-08-16 22:31 ` [PATCH v3 bpf-next 1/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl Daniel Xu
                   ` (3 more replies)
  0 siblings, 4 replies; 25+ messages in thread
From: Daniel Xu @ 2019-08-16 22:31 UTC (permalink / raw)
  To: bpf, songliubraving, yhs, andriin, peterz, mingo, acme
  Cc: Daniel Xu, ast, alexander.shishkin, jolsa, namhyung,
	linux-kernel, kernel-team

It's useful to know [uk]probe's nmissed and nhit stats. For example with
tracing tools, it's important to know when events may have been lost.
debugfs currently exposes a control file to get this information, but
it is not compatible with probes registered with the perf API.

While bpf programs may be able to manually count nhit, there is no way
to gather nmissed. In other words, it is currently not possible to
retrieve information about FD-based probes.

This patch adds a new ioctl that lets users query nmissed (as well as
nhit for completeness). We currently only add support for [uk]probes
but leave the possibility open for other probes like tracepoint.

v2 -> v3:
- Introduce bpf_link_type and associated getter to track underlying link
  types
- Add back size field in perf_event_query_probe for forward/backwards
  compat
- Remove NULL checks, fix typos

v1 -> v2:
- More descriptive cover letter
- Make API more generic and support uprobes as well
- Use casters/getters for libbpf instead of single getter
- Fix typos
- Remove size field from ioctl struct
- Split out libbpf.h sync to tools dir to separate commit

Daniel Xu (4):
  tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl
  libbpf: Add helpers to extract perf fd from bpf_link
  tracing/probe: Sync perf_event.h to tools
  tracing/probe: Add self test for PERF_EVENT_IOC_QUERY_PROBE

 include/linux/trace_events.h                  |  12 ++
 include/uapi/linux/perf_event.h               |  23 ++++
 kernel/events/core.c                          |  20 ++++
 kernel/trace/trace_kprobe.c                   |  24 ++++
 kernel/trace/trace_uprobe.c                   |  24 ++++
 tools/include/uapi/linux/perf_event.h         |  23 ++++
 tools/lib/bpf/libbpf.c                        |  21 ++++
 tools/lib/bpf/libbpf.h                        |  13 +++
 tools/lib/bpf/libbpf.map                      |   4 +
 .../selftests/bpf/prog_tests/attach_probe.c   | 106 ++++++++++++++++++
 10 files changed, 270 insertions(+)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v3 bpf-next 1/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl
  2019-08-16 22:31 [PATCH v3 bpf-next 0/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE Daniel Xu
@ 2019-08-16 22:31 ` Daniel Xu
  2019-08-19 18:22   ` kbuild test robot
                     ` (3 more replies)
  2019-08-16 22:31 ` [PATCH v3 bpf-next 2/4] libbpf: Add helpers to extract perf fd from bpf_link Daniel Xu
                   ` (2 subsequent siblings)
  3 siblings, 4 replies; 25+ messages in thread
From: Daniel Xu @ 2019-08-16 22:31 UTC (permalink / raw)
  To: bpf, songliubraving, yhs, andriin, peterz, mingo, acme
  Cc: Daniel Xu, ast, alexander.shishkin, jolsa, namhyung,
	linux-kernel, kernel-team

It's useful to know [uk]probe's nmissed and nhit stats. For example with
tracing tools, it's important to know when events may have been lost.
debugfs currently exposes a control file to get this information, but
it is not compatible with probes registered with the perf API.

While bpf programs may be able to manually count nhit, there is no way
to gather nmissed. In other words, it is currently not possible to
retrieve information about FD-based probes.

This patch adds a new ioctl that lets users query nmissed (as well as
nhit for completeness). We currently only add support for [uk]probes
but leave the possibility open for other probes like tracepoint.

Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
---
 include/linux/trace_events.h    | 12 ++++++++++++
 include/uapi/linux/perf_event.h | 23 +++++++++++++++++++++++
 kernel/events/core.c            | 20 ++++++++++++++++++++
 kernel/trace/trace_kprobe.c     | 24 ++++++++++++++++++++++++
 kernel/trace/trace_uprobe.c     | 24 ++++++++++++++++++++++++
 5 files changed, 103 insertions(+)

diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index 5150436783e8..61558f19696a 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -586,6 +586,12 @@ extern int bpf_get_kprobe_info(const struct perf_event *event,
 			       u32 *fd_type, const char **symbol,
 			       u64 *probe_offset, u64 *probe_addr,
 			       bool perf_type_tracepoint);
+extern int perf_kprobe_event_query(struct perf_event *event, void __user *info);
+#else
+int perf_kprobe_event_query(struct perf_event *event, void __user *info)
+{
+	return -EOPNOTSUPP;
+}
 #endif
 #ifdef CONFIG_UPROBE_EVENTS
 extern int  perf_uprobe_init(struct perf_event *event,
@@ -594,6 +600,12 @@ extern void perf_uprobe_destroy(struct perf_event *event);
 extern int bpf_get_uprobe_info(const struct perf_event *event,
 			       u32 *fd_type, const char **filename,
 			       u64 *probe_offset, bool perf_type_tracepoint);
+extern int perf_uprobe_event_query(struct perf_event *event, void __user *info);
+#else
+int perf_uprobe_event_query(struct perf_event *event, void __user *info)
+{
+	return -EOPNOTSUPP;
+}
 #endif
 extern int  ftrace_profile_set_filter(struct perf_event *event, int event_id,
 				     char *filter_str);
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 7198ddd0c6b1..8783d29a807a 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -447,6 +447,28 @@ struct perf_event_query_bpf {
 	__u32	ids[0];
 };
 
+/*
+ * Structure used by below PERF_EVENT_IOC_QUERY_PROBE command
+ * to query information about the probe attached to the perf
+ * event. Currently only supports [uk]probes.
+ */
+struct perf_event_query_probe {
+	/*
+	 * Size of structure for forward/backward compatibility
+	 */
+	__u64	size;
+	/*
+	 * Set by the kernel to indicate number of times this probe
+	 * was temporarily disabled
+	 */
+	__u64	nmissed;
+	/*
+	 * Set by the kernel to indicate number of times this probe
+	 * was hit
+	 */
+	__u64	nhit;
+};
+
 /*
  * Ioctls that can be done on a perf event fd:
  */
@@ -462,6 +484,7 @@ struct perf_event_query_bpf {
 #define PERF_EVENT_IOC_PAUSE_OUTPUT		_IOW('$', 9, __u32)
 #define PERF_EVENT_IOC_QUERY_BPF		_IOWR('$', 10, struct perf_event_query_bpf *)
 #define PERF_EVENT_IOC_MODIFY_ATTRIBUTES	_IOW('$', 11, struct perf_event_attr *)
+#define PERF_EVENT_IOC_QUERY_PROBE		_IOR('$', 12, struct perf_event_query_probe *)
 
 enum perf_event_ioc_flags {
 	PERF_IOC_FLAG_GROUP		= 1U << 0,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 0463c1151bae..ed33d50511a3 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5060,6 +5060,8 @@ static int perf_event_set_filter(struct perf_event *event, void __user *arg);
 static int perf_event_set_bpf_prog(struct perf_event *event, u32 prog_fd);
 static int perf_copy_attr(struct perf_event_attr __user *uattr,
 			  struct perf_event_attr *attr);
+static int perf_probe_event_query(struct perf_event *event,
+				    void __user *info);
 
 static long _perf_ioctl(struct perf_event *event, unsigned int cmd, unsigned long arg)
 {
@@ -5143,6 +5145,10 @@ static long _perf_ioctl(struct perf_event *event, unsigned int cmd, unsigned lon
 
 		return perf_event_modify_attr(event,  &new_attr);
 	}
+#if defined(CONFIG_KPROBE_EVENTS) || defined(CONFIG_UPROBE_EVENTS)
+	case PERF_EVENT_IOC_QUERY_PROBE:
+		return perf_probe_event_query(event, (void __user *)arg);
+#endif
 	default:
 		return -ENOTTY;
 	}
@@ -8833,6 +8839,20 @@ static inline void perf_tp_register(void)
 #endif
 }
 
+static int perf_probe_event_query(struct perf_event *event,
+				    void __user *info)
+{
+#ifdef CONFIG_KPROBE_EVENTS
+	if (event->attr.type == perf_kprobe.type)
+		return perf_kprobe_event_query(event, (void __user *)info);
+#endif
+#ifdef CONFIG_UPROBE_EVENTS
+	if (event->attr.type == perf_uprobe.type)
+		return perf_uprobe_event_query(event, (void __user *)info);
+#endif
+	return -EINVAL;
+}
+
 static void perf_event_free_filter(struct perf_event *event)
 {
 	ftrace_profile_free_filter(event);
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 9d483ad9bb6c..89fbe3e97562 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -196,6 +196,30 @@ bool trace_kprobe_error_injectable(struct trace_event_call *call)
 	return within_error_injection_list(trace_kprobe_address(tk));
 }
 
+int perf_kprobe_event_query(struct perf_event *event, void __user *info)
+{
+	struct perf_event_query_probe __user *uquery = info;
+	struct perf_event_query_probe query = {};
+	struct trace_event_call *call = event->tp_event;
+	struct trace_kprobe *tk = (struct trace_kprobe *)call->data;
+	u64 ncopy;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+	if (copy_from_user(&query, uquery,
+			   offsetofend(struct perf_event_query_probe, size)))
+		return -EFAULT;
+
+	ncopy = min_t(u64, query.size, sizeof(query));
+	query.nhit = trace_kprobe_nhit(tk);
+	query.nmissed = tk->rp.kp.nmissed;
+
+	if (copy_to_user(uquery, &query, ncopy))
+		return -EFAULT;
+
+	return 0;
+}
+
 static int register_kprobe_event(struct trace_kprobe *tk);
 static int unregister_kprobe_event(struct trace_kprobe *tk);
 
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index 1ceedb9146b1..ecdf2bdb91a7 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -1333,6 +1333,30 @@ static inline void init_trace_event_call(struct trace_uprobe *tu)
 	call->data = tu;
 }
 
+int perf_uprobe_event_query(struct perf_event *event, void __user *info)
+{
+	struct perf_event_query_probe __user *uquery = info;
+	struct perf_event_query_probe query = {};
+	struct trace_event_call *call = event->tp_event;
+	struct trace_uprobe *tu = (struct trace_uprobe *)call->data;
+	u64 ncopy;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+	if (copy_from_user(&query, uquery,
+			   offsetofend(struct perf_event_query_probe, size)))
+		return -EFAULT;
+
+	ncopy = min_t(u64, query.size, sizeof(query));
+	query.nhit = tu->nhit;
+	query.nmissed = 0;
+
+	if (copy_to_user(uquery, &query, ncopy))
+		return -EFAULT;
+
+	return 0;
+}
+
 static int register_uprobe_event(struct trace_uprobe *tu)
 {
 	init_trace_event_call(tu);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 bpf-next 2/4] libbpf: Add helpers to extract perf fd from bpf_link
  2019-08-16 22:31 [PATCH v3 bpf-next 0/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE Daniel Xu
  2019-08-16 22:31 ` [PATCH v3 bpf-next 1/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl Daniel Xu
@ 2019-08-16 22:31 ` Daniel Xu
  2019-08-19 17:45   ` Andrii Nakryiko
  2019-08-16 22:31 ` [PATCH v3 bpf-next 3/4] tracing/probe: Sync perf_event.h to tools Daniel Xu
  2019-08-16 22:31 ` [PATCH v3 bpf-next 4/4] tracing/probe: Add self test for PERF_EVENT_IOC_QUERY_PROBE Daniel Xu
  3 siblings, 1 reply; 25+ messages in thread
From: Daniel Xu @ 2019-08-16 22:31 UTC (permalink / raw)
  To: bpf, songliubraving, yhs, andriin, peterz, mingo, acme
  Cc: Daniel Xu, ast, alexander.shishkin, jolsa, namhyung,
	linux-kernel, kernel-team

It is sometimes necessary to perform ioctl's on the underlying perf fd.
There is not currently a way to extract the fd given a bpf_link, so add a
a pair of casting and getting helpers.

The casting and getting helpers are nice because they let us define
broad categories of links that makes it clear to users what they can
expect to extract from what type of link.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
---
 tools/lib/bpf/libbpf.c   | 21 +++++++++++++++++++++
 tools/lib/bpf/libbpf.h   | 13 +++++++++++++
 tools/lib/bpf/libbpf.map |  4 ++++
 3 files changed, 38 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 2233f919dd88..41588e13be2b 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -4876,6 +4876,7 @@ int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr,
 
 struct bpf_link {
 	int (*destroy)(struct bpf_link *link);
+	enum bpf_link_type type;
 };
 
 int bpf_link__destroy(struct bpf_link *link)
@@ -4909,6 +4910,24 @@ static int bpf_link__destroy_perf_event(struct bpf_link *link)
 	return err;
 }
 
+const struct bpf_link_fd *bpf_link__as_fd(const struct bpf_link *link)
+{
+	if (link->type != LIBBPF_LINK_FD)
+		return NULL;
+
+	return (struct bpf_link_fd *)link;
+}
+
+enum bpf_link_type bpf_link__type(const struct bpf_link *link)
+{
+	return link->type;
+}
+
+int bpf_link_fd__fd(const struct bpf_link_fd *link)
+{
+	return link->fd;
+}
+
 struct bpf_link *bpf_program__attach_perf_event(struct bpf_program *prog,
 						int pfd)
 {
@@ -4932,6 +4951,7 @@ struct bpf_link *bpf_program__attach_perf_event(struct bpf_program *prog,
 	if (!link)
 		return ERR_PTR(-ENOMEM);
 	link->link.destroy = &bpf_link__destroy_perf_event;
+	link->link.type = LIBBPF_LINK_FD;
 	link->fd = pfd;
 
 	if (ioctl(pfd, PERF_EVENT_IOC_SET_BPF, prog_fd) < 0) {
@@ -5225,6 +5245,7 @@ struct bpf_link *bpf_program__attach_raw_tracepoint(struct bpf_program *prog,
 	link = malloc(sizeof(*link));
 	if (!link)
 		return ERR_PTR(-ENOMEM);
+	link->link.type = LIBBPF_LINK_FD;
 	link->link.destroy = &bpf_link__destroy_fd;
 
 	pfd = bpf_raw_tracepoint_open(tp_name, prog_fd);
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index e8f70977d137..2ddef5315ff9 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -166,7 +166,20 @@ LIBBPF_API int bpf_program__pin(struct bpf_program *prog, const char *path);
 LIBBPF_API int bpf_program__unpin(struct bpf_program *prog, const char *path);
 LIBBPF_API void bpf_program__unload(struct bpf_program *prog);
 
+enum bpf_link_type {
+	LIBBPF_LINK_FD,
+};
+
 struct bpf_link;
+struct bpf_link_fd;
+
+/* casting APIs */
+LIBBPF_API const struct bpf_link_fd *
+bpf_link__as_fd(const struct bpf_link *link);
+
+/* getters APIs */
+LIBBPF_API enum bpf_link_type bpf_link__type(const struct bpf_link *link);
+LIBBPF_API int bpf_link_fd__fd(const struct bpf_link_fd *link);
 
 LIBBPF_API int bpf_link__destroy(struct bpf_link *link);
 
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index 4e72df8e98ba..ee9945177100 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -186,4 +186,8 @@ LIBBPF_0.0.4 {
 } LIBBPF_0.0.3;
 
 LIBBPF_0.0.5 {
+	global:
+		bpf_link__type;
+		bpf_link__as_fd;
+		bpf_link_fd__fd;
 } LIBBPF_0.0.4;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 bpf-next 3/4] tracing/probe: Sync perf_event.h to tools
  2019-08-16 22:31 [PATCH v3 bpf-next 0/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE Daniel Xu
  2019-08-16 22:31 ` [PATCH v3 bpf-next 1/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl Daniel Xu
  2019-08-16 22:31 ` [PATCH v3 bpf-next 2/4] libbpf: Add helpers to extract perf fd from bpf_link Daniel Xu
@ 2019-08-16 22:31 ` Daniel Xu
  2019-08-16 22:31 ` [PATCH v3 bpf-next 4/4] tracing/probe: Add self test for PERF_EVENT_IOC_QUERY_PROBE Daniel Xu
  3 siblings, 0 replies; 25+ messages in thread
From: Daniel Xu @ 2019-08-16 22:31 UTC (permalink / raw)
  To: bpf, songliubraving, yhs, andriin, peterz, mingo, acme
  Cc: Daniel Xu, ast, alexander.shishkin, jolsa, namhyung,
	linux-kernel, kernel-team

Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
---
 tools/include/uapi/linux/perf_event.h | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index 7198ddd0c6b1..8783d29a807a 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -447,6 +447,28 @@ struct perf_event_query_bpf {
 	__u32	ids[0];
 };
 
+/*
+ * Structure used by below PERF_EVENT_IOC_QUERY_PROBE command
+ * to query information about the probe attached to the perf
+ * event. Currently only supports [uk]probes.
+ */
+struct perf_event_query_probe {
+	/*
+	 * Size of structure for forward/backward compatibility
+	 */
+	__u64	size;
+	/*
+	 * Set by the kernel to indicate number of times this probe
+	 * was temporarily disabled
+	 */
+	__u64	nmissed;
+	/*
+	 * Set by the kernel to indicate number of times this probe
+	 * was hit
+	 */
+	__u64	nhit;
+};
+
 /*
  * Ioctls that can be done on a perf event fd:
  */
@@ -462,6 +484,7 @@ struct perf_event_query_bpf {
 #define PERF_EVENT_IOC_PAUSE_OUTPUT		_IOW('$', 9, __u32)
 #define PERF_EVENT_IOC_QUERY_BPF		_IOWR('$', 10, struct perf_event_query_bpf *)
 #define PERF_EVENT_IOC_MODIFY_ATTRIBUTES	_IOW('$', 11, struct perf_event_attr *)
+#define PERF_EVENT_IOC_QUERY_PROBE		_IOR('$', 12, struct perf_event_query_probe *)
 
 enum perf_event_ioc_flags {
 	PERF_IOC_FLAG_GROUP		= 1U << 0,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 bpf-next 4/4] tracing/probe: Add self test for PERF_EVENT_IOC_QUERY_PROBE
  2019-08-16 22:31 [PATCH v3 bpf-next 0/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE Daniel Xu
                   ` (2 preceding siblings ...)
  2019-08-16 22:31 ` [PATCH v3 bpf-next 3/4] tracing/probe: Sync perf_event.h to tools Daniel Xu
@ 2019-08-16 22:31 ` Daniel Xu
  3 siblings, 0 replies; 25+ messages in thread
From: Daniel Xu @ 2019-08-16 22:31 UTC (permalink / raw)
  To: bpf, songliubraving, yhs, andriin, peterz, mingo, acme
  Cc: Daniel Xu, ast, alexander.shishkin, jolsa, namhyung,
	linux-kernel, kernel-team

Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
---
 .../selftests/bpf/prog_tests/attach_probe.c   | 106 ++++++++++++++++++
 1 file changed, 106 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/attach_probe.c b/tools/testing/selftests/bpf/prog_tests/attach_probe.c
index 5ecc267d98b0..c14db7557881 100644
--- a/tools/testing/selftests/bpf/prog_tests/attach_probe.c
+++ b/tools/testing/selftests/bpf/prog_tests/attach_probe.c
@@ -27,17 +27,27 @@ void test_attach_probe(void)
 	const char *kretprobe_name = "kretprobe/sys_nanosleep";
 	const char *uprobe_name = "uprobe/trigger_func";
 	const char *uretprobe_name = "uretprobe/trigger_func";
+	struct perf_event_query_probe kprobe_query = {};
+	struct perf_event_query_probe kretprobe_query = {};
+	struct perf_event_query_probe uprobe_query = {};
+	struct perf_event_query_probe uretprobe_query = {};
 	const int kprobe_idx = 0, kretprobe_idx = 1;
 	const int uprobe_idx = 2, uretprobe_idx = 3;
 	const char *file = "./test_attach_probe.o";
 	struct bpf_program *kprobe_prog, *kretprobe_prog;
 	struct bpf_program *uprobe_prog, *uretprobe_prog;
 	struct bpf_object *obj;
+	const struct bpf_link_fd *kprobe_fd_link;
+	const struct bpf_link_fd *kretprobe_fd_link;
+	const struct bpf_link_fd *uprobe_fd_link;
+	const struct bpf_link_fd *uretprobe_fd_link;
 	int err, prog_fd, duration = 0, res;
 	struct bpf_link *kprobe_link = NULL;
 	struct bpf_link *kretprobe_link = NULL;
 	struct bpf_link *uprobe_link = NULL;
 	struct bpf_link *uretprobe_link = NULL;
+	int kprobe_fd, kretprobe_fd;
+	int uprobe_fd, uretprobe_fd;
 	int results_map_fd;
 	size_t uprobe_offset;
 	ssize_t base_addr;
@@ -116,6 +126,54 @@ void test_attach_probe(void)
 	/* trigger & validate kprobe && kretprobe */
 	usleep(1);
 
+	kprobe_fd_link = bpf_link__as_fd(kprobe_link);
+	if (CHECK(!kprobe_fd_link, "kprobe_link_as_fd",
+		  "failed to cast link to fd link\n"))
+		goto cleanup;
+
+	kprobe_fd = bpf_link_fd__fd(kprobe_fd_link);
+	if (CHECK(kprobe_fd < 0, "kprobe_get_perf_fd",
+	    "failed to get perf fd from kprobe link\n"))
+		goto cleanup;
+
+	kretprobe_fd_link = bpf_link__as_fd(kretprobe_link);
+	if (CHECK(!kretprobe_fd_link, "kretprobe_link_as_fd",
+		  "failed to cast link to fd link\n"))
+		goto cleanup;
+
+	kretprobe_fd = bpf_link_fd__fd(kretprobe_fd_link);
+	if (CHECK(kretprobe_fd < 0, "kretprobe_get_perf_fd",
+	    "failed to get perf fd from kretprobe link\n"))
+		goto cleanup;
+
+	kprobe_query.size = sizeof(kprobe_query);
+	err = ioctl(kprobe_fd, PERF_EVENT_IOC_QUERY_PROBE, &kprobe_query);
+	if (CHECK(err, "get_kprobe_ioctl",
+		  "failed to issue kprobe query ioctl\n"))
+		goto cleanup;
+	if (CHECK(kprobe_query.nmissed > 0, "get_kprobe_ioctl",
+		  "read incorrect nmissed from kprobe_ioctl: %llu\n",
+		  kprobe_query.nmissed))
+		goto cleanup;
+	if (CHECK(kprobe_query.nhit == 0, "get_kprobe_ioctl",
+		  "read incorrect nhit from kprobe_ioctl: %llu\n",
+		  kprobe_query.nhit))
+		goto cleanup;
+
+	kretprobe_query.size = sizeof(kretprobe_query);
+	err = ioctl(kretprobe_fd, PERF_EVENT_IOC_QUERY_PROBE, &kretprobe_query);
+	if (CHECK(err, "get_kretprobe_ioctl",
+		  "failed to issue kretprobe query ioctl\n"))
+		goto cleanup;
+	if (CHECK(kretprobe_query.nmissed > 0, "get_kretprobe_ioctl",
+		  "read incorrect nmissed from kretprobe_ioctl: %llu\n",
+		  kretprobe_query.nmissed))
+		goto cleanup;
+	if (CHECK(kretprobe_query.nhit <= 0, "get_kretprobe_ioctl",
+		  "read incorrect nhit from kretprobe_ioctl: %llu\n",
+		  kretprobe_query.nhit))
+		goto cleanup;
+
 	err = bpf_map_lookup_elem(results_map_fd, &kprobe_idx, &res);
 	if (CHECK(err, "get_kprobe_res",
 		  "failed to get kprobe res: %d\n", err))
@@ -135,6 +193,54 @@ void test_attach_probe(void)
 	/* trigger & validate uprobe & uretprobe */
 	get_base_addr();
 
+	uprobe_fd_link = bpf_link__as_fd(uprobe_link);
+	if (CHECK(!uprobe_fd_link, "uprobe_link_as_fd",
+		  "failed to cast link to fd link\n"))
+		goto cleanup;
+
+	uprobe_fd = bpf_link_fd__fd(uprobe_fd_link);
+	if (CHECK(uprobe_fd < 0, "uprobe_get_perf_fd",
+	    "failed to get perf fd from uprobe link\n"))
+		goto cleanup;
+
+	uretprobe_fd_link = bpf_link__as_fd(uretprobe_link);
+	if (CHECK(!uretprobe_fd_link, "uretprobe_link_as_fd",
+		  "failed to cast link to fd link\n"))
+		goto cleanup;
+
+	uretprobe_fd = bpf_link_fd__fd(uretprobe_fd_link);
+	if (CHECK(uretprobe_fd < 0, "uretprobe_get_perf_fd",
+	    "failed to get perf fd from uretprobe link\n"))
+		goto cleanup;
+
+	uprobe_query.size = sizeof(uprobe_query);
+	err = ioctl(uprobe_fd, PERF_EVENT_IOC_QUERY_PROBE, &uprobe_query);
+	if (CHECK(err, "get_uprobe_ioctl",
+		  "failed to issue uprobe query ioctl\n"))
+		goto cleanup;
+	if (CHECK(uprobe_query.nmissed > 0, "get_uprobe_ioctl",
+		  "read incorrect nmissed from uprobe_ioctl: %llu\n",
+		  uprobe_query.nmissed))
+		goto cleanup;
+	if (CHECK(uprobe_query.nhit == 0, "get_uprobe_ioctl",
+		  "read incorrect nhit from uprobe_ioctl: %llu\n",
+		  uprobe_query.nhit))
+		goto cleanup;
+
+	uretprobe_query.size = sizeof(uretprobe_query);
+	err = ioctl(uretprobe_fd, PERF_EVENT_IOC_QUERY_PROBE, &uretprobe_query);
+	if (CHECK(err, "get_uretprobe_ioctl",
+		  "failed to issue uretprobe query ioctl\n"))
+		goto cleanup;
+	if (CHECK(uretprobe_query.nmissed > 0, "get_uretprobe_ioctl",
+		  "read incorrect nmissed from uretprobe_ioctl: %llu\n",
+		  uretprobe_query.nmissed))
+		goto cleanup;
+	if (CHECK(uretprobe_query.nhit <= 0, "get_uretprobe_ioctl",
+		  "read incorrect nhit from uretprobe_ioctl: %llu\n",
+		  uretprobe_query.nhit))
+		goto cleanup;
+
 	err = bpf_map_lookup_elem(results_map_fd, &uprobe_idx, &res);
 	if (CHECK(err, "get_uprobe_res",
 		  "failed to get uprobe res: %d\n", err))
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 bpf-next 2/4] libbpf: Add helpers to extract perf fd from bpf_link
  2019-08-16 22:31 ` [PATCH v3 bpf-next 2/4] libbpf: Add helpers to extract perf fd from bpf_link Daniel Xu
@ 2019-08-19 17:45   ` Andrii Nakryiko
  2019-08-19 21:30     ` Daniel Xu
  0 siblings, 1 reply; 25+ messages in thread
From: Andrii Nakryiko @ 2019-08-19 17:45 UTC (permalink / raw)
  To: Daniel Xu
  Cc: bpf, Song Liu, Yonghong Song, Andrii Nakryiko, Peter Ziljstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Alexei Starovoitov,
	alexander.shishkin, Jiri Olsa, Namhyung Kim, open list,
	Kernel Team

On Fri, Aug 16, 2019 at 3:32 PM Daniel Xu <dxu@dxuuu.xyz> wrote:
>
> It is sometimes necessary to perform ioctl's on the underlying perf fd.
> There is not currently a way to extract the fd given a bpf_link, so add a
> a pair of casting and getting helpers.
>
> The casting and getting helpers are nice because they let us define
> broad categories of links that makes it clear to users what they can
> expect to extract from what type of link.
>
> Acked-by: Song Liu <songliubraving@fb.com>
> Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
> ---

This looks great, thanks a lot!

I think you might have a conflict with dadb81d0afe7 ("libbpf: make
libbpf.map source of truth for libbpf version") in libbpf.map, so you
might need to pull, rebase and re-post rebased version. But in any
case:

Acked-by: Andrii Nakryiko <andriin@fb.com>

>  tools/lib/bpf/libbpf.c   | 21 +++++++++++++++++++++
>  tools/lib/bpf/libbpf.h   | 13 +++++++++++++
>  tools/lib/bpf/libbpf.map |  4 ++++
>  3 files changed, 38 insertions(+)
>
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 2233f919dd88..41588e13be2b 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -4876,6 +4876,7 @@ int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr,
>
>  struct bpf_link {
>         int (*destroy)(struct bpf_link *link);
> +       enum bpf_link_type type;
>  };
>
>  int bpf_link__destroy(struct bpf_link *link)
> @@ -4909,6 +4910,24 @@ static int bpf_link__destroy_perf_event(struct bpf_link *link)
>         return err;
>  }
>
> +const struct bpf_link_fd *bpf_link__as_fd(const struct bpf_link *link)
> +{
> +       if (link->type != LIBBPF_LINK_FD)
> +               return NULL;
> +
> +       return (struct bpf_link_fd *)link;
> +}
> +
> +enum bpf_link_type bpf_link__type(const struct bpf_link *link)
> +{
> +       return link->type;
> +}
> +
> +int bpf_link_fd__fd(const struct bpf_link_fd *link)
> +{
> +       return link->fd;
> +}
> +
>  struct bpf_link *bpf_program__attach_perf_event(struct bpf_program *prog,
>                                                 int pfd)
>  {
> @@ -4932,6 +4951,7 @@ struct bpf_link *bpf_program__attach_perf_event(struct bpf_program *prog,
>         if (!link)
>                 return ERR_PTR(-ENOMEM);
>         link->link.destroy = &bpf_link__destroy_perf_event;
> +       link->link.type = LIBBPF_LINK_FD;
>         link->fd = pfd;
>
>         if (ioctl(pfd, PERF_EVENT_IOC_SET_BPF, prog_fd) < 0) {
> @@ -5225,6 +5245,7 @@ struct bpf_link *bpf_program__attach_raw_tracepoint(struct bpf_program *prog,
>         link = malloc(sizeof(*link));
>         if (!link)
>                 return ERR_PTR(-ENOMEM);
> +       link->link.type = LIBBPF_LINK_FD;
>         link->link.destroy = &bpf_link__destroy_fd;
>
>         pfd = bpf_raw_tracepoint_open(tp_name, prog_fd);
> diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
> index e8f70977d137..2ddef5315ff9 100644
> --- a/tools/lib/bpf/libbpf.h
> +++ b/tools/lib/bpf/libbpf.h
> @@ -166,7 +166,20 @@ LIBBPF_API int bpf_program__pin(struct bpf_program *prog, const char *path);
>  LIBBPF_API int bpf_program__unpin(struct bpf_program *prog, const char *path);
>  LIBBPF_API void bpf_program__unload(struct bpf_program *prog);
>
> +enum bpf_link_type {
> +       LIBBPF_LINK_FD,
> +};
> +
>  struct bpf_link;
> +struct bpf_link_fd;
> +
> +/* casting APIs */
> +LIBBPF_API const struct bpf_link_fd *
> +bpf_link__as_fd(const struct bpf_link *link);
> +
> +/* getters APIs */
> +LIBBPF_API enum bpf_link_type bpf_link__type(const struct bpf_link *link);
> +LIBBPF_API int bpf_link_fd__fd(const struct bpf_link_fd *link);
>
>  LIBBPF_API int bpf_link__destroy(struct bpf_link *link);
>
> diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
> index 4e72df8e98ba..ee9945177100 100644
> --- a/tools/lib/bpf/libbpf.map
> +++ b/tools/lib/bpf/libbpf.map
> @@ -186,4 +186,8 @@ LIBBPF_0.0.4 {
>  } LIBBPF_0.0.3;
>
>  LIBBPF_0.0.5 {
> +       global:
> +               bpf_link__type;
> +               bpf_link__as_fd;
> +               bpf_link_fd__fd;
>  } LIBBPF_0.0.4;
> --
> 2.20.1
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 bpf-next 1/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl
  2019-08-16 22:31 ` [PATCH v3 bpf-next 1/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl Daniel Xu
@ 2019-08-19 18:22   ` kbuild test robot
  2019-08-20  1:26   ` Alexei Starovoitov
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 25+ messages in thread
From: kbuild test robot @ 2019-08-19 18:22 UTC (permalink / raw)
  To: Daniel Xu
  Cc: kbuild-all, bpf, songliubraving, yhs, andriin, peterz, mingo,
	acme, Daniel Xu, ast, alexander.shishkin, jolsa, namhyung,
	linux-kernel, kernel-team

[-- Attachment #1: Type: text/plain, Size: 10924 bytes --]

Hi Daniel,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on bpf-next/master]

url:    https://github.com/0day-ci/linux/commits/Daniel-Xu/tracing-probe-Add-PERF_EVENT_IOC_QUERY_PROBE-ioctl/20190820-003910
base:   https://kernel.googlesource.com/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: i386-alldefconfig (attached as .config)
compiler: gcc-7 (Debian 7.4.0-10) 7.4.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   ld: init/do_mounts.o: in function `perf_kprobe_event_query':
>> do_mounts.c:(.text+0x80): multiple definition of `perf_kprobe_event_query'; init/main.o:main.c:(.text+0x80): first defined here
   ld: init/do_mounts.o: in function `perf_uprobe_event_query':
>> do_mounts.c:(.text+0x90): multiple definition of `perf_uprobe_event_query'; init/main.o:main.c:(.text+0x90): first defined here
   ld: init/noinitramfs.o: in function `perf_kprobe_event_query':
   noinitramfs.c:(.text+0x0): multiple definition of `perf_kprobe_event_query'; init/main.o:main.c:(.text+0x80): first defined here
   ld: init/noinitramfs.o: in function `perf_uprobe_event_query':
   noinitramfs.c:(.text+0x10): multiple definition of `perf_uprobe_event_query'; init/main.o:main.c:(.text+0x90): first defined here
   ld: arch/x86/entry/common.o: in function `perf_kprobe_event_query':
   common.c:(.text+0x2c0): multiple definition of `perf_kprobe_event_query'; init/main.o:main.c:(.text+0x80): first defined here
   ld: arch/x86/entry/common.o: in function `perf_uprobe_event_query':
   common.c:(.text+0x2d0): multiple definition of `perf_uprobe_event_query'; init/main.o:main.c:(.text+0x90): first defined here
   ld: arch/x86/kernel/process_32.o: in function `perf_kprobe_event_query':
   process_32.c:(.text+0x0): multiple definition of `perf_kprobe_event_query'; init/main.o:main.c:(.text+0x80): first defined here
   ld: arch/x86/kernel/process_32.o: in function `perf_uprobe_event_query':
   process_32.c:(.text+0x10): multiple definition of `perf_uprobe_event_query'; init/main.o:main.c:(.text+0x90): first defined here
   ld: arch/x86/kernel/signal.o: in function `perf_kprobe_event_query':
   signal.c:(.text+0x2e0): multiple definition of `perf_kprobe_event_query'; init/main.o:main.c:(.text+0x80): first defined here
   ld: arch/x86/kernel/signal.o: in function `perf_uprobe_event_query':
   signal.c:(.text+0x2f0): multiple definition of `perf_uprobe_event_query'; init/main.o:main.c:(.text+0x90): first defined here
   ld: arch/x86/kernel/ioport.o: in function `perf_kprobe_event_query':
   ioport.c:(.text+0x0): multiple definition of `perf_kprobe_event_query'; init/main.o:main.c:(.text+0x80): first defined here
   ld: arch/x86/kernel/ioport.o: in function `perf_uprobe_event_query':
   ioport.c:(.text+0x10): multiple definition of `perf_uprobe_event_query'; init/main.o:main.c:(.text+0x90): first defined here
   ld: arch/x86/kernel/ldt.o: in function `perf_kprobe_event_query':
   ldt.c:(.text+0x500): multiple definition of `perf_kprobe_event_query'; init/main.o:main.c:(.text+0x80): first defined here
   ld: arch/x86/kernel/ldt.o: in function `perf_uprobe_event_query':
   ldt.c:(.text+0x510): multiple definition of `perf_uprobe_event_query'; init/main.o:main.c:(.text+0x90): first defined here
   ld: arch/x86/kernel/process.o: in function `perf_kprobe_event_query':
   process.c:(.text+0xe0): multiple definition of `perf_kprobe_event_query'; init/main.o:main.c:(.text+0x80): first defined here
   ld: arch/x86/kernel/process.o: in function `perf_uprobe_event_query':
   process.c:(.text+0xf0): multiple definition of `perf_uprobe_event_query'; init/main.o:main.c:(.text+0x90): first defined here
   ld: arch/x86/kernel/tls.o: in function `perf_kprobe_event_query':
   tls.c:(.text+0x2c0): multiple definition of `perf_kprobe_event_query'; init/main.o:main.c:(.text+0x80): first defined here
   ld: arch/x86/kernel/tls.o: in function `perf_uprobe_event_query':
   tls.c:(.text+0x2d0): multiple definition of `perf_uprobe_event_query'; init/main.o:main.c:(.text+0x90): first defined here
   ld: kernel/fork.o: in function `perf_kprobe_event_query':
   fork.c:(.text+0x510): multiple definition of `perf_kprobe_event_query'; init/main.o:main.c:(.text+0x80): first defined here
   ld: kernel/fork.o: in function `perf_uprobe_event_query':
   fork.c:(.text+0x520): multiple definition of `perf_uprobe_event_query'; init/main.o:main.c:(.text+0x90): first defined here
   ld: kernel/exec_domain.o: in function `perf_kprobe_event_query':
   exec_domain.c:(.text+0x20): multiple definition of `perf_kprobe_event_query'; init/main.o:main.c:(.text+0x80): first defined here
   ld: kernel/exec_domain.o: in function `perf_uprobe_event_query':
   exec_domain.c:(.text+0x30): multiple definition of `perf_uprobe_event_query'; init/main.o:main.c:(.text+0x90): first defined here
   ld: kernel/cpu.o: in function `perf_kprobe_event_query':
   cpu.c:(.text+0x1b0): multiple definition of `perf_kprobe_event_query'; init/main.o:main.c:(.text+0x80): first defined here
   ld: kernel/cpu.o: in function `perf_uprobe_event_query':
   cpu.c:(.text+0x1c0): multiple definition of `perf_uprobe_event_query'; init/main.o:main.c:(.text+0x90): first defined here
   ld: kernel/exit.o: in function `perf_kprobe_event_query':
   exit.c:(.text+0x260): multiple definition of `perf_kprobe_event_query'; init/main.o:main.c:(.text+0x80): first defined here
   ld: kernel/exit.o: in function `perf_uprobe_event_query':
   exit.c:(.text+0x270): multiple definition of `perf_uprobe_event_query'; init/main.o:main.c:(.text+0x90): first defined here
   ld: kernel/sysctl.o: in function `perf_kprobe_event_query':
   sysctl.c:(.text+0x1790): multiple definition of `perf_kprobe_event_query'; init/main.o:main.c:(.text+0x80): first defined here
   ld: kernel/sysctl.o: in function `perf_uprobe_event_query':
   sysctl.c:(.text+0x17a0): multiple definition of `perf_uprobe_event_query'; init/main.o:main.c:(.text+0x90): first defined here
   ld: kernel/sysctl_binary.o: in function `perf_kprobe_event_query':
   sysctl_binary.c:(.text+0x0): multiple definition of `perf_kprobe_event_query'; init/main.o:main.c:(.text+0x80): first defined here
   ld: kernel/sysctl_binary.o: in function `perf_uprobe_event_query':
   sysctl_binary.c:(.text+0x10): multiple definition of `perf_uprobe_event_query'; init/main.o:main.c:(.text+0x90): first defined here
   ld: kernel/capability.o: in function `perf_kprobe_event_query':
   capability.c:(.text+0x170): multiple definition of `perf_kprobe_event_query'; init/main.o:main.c:(.text+0x80): first defined here
   ld: kernel/capability.o: in function `perf_uprobe_event_query':
   capability.c:(.text+0x180): multiple definition of `perf_uprobe_event_query'; init/main.o:main.c:(.text+0x90): first defined here
   ld: kernel/ptrace.o: in function `perf_kprobe_event_query':
   ptrace.c:(.text+0x680): multiple definition of `perf_kprobe_event_query'; init/main.o:main.c:(.text+0x80): first defined here
   ld: kernel/ptrace.o: in function `perf_uprobe_event_query':
   ptrace.c:(.text+0x690): multiple definition of `perf_uprobe_event_query'; init/main.o:main.c:(.text+0x90): first defined here
   ld: kernel/signal.o: in function `perf_kprobe_event_query':
   signal.c:(.text+0x810): multiple definition of `perf_kprobe_event_query'; init/main.o:main.c:(.text+0x80): first defined here
   ld: kernel/signal.o: in function `perf_uprobe_event_query':
   signal.c:(.text+0x820): multiple definition of `perf_uprobe_event_query'; init/main.o:main.c:(.text+0x90): first defined here
   ld: kernel/sys.o: in function `perf_kprobe_event_query':
   sys.c:(.text+0xa40): multiple definition of `perf_kprobe_event_query'; init/main.o:main.c:(.text+0x80): first defined here
   ld: kernel/sys.o: in function `perf_uprobe_event_query':
   sys.c:(.text+0xa50): multiple definition of `perf_uprobe_event_query'; init/main.o:main.c:(.text+0x90): first defined here
   ld: kernel/umh.o: in function `perf_kprobe_event_query':
   umh.c:(.text+0x4b0): multiple definition of `perf_kprobe_event_query'; init/main.o:main.c:(.text+0x80): first defined here
   ld: kernel/umh.o: in function `perf_uprobe_event_query':
   umh.c:(.text+0x4c0): multiple definition of `perf_uprobe_event_query'; init/main.o:main.c:(.text+0x90): first defined here
   ld: kernel/pid.o: in function `perf_kprobe_event_query':
   pid.c:(.text+0x0): multiple definition of `perf_kprobe_event_query'; init/main.o:main.c:(.text+0x80): first defined here
   ld: kernel/pid.o: in function `perf_uprobe_event_query':
   pid.c:(.text+0x10): multiple definition of `perf_uprobe_event_query'; init/main.o:main.c:(.text+0x90): first defined here
   ld: kernel/nsproxy.o: in function `perf_kprobe_event_query':
   nsproxy.c:(.text+0x1e0): multiple definition of `perf_kprobe_event_query'; init/main.o:main.c:(.text+0x80): first defined here
   ld: kernel/nsproxy.o: in function `perf_uprobe_event_query':
   nsproxy.c:(.text+0x1f0): multiple definition of `perf_uprobe_event_query'; init/main.o:main.c:(.text+0x90): first defined here
   ld: kernel/reboot.o: in function `perf_kprobe_event_query':
   reboot.c:(.text+0x90): multiple definition of `perf_kprobe_event_query'; init/main.o:main.c:(.text+0x80): first defined here
   ld: kernel/reboot.o: in function `perf_uprobe_event_query':
   reboot.c:(.text+0xa0): multiple definition of `perf_uprobe_event_query'; init/main.o:main.c:(.text+0x90): first defined here
   ld: kernel/groups.o: in function `perf_kprobe_event_query':
   groups.c:(.text+0x20): multiple definition of `perf_kprobe_event_query'; init/main.o:main.c:(.text+0x80): first defined here
   ld: kernel/groups.o: in function `perf_uprobe_event_query':
   groups.c:(.text+0x30): multiple definition of `perf_uprobe_event_query'; init/main.o:main.c:(.text+0x90): first defined here
   ld: kernel/sched/core.o: in function `perf_kprobe_event_query':
   core.c:(.text+0xdf0): multiple definition of `perf_kprobe_event_query'; init/main.o:main.c:(.text+0x80): first defined here
   ld: kernel/sched/core.o: in function `perf_uprobe_event_query':
   core.c:(.text+0xe00): multiple definition of `perf_uprobe_event_query'; init/main.o:main.c:(.text+0x90): first defined here
   ld: kernel/sched/loadavg.o: in function `perf_kprobe_event_query':
   loadavg.c:(.text+0x0): multiple definition of `perf_kprobe_event_query'; init/main.o:main.c:(.text+0x80): first defined here
   ld: kernel/sched/loadavg.o: in function `perf_uprobe_event_query':
   loadavg.c:(.text+0x10): multiple definition of `perf_uprobe_event_query'; init/main.o:main.c:(.text+0x90): first defined here

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 10579 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 bpf-next 2/4] libbpf: Add helpers to extract perf fd from bpf_link
  2019-08-19 17:45   ` Andrii Nakryiko
@ 2019-08-19 21:30     ` Daniel Xu
  0 siblings, 0 replies; 25+ messages in thread
From: Daniel Xu @ 2019-08-19 21:30 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Song Liu, Yonghong Song, Andrii Nakryiko, Peter Ziljstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Alexei Starovoitov,
	alexander.shishkin, Jiri Olsa, Namhyung Kim, open list,
	Kernel Team

On Mon, Aug 19, 2019, at 10:45 AM, Andrii Nakryiko wrote:
> On Fri, Aug 16, 2019 at 3:32 PM Daniel Xu <dxu@dxuuu.xyz> wrote:
> >
> > It is sometimes necessary to perform ioctl's on the underlying perf fd.
> > There is not currently a way to extract the fd given a bpf_link, so add a
> > a pair of casting and getting helpers.
> >
> > The casting and getting helpers are nice because they let us define
> > broad categories of links that makes it clear to users what they can
> > expect to extract from what type of link.
> >
> > Acked-by: Song Liu <songliubraving@fb.com>
> > Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
> > ---
> 
> This looks great, thanks a lot!
> 
> I think you might have a conflict with dadb81d0afe7 ("libbpf: make
> libbpf.map source of truth for libbpf version") in libbpf.map, so you
> might need to pull, rebase and re-post rebased version. But in any
> case:
> 

The patchset is already rebased on top :). Thanks for the tip.

Daniel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 bpf-next 1/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl
  2019-08-16 22:31 ` [PATCH v3 bpf-next 1/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl Daniel Xu
  2019-08-19 18:22   ` kbuild test robot
@ 2019-08-20  1:26   ` Alexei Starovoitov
  2019-08-20  2:34     ` Daniel Xu
  2019-08-20  2:04   ` kbuild test robot
  2019-08-20 14:45   ` Peter Zijlstra
  3 siblings, 1 reply; 25+ messages in thread
From: Alexei Starovoitov @ 2019-08-20  1:26 UTC (permalink / raw)
  To: Daniel Xu
  Cc: bpf, Song Liu, Yonghong Song, Andrii Nakryiko, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Alexei Starovoitov,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, LKML, Kernel Team

On Fri, Aug 16, 2019 at 3:33 PM Daniel Xu <dxu@dxuuu.xyz> wrote:
>
> It's useful to know [uk]probe's nmissed and nhit stats. For example with
> tracing tools, it's important to know when events may have been lost.
> debugfs currently exposes a control file to get this information, but
> it is not compatible with probes registered with the perf API.
>
> While bpf programs may be able to manually count nhit, there is no way
> to gather nmissed. In other words, it is currently not possible to
> retrieve information about FD-based probes.
>
> This patch adds a new ioctl that lets users query nmissed (as well as
> nhit for completeness). We currently only add support for [uk]probes
> but leave the possibility open for other probes like tracepoint.
>
> Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
...
> +int perf_kprobe_event_query(struct perf_event *event, void __user *info)
> +{
> +       struct perf_event_query_probe __user *uquery = info;
> +       struct perf_event_query_probe query = {};
> +       struct trace_event_call *call = event->tp_event;
> +       struct trace_kprobe *tk = (struct trace_kprobe *)call->data;
> +       u64 ncopy;
> +
> +       if (!capable(CAP_SYS_ADMIN))
> +               return -EPERM;
> +       if (copy_from_user(&query, uquery,
> +                          offsetofend(struct perf_event_query_probe, size)))
> +               return -EFAULT;
> +
> +       ncopy = min_t(u64, query.size, sizeof(query));
> +       query.nhit = trace_kprobe_nhit(tk);
> +       query.nmissed = tk->rp.kp.nmissed;
> +
> +       if (copy_to_user(uquery, &query, ncopy))
> +               return -EFAULT;

shouldn't kernel update query.size before copying back?
Otherwise how user space would know which fields
were populated?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 bpf-next 1/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl
  2019-08-16 22:31 ` [PATCH v3 bpf-next 1/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl Daniel Xu
  2019-08-19 18:22   ` kbuild test robot
  2019-08-20  1:26   ` Alexei Starovoitov
@ 2019-08-20  2:04   ` kbuild test robot
  2019-08-20 14:45   ` Peter Zijlstra
  3 siblings, 0 replies; 25+ messages in thread
From: kbuild test robot @ 2019-08-20  2:04 UTC (permalink / raw)
  To: Daniel Xu
  Cc: kbuild-all, bpf, songliubraving, yhs, andriin, peterz, mingo,
	acme, Daniel Xu, ast, alexander.shishkin, jolsa, namhyung,
	linux-kernel, kernel-team

[-- Attachment #1: Type: text/plain, Size: 7936 bytes --]

Hi Daniel,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on bpf-next/master]

url:    https://github.com/0day-ci/linux/commits/Daniel-Xu/tracing-probe-Add-PERF_EVENT_IOC_QUERY_PROBE-ioctl/20190820-003910
base:   https://kernel.googlesource.com/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: arm64-defconfig (attached as .config)
compiler: aarch64-linux-gcc (GCC) 7.4.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=7.4.0 make.cross ARCH=arm64 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   init/do_mounts.o: In function `perf_kprobe_event_query':
>> do_mounts.c:(.text+0x4e8): multiple definition of `perf_kprobe_event_query'
   init/main.o:main.c:(.text+0x168): first defined here
   init/do_mounts.o: In function `perf_uprobe_event_query':
>> do_mounts.c:(.text+0x4f0): multiple definition of `perf_uprobe_event_query'
   init/main.o:main.c:(.text+0x170): first defined here
   init/do_mounts_initrd.o: In function `perf_kprobe_event_query':
   do_mounts_initrd.c:(.text+0x0): multiple definition of `perf_kprobe_event_query'
   init/main.o:main.c:(.text+0x168): first defined here
   init/do_mounts_initrd.o: In function `perf_uprobe_event_query':
   do_mounts_initrd.c:(.text+0x8): multiple definition of `perf_uprobe_event_query'
   init/main.o:main.c:(.text+0x170): first defined here
   init/initramfs.o: In function `perf_kprobe_event_query':
   initramfs.c:(.text+0x0): multiple definition of `perf_kprobe_event_query'
   init/main.o:main.c:(.text+0x168): first defined here
   init/initramfs.o: In function `perf_uprobe_event_query':
   initramfs.c:(.text+0x8): multiple definition of `perf_uprobe_event_query'
   init/main.o:main.c:(.text+0x170): first defined here
   arch/arm64/kernel/process.o: In function `perf_kprobe_event_query':
   process.c:(.text+0x0): multiple definition of `perf_kprobe_event_query'
   init/main.o:main.c:(.text+0x168): first defined here
   arch/arm64/kernel/process.o: In function `perf_uprobe_event_query':
   process.c:(.text+0x8): multiple definition of `perf_uprobe_event_query'
   init/main.o:main.c:(.text+0x170): first defined here
   arch/arm64/kernel/signal.o: In function `perf_kprobe_event_query':
   signal.c:(.text+0x11e0): multiple definition of `perf_kprobe_event_query'
   init/main.o:main.c:(.text+0x168): first defined here
   arch/arm64/kernel/signal.o: In function `perf_uprobe_event_query':
   signal.c:(.text+0x11e8): multiple definition of `perf_uprobe_event_query'
   init/main.o:main.c:(.text+0x170): first defined here
   arch/arm64/kernel/sys.o: In function `perf_kprobe_event_query':
   sys.c:(.text+0xb8): multiple definition of `perf_kprobe_event_query'
   init/main.o:main.c:(.text+0x168): first defined here
   arch/arm64/kernel/sys.o: In function `perf_uprobe_event_query':
   sys.c:(.text+0xc0): multiple definition of `perf_uprobe_event_query'
   init/main.o:main.c:(.text+0x170): first defined here
   arch/arm64/kernel/traps.o: In function `perf_kprobe_event_query':
   traps.c:(.text+0x510): multiple definition of `perf_kprobe_event_query'
   init/main.o:main.c:(.text+0x168): first defined here
   arch/arm64/kernel/traps.o: In function `perf_uprobe_event_query':
   traps.c:(.text+0x518): multiple definition of `perf_uprobe_event_query'
   init/main.o:main.c:(.text+0x170): first defined here
   arch/arm64/kernel/syscall.o: In function `perf_kprobe_event_query':
   syscall.c:(.text+0x168): multiple definition of `perf_kprobe_event_query'
   init/main.o:main.c:(.text+0x168): first defined here
   arch/arm64/kernel/syscall.o: In function `perf_uprobe_event_query':
   syscall.c:(.text+0x170): multiple definition of `perf_uprobe_event_query'
   init/main.o:main.c:(.text+0x170): first defined here
   arch/arm64/kernel/sys32.o: In function `perf_kprobe_event_query':
   sys32.c:(.text+0x228): multiple definition of `perf_kprobe_event_query'
   init/main.o:main.c:(.text+0x168): first defined here
   arch/arm64/kernel/sys32.o: In function `perf_uprobe_event_query':
   sys32.c:(.text+0x230): multiple definition of `perf_uprobe_event_query'
   init/main.o:main.c:(.text+0x170): first defined here
   arch/arm64/kernel/signal32.o: In function `perf_kprobe_event_query':
   signal32.c:(.text+0x1548): multiple definition of `perf_kprobe_event_query'
   init/main.o:main.c:(.text+0x168): first defined here
   arch/arm64/kernel/signal32.o: In function `perf_uprobe_event_query':
   signal32.c:(.text+0x1550): multiple definition of `perf_uprobe_event_query'
   init/main.o:main.c:(.text+0x170): first defined here
   arch/arm64/kernel/sys_compat.o: In function `perf_kprobe_event_query':
   sys_compat.c:(.text+0x0): multiple definition of `perf_kprobe_event_query'
   init/main.o:main.c:(.text+0x168): first defined here
   arch/arm64/kernel/sys_compat.o: In function `perf_uprobe_event_query':
   sys_compat.c:(.text+0x8): multiple definition of `perf_uprobe_event_query'
   init/main.o:main.c:(.text+0x170): first defined here
   arch/arm64/kvm/../../../virt/kvm/eventfd.o: In function `perf_kprobe_event_query':
   eventfd.c:(.text+0x810): multiple definition of `perf_kprobe_event_query'
   init/main.o:main.c:(.text+0x168): first defined here
   arch/arm64/kvm/../../../virt/kvm/eventfd.o: In function `perf_uprobe_event_query':
   eventfd.c:(.text+0x818): multiple definition of `perf_uprobe_event_query'
   init/main.o:main.c:(.text+0x170): first defined here
   kernel/fork.o: In function `perf_kprobe_event_query':
   fork.c:(.text+0x990): multiple definition of `perf_kprobe_event_query'
   init/main.o:main.c:(.text+0x168): first defined here
   kernel/fork.o: In function `perf_uprobe_event_query':
   fork.c:(.text+0x998): multiple definition of `perf_uprobe_event_query'
   init/main.o:main.c:(.text+0x170): first defined here
   kernel/exec_domain.o: In function `perf_kprobe_event_query':
   exec_domain.c:(.text+0x20): multiple definition of `perf_kprobe_event_query'
   init/main.o:main.c:(.text+0x168): first defined here
   kernel/exec_domain.o: In function `perf_uprobe_event_query':
   exec_domain.c:(.text+0x28): multiple definition of `perf_uprobe_event_query'
   init/main.o:main.c:(.text+0x170): first defined here
   kernel/cpu.o: In function `perf_kprobe_event_query':
   cpu.c:(.text+0x1670): multiple definition of `perf_kprobe_event_query'
   init/main.o:main.c:(.text+0x168): first defined here
   kernel/cpu.o: In function `perf_uprobe_event_query':
   cpu.c:(.text+0x1678): multiple definition of `perf_uprobe_event_query'
   init/main.o:main.c:(.text+0x170): first defined here
   kernel/exit.o: In function `perf_kprobe_event_query':
   exit.c:(.text+0x260): multiple definition of `perf_kprobe_event_query'
   init/main.o:main.c:(.text+0x168): first defined here
   kernel/exit.o: In function `perf_uprobe_event_query':
   exit.c:(.text+0x268): multiple definition of `perf_uprobe_event_query'
   init/main.o:main.c:(.text+0x170): first defined here
   kernel/sysctl.o: In function `perf_kprobe_event_query':
   sysctl.c:(.text+0x2200): multiple definition of `perf_kprobe_event_query'
   init/main.o:main.c:(.text+0x168): first defined here
   kernel/sysctl.o: In function `perf_uprobe_event_query':
   sysctl.c:(.text+0x2208): multiple definition of `perf_uprobe_event_query'
   init/main.o:main.c:(.text+0x170): first defined here
   kernel/sysctl_binary.o: In function `perf_kprobe_event_query':
   sysctl_binary.c:(.text+0x200): multiple definition of `perf_kprobe_event_query'
   init/main.o:main.c:(.text+0x168): first defined here

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 45361 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 bpf-next 1/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl
  2019-08-20  1:26   ` Alexei Starovoitov
@ 2019-08-20  2:34     ` Daniel Xu
  2019-08-20  2:52       ` Alexei Starovoitov
  0 siblings, 1 reply; 25+ messages in thread
From: Daniel Xu @ 2019-08-20  2:34 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Song Liu, Yonghong Song, Andrii Nakryiko, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Alexei Starovoitov,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, LKML, Kernel Team

On Mon Aug 19, 2019 at 6:26 PM Alexei Starovoitov wrote:
> On Fri, Aug 16, 2019 at 3:33 PM Daniel Xu <dxu@dxuuu.xyz> wrote:
> >
> > It's useful to know [uk]probe's nmissed and nhit stats. For example with
> > tracing tools, it's important to know when events may have been lost.
> > debugfs currently exposes a control file to get this information, but
> > it is not compatible with probes registered with the perf API.
> >
> > While bpf programs may be able to manually count nhit, there is no way
> > to gather nmissed. In other words, it is currently not possible to
> > retrieve information about FD-based probes.
> >
> > This patch adds a new ioctl that lets users query nmissed (as well as
> > nhit for completeness). We currently only add support for [uk]probes
> > but leave the possibility open for other probes like tracepoint.
> >
> > Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
> ...
> > +int perf_kprobe_event_query(struct perf_event *event, void __user *info)
> > +{
> > +       struct perf_event_query_probe __user *uquery = info;
> > +       struct perf_event_query_probe query = {};
> > +       struct trace_event_call *call = event->tp_event;
> > +       struct trace_kprobe *tk = (struct trace_kprobe *)call->data;
> > +       u64 ncopy;
> > +
> > +       if (!capable(CAP_SYS_ADMIN))
> > +               return -EPERM;
> > +       if (copy_from_user(&query, uquery,
> > +                          offsetofend(struct perf_event_query_probe, size)))
> > +               return -EFAULT;
> > +
> > +       ncopy = min_t(u64, query.size, sizeof(query));
> > +       query.nhit = trace_kprobe_nhit(tk);
> > +       query.nmissed = tk->rp.kp.nmissed;
> > +
> > +       if (copy_to_user(uquery, &query, ncopy))
> > +               return -EFAULT;
> 
> shouldn't kernel update query.size before copying back?
> Otherwise how user space would know which fields
> were populated?

Ah yes, sorry. Will add that.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 bpf-next 1/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl
  2019-08-20  2:34     ` Daniel Xu
@ 2019-08-20  2:52       ` Alexei Starovoitov
  0 siblings, 0 replies; 25+ messages in thread
From: Alexei Starovoitov @ 2019-08-20  2:52 UTC (permalink / raw)
  To: Daniel Xu
  Cc: bpf, Song Liu, Yonghong Song, Andrii Nakryiko, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Alexei Starovoitov,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, LKML, Kernel Team

On Mon, Aug 19, 2019 at 7:34 PM Daniel Xu <dxu@dxuuu.xyz> wrote:
>
> Ah yes, sorry. Will add that.

Also please fix build errors.
It looks like buildbot is not happy about few things.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 bpf-next 1/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl
  2019-08-16 22:31 ` [PATCH v3 bpf-next 1/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl Daniel Xu
                     ` (2 preceding siblings ...)
  2019-08-20  2:04   ` kbuild test robot
@ 2019-08-20 14:45   ` Peter Zijlstra
  2019-08-20 17:58     ` Daniel Xu
  3 siblings, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2019-08-20 14:45 UTC (permalink / raw)
  To: Daniel Xu
  Cc: bpf, songliubraving, yhs, andriin, mingo, acme, ast,
	alexander.shishkin, jolsa, namhyung, linux-kernel, kernel-team

On Fri, Aug 16, 2019 at 03:31:46PM -0700, Daniel Xu wrote:
> It's useful to know [uk]probe's nmissed and nhit stats. For example with
> tracing tools, it's important to know when events may have been lost.
> debugfs currently exposes a control file to get this information, but
> it is not compatible with probes registered with the perf API.

What is this nmissed and nhit stuff?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 bpf-next 1/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl
  2019-08-20 14:45   ` Peter Zijlstra
@ 2019-08-20 17:58     ` Daniel Xu
  2019-08-21 11:08       ` Peter Zijlstra
  0 siblings, 1 reply; 25+ messages in thread
From: Daniel Xu @ 2019-08-20 17:58 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: bpf, songliubraving, yhs, andriin, mingo, acme, ast,
	alexander.shishkin, jolsa, namhyung, linux-kernel, kernel-team

Hi Peter,

On Tue Aug 20, 2019 at 4:45 PM Peter Zijlstra wrote:
> On Fri, Aug 16, 2019 at 03:31:46PM -0700, Daniel Xu wrote:
> > It's useful to know [uk]probe's nmissed and nhit stats. For example with
> > tracing tools, it's important to know when events may have been lost.
> > debugfs currently exposes a control file to get this information, but
> > it is not compatible with probes registered with the perf API.
> 
> What is this nmissed and nhit stuff?

nmissed is the number of times the probe's handler should have been run
but didn't. nhit is the number of times the probes handler has run. I've
documented this information in the uapi header. If you'd like, I can put
it in the commit message too.

Daniel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 bpf-next 1/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl
  2019-08-20 17:58     ` Daniel Xu
@ 2019-08-21 11:08       ` Peter Zijlstra
  2019-08-21 16:54         ` Yonghong Song
  0 siblings, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2019-08-21 11:08 UTC (permalink / raw)
  To: Daniel Xu
  Cc: bpf, songliubraving, yhs, andriin, mingo, acme, ast,
	alexander.shishkin, jolsa, namhyung, linux-kernel, kernel-team

On Tue, Aug 20, 2019 at 10:58:47AM -0700, Daniel Xu wrote:
> Hi Peter,
> 
> On Tue Aug 20, 2019 at 4:45 PM Peter Zijlstra wrote:
> > On Fri, Aug 16, 2019 at 03:31:46PM -0700, Daniel Xu wrote:
> > > It's useful to know [uk]probe's nmissed and nhit stats. For example with
> > > tracing tools, it's important to know when events may have been lost.
> > > debugfs currently exposes a control file to get this information, but
> > > it is not compatible with probes registered with the perf API.
> > 
> > What is this nmissed and nhit stuff?
> 
> nmissed is the number of times the probe's handler should have been run
> but didn't. nhit is the number of times the probes handler has run. I've
> documented this information in the uapi header. If you'd like, I can put
> it in the commit message too.

That comment just says: 'number of times this probe was temporarily
disabled', which says exactly nothing.

But reading the kprobe code seems to suggest this happens on recursive
kprobes, which I'm thinking is a dodgy situation in the first place.

ftrace and perf in general don't keep counts of events lost due to
recursion, so why should we do this for kprobes? Also, while you write
to support uprobes, it doesn't actually suffer from this (it cannot,
uprobes cannot recurse), so supporting it makes no sense.

And with that, the name QUERY_PROBE also makes no sense, because it is
not specific to [uk]probes, all software events suffer this.

And I'm not sure an additional ioctl() is the right way, supposing we
want to expose this at all. You've mentioned no alternative approached,
I'm thinking PERF_FORMAT_LOST might be possible, or maybe a
PERF_RECORD_LOST extention.

Of course, then you get to implement it for tracepoints and software
events too.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 bpf-next 1/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl
  2019-08-21 11:08       ` Peter Zijlstra
@ 2019-08-21 16:54         ` Yonghong Song
  2019-08-21 18:31           ` Peter Zijlstra
  2019-08-21 20:07           ` Arnaldo Carvalho de Melo
  0 siblings, 2 replies; 25+ messages in thread
From: Yonghong Song @ 2019-08-21 16:54 UTC (permalink / raw)
  To: Peter Zijlstra, Daniel Xu
  Cc: bpf, Song Liu, Andrii Nakryiko, mingo, acme, Alexei Starovoitov,
	alexander.shishkin, jolsa, namhyung, linux-kernel, Kernel Team,
	Arnaldo Carvalho de Melo



On 8/21/19 4:08 AM, Peter Zijlstra wrote:
> On Tue, Aug 20, 2019 at 10:58:47AM -0700, Daniel Xu wrote:
>> Hi Peter,
>>
>> On Tue Aug 20, 2019 at 4:45 PM Peter Zijlstra wrote:
>>> On Fri, Aug 16, 2019 at 03:31:46PM -0700, Daniel Xu wrote:
>>>> It's useful to know [uk]probe's nmissed and nhit stats. For example with
>>>> tracing tools, it's important to know when events may have been lost.
>>>> debugfs currently exposes a control file to get this information, but
>>>> it is not compatible with probes registered with the perf API.
>>>
>>> What is this nmissed and nhit stuff?
>>
>> nmissed is the number of times the probe's handler should have been run
>> but didn't. nhit is the number of times the probes handler has run. I've
>> documented this information in the uapi header. If you'd like, I can put
>> it in the commit message too.
> 
> That comment just says: 'number of times this probe was temporarily
> disabled', which says exactly nothing.
> 
> But reading the kprobe code seems to suggest this happens on recursive
> kprobes, which I'm thinking is a dodgy situation in the first place.
> 
> ftrace and perf in general don't keep counts of events lost due to
> recursion, so why should we do this for kprobes? Also, while you write
> to support uprobes, it doesn't actually suffer from this (it cannot,
> uprobes cannot recurse), so supporting it makes no sense.
> 
> And with that, the name QUERY_PROBE also makes no sense, because it is
> not specific to [uk]probes, all software events suffer this.
> 
> And I'm not sure an additional ioctl() is the right way, supposing we
> want to expose this at all. You've mentioned no alternative approached,
> I'm thinking PERF_FORMAT_LOST might be possible, or maybe a
> PERF_RECORD_LOST extention.

Things get more complicated when bpf program is executing to replace
ring buffer output.

Currently, in kernel/trace/bpf_trace.c, we have

unsigned int trace_call_bpf(struct trace_event_call *call, void *ctx)
{
         unsigned int ret;

         if (in_nmi()) /* not supported yet */
                 return 1;

         preempt_disable();

         if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
                 /*
                  * since some bpf program is already running on this cpu,
                  * don't call into another bpf program (same or different)
                  * and don't send kprobe event into ring-buffer,
                  * so return zero here
                  */
                 ret = 0;
                 goto out;
         }
.....

In the above, the events with bpf program attached will be missed
if the context is nmi interrupt, or if some recursion happens even with 
the same or different bpf programs.
In case of recursion, the events will not be sent to ring buffer.

A lot of bpf-based tracing programs uses maps to communicate and
do not allocate ring buffer at all.

Maybe we can still use ioctl based approach which is light weighted
compared to ring buffer approach? If a fd has bpf attached, nhit/nmisses
means the kprobe is processed by bpf program or not.

Currently, for debugfs, the nhit/nmisses info is exposed at
{k|u}probe_profile. Alternative, we could expose the nhit/nmisses
in /proc/self/fdinfo/<fd>. User can query this interface to
get numbers.

Arnaldo has a question on bcc mailing list about the hit/miss
counting of bpf program missed to process events.

https://lists.iovisor.org/g/iovisor-dev/message/1783

Comments?


> 
> Of course, then you get to implement it for tracepoints and software
> events too.
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 bpf-next 1/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl
  2019-08-21 16:54         ` Yonghong Song
@ 2019-08-21 18:31           ` Peter Zijlstra
  2019-08-21 18:43             ` Yonghong Song
  2019-08-21 20:07           ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2019-08-21 18:31 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Daniel Xu, bpf, Song Liu, Andrii Nakryiko, mingo, acme,
	Alexei Starovoitov, alexander.shishkin, jolsa, namhyung,
	linux-kernel, Kernel Team, Arnaldo Carvalho de Melo

On Wed, Aug 21, 2019 at 04:54:47PM +0000, Yonghong Song wrote:
> Currently, in kernel/trace/bpf_trace.c, we have
> 
> unsigned int trace_call_bpf(struct trace_event_call *call, void *ctx)
> {
>          unsigned int ret;
> 
>          if (in_nmi()) /* not supported yet */
>                  return 1;
> 
>          preempt_disable();
> 
>          if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {

Yes, I'm aware of that.

> In the above, the events with bpf program attached will be missed
> if the context is nmi interrupt, or if some recursion happens even with 
> the same or different bpf programs.
> In case of recursion, the events will not be sent to ring buffer.

And while that is significantly worse than what ftrace/perf have, it is
fundamentally the same thing.

perf allows (and iirc ftrace does too) 4 nested context per CPU
(task,softirq,irq,nmi) but any recursion within those context and we
drop stuff.

The BPF stuff is just more eager to drop things on the floor, but it is
fundamentally the same.

> A lot of bpf-based tracing programs uses maps to communicate and
> do not allocate ring buffer at all.

So extending PERF_RECORD_LOST doesn't work. But PERF_FORMAT_LOST might
still work fine; but you get to implement it for all software events.

> Maybe we can still use ioctl based approach which is light weighted
> compared to ring buffer approach? If a fd has bpf attached, nhit/nmisses
> means the kprobe is processed by bpf program or not.

There is nothing kprobe specific here. Kprobes just appear to be the
only one actually accounting the recursion cases, but everyone has
them.

> Currently, for debugfs, the nhit/nmisses info is exposed at
> {k|u}probe_profile. Alternative, we could expose the nhit/nmisses
> in /proc/self/fdinfo/<fd>. User can query this interface to
> get numbers.

No, we're not adding stuff to procfs for this.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 bpf-next 1/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl
  2019-08-21 18:31           ` Peter Zijlstra
@ 2019-08-21 18:43             ` Yonghong Song
  2019-08-21 20:04               ` Arnaldo Carvalho de Melo
  2019-08-22  7:47               ` Peter Zijlstra
  0 siblings, 2 replies; 25+ messages in thread
From: Yonghong Song @ 2019-08-21 18:43 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Daniel Xu, bpf, Song Liu, Andrii Nakryiko, mingo, acme,
	Alexei Starovoitov, alexander.shishkin, jolsa, namhyung,
	linux-kernel, Kernel Team, Arnaldo Carvalho de Melo



On 8/21/19 11:31 AM, Peter Zijlstra wrote:
> On Wed, Aug 21, 2019 at 04:54:47PM +0000, Yonghong Song wrote:
>> Currently, in kernel/trace/bpf_trace.c, we have
>>
>> unsigned int trace_call_bpf(struct trace_event_call *call, void *ctx)
>> {
>>           unsigned int ret;
>>
>>           if (in_nmi()) /* not supported yet */
>>                   return 1;
>>
>>           preempt_disable();
>>
>>           if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
> 
> Yes, I'm aware of that.
> 
>> In the above, the events with bpf program attached will be missed
>> if the context is nmi interrupt, or if some recursion happens even with
>> the same or different bpf programs.
>> In case of recursion, the events will not be sent to ring buffer.
> 
> And while that is significantly worse than what ftrace/perf have, it is
> fundamentally the same thing.
> 
> perf allows (and iirc ftrace does too) 4 nested context per CPU
> (task,softirq,irq,nmi) but any recursion within those context and we
> drop stuff.
> 
> The BPF stuff is just more eager to drop things on the floor, but it is
> fundamentally the same.
> 
>> A lot of bpf-based tracing programs uses maps to communicate and
>> do not allocate ring buffer at all.
> 
> So extending PERF_RECORD_LOST doesn't work. But PERF_FORMAT_LOST might
> still work fine; but you get to implement it for all software events.

Could you give more specifics about PERF_FORMAT_LOST? Googling 
"PERF_FORMAT_LOST" only yields two emails which we are discussing here :-(

> 
>> Maybe we can still use ioctl based approach which is light weighted
>> compared to ring buffer approach? If a fd has bpf attached, nhit/nmisses
>> means the kprobe is processed by bpf program or not.
> 
> There is nothing kprobe specific here. Kprobes just appear to be the
> only one actually accounting the recursion cases, but everyone has
> them.

Sorry to be specific, kprobe is just an example, I actually refers to 
any perf event where bpf can attach to, which theoretically are any
perf events which can be opened with "perf_event_open" syscall although 
some of them (e.g., software events?) may not have bpf running hooks yet.

> 
>> Currently, for debugfs, the nhit/nmisses info is exposed at
>> {k|u}probe_profile. Alternative, we could expose the nhit/nmisses
>> in /proc/self/fdinfo/<fd>. User can query this interface to
>> get numbers.
> 
> No, we're not adding stuff to procfs for this.

No problem. Just a suggestion.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 bpf-next 1/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl
  2019-08-21 18:43             ` Yonghong Song
@ 2019-08-21 20:04               ` Arnaldo Carvalho de Melo
  2019-08-22  7:47               ` Peter Zijlstra
  1 sibling, 0 replies; 25+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-21 20:04 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Peter Zijlstra, Daniel Xu, bpf, Song Liu, Andrii Nakryiko, mingo,
	Alexei Starovoitov, alexander.shishkin, jolsa, namhyung,
	linux-kernel, Kernel Team, Arnaldo Carvalho de Melo

Em Wed, Aug 21, 2019 at 06:43:49PM +0000, Yonghong Song escreveu:
> On 8/21/19 11:31 AM, Peter Zijlstra wrote:
> > On Wed, Aug 21, 2019 at 04:54:47PM +0000, Yonghong Song wrote:
> >> A lot of bpf-based tracing programs uses maps to communicate and
> >> do not allocate ring buffer at all.
> > 
> > So extending PERF_RECORD_LOST doesn't work. But PERF_FORMAT_LOST might
> > still work fine; but you get to implement it for all software events.
> 
> Could you give more specifics about PERF_FORMAT_LOST? Googling 
> "PERF_FORMAT_LOST" only yields two emails which we are discussing here :-(

Perhaps he's talking about using read(perf_event_fd, ...) after having set it
up with perf_event_attr.read_format with the-to-be-implemented
PERF_FORMAT_LOST bit?

Look at perf_read() and perf_read_one() in kernel/events/core.c.
 
- Arnaldo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 bpf-next 1/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl
  2019-08-21 16:54         ` Yonghong Song
  2019-08-21 18:31           ` Peter Zijlstra
@ 2019-08-21 20:07           ` Arnaldo Carvalho de Melo
  2019-08-21 22:10             ` Yonghong Song
  1 sibling, 1 reply; 25+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-21 20:07 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Peter Zijlstra, Daniel Xu, bpf, Song Liu, Andrii Nakryiko, mingo,
	Alexei Starovoitov, alexander.shishkin, jolsa, namhyung,
	linux-kernel, Kernel Team, Arnaldo Carvalho de Melo

Em Wed, Aug 21, 2019 at 04:54:47PM +0000, Yonghong Song escreveu:
> Arnaldo has a question on bcc mailing list about the hit/miss
> counting of bpf program missed to process events.
 
> https://lists.iovisor.org/g/iovisor-dev/message/1783

PERF_FORMAT_LOST seems to be a good answer to that? See my other reply
to this thread.

- Arnaldo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 bpf-next 1/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl
  2019-08-21 20:07           ` Arnaldo Carvalho de Melo
@ 2019-08-21 22:10             ` Yonghong Song
  0 siblings, 0 replies; 25+ messages in thread
From: Yonghong Song @ 2019-08-21 22:10 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Daniel Xu, bpf, Song Liu, Andrii Nakryiko, mingo,
	Alexei Starovoitov, alexander.shishkin, jolsa, namhyung,
	linux-kernel, Kernel Team, Arnaldo Carvalho de Melo



On 8/21/19 1:07 PM, Arnaldo Carvalho de Melo wrote:
> Em Wed, Aug 21, 2019 at 04:54:47PM +0000, Yonghong Song escreveu:
>> Arnaldo has a question on bcc mailing list about the hit/miss
>> counting of bpf program missed to process events.
>   
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.iovisor.org_g_iovisor-2Ddev_message_1783&d=DwIBAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=Rrvq6K3mx2wYBCU6cSXLjJj8Xfb06oymxNZH8ysnlLA&s=IbEaX8v0OulmvKU-pmcAhWNmaHwXgaDd5auVFfRoyJg&e=
> 
> PERF_FORMAT_LOST seems to be a good answer to that? See my other reply
> to this thread.

Just checked. indeed adding PERF_FORMAT_LOST to perf read_format
seems a reasonable approach. ioctl with perf_event_open fd can do the 
same thing, but ioctl should be avoided if we have alternatives.

Thanks for the pointer!

> 
> - Arnaldo
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 bpf-next 1/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl
  2019-08-21 18:43             ` Yonghong Song
  2019-08-21 20:04               ` Arnaldo Carvalho de Melo
@ 2019-08-22  7:47               ` Peter Zijlstra
  2019-08-22  7:54                 ` Song Liu
  1 sibling, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2019-08-22  7:47 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Daniel Xu, bpf, Song Liu, Andrii Nakryiko, mingo, acme,
	Alexei Starovoitov, alexander.shishkin, jolsa, namhyung,
	linux-kernel, Kernel Team, Arnaldo Carvalho de Melo

On Wed, Aug 21, 2019 at 06:43:49PM +0000, Yonghong Song wrote:
> On 8/21/19 11:31 AM, Peter Zijlstra wrote:

> > So extending PERF_RECORD_LOST doesn't work. But PERF_FORMAT_LOST might
> > still work fine; but you get to implement it for all software events.
> 
> Could you give more specifics about PERF_FORMAT_LOST? Googling 
> "PERF_FORMAT_LOST" only yields two emails which we are discussing here :-(

Look at what the other PERF_FORMAT_ flags do? Basically it is adding a
field to the read(2) output.

> >> Maybe we can still use ioctl based approach which is light weighted
> >> compared to ring buffer approach? If a fd has bpf attached, nhit/nmisses
> >> means the kprobe is processed by bpf program or not.
> > 
> > There is nothing kprobe specific here. Kprobes just appear to be the
> > only one actually accounting the recursion cases, but everyone has
> > them.
> 
> Sorry to be specific, kprobe is just an example, I actually refers to 
> any perf event where bpf can attach to, which theoretically are any
> perf events which can be opened with "perf_event_open" syscall although 
> some of them (e.g., software events?) may not have bpf running hooks yet.

Yes, BPF is sucky that way.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 bpf-next 1/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl
  2019-08-22  7:47               ` Peter Zijlstra
@ 2019-08-22  7:54                 ` Song Liu
  2019-08-22  9:05                   ` Peter Zijlstra
  0 siblings, 1 reply; 25+ messages in thread
From: Song Liu @ 2019-08-22  7:54 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Yonghong Song, Daniel Xu, bpf, Andrii Nakryiko, mingo, acme,
	Alexei Starovoitov, alexander.shishkin, jolsa, namhyung,
	linux-kernel, Kernel Team, Arnaldo Carvalho de Melo

Hi Peter, 

> On Aug 22, 2019, at 12:47 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> 
> On Wed, Aug 21, 2019 at 06:43:49PM +0000, Yonghong Song wrote:
>> On 8/21/19 11:31 AM, Peter Zijlstra wrote:
> 
>>> So extending PERF_RECORD_LOST doesn't work. But PERF_FORMAT_LOST might
>>> still work fine; but you get to implement it for all software events.
>> 
>> Could you give more specifics about PERF_FORMAT_LOST? Googling 
>> "PERF_FORMAT_LOST" only yields two emails which we are discussing here :-(
> 
> Look at what the other PERF_FORMAT_ flags do? Basically it is adding a
> field to the read(2) output.

Do we need to implement PERF_FORMAT_LOST for all software events? If user
space asks for PERF_FORMAT_LOST for events that do not support it, can we
just fail sys_perf_event_open()?

Thanks,
Song


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 bpf-next 1/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl
  2019-08-22  7:54                 ` Song Liu
@ 2019-08-22  9:05                   ` Peter Zijlstra
  2019-08-22 21:08                     ` Daniel Xu
  0 siblings, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2019-08-22  9:05 UTC (permalink / raw)
  To: Song Liu
  Cc: Yonghong Song, Daniel Xu, bpf, Andrii Nakryiko, mingo, acme,
	Alexei Starovoitov, alexander.shishkin, jolsa, namhyung,
	linux-kernel, Kernel Team, Arnaldo Carvalho de Melo

On Thu, Aug 22, 2019 at 07:54:16AM +0000, Song Liu wrote:
> Hi Peter, 
> 
> > On Aug 22, 2019, at 12:47 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > 
> > On Wed, Aug 21, 2019 at 06:43:49PM +0000, Yonghong Song wrote:
> >> On 8/21/19 11:31 AM, Peter Zijlstra wrote:
> > 
> >>> So extending PERF_RECORD_LOST doesn't work. But PERF_FORMAT_LOST might
> >>> still work fine; but you get to implement it for all software events.
> >> 
> >> Could you give more specifics about PERF_FORMAT_LOST? Googling 
> >> "PERF_FORMAT_LOST" only yields two emails which we are discussing here :-(
> > 
> > Look at what the other PERF_FORMAT_ flags do? Basically it is adding a
> > field to the read(2) output.
> 
> Do we need to implement PERF_FORMAT_LOST for all software events? If user
> space asks for PERF_FORMAT_LOST for events that do not support it, can we
> just fail sys_perf_event_open()?

It really shouldn't be hard; and I'm failing to see why kprobes are
special.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 bpf-next 1/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl
  2019-08-22  9:05                   ` Peter Zijlstra
@ 2019-08-22 21:08                     ` Daniel Xu
  0 siblings, 0 replies; 25+ messages in thread
From: Daniel Xu @ 2019-08-22 21:08 UTC (permalink / raw)
  To: Peter Zijlstra, Song Liu
  Cc: Yonghong Song, Daniel Xu, bpf, Andrii Nakryiko, mingo, acme,
	Alexei Starovoitov, alexander.shishkin, jolsa, namhyung,
	linux-kernel, Kernel Team, Arnaldo Carvalho de Melo

On Thu Aug 22, 2019 at 11:05 AM Peter Zijlstra wrote:
> On Thu, Aug 22, 2019 at 07:54:16AM +0000, Song Liu wrote:
> > Hi Peter, 
> > 
> > > On Aug 22, 2019, at 12:47 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > > 
> > > On Wed, Aug 21, 2019 at 06:43:49PM +0000, Yonghong Song wrote:
> > >> On 8/21/19 11:31 AM, Peter Zijlstra wrote:
> > > 
> > >>> So extending PERF_RECORD_LOST doesn't work. But PERF_FORMAT_LOST might
> > >>> still work fine; but you get to implement it for all software events.
> > >> 
> > >> Could you give more specifics about PERF_FORMAT_LOST? Googling 
> > >> "PERF_FORMAT_LOST" only yields two emails which we are discussing here :-(
> > > 
> > > Look at what the other PERF_FORMAT_ flags do? Basically it is adding a
> > > field to the read(2) output.
> > 
> > Do we need to implement PERF_FORMAT_LOST for all software events? If user
> > space asks for PERF_FORMAT_LOST for events that do not support it, can we
> > just fail sys_perf_event_open()?
> 
> It really shouldn't be hard; and I'm failing to see why kprobes are
> special.

Thanks for the feedback, everyone. Really appreciate it.

I will look into extending read_format. I'll submit another patch series
after I get the code to work.

Daniel

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2019-08-22 21:16 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-16 22:31 [PATCH v3 bpf-next 0/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE Daniel Xu
2019-08-16 22:31 ` [PATCH v3 bpf-next 1/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl Daniel Xu
2019-08-19 18:22   ` kbuild test robot
2019-08-20  1:26   ` Alexei Starovoitov
2019-08-20  2:34     ` Daniel Xu
2019-08-20  2:52       ` Alexei Starovoitov
2019-08-20  2:04   ` kbuild test robot
2019-08-20 14:45   ` Peter Zijlstra
2019-08-20 17:58     ` Daniel Xu
2019-08-21 11:08       ` Peter Zijlstra
2019-08-21 16:54         ` Yonghong Song
2019-08-21 18:31           ` Peter Zijlstra
2019-08-21 18:43             ` Yonghong Song
2019-08-21 20:04               ` Arnaldo Carvalho de Melo
2019-08-22  7:47               ` Peter Zijlstra
2019-08-22  7:54                 ` Song Liu
2019-08-22  9:05                   ` Peter Zijlstra
2019-08-22 21:08                     ` Daniel Xu
2019-08-21 20:07           ` Arnaldo Carvalho de Melo
2019-08-21 22:10             ` Yonghong Song
2019-08-16 22:31 ` [PATCH v3 bpf-next 2/4] libbpf: Add helpers to extract perf fd from bpf_link Daniel Xu
2019-08-19 17:45   ` Andrii Nakryiko
2019-08-19 21:30     ` Daniel Xu
2019-08-16 22:31 ` [PATCH v3 bpf-next 3/4] tracing/probe: Sync perf_event.h to tools Daniel Xu
2019-08-16 22:31 ` [PATCH v3 bpf-next 4/4] tracing/probe: Add self test for PERF_EVENT_IOC_QUERY_PROBE Daniel Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).