All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 0/5] perf: Add ioctl to emit sideband events
@ 2023-04-14  8:22 Adrian Hunter
  2023-04-14  8:22 ` [PATCH RFC 1/5] " Adrian Hunter
                   ` (5 more replies)
  0 siblings, 6 replies; 17+ messages in thread
From: Adrian Hunter @ 2023-04-14  8:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ian Rogers,
	linux-perf-users, linux-kernel

Hi

Here is a stab at adding an ioctl for sideband events.

This is to overcome races when reading the same information
from /proc.

To keep it simple, the ioctl is limited to emitting existing
sideband events (fork, namespaces, comm, mmap) to an already
active context.

There are not yet any perf tools patches at this stage.


Adrian Hunter (5):
      perf: Add ioctl to emit sideband events
      perf: Add fork to the sideband ioctl
      perf: Add namespaces to the sideband ioctl
      perf: Add comm to the sideband ioctl
      perf: Add mmap to the sideband ioctl

 include/uapi/linux/perf_event.h |  19 ++-
 kernel/events/core.c            | 315 +++++++++++++++++++++++++++++++++-------
 2 files changed, 280 insertions(+), 54 deletions(-)


Regards
Adrian

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH RFC 1/5] perf: Add ioctl to emit sideband events
  2023-04-14  8:22 [PATCH RFC 0/5] perf: Add ioctl to emit sideband events Adrian Hunter
@ 2023-04-14  8:22 ` Adrian Hunter
  2023-04-17 10:57   ` Peter Zijlstra
  2023-04-14  8:22 ` [PATCH RFC 2/5] perf: Add fork to the sideband ioctl Adrian Hunter
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Adrian Hunter @ 2023-04-14  8:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ian Rogers,
	linux-perf-users, linux-kernel

perf tools currently read /proc to get this information, but that
races with changes made by the kernel.

Add an ioctl to output status-only sideband events for a currently
active event on the current CPU. Using timestamps, these status-only
sideband events will be correctly ordered with respect to "real"
sideband events.

The assumption is a user will:
	- open and enable a dummy event to track sideband events
	- call the new ioctl to get sideband information for currently
	  running processes as needed
	- enable the remaining selected events

The initial sideband events to be supported will be: fork, namespaces, comm
and mmap.

Add a new misc flag PERF_RECORD_MISC_STATUS_ONLY to differentiate "real"
sideband events from status-only sideband events.

The limitation that the event must be active is significant. The ioctl
caller must either:
	i)  For a CPU context, set CPU affinity to the correct CPU.
	    Note, obviously that would not need to be done for system-wide
	    tracing on all CPUs. It would also only need to be done for the
	    period of tracing when the ioctl is to be used.
	ii) Use an event opened for the current process on all CPUs.
	    Note, if such an additional event is needed, it would also use
	    additional memory from the user's perf_event_mlock_kb /
	    RLIMIT_MEMLOCK limit.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 include/uapi/linux/perf_event.h | 19 ++++++-
 kernel/events/core.c            | 87 ++++++++++++++++++++++++++++++++-
 2 files changed, 103 insertions(+), 3 deletions(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 37675437b768..d44fb0f65484 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -541,6 +541,18 @@ struct perf_event_query_bpf {
 	__u32	ids[];
 };
 
+enum perf_event_emit_flag {
+	PERF_EVENT_EMIT_FORK		= 1U << 0,
+	PERF_EVENT_EMIT_NAMESPACES	= 1U << 1,
+	PERF_EVENT_EMIT_COMM		= 1U << 2,
+	PERF_EVENT_EMIT_MMAP		= 1U << 3,
+};
+
+struct perf_event_pid_sb {
+	__u32	pid;
+	__u32	emit_flags; /* Refer perf_event_emit_flag */
+};
+
 /*
  * Ioctls that can be done on a perf event fd:
  */
@@ -556,6 +568,7 @@ struct perf_event_query_bpf {
 #define PERF_EVENT_IOC_PAUSE_OUTPUT		_IOW('$', 9, __u32)
 #define PERF_EVENT_IOC_QUERY_BPF		_IOWR('$', 10, struct perf_event_query_bpf *)
 #define PERF_EVENT_IOC_MODIFY_ATTRIBUTES	_IOW('$', 11, struct perf_event_attr *)
+#define PERF_EVENT_IOC_EMIT_SIDEBAND		_IOW('$', 12, struct perf_event_pid_sb *)
 
 enum perf_event_ioc_flags {
 	PERF_IOC_FLAG_GROUP		= 1U << 0,
@@ -743,12 +756,13 @@ struct perf_event_mmap_page {
  * The current state of perf_event_header::misc bits usage:
  * ('|' used bit, '-' unused bit)
  *
- *  012         CDEF
- *  |||---------||||
+ *  012        BCDEF
+ *  |||--------|||||
  *
  *  Where:
  *    0-2     CPUMODE_MASK
  *
+ *    B       STATUS_ONLY
  *    C       PROC_MAP_PARSE_TIMEOUT
  *    D       MMAP_DATA / COMM_EXEC / FORK_EXEC / SWITCH_OUT
  *    E       MMAP_BUILD_ID / EXACT_IP / SCHED_OUT_PREEMPT
@@ -763,6 +777,7 @@ struct perf_event_mmap_page {
 #define PERF_RECORD_MISC_GUEST_KERNEL		(4 << 0)
 #define PERF_RECORD_MISC_GUEST_USER		(5 << 0)
 
+#define PERF_RECORD_MISC_STATUS_ONLY		(1 << 11)
 /*
  * Indicates that /proc/PID/maps parsing are truncated by time out.
  */
diff --git a/kernel/events/core.c b/kernel/events/core.c
index fb3e436bcd4a..5cbcc6851587 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5797,6 +5797,7 @@ static int perf_event_set_output(struct perf_event *event,
 static int perf_event_set_filter(struct perf_event *event, void __user *arg);
 static int perf_copy_attr(struct perf_event_attr __user *uattr,
 			  struct perf_event_attr *attr);
+static int perf_event_emit_sideband(struct perf_event *event, void __user *arg);
 
 static long _perf_ioctl(struct perf_event *event, unsigned int cmd, unsigned long arg)
 {
@@ -5924,6 +5925,9 @@ static long perf_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 	if (ret)
 		return ret;
 
+	if (cmd == PERF_EVENT_IOC_EMIT_SIDEBAND)
+		return perf_event_emit_sideband(event, (void __user *)arg);
+
 	ctx = perf_event_ctx_lock(event);
 	ret = _perf_ioctl(event, cmd, arg);
 	perf_event_ctx_unlock(event, ctx);
@@ -5940,6 +5944,7 @@ static long perf_compat_ioctl(struct file *file, unsigned int cmd,
 	case _IOC_NR(PERF_EVENT_IOC_ID):
 	case _IOC_NR(PERF_EVENT_IOC_QUERY_BPF):
 	case _IOC_NR(PERF_EVENT_IOC_MODIFY_ATTRIBUTES):
+	case _IOC_NR(PERF_EVENT_IOC_EMIT_SIDEBAND):
 		/* Fix up pointer size (usually 4 -> 8 in 32-on-64-bit case */
 		if (_IOC_SIZE(cmd) == sizeof(compat_uptr_t)) {
 			cmd &= ~IOCSIZE_MASK;
@@ -12277,7 +12282,7 @@ perf_check_permission(struct perf_event_attr *attr, struct task_struct *task)
 	unsigned int ptrace_mode = PTRACE_MODE_READ_REALCREDS;
 	bool is_capable = perfmon_capable();
 
-	if (attr->sigtrap) {
+	if (attr && attr->sigtrap) {
 		/*
 		 * perf_event_attr::sigtrap sends signals to the other task.
 		 * Require the current task to also have CAP_KILL.
@@ -12810,6 +12815,86 @@ perf_event_create_kernel_counter(struct perf_event_attr *attr, int cpu,
 }
 EXPORT_SYMBOL_GPL(perf_event_create_kernel_counter);
 
+static int perf_event_emit_fork(struct perf_event *event, struct task_struct *task)
+{
+	return -EINVAL;
+}
+
+static int perf_event_emit_namespaces(struct perf_event *event, struct task_struct *task)
+{
+	return -EINVAL;
+}
+
+static int perf_event_emit_comm(struct perf_event *event, struct task_struct *task)
+{
+	return -EINVAL;
+}
+
+static int perf_event_emit_mmap(struct perf_event *event, struct task_struct *task)
+{
+	return -EINVAL;
+}
+
+static int perf_event_emit_sideband(struct perf_event *event, void __user *arg)
+{
+	struct perf_event_pid_sb pid_sb;
+	struct perf_event_context *ctx;
+	struct task_struct *task;
+	int err;
+
+	if (copy_from_user(&pid_sb, arg, sizeof(pid_sb)))
+		return -EFAULT;
+
+	if (pid_sb.emit_flags & ~(PERF_EVENT_EMIT_FORK |
+				  PERF_EVENT_EMIT_NAMESPACES |
+				  PERF_EVENT_EMIT_COMM |
+				  PERF_EVENT_EMIT_MMAP))
+		return -EINVAL;
+
+	task = find_lively_task_by_vpid(pid_sb.pid);
+	if (IS_ERR(task))
+		return PTR_ERR(task);
+
+	err = down_read_interruptible(&task->signal->exec_update_lock);
+	if (err)
+		goto out_put_task;
+
+	/* Validate access to pid (same as perf_event_open) */
+	err = -EACCES;
+	if (!perf_check_permission(NULL, task))
+		goto out_cred;
+
+	ctx = perf_event_ctx_lock(event);
+
+	if (pid_sb.emit_flags & PERF_EVENT_EMIT_FORK) {
+		err = perf_event_emit_fork(event, task);
+		if (err)
+			goto out_ctx;
+	}
+	if (pid_sb.emit_flags & PERF_EVENT_EMIT_NAMESPACES) {
+		err = perf_event_emit_namespaces(event, task);
+		if (err)
+			goto out_ctx;
+	}
+	if (pid_sb.emit_flags & PERF_EVENT_EMIT_COMM) {
+		err = perf_event_emit_comm(event, task);
+		if (err)
+			goto out_ctx;
+	}
+	if (pid_sb.emit_flags & PERF_EVENT_EMIT_MMAP) {
+		err = perf_event_emit_mmap(event, task);
+		if (err)
+			goto out_ctx;
+	}
+out_ctx:
+	perf_event_ctx_unlock(event, ctx);
+out_cred:
+	up_read(&task->signal->exec_update_lock);
+out_put_task:
+	put_task_struct(task);
+	return err;
+}
+
 static void __perf_pmu_remove(struct perf_event_context *ctx,
 			      int cpu, struct pmu *pmu,
 			      struct perf_event_groups *groups,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH RFC 2/5] perf: Add fork to the sideband ioctl
  2023-04-14  8:22 [PATCH RFC 0/5] perf: Add ioctl to emit sideband events Adrian Hunter
  2023-04-14  8:22 ` [PATCH RFC 1/5] " Adrian Hunter
@ 2023-04-14  8:22 ` Adrian Hunter
  2023-04-14 10:36   ` kernel test robot
                     ` (2 more replies)
  2023-04-14  8:22 ` [PATCH RFC 3/5] perf: Add namespaces " Adrian Hunter
                   ` (3 subsequent siblings)
  5 siblings, 3 replies; 17+ messages in thread
From: Adrian Hunter @ 2023-04-14  8:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ian Rogers,
	linux-perf-users, linux-kernel

Support the case of output to an active event, and return an error if
output is not possible in that case. Set PERF_RECORD_MISC_STATUS_ONLY to
differentiate the ioctl status-only sideband event from a "real" sideband
event.

Set the fork parent pid/tid to the real parent for a thread group leader,
or to the thread group leader otherwise.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 kernel/events/core.c | 88 ++++++++++++++++++++++++++++++++++++--------
 1 file changed, 73 insertions(+), 15 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 5cbcc6851587..4e76596d3bfb 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7948,6 +7948,54 @@ perf_iterate_sb(perf_iterate_f output, void *data,
 	rcu_read_unlock();
 }
 
+typedef int (perf_output_f)(struct perf_event *event, void *data);
+
+static int perf_event_output_sb(struct perf_event *event, perf_output_f output, void *data)
+{
+	int err = -ENOENT;
+
+	preempt_disable();
+
+	if (event->state != PERF_EVENT_STATE_ACTIVE ||
+	    !event_filter_match(event) ||
+	    READ_ONCE(event->oncpu) != smp_processor_id())
+		goto out;
+
+	err = output(event, data);
+out:
+	preempt_enable();
+	return err;
+}
+
+struct perf_output_f_data {
+	perf_output_f *func;
+	void *data;
+};
+
+void perf_output_f_wrapper(struct perf_event *event, void *data)
+{
+	struct perf_output_f_data *f_data = data;
+
+	f_data->func(event, f_data->data);
+}
+
+static int perf_output_sb(perf_output_f output, void *data,
+			  struct perf_event_context *task_ctx,
+			  struct perf_event *event)
+{
+	struct perf_output_f_data f_data = {
+		.func = output,
+		.data = data,
+	};
+
+	if (event)
+		return perf_event_output_sb(event, output, data);
+
+	perf_iterate_sb(perf_output_f_wrapper, &f_data, task_ctx);
+
+	return 0;
+}
+
 /*
  * Clear all file-based filters at exec, they'll have to be
  * re-instated when/if these objects are mmapped again.
@@ -8107,8 +8155,7 @@ static int perf_event_task_match(struct perf_event *event)
 	       event->attr.task;
 }
 
-static void perf_event_task_output(struct perf_event *event,
-				   void *data)
+static int perf_event_task_output(struct perf_event *event, void *data)
 {
 	struct perf_task_event *task_event = data;
 	struct perf_output_handle handle;
@@ -8117,7 +8164,7 @@ static void perf_event_task_output(struct perf_event *event,
 	int ret, size = task_event->event_id.header.size;
 
 	if (!perf_event_task_match(event))
-		return;
+		return -ENOENT;
 
 	perf_event_header__init_id(&task_event->event_id.header, &sample, event);
 
@@ -8134,6 +8181,14 @@ static void perf_event_task_output(struct perf_event *event,
 							task->real_parent);
 		task_event->event_id.ptid = perf_event_pid(event,
 							task->real_parent);
+	} else if (task_event->event_id.header.misc & PERF_RECORD_MISC_STATUS_ONLY) {
+		if (thread_group_leader(task)) {
+			task_event->event_id.ppid = perf_event_pid(event, task->real_parent);
+			task_event->event_id.ptid = perf_event_tid(event, task->real_parent);
+		} else {
+			task_event->event_id.ppid = perf_event_pid(event, task);
+			task_event->event_id.ptid = perf_event_pid(event, task);
+		}
 	} else {  /* PERF_RECORD_FORK */
 		task_event->event_id.ppid = perf_event_pid(event, current);
 		task_event->event_id.ptid = perf_event_tid(event, current);
@@ -8148,18 +8203,19 @@ static void perf_event_task_output(struct perf_event *event,
 	perf_output_end(&handle);
 out:
 	task_event->event_id.header.size = size;
+	return ret;
 }
 
-static void perf_event_task(struct task_struct *task,
-			      struct perf_event_context *task_ctx,
-			      int new)
+static int perf_event_task(struct task_struct *task,
+			   struct perf_event_context *task_ctx,
+			   int new, struct perf_event *event)
 {
 	struct perf_task_event task_event;
 
 	if (!atomic_read(&nr_comm_events) &&
 	    !atomic_read(&nr_mmap_events) &&
 	    !atomic_read(&nr_task_events))
-		return;
+		return -ENOENT;
 
 	task_event = (struct perf_task_event){
 		.task	  = task,
@@ -8167,7 +8223,7 @@ static void perf_event_task(struct task_struct *task,
 		.event_id    = {
 			.header = {
 				.type = new ? PERF_RECORD_FORK : PERF_RECORD_EXIT,
-				.misc = 0,
+				.misc = event ? PERF_RECORD_MISC_STATUS_ONLY : 0,
 				.size = sizeof(task_event.event_id),
 			},
 			/* .pid  */
@@ -8178,14 +8234,12 @@ static void perf_event_task(struct task_struct *task,
 		},
 	};
 
-	perf_iterate_sb(perf_event_task_output,
-		       &task_event,
-		       task_ctx);
+	return perf_output_sb(perf_event_task_output, &task_event, task_ctx, event);
 }
 
 void perf_event_fork(struct task_struct *task)
 {
-	perf_event_task(task, NULL, 1);
+	perf_event_task(task, NULL, 1, NULL);
 	perf_event_namespaces(task);
 }
 
@@ -12817,7 +12871,11 @@ EXPORT_SYMBOL_GPL(perf_event_create_kernel_counter);
 
 static int perf_event_emit_fork(struct perf_event *event, struct task_struct *task)
 {
-	return -EINVAL;
+	if (!event->attr.comm && !event->attr.mmap && !event->attr.mmap2 &&
+	    !event->attr.mmap_data && !event->attr.task)
+		return -EINVAL;
+
+	return perf_event_task(task, NULL, 1, event);
 }
 
 static int perf_event_emit_namespaces(struct perf_event *event, struct task_struct *task)
@@ -13115,7 +13173,7 @@ static void perf_event_exit_task_context(struct task_struct *child)
 	 * won't get any samples after PERF_RECORD_EXIT. We can however still
 	 * get a few PERF_RECORD_READ events.
 	 */
-	perf_event_task(child, child_ctx, 0);
+	perf_event_task(child, child_ctx, 0, NULL);
 
 	list_for_each_entry_safe(child_event, next, &child_ctx->event_list, event_entry)
 		perf_event_exit_event(child_event, child_ctx);
@@ -13157,7 +13215,7 @@ void perf_event_exit_task(struct task_struct *child)
 	 * child contexts and sets child->perf_event_ctxp[] to NULL.
 	 * At this point we need to send EXIT events to cpu contexts.
 	 */
-	perf_event_task(child, NULL, 0);
+	perf_event_task(child, NULL, 0, NULL);
 }
 
 static void perf_free_event(struct perf_event *event,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH RFC 3/5] perf: Add namespaces to the sideband ioctl
  2023-04-14  8:22 [PATCH RFC 0/5] perf: Add ioctl to emit sideband events Adrian Hunter
  2023-04-14  8:22 ` [PATCH RFC 1/5] " Adrian Hunter
  2023-04-14  8:22 ` [PATCH RFC 2/5] perf: Add fork to the sideband ioctl Adrian Hunter
@ 2023-04-14  8:22 ` Adrian Hunter
  2023-04-14  8:22 ` [PATCH RFC 4/5] perf: Add comm " Adrian Hunter
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: Adrian Hunter @ 2023-04-14  8:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ian Rogers,
	linux-perf-users, linux-kernel

Support the case of output to an active event, and return an error if
output is not possible in that case. Set PERF_RECORD_MISC_STATUS_ONLY to
differentiate the ioctl status-only sideband event from a "real" sideband
event.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 kernel/events/core.c | 26 ++++++++++++++++----------
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 4e76596d3bfb..ed4af231853a 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -8364,8 +8364,7 @@ static int perf_event_namespaces_match(struct perf_event *event)
 	return event->attr.namespaces;
 }
 
-static void perf_event_namespaces_output(struct perf_event *event,
-					 void *data)
+static int perf_event_namespaces_output(struct perf_event *event, void *data)
 {
 	struct perf_namespaces_event *namespaces_event = data;
 	struct perf_output_handle handle;
@@ -8374,7 +8373,7 @@ static void perf_event_namespaces_output(struct perf_event *event,
 	int ret;
 
 	if (!perf_event_namespaces_match(event))
-		return;
+		return -ENOENT;
 
 	perf_event_header__init_id(&namespaces_event->event_id.header,
 				   &sample, event);
@@ -8395,6 +8394,7 @@ static void perf_event_namespaces_output(struct perf_event *event,
 	perf_output_end(&handle);
 out:
 	namespaces_event->event_id.header.size = header_size;
+	return ret;
 }
 
 static void perf_fill_ns_link_info(struct perf_ns_link_info *ns_link_info,
@@ -8414,20 +8414,20 @@ static void perf_fill_ns_link_info(struct perf_ns_link_info *ns_link_info,
 	}
 }
 
-void perf_event_namespaces(struct task_struct *task)
+static int __perf_event_namespaces(struct task_struct *task, struct perf_event *event)
 {
 	struct perf_namespaces_event namespaces_event;
 	struct perf_ns_link_info *ns_link_info;
 
 	if (!atomic_read(&nr_namespaces_events))
-		return;
+		return -ENOENT;
 
 	namespaces_event = (struct perf_namespaces_event){
 		.task	= task,
 		.event_id  = {
 			.header = {
 				.type = PERF_RECORD_NAMESPACES,
-				.misc = 0,
+				.misc = event ? PERF_RECORD_MISC_STATUS_ONLY : 0,
 				.size = sizeof(namespaces_event.event_id),
 			},
 			/* .pid */
@@ -8467,9 +8467,12 @@ void perf_event_namespaces(struct task_struct *task)
 			       task, &cgroupns_operations);
 #endif
 
-	perf_iterate_sb(perf_event_namespaces_output,
-			&namespaces_event,
-			NULL);
+	return perf_output_sb(perf_event_namespaces_output, &namespaces_event, NULL, event);
+}
+
+void perf_event_namespaces(struct task_struct *task)
+{
+	__perf_event_namespaces(task, NULL);
 }
 
 /*
@@ -12880,7 +12883,10 @@ static int perf_event_emit_fork(struct perf_event *event, struct task_struct *ta
 
 static int perf_event_emit_namespaces(struct perf_event *event, struct task_struct *task)
 {
-	return -EINVAL;
+	if (!event->attr.namespaces)
+		return -EINVAL;
+
+	return __perf_event_namespaces(task, event);
 }
 
 static int perf_event_emit_comm(struct perf_event *event, struct task_struct *task)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH RFC 4/5] perf: Add comm to the sideband ioctl
  2023-04-14  8:22 [PATCH RFC 0/5] perf: Add ioctl to emit sideband events Adrian Hunter
                   ` (2 preceding siblings ...)
  2023-04-14  8:22 ` [PATCH RFC 3/5] perf: Add namespaces " Adrian Hunter
@ 2023-04-14  8:22 ` Adrian Hunter
  2023-04-14  8:23 ` [PATCH RFC 5/5] perf: Add mmap " Adrian Hunter
  2023-04-17 11:02 ` [PATCH RFC 0/5] perf: Add ioctl to emit sideband events Peter Zijlstra
  5 siblings, 0 replies; 17+ messages in thread
From: Adrian Hunter @ 2023-04-14  8:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ian Rogers,
	linux-perf-users, linux-kernel

Support the case of output to an active event, and return an error if
output is not possible in that case. Set PERF_RECORD_MISC_STATUS_ONLY to
differentiate the ioctl status-only sideband event from a "real" sideband
event.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 kernel/events/core.c | 31 +++++++++++++++++++------------
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index ed4af231853a..cddc02c2e411 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -8265,8 +8265,7 @@ static int perf_event_comm_match(struct perf_event *event)
 	return event->attr.comm;
 }
 
-static void perf_event_comm_output(struct perf_event *event,
-				   void *data)
+static int perf_event_comm_output(struct perf_event *event, void *data)
 {
 	struct perf_comm_event *comm_event = data;
 	struct perf_output_handle handle;
@@ -8275,7 +8274,7 @@ static void perf_event_comm_output(struct perf_event *event,
 	int ret;
 
 	if (!perf_event_comm_match(event))
-		return;
+		return -ENOENT;
 
 	perf_event_header__init_id(&comm_event->event_id.header, &sample, event);
 	ret = perf_output_begin(&handle, &sample, event,
@@ -8296,9 +8295,10 @@ static void perf_event_comm_output(struct perf_event *event,
 	perf_output_end(&handle);
 out:
 	comm_event->event_id.header.size = size;
+	return ret;
 }
 
-static void perf_event_comm_event(struct perf_comm_event *comm_event)
+static int perf_event_comm_event(struct perf_comm_event *comm_event, struct perf_event *event)
 {
 	char comm[TASK_COMM_LEN];
 	unsigned int size;
@@ -8312,17 +8312,15 @@ static void perf_event_comm_event(struct perf_comm_event *comm_event)
 
 	comm_event->event_id.header.size = sizeof(comm_event->event_id) + size;
 
-	perf_iterate_sb(perf_event_comm_output,
-		       comm_event,
-		       NULL);
+	return perf_output_sb(perf_event_comm_output, comm_event, NULL, event);
 }
 
-void perf_event_comm(struct task_struct *task, bool exec)
+static int __perf_event_comm(struct task_struct *task, bool exec, struct perf_event *event)
 {
 	struct perf_comm_event comm_event;
 
 	if (!atomic_read(&nr_comm_events))
-		return;
+		return -ENOENT;
 
 	comm_event = (struct perf_comm_event){
 		.task	= task,
@@ -8331,7 +8329,8 @@ void perf_event_comm(struct task_struct *task, bool exec)
 		.event_id  = {
 			.header = {
 				.type = PERF_RECORD_COMM,
-				.misc = exec ? PERF_RECORD_MISC_COMM_EXEC : 0,
+				.misc = (exec ? PERF_RECORD_MISC_COMM_EXEC : 0) |
+					(event ? PERF_RECORD_MISC_STATUS_ONLY : 0),
 				/* .size */
 			},
 			/* .pid */
@@ -8339,7 +8338,12 @@ void perf_event_comm(struct task_struct *task, bool exec)
 		},
 	};
 
-	perf_event_comm_event(&comm_event);
+	return perf_event_comm_event(&comm_event, event);
+}
+
+void perf_event_comm(struct task_struct *task, bool exec)
+{
+	__perf_event_comm(task, exec, NULL);
 }
 
 /*
@@ -12891,7 +12895,10 @@ static int perf_event_emit_namespaces(struct perf_event *event, struct task_stru
 
 static int perf_event_emit_comm(struct perf_event *event, struct task_struct *task)
 {
-	return -EINVAL;
+	if (!event->attr.comm)
+		return -EINVAL;
+
+	return __perf_event_comm(task, false, event);
 }
 
 static int perf_event_emit_mmap(struct perf_event *event, struct task_struct *task)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH RFC 5/5] perf: Add mmap to the sideband ioctl
  2023-04-14  8:22 [PATCH RFC 0/5] perf: Add ioctl to emit sideband events Adrian Hunter
                   ` (3 preceding siblings ...)
  2023-04-14  8:22 ` [PATCH RFC 4/5] perf: Add comm " Adrian Hunter
@ 2023-04-14  8:23 ` Adrian Hunter
  2023-04-17 11:02 ` [PATCH RFC 0/5] perf: Add ioctl to emit sideband events Peter Zijlstra
  5 siblings, 0 replies; 17+ messages in thread
From: Adrian Hunter @ 2023-04-14  8:23 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ian Rogers,
	linux-perf-users, linux-kernel

Support the case of output to an active event, and return an error if
output is not possible in that case. Set PERF_RECORD_MISC_STATUS_ONLY to
differentiate the ioctl status-only sideband event from a "real" sideband
event.

Set the mmap pid/tid from the appropriate task.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 kernel/events/core.c | 91 +++++++++++++++++++++++++++++++++++---------
 1 file changed, 73 insertions(+), 18 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index cddc02c2e411..317bdf5f919a 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -8584,6 +8584,7 @@ static void perf_event_cgroup(struct cgroup *cgrp)
 
 struct perf_mmap_event {
 	struct vm_area_struct	*vma;
+	struct task_struct	*task;
 
 	const char		*file_name;
 	int			file_size;
@@ -8605,19 +8606,25 @@ struct perf_mmap_event {
 	} event_id;
 };
 
+static int perf_event_mmap_match_vma(struct perf_event *event,
+				     struct vm_area_struct *vma)
+{
+	int executable = vma->vm_flags & VM_EXEC;
+
+	return (!executable && event->attr.mmap_data) ||
+	       (executable && (event->attr.mmap || event->attr.mmap2));
+}
+
 static int perf_event_mmap_match(struct perf_event *event,
 				 void *data)
 {
 	struct perf_mmap_event *mmap_event = data;
 	struct vm_area_struct *vma = mmap_event->vma;
-	int executable = vma->vm_flags & VM_EXEC;
 
-	return (!executable && event->attr.mmap_data) ||
-	       (executable && (event->attr.mmap || event->attr.mmap2));
+	return perf_event_mmap_match_vma(event, vma);
 }
 
-static void perf_event_mmap_output(struct perf_event *event,
-				   void *data)
+static int perf_event_mmap_output(struct perf_event *event, void *data)
 {
 	struct perf_mmap_event *mmap_event = data;
 	struct perf_output_handle handle;
@@ -8628,7 +8635,7 @@ static void perf_event_mmap_output(struct perf_event *event,
 	int ret;
 
 	if (!perf_event_mmap_match(event, data))
-		return;
+		return -ENOENT;
 
 	if (event->attr.mmap2) {
 		mmap_event->event_id.header.type = PERF_RECORD_MMAP2;
@@ -8646,8 +8653,8 @@ static void perf_event_mmap_output(struct perf_event *event,
 	if (ret)
 		goto out;
 
-	mmap_event->event_id.pid = perf_event_pid(event, current);
-	mmap_event->event_id.tid = perf_event_tid(event, current);
+	mmap_event->event_id.pid = perf_event_pid(event, mmap_event->task);
+	mmap_event->event_id.tid = perf_event_tid(event, mmap_event->task);
 
 	use_build_id = event->attr.build_id && mmap_event->build_id_size;
 
@@ -8681,9 +8688,10 @@ static void perf_event_mmap_output(struct perf_event *event,
 out:
 	mmap_event->event_id.header.size = size;
 	mmap_event->event_id.header.type = type;
+	return ret;
 }
 
-static void perf_event_mmap_event(struct perf_mmap_event *mmap_event)
+static int perf_event_mmap_event(struct perf_mmap_event *mmap_event, struct perf_event *event)
 {
 	struct vm_area_struct *vma = mmap_event->vma;
 	struct file *file = vma->vm_file;
@@ -8694,6 +8702,7 @@ static void perf_event_mmap_event(struct perf_mmap_event *mmap_event)
 	char tmp[16];
 	char *buf = NULL;
 	char *name;
+	int ret;
 
 	if (vma->vm_flags & VM_READ)
 		prot |= PROT_READ;
@@ -8795,11 +8804,10 @@ static void perf_event_mmap_event(struct perf_mmap_event *mmap_event)
 	if (atomic_read(&nr_build_id_events))
 		build_id_parse(vma, mmap_event->build_id, &mmap_event->build_id_size);
 
-	perf_iterate_sb(perf_event_mmap_output,
-		       mmap_event,
-		       NULL);
+	ret = perf_output_sb(perf_event_mmap_output, mmap_event, NULL, event);
 
 	kfree(buf);
+	return ret;
 }
 
 /*
@@ -8899,21 +8907,25 @@ static void perf_addr_filters_adjust(struct vm_area_struct *vma)
 	rcu_read_unlock();
 }
 
-void perf_event_mmap(struct vm_area_struct *vma)
+static int __perf_event_mmap(struct vm_area_struct *vma,
+			     struct perf_event *event,
+			     struct task_struct *task)
 {
 	struct perf_mmap_event mmap_event;
 
 	if (!atomic_read(&nr_mmap_events))
-		return;
+		return -ENOENT;
 
 	mmap_event = (struct perf_mmap_event){
 		.vma	= vma,
+		.task	= task ?: current,
 		/* .file_name */
 		/* .file_size */
 		.event_id  = {
 			.header = {
 				.type = PERF_RECORD_MMAP,
-				.misc = PERF_RECORD_MISC_USER,
+				.misc = PERF_RECORD_MISC_USER |
+					(event ? PERF_RECORD_MISC_STATUS_ONLY : 0),
 				/* .size */
 			},
 			/* .pid */
@@ -8930,8 +8942,14 @@ void perf_event_mmap(struct vm_area_struct *vma)
 		/* .flags (attr_mmap2 only) */
 	};
 
-	perf_addr_filters_adjust(vma);
-	perf_event_mmap_event(&mmap_event);
+	if (!event)
+		perf_addr_filters_adjust(vma);
+	return perf_event_mmap_event(&mmap_event, event);
+}
+
+void perf_event_mmap(struct vm_area_struct *vma)
+{
+	__perf_event_mmap(vma, NULL, NULL);
 }
 
 void perf_event_aux_event(struct perf_event *event, unsigned long head,
@@ -12901,9 +12919,46 @@ static int perf_event_emit_comm(struct perf_event *event, struct task_struct *ta
 	return __perf_event_comm(task, false, event);
 }
 
+static int perf_event_mm_emit_mmap(struct perf_event *event,
+				   struct task_struct *task,
+				   struct mm_struct *mm)
+{
+	struct vm_area_struct *vma;
+	VMA_ITERATOR(vmi, mm, 0);
+	int err;
+
+	for_each_vma(vmi, vma) {
+		if (!perf_event_mmap_match_vma(event, vma))
+			continue;
+		err = __perf_event_mmap(vma, event, task);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
 static int perf_event_emit_mmap(struct perf_event *event, struct task_struct *task)
 {
-	return -EINVAL;
+	struct mm_struct *mm;
+	int err;
+
+	if (!event->attr.mmap_data && !event->attr.mmap && !event->attr.mmap2)
+		return -EINVAL;
+
+	mm = get_task_mm(task);
+	if (!mm)
+		return 0;
+
+	mmap_read_lock(mm);
+
+	err = perf_event_mm_emit_mmap(event, task, mm);
+
+	mmap_read_unlock(mm);
+
+	mmput(mm);
+
+	return err;
 }
 
 static int perf_event_emit_sideband(struct perf_event *event, void __user *arg)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH RFC 2/5] perf: Add fork to the sideband ioctl
  2023-04-14  8:22 ` [PATCH RFC 2/5] perf: Add fork to the sideband ioctl Adrian Hunter
@ 2023-04-14 10:36   ` kernel test robot
  2023-04-14 11:17   ` kernel test robot
  2023-04-14 13:33   ` kernel test robot
  2 siblings, 0 replies; 17+ messages in thread
From: kernel test robot @ 2023-04-14 10:36 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: oe-kbuild-all

Hi Adrian,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build warnings:

[auto build test WARNING on acme/perf/core]
[also build test WARNING on tip/perf/core tip/master tip/auto-latest linus/master v6.3-rc6 next-20230413]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Adrian-Hunter/perf-Add-ioctl-to-emit-sideband-events/20230414-162719
base:   https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf/core
patch link:    https://lore.kernel.org/r/20230414082300.34798-3-adrian.hunter%40intel.com
patch subject: [PATCH RFC 2/5] perf: Add fork to the sideband ioctl
config: x86_64-randconfig-a013-20230410 (https://download.01.org/0day-ci/archive/20230414/202304141847.w65deipC-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0
reproduce (this is a W=1 build):
        # https://github.com/intel-lab-lkp/linux/commit/48bf5fef7be6160c89995b4f98ffed4312999e96
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Adrian-Hunter/perf-Add-ioctl-to-emit-sideband-events/20230414-162719
        git checkout 48bf5fef7be6160c89995b4f98ffed4312999e96
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        make W=1 O=build_dir ARCH=x86_64 olddefconfig
        make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash kernel/events/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202304141847.w65deipC-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> kernel/events/core.c:7975:6: warning: no previous prototype for 'perf_output_f_wrapper' [-Wmissing-prototypes]
    7975 | void perf_output_f_wrapper(struct perf_event *event, void *data)
         |      ^~~~~~~~~~~~~~~~~~~~~


vim +/perf_output_f_wrapper +7975 kernel/events/core.c

  7974	
> 7975	void perf_output_f_wrapper(struct perf_event *event, void *data)
  7976	{
  7977		struct perf_output_f_data *f_data = data;
  7978	
  7979		f_data->func(event, f_data->data);
  7980	}
  7981	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH RFC 2/5] perf: Add fork to the sideband ioctl
  2023-04-14  8:22 ` [PATCH RFC 2/5] perf: Add fork to the sideband ioctl Adrian Hunter
  2023-04-14 10:36   ` kernel test robot
@ 2023-04-14 11:17   ` kernel test robot
  2023-04-14 13:33   ` kernel test robot
  2 siblings, 0 replies; 17+ messages in thread
From: kernel test robot @ 2023-04-14 11:17 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: llvm, oe-kbuild-all

Hi Adrian,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build warnings:

[auto build test WARNING on acme/perf/core]
[also build test WARNING on tip/perf/core tip/master tip/auto-latest linus/master v6.3-rc6 next-20230413]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Adrian-Hunter/perf-Add-ioctl-to-emit-sideband-events/20230414-162719
base:   https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf/core
patch link:    https://lore.kernel.org/r/20230414082300.34798-3-adrian.hunter%40intel.com
patch subject: [PATCH RFC 2/5] perf: Add fork to the sideband ioctl
config: i386-randconfig-a002-20230410 (https://download.01.org/0day-ci/archive/20230414/202304141902.2MeBvBcO-lkp@intel.com/config)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/48bf5fef7be6160c89995b4f98ffed4312999e96
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Adrian-Hunter/perf-Add-ioctl-to-emit-sideband-events/20230414-162719
        git checkout 48bf5fef7be6160c89995b4f98ffed4312999e96
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 SHELL=/bin/bash kernel/events/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202304141902.2MeBvBcO-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> kernel/events/core.c:7975:6: warning: no previous prototype for function 'perf_output_f_wrapper' [-Wmissing-prototypes]
   void perf_output_f_wrapper(struct perf_event *event, void *data)
        ^
   kernel/events/core.c:7975:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   void perf_output_f_wrapper(struct perf_event *event, void *data)
   ^
   static 
   1 warning generated.


vim +/perf_output_f_wrapper +7975 kernel/events/core.c

  7974	
> 7975	void perf_output_f_wrapper(struct perf_event *event, void *data)
  7976	{
  7977		struct perf_output_f_data *f_data = data;
  7978	
  7979		f_data->func(event, f_data->data);
  7980	}
  7981	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH RFC 2/5] perf: Add fork to the sideband ioctl
  2023-04-14  8:22 ` [PATCH RFC 2/5] perf: Add fork to the sideband ioctl Adrian Hunter
  2023-04-14 10:36   ` kernel test robot
  2023-04-14 11:17   ` kernel test robot
@ 2023-04-14 13:33   ` kernel test robot
  2 siblings, 0 replies; 17+ messages in thread
From: kernel test robot @ 2023-04-14 13:33 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: oe-kbuild-all

Hi Adrian,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build warnings:

[auto build test WARNING on acme/perf/core]
[also build test WARNING on tip/perf/core tip/master tip/auto-latest linus/master v6.3-rc6 next-20230413]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Adrian-Hunter/perf-Add-ioctl-to-emit-sideband-events/20230414-162719
base:   https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf/core
patch link:    https://lore.kernel.org/r/20230414082300.34798-3-adrian.hunter%40intel.com
patch subject: [PATCH RFC 2/5] perf: Add fork to the sideband ioctl
config: x86_64-randconfig-s023 (https://download.01.org/0day-ci/archive/20230414/202304142105.MjfwYVLq-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0
reproduce:
        # apt-get install sparse
        # sparse version: v0.6.4-39-gce1a6720-dirty
        # https://github.com/intel-lab-lkp/linux/commit/48bf5fef7be6160c89995b4f98ffed4312999e96
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Adrian-Hunter/perf-Add-ioctl-to-emit-sideband-events/20230414-162719
        git checkout 48bf5fef7be6160c89995b4f98ffed4312999e96
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=x86_64 olddefconfig
        make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=x86_64 SHELL=/bin/bash kernel/events/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202304142105.MjfwYVLq-lkp@intel.com/

sparse warnings: (new ones prefixed by >>)
   kernel/events/core.c:1375:15: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:1375:15: sparse:    struct perf_event_context [noderef] __rcu *
   kernel/events/core.c:1375:15: sparse:    struct perf_event_context *
   kernel/events/core.c:1388:28: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:1388:28: sparse:    struct perf_event_context [noderef] __rcu *
   kernel/events/core.c:1388:28: sparse:    struct perf_event_context *
   kernel/events/core.c:3460:20: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:3460:20: sparse:    struct perf_event_context [noderef] __rcu *
   kernel/events/core.c:3460:20: sparse:    struct perf_event_context *
   kernel/events/core.c:3464:18: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:3464:18: sparse:    struct perf_event_context [noderef] __rcu *
   kernel/events/core.c:3464:18: sparse:    struct perf_event_context *
   kernel/events/core.c:3465:23: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:3465:23: sparse:    struct perf_event_context [noderef] __rcu *
   kernel/events/core.c:3465:23: sparse:    struct perf_event_context *
   kernel/events/core.c:3514:25: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:3514:25: sparse:    struct perf_event_context [noderef] __rcu *
   kernel/events/core.c:3514:25: sparse:    struct perf_event_context *
   kernel/events/core.c:3515:25: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:3515:25: sparse:    struct perf_event_context [noderef] __rcu *
   kernel/events/core.c:3515:25: sparse:    struct perf_event_context *
   kernel/events/core.c:3914:15: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:3914:15: sparse:    struct perf_event_context [noderef] __rcu *
   kernel/events/core.c:3914:15: sparse:    struct perf_event_context *
   kernel/events/core.c:4306:15: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:4306:15: sparse:    struct perf_event_context [noderef] __rcu *
   kernel/events/core.c:4306:15: sparse:    struct perf_event_context *
   kernel/events/core.c:4785:25: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:4785:25: sparse:    struct perf_event_context [noderef] __rcu *
   kernel/events/core.c:4785:25: sparse:    struct perf_event_context *
   kernel/events/core.c:6170:9: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:6170:9: sparse:    struct perf_buffer [noderef] __rcu *
   kernel/events/core.c:6170:9: sparse:    struct perf_buffer *
   kernel/events/core.c:5637:24: sparse: sparse: incorrect type in assignment (different base types) @@     expected restricted __poll_t [usertype] events @@     got int @@
   kernel/events/core.c:5882:22: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:5882:22: sparse:    struct perf_buffer [noderef] __rcu *
   kernel/events/core.c:5882:22: sparse:    struct perf_buffer *
   kernel/events/core.c:6010:14: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:6010:14: sparse:    struct perf_buffer [noderef] __rcu *
   kernel/events/core.c:6010:14: sparse:    struct perf_buffer *
   kernel/events/core.c:6043:14: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:6043:14: sparse:    struct perf_buffer [noderef] __rcu *
   kernel/events/core.c:6043:14: sparse:    struct perf_buffer *
   kernel/events/core.c:6100:14: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:6100:14: sparse:    struct perf_buffer [noderef] __rcu *
   kernel/events/core.c:6100:14: sparse:    struct perf_buffer *
   kernel/events/core.c:6191:14: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:6191:14: sparse:    struct perf_buffer [noderef] __rcu *
   kernel/events/core.c:6191:14: sparse:    struct perf_buffer *
   kernel/events/core.c:6207:14: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:6207:14: sparse:    struct perf_buffer [noderef] __rcu *
   kernel/events/core.c:6207:14: sparse:    struct perf_buffer *
   kernel/events/core.c:7943:15: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:7943:15: sparse:    struct perf_event_context [noderef] __rcu *
   kernel/events/core.c:7943:15: sparse:    struct perf_event_context *
>> kernel/events/core.c:7975:6: sparse: sparse: symbol 'perf_output_f_wrapper' was not declared. Should it be static?
   kernel/events/core.c:8078:13: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:8078:13: sparse:    struct perf_buffer [noderef] __rcu *
   kernel/events/core.c:8078:13: sparse:    struct perf_buffer *
   kernel/events/core.c:8181:61: sparse: sparse: incorrect type in argument 2 (different address spaces) @@     expected struct task_struct *p @@     got struct task_struct [noderef] __rcu *real_parent @@
   kernel/events/core.c:8181:61: sparse:     expected struct task_struct *p
   kernel/events/core.c:8181:61: sparse:     got struct task_struct [noderef] __rcu *real_parent
   kernel/events/core.c:8183:61: sparse: sparse: incorrect type in argument 2 (different address spaces) @@     expected struct task_struct *p @@     got struct task_struct [noderef] __rcu *real_parent @@
   kernel/events/core.c:8183:61: sparse:     expected struct task_struct *p
   kernel/events/core.c:8183:61: sparse:     got struct task_struct [noderef] __rcu *real_parent
   kernel/events/core.c:8186:79: sparse: sparse: incorrect type in argument 2 (different address spaces) @@     expected struct task_struct *p @@     got struct task_struct [noderef] __rcu *real_parent @@
   kernel/events/core.c:8186:79: sparse:     expected struct task_struct *p
   kernel/events/core.c:8186:79: sparse:     got struct task_struct [noderef] __rcu *real_parent
   kernel/events/core.c:8187:79: sparse: sparse: incorrect type in argument 2 (different address spaces) @@     expected struct task_struct *p @@     got struct task_struct [noderef] __rcu *real_parent @@
   kernel/events/core.c:8187:79: sparse:     expected struct task_struct *p
   kernel/events/core.c:8187:79: sparse:     got struct task_struct [noderef] __rcu *real_parent
   kernel/events/core.c:8889:15: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:8889:15: sparse:    struct perf_event_context [noderef] __rcu *
   kernel/events/core.c:8889:15: sparse:    struct perf_event_context *
   kernel/events/core.c:9929:9: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:9929:9: sparse:    struct swevent_hlist [noderef] __rcu *
   kernel/events/core.c:9929:9: sparse:    struct swevent_hlist *
   kernel/events/core.c:9968:17: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:9968:17: sparse:    struct swevent_hlist [noderef] __rcu *
   kernel/events/core.c:9968:17: sparse:    struct swevent_hlist *
   kernel/events/core.c:10224:23: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:10224:23: sparse:    struct perf_event_context [noderef] __rcu *
   kernel/events/core.c:10224:23: sparse:    struct perf_event_context *
   kernel/events/core.c:11335:1: sparse: sparse: symbol 'dev_attr_nr_addr_filters' was not declared. Should it be static?
   kernel/events/core.c:13160:9: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:13160:9: sparse:    struct perf_event_context [noderef] __rcu *
   kernel/events/core.c:13160:9: sparse:    struct perf_event_context *
   kernel/events/core.c:13254:15: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:13254:15: sparse:    struct perf_event_context [noderef] __rcu *
   kernel/events/core.c:13254:15: sparse:    struct perf_event_context *
   kernel/events/core.c:13266:9: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:13266:9: sparse:    struct perf_event_context [noderef] __rcu *
   kernel/events/core.c:13266:9: sparse:    struct perf_event_context *
   kernel/events/core.c:13687:17: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:13687:17: sparse:    struct swevent_hlist [noderef] __rcu *
   kernel/events/core.c:13687:17: sparse:    struct swevent_hlist *
   kernel/events/core.c:162:9: sparse: sparse: context imbalance in 'perf_ctx_lock' - wrong count at exit
   kernel/events/core.c:170:17: sparse: sparse: context imbalance in 'perf_ctx_unlock' - unexpected unlock
   kernel/events/core.c: note: in included file (through include/linux/rculist.h, include/linux/dcache.h, include/linux/fs.h):
   include/linux/rcupdate.h:802:9: sparse: sparse: context imbalance in 'perf_lock_task_context' - different lock contexts for basic block
   kernel/events/core.c:1422:17: sparse: sparse: context imbalance in 'perf_pin_task_context' - unexpected unlock
   kernel/events/core.c:2775:9: sparse: sparse: context imbalance in '__perf_install_in_context' - wrong count at exit
   kernel/events/core.c:4759:17: sparse: sparse: context imbalance in 'find_get_context' - unexpected unlock
   kernel/events/core.c: note: in included file:
   kernel/events/internal.h:209:1: sparse: sparse: incorrect type in argument 2 (different address spaces) @@     expected void const [noderef] __user *from @@     got void const *buf @@
   kernel/events/core.c:9778:17: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:9778:17: sparse:    struct swevent_hlist [noderef] __rcu *
   kernel/events/core.c:9778:17: sparse:    struct swevent_hlist *
   kernel/events/core.c:9798:17: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:9798:17: sparse:    struct swevent_hlist [noderef] __rcu *
   kernel/events/core.c:9798:17: sparse:    struct swevent_hlist *
   kernel/events/core.c:9918:16: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:9918:16: sparse:    struct swevent_hlist [noderef] __rcu *
   kernel/events/core.c:9918:16: sparse:    struct swevent_hlist *
   kernel/events/core.c:9918:16: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:9918:16: sparse:    struct swevent_hlist [noderef] __rcu *
   kernel/events/core.c:9918:16: sparse:    struct swevent_hlist *
   kernel/events/core.c:9918:16: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/events/core.c:9918:16: sparse:    struct swevent_hlist [noderef] __rcu *
   kernel/events/core.c:9918:16: sparse:    struct swevent_hlist *

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH RFC 1/5] perf: Add ioctl to emit sideband events
  2023-04-14  8:22 ` [PATCH RFC 1/5] " Adrian Hunter
@ 2023-04-17 10:57   ` Peter Zijlstra
  2023-04-18  6:29     ` Adrian Hunter
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2023-04-17 10:57 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ian Rogers,
	linux-perf-users, linux-kernel

On Fri, Apr 14, 2023 at 11:22:56AM +0300, Adrian Hunter wrote:
> perf tools currently read /proc to get this information, but that
> races with changes made by the kernel.
> 
> Add an ioctl to output status-only sideband events for a currently
> active event on the current CPU. Using timestamps, these status-only
> sideband events will be correctly ordered with respect to "real"
> sideband events.
> 
> The assumption is a user will:
> 	- open and enable a dummy event to track sideband events
> 	- call the new ioctl to get sideband information for currently
> 	  running processes as needed
> 	- enable the remaining selected events
> 
> The initial sideband events to be supported will be: fork, namespaces, comm
> and mmap.
> 
> Add a new misc flag PERF_RECORD_MISC_STATUS_ONLY to differentiate "real"
> sideband events from status-only sideband events.
> 
> The limitation that the event must be active is significant. The ioctl
> caller must either:
> 	i)  For a CPU context, set CPU affinity to the correct CPU.
> 	    Note, obviously that would not need to be done for system-wide
> 	    tracing on all CPUs. It would also only need to be done for the
> 	    period of tracing when the ioctl is to be used.
> 	ii) Use an event opened for the current process on all CPUs.
> 	    Note, if such an additional event is needed, it would also use
> 	    additional memory from the user's perf_event_mlock_kb /
> 	    RLIMIT_MEMLOCK limit.

Why would a single per-task event not work? I see nothing in the code
that would require a per-task-per-cpu setup. Or am I just having trouble
reading again?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH RFC 0/5] perf: Add ioctl to emit sideband events
  2023-04-14  8:22 [PATCH RFC 0/5] perf: Add ioctl to emit sideband events Adrian Hunter
                   ` (4 preceding siblings ...)
  2023-04-14  8:23 ` [PATCH RFC 5/5] perf: Add mmap " Adrian Hunter
@ 2023-04-17 11:02 ` Peter Zijlstra
  2023-04-17 16:37   ` Ian Rogers
  2023-04-18  6:18   ` Adrian Hunter
  5 siblings, 2 replies; 17+ messages in thread
From: Peter Zijlstra @ 2023-04-17 11:02 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ian Rogers,
	linux-perf-users, linux-kernel

On Fri, Apr 14, 2023 at 11:22:55AM +0300, Adrian Hunter wrote:
> Hi
> 
> Here is a stab at adding an ioctl for sideband events.
> 
> This is to overcome races when reading the same information
> from /proc.

What races? Are you talking about reading old state in /proc the kernel
delivering a sideband event for the new state, and then you writing the
old state out?

Surely that's something perf tool can fix without kernel changes?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH RFC 0/5] perf: Add ioctl to emit sideband events
  2023-04-17 11:02 ` [PATCH RFC 0/5] perf: Add ioctl to emit sideband events Peter Zijlstra
@ 2023-04-17 16:37   ` Ian Rogers
  2023-04-18  7:03     ` Adrian Hunter
  2023-04-18  6:18   ` Adrian Hunter
  1 sibling, 1 reply; 17+ messages in thread
From: Ian Rogers @ 2023-04-17 16:37 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Adrian Hunter, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	linux-perf-users, linux-kernel

On Mon, Apr 17, 2023 at 4:02 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Fri, Apr 14, 2023 at 11:22:55AM +0300, Adrian Hunter wrote:
> > Hi
> >
> > Here is a stab at adding an ioctl for sideband events.
> >
> > This is to overcome races when reading the same information
> > from /proc.
>
> What races? Are you talking about reading old state in /proc the kernel
> delivering a sideband event for the new state, and then you writing the
> old state out?
>
> Surely that's something perf tool can fix without kernel changes?

So my reading is that during event synthesis there are races between
reading the different /proc files. There is still, I believe, a race
in with perf record/top with uid filtering which reminds me of this.
The uid filtering race is that we scan /proc to find processes (pids)
for a uid, we then synthesize the maps for each of these pids but if a
pid starts or exits we either error out or don't sample that pid. I
believe the error out behavior is easy to hit 100% of the time making
uid mode of limited use.

This may be for something other than synthesis, but for synthesis a
few points are:
 - as servers get bigger and consequently more jobs get consolidated
on them, synthesis is slow (hence --num-thread-synthesize) and also
the events dominate the perf.data file - perhaps >90% of the file
size, and a lot of that will be for processes with no samples in them.
Another issue here is that all those file descriptors don't come for
free in the kernel.
 - BPF has buildid+offset stack traces that remove the need for
synthesis by having more expensive stack generation. I believe this is
unpopular as adding this as a variant for every kind of event would be
hard, but perhaps we can do some low-hanging fruit like instructions
and cycles.
 - I believe Jiri looked at doing synthesis with BPF. Perhaps we could
do something similar to the off-cpu and tail-synthesize, where more
things happen at the tail end of perf. Off-cpu records data in maps
that it then synthesizes into samples.

There is also a long standing issue around not sampling munmap (or
mremap) that causes plenty of issues. Perhaps if we had less mmap in
the perf.data file we could add these.

Thanks,
Ian

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH RFC 0/5] perf: Add ioctl to emit sideband events
  2023-04-17 11:02 ` [PATCH RFC 0/5] perf: Add ioctl to emit sideband events Peter Zijlstra
  2023-04-17 16:37   ` Ian Rogers
@ 2023-04-18  6:18   ` Adrian Hunter
  2023-04-18 13:36     ` Adrian Hunter
  1 sibling, 1 reply; 17+ messages in thread
From: Adrian Hunter @ 2023-04-18  6:18 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ian Rogers,
	linux-perf-users, linux-kernel

On 17/04/23 14:02, Peter Zijlstra wrote:
> On Fri, Apr 14, 2023 at 11:22:55AM +0300, Adrian Hunter wrote:
>> Hi
>>
>> Here is a stab at adding an ioctl for sideband events.
>>
>> This is to overcome races when reading the same information
>> from /proc.
> 
> What races? Are you talking about reading old state in /proc the kernel
> delivering a sideband event for the new state, and then you writing the
> old state out?
> 
> Surely that's something perf tool can fix without kernel changes?

Yes, and it was a bit of a brain fart not to realise that.

There may still be corner cases, where different kinds of events are
interdependent, perhaps NAMESPACES events vs MMAP events could
have ordering issues.

Putting that aside, the ioctl may be quicker than reading from
/proc.  I could get some numbers and see what people think.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH RFC 1/5] perf: Add ioctl to emit sideband events
  2023-04-17 10:57   ` Peter Zijlstra
@ 2023-04-18  6:29     ` Adrian Hunter
  0 siblings, 0 replies; 17+ messages in thread
From: Adrian Hunter @ 2023-04-18  6:29 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ian Rogers,
	linux-perf-users, linux-kernel

On 17/04/23 13:57, Peter Zijlstra wrote:
> On Fri, Apr 14, 2023 at 11:22:56AM +0300, Adrian Hunter wrote:
>> perf tools currently read /proc to get this information, but that
>> races with changes made by the kernel.
>>
>> Add an ioctl to output status-only sideband events for a currently
>> active event on the current CPU. Using timestamps, these status-only
>> sideband events will be correctly ordered with respect to "real"
>> sideband events.
>>
>> The assumption is a user will:
>> 	- open and enable a dummy event to track sideband events
>> 	- call the new ioctl to get sideband information for currently
>> 	  running processes as needed
>> 	- enable the remaining selected events
>>
>> The initial sideband events to be supported will be: fork, namespaces, comm
>> and mmap.
>>
>> Add a new misc flag PERF_RECORD_MISC_STATUS_ONLY to differentiate "real"
>> sideband events from status-only sideband events.
>>
>> The limitation that the event must be active is significant. The ioctl
>> caller must either:
>> 	i)  For a CPU context, set CPU affinity to the correct CPU.
>> 	    Note, obviously that would not need to be done for system-wide
>> 	    tracing on all CPUs. It would also only need to be done for the
>> 	    period of tracing when the ioctl is to be used.
>> 	ii) Use an event opened for the current process on all CPUs.
>> 	    Note, if such an additional event is needed, it would also use
>> 	    additional memory from the user's perf_event_mlock_kb /
>> 	    RLIMIT_MEMLOCK limit.
> 
> Why would a single per-task event not work? I see nothing in the code
> that would require a per-task-per-cpu setup. Or am I just having trouble
> reading again?

Sorry, "all CPUS" should have been "cpu=-1"


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH RFC 0/5] perf: Add ioctl to emit sideband events
  2023-04-17 16:37   ` Ian Rogers
@ 2023-04-18  7:03     ` Adrian Hunter
  0 siblings, 0 replies; 17+ messages in thread
From: Adrian Hunter @ 2023-04-18  7:03 UTC (permalink / raw)
  To: Ian Rogers, Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, linux-perf-users,
	linux-kernel

On 17/04/23 19:37, Ian Rogers wrote:
> On Mon, Apr 17, 2023 at 4:02 AM Peter Zijlstra <peterz@infradead.org> wrote:
>>
>> On Fri, Apr 14, 2023 at 11:22:55AM +0300, Adrian Hunter wrote:
>>> Hi
>>>
>>> Here is a stab at adding an ioctl for sideband events.
>>>
>>> This is to overcome races when reading the same information
>>> from /proc.
>>
>> What races? Are you talking about reading old state in /proc the kernel
>> delivering a sideband event for the new state, and then you writing the
>> old state out?
>>
>> Surely that's something perf tool can fix without kernel changes?
> 
> So my reading is that during event synthesis there are races between
> reading the different /proc files. There is still, I believe, a race
> in with perf record/top with uid filtering which reminds me of this.
> The uid filtering race is that we scan /proc to find processes (pids)
> for a uid, we then synthesize the maps for each of these pids but if a
> pid starts or exits we either error out or don't sample that pid. I
> believe the error out behavior is easy to hit 100% of the time making
> uid mode of limited use.
> 
> This may be for something other than synthesis, but for synthesis a
> few points are:
>  - as servers get bigger and consequently more jobs get consolidated
> on them, synthesis is slow (hence --num-thread-synthesize) and also
> the events dominate the perf.data file - perhaps >90% of the file
> size, and a lot of that will be for processes with no samples in them.

Note also, for hardware tracing, it isn't generally possible to know
that during tracing, and figuring it out afterwards and working
backwards may not be feasible.

> Another issue here is that all those file descriptors don't come for
> free in the kernel.
>  - BPF has buildid+offset stack traces that remove the need for
> synthesis by having more expensive stack generation. I believe this is
> unpopular as adding this as a variant for every kind of event would be
> hard, but perhaps we can do some low-hanging fruit like instructions
> and cycles.
>  - I believe Jiri looked at doing synthesis with BPF. Perhaps we could
> do something similar to the off-cpu and tail-synthesize, where more
> things happen at the tail end of perf. Off-cpu records data in maps
> that it then synthesizes into samples.
> 
> There is also a long standing issue around not sampling munmap (or
> mremap) that causes plenty of issues. Perhaps if we had less mmap in
> the perf.data file we could add these.
> 
> Thanks,
> Ian


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH RFC 0/5] perf: Add ioctl to emit sideband events
  2023-04-18  6:18   ` Adrian Hunter
@ 2023-04-18 13:36     ` Adrian Hunter
  2023-04-18 15:51       ` Ian Rogers
  0 siblings, 1 reply; 17+ messages in thread
From: Adrian Hunter @ 2023-04-18 13:36 UTC (permalink / raw)
  To: Peter Zijlstra, Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Ian Rogers, linux-perf-users, linux-kernel

On 18/04/23 09:18, Adrian Hunter wrote:
> On 17/04/23 14:02, Peter Zijlstra wrote:
>> On Fri, Apr 14, 2023 at 11:22:55AM +0300, Adrian Hunter wrote:
>>> Hi
>>>
>>> Here is a stab at adding an ioctl for sideband events.
>>>
>>> This is to overcome races when reading the same information
>>> from /proc.
>>
>> What races? Are you talking about reading old state in /proc the kernel
>> delivering a sideband event for the new state, and then you writing the
>> old state out?
>>
>> Surely that's something perf tool can fix without kernel changes?
> 
> Yes, and it was a bit of a brain fart not to realise that.
> 
> There may still be corner cases, where different kinds of events are
> interdependent, perhaps NAMESPACES events vs MMAP events could
> have ordering issues.
> 
> Putting that aside, the ioctl may be quicker than reading from
> /proc.  I could get some numbers and see what people think.
> 

Here's a result with a quick hack to use the ioctl but without
handling the buffer becoming full (hence the -m4M)

# ps -e | wc -l
1171
# perf.old stat -- perf.old record -o old.data --namespaces -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.095 MB old.data (100 samples) ]

 Performance counter stats for 'perf.old record -o old.data --namespaces -a true':

            498.15 msec task-clock                       #    0.987 CPUs utilized             
               126      context-switches                 #  252.935 /sec                      
                64      cpu-migrations                   #  128.475 /sec                      
              4396      page-faults                      #    8.825 K/sec                     
        1927096347      cycles                           #    3.868 GHz                       
        4563059399      instructions                     #    2.37  insn per cycle            
         914232559      branches                         #    1.835 G/sec                     
           6618052      branch-misses                    #    0.72% of all branches           
        9633787105      slots                            #   19.339 G/sec                     
        4394300990      topdown-retiring                 #     38.8% Retiring                 
        3693815286      topdown-bad-spec                 #     32.6% Bad Speculation          
        1692356927      topdown-fe-bound                 #     14.9% Frontend Bound           
        1544151518      topdown-be-bound                 #     13.6% Backend Bound            

       0.504636742 seconds time elapsed

       0.158237000 seconds user
       0.340625000 seconds sys

# perf.old stat -- perf.new record -o new.data -m4M --namespaces -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.095 MB new.data (103 samples) ]

 Performance counter stats for 'perf.new record -o new.data -m4M --namespaces -a true':

            386.61 msec task-clock                       #    0.988 CPUs utilized             
               100      context-switches                 #  258.658 /sec                      
                65      cpu-migrations                   #  168.128 /sec                      
              4935      page-faults                      #   12.765 K/sec                     
        1495905137      cycles                           #    3.869 GHz                       
        3647660473      instructions                     #    2.44  insn per cycle            
         735822370      branches                         #    1.903 G/sec                     
           5765668      branch-misses                    #    0.78% of all branches           
        7477722620      slots                            #   19.342 G/sec                     
        3415835954      topdown-retiring                 #     39.5% Retiring                 
        2748625759      topdown-bad-spec                 #     31.8% Bad Speculation          
        1221594670      topdown-fe-bound                 #     14.1% Frontend Bound           
        1256150733      topdown-be-bound                 #     14.5% Backend Bound            

       0.391472763 seconds time elapsed

       0.141207000 seconds user
       0.246277000 seconds sys

# ls -lh old.data
-rw------- 1 root root 1.2M Apr 18 13:19 old.data
# ls -lh new.data
-rw------- 1 root root 1.2M Apr 18 13:19 new.data
# 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH RFC 0/5] perf: Add ioctl to emit sideband events
  2023-04-18 13:36     ` Adrian Hunter
@ 2023-04-18 15:51       ` Ian Rogers
  0 siblings, 0 replies; 17+ messages in thread
From: Ian Rogers @ 2023-04-18 15:51 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Peter Zijlstra, Arnaldo Carvalho de Melo, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	linux-perf-users, linux-kernel

On Tue, Apr 18, 2023 at 6:36 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> On 18/04/23 09:18, Adrian Hunter wrote:
> > On 17/04/23 14:02, Peter Zijlstra wrote:
> >> On Fri, Apr 14, 2023 at 11:22:55AM +0300, Adrian Hunter wrote:
> >>> Hi
> >>>
> >>> Here is a stab at adding an ioctl for sideband events.
> >>>
> >>> This is to overcome races when reading the same information
> >>> from /proc.
> >>
> >> What races? Are you talking about reading old state in /proc the kernel
> >> delivering a sideband event for the new state, and then you writing the
> >> old state out?
> >>
> >> Surely that's something perf tool can fix without kernel changes?
> >
> > Yes, and it was a bit of a brain fart not to realise that.
> >
> > There may still be corner cases, where different kinds of events are
> > interdependent, perhaps NAMESPACES events vs MMAP events could
> > have ordering issues.
> >
> > Putting that aside, the ioctl may be quicker than reading from
> > /proc.  I could get some numbers and see what people think.
> >
>
> Here's a result with a quick hack to use the ioctl but without
> handling the buffer becoming full (hence the -m4M)
>
> # ps -e | wc -l
> 1171
> # perf.old stat -- perf.old record -o old.data --namespaces -a true
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 1.095 MB old.data (100 samples) ]
>
>  Performance counter stats for 'perf.old record -o old.data --namespaces -a true':
>
>             498.15 msec task-clock                       #    0.987 CPUs utilized
>                126      context-switches                 #  252.935 /sec
>                 64      cpu-migrations                   #  128.475 /sec
>               4396      page-faults                      #    8.825 K/sec
>         1927096347      cycles                           #    3.868 GHz
>         4563059399      instructions                     #    2.37  insn per cycle
>          914232559      branches                         #    1.835 G/sec
>            6618052      branch-misses                    #    0.72% of all branches
>         9633787105      slots                            #   19.339 G/sec
>         4394300990      topdown-retiring                 #     38.8% Retiring
>         3693815286      topdown-bad-spec                 #     32.6% Bad Speculation
>         1692356927      topdown-fe-bound                 #     14.9% Frontend Bound
>         1544151518      topdown-be-bound                 #     13.6% Backend Bound
>
>        0.504636742 seconds time elapsed
>
>        0.158237000 seconds user
>        0.340625000 seconds sys
>
> # perf.old stat -- perf.new record -o new.data -m4M --namespaces -a true
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 1.095 MB new.data (103 samples) ]
>
>  Performance counter stats for 'perf.new record -o new.data -m4M --namespaces -a true':
>
>             386.61 msec task-clock                       #    0.988 CPUs utilized
>                100      context-switches                 #  258.658 /sec
>                 65      cpu-migrations                   #  168.128 /sec
>               4935      page-faults                      #   12.765 K/sec
>         1495905137      cycles                           #    3.869 GHz
>         3647660473      instructions                     #    2.44  insn per cycle
>          735822370      branches                         #    1.903 G/sec
>            5765668      branch-misses                    #    0.78% of all branches
>         7477722620      slots                            #   19.342 G/sec
>         3415835954      topdown-retiring                 #     39.5% Retiring
>         2748625759      topdown-bad-spec                 #     31.8% Bad Speculation
>         1221594670      topdown-fe-bound                 #     14.1% Frontend Bound
>         1256150733      topdown-be-bound                 #     14.5% Backend Bound
>
>        0.391472763 seconds time elapsed
>
>        0.141207000 seconds user
>        0.246277000 seconds sys
>
> # ls -lh old.data
> -rw------- 1 root root 1.2M Apr 18 13:19 old.data
> # ls -lh new.data
> -rw------- 1 root root 1.2M Apr 18 13:19 new.data
> #

Cool, so the headline is a ~20% or 1billion instruction reduction in
perf startup overhead?

Thanks,
Ian

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-04-18 15:52 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-14  8:22 [PATCH RFC 0/5] perf: Add ioctl to emit sideband events Adrian Hunter
2023-04-14  8:22 ` [PATCH RFC 1/5] " Adrian Hunter
2023-04-17 10:57   ` Peter Zijlstra
2023-04-18  6:29     ` Adrian Hunter
2023-04-14  8:22 ` [PATCH RFC 2/5] perf: Add fork to the sideband ioctl Adrian Hunter
2023-04-14 10:36   ` kernel test robot
2023-04-14 11:17   ` kernel test robot
2023-04-14 13:33   ` kernel test robot
2023-04-14  8:22 ` [PATCH RFC 3/5] perf: Add namespaces " Adrian Hunter
2023-04-14  8:22 ` [PATCH RFC 4/5] perf: Add comm " Adrian Hunter
2023-04-14  8:23 ` [PATCH RFC 5/5] perf: Add mmap " Adrian Hunter
2023-04-17 11:02 ` [PATCH RFC 0/5] perf: Add ioctl to emit sideband events Peter Zijlstra
2023-04-17 16:37   ` Ian Rogers
2023-04-18  7:03     ` Adrian Hunter
2023-04-18  6:18   ` Adrian Hunter
2023-04-18 13:36     ` Adrian Hunter
2023-04-18 15:51       ` Ian Rogers

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.