All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC V2 0/4] perf/core: Add ability for an event to "pause" or "resume" AUX area tracing
@ 2023-12-08 17:24 ` Adrian Hunter
  0 siblings, 0 replies; 31+ messages in thread
From: Adrian Hunter @ 2023-12-08 17:24 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Heiko Carstens,
	Thomas Richter, Hendrik Brueckner, Suzuki K Poulose, Mike Leach,
	James Clark, coresight, linux-arm-kernel, Yicong Yang,
	Jonathan Cameron, Will Deacon, Arnaldo Carvalho de Melo,
	Jiri Olsa, Namhyung Kim, Ian Rogers, linux-kernel,
	linux-perf-users

Hi

Hardware traces, such as instruction traces, can produce a vast amount of
trace data, so being able to reduce tracing to more specific circumstances
can be useful.

The ability to pause or resume tracing when another event happens, can do
that.

These patches add such a facilty and show how it would work for Intel
Processor Trace.

Maintainers of other AUX area tracing implementations are requested to
consider if this is something they might employ and then whether or not
the ABI would work for them.

Changes to perf tools are not fleshed out yet.


Changes in RFC V2:

      Use ->stop() / ->start() instead of ->pause_resume()
      Move aux_start_paused bit into aux_output_cfg
      Tighten up when Intel PT pause / resume is allowed
      Add an example of how it might work for CoreSight


Adrian Hunter (4):
      perf/core: Add aux_pause, aux_resume, aux_start_paused
      perf/x86/intel/pt: Add support for pause / resume
      perf tools: Add support for AUX area pause / resume
      coresight: Have a stab at support for pause / resume

 arch/x86/events/intel/pt.c                       | 63 ++++++++++++++++++++-
 arch/x86/events/intel/pt.h                       |  4 ++
 drivers/hwtracing/coresight/coresight-etm-perf.c | 29 ++++++++--
 include/linux/perf_event.h                       | 15 +++++
 include/uapi/linux/perf_event.h                  | 11 +++-
 kernel/events/core.c                             | 72 +++++++++++++++++++++++-
 kernel/events/internal.h                         |  1 +
 tools/include/uapi/linux/perf_event.h            | 11 +++-
 tools/perf/util/auxtrace.c                       |  4 ++
 tools/perf/util/evsel.c                          |  9 +++
 tools/perf/util/evsel_config.h                   |  6 ++
 tools/perf/util/parse-events.c                   | 33 +++++++++++
 tools/perf/util/parse-events.h                   |  3 +
 tools/perf/util/parse-events.l                   |  3 +
 tools/perf/util/perf_event_attr_fprintf.c        |  3 +
 15 files changed, 255 insertions(+), 12 deletions(-)


Regards
Adrian

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH RFC V2 0/4] perf/core: Add ability for an event to "pause" or "resume" AUX area tracing
@ 2023-12-08 17:24 ` Adrian Hunter
  0 siblings, 0 replies; 31+ messages in thread
From: Adrian Hunter @ 2023-12-08 17:24 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Heiko Carstens,
	Thomas Richter, Hendrik Brueckner, Suzuki K Poulose, Mike Leach,
	James Clark, coresight, linux-arm-kernel, Yicong Yang,
	Jonathan Cameron, Will Deacon, Arnaldo Carvalho de Melo,
	Jiri Olsa, Namhyung Kim, Ian Rogers, linux-kernel,
	linux-perf-users

Hi

Hardware traces, such as instruction traces, can produce a vast amount of
trace data, so being able to reduce tracing to more specific circumstances
can be useful.

The ability to pause or resume tracing when another event happens, can do
that.

These patches add such a facilty and show how it would work for Intel
Processor Trace.

Maintainers of other AUX area tracing implementations are requested to
consider if this is something they might employ and then whether or not
the ABI would work for them.

Changes to perf tools are not fleshed out yet.


Changes in RFC V2:

      Use ->stop() / ->start() instead of ->pause_resume()
      Move aux_start_paused bit into aux_output_cfg
      Tighten up when Intel PT pause / resume is allowed
      Add an example of how it might work for CoreSight


Adrian Hunter (4):
      perf/core: Add aux_pause, aux_resume, aux_start_paused
      perf/x86/intel/pt: Add support for pause / resume
      perf tools: Add support for AUX area pause / resume
      coresight: Have a stab at support for pause / resume

 arch/x86/events/intel/pt.c                       | 63 ++++++++++++++++++++-
 arch/x86/events/intel/pt.h                       |  4 ++
 drivers/hwtracing/coresight/coresight-etm-perf.c | 29 ++++++++--
 include/linux/perf_event.h                       | 15 +++++
 include/uapi/linux/perf_event.h                  | 11 +++-
 kernel/events/core.c                             | 72 +++++++++++++++++++++++-
 kernel/events/internal.h                         |  1 +
 tools/include/uapi/linux/perf_event.h            | 11 +++-
 tools/perf/util/auxtrace.c                       |  4 ++
 tools/perf/util/evsel.c                          |  9 +++
 tools/perf/util/evsel_config.h                   |  6 ++
 tools/perf/util/parse-events.c                   | 33 +++++++++++
 tools/perf/util/parse-events.h                   |  3 +
 tools/perf/util/parse-events.l                   |  3 +
 tools/perf/util/perf_event_attr_fprintf.c        |  3 +
 15 files changed, 255 insertions(+), 12 deletions(-)


Regards
Adrian

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH RFC V2 1/4] perf/core: Add aux_pause, aux_resume, aux_start_paused
  2023-12-08 17:24 ` Adrian Hunter
@ 2023-12-08 17:24   ` Adrian Hunter
  -1 siblings, 0 replies; 31+ messages in thread
From: Adrian Hunter @ 2023-12-08 17:24 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Heiko Carstens,
	Thomas Richter, Hendrik Brueckner, Suzuki K Poulose, Mike Leach,
	James Clark, coresight, linux-arm-kernel, Yicong Yang,
	Jonathan Cameron, Will Deacon, Arnaldo Carvalho de Melo,
	Jiri Olsa, Namhyung Kim, Ian Rogers, linux-kernel,
	linux-perf-users

Hardware traces, such as instruction traces, can produce a vast amount of
trace data, so being able to reduce tracing to more specific circumstances
can be useful.

The ability to pause or resume tracing when another event happens, can do
that.

Add ability for an event to "pause" or "resume" AUX area tracing.

Add aux_pause bit to perf_event_attr to indicate that, if the event
happens, the associated AUX area tracing should be paused. Ditto
aux_resume. Do not allow aux_pause and aux_resume to be set together.

Add aux_start_paused bit to perf_event_attr to indicate to an AUX area
event that it should start in a "paused" state.

Add aux_paused to struct perf_event for AUX area events to keep track of
the "paused" state. aux_paused is initialized to aux_start_paused.

Add PERF_EF_PAUSE and PERF_EF_RESUME modes for ->stop() and ->start()
callbacks. Call as needed, during __perf_event_output(). Add
aux_in_pause_resume to struct perf_buffer to prevent races with the NMI
handler. Pause/resume in NMI context will miss out if it coincides with
another pause/resume.

To use aux_pause or aux_resume, an event must be in a group with the AUX
area event as the group leader.

Example (requires Intel PT and tools patches also):

 $ perf record --kcore -e '{intel_pt/aux-start-paused/k,syscalls:sys_enter_newuname/aux-resume/,syscalls:sys_exit_newuname/aux-pause/}' uname
 Linux
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.041 MB perf.data ]
 $ perf script --call-trace
 uname    5712 [007]    83.855580930: name: 0x7ffd9dcebec0
 uname    5712 [007]    83.855582518:  psb offs: 0
 uname    5712 [007]    83.855582518:  cbr: 42 freq: 4205 MHz (150%)
 uname    5712 [007]    83.855582723: ([kernel.kallsyms])    debug_smp_processor_id
 uname    5712 [007]    83.855582723: ([kernel.kallsyms])    __x64_sys_newuname
 uname    5712 [007]    83.855582723: ([kernel.kallsyms])        down_read
 uname    5712 [007]    83.855582723: ([kernel.kallsyms])            __cond_resched
 uname    5712 [007]    83.855582932: ([kernel.kallsyms])            preempt_count_add
 uname    5712 [007]    83.855582932: ([kernel.kallsyms])                in_lock_functions
 uname    5712 [007]    83.855582932: ([kernel.kallsyms])            preempt_count_sub
 uname    5712 [007]    83.855582932: ([kernel.kallsyms])        up_read
 uname    5712 [007]    83.855582932: ([kernel.kallsyms])            preempt_count_add
 uname    5712 [007]    83.855583348: ([kernel.kallsyms])                in_lock_functions
 uname    5712 [007]    83.855583348: ([kernel.kallsyms])            preempt_count_sub
 uname    5712 [007]    83.855583348: ([kernel.kallsyms])        _copy_to_user
 uname    5712 [007]    83.855583557: ([kernel.kallsyms])    syscall_exit_to_user_mode
 uname    5712 [007]    83.855583557: ([kernel.kallsyms])        syscall_exit_work
 uname    5712 [007]    83.855583557: ([kernel.kallsyms])            perf_syscall_exit
 uname    5712 [007]    83.855583557: ([kernel.kallsyms])                debug_smp_processor_id
 uname    5712 [007]    83.855583557: ([kernel.kallsyms])                perf_trace_buf_alloc
 uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    perf_swevent_get_recursion_context
 uname    5712 [007]    83.855583557: ([kernel.kallsyms])                        debug_smp_processor_id
 uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    debug_smp_processor_id
 uname    5712 [007]    83.855583557: ([kernel.kallsyms])                perf_tp_event
 uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    perf_trace_buf_update
 uname    5712 [007]    83.855583557: ([kernel.kallsyms])                        tracing_gen_ctx_irq_test
 uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    perf_swevent_event
 uname    5712 [007]    83.855583765: ([kernel.kallsyms])                        __perf_event_account_interrupt
 uname    5712 [007]    83.855583765: ([kernel.kallsyms])                            __this_cpu_preempt_check
 uname    5712 [007]    83.855583765: ([kernel.kallsyms])                        perf_event_output_forward
 uname    5712 [007]    83.855583765: ([kernel.kallsyms])                            perf_event_aux_pause
 uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                ring_buffer_get
 uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    __rcu_read_lock
 uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    __rcu_read_unlock
 uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                pt_event_stop
 uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    debug_smp_processor_id
 uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    debug_smp_processor_id
 uname    5712 [007]    83.855583973: ([kernel.kallsyms])                                    native_write_msr
 uname    5712 [007]    83.855583973: ([kernel.kallsyms])                                    native_write_msr
 uname    5712 [007]    83.855584175: 0x0

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 include/linux/perf_event.h      | 15 +++++++
 include/uapi/linux/perf_event.h | 11 ++++-
 kernel/events/core.c            | 72 +++++++++++++++++++++++++++++++--
 kernel/events/internal.h        |  1 +
 4 files changed, 95 insertions(+), 4 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index e85cd1c0eaf3..252c4aac3b79 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -291,6 +291,7 @@ struct perf_event_pmu_context;
 #define PERF_PMU_CAP_NO_EXCLUDE			0x0040
 #define PERF_PMU_CAP_AUX_OUTPUT			0x0080
 #define PERF_PMU_CAP_EXTENDED_HW_TYPE		0x0100
+#define PERF_PMU_CAP_AUX_PAUSE			0x0200
 
 struct perf_output_handle;
 
@@ -363,6 +364,8 @@ struct pmu {
 #define PERF_EF_START	0x01		/* start the counter when adding    */
 #define PERF_EF_RELOAD	0x02		/* reload the counter when starting */
 #define PERF_EF_UPDATE	0x04		/* update the counter when stopping */
+#define PERF_EF_PAUSE	0x08		/* AUX area event, pause tracing */
+#define PERF_EF_RESUME	0x10		/* AUX area event, resume tracing */
 
 	/*
 	 * Adds/Removes a counter to/from the PMU, can be done inside a
@@ -402,6 +405,15 @@ struct pmu {
 	 *
 	 * ->start() with PERF_EF_RELOAD will reprogram the counter
 	 *  value, must be preceded by a ->stop() with PERF_EF_UPDATE.
+	 *
+	 * ->stop() with PERF_EF_PAUSE will stop as simply as possible. Will not
+	 * overlap another ->stop() with PERF_EF_PAUSE nor ->start() with
+	 * PERF_EF_RESUME.
+	 *
+	 * ->start() with PERF_EF_RESUME will start as simply as possible but
+	 * only if the counter is not otherwise stopped. Will not overlap
+	 * another ->start() with PERF_EF_RESUME nor ->stop() with
+	 * PERF_EF_PAUSE.
 	 */
 	void (*start)			(struct perf_event *event, int flags);
 	void (*stop)			(struct perf_event *event, int flags);
@@ -797,6 +809,9 @@ struct perf_event {
 	/* for aux_output events */
 	struct perf_event		*aux_event;
 
+	/* for AUX area events */
+	unsigned int			aux_paused;
+
 	void (*destroy)(struct perf_event *);
 	struct rcu_head			rcu_head;
 
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 39c6a250dd1b..437bc2a8d50c 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -507,7 +507,16 @@ struct perf_event_attr {
 	__u16	sample_max_stack;
 	__u16	__reserved_2;
 	__u32	aux_sample_size;
-	__u32	__reserved_3;
+
+	union {
+		__u32	aux_output_cfg;
+		struct {
+			__u64	aux_pause        :  1, /* on overflow, pause AUX area tracing */
+				aux_resume       :  1, /* on overflow, resume AUX area tracing */
+				aux_start_paused :  1, /* start AUX area tracing paused */
+				__reserved_3     : 29;
+		};
+	};
 
 	/*
 	 * User provided data if sigtrap=1, passed back to user via
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 4c72a41f11af..c1e11884d06e 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2060,7 +2060,8 @@ static void perf_put_aux_event(struct perf_event *event)
 
 static bool perf_need_aux_event(struct perf_event *event)
 {
-	return !!event->attr.aux_output || !!event->attr.aux_sample_size;
+	return event->attr.aux_output || event->attr.aux_sample_size ||
+	       event->attr.aux_pause || event->attr.aux_resume;
 }
 
 static int perf_get_aux_event(struct perf_event *event,
@@ -2085,6 +2086,10 @@ static int perf_get_aux_event(struct perf_event *event,
 	    !perf_aux_output_match(event, group_leader))
 		return 0;
 
+	if ((event->attr.aux_pause || event->attr.aux_resume) &&
+	    !(group_leader->pmu->capabilities & PERF_PMU_CAP_AUX_PAUSE))
+		return 0;
+
 	if (event->attr.aux_sample_size && !group_leader->pmu->snapshot_aux)
 		return 0;
 
@@ -7773,6 +7778,47 @@ void perf_prepare_header(struct perf_event_header *header,
 	WARN_ON_ONCE(header->size & 7);
 }
 
+static void __perf_event_aux_pause(struct perf_event *event, bool pause)
+{
+	if (pause) {
+		if (!READ_ONCE(event->aux_paused)) {
+			WRITE_ONCE(event->aux_paused, 1);
+			event->pmu->stop(event, PERF_EF_PAUSE);
+		}
+	} else {
+		if (READ_ONCE(event->aux_paused)) {
+			WRITE_ONCE(event->aux_paused, 0);
+			event->pmu->start(event, PERF_EF_RESUME);
+		}
+	}
+}
+
+static void perf_event_aux_pause(struct perf_event *event, bool pause)
+{
+	struct perf_buffer *rb;
+	unsigned long flags;
+
+	if (WARN_ON_ONCE(!event))
+		return;
+
+	rb = ring_buffer_get(event);
+	if (!rb)
+		return;
+
+	local_irq_save(flags);
+	/* Guard against NMI, NMI loses here */
+	if (READ_ONCE(rb->aux_in_pause_resume))
+		goto out_restore;
+	WRITE_ONCE(rb->aux_in_pause_resume, 1);
+	barrier();
+	__perf_event_aux_pause(event, pause);
+	barrier();
+	WRITE_ONCE(rb->aux_in_pause_resume, 0);
+out_restore:
+	local_irq_restore(flags);
+	ring_buffer_put(rb);
+}
+
 static __always_inline int
 __perf_event_output(struct perf_event *event,
 		    struct perf_sample_data *data,
@@ -7786,6 +7832,9 @@ __perf_event_output(struct perf_event *event,
 	struct perf_event_header header;
 	int err;
 
+	if (event->attr.aux_pause)
+		perf_event_aux_pause(event->aux_event, true);
+
 	/* protect the callchain buffers */
 	rcu_read_lock();
 
@@ -7802,6 +7851,10 @@ __perf_event_output(struct perf_event *event,
 
 exit:
 	rcu_read_unlock();
+
+	if (event->attr.aux_resume)
+		perf_event_aux_pause(event->aux_event, false);
+
 	return err;
 }
 
@@ -11941,10 +11994,23 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
 	}
 
 	if (event->attr.aux_output &&
-	    !(pmu->capabilities & PERF_PMU_CAP_AUX_OUTPUT)) {
+	    (!(pmu->capabilities & PERF_PMU_CAP_AUX_OUTPUT) ||
+	     event->attr.aux_pause || event->attr.aux_resume)) {
+		err = -EOPNOTSUPP;
+		goto err_pmu;
+	}
+
+	if (event->attr.aux_pause && event->attr.aux_resume) {
+		err = -EINVAL;
+		goto err_pmu;
+	}
+
+	if (event->attr.aux_start_paused &&
+	    !(pmu->capabilities & PERF_PMU_CAP_AUX_PAUSE)) {
 		err = -EOPNOTSUPP;
 		goto err_pmu;
 	}
+	event->aux_paused = event->attr.aux_start_paused;
 
 	if (cgroup_fd != -1) {
 		err = perf_cgroup_connect(cgroup_fd, event, attr, group_leader);
@@ -12741,7 +12807,7 @@ perf_event_create_kernel_counter(struct perf_event_attr *attr, int cpu,
 	 * Grouping is not supported for kernel events, neither is 'AUX',
 	 * make sure the caller's intentions are adjusted.
 	 */
-	if (attr->aux_output)
+	if (attr->aux_output || attr->aux_output_cfg)
 		return ERR_PTR(-EINVAL);
 
 	event = perf_event_alloc(attr, cpu, task, NULL, NULL,
diff --git a/kernel/events/internal.h b/kernel/events/internal.h
index 5150d5f84c03..3320f78117dc 100644
--- a/kernel/events/internal.h
+++ b/kernel/events/internal.h
@@ -51,6 +51,7 @@ struct perf_buffer {
 	void				(*free_aux)(void *);
 	refcount_t			aux_refcount;
 	int				aux_in_sampling;
+	int				aux_in_pause_resume;
 	void				**aux_pages;
 	void				*aux_priv;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RFC V2 1/4] perf/core: Add aux_pause, aux_resume, aux_start_paused
@ 2023-12-08 17:24   ` Adrian Hunter
  0 siblings, 0 replies; 31+ messages in thread
From: Adrian Hunter @ 2023-12-08 17:24 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Heiko Carstens,
	Thomas Richter, Hendrik Brueckner, Suzuki K Poulose, Mike Leach,
	James Clark, coresight, linux-arm-kernel, Yicong Yang,
	Jonathan Cameron, Will Deacon, Arnaldo Carvalho de Melo,
	Jiri Olsa, Namhyung Kim, Ian Rogers, linux-kernel,
	linux-perf-users

Hardware traces, such as instruction traces, can produce a vast amount of
trace data, so being able to reduce tracing to more specific circumstances
can be useful.

The ability to pause or resume tracing when another event happens, can do
that.

Add ability for an event to "pause" or "resume" AUX area tracing.

Add aux_pause bit to perf_event_attr to indicate that, if the event
happens, the associated AUX area tracing should be paused. Ditto
aux_resume. Do not allow aux_pause and aux_resume to be set together.

Add aux_start_paused bit to perf_event_attr to indicate to an AUX area
event that it should start in a "paused" state.

Add aux_paused to struct perf_event for AUX area events to keep track of
the "paused" state. aux_paused is initialized to aux_start_paused.

Add PERF_EF_PAUSE and PERF_EF_RESUME modes for ->stop() and ->start()
callbacks. Call as needed, during __perf_event_output(). Add
aux_in_pause_resume to struct perf_buffer to prevent races with the NMI
handler. Pause/resume in NMI context will miss out if it coincides with
another pause/resume.

To use aux_pause or aux_resume, an event must be in a group with the AUX
area event as the group leader.

Example (requires Intel PT and tools patches also):

 $ perf record --kcore -e '{intel_pt/aux-start-paused/k,syscalls:sys_enter_newuname/aux-resume/,syscalls:sys_exit_newuname/aux-pause/}' uname
 Linux
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.041 MB perf.data ]
 $ perf script --call-trace
 uname    5712 [007]    83.855580930: name: 0x7ffd9dcebec0
 uname    5712 [007]    83.855582518:  psb offs: 0
 uname    5712 [007]    83.855582518:  cbr: 42 freq: 4205 MHz (150%)
 uname    5712 [007]    83.855582723: ([kernel.kallsyms])    debug_smp_processor_id
 uname    5712 [007]    83.855582723: ([kernel.kallsyms])    __x64_sys_newuname
 uname    5712 [007]    83.855582723: ([kernel.kallsyms])        down_read
 uname    5712 [007]    83.855582723: ([kernel.kallsyms])            __cond_resched
 uname    5712 [007]    83.855582932: ([kernel.kallsyms])            preempt_count_add
 uname    5712 [007]    83.855582932: ([kernel.kallsyms])                in_lock_functions
 uname    5712 [007]    83.855582932: ([kernel.kallsyms])            preempt_count_sub
 uname    5712 [007]    83.855582932: ([kernel.kallsyms])        up_read
 uname    5712 [007]    83.855582932: ([kernel.kallsyms])            preempt_count_add
 uname    5712 [007]    83.855583348: ([kernel.kallsyms])                in_lock_functions
 uname    5712 [007]    83.855583348: ([kernel.kallsyms])            preempt_count_sub
 uname    5712 [007]    83.855583348: ([kernel.kallsyms])        _copy_to_user
 uname    5712 [007]    83.855583557: ([kernel.kallsyms])    syscall_exit_to_user_mode
 uname    5712 [007]    83.855583557: ([kernel.kallsyms])        syscall_exit_work
 uname    5712 [007]    83.855583557: ([kernel.kallsyms])            perf_syscall_exit
 uname    5712 [007]    83.855583557: ([kernel.kallsyms])                debug_smp_processor_id
 uname    5712 [007]    83.855583557: ([kernel.kallsyms])                perf_trace_buf_alloc
 uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    perf_swevent_get_recursion_context
 uname    5712 [007]    83.855583557: ([kernel.kallsyms])                        debug_smp_processor_id
 uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    debug_smp_processor_id
 uname    5712 [007]    83.855583557: ([kernel.kallsyms])                perf_tp_event
 uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    perf_trace_buf_update
 uname    5712 [007]    83.855583557: ([kernel.kallsyms])                        tracing_gen_ctx_irq_test
 uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    perf_swevent_event
 uname    5712 [007]    83.855583765: ([kernel.kallsyms])                        __perf_event_account_interrupt
 uname    5712 [007]    83.855583765: ([kernel.kallsyms])                            __this_cpu_preempt_check
 uname    5712 [007]    83.855583765: ([kernel.kallsyms])                        perf_event_output_forward
 uname    5712 [007]    83.855583765: ([kernel.kallsyms])                            perf_event_aux_pause
 uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                ring_buffer_get
 uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    __rcu_read_lock
 uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    __rcu_read_unlock
 uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                pt_event_stop
 uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    debug_smp_processor_id
 uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    debug_smp_processor_id
 uname    5712 [007]    83.855583973: ([kernel.kallsyms])                                    native_write_msr
 uname    5712 [007]    83.855583973: ([kernel.kallsyms])                                    native_write_msr
 uname    5712 [007]    83.855584175: 0x0

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 include/linux/perf_event.h      | 15 +++++++
 include/uapi/linux/perf_event.h | 11 ++++-
 kernel/events/core.c            | 72 +++++++++++++++++++++++++++++++--
 kernel/events/internal.h        |  1 +
 4 files changed, 95 insertions(+), 4 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index e85cd1c0eaf3..252c4aac3b79 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -291,6 +291,7 @@ struct perf_event_pmu_context;
 #define PERF_PMU_CAP_NO_EXCLUDE			0x0040
 #define PERF_PMU_CAP_AUX_OUTPUT			0x0080
 #define PERF_PMU_CAP_EXTENDED_HW_TYPE		0x0100
+#define PERF_PMU_CAP_AUX_PAUSE			0x0200
 
 struct perf_output_handle;
 
@@ -363,6 +364,8 @@ struct pmu {
 #define PERF_EF_START	0x01		/* start the counter when adding    */
 #define PERF_EF_RELOAD	0x02		/* reload the counter when starting */
 #define PERF_EF_UPDATE	0x04		/* update the counter when stopping */
+#define PERF_EF_PAUSE	0x08		/* AUX area event, pause tracing */
+#define PERF_EF_RESUME	0x10		/* AUX area event, resume tracing */
 
 	/*
 	 * Adds/Removes a counter to/from the PMU, can be done inside a
@@ -402,6 +405,15 @@ struct pmu {
 	 *
 	 * ->start() with PERF_EF_RELOAD will reprogram the counter
 	 *  value, must be preceded by a ->stop() with PERF_EF_UPDATE.
+	 *
+	 * ->stop() with PERF_EF_PAUSE will stop as simply as possible. Will not
+	 * overlap another ->stop() with PERF_EF_PAUSE nor ->start() with
+	 * PERF_EF_RESUME.
+	 *
+	 * ->start() with PERF_EF_RESUME will start as simply as possible but
+	 * only if the counter is not otherwise stopped. Will not overlap
+	 * another ->start() with PERF_EF_RESUME nor ->stop() with
+	 * PERF_EF_PAUSE.
 	 */
 	void (*start)			(struct perf_event *event, int flags);
 	void (*stop)			(struct perf_event *event, int flags);
@@ -797,6 +809,9 @@ struct perf_event {
 	/* for aux_output events */
 	struct perf_event		*aux_event;
 
+	/* for AUX area events */
+	unsigned int			aux_paused;
+
 	void (*destroy)(struct perf_event *);
 	struct rcu_head			rcu_head;
 
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 39c6a250dd1b..437bc2a8d50c 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -507,7 +507,16 @@ struct perf_event_attr {
 	__u16	sample_max_stack;
 	__u16	__reserved_2;
 	__u32	aux_sample_size;
-	__u32	__reserved_3;
+
+	union {
+		__u32	aux_output_cfg;
+		struct {
+			__u64	aux_pause        :  1, /* on overflow, pause AUX area tracing */
+				aux_resume       :  1, /* on overflow, resume AUX area tracing */
+				aux_start_paused :  1, /* start AUX area tracing paused */
+				__reserved_3     : 29;
+		};
+	};
 
 	/*
 	 * User provided data if sigtrap=1, passed back to user via
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 4c72a41f11af..c1e11884d06e 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2060,7 +2060,8 @@ static void perf_put_aux_event(struct perf_event *event)
 
 static bool perf_need_aux_event(struct perf_event *event)
 {
-	return !!event->attr.aux_output || !!event->attr.aux_sample_size;
+	return event->attr.aux_output || event->attr.aux_sample_size ||
+	       event->attr.aux_pause || event->attr.aux_resume;
 }
 
 static int perf_get_aux_event(struct perf_event *event,
@@ -2085,6 +2086,10 @@ static int perf_get_aux_event(struct perf_event *event,
 	    !perf_aux_output_match(event, group_leader))
 		return 0;
 
+	if ((event->attr.aux_pause || event->attr.aux_resume) &&
+	    !(group_leader->pmu->capabilities & PERF_PMU_CAP_AUX_PAUSE))
+		return 0;
+
 	if (event->attr.aux_sample_size && !group_leader->pmu->snapshot_aux)
 		return 0;
 
@@ -7773,6 +7778,47 @@ void perf_prepare_header(struct perf_event_header *header,
 	WARN_ON_ONCE(header->size & 7);
 }
 
+static void __perf_event_aux_pause(struct perf_event *event, bool pause)
+{
+	if (pause) {
+		if (!READ_ONCE(event->aux_paused)) {
+			WRITE_ONCE(event->aux_paused, 1);
+			event->pmu->stop(event, PERF_EF_PAUSE);
+		}
+	} else {
+		if (READ_ONCE(event->aux_paused)) {
+			WRITE_ONCE(event->aux_paused, 0);
+			event->pmu->start(event, PERF_EF_RESUME);
+		}
+	}
+}
+
+static void perf_event_aux_pause(struct perf_event *event, bool pause)
+{
+	struct perf_buffer *rb;
+	unsigned long flags;
+
+	if (WARN_ON_ONCE(!event))
+		return;
+
+	rb = ring_buffer_get(event);
+	if (!rb)
+		return;
+
+	local_irq_save(flags);
+	/* Guard against NMI, NMI loses here */
+	if (READ_ONCE(rb->aux_in_pause_resume))
+		goto out_restore;
+	WRITE_ONCE(rb->aux_in_pause_resume, 1);
+	barrier();
+	__perf_event_aux_pause(event, pause);
+	barrier();
+	WRITE_ONCE(rb->aux_in_pause_resume, 0);
+out_restore:
+	local_irq_restore(flags);
+	ring_buffer_put(rb);
+}
+
 static __always_inline int
 __perf_event_output(struct perf_event *event,
 		    struct perf_sample_data *data,
@@ -7786,6 +7832,9 @@ __perf_event_output(struct perf_event *event,
 	struct perf_event_header header;
 	int err;
 
+	if (event->attr.aux_pause)
+		perf_event_aux_pause(event->aux_event, true);
+
 	/* protect the callchain buffers */
 	rcu_read_lock();
 
@@ -7802,6 +7851,10 @@ __perf_event_output(struct perf_event *event,
 
 exit:
 	rcu_read_unlock();
+
+	if (event->attr.aux_resume)
+		perf_event_aux_pause(event->aux_event, false);
+
 	return err;
 }
 
@@ -11941,10 +11994,23 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
 	}
 
 	if (event->attr.aux_output &&
-	    !(pmu->capabilities & PERF_PMU_CAP_AUX_OUTPUT)) {
+	    (!(pmu->capabilities & PERF_PMU_CAP_AUX_OUTPUT) ||
+	     event->attr.aux_pause || event->attr.aux_resume)) {
+		err = -EOPNOTSUPP;
+		goto err_pmu;
+	}
+
+	if (event->attr.aux_pause && event->attr.aux_resume) {
+		err = -EINVAL;
+		goto err_pmu;
+	}
+
+	if (event->attr.aux_start_paused &&
+	    !(pmu->capabilities & PERF_PMU_CAP_AUX_PAUSE)) {
 		err = -EOPNOTSUPP;
 		goto err_pmu;
 	}
+	event->aux_paused = event->attr.aux_start_paused;
 
 	if (cgroup_fd != -1) {
 		err = perf_cgroup_connect(cgroup_fd, event, attr, group_leader);
@@ -12741,7 +12807,7 @@ perf_event_create_kernel_counter(struct perf_event_attr *attr, int cpu,
 	 * Grouping is not supported for kernel events, neither is 'AUX',
 	 * make sure the caller's intentions are adjusted.
 	 */
-	if (attr->aux_output)
+	if (attr->aux_output || attr->aux_output_cfg)
 		return ERR_PTR(-EINVAL);
 
 	event = perf_event_alloc(attr, cpu, task, NULL, NULL,
diff --git a/kernel/events/internal.h b/kernel/events/internal.h
index 5150d5f84c03..3320f78117dc 100644
--- a/kernel/events/internal.h
+++ b/kernel/events/internal.h
@@ -51,6 +51,7 @@ struct perf_buffer {
 	void				(*free_aux)(void *);
 	refcount_t			aux_refcount;
 	int				aux_in_sampling;
+	int				aux_in_pause_resume;
 	void				**aux_pages;
 	void				*aux_priv;
 
-- 
2.34.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RFC V2 2/4] perf/x86/intel/pt: Add support for pause / resume
  2023-12-08 17:24 ` Adrian Hunter
@ 2023-12-08 17:24   ` Adrian Hunter
  -1 siblings, 0 replies; 31+ messages in thread
From: Adrian Hunter @ 2023-12-08 17:24 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Heiko Carstens,
	Thomas Richter, Hendrik Brueckner, Suzuki K Poulose, Mike Leach,
	James Clark, coresight, linux-arm-kernel, Yicong Yang,
	Jonathan Cameron, Will Deacon, Arnaldo Carvalho de Melo,
	Jiri Olsa, Namhyung Kim, Ian Rogers, linux-kernel,
	linux-perf-users

Prevent tracing to start if aux_paused.

Implement support for PERF_EF_PAUSE / PERF_EF_RESUME. When aux_paused, stop
tracing. When not aux_paused, only start tracing if it isn't currently
meant to be stopped.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 arch/x86/events/intel/pt.c | 63 ++++++++++++++++++++++++++++++++++++--
 arch/x86/events/intel/pt.h |  4 +++
 2 files changed, 64 insertions(+), 3 deletions(-)

diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
index 42a55794004a..692b51849d1c 100644
--- a/arch/x86/events/intel/pt.c
+++ b/arch/x86/events/intel/pt.c
@@ -418,6 +418,9 @@ static void pt_config_start(struct perf_event *event)
 	struct pt *pt = this_cpu_ptr(&pt_ctx);
 	u64 ctl = event->hw.config;
 
+	if (READ_ONCE(event->aux_paused))
+		return;
+
 	ctl |= RTIT_CTL_TRACEEN;
 	if (READ_ONCE(pt->vmx_on))
 		perf_aux_output_flag(&pt->handle, PERF_AUX_FLAG_PARTIAL);
@@ -534,7 +537,20 @@ static void pt_config(struct perf_event *event)
 	reg |= (event->attr.config & PT_CONFIG_MASK);
 
 	event->hw.config = reg;
+
+	/*
+	 * Allow resume before starting so as not to overwrite a value set by a
+	 * PMI.
+	 */
+	WRITE_ONCE(pt->resume_allowed, 1);
+
 	pt_config_start(event);
+
+	/*
+	 * Allow pause after starting so its pt_config_stop() doesn't race with
+	 * pt_config_start().
+	 */
+	WRITE_ONCE(pt->pause_allowed, 1);
 }
 
 static void pt_config_stop(struct perf_event *event)
@@ -1507,6 +1523,7 @@ void intel_pt_interrupt(void)
 		buf = perf_aux_output_begin(&pt->handle, event);
 		if (!buf) {
 			event->hw.state = PERF_HES_STOPPED;
+			pt->resume_allowed = 0;
 			return;
 		}
 
@@ -1515,6 +1532,7 @@ void intel_pt_interrupt(void)
 		ret = pt_buffer_reset_markers(buf, &pt->handle);
 		if (ret) {
 			perf_aux_output_end(&pt->handle, 0);
+			pt->resume_allowed = 0;
 			return;
 		}
 
@@ -1569,6 +1587,26 @@ static void pt_event_start(struct perf_event *event, int mode)
 	struct pt *pt = this_cpu_ptr(&pt_ctx);
 	struct pt_buffer *buf;
 
+	if (mode & PERF_EF_RESUME) {
+		if (READ_ONCE(pt->resume_allowed)) {
+			u64 status;
+
+			/*
+			 * Only if the trace is not active and the error and
+			 * stopped bits are clear, is it safe to start, but a
+			 * PMI might have just cleared these, so resume_allowed
+			 * must be checked again also.
+			 */
+			rdmsrl(MSR_IA32_RTIT_STATUS, status);
+			if (!(status & (RTIT_STATUS_TRIGGEREN |
+					RTIT_STATUS_ERROR |
+					RTIT_STATUS_STOPPED)) &&
+			   READ_ONCE(pt->resume_allowed))
+				pt_config_start(event);
+		}
+		return;
+	}
+
 	buf = perf_aux_output_begin(&pt->handle, event);
 	if (!buf)
 		goto fail_stop;
@@ -1597,6 +1635,16 @@ static void pt_event_stop(struct perf_event *event, int mode)
 {
 	struct pt *pt = this_cpu_ptr(&pt_ctx);
 
+	if (mode & PERF_EF_PAUSE) {
+		if (READ_ONCE(pt->pause_allowed))
+			pt_config_stop(event);
+		return;
+	}
+
+	/* Protect against racing */
+	WRITE_ONCE(pt->pause_allowed, 0);
+	WRITE_ONCE(pt->resume_allowed, 0);
+
 	/*
 	 * Protect against the PMI racing with disabling wrmsr,
 	 * see comment in intel_pt_interrupt().
@@ -1655,8 +1703,12 @@ static long pt_event_snapshot_aux(struct perf_event *event,
 	/*
 	 * Here, handle_nmi tells us if the tracing is on
 	 */
-	if (READ_ONCE(pt->handle_nmi))
+	if (READ_ONCE(pt->handle_nmi)) {
+		/* Protect against racing */
+		WRITE_ONCE(pt->pause_allowed, 0);
+		WRITE_ONCE(pt->resume_allowed, 0);
 		pt_config_stop(event);
+	}
 
 	pt_read_offset(buf);
 	pt_update_head(pt);
@@ -1673,8 +1725,11 @@ static long pt_event_snapshot_aux(struct perf_event *event,
 	 * Compiler barrier not needed as we couldn't have been
 	 * preempted by anything that touches pt->handle_nmi.
 	 */
-	if (pt->handle_nmi)
+	if (pt->handle_nmi) {
+		WRITE_ONCE(pt->resume_allowed, 1);
 		pt_config_start(event);
+		WRITE_ONCE(pt->pause_allowed, 1);
+	}
 
 	return ret;
 }
@@ -1790,7 +1845,9 @@ static __init int pt_init(void)
 	if (!intel_pt_validate_hw_cap(PT_CAP_topa_multiple_entries))
 		pt_pmu.pmu.capabilities = PERF_PMU_CAP_AUX_NO_SG;
 
-	pt_pmu.pmu.capabilities	|= PERF_PMU_CAP_EXCLUSIVE | PERF_PMU_CAP_ITRACE;
+	pt_pmu.pmu.capabilities		|= PERF_PMU_CAP_EXCLUSIVE |
+					   PERF_PMU_CAP_ITRACE |
+					   PERF_PMU_CAP_AUX_PAUSE;
 	pt_pmu.pmu.attr_groups		 = pt_attr_groups;
 	pt_pmu.pmu.task_ctx_nr		 = perf_sw_context;
 	pt_pmu.pmu.event_init		 = pt_event_init;
diff --git a/arch/x86/events/intel/pt.h b/arch/x86/events/intel/pt.h
index 96906a62aacd..b9527205e028 100644
--- a/arch/x86/events/intel/pt.h
+++ b/arch/x86/events/intel/pt.h
@@ -117,6 +117,8 @@ struct pt_filters {
  * @filters:		last configured filters
  * @handle_nmi:		do handle PT PMI on this cpu, there's an active event
  * @vmx_on:		1 if VMX is ON on this cpu
+ * @pause_allowed:	PERF_EF_PAUSE is allowed to stop tracing
+ * @resume_allowed:	PERF_EF_RESUME is allowed to start tracing
  * @output_base:	cached RTIT_OUTPUT_BASE MSR value
  * @output_mask:	cached RTIT_OUTPUT_MASK MSR value
  */
@@ -125,6 +127,8 @@ struct pt {
 	struct pt_filters	filters;
 	int			handle_nmi;
 	int			vmx_on;
+	int			pause_allowed;
+	int			resume_allowed;
 	u64			output_base;
 	u64			output_mask;
 };
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RFC V2 2/4] perf/x86/intel/pt: Add support for pause / resume
@ 2023-12-08 17:24   ` Adrian Hunter
  0 siblings, 0 replies; 31+ messages in thread
From: Adrian Hunter @ 2023-12-08 17:24 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Heiko Carstens,
	Thomas Richter, Hendrik Brueckner, Suzuki K Poulose, Mike Leach,
	James Clark, coresight, linux-arm-kernel, Yicong Yang,
	Jonathan Cameron, Will Deacon, Arnaldo Carvalho de Melo,
	Jiri Olsa, Namhyung Kim, Ian Rogers, linux-kernel,
	linux-perf-users

Prevent tracing to start if aux_paused.

Implement support for PERF_EF_PAUSE / PERF_EF_RESUME. When aux_paused, stop
tracing. When not aux_paused, only start tracing if it isn't currently
meant to be stopped.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 arch/x86/events/intel/pt.c | 63 ++++++++++++++++++++++++++++++++++++--
 arch/x86/events/intel/pt.h |  4 +++
 2 files changed, 64 insertions(+), 3 deletions(-)

diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
index 42a55794004a..692b51849d1c 100644
--- a/arch/x86/events/intel/pt.c
+++ b/arch/x86/events/intel/pt.c
@@ -418,6 +418,9 @@ static void pt_config_start(struct perf_event *event)
 	struct pt *pt = this_cpu_ptr(&pt_ctx);
 	u64 ctl = event->hw.config;
 
+	if (READ_ONCE(event->aux_paused))
+		return;
+
 	ctl |= RTIT_CTL_TRACEEN;
 	if (READ_ONCE(pt->vmx_on))
 		perf_aux_output_flag(&pt->handle, PERF_AUX_FLAG_PARTIAL);
@@ -534,7 +537,20 @@ static void pt_config(struct perf_event *event)
 	reg |= (event->attr.config & PT_CONFIG_MASK);
 
 	event->hw.config = reg;
+
+	/*
+	 * Allow resume before starting so as not to overwrite a value set by a
+	 * PMI.
+	 */
+	WRITE_ONCE(pt->resume_allowed, 1);
+
 	pt_config_start(event);
+
+	/*
+	 * Allow pause after starting so its pt_config_stop() doesn't race with
+	 * pt_config_start().
+	 */
+	WRITE_ONCE(pt->pause_allowed, 1);
 }
 
 static void pt_config_stop(struct perf_event *event)
@@ -1507,6 +1523,7 @@ void intel_pt_interrupt(void)
 		buf = perf_aux_output_begin(&pt->handle, event);
 		if (!buf) {
 			event->hw.state = PERF_HES_STOPPED;
+			pt->resume_allowed = 0;
 			return;
 		}
 
@@ -1515,6 +1532,7 @@ void intel_pt_interrupt(void)
 		ret = pt_buffer_reset_markers(buf, &pt->handle);
 		if (ret) {
 			perf_aux_output_end(&pt->handle, 0);
+			pt->resume_allowed = 0;
 			return;
 		}
 
@@ -1569,6 +1587,26 @@ static void pt_event_start(struct perf_event *event, int mode)
 	struct pt *pt = this_cpu_ptr(&pt_ctx);
 	struct pt_buffer *buf;
 
+	if (mode & PERF_EF_RESUME) {
+		if (READ_ONCE(pt->resume_allowed)) {
+			u64 status;
+
+			/*
+			 * Only if the trace is not active and the error and
+			 * stopped bits are clear, is it safe to start, but a
+			 * PMI might have just cleared these, so resume_allowed
+			 * must be checked again also.
+			 */
+			rdmsrl(MSR_IA32_RTIT_STATUS, status);
+			if (!(status & (RTIT_STATUS_TRIGGEREN |
+					RTIT_STATUS_ERROR |
+					RTIT_STATUS_STOPPED)) &&
+			   READ_ONCE(pt->resume_allowed))
+				pt_config_start(event);
+		}
+		return;
+	}
+
 	buf = perf_aux_output_begin(&pt->handle, event);
 	if (!buf)
 		goto fail_stop;
@@ -1597,6 +1635,16 @@ static void pt_event_stop(struct perf_event *event, int mode)
 {
 	struct pt *pt = this_cpu_ptr(&pt_ctx);
 
+	if (mode & PERF_EF_PAUSE) {
+		if (READ_ONCE(pt->pause_allowed))
+			pt_config_stop(event);
+		return;
+	}
+
+	/* Protect against racing */
+	WRITE_ONCE(pt->pause_allowed, 0);
+	WRITE_ONCE(pt->resume_allowed, 0);
+
 	/*
 	 * Protect against the PMI racing with disabling wrmsr,
 	 * see comment in intel_pt_interrupt().
@@ -1655,8 +1703,12 @@ static long pt_event_snapshot_aux(struct perf_event *event,
 	/*
 	 * Here, handle_nmi tells us if the tracing is on
 	 */
-	if (READ_ONCE(pt->handle_nmi))
+	if (READ_ONCE(pt->handle_nmi)) {
+		/* Protect against racing */
+		WRITE_ONCE(pt->pause_allowed, 0);
+		WRITE_ONCE(pt->resume_allowed, 0);
 		pt_config_stop(event);
+	}
 
 	pt_read_offset(buf);
 	pt_update_head(pt);
@@ -1673,8 +1725,11 @@ static long pt_event_snapshot_aux(struct perf_event *event,
 	 * Compiler barrier not needed as we couldn't have been
 	 * preempted by anything that touches pt->handle_nmi.
 	 */
-	if (pt->handle_nmi)
+	if (pt->handle_nmi) {
+		WRITE_ONCE(pt->resume_allowed, 1);
 		pt_config_start(event);
+		WRITE_ONCE(pt->pause_allowed, 1);
+	}
 
 	return ret;
 }
@@ -1790,7 +1845,9 @@ static __init int pt_init(void)
 	if (!intel_pt_validate_hw_cap(PT_CAP_topa_multiple_entries))
 		pt_pmu.pmu.capabilities = PERF_PMU_CAP_AUX_NO_SG;
 
-	pt_pmu.pmu.capabilities	|= PERF_PMU_CAP_EXCLUSIVE | PERF_PMU_CAP_ITRACE;
+	pt_pmu.pmu.capabilities		|= PERF_PMU_CAP_EXCLUSIVE |
+					   PERF_PMU_CAP_ITRACE |
+					   PERF_PMU_CAP_AUX_PAUSE;
 	pt_pmu.pmu.attr_groups		 = pt_attr_groups;
 	pt_pmu.pmu.task_ctx_nr		 = perf_sw_context;
 	pt_pmu.pmu.event_init		 = pt_event_init;
diff --git a/arch/x86/events/intel/pt.h b/arch/x86/events/intel/pt.h
index 96906a62aacd..b9527205e028 100644
--- a/arch/x86/events/intel/pt.h
+++ b/arch/x86/events/intel/pt.h
@@ -117,6 +117,8 @@ struct pt_filters {
  * @filters:		last configured filters
  * @handle_nmi:		do handle PT PMI on this cpu, there's an active event
  * @vmx_on:		1 if VMX is ON on this cpu
+ * @pause_allowed:	PERF_EF_PAUSE is allowed to stop tracing
+ * @resume_allowed:	PERF_EF_RESUME is allowed to start tracing
  * @output_base:	cached RTIT_OUTPUT_BASE MSR value
  * @output_mask:	cached RTIT_OUTPUT_MASK MSR value
  */
@@ -125,6 +127,8 @@ struct pt {
 	struct pt_filters	filters;
 	int			handle_nmi;
 	int			vmx_on;
+	int			pause_allowed;
+	int			resume_allowed;
 	u64			output_base;
 	u64			output_mask;
 };
-- 
2.34.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RFC V2 3/4] perf tools: Add support for AUX area pause / resume
  2023-12-08 17:24 ` Adrian Hunter
@ 2023-12-08 17:24   ` Adrian Hunter
  -1 siblings, 0 replies; 31+ messages in thread
From: Adrian Hunter @ 2023-12-08 17:24 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Heiko Carstens,
	Thomas Richter, Hendrik Brueckner, Suzuki K Poulose, Mike Leach,
	James Clark, coresight, linux-arm-kernel, Yicong Yang,
	Jonathan Cameron, Will Deacon, Arnaldo Carvalho de Melo,
	Jiri Olsa, Namhyung Kim, Ian Rogers, linux-kernel,
	linux-perf-users

Add config terms aux-pause, aux-resume and aux-start-paused.

Still to do: validation, fallbacks for perf_event_open, documentation.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/include/uapi/linux/perf_event.h     | 11 +++++++-
 tools/perf/util/auxtrace.c                |  4 +++
 tools/perf/util/evsel.c                   |  9 +++++++
 tools/perf/util/evsel_config.h            |  6 +++++
 tools/perf/util/parse-events.c            | 33 +++++++++++++++++++++++
 tools/perf/util/parse-events.h            |  3 +++
 tools/perf/util/parse-events.l            |  3 +++
 tools/perf/util/perf_event_attr_fprintf.c |  3 +++
 8 files changed, 71 insertions(+), 1 deletion(-)

diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index 3a64499b0f5d..9db32bc10d5b 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -511,7 +511,16 @@ struct perf_event_attr {
 	__u16	sample_max_stack;
 	__u16	__reserved_2;
 	__u32	aux_sample_size;
-	__u32	__reserved_3;
+
+	union {
+		__u32	aux_output_cfg;
+		struct {
+			__u64	aux_pause        :  1, /* on overflow, pause AUX area tracing */
+				aux_resume       :  1, /* on overflow, resume AUX area tracing */
+				aux_start_paused :  1, /* start AUX area tracing paused */
+				__reserved_3     : 29;
+		};
+	};
 
 	/*
 	 * User provided data if sigtrap=1, passed back to user via
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index a0368202a746..4a7ca8b0d100 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -814,6 +814,10 @@ void auxtrace_regroup_aux_output(struct evlist *evlist)
 		if (evsel__is_aux_event(evsel))
 			aux_evsel = evsel;
 		term = evsel__get_config_term(evsel, AUX_OUTPUT);
+		if (!term)
+			term = evsel__get_config_term(evsel, AUX_PAUSE);
+		if (!term)
+			term = evsel__get_config_term(evsel, AUX_RESUME);
 		/* If possible, group with the AUX event */
 		if (term && aux_evsel)
 			evlist__regroup(evlist, aux_evsel, evsel);
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index a5da74e3a517..03553c104954 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1001,6 +1001,15 @@ static void evsel__apply_config_terms(struct evsel *evsel,
 		case EVSEL__CONFIG_TERM_AUX_OUTPUT:
 			attr->aux_output = term->val.aux_output ? 1 : 0;
 			break;
+		case EVSEL__CONFIG_TERM_AUX_PAUSE:
+			attr->aux_pause = term->val.aux_pause ? 1 : 0;
+			break;
+		case EVSEL__CONFIG_TERM_AUX_RESUME:
+			attr->aux_resume = term->val.aux_resume ? 1 : 0;
+			break;
+		case EVSEL__CONFIG_TERM_AUX_START_PAUSED:
+			attr->aux_start_paused = term->val.aux_start_paused ? 1 : 0;
+			break;
 		case EVSEL__CONFIG_TERM_AUX_SAMPLE_SIZE:
 			/* Already applied by auxtrace */
 			break;
diff --git a/tools/perf/util/evsel_config.h b/tools/perf/util/evsel_config.h
index aee6f808b512..85ad183b5637 100644
--- a/tools/perf/util/evsel_config.h
+++ b/tools/perf/util/evsel_config.h
@@ -25,6 +25,9 @@ enum evsel_term_type {
 	EVSEL__CONFIG_TERM_BRANCH,
 	EVSEL__CONFIG_TERM_PERCORE,
 	EVSEL__CONFIG_TERM_AUX_OUTPUT,
+	EVSEL__CONFIG_TERM_AUX_PAUSE,
+	EVSEL__CONFIG_TERM_AUX_RESUME,
+	EVSEL__CONFIG_TERM_AUX_START_PAUSED,
 	EVSEL__CONFIG_TERM_AUX_SAMPLE_SIZE,
 	EVSEL__CONFIG_TERM_CFG_CHG,
 };
@@ -44,6 +47,9 @@ struct evsel_config_term {
 		unsigned long max_events;
 		bool	      percore;
 		bool	      aux_output;
+		bool	      aux_pause;
+		bool	      aux_resume;
+		bool	      aux_start_paused;
 		u32	      aux_sample_size;
 		u64	      cfg_chg;
 		char	      *str;
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index aa2f5c6fc7fc..615b04d5fb30 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -768,6 +768,9 @@ static const char *config_term_name(enum parse_events__term_type term_type)
 		[PARSE_EVENTS__TERM_TYPE_DRV_CFG]		= "driver-config",
 		[PARSE_EVENTS__TERM_TYPE_PERCORE]		= "percore",
 		[PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT]		= "aux-output",
+		[PARSE_EVENTS__TERM_TYPE_AUX_PAUSE]		= "aux-pause",
+		[PARSE_EVENTS__TERM_TYPE_AUX_RESUME]		= "aux-resume",
+		[PARSE_EVENTS__TERM_TYPE_AUX_START_PAUSED]	= "aux-start-paused",
 		[PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE]	= "aux-sample-size",
 		[PARSE_EVENTS__TERM_TYPE_METRIC_ID]		= "metric-id",
 		[PARSE_EVENTS__TERM_TYPE_RAW]                   = "raw",
@@ -817,6 +820,9 @@ config_term_avail(enum parse_events__term_type term_type, struct parse_events_er
 	case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
 	case PARSE_EVENTS__TERM_TYPE_DRV_CFG:
 	case PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT:
+	case PARSE_EVENTS__TERM_TYPE_AUX_PAUSE:
+	case PARSE_EVENTS__TERM_TYPE_AUX_RESUME:
+	case PARSE_EVENTS__TERM_TYPE_AUX_START_PAUSED:
 	case PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE:
 	case PARSE_EVENTS__TERM_TYPE_RAW:
 	case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE:
@@ -936,6 +942,15 @@ do {									   \
 	case PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT:
 		CHECK_TYPE_VAL(NUM);
 		break;
+	case PARSE_EVENTS__TERM_TYPE_AUX_PAUSE:
+		CHECK_TYPE_VAL(NUM);
+		break;
+	case PARSE_EVENTS__TERM_TYPE_AUX_RESUME:
+		CHECK_TYPE_VAL(NUM);
+		break;
+	case PARSE_EVENTS__TERM_TYPE_AUX_START_PAUSED:
+		CHECK_TYPE_VAL(NUM);
+		break;
 	case PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE:
 		CHECK_TYPE_VAL(NUM);
 		if (term->val.num > UINT_MAX) {
@@ -1036,6 +1051,9 @@ static int config_term_tracepoint(struct perf_event_attr *attr,
 	case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
 	case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
 	case PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT:
+	case PARSE_EVENTS__TERM_TYPE_AUX_PAUSE:
+	case PARSE_EVENTS__TERM_TYPE_AUX_RESUME:
+	case PARSE_EVENTS__TERM_TYPE_AUX_START_PAUSED:
 	case PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE:
 		return config_term_common(attr, term, err);
 	case PARSE_EVENTS__TERM_TYPE_USER:
@@ -1170,6 +1188,18 @@ do {								\
 			ADD_CONFIG_TERM_VAL(AUX_OUTPUT, aux_output,
 					    term->val.num ? 1 : 0, term->weak);
 			break;
+		case PARSE_EVENTS__TERM_TYPE_AUX_PAUSE:
+			ADD_CONFIG_TERM_VAL(AUX_PAUSE, aux_pause,
+					    term->val.num ? 1 : 0, term->weak);
+			break;
+		case PARSE_EVENTS__TERM_TYPE_AUX_RESUME:
+			ADD_CONFIG_TERM_VAL(AUX_RESUME, aux_resume,
+					    term->val.num ? 1 : 0, term->weak);
+			break;
+		case PARSE_EVENTS__TERM_TYPE_AUX_START_PAUSED:
+			ADD_CONFIG_TERM_VAL(AUX_START_PAUSED, aux_start_paused,
+					    term->val.num ? 1 : 0, term->weak);
+			break;
 		case PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE:
 			ADD_CONFIG_TERM_VAL(AUX_SAMPLE_SIZE, aux_sample_size,
 					    term->val.num, term->weak);
@@ -1232,6 +1262,9 @@ static int get_config_chgs(struct perf_pmu *pmu, struct parse_events_terms *head
 		case PARSE_EVENTS__TERM_TYPE_DRV_CFG:
 		case PARSE_EVENTS__TERM_TYPE_PERCORE:
 		case PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT:
+		case PARSE_EVENTS__TERM_TYPE_AUX_PAUSE:
+		case PARSE_EVENTS__TERM_TYPE_AUX_RESUME:
+		case PARSE_EVENTS__TERM_TYPE_AUX_START_PAUSED:
 		case PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE:
 		case PARSE_EVENTS__TERM_TYPE_METRIC_ID:
 		case PARSE_EVENTS__TERM_TYPE_RAW:
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 63c0a36a4bf1..ff0871385b50 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -74,6 +74,9 @@ enum parse_events__term_type {
 	PARSE_EVENTS__TERM_TYPE_DRV_CFG,
 	PARSE_EVENTS__TERM_TYPE_PERCORE,
 	PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT,
+	PARSE_EVENTS__TERM_TYPE_AUX_PAUSE,
+	PARSE_EVENTS__TERM_TYPE_AUX_RESUME,
+	PARSE_EVENTS__TERM_TYPE_AUX_START_PAUSED,
 	PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE,
 	PARSE_EVENTS__TERM_TYPE_METRIC_ID,
 	PARSE_EVENTS__TERM_TYPE_RAW,
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index e86c45675e1d..56963013c3af 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -244,6 +244,9 @@ overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_OVERWRITE); }
 no-overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOOVERWRITE); }
 percore			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_PERCORE); }
 aux-output		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT); }
+aux-pause		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_PAUSE); }
+aux-resume		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_RESUME); }
+aux-start-paused	{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_START_PAUSED); }
 aux-sample-size		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE); }
 metric-id		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_METRIC_ID); }
 cpu-cycles|cycles				{ return hw_term(yyscanner, PERF_COUNT_HW_CPU_CYCLES); }
diff --git a/tools/perf/util/perf_event_attr_fprintf.c b/tools/perf/util/perf_event_attr_fprintf.c
index 8f04d3b7f3ec..e6ba0ac73182 100644
--- a/tools/perf/util/perf_event_attr_fprintf.c
+++ b/tools/perf/util/perf_event_attr_fprintf.c
@@ -309,6 +309,7 @@ int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
 	PRINT_ATTRf(inherit_thread, p_unsigned);
 	PRINT_ATTRf(remove_on_exec, p_unsigned);
 	PRINT_ATTRf(sigtrap, p_unsigned);
+	PRINT_ATTRf(aux_start_paused, p_unsigned);
 
 	PRINT_ATTRn("{ wakeup_events, wakeup_watermark }", wakeup_events, p_unsigned, false);
 	PRINT_ATTRf(bp_type, p_unsigned);
@@ -323,6 +324,8 @@ int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
 	PRINT_ATTRf(sample_max_stack, p_unsigned);
 	PRINT_ATTRf(aux_sample_size, p_unsigned);
 	PRINT_ATTRf(sig_data, p_unsigned);
+	PRINT_ATTRf(aux_pause, p_unsigned);
+	PRINT_ATTRf(aux_resume, p_unsigned);
 
 	return ret;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RFC V2 3/4] perf tools: Add support for AUX area pause / resume
@ 2023-12-08 17:24   ` Adrian Hunter
  0 siblings, 0 replies; 31+ messages in thread
From: Adrian Hunter @ 2023-12-08 17:24 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Heiko Carstens,
	Thomas Richter, Hendrik Brueckner, Suzuki K Poulose, Mike Leach,
	James Clark, coresight, linux-arm-kernel, Yicong Yang,
	Jonathan Cameron, Will Deacon, Arnaldo Carvalho de Melo,
	Jiri Olsa, Namhyung Kim, Ian Rogers, linux-kernel,
	linux-perf-users

Add config terms aux-pause, aux-resume and aux-start-paused.

Still to do: validation, fallbacks for perf_event_open, documentation.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/include/uapi/linux/perf_event.h     | 11 +++++++-
 tools/perf/util/auxtrace.c                |  4 +++
 tools/perf/util/evsel.c                   |  9 +++++++
 tools/perf/util/evsel_config.h            |  6 +++++
 tools/perf/util/parse-events.c            | 33 +++++++++++++++++++++++
 tools/perf/util/parse-events.h            |  3 +++
 tools/perf/util/parse-events.l            |  3 +++
 tools/perf/util/perf_event_attr_fprintf.c |  3 +++
 8 files changed, 71 insertions(+), 1 deletion(-)

diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index 3a64499b0f5d..9db32bc10d5b 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -511,7 +511,16 @@ struct perf_event_attr {
 	__u16	sample_max_stack;
 	__u16	__reserved_2;
 	__u32	aux_sample_size;
-	__u32	__reserved_3;
+
+	union {
+		__u32	aux_output_cfg;
+		struct {
+			__u64	aux_pause        :  1, /* on overflow, pause AUX area tracing */
+				aux_resume       :  1, /* on overflow, resume AUX area tracing */
+				aux_start_paused :  1, /* start AUX area tracing paused */
+				__reserved_3     : 29;
+		};
+	};
 
 	/*
 	 * User provided data if sigtrap=1, passed back to user via
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index a0368202a746..4a7ca8b0d100 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -814,6 +814,10 @@ void auxtrace_regroup_aux_output(struct evlist *evlist)
 		if (evsel__is_aux_event(evsel))
 			aux_evsel = evsel;
 		term = evsel__get_config_term(evsel, AUX_OUTPUT);
+		if (!term)
+			term = evsel__get_config_term(evsel, AUX_PAUSE);
+		if (!term)
+			term = evsel__get_config_term(evsel, AUX_RESUME);
 		/* If possible, group with the AUX event */
 		if (term && aux_evsel)
 			evlist__regroup(evlist, aux_evsel, evsel);
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index a5da74e3a517..03553c104954 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1001,6 +1001,15 @@ static void evsel__apply_config_terms(struct evsel *evsel,
 		case EVSEL__CONFIG_TERM_AUX_OUTPUT:
 			attr->aux_output = term->val.aux_output ? 1 : 0;
 			break;
+		case EVSEL__CONFIG_TERM_AUX_PAUSE:
+			attr->aux_pause = term->val.aux_pause ? 1 : 0;
+			break;
+		case EVSEL__CONFIG_TERM_AUX_RESUME:
+			attr->aux_resume = term->val.aux_resume ? 1 : 0;
+			break;
+		case EVSEL__CONFIG_TERM_AUX_START_PAUSED:
+			attr->aux_start_paused = term->val.aux_start_paused ? 1 : 0;
+			break;
 		case EVSEL__CONFIG_TERM_AUX_SAMPLE_SIZE:
 			/* Already applied by auxtrace */
 			break;
diff --git a/tools/perf/util/evsel_config.h b/tools/perf/util/evsel_config.h
index aee6f808b512..85ad183b5637 100644
--- a/tools/perf/util/evsel_config.h
+++ b/tools/perf/util/evsel_config.h
@@ -25,6 +25,9 @@ enum evsel_term_type {
 	EVSEL__CONFIG_TERM_BRANCH,
 	EVSEL__CONFIG_TERM_PERCORE,
 	EVSEL__CONFIG_TERM_AUX_OUTPUT,
+	EVSEL__CONFIG_TERM_AUX_PAUSE,
+	EVSEL__CONFIG_TERM_AUX_RESUME,
+	EVSEL__CONFIG_TERM_AUX_START_PAUSED,
 	EVSEL__CONFIG_TERM_AUX_SAMPLE_SIZE,
 	EVSEL__CONFIG_TERM_CFG_CHG,
 };
@@ -44,6 +47,9 @@ struct evsel_config_term {
 		unsigned long max_events;
 		bool	      percore;
 		bool	      aux_output;
+		bool	      aux_pause;
+		bool	      aux_resume;
+		bool	      aux_start_paused;
 		u32	      aux_sample_size;
 		u64	      cfg_chg;
 		char	      *str;
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index aa2f5c6fc7fc..615b04d5fb30 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -768,6 +768,9 @@ static const char *config_term_name(enum parse_events__term_type term_type)
 		[PARSE_EVENTS__TERM_TYPE_DRV_CFG]		= "driver-config",
 		[PARSE_EVENTS__TERM_TYPE_PERCORE]		= "percore",
 		[PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT]		= "aux-output",
+		[PARSE_EVENTS__TERM_TYPE_AUX_PAUSE]		= "aux-pause",
+		[PARSE_EVENTS__TERM_TYPE_AUX_RESUME]		= "aux-resume",
+		[PARSE_EVENTS__TERM_TYPE_AUX_START_PAUSED]	= "aux-start-paused",
 		[PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE]	= "aux-sample-size",
 		[PARSE_EVENTS__TERM_TYPE_METRIC_ID]		= "metric-id",
 		[PARSE_EVENTS__TERM_TYPE_RAW]                   = "raw",
@@ -817,6 +820,9 @@ config_term_avail(enum parse_events__term_type term_type, struct parse_events_er
 	case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
 	case PARSE_EVENTS__TERM_TYPE_DRV_CFG:
 	case PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT:
+	case PARSE_EVENTS__TERM_TYPE_AUX_PAUSE:
+	case PARSE_EVENTS__TERM_TYPE_AUX_RESUME:
+	case PARSE_EVENTS__TERM_TYPE_AUX_START_PAUSED:
 	case PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE:
 	case PARSE_EVENTS__TERM_TYPE_RAW:
 	case PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE:
@@ -936,6 +942,15 @@ do {									   \
 	case PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT:
 		CHECK_TYPE_VAL(NUM);
 		break;
+	case PARSE_EVENTS__TERM_TYPE_AUX_PAUSE:
+		CHECK_TYPE_VAL(NUM);
+		break;
+	case PARSE_EVENTS__TERM_TYPE_AUX_RESUME:
+		CHECK_TYPE_VAL(NUM);
+		break;
+	case PARSE_EVENTS__TERM_TYPE_AUX_START_PAUSED:
+		CHECK_TYPE_VAL(NUM);
+		break;
 	case PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE:
 		CHECK_TYPE_VAL(NUM);
 		if (term->val.num > UINT_MAX) {
@@ -1036,6 +1051,9 @@ static int config_term_tracepoint(struct perf_event_attr *attr,
 	case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
 	case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
 	case PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT:
+	case PARSE_EVENTS__TERM_TYPE_AUX_PAUSE:
+	case PARSE_EVENTS__TERM_TYPE_AUX_RESUME:
+	case PARSE_EVENTS__TERM_TYPE_AUX_START_PAUSED:
 	case PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE:
 		return config_term_common(attr, term, err);
 	case PARSE_EVENTS__TERM_TYPE_USER:
@@ -1170,6 +1188,18 @@ do {								\
 			ADD_CONFIG_TERM_VAL(AUX_OUTPUT, aux_output,
 					    term->val.num ? 1 : 0, term->weak);
 			break;
+		case PARSE_EVENTS__TERM_TYPE_AUX_PAUSE:
+			ADD_CONFIG_TERM_VAL(AUX_PAUSE, aux_pause,
+					    term->val.num ? 1 : 0, term->weak);
+			break;
+		case PARSE_EVENTS__TERM_TYPE_AUX_RESUME:
+			ADD_CONFIG_TERM_VAL(AUX_RESUME, aux_resume,
+					    term->val.num ? 1 : 0, term->weak);
+			break;
+		case PARSE_EVENTS__TERM_TYPE_AUX_START_PAUSED:
+			ADD_CONFIG_TERM_VAL(AUX_START_PAUSED, aux_start_paused,
+					    term->val.num ? 1 : 0, term->weak);
+			break;
 		case PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE:
 			ADD_CONFIG_TERM_VAL(AUX_SAMPLE_SIZE, aux_sample_size,
 					    term->val.num, term->weak);
@@ -1232,6 +1262,9 @@ static int get_config_chgs(struct perf_pmu *pmu, struct parse_events_terms *head
 		case PARSE_EVENTS__TERM_TYPE_DRV_CFG:
 		case PARSE_EVENTS__TERM_TYPE_PERCORE:
 		case PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT:
+		case PARSE_EVENTS__TERM_TYPE_AUX_PAUSE:
+		case PARSE_EVENTS__TERM_TYPE_AUX_RESUME:
+		case PARSE_EVENTS__TERM_TYPE_AUX_START_PAUSED:
 		case PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE:
 		case PARSE_EVENTS__TERM_TYPE_METRIC_ID:
 		case PARSE_EVENTS__TERM_TYPE_RAW:
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 63c0a36a4bf1..ff0871385b50 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -74,6 +74,9 @@ enum parse_events__term_type {
 	PARSE_EVENTS__TERM_TYPE_DRV_CFG,
 	PARSE_EVENTS__TERM_TYPE_PERCORE,
 	PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT,
+	PARSE_EVENTS__TERM_TYPE_AUX_PAUSE,
+	PARSE_EVENTS__TERM_TYPE_AUX_RESUME,
+	PARSE_EVENTS__TERM_TYPE_AUX_START_PAUSED,
 	PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE,
 	PARSE_EVENTS__TERM_TYPE_METRIC_ID,
 	PARSE_EVENTS__TERM_TYPE_RAW,
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index e86c45675e1d..56963013c3af 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -244,6 +244,9 @@ overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_OVERWRITE); }
 no-overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOOVERWRITE); }
 percore			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_PERCORE); }
 aux-output		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT); }
+aux-pause		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_PAUSE); }
+aux-resume		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_RESUME); }
+aux-start-paused	{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_START_PAUSED); }
 aux-sample-size		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE); }
 metric-id		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_METRIC_ID); }
 cpu-cycles|cycles				{ return hw_term(yyscanner, PERF_COUNT_HW_CPU_CYCLES); }
diff --git a/tools/perf/util/perf_event_attr_fprintf.c b/tools/perf/util/perf_event_attr_fprintf.c
index 8f04d3b7f3ec..e6ba0ac73182 100644
--- a/tools/perf/util/perf_event_attr_fprintf.c
+++ b/tools/perf/util/perf_event_attr_fprintf.c
@@ -309,6 +309,7 @@ int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
 	PRINT_ATTRf(inherit_thread, p_unsigned);
 	PRINT_ATTRf(remove_on_exec, p_unsigned);
 	PRINT_ATTRf(sigtrap, p_unsigned);
+	PRINT_ATTRf(aux_start_paused, p_unsigned);
 
 	PRINT_ATTRn("{ wakeup_events, wakeup_watermark }", wakeup_events, p_unsigned, false);
 	PRINT_ATTRf(bp_type, p_unsigned);
@@ -323,6 +324,8 @@ int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
 	PRINT_ATTRf(sample_max_stack, p_unsigned);
 	PRINT_ATTRf(aux_sample_size, p_unsigned);
 	PRINT_ATTRf(sig_data, p_unsigned);
+	PRINT_ATTRf(aux_pause, p_unsigned);
+	PRINT_ATTRf(aux_resume, p_unsigned);
 
 	return ret;
 }
-- 
2.34.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RFC V2 4/4] coresight: Have a stab at support for pause / resume
  2023-12-08 17:24 ` Adrian Hunter
@ 2023-12-08 17:24   ` Adrian Hunter
  -1 siblings, 0 replies; 31+ messages in thread
From: Adrian Hunter @ 2023-12-08 17:24 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Heiko Carstens,
	Thomas Richter, Hendrik Brueckner, Suzuki K Poulose, Mike Leach,
	James Clark, coresight, linux-arm-kernel, Yicong Yang,
	Jonathan Cameron, Will Deacon, Arnaldo Carvalho de Melo,
	Jiri Olsa, Namhyung Kim, Ian Rogers, linux-kernel,
	linux-perf-users

For discussion only, un-tested, not even compiled...

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 .../hwtracing/coresight/coresight-etm-perf.c  | 29 ++++++++++++++++---
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index 5ca6278baff4..36e774405c51 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -45,6 +45,7 @@ static bool etm_perf_up;
 struct etm_ctxt {
 	struct perf_output_handle handle;
 	struct etm_event_data *event_data;
+	int pr_allowed;
 };
 
 static DEFINE_PER_CPU(struct etm_ctxt, etm_ctxt);
@@ -452,6 +453,13 @@ static void etm_event_start(struct perf_event *event, int flags)
 	struct list_head *path;
 	u64 hw_id;
 
+	if (mode & PERF_EF_RESUME) {
+		if (!READ_ONCE(ctxt->pr_allowed))
+			return;
+	} else if (READ_ONCE(event->aux_paused)) {
+		goto out_pr_allowed;
+	}
+
 	if (!csdev)
 		goto fail;
 
@@ -514,6 +522,8 @@ static void etm_event_start(struct perf_event *event, int flags)
 	event->hw.state = 0;
 	/* Save the event_data for this ETM */
 	ctxt->event_data = event_data;
+out_pr_allowed:
+	WRITE_ONCE(ctxt->pr_allowed, 1);
 	return;
 
 fail_disable_path:
@@ -530,6 +540,7 @@ static void etm_event_start(struct perf_event *event, int flags)
 	}
 fail:
 	event->hw.state = PERF_HES_STOPPED;
+	WRITE_ONCE(ctxt->pr_allowed, 0);
 	return;
 }
 
@@ -543,6 +554,11 @@ static void etm_event_stop(struct perf_event *event, int mode)
 	struct etm_event_data *event_data;
 	struct list_head *path;
 
+	if (mode & PERF_EF_PAUSE && !READ_ONCE(ctxt->pr_allowed))
+		return;
+
+	WRITE_ONCE(ctxt->pr_allowed, 0);
+
 	/*
 	 * If we still have access to the event_data via handle,
 	 * confirm that we haven't messed up the tracking.
@@ -556,7 +572,7 @@ static void etm_event_stop(struct perf_event *event, int mode)
 	ctxt->event_data = NULL;
 
 	if (event->hw.state == PERF_HES_STOPPED)
-		return;
+		goto out_pr_allowed;
 
 	/* We must have a valid event_data for a running event */
 	if (WARN_ON(!event_data))
@@ -627,6 +643,10 @@ static void etm_event_stop(struct perf_event *event, int mode)
 
 	/* Disabling the path make its elements available to other sessions */
 	coresight_disable_path(path);
+
+out_pr_allowed:
+	if (mode & PERF_EF_PAUSE)
+		WRITE_ONCE(ctxt->pr_allowed, 1);
 }
 
 static int etm_event_add(struct perf_event *event, int mode)
@@ -634,7 +654,7 @@ static int etm_event_add(struct perf_event *event, int mode)
 	int ret = 0;
 	struct hw_perf_event *hwc = &event->hw;
 
-	if (mode & PERF_EF_START) {
+	if (mode & PERF_EF_START && !READ_ONCE(event->aux_paused)) {
 		etm_event_start(event, 0);
 		if (hwc->state & PERF_HES_STOPPED)
 			ret = -EINVAL;
@@ -886,8 +906,9 @@ int __init etm_perf_init(void)
 {
 	int ret;
 
-	etm_pmu.capabilities		= (PERF_PMU_CAP_EXCLUSIVE |
-					   PERF_PMU_CAP_ITRACE);
+	etm_pmu.capabilities		= PERF_PMU_CAP_EXCLUSIVE |
+					  PERF_PMU_CAP_ITRACE |
+					  PERF_PMU_CAP_AUX_PAUSE;
 
 	etm_pmu.attr_groups		= etm_pmu_attr_groups;
 	etm_pmu.task_ctx_nr		= perf_sw_context;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RFC V2 4/4] coresight: Have a stab at support for pause / resume
@ 2023-12-08 17:24   ` Adrian Hunter
  0 siblings, 0 replies; 31+ messages in thread
From: Adrian Hunter @ 2023-12-08 17:24 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Heiko Carstens,
	Thomas Richter, Hendrik Brueckner, Suzuki K Poulose, Mike Leach,
	James Clark, coresight, linux-arm-kernel, Yicong Yang,
	Jonathan Cameron, Will Deacon, Arnaldo Carvalho de Melo,
	Jiri Olsa, Namhyung Kim, Ian Rogers, linux-kernel,
	linux-perf-users

For discussion only, un-tested, not even compiled...

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 .../hwtracing/coresight/coresight-etm-perf.c  | 29 ++++++++++++++++---
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index 5ca6278baff4..36e774405c51 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -45,6 +45,7 @@ static bool etm_perf_up;
 struct etm_ctxt {
 	struct perf_output_handle handle;
 	struct etm_event_data *event_data;
+	int pr_allowed;
 };
 
 static DEFINE_PER_CPU(struct etm_ctxt, etm_ctxt);
@@ -452,6 +453,13 @@ static void etm_event_start(struct perf_event *event, int flags)
 	struct list_head *path;
 	u64 hw_id;
 
+	if (mode & PERF_EF_RESUME) {
+		if (!READ_ONCE(ctxt->pr_allowed))
+			return;
+	} else if (READ_ONCE(event->aux_paused)) {
+		goto out_pr_allowed;
+	}
+
 	if (!csdev)
 		goto fail;
 
@@ -514,6 +522,8 @@ static void etm_event_start(struct perf_event *event, int flags)
 	event->hw.state = 0;
 	/* Save the event_data for this ETM */
 	ctxt->event_data = event_data;
+out_pr_allowed:
+	WRITE_ONCE(ctxt->pr_allowed, 1);
 	return;
 
 fail_disable_path:
@@ -530,6 +540,7 @@ static void etm_event_start(struct perf_event *event, int flags)
 	}
 fail:
 	event->hw.state = PERF_HES_STOPPED;
+	WRITE_ONCE(ctxt->pr_allowed, 0);
 	return;
 }
 
@@ -543,6 +554,11 @@ static void etm_event_stop(struct perf_event *event, int mode)
 	struct etm_event_data *event_data;
 	struct list_head *path;
 
+	if (mode & PERF_EF_PAUSE && !READ_ONCE(ctxt->pr_allowed))
+		return;
+
+	WRITE_ONCE(ctxt->pr_allowed, 0);
+
 	/*
 	 * If we still have access to the event_data via handle,
 	 * confirm that we haven't messed up the tracking.
@@ -556,7 +572,7 @@ static void etm_event_stop(struct perf_event *event, int mode)
 	ctxt->event_data = NULL;
 
 	if (event->hw.state == PERF_HES_STOPPED)
-		return;
+		goto out_pr_allowed;
 
 	/* We must have a valid event_data for a running event */
 	if (WARN_ON(!event_data))
@@ -627,6 +643,10 @@ static void etm_event_stop(struct perf_event *event, int mode)
 
 	/* Disabling the path make its elements available to other sessions */
 	coresight_disable_path(path);
+
+out_pr_allowed:
+	if (mode & PERF_EF_PAUSE)
+		WRITE_ONCE(ctxt->pr_allowed, 1);
 }
 
 static int etm_event_add(struct perf_event *event, int mode)
@@ -634,7 +654,7 @@ static int etm_event_add(struct perf_event *event, int mode)
 	int ret = 0;
 	struct hw_perf_event *hwc = &event->hw;
 
-	if (mode & PERF_EF_START) {
+	if (mode & PERF_EF_START && !READ_ONCE(event->aux_paused)) {
 		etm_event_start(event, 0);
 		if (hwc->state & PERF_HES_STOPPED)
 			ret = -EINVAL;
@@ -886,8 +906,9 @@ int __init etm_perf_init(void)
 {
 	int ret;
 
-	etm_pmu.capabilities		= (PERF_PMU_CAP_EXCLUSIVE |
-					   PERF_PMU_CAP_ITRACE);
+	etm_pmu.capabilities		= PERF_PMU_CAP_EXCLUSIVE |
+					  PERF_PMU_CAP_ITRACE |
+					  PERF_PMU_CAP_AUX_PAUSE;
 
 	etm_pmu.attr_groups		= etm_pmu_attr_groups;
 	etm_pmu.task_ctx_nr		= perf_sw_context;
-- 
2.34.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC V2 4/4] coresight: Have a stab at support for pause / resume
  2023-12-08 17:24   ` Adrian Hunter
  (?)
@ 2023-12-09 17:52   ` kernel test robot
  -1 siblings, 0 replies; 31+ messages in thread
From: kernel test robot @ 2023-12-09 17:52 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: oe-kbuild-all

Hi Adrian,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build errors:

[auto build test ERROR on perf-tools-next/perf-tools-next]
[also build test ERROR on tip/perf/core perf-tools/perf-tools linus/master v6.7-rc4 next-20231208]
[cannot apply to acme/perf/core]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Adrian-Hunter/perf-core-Add-aux_pause-aux_resume-aux_start_paused/20231209-013450
base:   https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git perf-tools-next
patch link:    https://lore.kernel.org/r/20231208172449.35444-5-adrian.hunter%40intel.com
patch subject: [PATCH RFC V2 4/4] coresight: Have a stab at support for pause / resume
config: arm-randconfig-002-20231209 (https://download.01.org/0day-ci/archive/20231210/202312100131.QphwFtTa-lkp@intel.com/config)
compiler: arm-linux-gnueabi-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231210/202312100131.QphwFtTa-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202312100131.QphwFtTa-lkp@intel.com/

All errors (new ones prefixed by >>):

   drivers/hwtracing/coresight/coresight-etm-perf.c: In function 'etm_event_start':
>> drivers/hwtracing/coresight/coresight-etm-perf.c:456:13: error: 'mode' undeclared (first use in this function); did you mean 'node'?
     456 |         if (mode & PERF_EF_RESUME) {
         |             ^~~~
         |             node
   drivers/hwtracing/coresight/coresight-etm-perf.c:456:13: note: each undeclared identifier is reported only once for each function it appears in


vim +456 drivers/hwtracing/coresight/coresight-etm-perf.c

   445	
   446	static void etm_event_start(struct perf_event *event, int flags)
   447	{
   448		int cpu = smp_processor_id();
   449		struct etm_event_data *event_data;
   450		struct etm_ctxt *ctxt = this_cpu_ptr(&etm_ctxt);
   451		struct perf_output_handle *handle = &ctxt->handle;
   452		struct coresight_device *sink, *csdev = per_cpu(csdev_src, cpu);
   453		struct list_head *path;
   454		u64 hw_id;
   455	
 > 456		if (mode & PERF_EF_RESUME) {
   457			if (!READ_ONCE(ctxt->pr_allowed))
   458				return;
   459		} else if (READ_ONCE(event->aux_paused)) {
   460			goto out_pr_allowed;
   461		}
   462	
   463		if (!csdev)
   464			goto fail;
   465	
   466		/* Have we messed up our tracking ? */
   467		if (WARN_ON(ctxt->event_data))
   468			goto fail;
   469	
   470		/*
   471		 * Deal with the ring buffer API and get a handle on the
   472		 * session's information.
   473		 */
   474		event_data = perf_aux_output_begin(handle, event);
   475		if (!event_data)
   476			goto fail;
   477	
   478		/*
   479		 * Check if this ETM is allowed to trace, as decided
   480		 * at etm_setup_aux(). This could be due to an unreachable
   481		 * sink from this ETM. We can't do much in this case if
   482		 * the sink was specified or hinted to the driver. For
   483		 * now, simply don't record anything on this ETM.
   484		 *
   485		 * As such we pretend that everything is fine, and let
   486		 * it continue without actually tracing. The event could
   487		 * continue tracing when it moves to a CPU where it is
   488		 * reachable to a sink.
   489		 */
   490		if (!cpumask_test_cpu(cpu, &event_data->mask))
   491			goto out;
   492	
   493		path = etm_event_cpu_path(event_data, cpu);
   494		/* We need a sink, no need to continue without one */
   495		sink = coresight_get_sink(path);
   496		if (WARN_ON_ONCE(!sink))
   497			goto fail_end_stop;
   498	
   499		/* Nothing will happen without a path */
   500		if (coresight_enable_path(path, CS_MODE_PERF, handle))
   501			goto fail_end_stop;
   502	
   503		/* Finally enable the tracer */
   504		if (coresight_enable_source(csdev, CS_MODE_PERF, event))
   505			goto fail_disable_path;
   506	
   507		/*
   508		 * output cpu / trace ID in perf record, once for the lifetime
   509		 * of the event.
   510		 */
   511		if (!cpumask_test_cpu(cpu, &event_data->aux_hwid_done)) {
   512			cpumask_set_cpu(cpu, &event_data->aux_hwid_done);
   513			hw_id = FIELD_PREP(CS_AUX_HW_ID_VERSION_MASK,
   514					   CS_AUX_HW_ID_CURR_VERSION);
   515			hw_id |= FIELD_PREP(CS_AUX_HW_ID_TRACE_ID_MASK,
   516					    coresight_trace_id_read_cpu_id(cpu));
   517			perf_report_aux_output_id(event, hw_id);
   518		}
   519	
   520	out:
   521		/* Tell the perf core the event is alive */
   522		event->hw.state = 0;
   523		/* Save the event_data for this ETM */
   524		ctxt->event_data = event_data;
   525	out_pr_allowed:
   526		WRITE_ONCE(ctxt->pr_allowed, 1);
   527		return;
   528	
   529	fail_disable_path:
   530		coresight_disable_path(path);
   531	fail_end_stop:
   532		/*
   533		 * Check if the handle is still associated with the event,
   534		 * to handle cases where if the sink failed to start the
   535		 * trace and TRUNCATED the handle already.
   536		 */
   537		if (READ_ONCE(handle->event)) {
   538			perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
   539			perf_aux_output_end(handle, 0);
   540		}
   541	fail:
   542		event->hw.state = PERF_HES_STOPPED;
   543		WRITE_ONCE(ctxt->pr_allowed, 0);
   544		return;
   545	}
   546	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH RFC V3 4/4] coresight: Have a stab at support for pause / resume
  2023-12-08 17:24   ` Adrian Hunter
@ 2023-12-15  6:42     ` Adrian Hunter
  -1 siblings, 0 replies; 31+ messages in thread
From: Adrian Hunter @ 2023-12-15  6:42 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Heiko Carstens,
	Thomas Richter, Hendrik Brueckner, Suzuki K Poulose, Mike Leach,
	James Clark, coresight, linux-arm-kernel, Yicong Yang,
	Jonathan Cameron, Will Deacon, Arnaldo Carvalho de Melo,
	Jiri Olsa, Namhyung Kim, Ian Rogers, linux-kernel,
	linux-perf-users

For discussion only, un-tested...

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---


Changes in V3:

	'mode' -> 'flags' so it at least compiles


 .../hwtracing/coresight/coresight-etm-perf.c  | 29 ++++++++++++++++---
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index 5ca6278baff4..7a69e6417ed4 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -45,6 +45,7 @@ static bool etm_perf_up;
 struct etm_ctxt {
 	struct perf_output_handle handle;
 	struct etm_event_data *event_data;
+	int pr_allowed;
 };
 
 static DEFINE_PER_CPU(struct etm_ctxt, etm_ctxt);
@@ -452,6 +453,13 @@ static void etm_event_start(struct perf_event *event, int flags)
 	struct list_head *path;
 	u64 hw_id;
 
+	if (flags & PERF_EF_RESUME) {
+		if (!READ_ONCE(ctxt->pr_allowed))
+			return;
+	} else if (READ_ONCE(event->aux_paused)) {
+		goto out_pr_allowed;
+	}
+
 	if (!csdev)
 		goto fail;
 
@@ -514,6 +522,8 @@ static void etm_event_start(struct perf_event *event, int flags)
 	event->hw.state = 0;
 	/* Save the event_data for this ETM */
 	ctxt->event_data = event_data;
+out_pr_allowed:
+	WRITE_ONCE(ctxt->pr_allowed, 1);
 	return;
 
 fail_disable_path:
@@ -530,6 +540,7 @@ static void etm_event_start(struct perf_event *event, int flags)
 	}
 fail:
 	event->hw.state = PERF_HES_STOPPED;
+	WRITE_ONCE(ctxt->pr_allowed, 0);
 	return;
 }
 
@@ -543,6 +554,11 @@ static void etm_event_stop(struct perf_event *event, int mode)
 	struct etm_event_data *event_data;
 	struct list_head *path;
 
+	if (mode & PERF_EF_PAUSE && !READ_ONCE(ctxt->pr_allowed))
+		return;
+
+	WRITE_ONCE(ctxt->pr_allowed, 0);
+
 	/*
 	 * If we still have access to the event_data via handle,
 	 * confirm that we haven't messed up the tracking.
@@ -556,7 +572,7 @@ static void etm_event_stop(struct perf_event *event, int mode)
 	ctxt->event_data = NULL;
 
 	if (event->hw.state == PERF_HES_STOPPED)
-		return;
+		goto out_pr_allowed;
 
 	/* We must have a valid event_data for a running event */
 	if (WARN_ON(!event_data))
@@ -627,6 +643,10 @@ static void etm_event_stop(struct perf_event *event, int mode)
 
 	/* Disabling the path make its elements available to other sessions */
 	coresight_disable_path(path);
+
+out_pr_allowed:
+	if (mode & PERF_EF_PAUSE)
+		WRITE_ONCE(ctxt->pr_allowed, 1);
 }
 
 static int etm_event_add(struct perf_event *event, int mode)
@@ -634,7 +654,7 @@ static int etm_event_add(struct perf_event *event, int mode)
 	int ret = 0;
 	struct hw_perf_event *hwc = &event->hw;
 
-	if (mode & PERF_EF_START) {
+	if (mode & PERF_EF_START && !READ_ONCE(event->aux_paused)) {
 		etm_event_start(event, 0);
 		if (hwc->state & PERF_HES_STOPPED)
 			ret = -EINVAL;
@@ -886,8 +906,9 @@ int __init etm_perf_init(void)
 {
 	int ret;
 
-	etm_pmu.capabilities		= (PERF_PMU_CAP_EXCLUSIVE |
-					   PERF_PMU_CAP_ITRACE);
+	etm_pmu.capabilities		= PERF_PMU_CAP_EXCLUSIVE |
+					  PERF_PMU_CAP_ITRACE |
+					  PERF_PMU_CAP_AUX_PAUSE;
 
 	etm_pmu.attr_groups		= etm_pmu_attr_groups;
 	etm_pmu.task_ctx_nr		= perf_sw_context;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RFC V3 4/4] coresight: Have a stab at support for pause / resume
@ 2023-12-15  6:42     ` Adrian Hunter
  0 siblings, 0 replies; 31+ messages in thread
From: Adrian Hunter @ 2023-12-15  6:42 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Heiko Carstens,
	Thomas Richter, Hendrik Brueckner, Suzuki K Poulose, Mike Leach,
	James Clark, coresight, linux-arm-kernel, Yicong Yang,
	Jonathan Cameron, Will Deacon, Arnaldo Carvalho de Melo,
	Jiri Olsa, Namhyung Kim, Ian Rogers, linux-kernel,
	linux-perf-users

For discussion only, un-tested...

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---


Changes in V3:

	'mode' -> 'flags' so it at least compiles


 .../hwtracing/coresight/coresight-etm-perf.c  | 29 ++++++++++++++++---
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index 5ca6278baff4..7a69e6417ed4 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -45,6 +45,7 @@ static bool etm_perf_up;
 struct etm_ctxt {
 	struct perf_output_handle handle;
 	struct etm_event_data *event_data;
+	int pr_allowed;
 };
 
 static DEFINE_PER_CPU(struct etm_ctxt, etm_ctxt);
@@ -452,6 +453,13 @@ static void etm_event_start(struct perf_event *event, int flags)
 	struct list_head *path;
 	u64 hw_id;
 
+	if (flags & PERF_EF_RESUME) {
+		if (!READ_ONCE(ctxt->pr_allowed))
+			return;
+	} else if (READ_ONCE(event->aux_paused)) {
+		goto out_pr_allowed;
+	}
+
 	if (!csdev)
 		goto fail;
 
@@ -514,6 +522,8 @@ static void etm_event_start(struct perf_event *event, int flags)
 	event->hw.state = 0;
 	/* Save the event_data for this ETM */
 	ctxt->event_data = event_data;
+out_pr_allowed:
+	WRITE_ONCE(ctxt->pr_allowed, 1);
 	return;
 
 fail_disable_path:
@@ -530,6 +540,7 @@ static void etm_event_start(struct perf_event *event, int flags)
 	}
 fail:
 	event->hw.state = PERF_HES_STOPPED;
+	WRITE_ONCE(ctxt->pr_allowed, 0);
 	return;
 }
 
@@ -543,6 +554,11 @@ static void etm_event_stop(struct perf_event *event, int mode)
 	struct etm_event_data *event_data;
 	struct list_head *path;
 
+	if (mode & PERF_EF_PAUSE && !READ_ONCE(ctxt->pr_allowed))
+		return;
+
+	WRITE_ONCE(ctxt->pr_allowed, 0);
+
 	/*
 	 * If we still have access to the event_data via handle,
 	 * confirm that we haven't messed up the tracking.
@@ -556,7 +572,7 @@ static void etm_event_stop(struct perf_event *event, int mode)
 	ctxt->event_data = NULL;
 
 	if (event->hw.state == PERF_HES_STOPPED)
-		return;
+		goto out_pr_allowed;
 
 	/* We must have a valid event_data for a running event */
 	if (WARN_ON(!event_data))
@@ -627,6 +643,10 @@ static void etm_event_stop(struct perf_event *event, int mode)
 
 	/* Disabling the path make its elements available to other sessions */
 	coresight_disable_path(path);
+
+out_pr_allowed:
+	if (mode & PERF_EF_PAUSE)
+		WRITE_ONCE(ctxt->pr_allowed, 1);
 }
 
 static int etm_event_add(struct perf_event *event, int mode)
@@ -634,7 +654,7 @@ static int etm_event_add(struct perf_event *event, int mode)
 	int ret = 0;
 	struct hw_perf_event *hwc = &event->hw;
 
-	if (mode & PERF_EF_START) {
+	if (mode & PERF_EF_START && !READ_ONCE(event->aux_paused)) {
 		etm_event_start(event, 0);
 		if (hwc->state & PERF_HES_STOPPED)
 			ret = -EINVAL;
@@ -886,8 +906,9 @@ int __init etm_perf_init(void)
 {
 	int ret;
 
-	etm_pmu.capabilities		= (PERF_PMU_CAP_EXCLUSIVE |
-					   PERF_PMU_CAP_ITRACE);
+	etm_pmu.capabilities		= PERF_PMU_CAP_EXCLUSIVE |
+					  PERF_PMU_CAP_ITRACE |
+					  PERF_PMU_CAP_AUX_PAUSE;
 
 	etm_pmu.attr_groups		= etm_pmu_attr_groups;
 	etm_pmu.task_ctx_nr		= perf_sw_context;
-- 
2.34.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC V2 0/4] perf/core: Add ability for an event to "pause" or "resume" AUX area tracing
  2023-12-08 17:24 ` Adrian Hunter
@ 2023-12-19  6:05   ` Adrian Hunter
  -1 siblings, 0 replies; 31+ messages in thread
From: Adrian Hunter @ 2023-12-19  6:05 UTC (permalink / raw)
  To: Peter Zijlstra, James Clark
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Heiko Carstens,
	Thomas Richter, Hendrik Brueckner, Suzuki K Poulose, Mike Leach,
	coresight, linux-arm-kernel, Yicong Yang, Jonathan Cameron,
	Will Deacon, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
	Ian Rogers, linux-kernel, linux-perf-users

On 8/12/23 19:24, Adrian Hunter wrote:
> Hi
> 
> Hardware traces, such as instruction traces, can produce a vast amount of
> trace data, so being able to reduce tracing to more specific circumstances
> can be useful.
> 
> The ability to pause or resume tracing when another event happens, can do
> that.
> 
> These patches add such a facilty and show how it would work for Intel
> Processor Trace.
> 
> Maintainers of other AUX area tracing implementations are requested to
> consider if this is something they might employ and then whether or not
> the ABI would work for them.
> 
> Changes to perf tools are not fleshed out yet.
> 
> 
> Changes in RFC V2:
> 
>       Use ->stop() / ->start() instead of ->pause_resume()
>       Move aux_start_paused bit into aux_output_cfg
>       Tighten up when Intel PT pause / resume is allowed
>       Add an example of how it might work for CoreSight

Any comments?

> 
> 
> Adrian Hunter (4):
>       perf/core: Add aux_pause, aux_resume, aux_start_paused
>       perf/x86/intel/pt: Add support for pause / resume
>       perf tools: Add support for AUX area pause / resume
>       coresight: Have a stab at support for pause / resume
> 
>  arch/x86/events/intel/pt.c                       | 63 ++++++++++++++++++++-
>  arch/x86/events/intel/pt.h                       |  4 ++
>  drivers/hwtracing/coresight/coresight-etm-perf.c | 29 ++++++++--
>  include/linux/perf_event.h                       | 15 +++++
>  include/uapi/linux/perf_event.h                  | 11 +++-
>  kernel/events/core.c                             | 72 +++++++++++++++++++++++-
>  kernel/events/internal.h                         |  1 +
>  tools/include/uapi/linux/perf_event.h            | 11 +++-
>  tools/perf/util/auxtrace.c                       |  4 ++
>  tools/perf/util/evsel.c                          |  9 +++
>  tools/perf/util/evsel_config.h                   |  6 ++
>  tools/perf/util/parse-events.c                   | 33 +++++++++++
>  tools/perf/util/parse-events.h                   |  3 +
>  tools/perf/util/parse-events.l                   |  3 +
>  tools/perf/util/perf_event_attr_fprintf.c        |  3 +
>  15 files changed, 255 insertions(+), 12 deletions(-)
> 
> 
> Regards
> Adrian


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC V2 0/4] perf/core: Add ability for an event to "pause" or "resume" AUX area tracing
@ 2023-12-19  6:05   ` Adrian Hunter
  0 siblings, 0 replies; 31+ messages in thread
From: Adrian Hunter @ 2023-12-19  6:05 UTC (permalink / raw)
  To: Peter Zijlstra, James Clark
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Heiko Carstens,
	Thomas Richter, Hendrik Brueckner, Suzuki K Poulose, Mike Leach,
	coresight, linux-arm-kernel, Yicong Yang, Jonathan Cameron,
	Will Deacon, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
	Ian Rogers, linux-kernel, linux-perf-users

On 8/12/23 19:24, Adrian Hunter wrote:
> Hi
> 
> Hardware traces, such as instruction traces, can produce a vast amount of
> trace data, so being able to reduce tracing to more specific circumstances
> can be useful.
> 
> The ability to pause or resume tracing when another event happens, can do
> that.
> 
> These patches add such a facilty and show how it would work for Intel
> Processor Trace.
> 
> Maintainers of other AUX area tracing implementations are requested to
> consider if this is something they might employ and then whether or not
> the ABI would work for them.
> 
> Changes to perf tools are not fleshed out yet.
> 
> 
> Changes in RFC V2:
> 
>       Use ->stop() / ->start() instead of ->pause_resume()
>       Move aux_start_paused bit into aux_output_cfg
>       Tighten up when Intel PT pause / resume is allowed
>       Add an example of how it might work for CoreSight

Any comments?

> 
> 
> Adrian Hunter (4):
>       perf/core: Add aux_pause, aux_resume, aux_start_paused
>       perf/x86/intel/pt: Add support for pause / resume
>       perf tools: Add support for AUX area pause / resume
>       coresight: Have a stab at support for pause / resume
> 
>  arch/x86/events/intel/pt.c                       | 63 ++++++++++++++++++++-
>  arch/x86/events/intel/pt.h                       |  4 ++
>  drivers/hwtracing/coresight/coresight-etm-perf.c | 29 ++++++++--
>  include/linux/perf_event.h                       | 15 +++++
>  include/uapi/linux/perf_event.h                  | 11 +++-
>  kernel/events/core.c                             | 72 +++++++++++++++++++++++-
>  kernel/events/internal.h                         |  1 +
>  tools/include/uapi/linux/perf_event.h            | 11 +++-
>  tools/perf/util/auxtrace.c                       |  4 ++
>  tools/perf/util/evsel.c                          |  9 +++
>  tools/perf/util/evsel_config.h                   |  6 ++
>  tools/perf/util/parse-events.c                   | 33 +++++++++++
>  tools/perf/util/parse-events.h                   |  3 +
>  tools/perf/util/parse-events.l                   |  3 +
>  tools/perf/util/perf_event_attr_fprintf.c        |  3 +
>  15 files changed, 255 insertions(+), 12 deletions(-)
> 
> 
> Regards
> Adrian


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC V2 1/4] perf/core: Add aux_pause, aux_resume, aux_start_paused
  2023-12-08 17:24   ` Adrian Hunter
@ 2023-12-19 13:42     ` Arnaldo Carvalho de Melo
  -1 siblings, 0 replies; 31+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-12-19 13:42 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Heiko Carstens, Thomas Richter, Hendrik Brueckner,
	Suzuki K Poulose, Mike Leach, James Clark, coresight,
	linux-arm-kernel, Yicong Yang, Jonathan Cameron, Will Deacon,
	Jiri Olsa, Namhyung Kim, Ian Rogers, linux-kernel,
	linux-perf-users

Em Fri, Dec 08, 2023 at 07:24:46PM +0200, Adrian Hunter escreveu:
> Hardware traces, such as instruction traces, can produce a vast amount of
> trace data, so being able to reduce tracing to more specific circumstances
> can be useful.

> The ability to pause or resume tracing when another event happens, can do
> that.
 
> Add ability for an event to "pause" or "resume" AUX area tracing.

We need this as well for the usual ring buffer, 'perf report' has:

        --switch-off <event>
                          Stop considering events after the occurrence of this event
        --switch-on <event>
                          Consider events after the occurrence of this event

And 'perf record' has:

       --switch-output-event
           Events that will cause the switch of the perf.data file, auto-selecting --switch-output=signal, the results are similar as internally the side band thread will also send a
           SIGUSR2 to the main one.

But those are all in userspace, what you're doing is in the kernel, and
for the example you used synchronous, i.e. you're only interested in
what happens after you enter the syscall and then stop when the syscall
exits (but here you'll catch more stuff in the AUX trace, i.e. a "race"
from intel_pt inserting events in the AUX trace and then the syscall
exit switching it off).

Also being able to group the { resume, what-to-enable, pause } is
powerful, as we could have multiple such groups to record those
"slices", not just the --switch-off/--switch-on global ones.
 
> Add aux_pause bit to perf_event_attr to indicate that, if the event
> happens, the associated AUX area tracing should be paused. Ditto
> aux_resume. Do not allow aux_pause and aux_resume to be set together.
> 
> Add aux_start_paused bit to perf_event_attr to indicate to an AUX area
> event that it should start in a "paused" state.
> 
> Add aux_paused to struct perf_event for AUX area events to keep track of
> the "paused" state. aux_paused is initialized to aux_start_paused.
> 
> Add PERF_EF_PAUSE and PERF_EF_RESUME modes for ->stop() and ->start()
> callbacks. Call as needed, during __perf_event_output(). Add
> aux_in_pause_resume to struct perf_buffer to prevent races with the NMI
> handler. Pause/resume in NMI context will miss out if it coincides with
> another pause/resume.
> 
> To use aux_pause or aux_resume, an event must be in a group with the AUX
> area event as the group leader.
> 
> Example (requires Intel PT and tools patches also):
> 
>  $ perf record --kcore -e '{intel_pt/aux-start-paused/k,syscalls:sys_enter_newuname/aux-resume/,syscalls:sys_exit_newuname/aux-pause/}' uname

User interface looks nice, asks for intel_pt to be armed but start
paused, collect just inside the kernel, then sets up another event to
enable the collection of whatever is using the aux area, intel_pt in
this case, and then one other event, sys_exit_newuname to pause it
again.

So this implicitely selects the CPU where the aux-resume took place and
in this specific case we ended up being lucky and that process wasn't
migrated to another CPU in the middle of the syscall...

Scratch that, you're not tracing system wide, but just the 'uname'
process being started from perf, perfect.



- Arnaldo

>  Linux
>  [ perf record: Woken up 1 times to write data ]
>  [ perf record: Captured and wrote 0.041 MB perf.data ]
>  $ perf script --call-trace
>  uname    5712 [007]    83.855580930: name: 0x7ffd9dcebec0
>  uname    5712 [007]    83.855582518:  psb offs: 0
>  uname    5712 [007]    83.855582518:  cbr: 42 freq: 4205 MHz (150%)
>  uname    5712 [007]    83.855582723: ([kernel.kallsyms])    debug_smp_processor_id
>  uname    5712 [007]    83.855582723: ([kernel.kallsyms])    __x64_sys_newuname
>  uname    5712 [007]    83.855582723: ([kernel.kallsyms])        down_read
>  uname    5712 [007]    83.855582723: ([kernel.kallsyms])            __cond_resched
>  uname    5712 [007]    83.855582932: ([kernel.kallsyms])            preempt_count_add
>  uname    5712 [007]    83.855582932: ([kernel.kallsyms])                in_lock_functions
>  uname    5712 [007]    83.855582932: ([kernel.kallsyms])            preempt_count_sub
>  uname    5712 [007]    83.855582932: ([kernel.kallsyms])        up_read
>  uname    5712 [007]    83.855582932: ([kernel.kallsyms])            preempt_count_add
>  uname    5712 [007]    83.855583348: ([kernel.kallsyms])                in_lock_functions
>  uname    5712 [007]    83.855583348: ([kernel.kallsyms])            preempt_count_sub
>  uname    5712 [007]    83.855583348: ([kernel.kallsyms])        _copy_to_user
>  uname    5712 [007]    83.855583557: ([kernel.kallsyms])    syscall_exit_to_user_mode
>  uname    5712 [007]    83.855583557: ([kernel.kallsyms])        syscall_exit_work
>  uname    5712 [007]    83.855583557: ([kernel.kallsyms])            perf_syscall_exit
>  uname    5712 [007]    83.855583557: ([kernel.kallsyms])                debug_smp_processor_id
>  uname    5712 [007]    83.855583557: ([kernel.kallsyms])                perf_trace_buf_alloc
>  uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    perf_swevent_get_recursion_context
>  uname    5712 [007]    83.855583557: ([kernel.kallsyms])                        debug_smp_processor_id
>  uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    debug_smp_processor_id
>  uname    5712 [007]    83.855583557: ([kernel.kallsyms])                perf_tp_event
>  uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    perf_trace_buf_update
>  uname    5712 [007]    83.855583557: ([kernel.kallsyms])                        tracing_gen_ctx_irq_test
>  uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    perf_swevent_event
>  uname    5712 [007]    83.855583765: ([kernel.kallsyms])                        __perf_event_account_interrupt
>  uname    5712 [007]    83.855583765: ([kernel.kallsyms])                            __this_cpu_preempt_check
>  uname    5712 [007]    83.855583765: ([kernel.kallsyms])                        perf_event_output_forward
>  uname    5712 [007]    83.855583765: ([kernel.kallsyms])                            perf_event_aux_pause
>  uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                ring_buffer_get
>  uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    __rcu_read_lock
>  uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    __rcu_read_unlock
>  uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                pt_event_stop
>  uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    debug_smp_processor_id
>  uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    debug_smp_processor_id
>  uname    5712 [007]    83.855583973: ([kernel.kallsyms])                                    native_write_msr
>  uname    5712 [007]    83.855583973: ([kernel.kallsyms])                                    native_write_msr
>  uname    5712 [007]    83.855584175: 0x0
> 
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>  include/linux/perf_event.h      | 15 +++++++
>  include/uapi/linux/perf_event.h | 11 ++++-
>  kernel/events/core.c            | 72 +++++++++++++++++++++++++++++++--
>  kernel/events/internal.h        |  1 +
>  4 files changed, 95 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index e85cd1c0eaf3..252c4aac3b79 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -291,6 +291,7 @@ struct perf_event_pmu_context;
>  #define PERF_PMU_CAP_NO_EXCLUDE			0x0040
>  #define PERF_PMU_CAP_AUX_OUTPUT			0x0080
>  #define PERF_PMU_CAP_EXTENDED_HW_TYPE		0x0100
> +#define PERF_PMU_CAP_AUX_PAUSE			0x0200
>  
>  struct perf_output_handle;
>  
> @@ -363,6 +364,8 @@ struct pmu {
>  #define PERF_EF_START	0x01		/* start the counter when adding    */
>  #define PERF_EF_RELOAD	0x02		/* reload the counter when starting */
>  #define PERF_EF_UPDATE	0x04		/* update the counter when stopping */
> +#define PERF_EF_PAUSE	0x08		/* AUX area event, pause tracing */
> +#define PERF_EF_RESUME	0x10		/* AUX area event, resume tracing */
>  
>  	/*
>  	 * Adds/Removes a counter to/from the PMU, can be done inside a
> @@ -402,6 +405,15 @@ struct pmu {
>  	 *
>  	 * ->start() with PERF_EF_RELOAD will reprogram the counter
>  	 *  value, must be preceded by a ->stop() with PERF_EF_UPDATE.
> +	 *
> +	 * ->stop() with PERF_EF_PAUSE will stop as simply as possible. Will not
> +	 * overlap another ->stop() with PERF_EF_PAUSE nor ->start() with
> +	 * PERF_EF_RESUME.
> +	 *
> +	 * ->start() with PERF_EF_RESUME will start as simply as possible but
> +	 * only if the counter is not otherwise stopped. Will not overlap
> +	 * another ->start() with PERF_EF_RESUME nor ->stop() with
> +	 * PERF_EF_PAUSE.
>  	 */
>  	void (*start)			(struct perf_event *event, int flags);
>  	void (*stop)			(struct perf_event *event, int flags);
> @@ -797,6 +809,9 @@ struct perf_event {
>  	/* for aux_output events */
>  	struct perf_event		*aux_event;
>  
> +	/* for AUX area events */
> +	unsigned int			aux_paused;
> +
>  	void (*destroy)(struct perf_event *);
>  	struct rcu_head			rcu_head;
>  
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 39c6a250dd1b..437bc2a8d50c 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -507,7 +507,16 @@ struct perf_event_attr {
>  	__u16	sample_max_stack;
>  	__u16	__reserved_2;
>  	__u32	aux_sample_size;
> -	__u32	__reserved_3;
> +
> +	union {
> +		__u32	aux_output_cfg;
> +		struct {
> +			__u64	aux_pause        :  1, /* on overflow, pause AUX area tracing */
> +				aux_resume       :  1, /* on overflow, resume AUX area tracing */
> +				aux_start_paused :  1, /* start AUX area tracing paused */
> +				__reserved_3     : 29;
> +		};
> +	};
>  
>  	/*
>  	 * User provided data if sigtrap=1, passed back to user via
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 4c72a41f11af..c1e11884d06e 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -2060,7 +2060,8 @@ static void perf_put_aux_event(struct perf_event *event)
>  
>  static bool perf_need_aux_event(struct perf_event *event)
>  {
> -	return !!event->attr.aux_output || !!event->attr.aux_sample_size;
> +	return event->attr.aux_output || event->attr.aux_sample_size ||
> +	       event->attr.aux_pause || event->attr.aux_resume;
>  }
>  
>  static int perf_get_aux_event(struct perf_event *event,
> @@ -2085,6 +2086,10 @@ static int perf_get_aux_event(struct perf_event *event,
>  	    !perf_aux_output_match(event, group_leader))
>  		return 0;
>  
> +	if ((event->attr.aux_pause || event->attr.aux_resume) &&
> +	    !(group_leader->pmu->capabilities & PERF_PMU_CAP_AUX_PAUSE))
> +		return 0;
> +
>  	if (event->attr.aux_sample_size && !group_leader->pmu->snapshot_aux)
>  		return 0;
>  
> @@ -7773,6 +7778,47 @@ void perf_prepare_header(struct perf_event_header *header,
>  	WARN_ON_ONCE(header->size & 7);
>  }
>  
> +static void __perf_event_aux_pause(struct perf_event *event, bool pause)
> +{
> +	if (pause) {
> +		if (!READ_ONCE(event->aux_paused)) {
> +			WRITE_ONCE(event->aux_paused, 1);
> +			event->pmu->stop(event, PERF_EF_PAUSE);
> +		}
> +	} else {
> +		if (READ_ONCE(event->aux_paused)) {
> +			WRITE_ONCE(event->aux_paused, 0);
> +			event->pmu->start(event, PERF_EF_RESUME);
> +		}
> +	}
> +}
> +
> +static void perf_event_aux_pause(struct perf_event *event, bool pause)
> +{
> +	struct perf_buffer *rb;
> +	unsigned long flags;
> +
> +	if (WARN_ON_ONCE(!event))
> +		return;
> +
> +	rb = ring_buffer_get(event);
> +	if (!rb)
> +		return;
> +
> +	local_irq_save(flags);
> +	/* Guard against NMI, NMI loses here */
> +	if (READ_ONCE(rb->aux_in_pause_resume))
> +		goto out_restore;
> +	WRITE_ONCE(rb->aux_in_pause_resume, 1);
> +	barrier();
> +	__perf_event_aux_pause(event, pause);
> +	barrier();
> +	WRITE_ONCE(rb->aux_in_pause_resume, 0);
> +out_restore:
> +	local_irq_restore(flags);
> +	ring_buffer_put(rb);
> +}
> +
>  static __always_inline int
>  __perf_event_output(struct perf_event *event,
>  		    struct perf_sample_data *data,
> @@ -7786,6 +7832,9 @@ __perf_event_output(struct perf_event *event,
>  	struct perf_event_header header;
>  	int err;
>  
> +	if (event->attr.aux_pause)
> +		perf_event_aux_pause(event->aux_event, true);
> +
>  	/* protect the callchain buffers */
>  	rcu_read_lock();
>  
> @@ -7802,6 +7851,10 @@ __perf_event_output(struct perf_event *event,
>  
>  exit:
>  	rcu_read_unlock();
> +
> +	if (event->attr.aux_resume)
> +		perf_event_aux_pause(event->aux_event, false);
> +
>  	return err;
>  }
>  
> @@ -11941,10 +11994,23 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
>  	}
>  
>  	if (event->attr.aux_output &&
> -	    !(pmu->capabilities & PERF_PMU_CAP_AUX_OUTPUT)) {
> +	    (!(pmu->capabilities & PERF_PMU_CAP_AUX_OUTPUT) ||
> +	     event->attr.aux_pause || event->attr.aux_resume)) {
> +		err = -EOPNOTSUPP;
> +		goto err_pmu;
> +	}
> +
> +	if (event->attr.aux_pause && event->attr.aux_resume) {
> +		err = -EINVAL;
> +		goto err_pmu;
> +	}
> +
> +	if (event->attr.aux_start_paused &&
> +	    !(pmu->capabilities & PERF_PMU_CAP_AUX_PAUSE)) {
>  		err = -EOPNOTSUPP;
>  		goto err_pmu;
>  	}
> +	event->aux_paused = event->attr.aux_start_paused;
>  
>  	if (cgroup_fd != -1) {
>  		err = perf_cgroup_connect(cgroup_fd, event, attr, group_leader);
> @@ -12741,7 +12807,7 @@ perf_event_create_kernel_counter(struct perf_event_attr *attr, int cpu,
>  	 * Grouping is not supported for kernel events, neither is 'AUX',
>  	 * make sure the caller's intentions are adjusted.
>  	 */
> -	if (attr->aux_output)
> +	if (attr->aux_output || attr->aux_output_cfg)
>  		return ERR_PTR(-EINVAL);
>  
>  	event = perf_event_alloc(attr, cpu, task, NULL, NULL,
> diff --git a/kernel/events/internal.h b/kernel/events/internal.h
> index 5150d5f84c03..3320f78117dc 100644
> --- a/kernel/events/internal.h
> +++ b/kernel/events/internal.h
> @@ -51,6 +51,7 @@ struct perf_buffer {
>  	void				(*free_aux)(void *);
>  	refcount_t			aux_refcount;
>  	int				aux_in_sampling;
> +	int				aux_in_pause_resume;
>  	void				**aux_pages;
>  	void				*aux_priv;
>  
> -- 
> 2.34.1
> 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC V2 1/4] perf/core: Add aux_pause, aux_resume, aux_start_paused
@ 2023-12-19 13:42     ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 31+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-12-19 13:42 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Heiko Carstens, Thomas Richter, Hendrik Brueckner,
	Suzuki K Poulose, Mike Leach, James Clark, coresight,
	linux-arm-kernel, Yicong Yang, Jonathan Cameron, Will Deacon,
	Jiri Olsa, Namhyung Kim, Ian Rogers, linux-kernel,
	linux-perf-users

Em Fri, Dec 08, 2023 at 07:24:46PM +0200, Adrian Hunter escreveu:
> Hardware traces, such as instruction traces, can produce a vast amount of
> trace data, so being able to reduce tracing to more specific circumstances
> can be useful.

> The ability to pause or resume tracing when another event happens, can do
> that.
 
> Add ability for an event to "pause" or "resume" AUX area tracing.

We need this as well for the usual ring buffer, 'perf report' has:

        --switch-off <event>
                          Stop considering events after the occurrence of this event
        --switch-on <event>
                          Consider events after the occurrence of this event

And 'perf record' has:

       --switch-output-event
           Events that will cause the switch of the perf.data file, auto-selecting --switch-output=signal, the results are similar as internally the side band thread will also send a
           SIGUSR2 to the main one.

But those are all in userspace, what you're doing is in the kernel, and
for the example you used synchronous, i.e. you're only interested in
what happens after you enter the syscall and then stop when the syscall
exits (but here you'll catch more stuff in the AUX trace, i.e. a "race"
from intel_pt inserting events in the AUX trace and then the syscall
exit switching it off).

Also being able to group the { resume, what-to-enable, pause } is
powerful, as we could have multiple such groups to record those
"slices", not just the --switch-off/--switch-on global ones.
 
> Add aux_pause bit to perf_event_attr to indicate that, if the event
> happens, the associated AUX area tracing should be paused. Ditto
> aux_resume. Do not allow aux_pause and aux_resume to be set together.
> 
> Add aux_start_paused bit to perf_event_attr to indicate to an AUX area
> event that it should start in a "paused" state.
> 
> Add aux_paused to struct perf_event for AUX area events to keep track of
> the "paused" state. aux_paused is initialized to aux_start_paused.
> 
> Add PERF_EF_PAUSE and PERF_EF_RESUME modes for ->stop() and ->start()
> callbacks. Call as needed, during __perf_event_output(). Add
> aux_in_pause_resume to struct perf_buffer to prevent races with the NMI
> handler. Pause/resume in NMI context will miss out if it coincides with
> another pause/resume.
> 
> To use aux_pause or aux_resume, an event must be in a group with the AUX
> area event as the group leader.
> 
> Example (requires Intel PT and tools patches also):
> 
>  $ perf record --kcore -e '{intel_pt/aux-start-paused/k,syscalls:sys_enter_newuname/aux-resume/,syscalls:sys_exit_newuname/aux-pause/}' uname

User interface looks nice, asks for intel_pt to be armed but start
paused, collect just inside the kernel, then sets up another event to
enable the collection of whatever is using the aux area, intel_pt in
this case, and then one other event, sys_exit_newuname to pause it
again.

So this implicitely selects the CPU where the aux-resume took place and
in this specific case we ended up being lucky and that process wasn't
migrated to another CPU in the middle of the syscall...

Scratch that, you're not tracing system wide, but just the 'uname'
process being started from perf, perfect.



- Arnaldo

>  Linux
>  [ perf record: Woken up 1 times to write data ]
>  [ perf record: Captured and wrote 0.041 MB perf.data ]
>  $ perf script --call-trace
>  uname    5712 [007]    83.855580930: name: 0x7ffd9dcebec0
>  uname    5712 [007]    83.855582518:  psb offs: 0
>  uname    5712 [007]    83.855582518:  cbr: 42 freq: 4205 MHz (150%)
>  uname    5712 [007]    83.855582723: ([kernel.kallsyms])    debug_smp_processor_id
>  uname    5712 [007]    83.855582723: ([kernel.kallsyms])    __x64_sys_newuname
>  uname    5712 [007]    83.855582723: ([kernel.kallsyms])        down_read
>  uname    5712 [007]    83.855582723: ([kernel.kallsyms])            __cond_resched
>  uname    5712 [007]    83.855582932: ([kernel.kallsyms])            preempt_count_add
>  uname    5712 [007]    83.855582932: ([kernel.kallsyms])                in_lock_functions
>  uname    5712 [007]    83.855582932: ([kernel.kallsyms])            preempt_count_sub
>  uname    5712 [007]    83.855582932: ([kernel.kallsyms])        up_read
>  uname    5712 [007]    83.855582932: ([kernel.kallsyms])            preempt_count_add
>  uname    5712 [007]    83.855583348: ([kernel.kallsyms])                in_lock_functions
>  uname    5712 [007]    83.855583348: ([kernel.kallsyms])            preempt_count_sub
>  uname    5712 [007]    83.855583348: ([kernel.kallsyms])        _copy_to_user
>  uname    5712 [007]    83.855583557: ([kernel.kallsyms])    syscall_exit_to_user_mode
>  uname    5712 [007]    83.855583557: ([kernel.kallsyms])        syscall_exit_work
>  uname    5712 [007]    83.855583557: ([kernel.kallsyms])            perf_syscall_exit
>  uname    5712 [007]    83.855583557: ([kernel.kallsyms])                debug_smp_processor_id
>  uname    5712 [007]    83.855583557: ([kernel.kallsyms])                perf_trace_buf_alloc
>  uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    perf_swevent_get_recursion_context
>  uname    5712 [007]    83.855583557: ([kernel.kallsyms])                        debug_smp_processor_id
>  uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    debug_smp_processor_id
>  uname    5712 [007]    83.855583557: ([kernel.kallsyms])                perf_tp_event
>  uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    perf_trace_buf_update
>  uname    5712 [007]    83.855583557: ([kernel.kallsyms])                        tracing_gen_ctx_irq_test
>  uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    perf_swevent_event
>  uname    5712 [007]    83.855583765: ([kernel.kallsyms])                        __perf_event_account_interrupt
>  uname    5712 [007]    83.855583765: ([kernel.kallsyms])                            __this_cpu_preempt_check
>  uname    5712 [007]    83.855583765: ([kernel.kallsyms])                        perf_event_output_forward
>  uname    5712 [007]    83.855583765: ([kernel.kallsyms])                            perf_event_aux_pause
>  uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                ring_buffer_get
>  uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    __rcu_read_lock
>  uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    __rcu_read_unlock
>  uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                pt_event_stop
>  uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    debug_smp_processor_id
>  uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    debug_smp_processor_id
>  uname    5712 [007]    83.855583973: ([kernel.kallsyms])                                    native_write_msr
>  uname    5712 [007]    83.855583973: ([kernel.kallsyms])                                    native_write_msr
>  uname    5712 [007]    83.855584175: 0x0
> 
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>  include/linux/perf_event.h      | 15 +++++++
>  include/uapi/linux/perf_event.h | 11 ++++-
>  kernel/events/core.c            | 72 +++++++++++++++++++++++++++++++--
>  kernel/events/internal.h        |  1 +
>  4 files changed, 95 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index e85cd1c0eaf3..252c4aac3b79 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -291,6 +291,7 @@ struct perf_event_pmu_context;
>  #define PERF_PMU_CAP_NO_EXCLUDE			0x0040
>  #define PERF_PMU_CAP_AUX_OUTPUT			0x0080
>  #define PERF_PMU_CAP_EXTENDED_HW_TYPE		0x0100
> +#define PERF_PMU_CAP_AUX_PAUSE			0x0200
>  
>  struct perf_output_handle;
>  
> @@ -363,6 +364,8 @@ struct pmu {
>  #define PERF_EF_START	0x01		/* start the counter when adding    */
>  #define PERF_EF_RELOAD	0x02		/* reload the counter when starting */
>  #define PERF_EF_UPDATE	0x04		/* update the counter when stopping */
> +#define PERF_EF_PAUSE	0x08		/* AUX area event, pause tracing */
> +#define PERF_EF_RESUME	0x10		/* AUX area event, resume tracing */
>  
>  	/*
>  	 * Adds/Removes a counter to/from the PMU, can be done inside a
> @@ -402,6 +405,15 @@ struct pmu {
>  	 *
>  	 * ->start() with PERF_EF_RELOAD will reprogram the counter
>  	 *  value, must be preceded by a ->stop() with PERF_EF_UPDATE.
> +	 *
> +	 * ->stop() with PERF_EF_PAUSE will stop as simply as possible. Will not
> +	 * overlap another ->stop() with PERF_EF_PAUSE nor ->start() with
> +	 * PERF_EF_RESUME.
> +	 *
> +	 * ->start() with PERF_EF_RESUME will start as simply as possible but
> +	 * only if the counter is not otherwise stopped. Will not overlap
> +	 * another ->start() with PERF_EF_RESUME nor ->stop() with
> +	 * PERF_EF_PAUSE.
>  	 */
>  	void (*start)			(struct perf_event *event, int flags);
>  	void (*stop)			(struct perf_event *event, int flags);
> @@ -797,6 +809,9 @@ struct perf_event {
>  	/* for aux_output events */
>  	struct perf_event		*aux_event;
>  
> +	/* for AUX area events */
> +	unsigned int			aux_paused;
> +
>  	void (*destroy)(struct perf_event *);
>  	struct rcu_head			rcu_head;
>  
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 39c6a250dd1b..437bc2a8d50c 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -507,7 +507,16 @@ struct perf_event_attr {
>  	__u16	sample_max_stack;
>  	__u16	__reserved_2;
>  	__u32	aux_sample_size;
> -	__u32	__reserved_3;
> +
> +	union {
> +		__u32	aux_output_cfg;
> +		struct {
> +			__u64	aux_pause        :  1, /* on overflow, pause AUX area tracing */
> +				aux_resume       :  1, /* on overflow, resume AUX area tracing */
> +				aux_start_paused :  1, /* start AUX area tracing paused */
> +				__reserved_3     : 29;
> +		};
> +	};
>  
>  	/*
>  	 * User provided data if sigtrap=1, passed back to user via
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 4c72a41f11af..c1e11884d06e 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -2060,7 +2060,8 @@ static void perf_put_aux_event(struct perf_event *event)
>  
>  static bool perf_need_aux_event(struct perf_event *event)
>  {
> -	return !!event->attr.aux_output || !!event->attr.aux_sample_size;
> +	return event->attr.aux_output || event->attr.aux_sample_size ||
> +	       event->attr.aux_pause || event->attr.aux_resume;
>  }
>  
>  static int perf_get_aux_event(struct perf_event *event,
> @@ -2085,6 +2086,10 @@ static int perf_get_aux_event(struct perf_event *event,
>  	    !perf_aux_output_match(event, group_leader))
>  		return 0;
>  
> +	if ((event->attr.aux_pause || event->attr.aux_resume) &&
> +	    !(group_leader->pmu->capabilities & PERF_PMU_CAP_AUX_PAUSE))
> +		return 0;
> +
>  	if (event->attr.aux_sample_size && !group_leader->pmu->snapshot_aux)
>  		return 0;
>  
> @@ -7773,6 +7778,47 @@ void perf_prepare_header(struct perf_event_header *header,
>  	WARN_ON_ONCE(header->size & 7);
>  }
>  
> +static void __perf_event_aux_pause(struct perf_event *event, bool pause)
> +{
> +	if (pause) {
> +		if (!READ_ONCE(event->aux_paused)) {
> +			WRITE_ONCE(event->aux_paused, 1);
> +			event->pmu->stop(event, PERF_EF_PAUSE);
> +		}
> +	} else {
> +		if (READ_ONCE(event->aux_paused)) {
> +			WRITE_ONCE(event->aux_paused, 0);
> +			event->pmu->start(event, PERF_EF_RESUME);
> +		}
> +	}
> +}
> +
> +static void perf_event_aux_pause(struct perf_event *event, bool pause)
> +{
> +	struct perf_buffer *rb;
> +	unsigned long flags;
> +
> +	if (WARN_ON_ONCE(!event))
> +		return;
> +
> +	rb = ring_buffer_get(event);
> +	if (!rb)
> +		return;
> +
> +	local_irq_save(flags);
> +	/* Guard against NMI, NMI loses here */
> +	if (READ_ONCE(rb->aux_in_pause_resume))
> +		goto out_restore;
> +	WRITE_ONCE(rb->aux_in_pause_resume, 1);
> +	barrier();
> +	__perf_event_aux_pause(event, pause);
> +	barrier();
> +	WRITE_ONCE(rb->aux_in_pause_resume, 0);
> +out_restore:
> +	local_irq_restore(flags);
> +	ring_buffer_put(rb);
> +}
> +
>  static __always_inline int
>  __perf_event_output(struct perf_event *event,
>  		    struct perf_sample_data *data,
> @@ -7786,6 +7832,9 @@ __perf_event_output(struct perf_event *event,
>  	struct perf_event_header header;
>  	int err;
>  
> +	if (event->attr.aux_pause)
> +		perf_event_aux_pause(event->aux_event, true);
> +
>  	/* protect the callchain buffers */
>  	rcu_read_lock();
>  
> @@ -7802,6 +7851,10 @@ __perf_event_output(struct perf_event *event,
>  
>  exit:
>  	rcu_read_unlock();
> +
> +	if (event->attr.aux_resume)
> +		perf_event_aux_pause(event->aux_event, false);
> +
>  	return err;
>  }
>  
> @@ -11941,10 +11994,23 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
>  	}
>  
>  	if (event->attr.aux_output &&
> -	    !(pmu->capabilities & PERF_PMU_CAP_AUX_OUTPUT)) {
> +	    (!(pmu->capabilities & PERF_PMU_CAP_AUX_OUTPUT) ||
> +	     event->attr.aux_pause || event->attr.aux_resume)) {
> +		err = -EOPNOTSUPP;
> +		goto err_pmu;
> +	}
> +
> +	if (event->attr.aux_pause && event->attr.aux_resume) {
> +		err = -EINVAL;
> +		goto err_pmu;
> +	}
> +
> +	if (event->attr.aux_start_paused &&
> +	    !(pmu->capabilities & PERF_PMU_CAP_AUX_PAUSE)) {
>  		err = -EOPNOTSUPP;
>  		goto err_pmu;
>  	}
> +	event->aux_paused = event->attr.aux_start_paused;
>  
>  	if (cgroup_fd != -1) {
>  		err = perf_cgroup_connect(cgroup_fd, event, attr, group_leader);
> @@ -12741,7 +12807,7 @@ perf_event_create_kernel_counter(struct perf_event_attr *attr, int cpu,
>  	 * Grouping is not supported for kernel events, neither is 'AUX',
>  	 * make sure the caller's intentions are adjusted.
>  	 */
> -	if (attr->aux_output)
> +	if (attr->aux_output || attr->aux_output_cfg)
>  		return ERR_PTR(-EINVAL);
>  
>  	event = perf_event_alloc(attr, cpu, task, NULL, NULL,
> diff --git a/kernel/events/internal.h b/kernel/events/internal.h
> index 5150d5f84c03..3320f78117dc 100644
> --- a/kernel/events/internal.h
> +++ b/kernel/events/internal.h
> @@ -51,6 +51,7 @@ struct perf_buffer {
>  	void				(*free_aux)(void *);
>  	refcount_t			aux_refcount;
>  	int				aux_in_sampling;
> +	int				aux_in_pause_resume;
>  	void				**aux_pages;
>  	void				*aux_priv;
>  
> -- 
> 2.34.1
> 

-- 

- Arnaldo

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC V2 1/4] perf/core: Add aux_pause, aux_resume, aux_start_paused
  2023-12-08 17:24   ` Adrian Hunter
@ 2023-12-20 15:54     ` James Clark
  -1 siblings, 0 replies; 31+ messages in thread
From: James Clark @ 2023-12-20 15:54 UTC (permalink / raw)
  To: Adrian Hunter, Peter Zijlstra
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Heiko Carstens,
	Thomas Richter, Hendrik Brueckner, Suzuki K Poulose, Mike Leach,
	coresight, linux-arm-kernel, Yicong Yang, Jonathan Cameron,
	Will Deacon, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
	Ian Rogers, linux-kernel, linux-perf-users



On 08/12/2023 17:24, Adrian Hunter wrote:
> Hardware traces, such as instruction traces, can produce a vast amount of
> trace data, so being able to reduce tracing to more specific circumstances
> can be useful.
> 
> The ability to pause or resume tracing when another event happens, can do
> that.
> 
> Add ability for an event to "pause" or "resume" AUX area tracing.
> 
> Add aux_pause bit to perf_event_attr to indicate that, if the event
> happens, the associated AUX area tracing should be paused. Ditto
> aux_resume. Do not allow aux_pause and aux_resume to be set together.
> 
> Add aux_start_paused bit to perf_event_attr to indicate to an AUX area
> event that it should start in a "paused" state.
> 
> Add aux_paused to struct perf_event for AUX area events to keep track of
> the "paused" state. aux_paused is initialized to aux_start_paused.
> 
> Add PERF_EF_PAUSE and PERF_EF_RESUME modes for ->stop() and ->start()
> callbacks. Call as needed, during __perf_event_output(). Add
> aux_in_pause_resume to struct perf_buffer to prevent races with the NMI
> handler. Pause/resume in NMI context will miss out if it coincides with
> another pause/resume.
> 
> To use aux_pause or aux_resume, an event must be in a group with the AUX
> area event as the group leader.
> 
> Example (requires Intel PT and tools patches also):
> 
>  $ perf record --kcore -e '{intel_pt/aux-start-paused/k,syscalls:sys_enter_newuname/aux-resume/,syscalls:sys_exit_newuname/aux-pause/}' uname

I think it might be useful to have an aux-toggle option as well, and
then you could do sampling if you put it on a PMU counter with an
interval. Unless you can make two events for the same counter with
different intervals, and one does resume and the other does pause? I'm
not sure if that would work?

Other than that it looks ok. I got Coresight working with a couple of
changes to what you posted on here, but that can always be done more
thoroughly later if we leave PERF_PMU_CAP_AUX_PAUSE off Coresight for now.

Thanks
James

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC V2 1/4] perf/core: Add aux_pause, aux_resume, aux_start_paused
@ 2023-12-20 15:54     ` James Clark
  0 siblings, 0 replies; 31+ messages in thread
From: James Clark @ 2023-12-20 15:54 UTC (permalink / raw)
  To: Adrian Hunter, Peter Zijlstra
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Heiko Carstens,
	Thomas Richter, Hendrik Brueckner, Suzuki K Poulose, Mike Leach,
	coresight, linux-arm-kernel, Yicong Yang, Jonathan Cameron,
	Will Deacon, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
	Ian Rogers, linux-kernel, linux-perf-users



On 08/12/2023 17:24, Adrian Hunter wrote:
> Hardware traces, such as instruction traces, can produce a vast amount of
> trace data, so being able to reduce tracing to more specific circumstances
> can be useful.
> 
> The ability to pause or resume tracing when another event happens, can do
> that.
> 
> Add ability for an event to "pause" or "resume" AUX area tracing.
> 
> Add aux_pause bit to perf_event_attr to indicate that, if the event
> happens, the associated AUX area tracing should be paused. Ditto
> aux_resume. Do not allow aux_pause and aux_resume to be set together.
> 
> Add aux_start_paused bit to perf_event_attr to indicate to an AUX area
> event that it should start in a "paused" state.
> 
> Add aux_paused to struct perf_event for AUX area events to keep track of
> the "paused" state. aux_paused is initialized to aux_start_paused.
> 
> Add PERF_EF_PAUSE and PERF_EF_RESUME modes for ->stop() and ->start()
> callbacks. Call as needed, during __perf_event_output(). Add
> aux_in_pause_resume to struct perf_buffer to prevent races with the NMI
> handler. Pause/resume in NMI context will miss out if it coincides with
> another pause/resume.
> 
> To use aux_pause or aux_resume, an event must be in a group with the AUX
> area event as the group leader.
> 
> Example (requires Intel PT and tools patches also):
> 
>  $ perf record --kcore -e '{intel_pt/aux-start-paused/k,syscalls:sys_enter_newuname/aux-resume/,syscalls:sys_exit_newuname/aux-pause/}' uname

I think it might be useful to have an aux-toggle option as well, and
then you could do sampling if you put it on a PMU counter with an
interval. Unless you can make two events for the same counter with
different intervals, and one does resume and the other does pause? I'm
not sure if that would work?

Other than that it looks ok. I got Coresight working with a couple of
changes to what you posted on here, but that can always be done more
thoroughly later if we leave PERF_PMU_CAP_AUX_PAUSE off Coresight for now.

Thanks
James

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC V3 4/4] coresight: Have a stab at support for pause / resume
  2023-12-15  6:42     ` Adrian Hunter
@ 2023-12-20 15:59       ` James Clark
  -1 siblings, 0 replies; 31+ messages in thread
From: James Clark @ 2023-12-20 15:59 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Heiko Carstens,
	Thomas Richter, Hendrik Brueckner, Suzuki K Poulose, Mike Leach,
	coresight, linux-arm-kernel, Yicong Yang, Jonathan Cameron,
	Will Deacon, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
	Ian Rogers, linux-kernel, linux-perf-users, Peter Zijlstra



On 15/12/2023 06:42, Adrian Hunter wrote:
> For discussion only, un-tested...
> 

If anyone wants to test Coresight, the diff below is required to get the
most basic use case working. It also probably needs more thought and
some edge case handling:

diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index 596c01e37624..bd0767356277 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -556,7 +556,8 @@ static void etm_event_stop(struct perf_event *event, int mode)
 	struct etm_event_data *event_data;
 	struct list_head *path;
 
-	if (mode & PERF_EF_PAUSE && !READ_ONCE(ctxt->pr_allowed))
+	if ((mode & PERF_EF_PAUSE && !READ_ONCE(ctxt->pr_allowed)) ||
+	    event->hw.state == PERF_HES_STOPPED)
 		return;
 
 	WRITE_ONCE(ctxt->pr_allowed, 0);
@@ -573,9 +574,6 @@ static void etm_event_stop(struct perf_event *event, int mode)
 	/* Clear the event_data as this ETM is stopping the trace. */
 	ctxt->event_data = NULL;
 
-	if (event->hw.state == PERF_HES_STOPPED)
-		goto out_pr_allowed;
-
 	/* We must have a valid event_data for a running event */
 	if (WARN_ON(!event_data))
 		return;
@@ -586,7 +584,7 @@ static void etm_event_stop(struct perf_event *event, int mode)
 	 * nothing needs to be torn down other than outputting a
 	 * zero sized record.
 	 */
-	if (handle->event && (mode & PERF_EF_UPDATE) &&
+	if (handle->event && (mode & (PERF_EF_UPDATE | PERF_EF_PAUSE)) &&
 	    !cpumask_test_cpu(cpu, &event_data->mask)) {
 		event->hw.state = PERF_HES_STOPPED;
 		perf_aux_output_end(handle, 0);
@@ -616,7 +614,7 @@ static void etm_event_stop(struct perf_event *event, int mode)
 	 * handle due to lack of buffer space), we don't
 	 * have to do anything here.
 	 */
-	if (handle->event && (mode & PERF_EF_UPDATE)) {
+	if (handle->event && (mode & (PERF_EF_UPDATE | PERF_EF_PAUSE))) {
 		if (WARN_ON_ONCE(handle->event != event))
 			return;
 
@@ -646,7 +644,6 @@ static void etm_event_stop(struct perf_event *event, int mode)
 	/* Disabling the path make its elements available to other sessions */
 	coresight_disable_path(path);
 
-out_pr_allowed:
 	if (mode & PERF_EF_PAUSE)
 		WRITE_ONCE(ctxt->pr_allowed, 1);
 }
@@ -656,7 +653,7 @@ static int etm_event_add(struct perf_event *event, int mode)
 	int ret = 0;
 	struct hw_perf_event *hwc = &event->hw;
 
-	if (mode & PERF_EF_START && !READ_ONCE(event->aux_paused)) {
+	if (mode & PERF_EF_START) {
 		etm_event_start(event, 0);
 		if (hwc->state & PERF_HES_STOPPED)
 			ret = -EINVAL;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC V3 4/4] coresight: Have a stab at support for pause / resume
@ 2023-12-20 15:59       ` James Clark
  0 siblings, 0 replies; 31+ messages in thread
From: James Clark @ 2023-12-20 15:59 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Heiko Carstens,
	Thomas Richter, Hendrik Brueckner, Suzuki K Poulose, Mike Leach,
	coresight, linux-arm-kernel, Yicong Yang, Jonathan Cameron,
	Will Deacon, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
	Ian Rogers, linux-kernel, linux-perf-users, Peter Zijlstra



On 15/12/2023 06:42, Adrian Hunter wrote:
> For discussion only, un-tested...
> 

If anyone wants to test Coresight, the diff below is required to get the
most basic use case working. It also probably needs more thought and
some edge case handling:

diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index 596c01e37624..bd0767356277 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -556,7 +556,8 @@ static void etm_event_stop(struct perf_event *event, int mode)
 	struct etm_event_data *event_data;
 	struct list_head *path;
 
-	if (mode & PERF_EF_PAUSE && !READ_ONCE(ctxt->pr_allowed))
+	if ((mode & PERF_EF_PAUSE && !READ_ONCE(ctxt->pr_allowed)) ||
+	    event->hw.state == PERF_HES_STOPPED)
 		return;
 
 	WRITE_ONCE(ctxt->pr_allowed, 0);
@@ -573,9 +574,6 @@ static void etm_event_stop(struct perf_event *event, int mode)
 	/* Clear the event_data as this ETM is stopping the trace. */
 	ctxt->event_data = NULL;
 
-	if (event->hw.state == PERF_HES_STOPPED)
-		goto out_pr_allowed;
-
 	/* We must have a valid event_data for a running event */
 	if (WARN_ON(!event_data))
 		return;
@@ -586,7 +584,7 @@ static void etm_event_stop(struct perf_event *event, int mode)
 	 * nothing needs to be torn down other than outputting a
 	 * zero sized record.
 	 */
-	if (handle->event && (mode & PERF_EF_UPDATE) &&
+	if (handle->event && (mode & (PERF_EF_UPDATE | PERF_EF_PAUSE)) &&
 	    !cpumask_test_cpu(cpu, &event_data->mask)) {
 		event->hw.state = PERF_HES_STOPPED;
 		perf_aux_output_end(handle, 0);
@@ -616,7 +614,7 @@ static void etm_event_stop(struct perf_event *event, int mode)
 	 * handle due to lack of buffer space), we don't
 	 * have to do anything here.
 	 */
-	if (handle->event && (mode & PERF_EF_UPDATE)) {
+	if (handle->event && (mode & (PERF_EF_UPDATE | PERF_EF_PAUSE))) {
 		if (WARN_ON_ONCE(handle->event != event))
 			return;
 
@@ -646,7 +644,6 @@ static void etm_event_stop(struct perf_event *event, int mode)
 	/* Disabling the path make its elements available to other sessions */
 	coresight_disable_path(path);
 
-out_pr_allowed:
 	if (mode & PERF_EF_PAUSE)
 		WRITE_ONCE(ctxt->pr_allowed, 1);
 }
@@ -656,7 +653,7 @@ static int etm_event_add(struct perf_event *event, int mode)
 	int ret = 0;
 	struct hw_perf_event *hwc = &event->hw;
 
-	if (mode & PERF_EF_START && !READ_ONCE(event->aux_paused)) {
+	if (mode & PERF_EF_START) {
 		etm_event_start(event, 0);
 		if (hwc->state & PERF_HES_STOPPED)
 			ret = -EINVAL;
-- 
2.34.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC V2 1/4] perf/core: Add aux_pause, aux_resume, aux_start_paused
  2023-12-20 15:54     ` James Clark
@ 2023-12-20 16:16       ` Adrian Hunter
  -1 siblings, 0 replies; 31+ messages in thread
From: Adrian Hunter @ 2023-12-20 16:16 UTC (permalink / raw)
  To: James Clark, Peter Zijlstra
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Heiko Carstens,
	Thomas Richter, Hendrik Brueckner, Suzuki K Poulose, Mike Leach,
	coresight, linux-arm-kernel, Yicong Yang, Jonathan Cameron,
	Will Deacon, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
	Ian Rogers, linux-kernel, linux-perf-users

On 20/12/23 17:54, James Clark wrote:
> 
> 
> On 08/12/2023 17:24, Adrian Hunter wrote:
>> Hardware traces, such as instruction traces, can produce a vast amount of
>> trace data, so being able to reduce tracing to more specific circumstances
>> can be useful.
>>
>> The ability to pause or resume tracing when another event happens, can do
>> that.
>>
>> Add ability for an event to "pause" or "resume" AUX area tracing.
>>
>> Add aux_pause bit to perf_event_attr to indicate that, if the event
>> happens, the associated AUX area tracing should be paused. Ditto
>> aux_resume. Do not allow aux_pause and aux_resume to be set together.
>>
>> Add aux_start_paused bit to perf_event_attr to indicate to an AUX area
>> event that it should start in a "paused" state.
>>
>> Add aux_paused to struct perf_event for AUX area events to keep track of
>> the "paused" state. aux_paused is initialized to aux_start_paused.
>>
>> Add PERF_EF_PAUSE and PERF_EF_RESUME modes for ->stop() and ->start()
>> callbacks. Call as needed, during __perf_event_output(). Add
>> aux_in_pause_resume to struct perf_buffer to prevent races with the NMI
>> handler. Pause/resume in NMI context will miss out if it coincides with
>> another pause/resume.
>>
>> To use aux_pause or aux_resume, an event must be in a group with the AUX
>> area event as the group leader.
>>
>> Example (requires Intel PT and tools patches also):
>>
>>  $ perf record --kcore -e '{intel_pt/aux-start-paused/k,syscalls:sys_enter_newuname/aux-resume/,syscalls:sys_exit_newuname/aux-pause/}' uname
> 
> I think it might be useful to have an aux-toggle option as well, and
> then you could do sampling if you put it on a PMU counter with an
> interval. Unless you can make two events for the same counter with
> different intervals, and one does resume and the other does pause? I'm
> not sure if that would work?

There is already ->snapshot_aux() for sampling. Is that what you mean?

> 
> Other than that it looks ok. I got Coresight working with a couple of
> changes to what you posted on here, but that can always be done more
> thoroughly later if we leave PERF_PMU_CAP_AUX_PAUSE off Coresight for now.

Thanks a lot for looking at this!


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC V2 1/4] perf/core: Add aux_pause, aux_resume, aux_start_paused
@ 2023-12-20 16:16       ` Adrian Hunter
  0 siblings, 0 replies; 31+ messages in thread
From: Adrian Hunter @ 2023-12-20 16:16 UTC (permalink / raw)
  To: James Clark, Peter Zijlstra
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Heiko Carstens,
	Thomas Richter, Hendrik Brueckner, Suzuki K Poulose, Mike Leach,
	coresight, linux-arm-kernel, Yicong Yang, Jonathan Cameron,
	Will Deacon, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
	Ian Rogers, linux-kernel, linux-perf-users

On 20/12/23 17:54, James Clark wrote:
> 
> 
> On 08/12/2023 17:24, Adrian Hunter wrote:
>> Hardware traces, such as instruction traces, can produce a vast amount of
>> trace data, so being able to reduce tracing to more specific circumstances
>> can be useful.
>>
>> The ability to pause or resume tracing when another event happens, can do
>> that.
>>
>> Add ability for an event to "pause" or "resume" AUX area tracing.
>>
>> Add aux_pause bit to perf_event_attr to indicate that, if the event
>> happens, the associated AUX area tracing should be paused. Ditto
>> aux_resume. Do not allow aux_pause and aux_resume to be set together.
>>
>> Add aux_start_paused bit to perf_event_attr to indicate to an AUX area
>> event that it should start in a "paused" state.
>>
>> Add aux_paused to struct perf_event for AUX area events to keep track of
>> the "paused" state. aux_paused is initialized to aux_start_paused.
>>
>> Add PERF_EF_PAUSE and PERF_EF_RESUME modes for ->stop() and ->start()
>> callbacks. Call as needed, during __perf_event_output(). Add
>> aux_in_pause_resume to struct perf_buffer to prevent races with the NMI
>> handler. Pause/resume in NMI context will miss out if it coincides with
>> another pause/resume.
>>
>> To use aux_pause or aux_resume, an event must be in a group with the AUX
>> area event as the group leader.
>>
>> Example (requires Intel PT and tools patches also):
>>
>>  $ perf record --kcore -e '{intel_pt/aux-start-paused/k,syscalls:sys_enter_newuname/aux-resume/,syscalls:sys_exit_newuname/aux-pause/}' uname
> 
> I think it might be useful to have an aux-toggle option as well, and
> then you could do sampling if you put it on a PMU counter with an
> interval. Unless you can make two events for the same counter with
> different intervals, and one does resume and the other does pause? I'm
> not sure if that would work?

There is already ->snapshot_aux() for sampling. Is that what you mean?

> 
> Other than that it looks ok. I got Coresight working with a couple of
> changes to what you posted on here, but that can always be done more
> thoroughly later if we leave PERF_PMU_CAP_AUX_PAUSE off Coresight for now.

Thanks a lot for looking at this!


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC V2 1/4] perf/core: Add aux_pause, aux_resume, aux_start_paused
  2023-12-08 17:24   ` Adrian Hunter
@ 2023-12-20 17:41     ` Suzuki K Poulose
  -1 siblings, 0 replies; 31+ messages in thread
From: Suzuki K Poulose @ 2023-12-20 17:41 UTC (permalink / raw)
  To: Adrian Hunter, Peter Zijlstra
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Heiko Carstens,
	Thomas Richter, Hendrik Brueckner, Mike Leach, James Clark,
	coresight, linux-arm-kernel, Yicong Yang, Jonathan Cameron,
	Will Deacon, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
	Ian Rogers, linux-kernel, linux-perf-users

On 08/12/2023 17:24, Adrian Hunter wrote:
> Hardware traces, such as instruction traces, can produce a vast amount of
> trace data, so being able to reduce tracing to more specific circumstances
> can be useful.
> 
> The ability to pause or resume tracing when another event happens, can do
> that.
> 
> Add ability for an event to "pause" or "resume" AUX area tracing.
> 
> Add aux_pause bit to perf_event_attr to indicate that, if the event
> happens, the associated AUX area tracing should be paused. Ditto
> aux_resume. Do not allow aux_pause and aux_resume to be set together.
> 
> Add aux_start_paused bit to perf_event_attr to indicate to an AUX area
> event that it should start in a "paused" state.
> 
> Add aux_paused to struct perf_event for AUX area events to keep track of
> the "paused" state. aux_paused is initialized to aux_start_paused.
> 
> Add PERF_EF_PAUSE and PERF_EF_RESUME modes for ->stop() and ->start()
> callbacks. Call as needed, during __perf_event_output(). Add
> aux_in_pause_resume to struct perf_buffer to prevent races with the NMI
> handler. Pause/resume in NMI context will miss out if it coincides with
> another pause/resume.
> 
> To use aux_pause or aux_resume, an event must be in a group with the AUX
> area event as the group leader.
> 
> Example (requires Intel PT and tools patches also):
> 
>   $ perf record --kcore -e '{intel_pt/aux-start-paused/k,syscalls:sys_enter_newuname/aux-resume/,syscalls:sys_exit_newuname/aux-pause/}' uname
>   Linux
>   [ perf record: Woken up 1 times to write data ]
>   [ perf record: Captured and wrote 0.041 MB perf.data ]
>   $ perf script --call-trace
>   uname    5712 [007]    83.855580930: name: 0x7ffd9dcebec0
>   uname    5712 [007]    83.855582518:  psb offs: 0
>   uname    5712 [007]    83.855582518:  cbr: 42 freq: 4205 MHz (150%)
>   uname    5712 [007]    83.855582723: ([kernel.kallsyms])    debug_smp_processor_id
>   uname    5712 [007]    83.855582723: ([kernel.kallsyms])    __x64_sys_newuname
>   uname    5712 [007]    83.855582723: ([kernel.kallsyms])        down_read
>   uname    5712 [007]    83.855582723: ([kernel.kallsyms])            __cond_resched
>   uname    5712 [007]    83.855582932: ([kernel.kallsyms])            preempt_count_add
>   uname    5712 [007]    83.855582932: ([kernel.kallsyms])                in_lock_functions
>   uname    5712 [007]    83.855582932: ([kernel.kallsyms])            preempt_count_sub
>   uname    5712 [007]    83.855582932: ([kernel.kallsyms])        up_read
>   uname    5712 [007]    83.855582932: ([kernel.kallsyms])            preempt_count_add
>   uname    5712 [007]    83.855583348: ([kernel.kallsyms])                in_lock_functions
>   uname    5712 [007]    83.855583348: ([kernel.kallsyms])            preempt_count_sub
>   uname    5712 [007]    83.855583348: ([kernel.kallsyms])        _copy_to_user
>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])    syscall_exit_to_user_mode
>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])        syscall_exit_work
>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])            perf_syscall_exit
>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                debug_smp_processor_id
>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                perf_trace_buf_alloc
>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    perf_swevent_get_recursion_context
>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                        debug_smp_processor_id
>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    debug_smp_processor_id
>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                perf_tp_event
>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    perf_trace_buf_update
>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                        tracing_gen_ctx_irq_test
>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    perf_swevent_event
>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                        __perf_event_account_interrupt
>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                            __this_cpu_preempt_check
>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                        perf_event_output_forward
>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                            perf_event_aux_pause
>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                ring_buffer_get
>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    __rcu_read_lock
>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    __rcu_read_unlock
>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                pt_event_stop
>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    debug_smp_processor_id
>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    debug_smp_processor_id
>   uname    5712 [007]    83.855583973: ([kernel.kallsyms])                                    native_write_msr
>   uname    5712 [007]    83.855583973: ([kernel.kallsyms])                                    native_write_msr
>   uname    5712 [007]    83.855584175: 0x0
> 
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>   include/linux/perf_event.h      | 15 +++++++
>   include/uapi/linux/perf_event.h | 11 ++++-
>   kernel/events/core.c            | 72 +++++++++++++++++++++++++++++++--
>   kernel/events/internal.h        |  1 +
>   4 files changed, 95 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index e85cd1c0eaf3..252c4aac3b79 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -291,6 +291,7 @@ struct perf_event_pmu_context;
>   #define PERF_PMU_CAP_NO_EXCLUDE			0x0040
>   #define PERF_PMU_CAP_AUX_OUTPUT			0x0080
>   #define PERF_PMU_CAP_EXTENDED_HW_TYPE		0x0100
> +#define PERF_PMU_CAP_AUX_PAUSE			0x0200
>   
>   struct perf_output_handle;
>   
> @@ -363,6 +364,8 @@ struct pmu {
>   #define PERF_EF_START	0x01		/* start the counter when adding    */
>   #define PERF_EF_RELOAD	0x02		/* reload the counter when starting */
>   #define PERF_EF_UPDATE	0x04		/* update the counter when stopping */
> +#define PERF_EF_PAUSE	0x08		/* AUX area event, pause tracing */
> +#define PERF_EF_RESUME	0x10		/* AUX area event, resume tracing */
>   
>   	/*
>   	 * Adds/Removes a counter to/from the PMU, can be done inside a
> @@ -402,6 +405,15 @@ struct pmu {
>   	 *
>   	 * ->start() with PERF_EF_RELOAD will reprogram the counter
>   	 *  value, must be preceded by a ->stop() with PERF_EF_UPDATE.
> +	 *
> +	 * ->stop() with PERF_EF_PAUSE will stop as simply as possible. Will not
> +	 * overlap another ->stop() with PERF_EF_PAUSE nor ->start() with
> +	 * PERF_EF_RESUME.
> +	 *
> +	 * ->start() with PERF_EF_RESUME will start as simply as possible but
> +	 * only if the counter is not otherwise stopped. Will not overlap
> +	 * another ->start() with PERF_EF_RESUME nor ->stop() with
> +	 * PERF_EF_PAUSE.
>   	 */
>   	void (*start)			(struct perf_event *event, int flags);
>   	void (*stop)			(struct perf_event *event, int flags);
> @@ -797,6 +809,9 @@ struct perf_event {
>   	/* for aux_output events */
>   	struct perf_event		*aux_event;
>   
> +	/* for AUX area events */
> +	unsigned int			aux_paused;
> +
>   	void (*destroy)(struct perf_event *);
>   	struct rcu_head			rcu_head;
>   
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 39c6a250dd1b..437bc2a8d50c 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -507,7 +507,16 @@ struct perf_event_attr {
>   	__u16	sample_max_stack;
>   	__u16	__reserved_2;
>   	__u32	aux_sample_size;
> -	__u32	__reserved_3;
> +
> +	union {
> +		__u32	aux_output_cfg;
> +		struct {
> +			__u64	aux_pause        :  1, /* on overflow, pause AUX area tracing */

Did you mean __u32 ? Otherwise look good to me.

Suzuki

> +				aux_resume       :  1, /* on overflow, resume AUX area tracing */
> +				aux_start_paused :  1, /* start AUX area tracing paused */
> +				__reserved_3     : 29;
> +		};
> +	};
>   
>   	/*
>   	 * User provided data if sigtrap=1, passed back to user via
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 4c72a41f11af..c1e11884d06e 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -2060,7 +2060,8 @@ static void perf_put_aux_event(struct perf_event *event)
>   
>   static bool perf_need_aux_event(struct perf_event *event)
>   {
> -	return !!event->attr.aux_output || !!event->attr.aux_sample_size;
> +	return event->attr.aux_output || event->attr.aux_sample_size ||
> +	       event->attr.aux_pause || event->attr.aux_resume;
>   }
>   
>   static int perf_get_aux_event(struct perf_event *event,
> @@ -2085,6 +2086,10 @@ static int perf_get_aux_event(struct perf_event *event,
>   	    !perf_aux_output_match(event, group_leader))
>   		return 0;
>   
> +	if ((event->attr.aux_pause || event->attr.aux_resume) &&
> +	    !(group_leader->pmu->capabilities & PERF_PMU_CAP_AUX_PAUSE))
> +		return 0;
> +
>   	if (event->attr.aux_sample_size && !group_leader->pmu->snapshot_aux)
>   		return 0;
>   
> @@ -7773,6 +7778,47 @@ void perf_prepare_header(struct perf_event_header *header,
>   	WARN_ON_ONCE(header->size & 7);
>   }
>   
> +static void __perf_event_aux_pause(struct perf_event *event, bool pause)
> +{
> +	if (pause) {
> +		if (!READ_ONCE(event->aux_paused)) {
> +			WRITE_ONCE(event->aux_paused, 1);
> +			event->pmu->stop(event, PERF_EF_PAUSE);
> +		}
> +	} else {
> +		if (READ_ONCE(event->aux_paused)) {
> +			WRITE_ONCE(event->aux_paused, 0);
> +			event->pmu->start(event, PERF_EF_RESUME);
> +		}
> +	}
> +}
> +
> +static void perf_event_aux_pause(struct perf_event *event, bool pause)
> +{
> +	struct perf_buffer *rb;
> +	unsigned long flags;
> +
> +	if (WARN_ON_ONCE(!event))
> +		return;
> +
> +	rb = ring_buffer_get(event);
> +	if (!rb)
> +		return;
> +
> +	local_irq_save(flags);
> +	/* Guard against NMI, NMI loses here */
> +	if (READ_ONCE(rb->aux_in_pause_resume))
> +		goto out_restore;
> +	WRITE_ONCE(rb->aux_in_pause_resume, 1);
> +	barrier();
> +	__perf_event_aux_pause(event, pause);
> +	barrier();
> +	WRITE_ONCE(rb->aux_in_pause_resume, 0);
> +out_restore:
> +	local_irq_restore(flags);
> +	ring_buffer_put(rb);
> +}
> +
>   static __always_inline int
>   __perf_event_output(struct perf_event *event,
>   		    struct perf_sample_data *data,
> @@ -7786,6 +7832,9 @@ __perf_event_output(struct perf_event *event,
>   	struct perf_event_header header;
>   	int err;
>   
> +	if (event->attr.aux_pause)
> +		perf_event_aux_pause(event->aux_event, true);
> +
>   	/* protect the callchain buffers */
>   	rcu_read_lock();
>   
> @@ -7802,6 +7851,10 @@ __perf_event_output(struct perf_event *event,
>   
>   exit:
>   	rcu_read_unlock();
> +
> +	if (event->attr.aux_resume)
> +		perf_event_aux_pause(event->aux_event, false);
> +
>   	return err;
>   }
>   
> @@ -11941,10 +11994,23 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
>   	}
>   
>   	if (event->attr.aux_output &&
> -	    !(pmu->capabilities & PERF_PMU_CAP_AUX_OUTPUT)) {
> +	    (!(pmu->capabilities & PERF_PMU_CAP_AUX_OUTPUT) ||
> +	     event->attr.aux_pause || event->attr.aux_resume)) {
> +		err = -EOPNOTSUPP;
> +		goto err_pmu;
> +	}
> +
> +	if (event->attr.aux_pause && event->attr.aux_resume) {
> +		err = -EINVAL;
> +		goto err_pmu;
> +	}
> +
> +	if (event->attr.aux_start_paused &&
> +	    !(pmu->capabilities & PERF_PMU_CAP_AUX_PAUSE)) {
>   		err = -EOPNOTSUPP;
>   		goto err_pmu;
>   	}
> +	event->aux_paused = event->attr.aux_start_paused;
>   
>   	if (cgroup_fd != -1) {
>   		err = perf_cgroup_connect(cgroup_fd, event, attr, group_leader);
> @@ -12741,7 +12807,7 @@ perf_event_create_kernel_counter(struct perf_event_attr *attr, int cpu,
>   	 * Grouping is not supported for kernel events, neither is 'AUX',
>   	 * make sure the caller's intentions are adjusted.
>   	 */
> -	if (attr->aux_output)
> +	if (attr->aux_output || attr->aux_output_cfg)
>   		return ERR_PTR(-EINVAL);
>   
>   	event = perf_event_alloc(attr, cpu, task, NULL, NULL,
> diff --git a/kernel/events/internal.h b/kernel/events/internal.h
> index 5150d5f84c03..3320f78117dc 100644
> --- a/kernel/events/internal.h
> +++ b/kernel/events/internal.h
> @@ -51,6 +51,7 @@ struct perf_buffer {
>   	void				(*free_aux)(void *);
>   	refcount_t			aux_refcount;
>   	int				aux_in_sampling;
> +	int				aux_in_pause_resume;
>   	void				**aux_pages;
>   	void				*aux_priv;
>   


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC V2 1/4] perf/core: Add aux_pause, aux_resume, aux_start_paused
@ 2023-12-20 17:41     ` Suzuki K Poulose
  0 siblings, 0 replies; 31+ messages in thread
From: Suzuki K Poulose @ 2023-12-20 17:41 UTC (permalink / raw)
  To: Adrian Hunter, Peter Zijlstra
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Heiko Carstens,
	Thomas Richter, Hendrik Brueckner, Mike Leach, James Clark,
	coresight, linux-arm-kernel, Yicong Yang, Jonathan Cameron,
	Will Deacon, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
	Ian Rogers, linux-kernel, linux-perf-users

On 08/12/2023 17:24, Adrian Hunter wrote:
> Hardware traces, such as instruction traces, can produce a vast amount of
> trace data, so being able to reduce tracing to more specific circumstances
> can be useful.
> 
> The ability to pause or resume tracing when another event happens, can do
> that.
> 
> Add ability for an event to "pause" or "resume" AUX area tracing.
> 
> Add aux_pause bit to perf_event_attr to indicate that, if the event
> happens, the associated AUX area tracing should be paused. Ditto
> aux_resume. Do not allow aux_pause and aux_resume to be set together.
> 
> Add aux_start_paused bit to perf_event_attr to indicate to an AUX area
> event that it should start in a "paused" state.
> 
> Add aux_paused to struct perf_event for AUX area events to keep track of
> the "paused" state. aux_paused is initialized to aux_start_paused.
> 
> Add PERF_EF_PAUSE and PERF_EF_RESUME modes for ->stop() and ->start()
> callbacks. Call as needed, during __perf_event_output(). Add
> aux_in_pause_resume to struct perf_buffer to prevent races with the NMI
> handler. Pause/resume in NMI context will miss out if it coincides with
> another pause/resume.
> 
> To use aux_pause or aux_resume, an event must be in a group with the AUX
> area event as the group leader.
> 
> Example (requires Intel PT and tools patches also):
> 
>   $ perf record --kcore -e '{intel_pt/aux-start-paused/k,syscalls:sys_enter_newuname/aux-resume/,syscalls:sys_exit_newuname/aux-pause/}' uname
>   Linux
>   [ perf record: Woken up 1 times to write data ]
>   [ perf record: Captured and wrote 0.041 MB perf.data ]
>   $ perf script --call-trace
>   uname    5712 [007]    83.855580930: name: 0x7ffd9dcebec0
>   uname    5712 [007]    83.855582518:  psb offs: 0
>   uname    5712 [007]    83.855582518:  cbr: 42 freq: 4205 MHz (150%)
>   uname    5712 [007]    83.855582723: ([kernel.kallsyms])    debug_smp_processor_id
>   uname    5712 [007]    83.855582723: ([kernel.kallsyms])    __x64_sys_newuname
>   uname    5712 [007]    83.855582723: ([kernel.kallsyms])        down_read
>   uname    5712 [007]    83.855582723: ([kernel.kallsyms])            __cond_resched
>   uname    5712 [007]    83.855582932: ([kernel.kallsyms])            preempt_count_add
>   uname    5712 [007]    83.855582932: ([kernel.kallsyms])                in_lock_functions
>   uname    5712 [007]    83.855582932: ([kernel.kallsyms])            preempt_count_sub
>   uname    5712 [007]    83.855582932: ([kernel.kallsyms])        up_read
>   uname    5712 [007]    83.855582932: ([kernel.kallsyms])            preempt_count_add
>   uname    5712 [007]    83.855583348: ([kernel.kallsyms])                in_lock_functions
>   uname    5712 [007]    83.855583348: ([kernel.kallsyms])            preempt_count_sub
>   uname    5712 [007]    83.855583348: ([kernel.kallsyms])        _copy_to_user
>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])    syscall_exit_to_user_mode
>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])        syscall_exit_work
>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])            perf_syscall_exit
>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                debug_smp_processor_id
>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                perf_trace_buf_alloc
>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    perf_swevent_get_recursion_context
>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                        debug_smp_processor_id
>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    debug_smp_processor_id
>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                perf_tp_event
>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    perf_trace_buf_update
>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                        tracing_gen_ctx_irq_test
>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    perf_swevent_event
>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                        __perf_event_account_interrupt
>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                            __this_cpu_preempt_check
>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                        perf_event_output_forward
>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                            perf_event_aux_pause
>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                ring_buffer_get
>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    __rcu_read_lock
>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    __rcu_read_unlock
>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                pt_event_stop
>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    debug_smp_processor_id
>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    debug_smp_processor_id
>   uname    5712 [007]    83.855583973: ([kernel.kallsyms])                                    native_write_msr
>   uname    5712 [007]    83.855583973: ([kernel.kallsyms])                                    native_write_msr
>   uname    5712 [007]    83.855584175: 0x0
> 
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>   include/linux/perf_event.h      | 15 +++++++
>   include/uapi/linux/perf_event.h | 11 ++++-
>   kernel/events/core.c            | 72 +++++++++++++++++++++++++++++++--
>   kernel/events/internal.h        |  1 +
>   4 files changed, 95 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index e85cd1c0eaf3..252c4aac3b79 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -291,6 +291,7 @@ struct perf_event_pmu_context;
>   #define PERF_PMU_CAP_NO_EXCLUDE			0x0040
>   #define PERF_PMU_CAP_AUX_OUTPUT			0x0080
>   #define PERF_PMU_CAP_EXTENDED_HW_TYPE		0x0100
> +#define PERF_PMU_CAP_AUX_PAUSE			0x0200
>   
>   struct perf_output_handle;
>   
> @@ -363,6 +364,8 @@ struct pmu {
>   #define PERF_EF_START	0x01		/* start the counter when adding    */
>   #define PERF_EF_RELOAD	0x02		/* reload the counter when starting */
>   #define PERF_EF_UPDATE	0x04		/* update the counter when stopping */
> +#define PERF_EF_PAUSE	0x08		/* AUX area event, pause tracing */
> +#define PERF_EF_RESUME	0x10		/* AUX area event, resume tracing */
>   
>   	/*
>   	 * Adds/Removes a counter to/from the PMU, can be done inside a
> @@ -402,6 +405,15 @@ struct pmu {
>   	 *
>   	 * ->start() with PERF_EF_RELOAD will reprogram the counter
>   	 *  value, must be preceded by a ->stop() with PERF_EF_UPDATE.
> +	 *
> +	 * ->stop() with PERF_EF_PAUSE will stop as simply as possible. Will not
> +	 * overlap another ->stop() with PERF_EF_PAUSE nor ->start() with
> +	 * PERF_EF_RESUME.
> +	 *
> +	 * ->start() with PERF_EF_RESUME will start as simply as possible but
> +	 * only if the counter is not otherwise stopped. Will not overlap
> +	 * another ->start() with PERF_EF_RESUME nor ->stop() with
> +	 * PERF_EF_PAUSE.
>   	 */
>   	void (*start)			(struct perf_event *event, int flags);
>   	void (*stop)			(struct perf_event *event, int flags);
> @@ -797,6 +809,9 @@ struct perf_event {
>   	/* for aux_output events */
>   	struct perf_event		*aux_event;
>   
> +	/* for AUX area events */
> +	unsigned int			aux_paused;
> +
>   	void (*destroy)(struct perf_event *);
>   	struct rcu_head			rcu_head;
>   
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 39c6a250dd1b..437bc2a8d50c 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -507,7 +507,16 @@ struct perf_event_attr {
>   	__u16	sample_max_stack;
>   	__u16	__reserved_2;
>   	__u32	aux_sample_size;
> -	__u32	__reserved_3;
> +
> +	union {
> +		__u32	aux_output_cfg;
> +		struct {
> +			__u64	aux_pause        :  1, /* on overflow, pause AUX area tracing */

Did you mean __u32 ? Otherwise look good to me.

Suzuki

> +				aux_resume       :  1, /* on overflow, resume AUX area tracing */
> +				aux_start_paused :  1, /* start AUX area tracing paused */
> +				__reserved_3     : 29;
> +		};
> +	};
>   
>   	/*
>   	 * User provided data if sigtrap=1, passed back to user via
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 4c72a41f11af..c1e11884d06e 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -2060,7 +2060,8 @@ static void perf_put_aux_event(struct perf_event *event)
>   
>   static bool perf_need_aux_event(struct perf_event *event)
>   {
> -	return !!event->attr.aux_output || !!event->attr.aux_sample_size;
> +	return event->attr.aux_output || event->attr.aux_sample_size ||
> +	       event->attr.aux_pause || event->attr.aux_resume;
>   }
>   
>   static int perf_get_aux_event(struct perf_event *event,
> @@ -2085,6 +2086,10 @@ static int perf_get_aux_event(struct perf_event *event,
>   	    !perf_aux_output_match(event, group_leader))
>   		return 0;
>   
> +	if ((event->attr.aux_pause || event->attr.aux_resume) &&
> +	    !(group_leader->pmu->capabilities & PERF_PMU_CAP_AUX_PAUSE))
> +		return 0;
> +
>   	if (event->attr.aux_sample_size && !group_leader->pmu->snapshot_aux)
>   		return 0;
>   
> @@ -7773,6 +7778,47 @@ void perf_prepare_header(struct perf_event_header *header,
>   	WARN_ON_ONCE(header->size & 7);
>   }
>   
> +static void __perf_event_aux_pause(struct perf_event *event, bool pause)
> +{
> +	if (pause) {
> +		if (!READ_ONCE(event->aux_paused)) {
> +			WRITE_ONCE(event->aux_paused, 1);
> +			event->pmu->stop(event, PERF_EF_PAUSE);
> +		}
> +	} else {
> +		if (READ_ONCE(event->aux_paused)) {
> +			WRITE_ONCE(event->aux_paused, 0);
> +			event->pmu->start(event, PERF_EF_RESUME);
> +		}
> +	}
> +}
> +
> +static void perf_event_aux_pause(struct perf_event *event, bool pause)
> +{
> +	struct perf_buffer *rb;
> +	unsigned long flags;
> +
> +	if (WARN_ON_ONCE(!event))
> +		return;
> +
> +	rb = ring_buffer_get(event);
> +	if (!rb)
> +		return;
> +
> +	local_irq_save(flags);
> +	/* Guard against NMI, NMI loses here */
> +	if (READ_ONCE(rb->aux_in_pause_resume))
> +		goto out_restore;
> +	WRITE_ONCE(rb->aux_in_pause_resume, 1);
> +	barrier();
> +	__perf_event_aux_pause(event, pause);
> +	barrier();
> +	WRITE_ONCE(rb->aux_in_pause_resume, 0);
> +out_restore:
> +	local_irq_restore(flags);
> +	ring_buffer_put(rb);
> +}
> +
>   static __always_inline int
>   __perf_event_output(struct perf_event *event,
>   		    struct perf_sample_data *data,
> @@ -7786,6 +7832,9 @@ __perf_event_output(struct perf_event *event,
>   	struct perf_event_header header;
>   	int err;
>   
> +	if (event->attr.aux_pause)
> +		perf_event_aux_pause(event->aux_event, true);
> +
>   	/* protect the callchain buffers */
>   	rcu_read_lock();
>   
> @@ -7802,6 +7851,10 @@ __perf_event_output(struct perf_event *event,
>   
>   exit:
>   	rcu_read_unlock();
> +
> +	if (event->attr.aux_resume)
> +		perf_event_aux_pause(event->aux_event, false);
> +
>   	return err;
>   }
>   
> @@ -11941,10 +11994,23 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
>   	}
>   
>   	if (event->attr.aux_output &&
> -	    !(pmu->capabilities & PERF_PMU_CAP_AUX_OUTPUT)) {
> +	    (!(pmu->capabilities & PERF_PMU_CAP_AUX_OUTPUT) ||
> +	     event->attr.aux_pause || event->attr.aux_resume)) {
> +		err = -EOPNOTSUPP;
> +		goto err_pmu;
> +	}
> +
> +	if (event->attr.aux_pause && event->attr.aux_resume) {
> +		err = -EINVAL;
> +		goto err_pmu;
> +	}
> +
> +	if (event->attr.aux_start_paused &&
> +	    !(pmu->capabilities & PERF_PMU_CAP_AUX_PAUSE)) {
>   		err = -EOPNOTSUPP;
>   		goto err_pmu;
>   	}
> +	event->aux_paused = event->attr.aux_start_paused;
>   
>   	if (cgroup_fd != -1) {
>   		err = perf_cgroup_connect(cgroup_fd, event, attr, group_leader);
> @@ -12741,7 +12807,7 @@ perf_event_create_kernel_counter(struct perf_event_attr *attr, int cpu,
>   	 * Grouping is not supported for kernel events, neither is 'AUX',
>   	 * make sure the caller's intentions are adjusted.
>   	 */
> -	if (attr->aux_output)
> +	if (attr->aux_output || attr->aux_output_cfg)
>   		return ERR_PTR(-EINVAL);
>   
>   	event = perf_event_alloc(attr, cpu, task, NULL, NULL,
> diff --git a/kernel/events/internal.h b/kernel/events/internal.h
> index 5150d5f84c03..3320f78117dc 100644
> --- a/kernel/events/internal.h
> +++ b/kernel/events/internal.h
> @@ -51,6 +51,7 @@ struct perf_buffer {
>   	void				(*free_aux)(void *);
>   	refcount_t			aux_refcount;
>   	int				aux_in_sampling;
> +	int				aux_in_pause_resume;
>   	void				**aux_pages;
>   	void				*aux_priv;
>   


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC V2 1/4] perf/core: Add aux_pause, aux_resume, aux_start_paused
  2023-12-20 16:16       ` Adrian Hunter
@ 2023-12-21 10:05         ` James Clark
  -1 siblings, 0 replies; 31+ messages in thread
From: James Clark @ 2023-12-21 10:05 UTC (permalink / raw)
  To: Adrian Hunter, Peter Zijlstra
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Heiko Carstens,
	Thomas Richter, Hendrik Brueckner, Suzuki K Poulose, Mike Leach,
	coresight, linux-arm-kernel, Yicong Yang, Jonathan Cameron,
	Will Deacon, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
	Ian Rogers, linux-kernel, linux-perf-users



On 20/12/2023 16:16, Adrian Hunter wrote:
> On 20/12/23 17:54, James Clark wrote:
>>
>>
>> On 08/12/2023 17:24, Adrian Hunter wrote:
>>> Hardware traces, such as instruction traces, can produce a vast amount of
>>> trace data, so being able to reduce tracing to more specific circumstances
>>> can be useful.
>>>
>>> The ability to pause or resume tracing when another event happens, can do
>>> that.
>>>
>>> Add ability for an event to "pause" or "resume" AUX area tracing.
>>>
>>> Add aux_pause bit to perf_event_attr to indicate that, if the event
>>> happens, the associated AUX area tracing should be paused. Ditto
>>> aux_resume. Do not allow aux_pause and aux_resume to be set together.
>>>
>>> Add aux_start_paused bit to perf_event_attr to indicate to an AUX area
>>> event that it should start in a "paused" state.
>>>
>>> Add aux_paused to struct perf_event for AUX area events to keep track of
>>> the "paused" state. aux_paused is initialized to aux_start_paused.
>>>
>>> Add PERF_EF_PAUSE and PERF_EF_RESUME modes for ->stop() and ->start()
>>> callbacks. Call as needed, during __perf_event_output(). Add
>>> aux_in_pause_resume to struct perf_buffer to prevent races with the NMI
>>> handler. Pause/resume in NMI context will miss out if it coincides with
>>> another pause/resume.
>>>
>>> To use aux_pause or aux_resume, an event must be in a group with the AUX
>>> area event as the group leader.
>>>
>>> Example (requires Intel PT and tools patches also):
>>>
>>>  $ perf record --kcore -e '{intel_pt/aux-start-paused/k,syscalls:sys_enter_newuname/aux-resume/,syscalls:sys_exit_newuname/aux-pause/}' uname
>>
>> I think it might be useful to have an aux-toggle option as well, and
>> then you could do sampling if you put it on a PMU counter with an
>> interval. Unless you can make two events for the same counter with
>> different intervals, and one does resume and the other does pause? I'm
>> not sure if that would work?
> 
> There is already ->snapshot_aux() for sampling. Is that what you mean?
> 

I suppose that mostly handles that use case yes. Although there are some
slight differences. It looks like for SAMPLE_AUX, the buffer size for
each sample is fixed and limited to 16 bits in size, whereas between
pause and resume you could potentially have multiple buffers delivered
to userspace of any size.

And it looks like SAMPLE_AUX would leave the trace running even when no
samples were being collected. I suppose you might not want to consume
the memory bandwidth and turn the trace off between samples, which
pause/resume does. Especially if you intend to have long periods between
the samples.

I think if it did turn out to be useful the toggle function can easily
be added later, so I don't intend this comment to be a blocking one.


>>
>> Other than that it looks ok. I got Coresight working with a couple of
>> changes to what you posted on here, but that can always be done more
>> thoroughly later if we leave PERF_PMU_CAP_AUX_PAUSE off Coresight for now.
> 
> Thanks a lot for looking at this!
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC V2 1/4] perf/core: Add aux_pause, aux_resume, aux_start_paused
@ 2023-12-21 10:05         ` James Clark
  0 siblings, 0 replies; 31+ messages in thread
From: James Clark @ 2023-12-21 10:05 UTC (permalink / raw)
  To: Adrian Hunter, Peter Zijlstra
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Heiko Carstens,
	Thomas Richter, Hendrik Brueckner, Suzuki K Poulose, Mike Leach,
	coresight, linux-arm-kernel, Yicong Yang, Jonathan Cameron,
	Will Deacon, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
	Ian Rogers, linux-kernel, linux-perf-users



On 20/12/2023 16:16, Adrian Hunter wrote:
> On 20/12/23 17:54, James Clark wrote:
>>
>>
>> On 08/12/2023 17:24, Adrian Hunter wrote:
>>> Hardware traces, such as instruction traces, can produce a vast amount of
>>> trace data, so being able to reduce tracing to more specific circumstances
>>> can be useful.
>>>
>>> The ability to pause or resume tracing when another event happens, can do
>>> that.
>>>
>>> Add ability for an event to "pause" or "resume" AUX area tracing.
>>>
>>> Add aux_pause bit to perf_event_attr to indicate that, if the event
>>> happens, the associated AUX area tracing should be paused. Ditto
>>> aux_resume. Do not allow aux_pause and aux_resume to be set together.
>>>
>>> Add aux_start_paused bit to perf_event_attr to indicate to an AUX area
>>> event that it should start in a "paused" state.
>>>
>>> Add aux_paused to struct perf_event for AUX area events to keep track of
>>> the "paused" state. aux_paused is initialized to aux_start_paused.
>>>
>>> Add PERF_EF_PAUSE and PERF_EF_RESUME modes for ->stop() and ->start()
>>> callbacks. Call as needed, during __perf_event_output(). Add
>>> aux_in_pause_resume to struct perf_buffer to prevent races with the NMI
>>> handler. Pause/resume in NMI context will miss out if it coincides with
>>> another pause/resume.
>>>
>>> To use aux_pause or aux_resume, an event must be in a group with the AUX
>>> area event as the group leader.
>>>
>>> Example (requires Intel PT and tools patches also):
>>>
>>>  $ perf record --kcore -e '{intel_pt/aux-start-paused/k,syscalls:sys_enter_newuname/aux-resume/,syscalls:sys_exit_newuname/aux-pause/}' uname
>>
>> I think it might be useful to have an aux-toggle option as well, and
>> then you could do sampling if you put it on a PMU counter with an
>> interval. Unless you can make two events for the same counter with
>> different intervals, and one does resume and the other does pause? I'm
>> not sure if that would work?
> 
> There is already ->snapshot_aux() for sampling. Is that what you mean?
> 

I suppose that mostly handles that use case yes. Although there are some
slight differences. It looks like for SAMPLE_AUX, the buffer size for
each sample is fixed and limited to 16 bits in size, whereas between
pause and resume you could potentially have multiple buffers delivered
to userspace of any size.

And it looks like SAMPLE_AUX would leave the trace running even when no
samples were being collected. I suppose you might not want to consume
the memory bandwidth and turn the trace off between samples, which
pause/resume does. Especially if you intend to have long periods between
the samples.

I think if it did turn out to be useful the toggle function can easily
be added later, so I don't intend this comment to be a blocking one.


>>
>> Other than that it looks ok. I got Coresight working with a couple of
>> changes to what you posted on here, but that can always be done more
>> thoroughly later if we leave PERF_PMU_CAP_AUX_PAUSE off Coresight for now.
> 
> Thanks a lot for looking at this!
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC V3 4/4] coresight: Have a stab at support for pause / resume
  2023-12-20 15:59       ` James Clark
@ 2024-01-05 12:56         ` Adrian Hunter
  -1 siblings, 0 replies; 31+ messages in thread
From: Adrian Hunter @ 2024-01-05 12:56 UTC (permalink / raw)
  To: James Clark
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Heiko Carstens,
	Thomas Richter, Hendrik Brueckner, Suzuki K Poulose, Mike Leach,
	coresight, linux-arm-kernel, Yicong Yang, Jonathan Cameron,
	Will Deacon, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
	Ian Rogers, linux-kernel, linux-perf-users, Peter Zijlstra

On 20/12/23 17:59, James Clark wrote:
> 
> 
> On 15/12/2023 06:42, Adrian Hunter wrote:
>> For discussion only, un-tested...
>>
> 
> If anyone wants to test Coresight, the diff below is required to get the
> most basic use case working. It also probably needs more thought and
> some edge case handling:

Makes sense to me

> 
> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
> index 596c01e37624..bd0767356277 100644
> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
> @@ -556,7 +556,8 @@ static void etm_event_stop(struct perf_event *event, int mode)
>  	struct etm_event_data *event_data;
>  	struct list_head *path;
>  
> -	if (mode & PERF_EF_PAUSE && !READ_ONCE(ctxt->pr_allowed))
> +	if ((mode & PERF_EF_PAUSE && !READ_ONCE(ctxt->pr_allowed)) ||
> +	    event->hw.state == PERF_HES_STOPPED)
>  		return;
>  
>  	WRITE_ONCE(ctxt->pr_allowed, 0);
> @@ -573,9 +574,6 @@ static void etm_event_stop(struct perf_event *event, int mode)
>  	/* Clear the event_data as this ETM is stopping the trace. */
>  	ctxt->event_data = NULL;
>  
> -	if (event->hw.state == PERF_HES_STOPPED)
> -		goto out_pr_allowed;
> -
>  	/* We must have a valid event_data for a running event */
>  	if (WARN_ON(!event_data))
>  		return;
> @@ -586,7 +584,7 @@ static void etm_event_stop(struct perf_event *event, int mode)
>  	 * nothing needs to be torn down other than outputting a
>  	 * zero sized record.
>  	 */
> -	if (handle->event && (mode & PERF_EF_UPDATE) &&
> +	if (handle->event && (mode & (PERF_EF_UPDATE | PERF_EF_PAUSE)) &&
>  	    !cpumask_test_cpu(cpu, &event_data->mask)) {
>  		event->hw.state = PERF_HES_STOPPED;
>  		perf_aux_output_end(handle, 0);
> @@ -616,7 +614,7 @@ static void etm_event_stop(struct perf_event *event, int mode)
>  	 * handle due to lack of buffer space), we don't
>  	 * have to do anything here.
>  	 */
> -	if (handle->event && (mode & PERF_EF_UPDATE)) {
> +	if (handle->event && (mode & (PERF_EF_UPDATE | PERF_EF_PAUSE))) {
>  		if (WARN_ON_ONCE(handle->event != event))
>  			return;
>  
> @@ -646,7 +644,6 @@ static void etm_event_stop(struct perf_event *event, int mode)
>  	/* Disabling the path make its elements available to other sessions */
>  	coresight_disable_path(path);
>  
> -out_pr_allowed:
>  	if (mode & PERF_EF_PAUSE)
>  		WRITE_ONCE(ctxt->pr_allowed, 1);
>  }
> @@ -656,7 +653,7 @@ static int etm_event_add(struct perf_event *event, int mode)
>  	int ret = 0;
>  	struct hw_perf_event *hwc = &event->hw;
>  
> -	if (mode & PERF_EF_START && !READ_ONCE(event->aux_paused)) {
> +	if (mode & PERF_EF_START) {
>  		etm_event_start(event, 0);
>  		if (hwc->state & PERF_HES_STOPPED)
>  			ret = -EINVAL;


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC V3 4/4] coresight: Have a stab at support for pause / resume
@ 2024-01-05 12:56         ` Adrian Hunter
  0 siblings, 0 replies; 31+ messages in thread
From: Adrian Hunter @ 2024-01-05 12:56 UTC (permalink / raw)
  To: James Clark
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Heiko Carstens,
	Thomas Richter, Hendrik Brueckner, Suzuki K Poulose, Mike Leach,
	coresight, linux-arm-kernel, Yicong Yang, Jonathan Cameron,
	Will Deacon, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
	Ian Rogers, linux-kernel, linux-perf-users, Peter Zijlstra

On 20/12/23 17:59, James Clark wrote:
> 
> 
> On 15/12/2023 06:42, Adrian Hunter wrote:
>> For discussion only, un-tested...
>>
> 
> If anyone wants to test Coresight, the diff below is required to get the
> most basic use case working. It also probably needs more thought and
> some edge case handling:

Makes sense to me

> 
> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
> index 596c01e37624..bd0767356277 100644
> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
> @@ -556,7 +556,8 @@ static void etm_event_stop(struct perf_event *event, int mode)
>  	struct etm_event_data *event_data;
>  	struct list_head *path;
>  
> -	if (mode & PERF_EF_PAUSE && !READ_ONCE(ctxt->pr_allowed))
> +	if ((mode & PERF_EF_PAUSE && !READ_ONCE(ctxt->pr_allowed)) ||
> +	    event->hw.state == PERF_HES_STOPPED)
>  		return;
>  
>  	WRITE_ONCE(ctxt->pr_allowed, 0);
> @@ -573,9 +574,6 @@ static void etm_event_stop(struct perf_event *event, int mode)
>  	/* Clear the event_data as this ETM is stopping the trace. */
>  	ctxt->event_data = NULL;
>  
> -	if (event->hw.state == PERF_HES_STOPPED)
> -		goto out_pr_allowed;
> -
>  	/* We must have a valid event_data for a running event */
>  	if (WARN_ON(!event_data))
>  		return;
> @@ -586,7 +584,7 @@ static void etm_event_stop(struct perf_event *event, int mode)
>  	 * nothing needs to be torn down other than outputting a
>  	 * zero sized record.
>  	 */
> -	if (handle->event && (mode & PERF_EF_UPDATE) &&
> +	if (handle->event && (mode & (PERF_EF_UPDATE | PERF_EF_PAUSE)) &&
>  	    !cpumask_test_cpu(cpu, &event_data->mask)) {
>  		event->hw.state = PERF_HES_STOPPED;
>  		perf_aux_output_end(handle, 0);
> @@ -616,7 +614,7 @@ static void etm_event_stop(struct perf_event *event, int mode)
>  	 * handle due to lack of buffer space), we don't
>  	 * have to do anything here.
>  	 */
> -	if (handle->event && (mode & PERF_EF_UPDATE)) {
> +	if (handle->event && (mode & (PERF_EF_UPDATE | PERF_EF_PAUSE))) {
>  		if (WARN_ON_ONCE(handle->event != event))
>  			return;
>  
> @@ -646,7 +644,6 @@ static void etm_event_stop(struct perf_event *event, int mode)
>  	/* Disabling the path make its elements available to other sessions */
>  	coresight_disable_path(path);
>  
> -out_pr_allowed:
>  	if (mode & PERF_EF_PAUSE)
>  		WRITE_ONCE(ctxt->pr_allowed, 1);
>  }
> @@ -656,7 +653,7 @@ static int etm_event_add(struct perf_event *event, int mode)
>  	int ret = 0;
>  	struct hw_perf_event *hwc = &event->hw;
>  
> -	if (mode & PERF_EF_START && !READ_ONCE(event->aux_paused)) {
> +	if (mode & PERF_EF_START) {
>  		etm_event_start(event, 0);
>  		if (hwc->state & PERF_HES_STOPPED)
>  			ret = -EINVAL;


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC V2 1/4] perf/core: Add aux_pause, aux_resume, aux_start_paused
  2023-12-20 17:41     ` Suzuki K Poulose
@ 2024-01-05 12:57       ` Adrian Hunter
  -1 siblings, 0 replies; 31+ messages in thread
From: Adrian Hunter @ 2024-01-05 12:57 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Heiko Carstens,
	Thomas Richter, Hendrik Brueckner, Mike Leach, James Clark,
	coresight, linux-arm-kernel, Yicong Yang, Jonathan Cameron,
	Will Deacon, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
	Ian Rogers, linux-kernel, linux-perf-users, Peter Zijlstra

On 20/12/23 19:41, Suzuki K Poulose wrote:
> On 08/12/2023 17:24, Adrian Hunter wrote:
>> Hardware traces, such as instruction traces, can produce a vast amount of
>> trace data, so being able to reduce tracing to more specific circumstances
>> can be useful.
>>
>> The ability to pause or resume tracing when another event happens, can do
>> that.
>>
>> Add ability for an event to "pause" or "resume" AUX area tracing.
>>
>> Add aux_pause bit to perf_event_attr to indicate that, if the event
>> happens, the associated AUX area tracing should be paused. Ditto
>> aux_resume. Do not allow aux_pause and aux_resume to be set together.
>>
>> Add aux_start_paused bit to perf_event_attr to indicate to an AUX area
>> event that it should start in a "paused" state.
>>
>> Add aux_paused to struct perf_event for AUX area events to keep track of
>> the "paused" state. aux_paused is initialized to aux_start_paused.
>>
>> Add PERF_EF_PAUSE and PERF_EF_RESUME modes for ->stop() and ->start()
>> callbacks. Call as needed, during __perf_event_output(). Add
>> aux_in_pause_resume to struct perf_buffer to prevent races with the NMI
>> handler. Pause/resume in NMI context will miss out if it coincides with
>> another pause/resume.
>>
>> To use aux_pause or aux_resume, an event must be in a group with the AUX
>> area event as the group leader.
>>
>> Example (requires Intel PT and tools patches also):
>>
>>   $ perf record --kcore -e '{intel_pt/aux-start-paused/k,syscalls:sys_enter_newuname/aux-resume/,syscalls:sys_exit_newuname/aux-pause/}' uname
>>   Linux
>>   [ perf record: Woken up 1 times to write data ]
>>   [ perf record: Captured and wrote 0.041 MB perf.data ]
>>   $ perf script --call-trace
>>   uname    5712 [007]    83.855580930: name: 0x7ffd9dcebec0
>>   uname    5712 [007]    83.855582518:  psb offs: 0
>>   uname    5712 [007]    83.855582518:  cbr: 42 freq: 4205 MHz (150%)
>>   uname    5712 [007]    83.855582723: ([kernel.kallsyms])    debug_smp_processor_id
>>   uname    5712 [007]    83.855582723: ([kernel.kallsyms])    __x64_sys_newuname
>>   uname    5712 [007]    83.855582723: ([kernel.kallsyms])        down_read
>>   uname    5712 [007]    83.855582723: ([kernel.kallsyms])            __cond_resched
>>   uname    5712 [007]    83.855582932: ([kernel.kallsyms])            preempt_count_add
>>   uname    5712 [007]    83.855582932: ([kernel.kallsyms])                in_lock_functions
>>   uname    5712 [007]    83.855582932: ([kernel.kallsyms])            preempt_count_sub
>>   uname    5712 [007]    83.855582932: ([kernel.kallsyms])        up_read
>>   uname    5712 [007]    83.855582932: ([kernel.kallsyms])            preempt_count_add
>>   uname    5712 [007]    83.855583348: ([kernel.kallsyms])                in_lock_functions
>>   uname    5712 [007]    83.855583348: ([kernel.kallsyms])            preempt_count_sub
>>   uname    5712 [007]    83.855583348: ([kernel.kallsyms])        _copy_to_user
>>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])    syscall_exit_to_user_mode
>>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])        syscall_exit_work
>>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])            perf_syscall_exit
>>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                debug_smp_processor_id
>>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                perf_trace_buf_alloc
>>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    perf_swevent_get_recursion_context
>>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                        debug_smp_processor_id
>>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    debug_smp_processor_id
>>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                perf_tp_event
>>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    perf_trace_buf_update
>>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                        tracing_gen_ctx_irq_test
>>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    perf_swevent_event
>>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                        __perf_event_account_interrupt
>>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                            __this_cpu_preempt_check
>>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                        perf_event_output_forward
>>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                            perf_event_aux_pause
>>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                ring_buffer_get
>>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    __rcu_read_lock
>>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    __rcu_read_unlock
>>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                pt_event_stop
>>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    debug_smp_processor_id
>>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    debug_smp_processor_id
>>   uname    5712 [007]    83.855583973: ([kernel.kallsyms])                                    native_write_msr
>>   uname    5712 [007]    83.855583973: ([kernel.kallsyms])                                    native_write_msr
>>   uname    5712 [007]    83.855584175: 0x0
>>
>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
>> ---
>>   include/linux/perf_event.h      | 15 +++++++
>>   include/uapi/linux/perf_event.h | 11 ++++-
>>   kernel/events/core.c            | 72 +++++++++++++++++++++++++++++++--
>>   kernel/events/internal.h        |  1 +
>>   4 files changed, 95 insertions(+), 4 deletions(-)
>>
>> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
>> index e85cd1c0eaf3..252c4aac3b79 100644
>> --- a/include/linux/perf_event.h
>> +++ b/include/linux/perf_event.h
>> @@ -291,6 +291,7 @@ struct perf_event_pmu_context;
>>   #define PERF_PMU_CAP_NO_EXCLUDE            0x0040
>>   #define PERF_PMU_CAP_AUX_OUTPUT            0x0080
>>   #define PERF_PMU_CAP_EXTENDED_HW_TYPE        0x0100
>> +#define PERF_PMU_CAP_AUX_PAUSE            0x0200
>>     struct perf_output_handle;
>>   @@ -363,6 +364,8 @@ struct pmu {
>>   #define PERF_EF_START    0x01        /* start the counter when adding    */
>>   #define PERF_EF_RELOAD    0x02        /* reload the counter when starting */
>>   #define PERF_EF_UPDATE    0x04        /* update the counter when stopping */
>> +#define PERF_EF_PAUSE    0x08        /* AUX area event, pause tracing */
>> +#define PERF_EF_RESUME    0x10        /* AUX area event, resume tracing */
>>         /*
>>        * Adds/Removes a counter to/from the PMU, can be done inside a
>> @@ -402,6 +405,15 @@ struct pmu {
>>        *
>>        * ->start() with PERF_EF_RELOAD will reprogram the counter
>>        *  value, must be preceded by a ->stop() with PERF_EF_UPDATE.
>> +     *
>> +     * ->stop() with PERF_EF_PAUSE will stop as simply as possible. Will not
>> +     * overlap another ->stop() with PERF_EF_PAUSE nor ->start() with
>> +     * PERF_EF_RESUME.
>> +     *
>> +     * ->start() with PERF_EF_RESUME will start as simply as possible but
>> +     * only if the counter is not otherwise stopped. Will not overlap
>> +     * another ->start() with PERF_EF_RESUME nor ->stop() with
>> +     * PERF_EF_PAUSE.
>>        */
>>       void (*start)            (struct perf_event *event, int flags);
>>       void (*stop)            (struct perf_event *event, int flags);
>> @@ -797,6 +809,9 @@ struct perf_event {
>>       /* for aux_output events */
>>       struct perf_event        *aux_event;
>>   +    /* for AUX area events */
>> +    unsigned int            aux_paused;
>> +
>>       void (*destroy)(struct perf_event *);
>>       struct rcu_head            rcu_head;
>>   diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
>> index 39c6a250dd1b..437bc2a8d50c 100644
>> --- a/include/uapi/linux/perf_event.h
>> +++ b/include/uapi/linux/perf_event.h
>> @@ -507,7 +507,16 @@ struct perf_event_attr {
>>       __u16    sample_max_stack;
>>       __u16    __reserved_2;
>>       __u32    aux_sample_size;
>> -    __u32    __reserved_3;
>> +
>> +    union {
>> +        __u32    aux_output_cfg;
>> +        struct {
>> +            __u64    aux_pause        :  1, /* on overflow, pause AUX area tracing */
> 
> Did you mean __u32 ? Otherwise look good to me.

True, thanks!


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC V2 1/4] perf/core: Add aux_pause, aux_resume, aux_start_paused
@ 2024-01-05 12:57       ` Adrian Hunter
  0 siblings, 0 replies; 31+ messages in thread
From: Adrian Hunter @ 2024-01-05 12:57 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: Ingo Molnar, Mark Rutland, Alexander Shishkin, Heiko Carstens,
	Thomas Richter, Hendrik Brueckner, Mike Leach, James Clark,
	coresight, linux-arm-kernel, Yicong Yang, Jonathan Cameron,
	Will Deacon, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
	Ian Rogers, linux-kernel, linux-perf-users, Peter Zijlstra

On 20/12/23 19:41, Suzuki K Poulose wrote:
> On 08/12/2023 17:24, Adrian Hunter wrote:
>> Hardware traces, such as instruction traces, can produce a vast amount of
>> trace data, so being able to reduce tracing to more specific circumstances
>> can be useful.
>>
>> The ability to pause or resume tracing when another event happens, can do
>> that.
>>
>> Add ability for an event to "pause" or "resume" AUX area tracing.
>>
>> Add aux_pause bit to perf_event_attr to indicate that, if the event
>> happens, the associated AUX area tracing should be paused. Ditto
>> aux_resume. Do not allow aux_pause and aux_resume to be set together.
>>
>> Add aux_start_paused bit to perf_event_attr to indicate to an AUX area
>> event that it should start in a "paused" state.
>>
>> Add aux_paused to struct perf_event for AUX area events to keep track of
>> the "paused" state. aux_paused is initialized to aux_start_paused.
>>
>> Add PERF_EF_PAUSE and PERF_EF_RESUME modes for ->stop() and ->start()
>> callbacks. Call as needed, during __perf_event_output(). Add
>> aux_in_pause_resume to struct perf_buffer to prevent races with the NMI
>> handler. Pause/resume in NMI context will miss out if it coincides with
>> another pause/resume.
>>
>> To use aux_pause or aux_resume, an event must be in a group with the AUX
>> area event as the group leader.
>>
>> Example (requires Intel PT and tools patches also):
>>
>>   $ perf record --kcore -e '{intel_pt/aux-start-paused/k,syscalls:sys_enter_newuname/aux-resume/,syscalls:sys_exit_newuname/aux-pause/}' uname
>>   Linux
>>   [ perf record: Woken up 1 times to write data ]
>>   [ perf record: Captured and wrote 0.041 MB perf.data ]
>>   $ perf script --call-trace
>>   uname    5712 [007]    83.855580930: name: 0x7ffd9dcebec0
>>   uname    5712 [007]    83.855582518:  psb offs: 0
>>   uname    5712 [007]    83.855582518:  cbr: 42 freq: 4205 MHz (150%)
>>   uname    5712 [007]    83.855582723: ([kernel.kallsyms])    debug_smp_processor_id
>>   uname    5712 [007]    83.855582723: ([kernel.kallsyms])    __x64_sys_newuname
>>   uname    5712 [007]    83.855582723: ([kernel.kallsyms])        down_read
>>   uname    5712 [007]    83.855582723: ([kernel.kallsyms])            __cond_resched
>>   uname    5712 [007]    83.855582932: ([kernel.kallsyms])            preempt_count_add
>>   uname    5712 [007]    83.855582932: ([kernel.kallsyms])                in_lock_functions
>>   uname    5712 [007]    83.855582932: ([kernel.kallsyms])            preempt_count_sub
>>   uname    5712 [007]    83.855582932: ([kernel.kallsyms])        up_read
>>   uname    5712 [007]    83.855582932: ([kernel.kallsyms])            preempt_count_add
>>   uname    5712 [007]    83.855583348: ([kernel.kallsyms])                in_lock_functions
>>   uname    5712 [007]    83.855583348: ([kernel.kallsyms])            preempt_count_sub
>>   uname    5712 [007]    83.855583348: ([kernel.kallsyms])        _copy_to_user
>>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])    syscall_exit_to_user_mode
>>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])        syscall_exit_work
>>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])            perf_syscall_exit
>>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                debug_smp_processor_id
>>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                perf_trace_buf_alloc
>>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    perf_swevent_get_recursion_context
>>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                        debug_smp_processor_id
>>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    debug_smp_processor_id
>>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                perf_tp_event
>>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    perf_trace_buf_update
>>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                        tracing_gen_ctx_irq_test
>>   uname    5712 [007]    83.855583557: ([kernel.kallsyms])                    perf_swevent_event
>>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                        __perf_event_account_interrupt
>>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                            __this_cpu_preempt_check
>>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                        perf_event_output_forward
>>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                            perf_event_aux_pause
>>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                ring_buffer_get
>>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    __rcu_read_lock
>>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    __rcu_read_unlock
>>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                pt_event_stop
>>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    debug_smp_processor_id
>>   uname    5712 [007]    83.855583765: ([kernel.kallsyms])                                    debug_smp_processor_id
>>   uname    5712 [007]    83.855583973: ([kernel.kallsyms])                                    native_write_msr
>>   uname    5712 [007]    83.855583973: ([kernel.kallsyms])                                    native_write_msr
>>   uname    5712 [007]    83.855584175: 0x0
>>
>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
>> ---
>>   include/linux/perf_event.h      | 15 +++++++
>>   include/uapi/linux/perf_event.h | 11 ++++-
>>   kernel/events/core.c            | 72 +++++++++++++++++++++++++++++++--
>>   kernel/events/internal.h        |  1 +
>>   4 files changed, 95 insertions(+), 4 deletions(-)
>>
>> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
>> index e85cd1c0eaf3..252c4aac3b79 100644
>> --- a/include/linux/perf_event.h
>> +++ b/include/linux/perf_event.h
>> @@ -291,6 +291,7 @@ struct perf_event_pmu_context;
>>   #define PERF_PMU_CAP_NO_EXCLUDE            0x0040
>>   #define PERF_PMU_CAP_AUX_OUTPUT            0x0080
>>   #define PERF_PMU_CAP_EXTENDED_HW_TYPE        0x0100
>> +#define PERF_PMU_CAP_AUX_PAUSE            0x0200
>>     struct perf_output_handle;
>>   @@ -363,6 +364,8 @@ struct pmu {
>>   #define PERF_EF_START    0x01        /* start the counter when adding    */
>>   #define PERF_EF_RELOAD    0x02        /* reload the counter when starting */
>>   #define PERF_EF_UPDATE    0x04        /* update the counter when stopping */
>> +#define PERF_EF_PAUSE    0x08        /* AUX area event, pause tracing */
>> +#define PERF_EF_RESUME    0x10        /* AUX area event, resume tracing */
>>         /*
>>        * Adds/Removes a counter to/from the PMU, can be done inside a
>> @@ -402,6 +405,15 @@ struct pmu {
>>        *
>>        * ->start() with PERF_EF_RELOAD will reprogram the counter
>>        *  value, must be preceded by a ->stop() with PERF_EF_UPDATE.
>> +     *
>> +     * ->stop() with PERF_EF_PAUSE will stop as simply as possible. Will not
>> +     * overlap another ->stop() with PERF_EF_PAUSE nor ->start() with
>> +     * PERF_EF_RESUME.
>> +     *
>> +     * ->start() with PERF_EF_RESUME will start as simply as possible but
>> +     * only if the counter is not otherwise stopped. Will not overlap
>> +     * another ->start() with PERF_EF_RESUME nor ->stop() with
>> +     * PERF_EF_PAUSE.
>>        */
>>       void (*start)            (struct perf_event *event, int flags);
>>       void (*stop)            (struct perf_event *event, int flags);
>> @@ -797,6 +809,9 @@ struct perf_event {
>>       /* for aux_output events */
>>       struct perf_event        *aux_event;
>>   +    /* for AUX area events */
>> +    unsigned int            aux_paused;
>> +
>>       void (*destroy)(struct perf_event *);
>>       struct rcu_head            rcu_head;
>>   diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
>> index 39c6a250dd1b..437bc2a8d50c 100644
>> --- a/include/uapi/linux/perf_event.h
>> +++ b/include/uapi/linux/perf_event.h
>> @@ -507,7 +507,16 @@ struct perf_event_attr {
>>       __u16    sample_max_stack;
>>       __u16    __reserved_2;
>>       __u32    aux_sample_size;
>> -    __u32    __reserved_3;
>> +
>> +    union {
>> +        __u32    aux_output_cfg;
>> +        struct {
>> +            __u64    aux_pause        :  1, /* on overflow, pause AUX area tracing */
> 
> Did you mean __u32 ? Otherwise look good to me.

True, thanks!


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2024-01-05 12:58 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-08 17:24 [PATCH RFC V2 0/4] perf/core: Add ability for an event to "pause" or "resume" AUX area tracing Adrian Hunter
2023-12-08 17:24 ` Adrian Hunter
2023-12-08 17:24 ` [PATCH RFC V2 1/4] perf/core: Add aux_pause, aux_resume, aux_start_paused Adrian Hunter
2023-12-08 17:24   ` Adrian Hunter
2023-12-19 13:42   ` Arnaldo Carvalho de Melo
2023-12-19 13:42     ` Arnaldo Carvalho de Melo
2023-12-20 15:54   ` James Clark
2023-12-20 15:54     ` James Clark
2023-12-20 16:16     ` Adrian Hunter
2023-12-20 16:16       ` Adrian Hunter
2023-12-21 10:05       ` James Clark
2023-12-21 10:05         ` James Clark
2023-12-20 17:41   ` Suzuki K Poulose
2023-12-20 17:41     ` Suzuki K Poulose
2024-01-05 12:57     ` Adrian Hunter
2024-01-05 12:57       ` Adrian Hunter
2023-12-08 17:24 ` [PATCH RFC V2 2/4] perf/x86/intel/pt: Add support for pause / resume Adrian Hunter
2023-12-08 17:24   ` Adrian Hunter
2023-12-08 17:24 ` [PATCH RFC V2 3/4] perf tools: Add support for AUX area " Adrian Hunter
2023-12-08 17:24   ` Adrian Hunter
2023-12-08 17:24 ` [PATCH RFC V2 4/4] coresight: Have a stab at support for " Adrian Hunter
2023-12-08 17:24   ` Adrian Hunter
2023-12-09 17:52   ` kernel test robot
2023-12-15  6:42   ` [PATCH RFC V3 " Adrian Hunter
2023-12-15  6:42     ` Adrian Hunter
2023-12-20 15:59     ` James Clark
2023-12-20 15:59       ` James Clark
2024-01-05 12:56       ` Adrian Hunter
2024-01-05 12:56         ` Adrian Hunter
2023-12-19  6:05 ` [PATCH RFC V2 0/4] perf/core: Add ability for an event to "pause" or "resume" AUX area tracing Adrian Hunter
2023-12-19  6:05   ` Adrian Hunter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.