linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/16] perf intel-pt: Sampling improvements
@ 2020-04-01 10:15 Adrian Hunter
  2020-04-01 10:15 ` [PATCH 01/16] perf auxtrace: Add ->evsel_is_auxtrace() callback Adrian Hunter
                   ` (15 more replies)
  0 siblings, 16 replies; 47+ messages in thread
From: Adrian Hunter @ 2020-04-01 10:15 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, linux-kernel, Adrian Hunter

Hi

Here are 3 sampling improvements for Intel PT:

1. Patches 1 to 7
   For reporting purposes, un-group AUX area event
   Please example in patch 7

2. Patches 8 to 11
   Add support for synthesizing callchains for regular events
   Please see example in patch 11

3. Patches 12 to 16
   Add support for leader-sampling with AUX area event
   Please see example in patch 16

Patches also found here:

   git.infradead.org:/srv/git/users/ahunter/linux-perf.git callchain


Adrian Hunter (16):
      perf auxtrace: Add ->evsel_is_auxtrace() callback
      perf intel-pt: Implement ->evsel_is_auxtrace() callback
      perf intel-bts: Implement ->evsel_is_auxtrace() callback
      perf arm-spe: Implement ->evsel_is_auxtrace() callback
      perf cs-etm: Implement ->evsel_is_auxtrace() callback
      perf s390-cpumsf: Implement ->evsel_is_auxtrace() callback
      perf auxtrace: For reporting purposes, un-group AUX area event
      perf auxtrace: Add an option to synthesize callchains for regular events
      perf thread-stack: Add thread_stack__sample_late()
      perf tools: Add support for synthesized sample type
      perf intel-pt: Add support for synthesizing callchains for regular events
      perf tools: Move and globalize perf_evsel__find_pmu() and perf_evsel__is_aux_event()
      perf tools: Move leader-sampling configuration
      perf tools: Rearrange perf_evsel__config_leader_sampling()
      perf tools: Allow multiple read formats
      perf tools: Add support for leader-sampling with AUX area events

 tools/perf/Documentation/itrace.txt    |  1 +
 tools/perf/Documentation/perf-list.txt |  3 ++
 tools/perf/builtin-report.c            |  3 +-
 tools/perf/builtin-script.c            |  2 +-
 tools/perf/util/arm-spe.c              | 10 ++++
 tools/perf/util/auxtrace.c             | 94 +++++++++++++++++++++++++---------
 tools/perf/util/auxtrace.h             | 14 +++++
 tools/perf/util/cs-etm.c               | 11 ++++
 tools/perf/util/evlist.c               |  6 ++-
 tools/perf/util/evsel.c                | 41 +++++++--------
 tools/perf/util/evsel.h                | 18 ++++++-
 tools/perf/util/intel-bts.c            | 10 ++++
 tools/perf/util/intel-pt.c             | 78 +++++++++++++++++++++++++---
 tools/perf/util/record.c               | 62 ++++++++++++++++++++++
 tools/perf/util/s390-cpumcf-kernel.h   |  1 +
 tools/perf/util/s390-cpumsf.c          | 11 +++-
 tools/perf/util/thread-stack.c         | 57 +++++++++++++++++++++
 tools/perf/util/thread-stack.h         |  3 ++
 18 files changed, 367 insertions(+), 58 deletions(-)



Regards
Adrian

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 01/16] perf auxtrace: Add ->evsel_is_auxtrace() callback
  2020-04-01 10:15 [PATCH 00/16] perf intel-pt: Sampling improvements Adrian Hunter
@ 2020-04-01 10:15 ` Adrian Hunter
  2020-04-22 12:17   ` [tip: perf/core] " tip-bot2 for Adrian Hunter
  2020-04-01 10:15 ` [PATCH 02/16] perf intel-pt: Implement " Adrian Hunter
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 47+ messages in thread
From: Adrian Hunter @ 2020-04-01 10:15 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, linux-kernel, Adrian Hunter, Kim Phillips,
	Mathieu Poirier, Thomas Richter

Add ->evsel_is_auxtrace() callback to identify if a selected event
is an AUX area event.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
---
 tools/perf/util/auxtrace.c |  9 +++++++++
 tools/perf/util/auxtrace.h | 12 ++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 3571ce72ca28..2c4ad6838766 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -2577,3 +2577,12 @@ void auxtrace__free(struct perf_session *session)
 
 	return session->auxtrace->free(session);
 }
+
+bool auxtrace__evsel_is_auxtrace(struct perf_session *session,
+				 struct evsel *evsel)
+{
+	if (!session->auxtrace || !session->auxtrace->evsel_is_auxtrace)
+		return false;
+
+	return session->auxtrace->evsel_is_auxtrace(session, evsel);
+}
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index e58ef160b599..db65aae5c2ea 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -21,6 +21,7 @@
 union perf_event;
 struct perf_session;
 struct evlist;
+struct evsel;
 struct perf_tool;
 struct mmap;
 struct perf_sample;
@@ -166,6 +167,8 @@ struct auxtrace {
 			    struct perf_tool *tool);
 	void (*free_events)(struct perf_session *session);
 	void (*free)(struct perf_session *session);
+	bool (*evsel_is_auxtrace)(struct perf_session *session,
+				  struct evsel *evsel);
 };
 
 /**
@@ -584,6 +587,8 @@ void auxtrace__dump_auxtrace_sample(struct perf_session *session,
 int auxtrace__flush_events(struct perf_session *session, struct perf_tool *tool);
 void auxtrace__free_events(struct perf_session *session);
 void auxtrace__free(struct perf_session *session);
+bool auxtrace__evsel_is_auxtrace(struct perf_session *session,
+				 struct evsel *evsel);
 
 #define ITRACE_HELP \
 "				i:	    		synthesize instructions events\n"		\
@@ -749,6 +754,13 @@ void auxtrace_index__free(struct list_head *head __maybe_unused)
 {
 }
 
+static inline
+bool auxtrace__evsel_is_auxtrace(struct perf_session *session __maybe_unused,
+				 struct evsel *evsel __maybe_unused)
+{
+	return false;
+}
+
 static inline
 int auxtrace_parse_filters(struct evlist *evlist __maybe_unused)
 {
-- 
2.17.1


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 02/16] perf intel-pt: Implement ->evsel_is_auxtrace() callback
  2020-04-01 10:15 [PATCH 00/16] perf intel-pt: Sampling improvements Adrian Hunter
  2020-04-01 10:15 ` [PATCH 01/16] perf auxtrace: Add ->evsel_is_auxtrace() callback Adrian Hunter
@ 2020-04-01 10:15 ` Adrian Hunter
  2020-04-22 12:17   ` [tip: perf/core] " tip-bot2 for Adrian Hunter
  2020-04-01 10:16 ` [PATCH 03/16] perf intel-bts: " Adrian Hunter
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 47+ messages in thread
From: Adrian Hunter @ 2020-04-01 10:15 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, linux-kernel, Adrian Hunter

Implement ->evsel_is_auxtrace() callback.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/intel-pt.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 23c8289c2472..db25c77d82f3 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -2715,6 +2715,15 @@ static void intel_pt_free(struct perf_session *session)
 	free(pt);
 }
 
+static bool intel_pt_evsel_is_auxtrace(struct perf_session *session,
+				       struct evsel *evsel)
+{
+	struct intel_pt *pt = container_of(session->auxtrace, struct intel_pt,
+					   auxtrace);
+
+	return evsel->core.attr.type == pt->pmu_type;
+}
+
 static int intel_pt_process_auxtrace_event(struct perf_session *session,
 					   union perf_event *event,
 					   struct perf_tool *tool __maybe_unused)
@@ -3310,6 +3319,7 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
 	pt->auxtrace.flush_events = intel_pt_flush;
 	pt->auxtrace.free_events = intel_pt_free_events;
 	pt->auxtrace.free = intel_pt_free;
+	pt->auxtrace.evsel_is_auxtrace = intel_pt_evsel_is_auxtrace;
 	session->auxtrace = &pt->auxtrace;
 
 	if (dump_trace)
-- 
2.17.1


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 03/16] perf intel-bts: Implement ->evsel_is_auxtrace() callback
  2020-04-01 10:15 [PATCH 00/16] perf intel-pt: Sampling improvements Adrian Hunter
  2020-04-01 10:15 ` [PATCH 01/16] perf auxtrace: Add ->evsel_is_auxtrace() callback Adrian Hunter
  2020-04-01 10:15 ` [PATCH 02/16] perf intel-pt: Implement " Adrian Hunter
@ 2020-04-01 10:16 ` Adrian Hunter
  2020-04-22 12:17   ` [tip: perf/core] " tip-bot2 for Adrian Hunter
  2020-04-01 10:16 ` [PATCH 04/16] perf arm-spe: " Adrian Hunter
                   ` (12 subsequent siblings)
  15 siblings, 1 reply; 47+ messages in thread
From: Adrian Hunter @ 2020-04-01 10:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, linux-kernel, Adrian Hunter

Implement ->evsel_is_auxtrace() callback.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/intel-bts.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/tools/perf/util/intel-bts.c b/tools/perf/util/intel-bts.c
index 34cb380d19a3..059e1c805ed0 100644
--- a/tools/perf/util/intel-bts.c
+++ b/tools/perf/util/intel-bts.c
@@ -728,6 +728,15 @@ static void intel_bts_free(struct perf_session *session)
 	free(bts);
 }
 
+static bool intel_bts_evsel_is_auxtrace(struct perf_session *session,
+					struct evsel *evsel)
+{
+	struct intel_bts *bts = container_of(session->auxtrace, struct intel_bts,
+					     auxtrace);
+
+	return evsel->core.attr.type == bts->pmu_type;
+}
+
 struct intel_bts_synth {
 	struct perf_tool dummy_tool;
 	struct perf_session *session;
@@ -883,6 +892,7 @@ int intel_bts_process_auxtrace_info(union perf_event *event,
 	bts->auxtrace.flush_events = intel_bts_flush;
 	bts->auxtrace.free_events = intel_bts_free_events;
 	bts->auxtrace.free = intel_bts_free;
+	bts->auxtrace.evsel_is_auxtrace = intel_bts_evsel_is_auxtrace;
 	session->auxtrace = &bts->auxtrace;
 
 	intel_bts_print_info(&auxtrace_info->priv[0], INTEL_BTS_PMU_TYPE,
-- 
2.17.1


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 04/16] perf arm-spe: Implement ->evsel_is_auxtrace() callback
  2020-04-01 10:15 [PATCH 00/16] perf intel-pt: Sampling improvements Adrian Hunter
                   ` (2 preceding siblings ...)
  2020-04-01 10:16 ` [PATCH 03/16] perf intel-bts: " Adrian Hunter
@ 2020-04-01 10:16 ` Adrian Hunter
  2020-04-02  3:03   ` Leo Yan
  2020-04-22 12:17   ` [tip: perf/core] " tip-bot2 for Adrian Hunter
  2020-04-01 10:16 ` [PATCH 05/16] perf cs-etm: " Adrian Hunter
                   ` (11 subsequent siblings)
  15 siblings, 2 replies; 47+ messages in thread
From: Adrian Hunter @ 2020-04-01 10:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, linux-kernel, Adrian Hunter, Kim Phillips

Implement ->evsel_is_auxtrace() callback.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kim Phillips <kim.phillips@arm.com>
---
 tools/perf/util/arm-spe.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 53be12b23ff4..b30cc74d0fb4 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -176,6 +176,15 @@ static void arm_spe_free(struct perf_session *session)
 	free(spe);
 }
 
+static bool arm_spe_evsel_is_auxtrace(struct perf_session *session,
+				      struct evsel *evsel)
+{
+	struct arm_spe *spe = container_of(session->auxtrace, struct arm_spe,
+					     auxtrace);
+
+	return evsel->core.attr.type == spe->pmu_type;
+}
+
 static const char * const arm_spe_info_fmts[] = {
 	[ARM_SPE_PMU_TYPE]		= "  PMU Type           %"PRId64"\n",
 };
@@ -218,6 +227,7 @@ int arm_spe_process_auxtrace_info(union perf_event *event,
 	spe->auxtrace.flush_events = arm_spe_flush;
 	spe->auxtrace.free_events = arm_spe_free_events;
 	spe->auxtrace.free = arm_spe_free;
+	spe->auxtrace.evsel_is_auxtrace = arm_spe_evsel_is_auxtrace;
 	session->auxtrace = &spe->auxtrace;
 
 	arm_spe_print_info(&auxtrace_info->priv[0]);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 05/16] perf cs-etm: Implement ->evsel_is_auxtrace() callback
  2020-04-01 10:15 [PATCH 00/16] perf intel-pt: Sampling improvements Adrian Hunter
                   ` (3 preceding siblings ...)
  2020-04-01 10:16 ` [PATCH 04/16] perf arm-spe: " Adrian Hunter
@ 2020-04-01 10:16 ` Adrian Hunter
  2020-04-01 17:11   ` Mathieu Poirier
  2020-04-22 12:17   ` [tip: perf/core] " tip-bot2 for Adrian Hunter
  2020-04-01 10:16 ` [PATCH 06/16] perf s390-cpumsf: " Adrian Hunter
                   ` (10 subsequent siblings)
  15 siblings, 2 replies; 47+ messages in thread
From: Adrian Hunter @ 2020-04-01 10:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, linux-kernel, Adrian Hunter, Mathieu Poirier

Implement ->evsel_is_auxtrace() callback.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
---
 tools/perf/util/cs-etm.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 62d2f9b9ce1b..3c802fde4954 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -631,6 +631,16 @@ static void cs_etm__free(struct perf_session *session)
 	zfree(&aux);
 }
 
+static bool cs_etm__evsel_is_auxtrace(struct perf_session *session,
+				      struct evsel *evsel)
+{
+	struct cs_etm_auxtrace *aux = container_of(session->auxtrace,
+						   struct cs_etm_auxtrace,
+						   auxtrace);
+
+	return evsel->core.attr.type == aux->pmu_type;
+}
+
 static u8 cs_etm__cpu_mode(struct cs_etm_queue *etmq, u64 address)
 {
 	struct machine *machine;
@@ -2618,6 +2628,7 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
 	etm->auxtrace.flush_events = cs_etm__flush_events;
 	etm->auxtrace.free_events = cs_etm__free_events;
 	etm->auxtrace.free = cs_etm__free;
+	etm->auxtrace.evsel_is_auxtrace = cs_etm__evsel_is_auxtrace;
 	session->auxtrace = &etm->auxtrace;
 
 	etm->unknown_thread = thread__new(999999999, 999999999);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 06/16] perf s390-cpumsf: Implement ->evsel_is_auxtrace() callback
  2020-04-01 10:15 [PATCH 00/16] perf intel-pt: Sampling improvements Adrian Hunter
                   ` (4 preceding siblings ...)
  2020-04-01 10:16 ` [PATCH 05/16] perf cs-etm: " Adrian Hunter
@ 2020-04-01 10:16 ` Adrian Hunter
  2020-04-01 14:10   ` Thomas Richter
  2020-04-22 12:17   ` [tip: perf/core] " tip-bot2 for Adrian Hunter
  2020-04-01 10:16 ` [PATCH 07/16] perf auxtrace: For reporting purposes, un-group AUX area event Adrian Hunter
                   ` (9 subsequent siblings)
  15 siblings, 2 replies; 47+ messages in thread
From: Adrian Hunter @ 2020-04-01 10:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, linux-kernel, Adrian Hunter, Thomas Richter

Implement ->evsel_is_auxtrace() callback.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
---
 tools/perf/util/s390-cpumcf-kernel.h | 1 +
 tools/perf/util/s390-cpumsf.c        | 9 +++++++++
 2 files changed, 10 insertions(+)

diff --git a/tools/perf/util/s390-cpumcf-kernel.h b/tools/perf/util/s390-cpumcf-kernel.h
index d4356030b504..f55ca07f3ca1 100644
--- a/tools/perf/util/s390-cpumcf-kernel.h
+++ b/tools/perf/util/s390-cpumcf-kernel.h
@@ -11,6 +11,7 @@
 
 #define	S390_CPUMCF_DIAG_DEF	0xfeef	/* Counter diagnostic entry ID */
 #define	PERF_EVENT_CPUM_CF_DIAG	0xBC000	/* Event: Counter sets */
+#define PERF_EVENT_CPUM_SF_DIAG	0xBD000 /* Event: Combined-sampling */
 
 struct cf_ctrset_entry {	/* CPU-M CF counter set entry (8 byte) */
 	unsigned int def:16;	/* 0-15  Data Entry Format */
diff --git a/tools/perf/util/s390-cpumsf.c b/tools/perf/util/s390-cpumsf.c
index 6785cd87aa4d..d7779e48652f 100644
--- a/tools/perf/util/s390-cpumsf.c
+++ b/tools/perf/util/s390-cpumsf.c
@@ -1047,6 +1047,14 @@ static void s390_cpumsf_free(struct perf_session *session)
 	free(sf);
 }
 
+static bool
+s390_cpumsf_evsel_is_auxtrace(struct perf_session *session __maybe_unused,
+			      struct evsel *evsel)
+{
+	return evsel->core.attr.type == PERF_TYPE_RAW &&
+	       evsel->core.attr.config == PERF_EVENT_CPUM_SF_DIAG;
+}
+
 static int s390_cpumsf_get_type(const char *cpuid)
 {
 	int ret, family = 0;
@@ -1142,6 +1150,7 @@ int s390_cpumsf_process_auxtrace_info(union perf_event *event,
 	sf->auxtrace.flush_events = s390_cpumsf_flush;
 	sf->auxtrace.free_events = s390_cpumsf_free_events;
 	sf->auxtrace.free = s390_cpumsf_free;
+	sf->auxtrace.evsel_is_auxtrace = s390_cpumsf_evsel_is_auxtrace;
 	session->auxtrace = &sf->auxtrace;
 
 	if (dump_trace)
-- 
2.17.1


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 07/16] perf auxtrace: For reporting purposes, un-group AUX area event
  2020-04-01 10:15 [PATCH 00/16] perf intel-pt: Sampling improvements Adrian Hunter
                   ` (5 preceding siblings ...)
  2020-04-01 10:16 ` [PATCH 06/16] perf s390-cpumsf: " Adrian Hunter
@ 2020-04-01 10:16 ` Adrian Hunter
  2020-04-22 12:17   ` [tip: perf/core] " tip-bot2 for Adrian Hunter
  2020-04-01 10:16 ` [PATCH 08/16] perf auxtrace: Add an option to synthesize callchains for regular events Adrian Hunter
                   ` (8 subsequent siblings)
  15 siblings, 1 reply; 47+ messages in thread
From: Adrian Hunter @ 2020-04-01 10:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, linux-kernel, Adrian Hunter

An AUX area event must be the group leader when recording traces in
sample mode, but that does not produce the expected results from
'perf report' because it expects the leader to provide samples. Rather
than teach 'perf report' about AUX area sampling, un-group the AUX
area event during processing, making the 2nd event the leader.

Example:

 $ perf record -e '{intel_pt//u,branch-misses:u}' -c 1 uname
 Linux
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.080 MB perf.data ]

 Before:

 $ perf report

 Samples: 800  of events 'anon group { intel_pt//u, branch-misses:u }', Event count (approx.): 800
        Children              Self  Command  Shared Object     Symbol
     0.00%  47.50%     0.00%  47.50%  uname    libc-2.28.so      [.] _dl_addr
     0.00%  16.38%     0.00%  16.38%  uname    ld-2.28.so        [.] __GI___tunables_init
     0.00%  54.75%     0.00%   4.75%  uname    ld-2.28.so        [.] dl_main
     0.00%   3.12%     0.00%   3.12%  uname    ld-2.28.so        [.] _dl_map_object_from_fd
     0.00%   2.38%     0.00%   2.38%  uname    ld-2.28.so        [.] strcmp
     0.00%   2.25%     0.00%   2.25%  uname    ld-2.28.so        [.] _dl_check_map_versions
     0.00%   2.00%     0.00%   2.00%  uname    ld-2.28.so        [.] _dl_important_hwcaps
     0.00%   2.00%     0.00%   2.00%  uname    ld-2.28.so        [.] _dl_map_object_deps
     0.00%  51.50%     0.00%   1.50%  uname    ld-2.28.so        [.] _dl_sysdep_start
     0.00%   1.25%     0.00%   1.25%  uname    ld-2.28.so        [.] _dl_load_cache_lookup
     0.00%  51.12%     0.00%   1.12%  uname    ld-2.28.so        [.] _dl_start
     0.00%  50.88%     0.00%   1.12%  uname    ld-2.28.so        [.] do_lookup_x
     0.00%  50.62%     0.00%   1.00%  uname    ld-2.28.so        [.] _dl_lookup_symbol_x
     0.00%   1.00%     0.00%   1.00%  uname    ld-2.28.so        [.] _dl_map_object
     0.00%   1.00%     0.00%   1.00%  uname    ld-2.28.so        [.] _dl_next_ld_env_entry
     0.00%   0.88%     0.00%   0.88%  uname    ld-2.28.so        [.] _dl_cache_libcmp
     0.00%   0.88%     0.00%   0.88%  uname    ld-2.28.so        [.] _dl_new_object
     0.00%  50.88%     0.00%   0.88%  uname    ld-2.28.so        [.] _dl_relocate_object
     0.00%   0.62%     0.00%   0.62%  uname    ld-2.28.so        [.] _dl_init_paths
     0.00%   0.62%     0.00%   0.62%  uname    ld-2.28.so        [.] _dl_name_match_p
     0.00%   0.50%     0.00%   0.50%  uname    ld-2.28.so        [.] get_common_indeces.constprop.1
     0.00%   0.50%     0.00%   0.50%  uname    ld-2.28.so        [.] memmove
     0.00%   0.50%     0.00%   0.50%  uname    ld-2.28.so        [.] memset
     0.00%   0.50%     0.00%   0.50%  uname    ld-2.28.so        [.] open_verify.constprop.11
     0.00%   0.38%     0.00%   0.38%  uname    ld-2.28.so        [.] _dl_check_all_versions
     0.00%   0.38%     0.00%   0.38%  uname    ld-2.28.so        [.] _dl_find_dso_for_object
     0.00%   0.38%     0.00%   0.38%  uname    ld-2.28.so        [.] init_tls
     0.00%   0.25%     0.00%   0.25%  uname    ld-2.28.so        [.] __tunable_get_val
     0.00%   0.25%     0.00%   0.25%  uname    ld-2.28.so        [.] _dl_add_to_namespace_list
     0.00%   0.25%     0.00%   0.25%  uname    ld-2.28.so        [.] _dl_determine_tlsoffset
     0.00%   0.25%     0.00%   0.25%  uname    ld-2.28.so        [.] _dl_discover_osversion
     0.00%   0.25%     0.00%   0.25%  uname    ld-2.28.so        [.] calloc@plt
     0.00%   0.25%     0.00%   0.25%  uname    ld-2.28.so        [.] malloc
     0.00%   0.25%     0.00%   0.25%  uname    ld-2.28.so        [.] malloc@plt
     0.00%   0.25%     0.00%   0.25%  uname    libc-2.28.so      [.] _nl_load_locale_from_archive
     0.00%   0.25%     0.00%   0.25%  uname    [unknown]         [k] 0xffffffffa3a00010
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] __libc_scratch_buffer_set_array_size
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] _dl_allocate_tls_storage
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] _dl_catch_exception
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] _dl_setup_hash
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] _dl_sort_maps
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] _dl_sysdep_read_whole_file
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] access
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] calloc
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] mmap64
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] openaux
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] rtld_lock_default_lock_recursive
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] rtld_lock_default_unlock_recursive
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] strchr
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] strlen
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] 0x0000000000001080
     0.00%   0.12%     0.00%   0.12%  uname    libc-2.28.so      [.] __strchrnul_avx2
     0.00%   0.12%     0.00%   0.12%  uname    libc-2.28.so      [.] _nl_normalize_codeset
     0.00%   0.12%     0.00%   0.12%  uname    libc-2.28.so      [.] malloc
     0.00%   0.12%     0.00%   0.12%  uname    [unknown]         [k] 0xffffffffa3a011f0
     0.00%  50.00%     0.00%   0.00%  uname    ld-2.28.so        [.] _dl_start_user
     0.00%  50.00%     0.00%   0.00%  uname    [unknown]         [.] 0000000000000000

 After:

 Samples: 800  of event 'branch-misses:u', Event count (approx.): 800
  Children      Self  Command  Shared Object     Symbol
    54.75%     4.75%  uname    ld-2.28.so        [.] dl_main
    51.50%     1.50%  uname    ld-2.28.so        [.] _dl_sysdep_start
    51.12%     1.12%  uname    ld-2.28.so        [.] _dl_start
    50.88%     0.88%  uname    ld-2.28.so        [.] _dl_relocate_object
    50.88%     1.12%  uname    ld-2.28.so        [.] do_lookup_x
    50.62%     1.00%  uname    ld-2.28.so        [.] _dl_lookup_symbol_x
    50.00%     0.00%  uname    ld-2.28.so        [.] _dl_start_user
    50.00%     0.00%  uname    [unknown]         [.] 0000000000000000
    47.50%    47.50%  uname    libc-2.28.so      [.] _dl_addr
    16.38%    16.38%  uname    ld-2.28.so        [.] __GI___tunables_init
     3.12%     3.12%  uname    ld-2.28.so        [.] _dl_map_object_from_fd
     2.38%     2.38%  uname    ld-2.28.so        [.] strcmp
     2.25%     2.25%  uname    ld-2.28.so        [.] _dl_check_map_versions
     2.00%     2.00%  uname    ld-2.28.so        [.] _dl_important_hwcaps
     2.00%     2.00%  uname    ld-2.28.so        [.] _dl_map_object_deps
     1.25%     1.25%  uname    ld-2.28.so        [.] _dl_load_cache_lookup
     1.00%     1.00%  uname    ld-2.28.so        [.] _dl_map_object
     1.00%     1.00%  uname    ld-2.28.so        [.] _dl_next_ld_env_entry
     0.88%     0.88%  uname    ld-2.28.so        [.] _dl_cache_libcmp
     0.88%     0.88%  uname    ld-2.28.so        [.] _dl_new_object
     0.62%     0.62%  uname    ld-2.28.so        [.] _dl_init_paths
     0.62%     0.62%  uname    ld-2.28.so        [.] _dl_name_match_p
     0.50%     0.50%  uname    ld-2.28.so        [.] get_common_indeces.constprop.1
     0.50%     0.50%  uname    ld-2.28.so        [.] memmove
     0.50%     0.50%  uname    ld-2.28.so        [.] memset
     0.50%     0.50%  uname    ld-2.28.so        [.] open_verify.constprop.11
     0.38%     0.38%  uname    ld-2.28.so        [.] _dl_check_all_versions
     0.38%     0.38%  uname    ld-2.28.so        [.] _dl_find_dso_for_object
     0.38%     0.38%  uname    ld-2.28.so        [.] init_tls
     0.25%     0.25%  uname    ld-2.28.so        [.] __tunable_get_val
     0.25%     0.25%  uname    ld-2.28.so        [.] _dl_add_to_namespace_list
     0.25%     0.25%  uname    ld-2.28.so        [.] _dl_determine_tlsoffset
     0.25%     0.25%  uname    ld-2.28.so        [.] _dl_discover_osversion
     0.25%     0.25%  uname    ld-2.28.so        [.] calloc@plt
     0.25%     0.25%  uname    ld-2.28.so        [.] malloc
     0.25%     0.25%  uname    ld-2.28.so        [.] malloc@plt
     0.25%     0.25%  uname    libc-2.28.so      [.] _nl_load_locale_from_archive
     0.25%     0.25%  uname    [unknown]         [k] 0xffffffffa3a00010
     0.12%     0.12%  uname    ld-2.28.so        [.] __libc_scratch_buffer_set_array_size
     0.12%     0.12%  uname    ld-2.28.so        [.] _dl_allocate_tls_storage
     0.12%     0.12%  uname    ld-2.28.so        [.] _dl_catch_exception
     0.12%     0.12%  uname    ld-2.28.so        [.] _dl_setup_hash
     0.12%     0.12%  uname    ld-2.28.so        [.] _dl_sort_maps
     0.12%     0.12%  uname    ld-2.28.so        [.] _dl_sysdep_read_whole_file
     0.12%     0.12%  uname    ld-2.28.so        [.] access
     0.12%     0.12%  uname    ld-2.28.so        [.] calloc
     0.12%     0.12%  uname    ld-2.28.so        [.] mmap64
     0.12%     0.12%  uname    ld-2.28.so        [.] openaux
     0.12%     0.12%  uname    ld-2.28.so        [.] rtld_lock_default_lock_recursive
     0.12%     0.12%  uname    ld-2.28.so        [.] rtld_lock_default_unlock_recursive
     0.12%     0.12%  uname    ld-2.28.so        [.] strchr
     0.12%     0.12%  uname    ld-2.28.so        [.] strlen
     0.12%     0.12%  uname    ld-2.28.so        [.] 0x0000000000001080
     0.12%     0.12%  uname    libc-2.28.so      [.] __strchrnul_avx2
     0.12%     0.12%  uname    libc-2.28.so      [.] _nl_normalize_codeset
     0.12%     0.12%  uname    libc-2.28.so      [.] malloc
     0.12%     0.12%  uname    [unknown]         [k] 0xffffffffa3a011f0

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/auxtrace.c | 60 ++++++++++++++++++++++++++++++++++----
 1 file changed, 55 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 2c4ad6838766..b60bae8e395c 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -1234,29 +1234,79 @@ int perf_event__synthesize_auxtrace_info(struct auxtrace_record *itr,
 	return err;
 }
 
+static void unleader_evsel(struct evlist *evlist, struct evsel *leader)
+{
+	struct evsel *new_leader = NULL;
+	struct evsel *evsel;
+
+	/* Find new leader for the group */
+	evlist__for_each_entry(evlist, evsel) {
+		if (evsel->leader != leader || evsel == leader)
+			continue;
+		if (!new_leader)
+			new_leader = evsel;
+		evsel->leader = new_leader;
+	}
+
+	/* Update group information */
+	if (new_leader) {
+		zfree(&new_leader->group_name);
+		new_leader->group_name = leader->group_name;
+		leader->group_name = NULL;
+
+		new_leader->core.nr_members = leader->core.nr_members - 1;
+		leader->core.nr_members = 1;
+	}
+}
+
+static void unleader_auxtrace(struct perf_session *session)
+{
+	struct evsel *evsel;
+
+	evlist__for_each_entry(session->evlist, evsel) {
+		if (auxtrace__evsel_is_auxtrace(session, evsel) &&
+		    perf_evsel__is_group_leader(evsel)) {
+			unleader_evsel(session->evlist, evsel);
+		}
+	}
+}
+
 int perf_event__process_auxtrace_info(struct perf_session *session,
 				      union perf_event *event)
 {
 	enum auxtrace_type type = event->auxtrace_info.type;
+	int err;
 
 	if (dump_trace)
 		fprintf(stdout, " type: %u\n", type);
 
 	switch (type) {
 	case PERF_AUXTRACE_INTEL_PT:
-		return intel_pt_process_auxtrace_info(event, session);
+		err = intel_pt_process_auxtrace_info(event, session);
+		break;
 	case PERF_AUXTRACE_INTEL_BTS:
-		return intel_bts_process_auxtrace_info(event, session);
+		err = intel_bts_process_auxtrace_info(event, session);
+		break;
 	case PERF_AUXTRACE_ARM_SPE:
-		return arm_spe_process_auxtrace_info(event, session);
+		err = arm_spe_process_auxtrace_info(event, session);
+		break;
 	case PERF_AUXTRACE_CS_ETM:
-		return cs_etm__process_auxtrace_info(event, session);
+		err = cs_etm__process_auxtrace_info(event, session);
+		break;
 	case PERF_AUXTRACE_S390_CPUMSF:
-		return s390_cpumsf_process_auxtrace_info(event, session);
+		err = s390_cpumsf_process_auxtrace_info(event, session);
+		break;
 	case PERF_AUXTRACE_UNKNOWN:
 	default:
 		return -EINVAL;
 	}
+
+	if (err)
+		return err;
+
+	unleader_auxtrace(session);
+
+	return 0;
 }
 
 s64 perf_event__process_auxtrace(struct perf_session *session,
-- 
2.17.1


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 08/16] perf auxtrace: Add an option to synthesize callchains for regular events
  2020-04-01 10:15 [PATCH 00/16] perf intel-pt: Sampling improvements Adrian Hunter
                   ` (6 preceding siblings ...)
  2020-04-01 10:16 ` [PATCH 07/16] perf auxtrace: For reporting purposes, un-group AUX area event Adrian Hunter
@ 2020-04-01 10:16 ` Adrian Hunter
  2020-04-22 12:17   ` [tip: perf/core] " tip-bot2 for Adrian Hunter
  2020-04-01 10:16 ` [PATCH 09/16] perf thread-stack: Add thread_stack__sample_late() Adrian Hunter
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 47+ messages in thread
From: Adrian Hunter @ 2020-04-01 10:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, linux-kernel, Adrian Hunter

Currently, callchains can be synthesized only for synthesized events. Add
an itrace option to synthesize callchains for regular events.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Documentation/itrace.txt | 1 +
 tools/perf/builtin-report.c         | 3 ++-
 tools/perf/builtin-script.c         | 2 +-
 tools/perf/util/auxtrace.c          | 6 +++++-
 tools/perf/util/auxtrace.h          | 2 ++
 tools/perf/util/s390-cpumsf.c       | 2 +-
 6 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/tools/perf/Documentation/itrace.txt b/tools/perf/Documentation/itrace.txt
index 82ff7dad40c2..671e154ede03 100644
--- a/tools/perf/Documentation/itrace.txt
+++ b/tools/perf/Documentation/itrace.txt
@@ -10,6 +10,7 @@
 		e	synthesize error events
 		d	create a debug log
 		g	synthesize a call chain (use with i or x)
+		G	synthesize a call chain on existing event records
 		l	synthesize last branch entries (use with i or x)
 		s       skip initial number of events
 
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 26d8fc27e427..c0cebd53ecf9 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -339,6 +339,7 @@ static int report__setup_sample_type(struct report *rep)
 	bool is_pipe = perf_data__is_pipe(session->data);
 
 	if (session->itrace_synth_opts->callchain ||
+	    session->itrace_synth_opts->add_callchain ||
 	    (!is_pipe &&
 	     perf_header__has_feat(&session->header, HEADER_AUXTRACE) &&
 	     !session->itrace_synth_opts->set))
@@ -1332,7 +1333,7 @@ int cmd_report(int argc, const char **argv)
 	if (symbol_conf.cumulate_callchain && !callchain_param.order_set)
 		callchain_param.order = ORDER_CALLER;
 
-	if (itrace_synth_opts.callchain &&
+	if ((itrace_synth_opts.callchain || itrace_synth_opts.add_callchain) &&
 	    (int)itrace_synth_opts.callchain_sz > report.max_stack)
 		report.max_stack = itrace_synth_opts.callchain_sz;
 
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 186ebf827fa1..a0dc118e987b 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -3709,7 +3709,7 @@ int cmd_script(int argc, const char **argv)
 		return -1;
 	}
 
-	if (itrace_synth_opts.callchain &&
+	if ((itrace_synth_opts.callchain || itrace_synth_opts.add_callchain) &&
 	    itrace_synth_opts.callchain_sz > scripting_max_stack)
 		scripting_max_stack = itrace_synth_opts.callchain_sz;
 
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index b60bae8e395c..809a09e75c55 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -1462,8 +1462,12 @@ int itrace_parse_synth_opts(const struct option *opt, const char *str,
 			synth_opts->branches = true;
 			synth_opts->returns = true;
 			break;
+		case 'G':
 		case 'g':
-			synth_opts->callchain = true;
+			if (p[-1] == 'G')
+				synth_opts->add_callchain = true;
+			else
+				synth_opts->callchain = true;
 			synth_opts->callchain_sz =
 					PERF_ITRACE_DEFAULT_CALLCHAIN_SZ;
 			while (*p == ' ' || *p == ',')
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index db65aae5c2ea..dd8a4ff8209e 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -74,6 +74,7 @@ enum itrace_period_type {
  * @calls: limit branch samples to calls (can be combined with @returns)
  * @returns: limit branch samples to returns (can be combined with @calls)
  * @callchain: add callchain to 'instructions' events
+ * @add_callchain: add callchain to existing event records
  * @thread_stack: feed branches to the thread_stack
  * @last_branch: add branch context to 'instruction' events
  * @callchain_sz: maximum callchain size
@@ -101,6 +102,7 @@ struct itrace_synth_opts {
 	bool			calls;
 	bool			returns;
 	bool			callchain;
+	bool			add_callchain;
 	bool			thread_stack;
 	bool			last_branch;
 	unsigned int		callchain_sz;
diff --git a/tools/perf/util/s390-cpumsf.c b/tools/perf/util/s390-cpumsf.c
index d7779e48652f..38a942881d1a 100644
--- a/tools/perf/util/s390-cpumsf.c
+++ b/tools/perf/util/s390-cpumsf.c
@@ -1079,7 +1079,7 @@ static bool check_auxtrace_itrace(struct itrace_synth_opts *itops)
 		itops->pwr_events || itops->errors ||
 		itops->dont_decode || itops->calls || itops->returns ||
 		itops->callchain || itops->thread_stack ||
-		itops->last_branch;
+		itops->last_branch || itops->add_callchain;
 	if (!ison)
 		return true;
 	pr_err("Unsupported --itrace options specified\n");
-- 
2.17.1


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 09/16] perf thread-stack: Add thread_stack__sample_late()
  2020-04-01 10:15 [PATCH 00/16] perf intel-pt: Sampling improvements Adrian Hunter
                   ` (7 preceding siblings ...)
  2020-04-01 10:16 ` [PATCH 08/16] perf auxtrace: Add an option to synthesize callchains for regular events Adrian Hunter
@ 2020-04-01 10:16 ` Adrian Hunter
  2020-04-22 12:17   ` [tip: perf/core] " tip-bot2 for Adrian Hunter
  2020-04-01 10:16 ` [PATCH 10/16] perf tools: Add support for synthesized sample type Adrian Hunter
                   ` (6 subsequent siblings)
  15 siblings, 1 reply; 47+ messages in thread
From: Adrian Hunter @ 2020-04-01 10:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, linux-kernel, Adrian Hunter

Add a thread stack function to create a call chain for hardware events
where the sample records get created some time after the event occurred.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/thread-stack.c | 57 ++++++++++++++++++++++++++++++++++
 tools/perf/util/thread-stack.h |  3 ++
 2 files changed, 60 insertions(+)

diff --git a/tools/perf/util/thread-stack.c b/tools/perf/util/thread-stack.c
index 0885967d5bc3..83f6c83f5617 100644
--- a/tools/perf/util/thread-stack.c
+++ b/tools/perf/util/thread-stack.c
@@ -497,6 +497,63 @@ void thread_stack__sample(struct thread *thread, int cpu,
 	chain->nr = i;
 }
 
+/*
+ * Hardware sample records, created some time after the event occurred, need to
+ * have subsequent addresses removed from the call chain.
+ */
+void thread_stack__sample_late(struct thread *thread, int cpu,
+			       struct ip_callchain *chain, size_t sz,
+			       u64 sample_ip, u64 kernel_start)
+{
+	struct thread_stack *ts = thread__stack(thread, cpu);
+	u64 sample_context = callchain_context(sample_ip, kernel_start);
+	u64 last_context, context, ip;
+	size_t nr = 0, j;
+
+	if (sz < 2) {
+		chain->nr = 0;
+		return;
+	}
+
+	if (!ts)
+		goto out;
+
+	/*
+	 * When tracing kernel space, kernel addresses occur at the top of the
+	 * call chain after the event occurred but before tracing stopped.
+	 * Skip them.
+	 */
+	for (j = 1; j <= ts->cnt; j++) {
+		ip = ts->stack[ts->cnt - j].ret_addr;
+		context = callchain_context(ip, kernel_start);
+		if (context == PERF_CONTEXT_USER ||
+		    (context == sample_context && ip == sample_ip))
+			break;
+	}
+
+	last_context = sample_ip; /* Use sample_ip as an invalid context */
+
+	for (; nr < sz && j <= ts->cnt; nr++, j++) {
+		ip = ts->stack[ts->cnt - j].ret_addr;
+		context = callchain_context(ip, kernel_start);
+		if (context != last_context) {
+			if (nr >= sz - 1)
+				break;
+			chain->ips[nr++] = context;
+			last_context = context;
+		}
+		chain->ips[nr] = ip;
+	}
+out:
+	if (nr) {
+		chain->nr = nr;
+	} else {
+		chain->ips[0] = sample_context;
+		chain->ips[1] = sample_ip;
+		chain->nr = 2;
+	}
+}
+
 struct call_return_processor *
 call_return_processor__new(int (*process)(struct call_return *cr, u64 *parent_db_id, void *data),
 			   void *data)
diff --git a/tools/perf/util/thread-stack.h b/tools/perf/util/thread-stack.h
index e1ec5a58f1b2..8962ddc4e1ab 100644
--- a/tools/perf/util/thread-stack.h
+++ b/tools/perf/util/thread-stack.h
@@ -85,6 +85,9 @@ int thread_stack__event(struct thread *thread, int cpu, u32 flags, u64 from_ip,
 void thread_stack__set_trace_nr(struct thread *thread, int cpu, u64 trace_nr);
 void thread_stack__sample(struct thread *thread, int cpu, struct ip_callchain *chain,
 			  size_t sz, u64 ip, u64 kernel_start);
+void thread_stack__sample_late(struct thread *thread, int cpu,
+			       struct ip_callchain *chain, size_t sz, u64 ip,
+			       u64 kernel_start);
 int thread_stack__flush(struct thread *thread);
 void thread_stack__free(struct thread *thread);
 size_t thread_stack__depth(struct thread *thread, int cpu);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 10/16] perf tools: Add support for synthesized sample type
  2020-04-01 10:15 [PATCH 00/16] perf intel-pt: Sampling improvements Adrian Hunter
                   ` (8 preceding siblings ...)
  2020-04-01 10:16 ` [PATCH 09/16] perf thread-stack: Add thread_stack__sample_late() Adrian Hunter
@ 2020-04-01 10:16 ` Adrian Hunter
  2020-04-16 14:54   ` Arnaldo Carvalho de Melo
                     ` (2 more replies)
  2020-04-01 10:16 ` [PATCH 11/16] perf intel-pt: Add support for synthesizing callchains for regular events Adrian Hunter
                   ` (5 subsequent siblings)
  15 siblings, 3 replies; 47+ messages in thread
From: Adrian Hunter @ 2020-04-01 10:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, linux-kernel, Adrian Hunter

For reporting purposes, an evsel sample can have a callchain synthesized
from AUX area data. Add support for keeping track of synthesized sample
types. Note, the recorded sample_type cannot be changed because it is
needed to continue to parse events.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/evsel.c |  2 +-
 tools/perf/util/evsel.h | 15 ++++++++++++++-
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index eb880efbce16..60e6cd49dee3 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2136,7 +2136,7 @@ int perf_evsel__parse_sample(struct evsel *evsel, union perf_event *event,
 		}
 	}
 
-	if (evsel__has_callchain(evsel)) {
+	if (type & PERF_SAMPLE_CALLCHAIN) {
 		const u64 max_callchain_nr = UINT64_MAX / sizeof(u64);
 
 		OVERFLOW_CHECK_u64(array);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 53187c501ee8..e64ed4202cab 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -104,6 +104,14 @@ struct evsel {
 		perf_evsel__sb_cb_t	*cb;
 		void			*data;
 	} side_band;
+	/*
+	 * For reporting purposes, an evsel sample can have a callchain
+	 * synthesized from AUX area data. Keep track of synthesized sample
+	 * types here. Note, the recorded sample_type cannot be changed because
+	 * it is needed to continue to parse events.
+	 * See also evsel__has_callchain().
+	 */
+	__u64			synth_sample_type;
 };
 
 struct perf_missing_features {
@@ -398,7 +406,12 @@ static inline bool perf_evsel__has_branch_hw_idx(const struct evsel *evsel)
 
 static inline bool evsel__has_callchain(const struct evsel *evsel)
 {
-	return (evsel->core.attr.sample_type & PERF_SAMPLE_CALLCHAIN) != 0;
+	/*
+	 * For reporting purposes, an evsel sample can have a recorded callchain
+	 * or a callchain synthesized from AUX area data.
+	 */
+	return evsel->core.attr.sample_type & PERF_SAMPLE_CALLCHAIN ||
+	       evsel->synth_sample_type & PERF_SAMPLE_CALLCHAIN;
 }
 
 struct perf_env *perf_evsel__env(struct evsel *evsel);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 11/16] perf intel-pt: Add support for synthesizing callchains for regular events
  2020-04-01 10:15 [PATCH 00/16] perf intel-pt: Sampling improvements Adrian Hunter
                   ` (9 preceding siblings ...)
  2020-04-01 10:16 ` [PATCH 10/16] perf tools: Add support for synthesized sample type Adrian Hunter
@ 2020-04-01 10:16 ` Adrian Hunter
  2020-04-16 15:14   ` Arnaldo Carvalho de Melo
  2020-04-22 12:17   ` [tip: perf/core] " tip-bot2 for Adrian Hunter
  2020-04-01 10:16 ` [PATCH 12/16] perf tools: Move and globalize perf_evsel__find_pmu() and perf_evsel__is_aux_event() Adrian Hunter
                   ` (4 subsequent siblings)
  15 siblings, 2 replies; 47+ messages in thread
From: Adrian Hunter @ 2020-04-01 10:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, linux-kernel, Adrian Hunter

Currently, callchains can be synthesized only for synthesized events.
Support also synthesizing callchains for regular events.

Example:

 # perf record --kcore --aux-sample -e '{intel_pt//,cycles}' -c 10000 uname
 Linux
 [ perf record: Woken up 3 times to write data ]
 [ perf record: Captured and wrote 0.532 MB perf.data ]
 # perf script --itrace=Ge | head -20
 uname  4864 2419025.358181:      10000     cycles:
        ffffffffbba56965 apparmor_bprm_committing_creds+0x35 ([kernel.kallsyms])
        ffffffffbc400cd5 __indirect_thunk_start+0x5 ([kernel.kallsyms])
        ffffffffbba07422 security_bprm_committing_creds+0x22 ([kernel.kallsyms])
        ffffffffbb89805d install_exec_creds+0xd ([kernel.kallsyms])
        ffffffffbb90d9ac load_elf_binary+0x3ac ([kernel.kallsyms])

 uname  4864 2419025.358185:      10000     cycles:
        ffffffffbba56db0 apparmor_bprm_committed_creds+0x20 ([kernel.kallsyms])
        ffffffffbc400cd5 __indirect_thunk_start+0x5 ([kernel.kallsyms])
        ffffffffbba07452 security_bprm_committed_creds+0x22 ([kernel.kallsyms])
        ffffffffbb89809a install_exec_creds+0x4a ([kernel.kallsyms])
        ffffffffbb90d9ac load_elf_binary+0x3ac ([kernel.kallsyms])

 uname  4864 2419025.358189:      10000     cycles:
        ffffffffbb86fdf6 vma_adjust_trans_huge+0x6 ([kernel.kallsyms])
        ffffffffbb821660 __vma_adjust+0x160 ([kernel.kallsyms])
        ffffffffbb897be7 shift_arg_pages+0x97 ([kernel.kallsyms])
        ffffffffbb897ed9 setup_arg_pages+0x1e9 ([kernel.kallsyms])
        ffffffffbb90d9f2 load_elf_binary+0x3f2 ([kernel.kallsyms])

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/intel-pt.c | 68 ++++++++++++++++++++++++++++++++++----
 1 file changed, 61 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index db25c77d82f3..a659b4a1b3f2 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -124,6 +124,8 @@ struct intel_pt {
 
 	struct range *time_ranges;
 	unsigned int range_cnt;
+
+	struct ip_callchain *chain;
 };
 
 enum switch_state {
@@ -868,6 +870,45 @@ static u64 intel_pt_ns_to_ticks(const struct intel_pt *pt, u64 ns)
 		pt->tc.time_mult;
 }
 
+static struct ip_callchain *intel_pt_alloc_chain(struct intel_pt *pt)
+{
+	size_t sz = sizeof(struct ip_callchain);
+
+	/* Add 1 to callchain_sz for callchain context */
+	sz += (pt->synth_opts.callchain_sz + 1) * sizeof(u64);
+	return zalloc(sz);
+}
+
+static int intel_pt_callchain_init(struct intel_pt *pt)
+{
+	struct evsel *evsel;
+
+	evlist__for_each_entry(pt->session->evlist, evsel) {
+		if (!(evsel->core.attr.sample_type & PERF_SAMPLE_CALLCHAIN))
+			evsel->synth_sample_type |= PERF_SAMPLE_CALLCHAIN;
+	}
+
+	pt->chain = intel_pt_alloc_chain(pt);
+	if (!pt->chain)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static void intel_pt_add_callchain(struct intel_pt *pt,
+				   struct perf_sample *sample)
+{
+	struct thread *thread = machine__findnew_thread(pt->machine,
+							sample->pid,
+							sample->tid);
+
+	thread_stack__sample_late(thread, sample->cpu, pt->chain,
+				  pt->synth_opts.callchain_sz + 1, sample->ip,
+				  pt->kernel_start);
+
+	sample->callchain = pt->chain;
+}
+
 static struct intel_pt_queue *intel_pt_alloc_queue(struct intel_pt *pt,
 						   unsigned int queue_nr)
 {
@@ -880,11 +921,7 @@ static struct intel_pt_queue *intel_pt_alloc_queue(struct intel_pt *pt,
 		return NULL;
 
 	if (pt->synth_opts.callchain) {
-		size_t sz = sizeof(struct ip_callchain);
-
-		/* Add 1 to callchain_sz for callchain context */
-		sz += (pt->synth_opts.callchain_sz + 1) * sizeof(u64);
-		ptq->chain = zalloc(sz);
+		ptq->chain = intel_pt_alloc_chain(pt);
 		if (!ptq->chain)
 			goto out_free;
 	}
@@ -1992,7 +2029,8 @@ static int intel_pt_sample(struct intel_pt_queue *ptq)
 	if (!(state->type & INTEL_PT_BRANCH))
 		return 0;
 
-	if (pt->synth_opts.callchain || pt->synth_opts.thread_stack)
+	if (pt->synth_opts.callchain || pt->synth_opts.add_callchain ||
+	    pt->synth_opts.thread_stack)
 		thread_stack__event(ptq->thread, ptq->cpu, ptq->flags, state->from_ip,
 				    state->to_ip, ptq->insn_len,
 				    state->trace_nr);
@@ -2639,6 +2677,11 @@ static int intel_pt_process_event(struct perf_session *session,
 	if (err)
 		return err;
 
+	if (event->header.type == PERF_RECORD_SAMPLE) {
+		if (pt->synth_opts.add_callchain && !sample->callchain)
+			intel_pt_add_callchain(pt, sample);
+	}
+
 	if (event->header.type == PERF_RECORD_AUX &&
 	    (event->aux.flags & PERF_AUX_FLAG_TRUNCATED) &&
 	    pt->synth_opts.errors) {
@@ -2710,6 +2753,7 @@ static void intel_pt_free(struct perf_session *session)
 	session->auxtrace = NULL;
 	thread__put(pt->unknown_thread);
 	addr_filters__exit(&pt->filts);
+	zfree(&pt->chain);
 	zfree(&pt->filter);
 	zfree(&pt->time_ranges);
 	free(pt);
@@ -3348,6 +3392,7 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
 		    !session->itrace_synth_opts->inject) {
 			pt->synth_opts.branches = false;
 			pt->synth_opts.callchain = true;
+			pt->synth_opts.add_callchain = true;
 		}
 		pt->synth_opts.thread_stack =
 				session->itrace_synth_opts->thread_stack;
@@ -3380,14 +3425,22 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
 		pt->branches_filter |= PERF_IP_FLAG_RETURN |
 				       PERF_IP_FLAG_TRACE_BEGIN;
 
-	if (pt->synth_opts.callchain && !symbol_conf.use_callchain) {
+	if ((pt->synth_opts.callchain || pt->synth_opts.add_callchain) &&
+	    !symbol_conf.use_callchain) {
 		symbol_conf.use_callchain = true;
 		if (callchain_register_param(&callchain_param) < 0) {
 			symbol_conf.use_callchain = false;
 			pt->synth_opts.callchain = false;
+			pt->synth_opts.add_callchain = false;
 		}
 	}
 
+	if (pt->synth_opts.add_callchain) {
+		err = intel_pt_callchain_init(pt);
+		if (err)
+			goto err_delete_thread;
+	}
+
 	err = intel_pt_synth_events(pt, session);
 	if (err)
 		goto err_delete_thread;
@@ -3410,6 +3463,7 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
 	return 0;
 
 err_delete_thread:
+	zfree(&pt->chain);
 	thread__zput(pt->unknown_thread);
 err_free_queues:
 	intel_pt_log_disable();
-- 
2.17.1


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 12/16] perf tools: Move and globalize perf_evsel__find_pmu() and perf_evsel__is_aux_event()
  2020-04-01 10:15 [PATCH 00/16] perf intel-pt: Sampling improvements Adrian Hunter
                   ` (10 preceding siblings ...)
  2020-04-01 10:16 ` [PATCH 11/16] perf intel-pt: Add support for synthesizing callchains for regular events Adrian Hunter
@ 2020-04-01 10:16 ` Adrian Hunter
  2020-04-18 11:50   ` Arnaldo Carvalho de Melo
  2020-04-22 12:17   ` [tip: perf/core] perf evsel: " tip-bot2 for Adrian Hunter
  2020-04-01 10:16 ` [PATCH 13/16] perf tools: Move leader-sampling configuration Adrian Hunter
                   ` (3 subsequent siblings)
  15 siblings, 2 replies; 47+ messages in thread
From: Adrian Hunter @ 2020-04-01 10:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, linux-kernel, Adrian Hunter

Move and globalize 2 functions so that they can be reused.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/auxtrace.c | 19 -------------------
 tools/perf/util/evsel.c    | 20 ++++++++++++++++++++
 tools/perf/util/evsel.h    |  3 +++
 3 files changed, 23 insertions(+), 19 deletions(-)

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 809a09e75c55..33ad33378a90 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -58,25 +58,6 @@
 #include "symbol/kallsyms.h"
 #include <internal/lib.h>
 
-static struct perf_pmu *perf_evsel__find_pmu(struct evsel *evsel)
-{
-	struct perf_pmu *pmu = NULL;
-
-	while ((pmu = perf_pmu__scan(pmu)) != NULL) {
-		if (pmu->type == evsel->core.attr.type)
-			break;
-	}
-
-	return pmu;
-}
-
-static bool perf_evsel__is_aux_event(struct evsel *evsel)
-{
-	struct perf_pmu *pmu = perf_evsel__find_pmu(evsel);
-
-	return pmu && pmu->auxtrace;
-}
-
 /*
  * Make a group from 'leader' to 'last', requiring that the events were not
  * already grouped to a different leader.
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 60e6cd49dee3..d4ab073c9fe7 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -44,6 +44,7 @@
 #include "stat.h"
 #include "string2.h"
 #include "memswap.h"
+#include "pmu.h"
 #include "util.h"
 #include "../perf-sys.h"
 #include "util/parse-branch-options.h"
@@ -100,6 +101,25 @@ int perf_evsel__object_config(size_t object_size,
 	return 0;
 }
 
+struct perf_pmu *perf_evsel__find_pmu(struct evsel *evsel)
+{
+	struct perf_pmu *pmu = NULL;
+
+	while ((pmu = perf_pmu__scan(pmu)) != NULL) {
+		if (pmu->type == evsel->core.attr.type)
+			break;
+	}
+
+	return pmu;
+}
+
+bool perf_evsel__is_aux_event(struct evsel *evsel)
+{
+	struct perf_pmu *pmu = perf_evsel__find_pmu(evsel);
+
+	return pmu && pmu->auxtrace;
+}
+
 #define FD(e, x, y) (*(int *)xyarray__entry(e->core.fd, x, y))
 
 int __perf_evsel__sample_size(u64 sample_type)
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index e64ed4202cab..a463bc65b001 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -158,6 +158,9 @@ int perf_evsel__object_config(size_t object_size,
 			      int (*init)(struct evsel *evsel),
 			      void (*fini)(struct evsel *evsel));
 
+struct perf_pmu *perf_evsel__find_pmu(struct evsel *evsel);
+bool perf_evsel__is_aux_event(struct evsel *evsel);
+
 struct evsel *perf_evsel__new_idx(struct perf_event_attr *attr, int idx);
 
 static inline struct evsel *evsel__new(struct perf_event_attr *attr)
-- 
2.17.1


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 13/16] perf tools: Move leader-sampling configuration
  2020-04-01 10:15 [PATCH 00/16] perf intel-pt: Sampling improvements Adrian Hunter
                   ` (11 preceding siblings ...)
  2020-04-01 10:16 ` [PATCH 12/16] perf tools: Move and globalize perf_evsel__find_pmu() and perf_evsel__is_aux_event() Adrian Hunter
@ 2020-04-01 10:16 ` Adrian Hunter
  2020-04-16 15:29   ` Arnaldo Carvalho de Melo
  2020-04-22 12:17   ` [tip: perf/core] perf evlist: " tip-bot2 for Adrian Hunter
  2020-04-01 10:16 ` [PATCH 14/16] perf tools: Rearrange perf_evsel__config_leader_sampling() Adrian Hunter
                   ` (2 subsequent siblings)
  15 siblings, 2 replies; 47+ messages in thread
From: Adrian Hunter @ 2020-04-01 10:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, linux-kernel, Adrian Hunter

Move leader-sampling configuration in preparation for adding support for
leader sampling with AUX area events.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/evsel.c  | 19 -------------------
 tools/perf/util/record.c | 29 +++++++++++++++++++++++++++++
 2 files changed, 29 insertions(+), 19 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index d4ab073c9fe7..8ddcb95396ac 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1022,25 +1022,6 @@ void perf_evsel__config(struct evsel *evsel, struct record_opts *opts,
 		}
 	}
 
-	/*
-	 * Disable sampling for all group members other
-	 * than leader in case leader 'leads' the sampling.
-	 */
-	if ((leader != evsel) && leader->sample_read) {
-		attr->freq           = 0;
-		attr->sample_freq    = 0;
-		attr->sample_period  = 0;
-		attr->write_backward = 0;
-
-		/*
-		 * We don't get sample for slave events, we make them
-		 * when delivering group leader sample. Set the slave
-		 * event to follow the master sample_type to ease up
-		 * report.
-		 */
-		attr->sample_type = leader->core.attr.sample_type;
-	}
-
 	if (opts->no_samples)
 		attr->sample_freq = 0;
 
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index 7def66168503..ce383fc1bbbc 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -167,6 +167,31 @@ bool perf_can_aux_sample(void)
 	return true;
 }
 
+static void perf_evsel__config_leader_sampling(struct evsel *evsel)
+{
+	struct perf_event_attr *attr = &evsel->core.attr;
+	struct evsel *leader = evsel->leader;
+
+	/*
+	 * Disable sampling for all group members other
+	 * than leader in case leader 'leads' the sampling.
+	 */
+	if (leader != evsel && leader->sample_read) {
+		attr->freq           = 0;
+		attr->sample_freq    = 0;
+		attr->sample_period  = 0;
+		attr->write_backward = 0;
+
+		/*
+		 * We don't get sample for slave events, we make them
+		 * when delivering group leader sample. Set the slave
+		 * event to follow the master sample_type to ease up
+		 * report.
+		 */
+		attr->sample_type = leader->core.attr.sample_type;
+	}
+}
+
 void perf_evlist__config(struct evlist *evlist, struct record_opts *opts,
 			 struct callchain_param *callchain)
 {
@@ -193,6 +218,10 @@ void perf_evlist__config(struct evlist *evlist, struct record_opts *opts,
 			evsel->core.attr.comm_exec = 1;
 	}
 
+	/* Configure leader sampling here now that the sample type is known */
+	evlist__for_each_entry(evlist, evsel)
+		perf_evsel__config_leader_sampling(evsel, evlist);
+
 	if (opts->full_auxtrace) {
 		/*
 		 * Need to be able to synthesize and parse selected events with
-- 
2.17.1


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 14/16] perf tools: Rearrange perf_evsel__config_leader_sampling()
  2020-04-01 10:15 [PATCH 00/16] perf intel-pt: Sampling improvements Adrian Hunter
                   ` (12 preceding siblings ...)
  2020-04-01 10:16 ` [PATCH 13/16] perf tools: Move leader-sampling configuration Adrian Hunter
@ 2020-04-01 10:16 ` Adrian Hunter
  2020-04-22 12:17   ` [tip: perf/core] perf evsel: " tip-bot2 for Adrian Hunter
  2020-04-01 10:16 ` [PATCH 15/16] perf tools: Allow multiple read formats Adrian Hunter
  2020-04-01 10:16 ` [PATCH 16/16] perf tools: Add support for leader-sampling with AUX area events Adrian Hunter
  15 siblings, 1 reply; 47+ messages in thread
From: Adrian Hunter @ 2020-04-01 10:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, linux-kernel, Adrian Hunter

In preparation for adding support for leader sampling with AUX area events.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/record.c | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index ce383fc1bbbc..924c58b3fc36 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -172,24 +172,24 @@ static void perf_evsel__config_leader_sampling(struct evsel *evsel)
 	struct perf_event_attr *attr = &evsel->core.attr;
 	struct evsel *leader = evsel->leader;
 
+	if (leader == evsel || !leader->sample_read)
+		return;
+
 	/*
 	 * Disable sampling for all group members other
 	 * than leader in case leader 'leads' the sampling.
 	 */
-	if (leader != evsel && leader->sample_read) {
-		attr->freq           = 0;
-		attr->sample_freq    = 0;
-		attr->sample_period  = 0;
-		attr->write_backward = 0;
+	attr->freq           = 0;
+	attr->sample_freq    = 0;
+	attr->sample_period  = 0;
+	attr->write_backward = 0;
 
-		/*
-		 * We don't get sample for slave events, we make them
-		 * when delivering group leader sample. Set the slave
-		 * event to follow the master sample_type to ease up
-		 * report.
-		 */
-		attr->sample_type = leader->core.attr.sample_type;
-	}
+	/*
+	 * We don't get a sample for slave events, we make them when delivering
+	 * the group leader sample. Set the slave event to follow the master
+	 * sample_type to ease up reporting.
+	 */
+	attr->sample_type = leader->core.attr.sample_type;
 }
 
 void perf_evlist__config(struct evlist *evlist, struct record_opts *opts,
-- 
2.17.1


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 15/16] perf tools: Allow multiple read formats
  2020-04-01 10:15 [PATCH 00/16] perf intel-pt: Sampling improvements Adrian Hunter
                   ` (13 preceding siblings ...)
  2020-04-01 10:16 ` [PATCH 14/16] perf tools: Rearrange perf_evsel__config_leader_sampling() Adrian Hunter
@ 2020-04-01 10:16 ` Adrian Hunter
  2020-04-22 12:17   ` [tip: perf/core] perf evlist: " tip-bot2 for Adrian Hunter
  2020-04-01 10:16 ` [PATCH 16/16] perf tools: Add support for leader-sampling with AUX area events Adrian Hunter
  15 siblings, 1 reply; 47+ messages in thread
From: Adrian Hunter @ 2020-04-01 10:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, linux-kernel, Adrian Hunter

Tools find the correct evsel, and therefore read format, using the event
ID, so it isn't necessary for all read formats to be the same. In the case
of leader-sampling of AUX area events, dummy tracking events will have
a different read format, so relax the validation to become a debug message
only.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/evlist.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 1548237b6558..82d9f9bb8975 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1131,8 +1131,10 @@ bool perf_evlist__valid_read_format(struct evlist *evlist)
 	u64 sample_type = first->core.attr.sample_type;
 
 	evlist__for_each_entry(evlist, pos) {
-		if (read_format != pos->core.attr.read_format)
-			return false;
+		if (read_format != pos->core.attr.read_format) {
+			pr_debug("Read format differs %#" PRIx64 " vs %#" PRIx64 "\n",
+				 read_format, (u64)pos->core.attr.read_format);
+		}
 	}
 
 	/* PERF_SAMPLE_READ imples PERF_FORMAT_ID. */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 16/16] perf tools: Add support for leader-sampling with AUX area events
  2020-04-01 10:15 [PATCH 00/16] perf intel-pt: Sampling improvements Adrian Hunter
                   ` (14 preceding siblings ...)
  2020-04-01 10:16 ` [PATCH 15/16] perf tools: Allow multiple read formats Adrian Hunter
@ 2020-04-01 10:16 ` Adrian Hunter
  2020-04-22 12:17   ` [tip: perf/core] " tip-bot2 for Adrian Hunter
  15 siblings, 1 reply; 47+ messages in thread
From: Adrian Hunter @ 2020-04-01 10:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, linux-kernel, Adrian Hunter

When AUX area events are used in sampling mode, they must be the group
leader, but the group leader is also used for leader-sampling. However,
it is not desirable to use an AUX area event as the leader for
leader-sampling, because it doesn't have any samples of its own. To support
leader-sampling with AUX area events, use the 2nd event of the group as the
"leader" for the purposes of leader-sampling.

Example:

 # perf record --kcore --aux-sample -e '{intel_pt//,cycles,instructions}:S' -c 10000 uname
 [ perf record: Woken up 3 times to write data ]
 [ perf record: Captured and wrote 0.786 MB perf.data ]
 # perf report
 Samples: 380  of events 'anon group { cycles, instructions }', Event count (approx.): 3026164
           Children              Self  Command  Shared Object      Symbol
 +   38.76%  42.65%     0.00%   0.00%  uname    [kernel.kallsyms]  [k] __x86_indirect_thunk_rax
 +   35.82%  31.33%     0.00%   0.00%  uname    ld-2.28.so         [.] _dl_start_user
 +   34.29%  29.74%     0.55%   0.47%  uname    ld-2.28.so         [.] _dl_start
 +   33.73%  28.62%     1.60%   0.97%  uname    ld-2.28.so         [.] dl_main
 +   33.19%  29.04%     0.52%   0.32%  uname    ld-2.28.so         [.] _dl_sysdep_start
 +   27.83%  33.74%     0.00%   0.00%  uname    [kernel.kallsyms]  [k] do_syscall_64
 +   26.76%  33.29%     0.00%   0.00%  uname    [kernel.kallsyms]  [k] entry_SYSCALL_64_after_hwframe
 +   23.78%  20.33%     5.97%   5.25%  uname    [kernel.kallsyms]  [k] page_fault
 +   23.18%  24.60%     0.00%   0.00%  uname    libc-2.28.so       [.] __libc_start_main
 +   22.64%  24.37%     0.00%   0.00%  uname    uname              [.] _start
 +   21.04%  23.27%     0.00%   0.00%  uname    uname              [.] main
 +   19.48%  18.08%     3.72%   3.64%  uname    ld-2.28.so         [.] _dl_relocate_object
 +   19.47%  21.81%     0.00%   0.00%  uname    libc-2.28.so       [.] setlocale
 +   19.44%  21.56%     0.52%   0.61%  uname    libc-2.28.so       [.] _nl_find_locale
 +   17.87%  19.66%     0.00%   0.00%  uname    libc-2.28.so       [.] _nl_load_locale_from_archive
 +   15.71%  13.73%     0.53%   0.52%  uname    [kernel.kallsyms]  [k] do_page_fault
 +   15.18%  13.21%     1.03%   0.68%  uname    [kernel.kallsyms]  [k] handle_mm_fault
 +   14.15%  12.53%     1.01%   1.12%  uname    [kernel.kallsyms]  [k] __handle_mm_fault
 +   12.03%   9.67%     0.54%   0.32%  uname    ld-2.28.so         [.] _dl_map_object
 +   10.55%   8.48%     0.00%   0.00%  uname    ld-2.28.so         [.] openaux
 +   10.55%  20.20%     0.52%   0.61%  uname    libc-2.28.so       [.] __run_exit_handlers

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Documentation/perf-list.txt |  3 ++
 tools/perf/util/record.c               | 43 +++++++++++++++++++++++---
 2 files changed, 41 insertions(+), 5 deletions(-)

diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt
index 6345db33c533..cb23667531ab 100644
--- a/tools/perf/Documentation/perf-list.txt
+++ b/tools/perf/Documentation/perf-list.txt
@@ -258,6 +258,9 @@ Normally all events in an event group sample, but with :S only
 the first event (the leader) samples, and it only reads the values of the
 other events in the group.
 
+However, in the case AUX area events (e.g. Intel PT or CoreSight), the AUX
+area event must be the leader, so then the second event samples, not the first.
+
 OPTIONS
 -------
 
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index 924c58b3fc36..6d3e3df6e2a1 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -167,17 +167,46 @@ bool perf_can_aux_sample(void)
 	return true;
 }
 
-static void perf_evsel__config_leader_sampling(struct evsel *evsel)
+/*
+ * perf_evsel__config_leader_sampling() uses special rules for leader sampling.
+ * However, if the leader is an AUX area event, then assume the event to sample
+ * is the next event.
+ */
+static struct evsel *perf_evsel__read_sampler(struct evsel *evsel,
+					      struct evlist *evlist)
+{
+	struct evsel *leader = evsel->leader;
+
+	if (perf_evsel__is_aux_event(leader)) {
+		evlist__for_each_entry(evlist, evsel) {
+			if (evsel->leader == leader && evsel != evsel->leader)
+				return evsel;
+		}
+	}
+
+	return leader;
+}
+
+static void perf_evsel__config_leader_sampling(struct evsel *evsel,
+					       struct evlist *evlist)
 {
 	struct perf_event_attr *attr = &evsel->core.attr;
 	struct evsel *leader = evsel->leader;
+	struct evsel *read_sampler;
+
+	if (!leader->sample_read)
+		return;
+
+	read_sampler = perf_evsel__read_sampler(evsel, evlist);
 
-	if (leader == evsel || !leader->sample_read)
+	if (evsel == read_sampler)
 		return;
 
 	/*
-	 * Disable sampling for all group members other
-	 * than leader in case leader 'leads' the sampling.
+	 * Disable sampling for all group members other than the leader in
+	 * case the leader 'leads' the sampling, except when the leader is an
+	 * AUX area event, in which case the 2nd event in the group is the one
+	 * that 'leads' the sampling.
 	 */
 	attr->freq           = 0;
 	attr->sample_freq    = 0;
@@ -188,8 +217,12 @@ static void perf_evsel__config_leader_sampling(struct evsel *evsel)
 	 * We don't get a sample for slave events, we make them when delivering
 	 * the group leader sample. Set the slave event to follow the master
 	 * sample_type to ease up reporting.
+	 * An AUX area event also has sample_type requirements, so also include
+	 * the sample type bits from the leader's sample_type to cover that
+	 * case.
 	 */
-	attr->sample_type = leader->core.attr.sample_type;
+	attr->sample_type = read_sampler->core.attr.sample_type |
+			    leader->core.attr.sample_type;
 }
 
 void perf_evlist__config(struct evlist *evlist, struct record_opts *opts,
-- 
2.17.1


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 06/16] perf s390-cpumsf: Implement ->evsel_is_auxtrace() callback
  2020-04-01 10:16 ` [PATCH 06/16] perf s390-cpumsf: " Adrian Hunter
@ 2020-04-01 14:10   ` Thomas Richter
  2020-04-22 12:17   ` [tip: perf/core] " tip-bot2 for Adrian Hunter
  1 sibling, 0 replies; 47+ messages in thread
From: Thomas Richter @ 2020-04-01 14:10 UTC (permalink / raw)
  To: Adrian Hunter, Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, linux-kernel

On 4/1/20 12:16 PM, Adrian Hunter wrote:
> Implement ->evsel_is_auxtrace() callback.
> 
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> Cc: Thomas Richter <tmricht@linux.ibm.com>
> ---
>  tools/perf/util/s390-cpumcf-kernel.h | 1 +
>  tools/perf/util/s390-cpumsf.c        | 9 +++++++++
>  2 files changed, 10 insertions(+)
> 
> diff --git a/tools/perf/util/s390-cpumcf-kernel.h b/tools/perf/util/s390-cpumcf-kernel.h
> index d4356030b504..f55ca07f3ca1 100644
> --- a/tools/perf/util/s390-cpumcf-kernel.h
> +++ b/tools/perf/util/s390-cpumcf-kernel.h
> @@ -11,6 +11,7 @@
>  
>  #define	S390_CPUMCF_DIAG_DEF	0xfeef	/* Counter diagnostic entry ID */
>  #define	PERF_EVENT_CPUM_CF_DIAG	0xBC000	/* Event: Counter sets */
> +#define PERF_EVENT_CPUM_SF_DIAG	0xBD000 /* Event: Combined-sampling */
>  
>  struct cf_ctrset_entry {	/* CPU-M CF counter set entry (8 byte) */
>  	unsigned int def:16;	/* 0-15  Data Entry Format */
> diff --git a/tools/perf/util/s390-cpumsf.c b/tools/perf/util/s390-cpumsf.c
> index 6785cd87aa4d..d7779e48652f 100644
> --- a/tools/perf/util/s390-cpumsf.c
> +++ b/tools/perf/util/s390-cpumsf.c
> @@ -1047,6 +1047,14 @@ static void s390_cpumsf_free(struct perf_session *session)
>  	free(sf);
>  }
>  
> +static bool
> +s390_cpumsf_evsel_is_auxtrace(struct perf_session *session __maybe_unused,
> +			      struct evsel *evsel)
> +{
> +	return evsel->core.attr.type == PERF_TYPE_RAW &&
> +	       evsel->core.attr.config == PERF_EVENT_CPUM_SF_DIAG;
> +}
> +
>  static int s390_cpumsf_get_type(const char *cpuid)
>  {
>  	int ret, family = 0;
> @@ -1142,6 +1150,7 @@ int s390_cpumsf_process_auxtrace_info(union perf_event *event,
>  	sf->auxtrace.flush_events = s390_cpumsf_flush;
>  	sf->auxtrace.free_events = s390_cpumsf_free_events;
>  	sf->auxtrace.free = s390_cpumsf_free;
> +	sf->auxtrace.evsel_is_auxtrace = s390_cpumsf_evsel_is_auxtrace;
>  	session->auxtrace = &sf->auxtrace;
>  
>  	if (dump_trace)
> 

Acked-by: Thomas Richter <tmricht@linux.ibm.com>

-- 
Thomas Richter, Dept 3252, IBM s390 Linux Development, Boeblingen, Germany
--
Vorsitzender des Aufsichtsrats: Matthias Hartmann
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 05/16] perf cs-etm: Implement ->evsel_is_auxtrace() callback
  2020-04-01 10:16 ` [PATCH 05/16] perf cs-etm: " Adrian Hunter
@ 2020-04-01 17:11   ` Mathieu Poirier
  2020-04-22 12:17   ` [tip: perf/core] " tip-bot2 for Adrian Hunter
  1 sibling, 0 replies; 47+ messages in thread
From: Mathieu Poirier @ 2020-04-01 17:11 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Andi Kleen, linux-kernel

On Wed, Apr 01, 2020 at 01:16:02PM +0300, Adrian Hunter wrote:
> Implement ->evsel_is_auxtrace() callback.
> 
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> ---
>  tools/perf/util/cs-etm.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> index 62d2f9b9ce1b..3c802fde4954 100644
> --- a/tools/perf/util/cs-etm.c
> +++ b/tools/perf/util/cs-etm.c
> @@ -631,6 +631,16 @@ static void cs_etm__free(struct perf_session *session)
>  	zfree(&aux);
>  }
>  
> +static bool cs_etm__evsel_is_auxtrace(struct perf_session *session,
> +				      struct evsel *evsel)
> +{
> +	struct cs_etm_auxtrace *aux = container_of(session->auxtrace,
> +						   struct cs_etm_auxtrace,
> +						   auxtrace);
> +
> +	return evsel->core.attr.type == aux->pmu_type;
> +}
> +
>  static u8 cs_etm__cpu_mode(struct cs_etm_queue *etmq, u64 address)
>  {
>  	struct machine *machine;
> @@ -2618,6 +2628,7 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
>  	etm->auxtrace.flush_events = cs_etm__flush_events;
>  	etm->auxtrace.free_events = cs_etm__free_events;
>  	etm->auxtrace.free = cs_etm__free;
> +	etm->auxtrace.evsel_is_auxtrace = cs_etm__evsel_is_auxtrace;
>  	session->auxtrace = &etm->auxtrace;

Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>

>  
>  	etm->unknown_thread = thread__new(999999999, 999999999);
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 04/16] perf arm-spe: Implement ->evsel_is_auxtrace() callback
  2020-04-01 10:16 ` [PATCH 04/16] perf arm-spe: " Adrian Hunter
@ 2020-04-02  3:03   ` Leo Yan
  2020-04-22 12:17   ` [tip: perf/core] " tip-bot2 for Adrian Hunter
  1 sibling, 0 replies; 47+ messages in thread
From: Leo Yan @ 2020-04-02  3:03 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Andi Kleen, linux-kernel,
	Kim Phillips

On Wed, Apr 01, 2020 at 01:16:01PM +0300, Adrian Hunter wrote:
> Implement ->evsel_is_auxtrace() callback.
> 
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> Cc: Kim Phillips <kim.phillips@arm.com>
> ---
>  tools/perf/util/arm-spe.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
> index 53be12b23ff4..b30cc74d0fb4 100644
> --- a/tools/perf/util/arm-spe.c
> +++ b/tools/perf/util/arm-spe.c
> @@ -176,6 +176,15 @@ static void arm_spe_free(struct perf_session *session)
>  	free(spe);
>  }
>  
> +static bool arm_spe_evsel_is_auxtrace(struct perf_session *session,
> +				      struct evsel *evsel)
> +{
> +	struct arm_spe *spe = container_of(session->auxtrace, struct arm_spe,
> +					     auxtrace);
> +
> +	return evsel->core.attr.type == spe->pmu_type;
> +}
> +
>  static const char * const arm_spe_info_fmts[] = {
>  	[ARM_SPE_PMU_TYPE]		= "  PMU Type           %"PRId64"\n",
>  };
> @@ -218,6 +227,7 @@ int arm_spe_process_auxtrace_info(union perf_event *event,
>  	spe->auxtrace.flush_events = arm_spe_flush;
>  	spe->auxtrace.free_events = arm_spe_free_events;
>  	spe->auxtrace.free = arm_spe_free;
> +	spe->auxtrace.evsel_is_auxtrace = arm_spe_evsel_is_auxtrace;
>  	session->auxtrace = &spe->auxtrace;

Reviewed-by: Leo Yan <leo.yan@linaro.org>

>  	arm_spe_print_info(&auxtrace_info->priv[0]);
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 10/16] perf tools: Add support for synthesized sample type
  2020-04-01 10:16 ` [PATCH 10/16] perf tools: Add support for synthesized sample type Adrian Hunter
@ 2020-04-16 14:54   ` Arnaldo Carvalho de Melo
  2020-04-16 14:57     ` Arnaldo Carvalho de Melo
  2020-04-22 12:17   ` [tip: perf/core] perf evsel: Be consistent when looking which evsel PERF_SAMPLE_ bits are set tip-bot2 for Adrian Hunter
  2020-04-22 12:17   ` [tip: perf/core] perf evsel: Add support for synthesized sample type tip-bot2 for Adrian Hunter
  2 siblings, 1 reply; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-04-16 14:54 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: Jiri Olsa, Andi Kleen, linux-kernel

Em Wed, Apr 01, 2020 at 01:16:07PM +0300, Adrian Hunter escreveu:
> For reporting purposes, an evsel sample can have a callchain synthesized
> from AUX area data. Add support for keeping track of synthesized sample
> types. Note, the recorded sample_type cannot be changed because it is
> needed to continue to parse events.
> 
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>  tools/perf/util/evsel.c |  2 +-
>  tools/perf/util/evsel.h | 15 ++++++++++++++-
>  2 files changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index eb880efbce16..60e6cd49dee3 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -2136,7 +2136,7 @@ int perf_evsel__parse_sample(struct evsel *evsel, union perf_event *event,
>  		}
>  	}
>  
> -	if (evsel__has_callchain(evsel)) {
> +	if (type & PERF_SAMPLE_CALLCHAIN) {
>  		const u64 max_callchain_nr = UINT64_MAX / sizeof(u64);

This ends up looking unrelated, I had to go and look at the source to
see that this is just a simplification, i.e. earlier in this function
(perf_evsel__parse_sample) we have:

        u64 type = evsel->core.attr.sample_type;

So the above doesn't change anything, good, but slowed reviewing a bit,
please consider next time to have this in a separate patch, I'll do it
this time.

Thanks,

- Arnaldo
  
>  		OVERFLOW_CHECK_u64(array);
> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> index 53187c501ee8..e64ed4202cab 100644
> --- a/tools/perf/util/evsel.h
> +++ b/tools/perf/util/evsel.h
> @@ -104,6 +104,14 @@ struct evsel {
>  		perf_evsel__sb_cb_t	*cb;
>  		void			*data;
>  	} side_band;
> +	/*
> +	 * For reporting purposes, an evsel sample can have a callchain
> +	 * synthesized from AUX area data. Keep track of synthesized sample
> +	 * types here. Note, the recorded sample_type cannot be changed because
> +	 * it is needed to continue to parse events.
> +	 * See also evsel__has_callchain().
> +	 */
> +	__u64			synth_sample_type;
>  };
>  
>  struct perf_missing_features {
> @@ -398,7 +406,12 @@ static inline bool perf_evsel__has_branch_hw_idx(const struct evsel *evsel)
>  
>  static inline bool evsel__has_callchain(const struct evsel *evsel)
>  {
> -	return (evsel->core.attr.sample_type & PERF_SAMPLE_CALLCHAIN) != 0;
> +	/*
> +	 * For reporting purposes, an evsel sample can have a recorded callchain
> +	 * or a callchain synthesized from AUX area data.
> +	 */
> +	return evsel->core.attr.sample_type & PERF_SAMPLE_CALLCHAIN ||
> +	       evsel->synth_sample_type & PERF_SAMPLE_CALLCHAIN;
>  }
>  
>  struct perf_env *perf_evsel__env(struct evsel *evsel);
> -- 
> 2.17.1
> 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 10/16] perf tools: Add support for synthesized sample type
  2020-04-16 14:54   ` Arnaldo Carvalho de Melo
@ 2020-04-16 14:57     ` Arnaldo Carvalho de Melo
  2020-04-16 15:01       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-04-16 14:57 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: Jiri Olsa, Andi Kleen, linux-kernel

Em Thu, Apr 16, 2020 at 11:54:26AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Wed, Apr 01, 2020 at 01:16:07PM +0300, Adrian Hunter escreveu:
> > For reporting purposes, an evsel sample can have a callchain synthesized
> > from AUX area data. Add support for keeping track of synthesized sample
> > types. Note, the recorded sample_type cannot be changed because it is
> > needed to continue to parse events.
> > 
> > Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> > ---
> >  tools/perf/util/evsel.c |  2 +-
> >  tools/perf/util/evsel.h | 15 ++++++++++++++-
> >  2 files changed, 15 insertions(+), 2 deletions(-)
> > 
> > diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> > index eb880efbce16..60e6cd49dee3 100644
> > --- a/tools/perf/util/evsel.c
> > +++ b/tools/perf/util/evsel.c
> > @@ -2136,7 +2136,7 @@ int perf_evsel__parse_sample(struct evsel *evsel, union perf_event *event,
> >  		}
> >  	}
> >  
> > -	if (evsel__has_callchain(evsel)) {
> > +	if (type & PERF_SAMPLE_CALLCHAIN) {
> >  		const u64 max_callchain_nr = UINT64_MAX / sizeof(u64);
> 
> This ends up looking unrelated, I had to go and look at the source to
> see that this is just a simplification, i.e. earlier in this function
> (perf_evsel__parse_sample) we have:
> 
>         u64 type = evsel->core.attr.sample_type;
> 
> So the above doesn't change anything, good, but slowed reviewing a bit,
> please consider next time to have this in a separate patch, I'll do it
> this time.

I've added this as the cset comment:

Using 'type' variable for checking for callchains is equivalent to using
evsel__has_callchain(evsel) and is how the other PERF_SAMPLE_ bits are checked
in this function, so use it to be consistent.
 
> Thanks,
> 
> - Arnaldo
>   
> >  		OVERFLOW_CHECK_u64(array);
> > diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> > index 53187c501ee8..e64ed4202cab 100644
> > --- a/tools/perf/util/evsel.h
> > +++ b/tools/perf/util/evsel.h
> > @@ -104,6 +104,14 @@ struct evsel {
> >  		perf_evsel__sb_cb_t	*cb;
> >  		void			*data;
> >  	} side_band;
> > +	/*
> > +	 * For reporting purposes, an evsel sample can have a callchain
> > +	 * synthesized from AUX area data. Keep track of synthesized sample
> > +	 * types here. Note, the recorded sample_type cannot be changed because
> > +	 * it is needed to continue to parse events.
> > +	 * See also evsel__has_callchain().
> > +	 */
> > +	__u64			synth_sample_type;
> >  };
> >  
> >  struct perf_missing_features {
> > @@ -398,7 +406,12 @@ static inline bool perf_evsel__has_branch_hw_idx(const struct evsel *evsel)
> >  
> >  static inline bool evsel__has_callchain(const struct evsel *evsel)
> >  {
> > -	return (evsel->core.attr.sample_type & PERF_SAMPLE_CALLCHAIN) != 0;
> > +	/*
> > +	 * For reporting purposes, an evsel sample can have a recorded callchain
> > +	 * or a callchain synthesized from AUX area data.
> > +	 */
> > +	return evsel->core.attr.sample_type & PERF_SAMPLE_CALLCHAIN ||
> > +	       evsel->synth_sample_type & PERF_SAMPLE_CALLCHAIN;
> >  }
> >  
> >  struct perf_env *perf_evsel__env(struct evsel *evsel);
> > -- 
> > 2.17.1
> > 
> 
> -- 
> 
> - Arnaldo

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 10/16] perf tools: Add support for synthesized sample type
  2020-04-16 14:57     ` Arnaldo Carvalho de Melo
@ 2020-04-16 15:01       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-04-16 15:01 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: Jiri Olsa, Andi Kleen, linux-kernel

Em Thu, Apr 16, 2020 at 11:57:04AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Thu, Apr 16, 2020 at 11:54:26AM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Wed, Apr 01, 2020 at 01:16:07PM +0300, Adrian Hunter escreveu:
> > > +++ b/tools/perf/util/evsel.c
> > > @@ -2136,7 +2136,7 @@ int perf_evsel__parse_sample(struct evsel *evsel, union perf_event *event,
> > >  		}
> > >  	}

> > > -	if (evsel__has_callchain(evsel)) {
> > > +	if (type & PERF_SAMPLE_CALLCHAIN) {
> > >  		const u64 max_callchain_nr = UINT64_MAX / sizeof(u64);

> > This ends up looking unrelated, I had to go and look at the source to
> > see that this is just a simplification, i.e. earlier in this function
> > (perf_evsel__parse_sample) we have:

> >         u64 type = evsel->core.attr.sample_type;

> > So the above doesn't change anything, good, but slowed reviewing a bit,
> > please consider next time to have this in a separate patch, I'll do it
> > this time.

> I've added this as the cset comment:

> Using 'type' variable for checking for callchains is equivalent to using
> evsel__has_callchain(evsel) and is how the other PERF_SAMPLE_ bits are checked
> in this function, so use it to be consistent.

Then, reading the rest of the patch, evsel__has_callchain() is not
anymore equivalent to just looking at the PERF_SAMPLE_ bit in core.attr
:-)

- Arnaldo

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 11/16] perf intel-pt: Add support for synthesizing callchains for regular events
  2020-04-01 10:16 ` [PATCH 11/16] perf intel-pt: Add support for synthesizing callchains for regular events Adrian Hunter
@ 2020-04-16 15:14   ` Arnaldo Carvalho de Melo
  2020-04-17 13:50     ` Adrian Hunter
  2020-04-22 12:17   ` [tip: perf/core] " tip-bot2 for Adrian Hunter
  1 sibling, 1 reply; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-04-16 15:14 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Jiri Olsa, Andi Kleen, linux-kernel, Namhyung Kim, linux-perf-users

Em Wed, Apr 01, 2020 at 01:16:08PM +0300, Adrian Hunter escreveu:
> Currently, callchains can be synthesized only for synthesized events.
> Support also synthesizing callchains for regular events.

This is super cool, I wonder if we shouldn't do it automatically or just
adding a new type of callchains, i.e.:

	perf record --call-graph pt uname

Should take care of all the details, i.e. do the extra steps below
behind the scenes.

Possibly even find out that the workload specified was built with
-fomit-frame-pointers, that the hardware has Intel PT and do all behind
the scenes for:

	perf record -g uname

Alternatively we could take some less seemingly far fetched approach and
make this configurable via:

	perf config call-graph.record-mode=pt

What do you think?

- Arnaldo
 
> Example:
> 
>  # perf record --kcore --aux-sample -e '{intel_pt//,cycles}' -c 10000 uname
>  Linux
>  [ perf record: Woken up 3 times to write data ]
>  [ perf record: Captured and wrote 0.532 MB perf.data ]
>  # perf script --itrace=Ge | head -20
>  uname  4864 2419025.358181:      10000     cycles:
>         ffffffffbba56965 apparmor_bprm_committing_creds+0x35 ([kernel.kallsyms])
>         ffffffffbc400cd5 __indirect_thunk_start+0x5 ([kernel.kallsyms])
>         ffffffffbba07422 security_bprm_committing_creds+0x22 ([kernel.kallsyms])
>         ffffffffbb89805d install_exec_creds+0xd ([kernel.kallsyms])
>         ffffffffbb90d9ac load_elf_binary+0x3ac ([kernel.kallsyms])
> 
>  uname  4864 2419025.358185:      10000     cycles:
>         ffffffffbba56db0 apparmor_bprm_committed_creds+0x20 ([kernel.kallsyms])
>         ffffffffbc400cd5 __indirect_thunk_start+0x5 ([kernel.kallsyms])
>         ffffffffbba07452 security_bprm_committed_creds+0x22 ([kernel.kallsyms])
>         ffffffffbb89809a install_exec_creds+0x4a ([kernel.kallsyms])
>         ffffffffbb90d9ac load_elf_binary+0x3ac ([kernel.kallsyms])
> 
>  uname  4864 2419025.358189:      10000     cycles:
>         ffffffffbb86fdf6 vma_adjust_trans_huge+0x6 ([kernel.kallsyms])
>         ffffffffbb821660 __vma_adjust+0x160 ([kernel.kallsyms])
>         ffffffffbb897be7 shift_arg_pages+0x97 ([kernel.kallsyms])
>         ffffffffbb897ed9 setup_arg_pages+0x1e9 ([kernel.kallsyms])
>         ffffffffbb90d9f2 load_elf_binary+0x3f2 ([kernel.kallsyms])
> 
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>  tools/perf/util/intel-pt.c | 68 ++++++++++++++++++++++++++++++++++----
>  1 file changed, 61 insertions(+), 7 deletions(-)
> 
> diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
> index db25c77d82f3..a659b4a1b3f2 100644
> --- a/tools/perf/util/intel-pt.c
> +++ b/tools/perf/util/intel-pt.c
> @@ -124,6 +124,8 @@ struct intel_pt {
>  
>  	struct range *time_ranges;
>  	unsigned int range_cnt;
> +
> +	struct ip_callchain *chain;
>  };
>  
>  enum switch_state {
> @@ -868,6 +870,45 @@ static u64 intel_pt_ns_to_ticks(const struct intel_pt *pt, u64 ns)
>  		pt->tc.time_mult;
>  }
>  
> +static struct ip_callchain *intel_pt_alloc_chain(struct intel_pt *pt)
> +{
> +	size_t sz = sizeof(struct ip_callchain);
> +
> +	/* Add 1 to callchain_sz for callchain context */
> +	sz += (pt->synth_opts.callchain_sz + 1) * sizeof(u64);
> +	return zalloc(sz);
> +}
> +
> +static int intel_pt_callchain_init(struct intel_pt *pt)
> +{
> +	struct evsel *evsel;
> +
> +	evlist__for_each_entry(pt->session->evlist, evsel) {
> +		if (!(evsel->core.attr.sample_type & PERF_SAMPLE_CALLCHAIN))
> +			evsel->synth_sample_type |= PERF_SAMPLE_CALLCHAIN;
> +	}
> +
> +	pt->chain = intel_pt_alloc_chain(pt);
> +	if (!pt->chain)
> +		return -ENOMEM;
> +
> +	return 0;
> +}
> +
> +static void intel_pt_add_callchain(struct intel_pt *pt,
> +				   struct perf_sample *sample)
> +{
> +	struct thread *thread = machine__findnew_thread(pt->machine,
> +							sample->pid,
> +							sample->tid);
> +
> +	thread_stack__sample_late(thread, sample->cpu, pt->chain,
> +				  pt->synth_opts.callchain_sz + 1, sample->ip,
> +				  pt->kernel_start);
> +
> +	sample->callchain = pt->chain;
> +}
> +
>  static struct intel_pt_queue *intel_pt_alloc_queue(struct intel_pt *pt,
>  						   unsigned int queue_nr)
>  {
> @@ -880,11 +921,7 @@ static struct intel_pt_queue *intel_pt_alloc_queue(struct intel_pt *pt,
>  		return NULL;
>  
>  	if (pt->synth_opts.callchain) {
> -		size_t sz = sizeof(struct ip_callchain);
> -
> -		/* Add 1 to callchain_sz for callchain context */
> -		sz += (pt->synth_opts.callchain_sz + 1) * sizeof(u64);
> -		ptq->chain = zalloc(sz);
> +		ptq->chain = intel_pt_alloc_chain(pt);
>  		if (!ptq->chain)
>  			goto out_free;
>  	}
> @@ -1992,7 +2029,8 @@ static int intel_pt_sample(struct intel_pt_queue *ptq)
>  	if (!(state->type & INTEL_PT_BRANCH))
>  		return 0;
>  
> -	if (pt->synth_opts.callchain || pt->synth_opts.thread_stack)
> +	if (pt->synth_opts.callchain || pt->synth_opts.add_callchain ||
> +	    pt->synth_opts.thread_stack)
>  		thread_stack__event(ptq->thread, ptq->cpu, ptq->flags, state->from_ip,
>  				    state->to_ip, ptq->insn_len,
>  				    state->trace_nr);
> @@ -2639,6 +2677,11 @@ static int intel_pt_process_event(struct perf_session *session,
>  	if (err)
>  		return err;
>  
> +	if (event->header.type == PERF_RECORD_SAMPLE) {
> +		if (pt->synth_opts.add_callchain && !sample->callchain)
> +			intel_pt_add_callchain(pt, sample);
> +	}
> +
>  	if (event->header.type == PERF_RECORD_AUX &&
>  	    (event->aux.flags & PERF_AUX_FLAG_TRUNCATED) &&
>  	    pt->synth_opts.errors) {
> @@ -2710,6 +2753,7 @@ static void intel_pt_free(struct perf_session *session)
>  	session->auxtrace = NULL;
>  	thread__put(pt->unknown_thread);
>  	addr_filters__exit(&pt->filts);
> +	zfree(&pt->chain);
>  	zfree(&pt->filter);
>  	zfree(&pt->time_ranges);
>  	free(pt);
> @@ -3348,6 +3392,7 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
>  		    !session->itrace_synth_opts->inject) {
>  			pt->synth_opts.branches = false;
>  			pt->synth_opts.callchain = true;
> +			pt->synth_opts.add_callchain = true;
>  		}
>  		pt->synth_opts.thread_stack =
>  				session->itrace_synth_opts->thread_stack;
> @@ -3380,14 +3425,22 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
>  		pt->branches_filter |= PERF_IP_FLAG_RETURN |
>  				       PERF_IP_FLAG_TRACE_BEGIN;
>  
> -	if (pt->synth_opts.callchain && !symbol_conf.use_callchain) {
> +	if ((pt->synth_opts.callchain || pt->synth_opts.add_callchain) &&
> +	    !symbol_conf.use_callchain) {
>  		symbol_conf.use_callchain = true;
>  		if (callchain_register_param(&callchain_param) < 0) {
>  			symbol_conf.use_callchain = false;
>  			pt->synth_opts.callchain = false;
> +			pt->synth_opts.add_callchain = false;
>  		}
>  	}
>  
> +	if (pt->synth_opts.add_callchain) {
> +		err = intel_pt_callchain_init(pt);
> +		if (err)
> +			goto err_delete_thread;
> +	}
> +
>  	err = intel_pt_synth_events(pt, session);
>  	if (err)
>  		goto err_delete_thread;
> @@ -3410,6 +3463,7 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
>  	return 0;
>  
>  err_delete_thread:
> +	zfree(&pt->chain);
>  	thread__zput(pt->unknown_thread);
>  err_free_queues:
>  	intel_pt_log_disable();
> -- 
> 2.17.1
> 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 13/16] perf tools: Move leader-sampling configuration
  2020-04-01 10:16 ` [PATCH 13/16] perf tools: Move leader-sampling configuration Adrian Hunter
@ 2020-04-16 15:29   ` Arnaldo Carvalho de Melo
  2020-04-22 12:17   ` [tip: perf/core] perf evlist: " tip-bot2 for Adrian Hunter
  1 sibling, 0 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-04-16 15:29 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: Jiri Olsa, Andi Kleen, linux-kernel

Em Wed, Apr 01, 2020 at 01:16:10PM +0300, Adrian Hunter escreveu:
> Move leader-sampling configuration in preparation for adding support for
> leader sampling with AUX area events.
> 
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>  tools/perf/util/evsel.c  | 19 -------------------
>  tools/perf/util/record.c | 29 +++++++++++++++++++++++++++++
>  2 files changed, 29 insertions(+), 19 deletions(-)
> 
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index d4ab073c9fe7..8ddcb95396ac 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1022,25 +1022,6 @@ void perf_evsel__config(struct evsel *evsel, struct record_opts *opts,
>  		}
>  	}
>  
> -	/*
> -	 * Disable sampling for all group members other
> -	 * than leader in case leader 'leads' the sampling.
> -	 */
> -	if ((leader != evsel) && leader->sample_read) {
> -		attr->freq           = 0;
> -		attr->sample_freq    = 0;
> -		attr->sample_period  = 0;
> -		attr->write_backward = 0;
> -
> -		/*
> -		 * We don't get sample for slave events, we make them
> -		 * when delivering group leader sample. Set the slave
> -		 * event to follow the master sample_type to ease up
> -		 * report.
> -		 */
> -		attr->sample_type = leader->core.attr.sample_type;
> -	}
> -
>  	if (opts->no_samples)
>  		attr->sample_freq = 0;
>  
> diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
> index 7def66168503..ce383fc1bbbc 100644
> --- a/tools/perf/util/record.c
> +++ b/tools/perf/util/record.c
> @@ -167,6 +167,31 @@ bool perf_can_aux_sample(void)
>  	return true;
>  }
>  
> +static void perf_evsel__config_leader_sampling(struct evsel *evsel)
> +{
> +	struct perf_event_attr *attr = &evsel->core.attr;
> +	struct evsel *leader = evsel->leader;
> +
> +	/*
> +	 * Disable sampling for all group members other
> +	 * than leader in case leader 'leads' the sampling.
> +	 */
> +	if (leader != evsel && leader->sample_read) {
> +		attr->freq           = 0;
> +		attr->sample_freq    = 0;
> +		attr->sample_period  = 0;
> +		attr->write_backward = 0;
> +
> +		/*
> +		 * We don't get sample for slave events, we make them
> +		 * when delivering group leader sample. Set the slave
> +		 * event to follow the master sample_type to ease up
> +		 * report.
> +		 */
> +		attr->sample_type = leader->core.attr.sample_type;
> +	}
> +}
> +
>  void perf_evlist__config(struct evlist *evlist, struct record_opts *opts,
>  			 struct callchain_param *callchain)
>  {
> @@ -193,6 +218,10 @@ void perf_evlist__config(struct evlist *evlist, struct record_opts *opts,
>  			evsel->core.attr.comm_exec = 1;
>  	}
>  
> +	/* Configure leader sampling here now that the sample type is known */
> +	evlist__for_each_entry(evlist, evsel)
> +		perf_evsel__config_leader_sampling(evsel, evlist);
> +
>  	if (opts->full_auxtrace) {
>  		/*
>  		 * Need to be able to synthesize and parse selected events with

  INSTALL  trace_plugins
util/record.c: In function ‘perf_evlist__config’:
util/record.c:223:3: error: too many arguments to function ‘perf_evsel__config_leader_sampling’
  223 |   perf_evsel__config_leader_sampling(evsel, evlist);
      |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
util/record.c:170:13: note: declared here
  170 | static void perf_evsel__config_leader_sampling(struct evsel *evsel)
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mv: cannot stat '/tmp/build/perf/util/.record.o.tmp': No such file or directory
make[4]: *** [/home/acme/git/perf/tools/build/Makefile.build:96: /tmp/build/perf/util/record.o] Error 1


----------

I'm removing that evlist arg from here and fixing up the fallout in the
following patches :-)

- Arnaldo

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 11/16] perf intel-pt: Add support for synthesizing callchains for regular events
  2020-04-16 15:14   ` Arnaldo Carvalho de Melo
@ 2020-04-17 13:50     ` Adrian Hunter
  2020-04-17 21:37       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 47+ messages in thread
From: Adrian Hunter @ 2020-04-17 13:50 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, linux-kernel, Namhyung Kim, linux-perf-users

On 16/04/20 6:14 pm, Arnaldo Carvalho de Melo wrote:
> Em Wed, Apr 01, 2020 at 01:16:08PM +0300, Adrian Hunter escreveu:
>> Currently, callchains can be synthesized only for synthesized events.
>> Support also synthesizing callchains for regular events.
> 
> This is super cool, I wonder if we shouldn't do it automatically or just
> adding a new type of callchains, i.e.:
> 
> 	perf record --call-graph pt uname
> 
> Should take care of all the details, i.e. do the extra steps below
> behind the scenes.
> 
> Possibly even find out that the workload specified was built with
> -fomit-frame-pointers, that the hardware has Intel PT and do all behind
> the scenes for:
> 
> 	perf record -g uname
> 
> Alternatively we could take some less seemingly far fetched approach and
> make this configurable via:
> 
> 	perf config call-graph.record-mode=pt
> 
> What do you think?

Adding a --call-graph option sounds reasonable, and config to define default
callgraph options.  But this was done at Andi Kleen's request, so he may
want to comment.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 11/16] perf intel-pt: Add support for synthesizing callchains for regular events
  2020-04-17 13:50     ` Adrian Hunter
@ 2020-04-17 21:37       ` Arnaldo Carvalho de Melo
  2020-04-20  3:04         ` Andi Kleen
  0 siblings, 1 reply; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-04-17 21:37 UTC (permalink / raw)
  To: Andi Kleen, Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, linux-kernel, Namhyung Kim,
	linux-perf-users

Em Fri, Apr 17, 2020 at 04:50:00PM +0300, Adrian Hunter escreveu:
> On 16/04/20 6:14 pm, Arnaldo Carvalho de Melo wrote:
> > Em Wed, Apr 01, 2020 at 01:16:08PM +0300, Adrian Hunter escreveu:
> >> Currently, callchains can be synthesized only for synthesized events.
> >> Support also synthesizing callchains for regular events.
> > 
> > This is super cool, I wonder if we shouldn't do it automatically or just
> > adding a new type of callchains, i.e.:
> > 
> > 	perf record --call-graph pt uname
> > 
> > Should take care of all the details, i.e. do the extra steps below
> > behind the scenes.
> > 
> > Possibly even find out that the workload specified was built with
> > -fomit-frame-pointers, that the hardware has Intel PT and do all behind
> > the scenes for:
> > 
> > 	perf record -g uname
> > 
> > Alternatively we could take some less seemingly far fetched approach and
> > make this configurable via:
> > 
> > 	perf config call-graph.record-mode=pt
> > 
> > What do you think?
> 
> Adding a --call-graph option sounds reasonable, and config to define default
> callgraph options.  But this was done at Andi Kleen's request, so he may
> want to comment.

Andi? My concern is that if this is the optimal solution for a good
subset of the machines out there, then we need to make it easy to use,
even transparent, if possible and safe to take that path.

- Arnaldo

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 12/16] perf tools: Move and globalize perf_evsel__find_pmu() and perf_evsel__is_aux_event()
  2020-04-01 10:16 ` [PATCH 12/16] perf tools: Move and globalize perf_evsel__find_pmu() and perf_evsel__is_aux_event() Adrian Hunter
@ 2020-04-18 11:50   ` Arnaldo Carvalho de Melo
  2020-04-18 12:04     ` Arnaldo Carvalho de Melo
  2020-04-22 12:17   ` [tip: perf/core] perf evsel: " tip-bot2 for Adrian Hunter
  1 sibling, 1 reply; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-04-18 11:50 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: Jiri Olsa, Andi Kleen, linux-kernel

Em Wed, Apr 01, 2020 at 01:16:09PM +0300, Adrian Hunter escreveu:
> Move and globalize 2 functions so that they can be reused.

So this breaks this:

[acme@five perf]$ perf test -v python
Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc
19: 'import perf' in python                               :
--- start ---
test child forked, pid 168414
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: /tmp/build/perf/python/perf.so: undefined symbol: perf_pmu__scan
test child finished with -1
---- end ----
'import perf' in python: FAILED!
[acme@five perf]$

I'm trying to fix...
 
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>  tools/perf/util/auxtrace.c | 19 -------------------
>  tools/perf/util/evsel.c    | 20 ++++++++++++++++++++
>  tools/perf/util/evsel.h    |  3 +++
>  3 files changed, 23 insertions(+), 19 deletions(-)
> 
> diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
> index 809a09e75c55..33ad33378a90 100644
> --- a/tools/perf/util/auxtrace.c
> +++ b/tools/perf/util/auxtrace.c
> @@ -58,25 +58,6 @@
>  #include "symbol/kallsyms.h"
>  #include <internal/lib.h>
>  
> -static struct perf_pmu *perf_evsel__find_pmu(struct evsel *evsel)
> -{
> -	struct perf_pmu *pmu = NULL;
> -
> -	while ((pmu = perf_pmu__scan(pmu)) != NULL) {
> -		if (pmu->type == evsel->core.attr.type)
> -			break;
> -	}
> -
> -	return pmu;
> -}
> -
> -static bool perf_evsel__is_aux_event(struct evsel *evsel)
> -{
> -	struct perf_pmu *pmu = perf_evsel__find_pmu(evsel);
> -
> -	return pmu && pmu->auxtrace;
> -}
> -
>  /*
>   * Make a group from 'leader' to 'last', requiring that the events were not
>   * already grouped to a different leader.
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 60e6cd49dee3..d4ab073c9fe7 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -44,6 +44,7 @@
>  #include "stat.h"
>  #include "string2.h"
>  #include "memswap.h"
> +#include "pmu.h"
>  #include "util.h"
>  #include "../perf-sys.h"
>  #include "util/parse-branch-options.h"
> @@ -100,6 +101,25 @@ int perf_evsel__object_config(size_t object_size,
>  	return 0;
>  }
>  
> +struct perf_pmu *perf_evsel__find_pmu(struct evsel *evsel)
> +{
> +	struct perf_pmu *pmu = NULL;
> +
> +	while ((pmu = perf_pmu__scan(pmu)) != NULL) {
> +		if (pmu->type == evsel->core.attr.type)
> +			break;
> +	}
> +
> +	return pmu;
> +}
> +
> +bool perf_evsel__is_aux_event(struct evsel *evsel)
> +{
> +	struct perf_pmu *pmu = perf_evsel__find_pmu(evsel);
> +
> +	return pmu && pmu->auxtrace;
> +}
> +
>  #define FD(e, x, y) (*(int *)xyarray__entry(e->core.fd, x, y))
>  
>  int __perf_evsel__sample_size(u64 sample_type)
> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> index e64ed4202cab..a463bc65b001 100644
> --- a/tools/perf/util/evsel.h
> +++ b/tools/perf/util/evsel.h
> @@ -158,6 +158,9 @@ int perf_evsel__object_config(size_t object_size,
>  			      int (*init)(struct evsel *evsel),
>  			      void (*fini)(struct evsel *evsel));
>  
> +struct perf_pmu *perf_evsel__find_pmu(struct evsel *evsel);
> +bool perf_evsel__is_aux_event(struct evsel *evsel);
> +
>  struct evsel *perf_evsel__new_idx(struct perf_event_attr *attr, int idx);
>  
>  static inline struct evsel *evsel__new(struct perf_event_attr *attr)
> -- 
> 2.17.1
> 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 12/16] perf tools: Move and globalize perf_evsel__find_pmu() and perf_evsel__is_aux_event()
  2020-04-18 11:50   ` Arnaldo Carvalho de Melo
@ 2020-04-18 12:04     ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-04-18 12:04 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: Jiri Olsa, Andi Kleen, linux-kernel

Em Sat, Apr 18, 2020 at 08:50:17AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Wed, Apr 01, 2020 at 01:16:09PM +0300, Adrian Hunter escreveu:
> > Move and globalize 2 functions so that they can be reused.
> 
> So this breaks this:
> 
> [acme@five perf]$ perf test -v python
> Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc
> 19: 'import perf' in python                               :
> --- start ---
> test child forked, pid 168414
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> ImportError: /tmp/build/perf/python/perf.so: undefined symbol: perf_pmu__scan
> test child finished with -1
> ---- end ----
> 'import perf' in python: FAILED!
> [acme@five perf]$
> 
> I'm trying to fix...

I solved this by moving those functions to tools/perf/util/pmu.c
instead.

Please consider running 'perf test' before sending your patches to me
next time,

Thanks,

- Arnaldo
  
> > Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> > ---
> >  tools/perf/util/auxtrace.c | 19 -------------------
> >  tools/perf/util/evsel.c    | 20 ++++++++++++++++++++
> >  tools/perf/util/evsel.h    |  3 +++
> >  3 files changed, 23 insertions(+), 19 deletions(-)
> > 
> > diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
> > index 809a09e75c55..33ad33378a90 100644
> > --- a/tools/perf/util/auxtrace.c
> > +++ b/tools/perf/util/auxtrace.c
> > @@ -58,25 +58,6 @@
> >  #include "symbol/kallsyms.h"
> >  #include <internal/lib.h>
> >  
> > -static struct perf_pmu *perf_evsel__find_pmu(struct evsel *evsel)
> > -{
> > -	struct perf_pmu *pmu = NULL;
> > -
> > -	while ((pmu = perf_pmu__scan(pmu)) != NULL) {
> > -		if (pmu->type == evsel->core.attr.type)
> > -			break;
> > -	}
> > -
> > -	return pmu;
> > -}
> > -
> > -static bool perf_evsel__is_aux_event(struct evsel *evsel)
> > -{
> > -	struct perf_pmu *pmu = perf_evsel__find_pmu(evsel);
> > -
> > -	return pmu && pmu->auxtrace;
> > -}
> > -
> >  /*
> >   * Make a group from 'leader' to 'last', requiring that the events were not
> >   * already grouped to a different leader.
> > diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> > index 60e6cd49dee3..d4ab073c9fe7 100644
> > --- a/tools/perf/util/evsel.c
> > +++ b/tools/perf/util/evsel.c
> > @@ -44,6 +44,7 @@
> >  #include "stat.h"
> >  #include "string2.h"
> >  #include "memswap.h"
> > +#include "pmu.h"
> >  #include "util.h"
> >  #include "../perf-sys.h"
> >  #include "util/parse-branch-options.h"
> > @@ -100,6 +101,25 @@ int perf_evsel__object_config(size_t object_size,
> >  	return 0;
> >  }
> >  
> > +struct perf_pmu *perf_evsel__find_pmu(struct evsel *evsel)
> > +{
> > +	struct perf_pmu *pmu = NULL;
> > +
> > +	while ((pmu = perf_pmu__scan(pmu)) != NULL) {
> > +		if (pmu->type == evsel->core.attr.type)
> > +			break;
> > +	}
> > +
> > +	return pmu;
> > +}
> > +
> > +bool perf_evsel__is_aux_event(struct evsel *evsel)
> > +{
> > +	struct perf_pmu *pmu = perf_evsel__find_pmu(evsel);
> > +
> > +	return pmu && pmu->auxtrace;
> > +}
> > +
> >  #define FD(e, x, y) (*(int *)xyarray__entry(e->core.fd, x, y))
> >  
> >  int __perf_evsel__sample_size(u64 sample_type)
> > diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> > index e64ed4202cab..a463bc65b001 100644
> > --- a/tools/perf/util/evsel.h
> > +++ b/tools/perf/util/evsel.h
> > @@ -158,6 +158,9 @@ int perf_evsel__object_config(size_t object_size,
> >  			      int (*init)(struct evsel *evsel),
> >  			      void (*fini)(struct evsel *evsel));
> >  
> > +struct perf_pmu *perf_evsel__find_pmu(struct evsel *evsel);
> > +bool perf_evsel__is_aux_event(struct evsel *evsel);
> > +
> >  struct evsel *perf_evsel__new_idx(struct perf_event_attr *attr, int idx);
> >  
> >  static inline struct evsel *evsel__new(struct perf_event_attr *attr)
> > -- 
> > 2.17.1
> > 
> 
> -- 
> 
> - Arnaldo

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 11/16] perf intel-pt: Add support for synthesizing callchains for regular events
  2020-04-17 21:37       ` Arnaldo Carvalho de Melo
@ 2020-04-20  3:04         ` Andi Kleen
  0 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2020-04-20  3:04 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Adrian Hunter, Jiri Olsa, linux-kernel, Namhyung Kim, linux-perf-users

> Andi? My concern is that if this is the optimal solution for a good
> subset of the machines out there, then we need to make it easy to use,
> even transparent, if possible and safe to take that path.

I'm not sure it's that great in the general case.  A PT call graph
would need a full PT recording from start to end.

The problem with full PT recording is that it doesn't really work for a
lot of workloads, because for anything doing enough computation the CPU
just generates too much data and you end up with a lot of gaps in the
trace when the perf record flushing cannot keep up.

Also even if it worked you might end up with far too much data that
will take a long time to process.

So I suspect it wouldn't work often enough to be generally useful.

For the leader sample case we just want to use some non PT method
to get the call graph.

-Andi

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip: perf/core] perf evlist: Allow multiple read formats
  2020-04-01 10:16 ` [PATCH 15/16] perf tools: Allow multiple read formats Adrian Hunter
@ 2020-04-22 12:17   ` tip-bot2 for Adrian Hunter
  0 siblings, 0 replies; 47+ messages in thread
From: tip-bot2 for Adrian Hunter @ 2020-04-22 12:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Adrian Hunter, Andi Kleen, Jiri Olsa, Arnaldo Carvalho de Melo,
	x86, LKML

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     94d3820f2e18d3c88f833baec8d6ad5b3489fa59
Gitweb:        https://git.kernel.org/tip/94d3820f2e18d3c88f833baec8d6ad5b3489fa59
Author:        Adrian Hunter <adrian.hunter@intel.com>
AuthorDate:    Wed, 01 Apr 2020 13:16:12 +03:00
Committer:     Arnaldo Carvalho de Melo <acme@redhat.com>
CommitterDate: Sat, 18 Apr 2020 09:05:00 -03:00

perf evlist: Allow multiple read formats

Tools find the correct evsel, and therefore read format, using the event
ID, so it isn't necessary for all read formats to be the same. In the
case of leader-sampling of AUX area events, dummy tracking events will
have a different read format, so relax the validation to become a debug
message only.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200401101613.6201-16-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evlist.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 1548237..82d9f9b 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1131,8 +1131,10 @@ bool perf_evlist__valid_read_format(struct evlist *evlist)
 	u64 sample_type = first->core.attr.sample_type;
 
 	evlist__for_each_entry(evlist, pos) {
-		if (read_format != pos->core.attr.read_format)
-			return false;
+		if (read_format != pos->core.attr.read_format) {
+			pr_debug("Read format differs %#" PRIx64 " vs %#" PRIx64 "\n",
+				 read_format, (u64)pos->core.attr.read_format);
+		}
 	}
 
 	/* PERF_SAMPLE_READ imples PERF_FORMAT_ID. */

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip: perf/core] perf tools: Add support for leader-sampling with AUX area events
  2020-04-01 10:16 ` [PATCH 16/16] perf tools: Add support for leader-sampling with AUX area events Adrian Hunter
@ 2020-04-22 12:17   ` tip-bot2 for Adrian Hunter
  0 siblings, 0 replies; 47+ messages in thread
From: tip-bot2 for Adrian Hunter @ 2020-04-22 12:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Adrian Hunter, Andi Kleen, Jiri Olsa, Arnaldo Carvalho de Melo,
	x86, LKML

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     e345997914a8af5e8362e884d2fee38bd2e9c6d8
Gitweb:        https://git.kernel.org/tip/e345997914a8af5e8362e884d2fee38bd2e9c6d8
Author:        Adrian Hunter <adrian.hunter@intel.com>
AuthorDate:    Wed, 01 Apr 2020 13:16:13 +03:00
Committer:     Arnaldo Carvalho de Melo <acme@redhat.com>
CommitterDate: Sat, 18 Apr 2020 09:05:00 -03:00

perf tools: Add support for leader-sampling with AUX area events

When AUX area events are used in sampling mode, they must be the group
leader, but the group leader is also used for leader-sampling. However,
it is not desirable to use an AUX area event as the leader for
leader-sampling, because it doesn't have any samples of its own. To support
leader-sampling with AUX area events, use the 2nd event of the group as the
"leader" for the purposes of leader-sampling.

Example:

 # perf record --kcore --aux-sample -e '{intel_pt//,cycles,instructions}:S' -c 10000 uname
 [ perf record: Woken up 3 times to write data ]
 [ perf record: Captured and wrote 0.786 MB perf.data ]
 # perf report
 Samples: 380  of events 'anon group { cycles, instructions }', Event count (approx.): 3026164
           Children              Self  Command  Shared Object      Symbol
 +   38.76%  42.65%     0.00%   0.00%  uname    [kernel.kallsyms]  [k] __x86_indirect_thunk_rax
 +   35.82%  31.33%     0.00%   0.00%  uname    ld-2.28.so         [.] _dl_start_user
 +   34.29%  29.74%     0.55%   0.47%  uname    ld-2.28.so         [.] _dl_start
 +   33.73%  28.62%     1.60%   0.97%  uname    ld-2.28.so         [.] dl_main
 +   33.19%  29.04%     0.52%   0.32%  uname    ld-2.28.so         [.] _dl_sysdep_start
 +   27.83%  33.74%     0.00%   0.00%  uname    [kernel.kallsyms]  [k] do_syscall_64
 +   26.76%  33.29%     0.00%   0.00%  uname    [kernel.kallsyms]  [k] entry_SYSCALL_64_after_hwframe
 +   23.78%  20.33%     5.97%   5.25%  uname    [kernel.kallsyms]  [k] page_fault
 +   23.18%  24.60%     0.00%   0.00%  uname    libc-2.28.so       [.] __libc_start_main
 +   22.64%  24.37%     0.00%   0.00%  uname    uname              [.] _start
 +   21.04%  23.27%     0.00%   0.00%  uname    uname              [.] main
 +   19.48%  18.08%     3.72%   3.64%  uname    ld-2.28.so         [.] _dl_relocate_object
 +   19.47%  21.81%     0.00%   0.00%  uname    libc-2.28.so       [.] setlocale
 +   19.44%  21.56%     0.52%   0.61%  uname    libc-2.28.so       [.] _nl_find_locale
 +   17.87%  19.66%     0.00%   0.00%  uname    libc-2.28.so       [.] _nl_load_locale_from_archive
 +   15.71%  13.73%     0.53%   0.52%  uname    [kernel.kallsyms]  [k] do_page_fault
 +   15.18%  13.21%     1.03%   0.68%  uname    [kernel.kallsyms]  [k] handle_mm_fault
 +   14.15%  12.53%     1.01%   1.12%  uname    [kernel.kallsyms]  [k] __handle_mm_fault
 +   12.03%   9.67%     0.54%   0.32%  uname    ld-2.28.so         [.] _dl_map_object
 +   10.55%   8.48%     0.00%   0.00%  uname    ld-2.28.so         [.] openaux
 +   10.55%  20.20%     0.52%   0.61%  uname    libc-2.28.so       [.] __run_exit_handlers

Comnmitter notes:

Fixed up this problem:

  util/record.c: In function ‘perf_evlist__config’:
  util/record.c:256:3: error: too few arguments to function ‘perf_evsel__config_leader_sampling’
    256 |   perf_evsel__config_leader_sampling(evsel);
        |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  util/record.c:190:13: note: declared here
    190 | static void perf_evsel__config_leader_sampling(struct evsel *evsel,
        |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200401101613.6201-17-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-list.txt |  3 ++-
 tools/perf/util/record.c               | 45 +++++++++++++++++++++----
 2 files changed, 42 insertions(+), 6 deletions(-)

diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt
index 6345db3..cb23667 100644
--- a/tools/perf/Documentation/perf-list.txt
+++ b/tools/perf/Documentation/perf-list.txt
@@ -258,6 +258,9 @@ Normally all events in an event group sample, but with :S only
 the first event (the leader) samples, and it only reads the values of the
 other events in the group.
 
+However, in the case AUX area events (e.g. Intel PT or CoreSight), the AUX
+area event must be the leader, so then the second event samples, not the first.
+
 OPTIONS
 -------
 
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index 32aeeb8..6d3e3df 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -167,17 +167,46 @@ bool perf_can_aux_sample(void)
 	return true;
 }
 
-static void perf_evsel__config_leader_sampling(struct evsel *evsel)
+/*
+ * perf_evsel__config_leader_sampling() uses special rules for leader sampling.
+ * However, if the leader is an AUX area event, then assume the event to sample
+ * is the next event.
+ */
+static struct evsel *perf_evsel__read_sampler(struct evsel *evsel,
+					      struct evlist *evlist)
+{
+	struct evsel *leader = evsel->leader;
+
+	if (perf_evsel__is_aux_event(leader)) {
+		evlist__for_each_entry(evlist, evsel) {
+			if (evsel->leader == leader && evsel != evsel->leader)
+				return evsel;
+		}
+	}
+
+	return leader;
+}
+
+static void perf_evsel__config_leader_sampling(struct evsel *evsel,
+					       struct evlist *evlist)
 {
 	struct perf_event_attr *attr = &evsel->core.attr;
 	struct evsel *leader = evsel->leader;
+	struct evsel *read_sampler;
+
+	if (!leader->sample_read)
+		return;
+
+	read_sampler = perf_evsel__read_sampler(evsel, evlist);
 
-	if (leader == evsel || !leader->sample_read)
+	if (evsel == read_sampler)
 		return;
 
 	/*
-	 * Disable sampling for all group members other
-	 * than leader in case leader 'leads' the sampling.
+	 * Disable sampling for all group members other than the leader in
+	 * case the leader 'leads' the sampling, except when the leader is an
+	 * AUX area event, in which case the 2nd event in the group is the one
+	 * that 'leads' the sampling.
 	 */
 	attr->freq           = 0;
 	attr->sample_freq    = 0;
@@ -188,8 +217,12 @@ static void perf_evsel__config_leader_sampling(struct evsel *evsel)
 	 * We don't get a sample for slave events, we make them when delivering
 	 * the group leader sample. Set the slave event to follow the master
 	 * sample_type to ease up reporting.
+	 * An AUX area event also has sample_type requirements, so also include
+	 * the sample type bits from the leader's sample_type to cover that
+	 * case.
 	 */
-	attr->sample_type = leader->core.attr.sample_type;
+	attr->sample_type = read_sampler->core.attr.sample_type |
+			    leader->core.attr.sample_type;
 }
 
 void perf_evlist__config(struct evlist *evlist, struct record_opts *opts,
@@ -220,7 +253,7 @@ void perf_evlist__config(struct evlist *evlist, struct record_opts *opts,
 
 	/* Configure leader sampling here now that the sample type is known */
 	evlist__for_each_entry(evlist, evsel)
-		perf_evsel__config_leader_sampling(evsel);
+		perf_evsel__config_leader_sampling(evsel, evlist);
 
 	if (opts->full_auxtrace) {
 		/*

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip: perf/core] perf evsel: Rearrange perf_evsel__config_leader_sampling()
  2020-04-01 10:16 ` [PATCH 14/16] perf tools: Rearrange perf_evsel__config_leader_sampling() Adrian Hunter
@ 2020-04-22 12:17   ` tip-bot2 for Adrian Hunter
  0 siblings, 0 replies; 47+ messages in thread
From: tip-bot2 for Adrian Hunter @ 2020-04-22 12:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Adrian Hunter, Andi Kleen, Jiri Olsa, Arnaldo Carvalho de Melo,
	x86, LKML

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     3713eb371c873e6bed713d78f1bdd5e8be0764a3
Gitweb:        https://git.kernel.org/tip/3713eb371c873e6bed713d78f1bdd5e8be0764a3
Author:        Adrian Hunter <adrian.hunter@intel.com>
AuthorDate:    Wed, 01 Apr 2020 13:16:11 +03:00
Committer:     Arnaldo Carvalho de Melo <acme@redhat.com>
CommitterDate: Sat, 18 Apr 2020 09:05:00 -03:00

perf evsel: Rearrange perf_evsel__config_leader_sampling()

In preparation for adding support for leader sampling with AUX area events.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200401101613.6201-15-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/record.c | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index 8870ae4..32aeeb8 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -172,24 +172,24 @@ static void perf_evsel__config_leader_sampling(struct evsel *evsel)
 	struct perf_event_attr *attr = &evsel->core.attr;
 	struct evsel *leader = evsel->leader;
 
+	if (leader == evsel || !leader->sample_read)
+		return;
+
 	/*
 	 * Disable sampling for all group members other
 	 * than leader in case leader 'leads' the sampling.
 	 */
-	if (leader != evsel && leader->sample_read) {
-		attr->freq           = 0;
-		attr->sample_freq    = 0;
-		attr->sample_period  = 0;
-		attr->write_backward = 0;
+	attr->freq           = 0;
+	attr->sample_freq    = 0;
+	attr->sample_period  = 0;
+	attr->write_backward = 0;
 
-		/*
-		 * We don't get sample for slave events, we make them
-		 * when delivering group leader sample. Set the slave
-		 * event to follow the master sample_type to ease up
-		 * report.
-		 */
-		attr->sample_type = leader->core.attr.sample_type;
-	}
+	/*
+	 * We don't get a sample for slave events, we make them when delivering
+	 * the group leader sample. Set the slave event to follow the master
+	 * sample_type to ease up reporting.
+	 */
+	attr->sample_type = leader->core.attr.sample_type;
 }
 
 void perf_evlist__config(struct evlist *evlist, struct record_opts *opts,

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip: perf/core] perf evlist: Move leader-sampling configuration
  2020-04-01 10:16 ` [PATCH 13/16] perf tools: Move leader-sampling configuration Adrian Hunter
  2020-04-16 15:29   ` Arnaldo Carvalho de Melo
@ 2020-04-22 12:17   ` tip-bot2 for Adrian Hunter
  1 sibling, 0 replies; 47+ messages in thread
From: tip-bot2 for Adrian Hunter @ 2020-04-22 12:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Adrian Hunter, Andi Kleen, Jiri Olsa, Arnaldo Carvalho de Melo,
	x86, LKML

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     5f34278867b78bed77dcbd723056244e9bfc12ef
Gitweb:        https://git.kernel.org/tip/5f34278867b78bed77dcbd723056244e9bfc12ef
Author:        Adrian Hunter <adrian.hunter@intel.com>
AuthorDate:    Wed, 01 Apr 2020 13:16:10 +03:00
Committer:     Arnaldo Carvalho de Melo <acme@redhat.com>
CommitterDate: Sat, 18 Apr 2020 09:05:00 -03:00

perf evlist: Move leader-sampling configuration

Move leader-sampling configuration in preparation for adding support for
leader sampling with AUX area events.

Committer notes:

It only makes sense when configuring an evsel that is part of an evlist,
so the only case where it is called outside perf_evlist__config(), in
some 'perf test' entry, is safe, and even there we should just use
perf_evlist__config(), but since in that case we have just one evsel in
the evlist, it is equivalent.

Also fixed up this problem:

  util/record.c: In function ‘perf_evlist__config’:
  util/record.c:223:3: error: too many arguments to function ‘perf_evsel__config_leader_sampling’
    223 |   perf_evsel__config_leader_sampling(evsel, evlist);
        |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  util/record.c:170:13: note: declared here
    170 | static void perf_evsel__config_leader_sampling(struct evsel *evsel)
        |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200401101613.6201-14-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evsel.c  | 19 -------------------
 tools/perf/util/record.c | 29 +++++++++++++++++++++++++++++
 2 files changed, 29 insertions(+), 19 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index f320ada..8300e8c 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1002,25 +1002,6 @@ void perf_evsel__config(struct evsel *evsel, struct record_opts *opts,
 		}
 	}
 
-	/*
-	 * Disable sampling for all group members other
-	 * than leader in case leader 'leads' the sampling.
-	 */
-	if ((leader != evsel) && leader->sample_read) {
-		attr->freq           = 0;
-		attr->sample_freq    = 0;
-		attr->sample_period  = 0;
-		attr->write_backward = 0;
-
-		/*
-		 * We don't get sample for slave events, we make them
-		 * when delivering group leader sample. Set the slave
-		 * event to follow the master sample_type to ease up
-		 * report.
-		 */
-		attr->sample_type = leader->core.attr.sample_type;
-	}
-
 	if (opts->no_samples)
 		attr->sample_freq = 0;
 
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index 7def661..8870ae4 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -167,6 +167,31 @@ bool perf_can_aux_sample(void)
 	return true;
 }
 
+static void perf_evsel__config_leader_sampling(struct evsel *evsel)
+{
+	struct perf_event_attr *attr = &evsel->core.attr;
+	struct evsel *leader = evsel->leader;
+
+	/*
+	 * Disable sampling for all group members other
+	 * than leader in case leader 'leads' the sampling.
+	 */
+	if (leader != evsel && leader->sample_read) {
+		attr->freq           = 0;
+		attr->sample_freq    = 0;
+		attr->sample_period  = 0;
+		attr->write_backward = 0;
+
+		/*
+		 * We don't get sample for slave events, we make them
+		 * when delivering group leader sample. Set the slave
+		 * event to follow the master sample_type to ease up
+		 * report.
+		 */
+		attr->sample_type = leader->core.attr.sample_type;
+	}
+}
+
 void perf_evlist__config(struct evlist *evlist, struct record_opts *opts,
 			 struct callchain_param *callchain)
 {
@@ -193,6 +218,10 @@ void perf_evlist__config(struct evlist *evlist, struct record_opts *opts,
 			evsel->core.attr.comm_exec = 1;
 	}
 
+	/* Configure leader sampling here now that the sample type is known */
+	evlist__for_each_entry(evlist, evsel)
+		perf_evsel__config_leader_sampling(evsel);
+
 	if (opts->full_auxtrace) {
 		/*
 		 * Need to be able to synthesize and parse selected events with

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip: perf/core] perf intel-pt: Add support for synthesizing callchains for regular events
  2020-04-01 10:16 ` [PATCH 11/16] perf intel-pt: Add support for synthesizing callchains for regular events Adrian Hunter
  2020-04-16 15:14   ` Arnaldo Carvalho de Melo
@ 2020-04-22 12:17   ` tip-bot2 for Adrian Hunter
  1 sibling, 0 replies; 47+ messages in thread
From: tip-bot2 for Adrian Hunter @ 2020-04-22 12:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Adrian Hunter, Arnaldo Carvalho de Melo, Andi Kleen, Jiri Olsa,
	x86, LKML

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     2855c05cf14a5ee0d3b58168632acb11ea35721f
Gitweb:        https://git.kernel.org/tip/2855c05cf14a5ee0d3b58168632acb11ea35721f
Author:        Adrian Hunter <adrian.hunter@intel.com>
AuthorDate:    Wed, 01 Apr 2020 13:16:08 +03:00
Committer:     Arnaldo Carvalho de Melo <acme@redhat.com>
CommitterDate: Thu, 16 Apr 2020 12:19:17 -03:00

perf intel-pt: Add support for synthesizing callchains for regular events

Currently, callchains can be synthesized only for synthesized events.
Support also synthesizing callchains for regular events.

Example:

 # perf record --kcore --aux-sample -e '{intel_pt//,cycles}' -c 10000 uname
 Linux
 [ perf record: Woken up 3 times to write data ]
 [ perf record: Captured and wrote 0.532 MB perf.data ]
 # perf script --itrace=Ge | head -20
 uname  4864 2419025.358181:      10000     cycles:
        ffffffffbba56965 apparmor_bprm_committing_creds+0x35 ([kernel.kallsyms])
        ffffffffbc400cd5 __indirect_thunk_start+0x5 ([kernel.kallsyms])
        ffffffffbba07422 security_bprm_committing_creds+0x22 ([kernel.kallsyms])
        ffffffffbb89805d install_exec_creds+0xd ([kernel.kallsyms])
        ffffffffbb90d9ac load_elf_binary+0x3ac ([kernel.kallsyms])

 uname  4864 2419025.358185:      10000     cycles:
        ffffffffbba56db0 apparmor_bprm_committed_creds+0x20 ([kernel.kallsyms])
        ffffffffbc400cd5 __indirect_thunk_start+0x5 ([kernel.kallsyms])
        ffffffffbba07452 security_bprm_committed_creds+0x22 ([kernel.kallsyms])
        ffffffffbb89809a install_exec_creds+0x4a ([kernel.kallsyms])
        ffffffffbb90d9ac load_elf_binary+0x3ac ([kernel.kallsyms])

 uname  4864 2419025.358189:      10000     cycles:
        ffffffffbb86fdf6 vma_adjust_trans_huge+0x6 ([kernel.kallsyms])
        ffffffffbb821660 __vma_adjust+0x160 ([kernel.kallsyms])
        ffffffffbb897be7 shift_arg_pages+0x97 ([kernel.kallsyms])
        ffffffffbb897ed9 setup_arg_pages+0x1e9 ([kernel.kallsyms])
        ffffffffbb90d9f2 load_elf_binary+0x3f2 ([kernel.kallsyms])

Committer testing:

  # perf record --kcore --aux-sample -e '{intel_pt//,cycles}' -c 10000 uname
  Linux
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.233 MB perf.data ]
  #

Then, before this patch:

  # perf script --itrace=Ge | head -20
     uname 28642 168664.856384: 10000 cycles: ffffffff9810aeaa commit_creds+0x2a ([kernel.kallsyms])
     uname 28642 168664.856388: 10000 cycles: ffffffff982a24f1 mprotect_fixup+0x151 ([kernel.kallsyms])
     uname 28642 168664.856392: 10000 cycles: ffffffff982a385b move_page_tables+0xbcb ([kernel.kallsyms])
     uname 28642 168664.856396: 10000 cycles: ffffffff982fd4ec __mod_memcg_state+0x1c ([kernel.kallsyms])
     uname 28642 168664.856400: 10000 cycles: ffffffff9829fddd do_mmap+0xfd ([kernel.kallsyms])
     uname 28642 168664.856404: 10000 cycles: ffffffff9829c879 __vma_adjust+0x479 ([kernel.kallsyms])
     uname 28642 168664.856408: 10000 cycles: ffffffff98238e94 __perf_addr_filters_adjust+0x34 ([kernel.kallsyms])
     uname 28642 168664.856412: 10000 cycles: ffffffff98a38e0b down_write+0x1b ([kernel.kallsyms])
     uname 28642 168664.856416: 10000 cycles: ffffffff983006a0 memcg_kmem_get_cache+0x0 ([kernel.kallsyms])
     uname 28642 168664.856421: 10000 cycles: ffffffff98396eaf load_elf_binary+0x92f ([kernel.kallsyms])
     uname 28642 168664.856425: 10000 cycles: ffffffff982e0222 kfree+0x62 ([kernel.kallsyms])
     uname 28642 168664.856428: 10000 cycles: ffffffff9846dfd4 file_has_perm+0x54 ([kernel.kallsyms])
     uname 28642 168664.856433: 10000 cycles: ffffffff98288911 vma_interval_tree_insert+0x51 ([kernel.kallsyms])
     uname 28642 168664.856437: 10000 cycles: ffffffff9823e577 perf_event_mmap_output+0x27 ([kernel.kallsyms])
     uname 28642 168664.856441: 10000 cycles: ffffffff98a26fa0 xas_load+0x40 ([kernel.kallsyms])
     uname 28642 168664.856445: 10000 cycles: ffffffff98004f30 arch_setup_additional_pages+0x0 ([kernel.kallsyms])
     uname 28642 168664.856448: 10000 cycles: ffffffff98a297c0 copy_user_generic_unrolled+0xa0 ([kernel.kallsyms])
     uname 28642 168664.856452: 10000 cycles: ffffffff9853a87a strnlen_user+0x10a ([kernel.kallsyms])
     uname 28642 168664.856456: 10000 cycles: ffffffff986638a7 randomize_page+0x27 ([kernel.kallsyms])
     uname 28642 168664.856460: 10000 cycles: ffffffff98a3b645 _raw_spin_lock+0x5 ([kernel.kallsyms])

  #

And after:

  # perf script --itrace=Ge | head -20
  uname 28642 168664.856384:      10000     cycles:
  	ffffffff9810aeaa commit_creds+0x2a ([kernel.kallsyms])
  	ffffffff9831fe87 install_exec_creds+0x17 ([kernel.kallsyms])
  	ffffffff983968d9 load_elf_binary+0x359 ([kernel.kallsyms])
  	ffffffff98e00c45 __x86_indirect_thunk_rax+0x5 ([kernel.kallsyms])
  	ffffffff98e00c45 __x86_indirect_thunk_rax+0x5 ([kernel.kallsyms])

  uname 28642 168664.856388:      10000     cycles:
  	ffffffff982a24f1 mprotect_fixup+0x151 ([kernel.kallsyms])
  	ffffffff9831fa83 setup_arg_pages+0x123 ([kernel.kallsyms])
  	ffffffff9839691f load_elf_binary+0x39f ([kernel.kallsyms])
  	ffffffff98e00c45 __x86_indirect_thunk_rax+0x5 ([kernel.kallsyms])
  	ffffffff98e00c45 __x86_indirect_thunk_rax+0x5 ([kernel.kallsyms])

  uname 28642 168664.856392:      10000     cycles:
  	ffffffff982a385b move_page_tables+0xbcb ([kernel.kallsyms])
  	ffffffff9831f889 shift_arg_pages+0xa9 ([kernel.kallsyms])
  	ffffffff9831fb4f setup_arg_pages+0x1ef ([kernel.kallsyms])
  	ffffffff9839691f load_elf_binary+0x39f ([kernel.kallsyms])
  	ffffffff98e00c45 __x86_indirect_thunk_rax+0x5 ([kernel.kallsyms])
  #

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200401101613.6201-12-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/intel-pt.c | 68 +++++++++++++++++++++++++++++++++----
 1 file changed, 61 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index db25c77..a659b4a 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -124,6 +124,8 @@ struct intel_pt {
 
 	struct range *time_ranges;
 	unsigned int range_cnt;
+
+	struct ip_callchain *chain;
 };
 
 enum switch_state {
@@ -868,6 +870,45 @@ static u64 intel_pt_ns_to_ticks(const struct intel_pt *pt, u64 ns)
 		pt->tc.time_mult;
 }
 
+static struct ip_callchain *intel_pt_alloc_chain(struct intel_pt *pt)
+{
+	size_t sz = sizeof(struct ip_callchain);
+
+	/* Add 1 to callchain_sz for callchain context */
+	sz += (pt->synth_opts.callchain_sz + 1) * sizeof(u64);
+	return zalloc(sz);
+}
+
+static int intel_pt_callchain_init(struct intel_pt *pt)
+{
+	struct evsel *evsel;
+
+	evlist__for_each_entry(pt->session->evlist, evsel) {
+		if (!(evsel->core.attr.sample_type & PERF_SAMPLE_CALLCHAIN))
+			evsel->synth_sample_type |= PERF_SAMPLE_CALLCHAIN;
+	}
+
+	pt->chain = intel_pt_alloc_chain(pt);
+	if (!pt->chain)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static void intel_pt_add_callchain(struct intel_pt *pt,
+				   struct perf_sample *sample)
+{
+	struct thread *thread = machine__findnew_thread(pt->machine,
+							sample->pid,
+							sample->tid);
+
+	thread_stack__sample_late(thread, sample->cpu, pt->chain,
+				  pt->synth_opts.callchain_sz + 1, sample->ip,
+				  pt->kernel_start);
+
+	sample->callchain = pt->chain;
+}
+
 static struct intel_pt_queue *intel_pt_alloc_queue(struct intel_pt *pt,
 						   unsigned int queue_nr)
 {
@@ -880,11 +921,7 @@ static struct intel_pt_queue *intel_pt_alloc_queue(struct intel_pt *pt,
 		return NULL;
 
 	if (pt->synth_opts.callchain) {
-		size_t sz = sizeof(struct ip_callchain);
-
-		/* Add 1 to callchain_sz for callchain context */
-		sz += (pt->synth_opts.callchain_sz + 1) * sizeof(u64);
-		ptq->chain = zalloc(sz);
+		ptq->chain = intel_pt_alloc_chain(pt);
 		if (!ptq->chain)
 			goto out_free;
 	}
@@ -1992,7 +2029,8 @@ static int intel_pt_sample(struct intel_pt_queue *ptq)
 	if (!(state->type & INTEL_PT_BRANCH))
 		return 0;
 
-	if (pt->synth_opts.callchain || pt->synth_opts.thread_stack)
+	if (pt->synth_opts.callchain || pt->synth_opts.add_callchain ||
+	    pt->synth_opts.thread_stack)
 		thread_stack__event(ptq->thread, ptq->cpu, ptq->flags, state->from_ip,
 				    state->to_ip, ptq->insn_len,
 				    state->trace_nr);
@@ -2639,6 +2677,11 @@ static int intel_pt_process_event(struct perf_session *session,
 	if (err)
 		return err;
 
+	if (event->header.type == PERF_RECORD_SAMPLE) {
+		if (pt->synth_opts.add_callchain && !sample->callchain)
+			intel_pt_add_callchain(pt, sample);
+	}
+
 	if (event->header.type == PERF_RECORD_AUX &&
 	    (event->aux.flags & PERF_AUX_FLAG_TRUNCATED) &&
 	    pt->synth_opts.errors) {
@@ -2710,6 +2753,7 @@ static void intel_pt_free(struct perf_session *session)
 	session->auxtrace = NULL;
 	thread__put(pt->unknown_thread);
 	addr_filters__exit(&pt->filts);
+	zfree(&pt->chain);
 	zfree(&pt->filter);
 	zfree(&pt->time_ranges);
 	free(pt);
@@ -3348,6 +3392,7 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
 		    !session->itrace_synth_opts->inject) {
 			pt->synth_opts.branches = false;
 			pt->synth_opts.callchain = true;
+			pt->synth_opts.add_callchain = true;
 		}
 		pt->synth_opts.thread_stack =
 				session->itrace_synth_opts->thread_stack;
@@ -3380,14 +3425,22 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
 		pt->branches_filter |= PERF_IP_FLAG_RETURN |
 				       PERF_IP_FLAG_TRACE_BEGIN;
 
-	if (pt->synth_opts.callchain && !symbol_conf.use_callchain) {
+	if ((pt->synth_opts.callchain || pt->synth_opts.add_callchain) &&
+	    !symbol_conf.use_callchain) {
 		symbol_conf.use_callchain = true;
 		if (callchain_register_param(&callchain_param) < 0) {
 			symbol_conf.use_callchain = false;
 			pt->synth_opts.callchain = false;
+			pt->synth_opts.add_callchain = false;
 		}
 	}
 
+	if (pt->synth_opts.add_callchain) {
+		err = intel_pt_callchain_init(pt);
+		if (err)
+			goto err_delete_thread;
+	}
+
 	err = intel_pt_synth_events(pt, session);
 	if (err)
 		goto err_delete_thread;
@@ -3410,6 +3463,7 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
 	return 0;
 
 err_delete_thread:
+	zfree(&pt->chain);
 	thread__zput(pt->unknown_thread);
 err_free_queues:
 	intel_pt_log_disable();

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip: perf/core] perf evsel: Move and globalize perf_evsel__find_pmu() and perf_evsel__is_aux_event()
  2020-04-01 10:16 ` [PATCH 12/16] perf tools: Move and globalize perf_evsel__find_pmu() and perf_evsel__is_aux_event() Adrian Hunter
  2020-04-18 11:50   ` Arnaldo Carvalho de Melo
@ 2020-04-22 12:17   ` tip-bot2 for Adrian Hunter
  1 sibling, 0 replies; 47+ messages in thread
From: tip-bot2 for Adrian Hunter @ 2020-04-22 12:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Adrian Hunter, Andi Kleen, Jiri Olsa, Arnaldo Carvalho de Melo,
	x86, LKML

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     e12ee9f7513cb5dbe8b12aac030dfbeff35b3766
Gitweb:        https://git.kernel.org/tip/e12ee9f7513cb5dbe8b12aac030dfbeff35b3766
Author:        Adrian Hunter <adrian.hunter@intel.com>
AuthorDate:    Wed, 01 Apr 2020 13:16:09 +03:00
Committer:     Arnaldo Carvalho de Melo <acme@redhat.com>
CommitterDate: Sat, 18 Apr 2020 09:04:32 -03:00

perf evsel: Move and globalize perf_evsel__find_pmu() and perf_evsel__is_aux_event()

Move and globalize 2 functions from the auxtrace specific sources so
that they can be reused.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200401101613.6201-13-adrian.hunter@intel.com
[ Move to pmu.c, as moving to evsel.h breaks the python binding ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/auxtrace.c | 19 -------------------
 tools/perf/util/evsel.h    |  3 +++
 tools/perf/util/pmu.c      | 20 ++++++++++++++++++++
 3 files changed, 23 insertions(+), 19 deletions(-)

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 809a09e..33ad333 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -58,25 +58,6 @@
 #include "symbol/kallsyms.h"
 #include <internal/lib.h>
 
-static struct perf_pmu *perf_evsel__find_pmu(struct evsel *evsel)
-{
-	struct perf_pmu *pmu = NULL;
-
-	while ((pmu = perf_pmu__scan(pmu)) != NULL) {
-		if (pmu->type == evsel->core.attr.type)
-			break;
-	}
-
-	return pmu;
-}
-
-static bool perf_evsel__is_aux_event(struct evsel *evsel)
-{
-	struct perf_pmu *pmu = perf_evsel__find_pmu(evsel);
-
-	return pmu && pmu->auxtrace;
-}
-
 /*
  * Make a group from 'leader' to 'last', requiring that the events were not
  * already grouped to a different leader.
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index e64ed42..a463bc6 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -158,6 +158,9 @@ int perf_evsel__object_config(size_t object_size,
 			      int (*init)(struct evsel *evsel),
 			      void (*fini)(struct evsel *evsel));
 
+struct perf_pmu *perf_evsel__find_pmu(struct evsel *evsel);
+bool perf_evsel__is_aux_event(struct evsel *evsel);
+
 struct evsel *perf_evsel__new_idx(struct perf_event_attr *attr, int idx);
 
 static inline struct evsel *evsel__new(struct perf_event_attr *attr)
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index ef6a63f..bc912a8 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -18,6 +18,7 @@
 #include <regex.h>
 #include <perf/cpumap.h>
 #include "debug.h"
+#include "evsel.h"
 #include "pmu.h"
 #include "parse-events.h"
 #include "header.h"
@@ -884,6 +885,25 @@ struct perf_pmu *perf_pmu__scan(struct perf_pmu *pmu)
 	return NULL;
 }
 
+struct perf_pmu *perf_evsel__find_pmu(struct evsel *evsel)
+{
+	struct perf_pmu *pmu = NULL;
+
+	while ((pmu = perf_pmu__scan(pmu)) != NULL) {
+		if (pmu->type == evsel->core.attr.type)
+			break;
+	}
+
+	return pmu;
+}
+
+bool perf_evsel__is_aux_event(struct evsel *evsel)
+{
+	struct perf_pmu *pmu = perf_evsel__find_pmu(evsel);
+
+	return pmu && pmu->auxtrace;
+}
+
 struct perf_pmu *perf_pmu__find(const char *name)
 {
 	struct perf_pmu *pmu;

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip: perf/core] perf evsel: Add support for synthesized sample type
  2020-04-01 10:16 ` [PATCH 10/16] perf tools: Add support for synthesized sample type Adrian Hunter
  2020-04-16 14:54   ` Arnaldo Carvalho de Melo
  2020-04-22 12:17   ` [tip: perf/core] perf evsel: Be consistent when looking which evsel PERF_SAMPLE_ bits are set tip-bot2 for Adrian Hunter
@ 2020-04-22 12:17   ` tip-bot2 for Adrian Hunter
  2 siblings, 0 replies; 47+ messages in thread
From: tip-bot2 for Adrian Hunter @ 2020-04-22 12:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Adrian Hunter, Andi Kleen, Jiri Olsa, Arnaldo Carvalho de Melo,
	x86, LKML

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     e11869a065e36f3d22a575ccfb1097c262bb4f6e
Gitweb:        https://git.kernel.org/tip/e11869a065e36f3d22a575ccfb1097c262bb4f6e
Author:        Adrian Hunter <adrian.hunter@intel.com>
AuthorDate:    Wed, 01 Apr 2020 13:16:07 +03:00
Committer:     Arnaldo Carvalho de Melo <acme@redhat.com>
CommitterDate: Thu, 16 Apr 2020 12:19:17 -03:00

perf evsel: Add support for synthesized sample type

For reporting purposes, an evsel sample can have a callchain synthesized
from AUX area data. Add support for keeping track of synthesized sample
types. Note, the recorded sample_type cannot be changed because it is
needed to continue to parse events.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200401101613.6201-11-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evsel.h | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 53187c5..e64ed42 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -104,6 +104,14 @@ struct evsel {
 		perf_evsel__sb_cb_t	*cb;
 		void			*data;
 	} side_band;
+	/*
+	 * For reporting purposes, an evsel sample can have a callchain
+	 * synthesized from AUX area data. Keep track of synthesized sample
+	 * types here. Note, the recorded sample_type cannot be changed because
+	 * it is needed to continue to parse events.
+	 * See also evsel__has_callchain().
+	 */
+	__u64			synth_sample_type;
 };
 
 struct perf_missing_features {
@@ -398,7 +406,12 @@ static inline bool perf_evsel__has_branch_hw_idx(const struct evsel *evsel)
 
 static inline bool evsel__has_callchain(const struct evsel *evsel)
 {
-	return (evsel->core.attr.sample_type & PERF_SAMPLE_CALLCHAIN) != 0;
+	/*
+	 * For reporting purposes, an evsel sample can have a recorded callchain
+	 * or a callchain synthesized from AUX area data.
+	 */
+	return evsel->core.attr.sample_type & PERF_SAMPLE_CALLCHAIN ||
+	       evsel->synth_sample_type & PERF_SAMPLE_CALLCHAIN;
 }
 
 struct perf_env *perf_evsel__env(struct evsel *evsel);

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip: perf/core] perf thread-stack: Add thread_stack__sample_late()
  2020-04-01 10:16 ` [PATCH 09/16] perf thread-stack: Add thread_stack__sample_late() Adrian Hunter
@ 2020-04-22 12:17   ` tip-bot2 for Adrian Hunter
  0 siblings, 0 replies; 47+ messages in thread
From: tip-bot2 for Adrian Hunter @ 2020-04-22 12:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Adrian Hunter, Andi Kleen, Jiri Olsa, Arnaldo Carvalho de Melo,
	x86, LKML

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     4fef41bfb1d8d2ada4a18eb3ab80c2682bcbae12
Gitweb:        https://git.kernel.org/tip/4fef41bfb1d8d2ada4a18eb3ab80c2682bcbae12
Author:        Adrian Hunter <adrian.hunter@intel.com>
AuthorDate:    Wed, 01 Apr 2020 13:16:06 +03:00
Committer:     Arnaldo Carvalho de Melo <acme@redhat.com>
CommitterDate: Thu, 16 Apr 2020 12:19:15 -03:00

perf thread-stack: Add thread_stack__sample_late()

Add a thread stack function to create a call chain for hardware events
where the sample records get created some time after the event occurred.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200401101613.6201-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/thread-stack.c | 57 +++++++++++++++++++++++++++++++++-
 tools/perf/util/thread-stack.h |  3 ++-
 2 files changed, 60 insertions(+)

diff --git a/tools/perf/util/thread-stack.c b/tools/perf/util/thread-stack.c
index 0885967..83f6c83 100644
--- a/tools/perf/util/thread-stack.c
+++ b/tools/perf/util/thread-stack.c
@@ -497,6 +497,63 @@ void thread_stack__sample(struct thread *thread, int cpu,
 	chain->nr = i;
 }
 
+/*
+ * Hardware sample records, created some time after the event occurred, need to
+ * have subsequent addresses removed from the call chain.
+ */
+void thread_stack__sample_late(struct thread *thread, int cpu,
+			       struct ip_callchain *chain, size_t sz,
+			       u64 sample_ip, u64 kernel_start)
+{
+	struct thread_stack *ts = thread__stack(thread, cpu);
+	u64 sample_context = callchain_context(sample_ip, kernel_start);
+	u64 last_context, context, ip;
+	size_t nr = 0, j;
+
+	if (sz < 2) {
+		chain->nr = 0;
+		return;
+	}
+
+	if (!ts)
+		goto out;
+
+	/*
+	 * When tracing kernel space, kernel addresses occur at the top of the
+	 * call chain after the event occurred but before tracing stopped.
+	 * Skip them.
+	 */
+	for (j = 1; j <= ts->cnt; j++) {
+		ip = ts->stack[ts->cnt - j].ret_addr;
+		context = callchain_context(ip, kernel_start);
+		if (context == PERF_CONTEXT_USER ||
+		    (context == sample_context && ip == sample_ip))
+			break;
+	}
+
+	last_context = sample_ip; /* Use sample_ip as an invalid context */
+
+	for (; nr < sz && j <= ts->cnt; nr++, j++) {
+		ip = ts->stack[ts->cnt - j].ret_addr;
+		context = callchain_context(ip, kernel_start);
+		if (context != last_context) {
+			if (nr >= sz - 1)
+				break;
+			chain->ips[nr++] = context;
+			last_context = context;
+		}
+		chain->ips[nr] = ip;
+	}
+out:
+	if (nr) {
+		chain->nr = nr;
+	} else {
+		chain->ips[0] = sample_context;
+		chain->ips[1] = sample_ip;
+		chain->nr = 2;
+	}
+}
+
 struct call_return_processor *
 call_return_processor__new(int (*process)(struct call_return *cr, u64 *parent_db_id, void *data),
 			   void *data)
diff --git a/tools/perf/util/thread-stack.h b/tools/perf/util/thread-stack.h
index e1ec5a5..8962ddc 100644
--- a/tools/perf/util/thread-stack.h
+++ b/tools/perf/util/thread-stack.h
@@ -85,6 +85,9 @@ int thread_stack__event(struct thread *thread, int cpu, u32 flags, u64 from_ip,
 void thread_stack__set_trace_nr(struct thread *thread, int cpu, u64 trace_nr);
 void thread_stack__sample(struct thread *thread, int cpu, struct ip_callchain *chain,
 			  size_t sz, u64 ip, u64 kernel_start);
+void thread_stack__sample_late(struct thread *thread, int cpu,
+			       struct ip_callchain *chain, size_t sz, u64 ip,
+			       u64 kernel_start);
 int thread_stack__flush(struct thread *thread);
 void thread_stack__free(struct thread *thread);
 size_t thread_stack__depth(struct thread *thread, int cpu);

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip: perf/core] perf evsel: Be consistent when looking which evsel PERF_SAMPLE_ bits are set
  2020-04-01 10:16 ` [PATCH 10/16] perf tools: Add support for synthesized sample type Adrian Hunter
  2020-04-16 14:54   ` Arnaldo Carvalho de Melo
@ 2020-04-22 12:17   ` tip-bot2 for Adrian Hunter
  2020-04-22 12:17   ` [tip: perf/core] perf evsel: Add support for synthesized sample type tip-bot2 for Adrian Hunter
  2 siblings, 0 replies; 47+ messages in thread
From: tip-bot2 for Adrian Hunter @ 2020-04-22 12:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Adrian Hunter, Andi Kleen, Jiri Olsa, Arnaldo Carvalho de Melo,
	x86, LKML

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     8e94b3243a9af2c49a38fd0d6f2f9beb542e41a4
Gitweb:        https://git.kernel.org/tip/8e94b3243a9af2c49a38fd0d6f2f9beb542e41a4
Author:        Adrian Hunter <adrian.hunter@intel.com>
AuthorDate:    Wed, 01 Apr 2020 13:16:07 +03:00
Committer:     Arnaldo Carvalho de Melo <acme@redhat.com>
CommitterDate: Thu, 16 Apr 2020 12:19:17 -03:00

perf evsel: Be consistent when looking which evsel PERF_SAMPLE_ bits are set

Using 'type' variable for checking for callchains is equivalent to using
evsel__has_callchain(evsel) and is how the other PERF_SAMPLE_ bits are checked
in this function, so use it to be consistent.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200401101613.6201-11-adrian.hunter@intel.com
[ split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evsel.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index d23db67..f320ada 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2136,7 +2136,7 @@ int perf_evsel__parse_sample(struct evsel *evsel, union perf_event *event,
 		}
 	}
 
-	if (evsel__has_callchain(evsel)) {
+	if (type & PERF_SAMPLE_CALLCHAIN) {
 		const u64 max_callchain_nr = UINT64_MAX / sizeof(u64);
 
 		OVERFLOW_CHECK_u64(array);

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip: perf/core] perf auxtrace: Add an option to synthesize callchains for regular events
  2020-04-01 10:16 ` [PATCH 08/16] perf auxtrace: Add an option to synthesize callchains for regular events Adrian Hunter
@ 2020-04-22 12:17   ` tip-bot2 for Adrian Hunter
  0 siblings, 0 replies; 47+ messages in thread
From: tip-bot2 for Adrian Hunter @ 2020-04-22 12:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Adrian Hunter, Andi Kleen, Jiri Olsa, Arnaldo Carvalho de Melo,
	x86, LKML

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     1c5c25b3fdbd7035f6d53a1a99b5afd577ce13e1
Gitweb:        https://git.kernel.org/tip/1c5c25b3fdbd7035f6d53a1a99b5afd577ce13e1
Author:        Adrian Hunter <adrian.hunter@intel.com>
AuthorDate:    Wed, 01 Apr 2020 13:16:05 +03:00
Committer:     Arnaldo Carvalho de Melo <acme@redhat.com>
CommitterDate: Thu, 16 Apr 2020 12:19:15 -03:00

perf auxtrace: Add an option to synthesize callchains for regular events

Currently, callchains can be synthesized only for synthesized events. Add
an itrace option to synthesize callchains for regular events.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200401101613.6201-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/itrace.txt | 1 +
 tools/perf/builtin-report.c         | 3 ++-
 tools/perf/builtin-script.c         | 2 +-
 tools/perf/util/auxtrace.c          | 6 +++++-
 tools/perf/util/auxtrace.h          | 2 ++
 tools/perf/util/s390-cpumsf.c       | 2 +-
 6 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/tools/perf/Documentation/itrace.txt b/tools/perf/Documentation/itrace.txt
index 82ff7da..671e154 100644
--- a/tools/perf/Documentation/itrace.txt
+++ b/tools/perf/Documentation/itrace.txt
@@ -10,6 +10,7 @@
 		e	synthesize error events
 		d	create a debug log
 		g	synthesize a call chain (use with i or x)
+		G	synthesize a call chain on existing event records
 		l	synthesize last branch entries (use with i or x)
 		s       skip initial number of events
 
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 26d8fc2..c0cebd5 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -339,6 +339,7 @@ static int report__setup_sample_type(struct report *rep)
 	bool is_pipe = perf_data__is_pipe(session->data);
 
 	if (session->itrace_synth_opts->callchain ||
+	    session->itrace_synth_opts->add_callchain ||
 	    (!is_pipe &&
 	     perf_header__has_feat(&session->header, HEADER_AUXTRACE) &&
 	     !session->itrace_synth_opts->set))
@@ -1332,7 +1333,7 @@ int cmd_report(int argc, const char **argv)
 	if (symbol_conf.cumulate_callchain && !callchain_param.order_set)
 		callchain_param.order = ORDER_CALLER;
 
-	if (itrace_synth_opts.callchain &&
+	if ((itrace_synth_opts.callchain || itrace_synth_opts.add_callchain) &&
 	    (int)itrace_synth_opts.callchain_sz > report.max_stack)
 		report.max_stack = itrace_synth_opts.callchain_sz;
 
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 8bf3ba2..06b511c 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -3537,7 +3537,7 @@ int cmd_script(int argc, const char **argv)
 		return -1;
 	}
 
-	if (itrace_synth_opts.callchain &&
+	if ((itrace_synth_opts.callchain || itrace_synth_opts.add_callchain) &&
 	    itrace_synth_opts.callchain_sz > scripting_max_stack)
 		scripting_max_stack = itrace_synth_opts.callchain_sz;
 
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index b60bae8..809a09e 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -1462,8 +1462,12 @@ int itrace_parse_synth_opts(const struct option *opt, const char *str,
 			synth_opts->branches = true;
 			synth_opts->returns = true;
 			break;
+		case 'G':
 		case 'g':
-			synth_opts->callchain = true;
+			if (p[-1] == 'G')
+				synth_opts->add_callchain = true;
+			else
+				synth_opts->callchain = true;
 			synth_opts->callchain_sz =
 					PERF_ITRACE_DEFAULT_CALLCHAIN_SZ;
 			while (*p == ' ' || *p == ',')
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index db65aae..dd8a4ff 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -74,6 +74,7 @@ enum itrace_period_type {
  * @calls: limit branch samples to calls (can be combined with @returns)
  * @returns: limit branch samples to returns (can be combined with @calls)
  * @callchain: add callchain to 'instructions' events
+ * @add_callchain: add callchain to existing event records
  * @thread_stack: feed branches to the thread_stack
  * @last_branch: add branch context to 'instruction' events
  * @callchain_sz: maximum callchain size
@@ -101,6 +102,7 @@ struct itrace_synth_opts {
 	bool			calls;
 	bool			returns;
 	bool			callchain;
+	bool			add_callchain;
 	bool			thread_stack;
 	bool			last_branch;
 	unsigned int		callchain_sz;
diff --git a/tools/perf/util/s390-cpumsf.c b/tools/perf/util/s390-cpumsf.c
index d7779e4..38a9428 100644
--- a/tools/perf/util/s390-cpumsf.c
+++ b/tools/perf/util/s390-cpumsf.c
@@ -1079,7 +1079,7 @@ static bool check_auxtrace_itrace(struct itrace_synth_opts *itops)
 		itops->pwr_events || itops->errors ||
 		itops->dont_decode || itops->calls || itops->returns ||
 		itops->callchain || itops->thread_stack ||
-		itops->last_branch;
+		itops->last_branch || itops->add_callchain;
 	if (!ison)
 		return true;
 	pr_err("Unsupported --itrace options specified\n");

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip: perf/core] perf s390-cpumsf: Implement ->evsel_is_auxtrace() callback
  2020-04-01 10:16 ` [PATCH 06/16] perf s390-cpumsf: " Adrian Hunter
  2020-04-01 14:10   ` Thomas Richter
@ 2020-04-22 12:17   ` tip-bot2 for Adrian Hunter
  1 sibling, 0 replies; 47+ messages in thread
From: tip-bot2 for Adrian Hunter @ 2020-04-22 12:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Adrian Hunter, Thomas Richter, Andi Kleen, Jiri Olsa,
	Arnaldo Carvalho de Melo, x86, LKML

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     113fcb46cfd557c549ab6bd9a1d43fda2c3a488c
Gitweb:        https://git.kernel.org/tip/113fcb46cfd557c549ab6bd9a1d43fda2c3a488c
Author:        Adrian Hunter <adrian.hunter@intel.com>
AuthorDate:    Wed, 01 Apr 2020 13:16:03 +03:00
Committer:     Arnaldo Carvalho de Melo <acme@redhat.com>
CommitterDate: Thu, 16 Apr 2020 12:19:15 -03:00

perf s390-cpumsf: Implement ->evsel_is_auxtrace() callback

Implement ->evsel_is_auxtrace() callback.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200401101613.6201-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/s390-cpumcf-kernel.h |  1 +
 tools/perf/util/s390-cpumsf.c        |  9 +++++++++
 2 files changed, 10 insertions(+)

diff --git a/tools/perf/util/s390-cpumcf-kernel.h b/tools/perf/util/s390-cpumcf-kernel.h
index d435603..f55ca07 100644
--- a/tools/perf/util/s390-cpumcf-kernel.h
+++ b/tools/perf/util/s390-cpumcf-kernel.h
@@ -11,6 +11,7 @@
 
 #define	S390_CPUMCF_DIAG_DEF	0xfeef	/* Counter diagnostic entry ID */
 #define	PERF_EVENT_CPUM_CF_DIAG	0xBC000	/* Event: Counter sets */
+#define PERF_EVENT_CPUM_SF_DIAG	0xBD000 /* Event: Combined-sampling */
 
 struct cf_ctrset_entry {	/* CPU-M CF counter set entry (8 byte) */
 	unsigned int def:16;	/* 0-15  Data Entry Format */
diff --git a/tools/perf/util/s390-cpumsf.c b/tools/perf/util/s390-cpumsf.c
index 6785cd8..d7779e4 100644
--- a/tools/perf/util/s390-cpumsf.c
+++ b/tools/perf/util/s390-cpumsf.c
@@ -1047,6 +1047,14 @@ static void s390_cpumsf_free(struct perf_session *session)
 	free(sf);
 }
 
+static bool
+s390_cpumsf_evsel_is_auxtrace(struct perf_session *session __maybe_unused,
+			      struct evsel *evsel)
+{
+	return evsel->core.attr.type == PERF_TYPE_RAW &&
+	       evsel->core.attr.config == PERF_EVENT_CPUM_SF_DIAG;
+}
+
 static int s390_cpumsf_get_type(const char *cpuid)
 {
 	int ret, family = 0;
@@ -1142,6 +1150,7 @@ int s390_cpumsf_process_auxtrace_info(union perf_event *event,
 	sf->auxtrace.flush_events = s390_cpumsf_flush;
 	sf->auxtrace.free_events = s390_cpumsf_free_events;
 	sf->auxtrace.free = s390_cpumsf_free;
+	sf->auxtrace.evsel_is_auxtrace = s390_cpumsf_evsel_is_auxtrace;
 	session->auxtrace = &sf->auxtrace;
 
 	if (dump_trace)

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip: perf/core] perf auxtrace: For reporting purposes, un-group AUX area event
  2020-04-01 10:16 ` [PATCH 07/16] perf auxtrace: For reporting purposes, un-group AUX area event Adrian Hunter
@ 2020-04-22 12:17   ` tip-bot2 for Adrian Hunter
  0 siblings, 0 replies; 47+ messages in thread
From: tip-bot2 for Adrian Hunter @ 2020-04-22 12:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Adrian Hunter, Andi Kleen, Jiri Olsa, Arnaldo Carvalho de Melo,
	x86, LKML

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     5c7bec0c9c543733fa96fe500e4c245be6f9fd30
Gitweb:        https://git.kernel.org/tip/5c7bec0c9c543733fa96fe500e4c245be6f9fd30
Author:        Adrian Hunter <adrian.hunter@intel.com>
AuthorDate:    Wed, 01 Apr 2020 13:16:04 +03:00
Committer:     Arnaldo Carvalho de Melo <acme@redhat.com>
CommitterDate: Thu, 16 Apr 2020 12:19:15 -03:00

perf auxtrace: For reporting purposes, un-group AUX area event

An AUX area event must be the group leader when recording traces in
sample mode, but that does not produce the expected results from
'perf report' because it expects the leader to provide samples.

Rather than teach 'perf report' about AUX area sampling, un-group the
AUX area event during processing, making the 2nd event the leader.

Example:

 $ perf record -e '{intel_pt//u,branch-misses:u}' -c 1 uname
 Linux
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.080 MB perf.data ]

 Before:

 $ perf report

 Samples: 800  of events 'anon group { intel_pt//u, branch-misses:u }', Event count (approx.): 800
        Children              Self  Command  Shared Object     Symbol
     0.00%  47.50%     0.00%  47.50%  uname    libc-2.28.so      [.] _dl_addr
     0.00%  16.38%     0.00%  16.38%  uname    ld-2.28.so        [.] __GI___tunables_init
     0.00%  54.75%     0.00%   4.75%  uname    ld-2.28.so        [.] dl_main
     0.00%   3.12%     0.00%   3.12%  uname    ld-2.28.so        [.] _dl_map_object_from_fd
     0.00%   2.38%     0.00%   2.38%  uname    ld-2.28.so        [.] strcmp
     0.00%   2.25%     0.00%   2.25%  uname    ld-2.28.so        [.] _dl_check_map_versions
     0.00%   2.00%     0.00%   2.00%  uname    ld-2.28.so        [.] _dl_important_hwcaps
     0.00%   2.00%     0.00%   2.00%  uname    ld-2.28.so        [.] _dl_map_object_deps
     0.00%  51.50%     0.00%   1.50%  uname    ld-2.28.so        [.] _dl_sysdep_start
     0.00%   1.25%     0.00%   1.25%  uname    ld-2.28.so        [.] _dl_load_cache_lookup
     0.00%  51.12%     0.00%   1.12%  uname    ld-2.28.so        [.] _dl_start
     0.00%  50.88%     0.00%   1.12%  uname    ld-2.28.so        [.] do_lookup_x
     0.00%  50.62%     0.00%   1.00%  uname    ld-2.28.so        [.] _dl_lookup_symbol_x
     0.00%   1.00%     0.00%   1.00%  uname    ld-2.28.so        [.] _dl_map_object
     0.00%   1.00%     0.00%   1.00%  uname    ld-2.28.so        [.] _dl_next_ld_env_entry
     0.00%   0.88%     0.00%   0.88%  uname    ld-2.28.so        [.] _dl_cache_libcmp
     0.00%   0.88%     0.00%   0.88%  uname    ld-2.28.so        [.] _dl_new_object
     0.00%  50.88%     0.00%   0.88%  uname    ld-2.28.so        [.] _dl_relocate_object
     0.00%   0.62%     0.00%   0.62%  uname    ld-2.28.so        [.] _dl_init_paths
     0.00%   0.62%     0.00%   0.62%  uname    ld-2.28.so        [.] _dl_name_match_p
     0.00%   0.50%     0.00%   0.50%  uname    ld-2.28.so        [.] get_common_indeces.constprop.1
     0.00%   0.50%     0.00%   0.50%  uname    ld-2.28.so        [.] memmove
     0.00%   0.50%     0.00%   0.50%  uname    ld-2.28.so        [.] memset
     0.00%   0.50%     0.00%   0.50%  uname    ld-2.28.so        [.] open_verify.constprop.11
     0.00%   0.38%     0.00%   0.38%  uname    ld-2.28.so        [.] _dl_check_all_versions
     0.00%   0.38%     0.00%   0.38%  uname    ld-2.28.so        [.] _dl_find_dso_for_object
     0.00%   0.38%     0.00%   0.38%  uname    ld-2.28.so        [.] init_tls
     0.00%   0.25%     0.00%   0.25%  uname    ld-2.28.so        [.] __tunable_get_val
     0.00%   0.25%     0.00%   0.25%  uname    ld-2.28.so        [.] _dl_add_to_namespace_list
     0.00%   0.25%     0.00%   0.25%  uname    ld-2.28.so        [.] _dl_determine_tlsoffset
     0.00%   0.25%     0.00%   0.25%  uname    ld-2.28.so        [.] _dl_discover_osversion
     0.00%   0.25%     0.00%   0.25%  uname    ld-2.28.so        [.] calloc@plt
     0.00%   0.25%     0.00%   0.25%  uname    ld-2.28.so        [.] malloc
     0.00%   0.25%     0.00%   0.25%  uname    ld-2.28.so        [.] malloc@plt
     0.00%   0.25%     0.00%   0.25%  uname    libc-2.28.so      [.] _nl_load_locale_from_archive
     0.00%   0.25%     0.00%   0.25%  uname    [unknown]         [k] 0xffffffffa3a00010
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] __libc_scratch_buffer_set_array_size
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] _dl_allocate_tls_storage
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] _dl_catch_exception
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] _dl_setup_hash
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] _dl_sort_maps
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] _dl_sysdep_read_whole_file
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] access
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] calloc
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] mmap64
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] openaux
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] rtld_lock_default_lock_recursive
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] rtld_lock_default_unlock_recursive
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] strchr
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] strlen
     0.00%   0.12%     0.00%   0.12%  uname    ld-2.28.so        [.] 0x0000000000001080
     0.00%   0.12%     0.00%   0.12%  uname    libc-2.28.so      [.] __strchrnul_avx2
     0.00%   0.12%     0.00%   0.12%  uname    libc-2.28.so      [.] _nl_normalize_codeset
     0.00%   0.12%     0.00%   0.12%  uname    libc-2.28.so      [.] malloc
     0.00%   0.12%     0.00%   0.12%  uname    [unknown]         [k] 0xffffffffa3a011f0
     0.00%  50.00%     0.00%   0.00%  uname    ld-2.28.so        [.] _dl_start_user
     0.00%  50.00%     0.00%   0.00%  uname    [unknown]         [.] 0000000000000000

 After:

 Samples: 800  of event 'branch-misses:u', Event count (approx.): 800
  Children      Self  Command  Shared Object     Symbol
    54.75%     4.75%  uname    ld-2.28.so        [.] dl_main
    51.50%     1.50%  uname    ld-2.28.so        [.] _dl_sysdep_start
    51.12%     1.12%  uname    ld-2.28.so        [.] _dl_start
    50.88%     0.88%  uname    ld-2.28.so        [.] _dl_relocate_object
    50.88%     1.12%  uname    ld-2.28.so        [.] do_lookup_x
    50.62%     1.00%  uname    ld-2.28.so        [.] _dl_lookup_symbol_x
    50.00%     0.00%  uname    ld-2.28.so        [.] _dl_start_user
    50.00%     0.00%  uname    [unknown]         [.] 0000000000000000
    47.50%    47.50%  uname    libc-2.28.so      [.] _dl_addr
    16.38%    16.38%  uname    ld-2.28.so        [.] __GI___tunables_init
     3.12%     3.12%  uname    ld-2.28.so        [.] _dl_map_object_from_fd
     2.38%     2.38%  uname    ld-2.28.so        [.] strcmp
     2.25%     2.25%  uname    ld-2.28.so        [.] _dl_check_map_versions
     2.00%     2.00%  uname    ld-2.28.so        [.] _dl_important_hwcaps
     2.00%     2.00%  uname    ld-2.28.so        [.] _dl_map_object_deps
     1.25%     1.25%  uname    ld-2.28.so        [.] _dl_load_cache_lookup
     1.00%     1.00%  uname    ld-2.28.so        [.] _dl_map_object
     1.00%     1.00%  uname    ld-2.28.so        [.] _dl_next_ld_env_entry
     0.88%     0.88%  uname    ld-2.28.so        [.] _dl_cache_libcmp
     0.88%     0.88%  uname    ld-2.28.so        [.] _dl_new_object
     0.62%     0.62%  uname    ld-2.28.so        [.] _dl_init_paths
     0.62%     0.62%  uname    ld-2.28.so        [.] _dl_name_match_p
     0.50%     0.50%  uname    ld-2.28.so        [.] get_common_indeces.constprop.1
     0.50%     0.50%  uname    ld-2.28.so        [.] memmove
     0.50%     0.50%  uname    ld-2.28.so        [.] memset
     0.50%     0.50%  uname    ld-2.28.so        [.] open_verify.constprop.11
     0.38%     0.38%  uname    ld-2.28.so        [.] _dl_check_all_versions
     0.38%     0.38%  uname    ld-2.28.so        [.] _dl_find_dso_for_object
     0.38%     0.38%  uname    ld-2.28.so        [.] init_tls
     0.25%     0.25%  uname    ld-2.28.so        [.] __tunable_get_val
     0.25%     0.25%  uname    ld-2.28.so        [.] _dl_add_to_namespace_list
     0.25%     0.25%  uname    ld-2.28.so        [.] _dl_determine_tlsoffset
     0.25%     0.25%  uname    ld-2.28.so        [.] _dl_discover_osversion
     0.25%     0.25%  uname    ld-2.28.so        [.] calloc@plt
     0.25%     0.25%  uname    ld-2.28.so        [.] malloc
     0.25%     0.25%  uname    ld-2.28.so        [.] malloc@plt
     0.25%     0.25%  uname    libc-2.28.so      [.] _nl_load_locale_from_archive
     0.25%     0.25%  uname    [unknown]         [k] 0xffffffffa3a00010
     0.12%     0.12%  uname    ld-2.28.so        [.] __libc_scratch_buffer_set_array_size
     0.12%     0.12%  uname    ld-2.28.so        [.] _dl_allocate_tls_storage
     0.12%     0.12%  uname    ld-2.28.so        [.] _dl_catch_exception
     0.12%     0.12%  uname    ld-2.28.so        [.] _dl_setup_hash
     0.12%     0.12%  uname    ld-2.28.so        [.] _dl_sort_maps
     0.12%     0.12%  uname    ld-2.28.so        [.] _dl_sysdep_read_whole_file
     0.12%     0.12%  uname    ld-2.28.so        [.] access
     0.12%     0.12%  uname    ld-2.28.so        [.] calloc
     0.12%     0.12%  uname    ld-2.28.so        [.] mmap64
     0.12%     0.12%  uname    ld-2.28.so        [.] openaux
     0.12%     0.12%  uname    ld-2.28.so        [.] rtld_lock_default_lock_recursive
     0.12%     0.12%  uname    ld-2.28.so        [.] rtld_lock_default_unlock_recursive
     0.12%     0.12%  uname    ld-2.28.so        [.] strchr
     0.12%     0.12%  uname    ld-2.28.so        [.] strlen
     0.12%     0.12%  uname    ld-2.28.so        [.] 0x0000000000001080
     0.12%     0.12%  uname    libc-2.28.so      [.] __strchrnul_avx2
     0.12%     0.12%  uname    libc-2.28.so      [.] _nl_normalize_codeset
     0.12%     0.12%  uname    libc-2.28.so      [.] malloc
     0.12%     0.12%  uname    [unknown]         [k] 0xffffffffa3a011f0

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200401101613.6201-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/auxtrace.c | 60 +++++++++++++++++++++++++++++++++----
 1 file changed, 55 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 2c4ad68..b60bae8 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -1234,29 +1234,79 @@ out_free:
 	return err;
 }
 
+static void unleader_evsel(struct evlist *evlist, struct evsel *leader)
+{
+	struct evsel *new_leader = NULL;
+	struct evsel *evsel;
+
+	/* Find new leader for the group */
+	evlist__for_each_entry(evlist, evsel) {
+		if (evsel->leader != leader || evsel == leader)
+			continue;
+		if (!new_leader)
+			new_leader = evsel;
+		evsel->leader = new_leader;
+	}
+
+	/* Update group information */
+	if (new_leader) {
+		zfree(&new_leader->group_name);
+		new_leader->group_name = leader->group_name;
+		leader->group_name = NULL;
+
+		new_leader->core.nr_members = leader->core.nr_members - 1;
+		leader->core.nr_members = 1;
+	}
+}
+
+static void unleader_auxtrace(struct perf_session *session)
+{
+	struct evsel *evsel;
+
+	evlist__for_each_entry(session->evlist, evsel) {
+		if (auxtrace__evsel_is_auxtrace(session, evsel) &&
+		    perf_evsel__is_group_leader(evsel)) {
+			unleader_evsel(session->evlist, evsel);
+		}
+	}
+}
+
 int perf_event__process_auxtrace_info(struct perf_session *session,
 				      union perf_event *event)
 {
 	enum auxtrace_type type = event->auxtrace_info.type;
+	int err;
 
 	if (dump_trace)
 		fprintf(stdout, " type: %u\n", type);
 
 	switch (type) {
 	case PERF_AUXTRACE_INTEL_PT:
-		return intel_pt_process_auxtrace_info(event, session);
+		err = intel_pt_process_auxtrace_info(event, session);
+		break;
 	case PERF_AUXTRACE_INTEL_BTS:
-		return intel_bts_process_auxtrace_info(event, session);
+		err = intel_bts_process_auxtrace_info(event, session);
+		break;
 	case PERF_AUXTRACE_ARM_SPE:
-		return arm_spe_process_auxtrace_info(event, session);
+		err = arm_spe_process_auxtrace_info(event, session);
+		break;
 	case PERF_AUXTRACE_CS_ETM:
-		return cs_etm__process_auxtrace_info(event, session);
+		err = cs_etm__process_auxtrace_info(event, session);
+		break;
 	case PERF_AUXTRACE_S390_CPUMSF:
-		return s390_cpumsf_process_auxtrace_info(event, session);
+		err = s390_cpumsf_process_auxtrace_info(event, session);
+		break;
 	case PERF_AUXTRACE_UNKNOWN:
 	default:
 		return -EINVAL;
 	}
+
+	if (err)
+		return err;
+
+	unleader_auxtrace(session);
+
+	return 0;
 }
 
 s64 perf_event__process_auxtrace(struct perf_session *session,

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip: perf/core] perf arm-spe: Implement ->evsel_is_auxtrace() callback
  2020-04-01 10:16 ` [PATCH 04/16] perf arm-spe: " Adrian Hunter
  2020-04-02  3:03   ` Leo Yan
@ 2020-04-22 12:17   ` tip-bot2 for Adrian Hunter
  1 sibling, 0 replies; 47+ messages in thread
From: tip-bot2 for Adrian Hunter @ 2020-04-22 12:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Adrian Hunter, Leo Yan, Andi Kleen, Jiri Olsa, Kim Phillips,
	Arnaldo Carvalho de Melo, x86, LKML

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     508c71e3f90e4ceee7516d691355a36660a3e5bf
Gitweb:        https://git.kernel.org/tip/508c71e3f90e4ceee7516d691355a36660a3e5bf
Author:        Adrian Hunter <adrian.hunter@intel.com>
AuthorDate:    Wed, 01 Apr 2020 13:16:01 +03:00
Committer:     Arnaldo Carvalho de Melo <acme@redhat.com>
CommitterDate: Thu, 16 Apr 2020 12:19:15 -03:00

perf arm-spe: Implement ->evsel_is_auxtrace() callback

Implement ->evsel_is_auxtrace() callback.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Link: http://lore.kernel.org/lkml/20200401101613.6201-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/arm-spe.c |  9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 53be12b..875a0dd 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -176,6 +176,14 @@ static void arm_spe_free(struct perf_session *session)
 	free(spe);
 }
 
+static bool arm_spe_evsel_is_auxtrace(struct perf_session *session,
+				      struct evsel *evsel)
+{
+	struct arm_spe *spe = container_of(session->auxtrace, struct arm_spe, auxtrace);
+
+	return evsel->core.attr.type == spe->pmu_type;
+}
+
 static const char * const arm_spe_info_fmts[] = {
 	[ARM_SPE_PMU_TYPE]		= "  PMU Type           %"PRId64"\n",
 };
@@ -218,6 +226,7 @@ int arm_spe_process_auxtrace_info(union perf_event *event,
 	spe->auxtrace.flush_events = arm_spe_flush;
 	spe->auxtrace.free_events = arm_spe_free_events;
 	spe->auxtrace.free = arm_spe_free;
+	spe->auxtrace.evsel_is_auxtrace = arm_spe_evsel_is_auxtrace;
 	session->auxtrace = &spe->auxtrace;
 
 	arm_spe_print_info(&auxtrace_info->priv[0]);

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip: perf/core] perf cs-etm: Implement ->evsel_is_auxtrace() callback
  2020-04-01 10:16 ` [PATCH 05/16] perf cs-etm: " Adrian Hunter
  2020-04-01 17:11   ` Mathieu Poirier
@ 2020-04-22 12:17   ` tip-bot2 for Adrian Hunter
  1 sibling, 0 replies; 47+ messages in thread
From: tip-bot2 for Adrian Hunter @ 2020-04-22 12:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Adrian Hunter, Mathieu Poirier, Andi Kleen, Jiri Olsa,
	Arnaldo Carvalho de Melo, x86, LKML

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     a58ab57caad02b0d854969e191b5d1d4b0f90930
Gitweb:        https://git.kernel.org/tip/a58ab57caad02b0d854969e191b5d1d4b0f90930
Author:        Adrian Hunter <adrian.hunter@intel.com>
AuthorDate:    Wed, 01 Apr 2020 13:16:02 +03:00
Committer:     Arnaldo Carvalho de Melo <acme@redhat.com>
CommitterDate: Thu, 16 Apr 2020 12:19:15 -03:00

perf cs-etm: Implement ->evsel_is_auxtrace() callback

Implement ->evsel_is_auxtrace() callback.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200401101613.6201-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/cs-etm.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 62d2f9b..3c802fd 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -631,6 +631,16 @@ static void cs_etm__free(struct perf_session *session)
 	zfree(&aux);
 }
 
+static bool cs_etm__evsel_is_auxtrace(struct perf_session *session,
+				      struct evsel *evsel)
+{
+	struct cs_etm_auxtrace *aux = container_of(session->auxtrace,
+						   struct cs_etm_auxtrace,
+						   auxtrace);
+
+	return evsel->core.attr.type == aux->pmu_type;
+}
+
 static u8 cs_etm__cpu_mode(struct cs_etm_queue *etmq, u64 address)
 {
 	struct machine *machine;
@@ -2618,6 +2628,7 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
 	etm->auxtrace.flush_events = cs_etm__flush_events;
 	etm->auxtrace.free_events = cs_etm__free_events;
 	etm->auxtrace.free = cs_etm__free;
+	etm->auxtrace.evsel_is_auxtrace = cs_etm__evsel_is_auxtrace;
 	session->auxtrace = &etm->auxtrace;
 
 	etm->unknown_thread = thread__new(999999999, 999999999);

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip: perf/core] perf intel-pt: Implement ->evsel_is_auxtrace() callback
  2020-04-01 10:15 ` [PATCH 02/16] perf intel-pt: Implement " Adrian Hunter
@ 2020-04-22 12:17   ` tip-bot2 for Adrian Hunter
  0 siblings, 0 replies; 47+ messages in thread
From: tip-bot2 for Adrian Hunter @ 2020-04-22 12:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Adrian Hunter, Andi Kleen, Jiri Olsa, Arnaldo Carvalho de Melo,
	x86, LKML

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     6b52bb07c397af274850deb9e4e054bdb6261e73
Gitweb:        https://git.kernel.org/tip/6b52bb07c397af274850deb9e4e054bdb6261e73
Author:        Adrian Hunter <adrian.hunter@intel.com>
AuthorDate:    Wed, 01 Apr 2020 13:15:59 +03:00
Committer:     Arnaldo Carvalho de Melo <acme@redhat.com>
CommitterDate: Thu, 16 Apr 2020 12:19:15 -03:00

perf intel-pt: Implement ->evsel_is_auxtrace() callback

Implement ->evsel_is_auxtrace() callback.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200401101613.6201-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/intel-pt.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 23c8289..db25c77 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -2715,6 +2715,15 @@ static void intel_pt_free(struct perf_session *session)
 	free(pt);
 }
 
+static bool intel_pt_evsel_is_auxtrace(struct perf_session *session,
+				       struct evsel *evsel)
+{
+	struct intel_pt *pt = container_of(session->auxtrace, struct intel_pt,
+					   auxtrace);
+
+	return evsel->core.attr.type == pt->pmu_type;
+}
+
 static int intel_pt_process_auxtrace_event(struct perf_session *session,
 					   union perf_event *event,
 					   struct perf_tool *tool __maybe_unused)
@@ -3310,6 +3319,7 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
 	pt->auxtrace.flush_events = intel_pt_flush;
 	pt->auxtrace.free_events = intel_pt_free_events;
 	pt->auxtrace.free = intel_pt_free;
+	pt->auxtrace.evsel_is_auxtrace = intel_pt_evsel_is_auxtrace;
 	session->auxtrace = &pt->auxtrace;
 
 	if (dump_trace)

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip: perf/core] perf auxtrace: Add ->evsel_is_auxtrace() callback
  2020-04-01 10:15 ` [PATCH 01/16] perf auxtrace: Add ->evsel_is_auxtrace() callback Adrian Hunter
@ 2020-04-22 12:17   ` tip-bot2 for Adrian Hunter
  0 siblings, 0 replies; 47+ messages in thread
From: tip-bot2 for Adrian Hunter @ 2020-04-22 12:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Adrian Hunter, Andi Kleen, Jiri Olsa, Kim Phillips,
	Mathieu Poirier, Thomas Richter, Arnaldo Carvalho de Melo, x86,
	LKML

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     853f37d75c44c305f750d8c4a34d83f03b610fce
Gitweb:        https://git.kernel.org/tip/853f37d75c44c305f750d8c4a34d83f03b610fce
Author:        Adrian Hunter <adrian.hunter@intel.com>
AuthorDate:    Wed, 01 Apr 2020 13:15:58 +03:00
Committer:     Arnaldo Carvalho de Melo <acme@redhat.com>
CommitterDate: Thu, 16 Apr 2020 12:19:15 -03:00

perf auxtrace: Add ->evsel_is_auxtrace() callback

Add ->evsel_is_auxtrace() callback to identify if a selected event
is an AUX area event.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20200401101613.6201-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/auxtrace.c |  9 +++++++++
 tools/perf/util/auxtrace.h | 12 ++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 3571ce7..2c4ad68 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -2577,3 +2577,12 @@ void auxtrace__free(struct perf_session *session)
 
 	return session->auxtrace->free(session);
 }
+
+bool auxtrace__evsel_is_auxtrace(struct perf_session *session,
+				 struct evsel *evsel)
+{
+	if (!session->auxtrace || !session->auxtrace->evsel_is_auxtrace)
+		return false;
+
+	return session->auxtrace->evsel_is_auxtrace(session, evsel);
+}
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index e58ef16..db65aae 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -21,6 +21,7 @@
 union perf_event;
 struct perf_session;
 struct evlist;
+struct evsel;
 struct perf_tool;
 struct mmap;
 struct perf_sample;
@@ -166,6 +167,8 @@ struct auxtrace {
 			    struct perf_tool *tool);
 	void (*free_events)(struct perf_session *session);
 	void (*free)(struct perf_session *session);
+	bool (*evsel_is_auxtrace)(struct perf_session *session,
+				  struct evsel *evsel);
 };
 
 /**
@@ -584,6 +587,8 @@ void auxtrace__dump_auxtrace_sample(struct perf_session *session,
 int auxtrace__flush_events(struct perf_session *session, struct perf_tool *tool);
 void auxtrace__free_events(struct perf_session *session);
 void auxtrace__free(struct perf_session *session);
+bool auxtrace__evsel_is_auxtrace(struct perf_session *session,
+				 struct evsel *evsel);
 
 #define ITRACE_HELP \
 "				i:	    		synthesize instructions events\n"		\
@@ -750,6 +755,13 @@ void auxtrace_index__free(struct list_head *head __maybe_unused)
 }
 
 static inline
+bool auxtrace__evsel_is_auxtrace(struct perf_session *session __maybe_unused,
+				 struct evsel *evsel __maybe_unused)
+{
+	return false;
+}
+
+static inline
 int auxtrace_parse_filters(struct evlist *evlist __maybe_unused)
 {
 	return 0;

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [tip: perf/core] perf intel-bts: Implement ->evsel_is_auxtrace() callback
  2020-04-01 10:16 ` [PATCH 03/16] perf intel-bts: " Adrian Hunter
@ 2020-04-22 12:17   ` tip-bot2 for Adrian Hunter
  0 siblings, 0 replies; 47+ messages in thread
From: tip-bot2 for Adrian Hunter @ 2020-04-22 12:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Adrian Hunter, Andi Kleen, Jiri Olsa, Arnaldo Carvalho de Melo,
	x86, LKML

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     966246f597deafbcb1d8c126865b4efdc2be776e
Gitweb:        https://git.kernel.org/tip/966246f597deafbcb1d8c126865b4efdc2be776e
Author:        Adrian Hunter <adrian.hunter@intel.com>
AuthorDate:    Wed, 01 Apr 2020 13:16:00 +03:00
Committer:     Arnaldo Carvalho de Melo <acme@redhat.com>
CommitterDate: Thu, 16 Apr 2020 12:19:15 -03:00

perf intel-bts: Implement ->evsel_is_auxtrace() callback

Implement ->evsel_is_auxtrace() callback.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200401101613.6201-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/intel-bts.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/tools/perf/util/intel-bts.c b/tools/perf/util/intel-bts.c
index 34cb380..059e1c8 100644
--- a/tools/perf/util/intel-bts.c
+++ b/tools/perf/util/intel-bts.c
@@ -728,6 +728,15 @@ static void intel_bts_free(struct perf_session *session)
 	free(bts);
 }
 
+static bool intel_bts_evsel_is_auxtrace(struct perf_session *session,
+					struct evsel *evsel)
+{
+	struct intel_bts *bts = container_of(session->auxtrace, struct intel_bts,
+					     auxtrace);
+
+	return evsel->core.attr.type == bts->pmu_type;
+}
+
 struct intel_bts_synth {
 	struct perf_tool dummy_tool;
 	struct perf_session *session;
@@ -883,6 +892,7 @@ int intel_bts_process_auxtrace_info(union perf_event *event,
 	bts->auxtrace.flush_events = intel_bts_flush;
 	bts->auxtrace.free_events = intel_bts_free_events;
 	bts->auxtrace.free = intel_bts_free;
+	bts->auxtrace.evsel_is_auxtrace = intel_bts_evsel_is_auxtrace;
 	session->auxtrace = &bts->auxtrace;
 
 	intel_bts_print_info(&auxtrace_info->priv[0], INTEL_BTS_PMU_TYPE,

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2020-04-22 12:22 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-01 10:15 [PATCH 00/16] perf intel-pt: Sampling improvements Adrian Hunter
2020-04-01 10:15 ` [PATCH 01/16] perf auxtrace: Add ->evsel_is_auxtrace() callback Adrian Hunter
2020-04-22 12:17   ` [tip: perf/core] " tip-bot2 for Adrian Hunter
2020-04-01 10:15 ` [PATCH 02/16] perf intel-pt: Implement " Adrian Hunter
2020-04-22 12:17   ` [tip: perf/core] " tip-bot2 for Adrian Hunter
2020-04-01 10:16 ` [PATCH 03/16] perf intel-bts: " Adrian Hunter
2020-04-22 12:17   ` [tip: perf/core] " tip-bot2 for Adrian Hunter
2020-04-01 10:16 ` [PATCH 04/16] perf arm-spe: " Adrian Hunter
2020-04-02  3:03   ` Leo Yan
2020-04-22 12:17   ` [tip: perf/core] " tip-bot2 for Adrian Hunter
2020-04-01 10:16 ` [PATCH 05/16] perf cs-etm: " Adrian Hunter
2020-04-01 17:11   ` Mathieu Poirier
2020-04-22 12:17   ` [tip: perf/core] " tip-bot2 for Adrian Hunter
2020-04-01 10:16 ` [PATCH 06/16] perf s390-cpumsf: " Adrian Hunter
2020-04-01 14:10   ` Thomas Richter
2020-04-22 12:17   ` [tip: perf/core] " tip-bot2 for Adrian Hunter
2020-04-01 10:16 ` [PATCH 07/16] perf auxtrace: For reporting purposes, un-group AUX area event Adrian Hunter
2020-04-22 12:17   ` [tip: perf/core] " tip-bot2 for Adrian Hunter
2020-04-01 10:16 ` [PATCH 08/16] perf auxtrace: Add an option to synthesize callchains for regular events Adrian Hunter
2020-04-22 12:17   ` [tip: perf/core] " tip-bot2 for Adrian Hunter
2020-04-01 10:16 ` [PATCH 09/16] perf thread-stack: Add thread_stack__sample_late() Adrian Hunter
2020-04-22 12:17   ` [tip: perf/core] " tip-bot2 for Adrian Hunter
2020-04-01 10:16 ` [PATCH 10/16] perf tools: Add support for synthesized sample type Adrian Hunter
2020-04-16 14:54   ` Arnaldo Carvalho de Melo
2020-04-16 14:57     ` Arnaldo Carvalho de Melo
2020-04-16 15:01       ` Arnaldo Carvalho de Melo
2020-04-22 12:17   ` [tip: perf/core] perf evsel: Be consistent when looking which evsel PERF_SAMPLE_ bits are set tip-bot2 for Adrian Hunter
2020-04-22 12:17   ` [tip: perf/core] perf evsel: Add support for synthesized sample type tip-bot2 for Adrian Hunter
2020-04-01 10:16 ` [PATCH 11/16] perf intel-pt: Add support for synthesizing callchains for regular events Adrian Hunter
2020-04-16 15:14   ` Arnaldo Carvalho de Melo
2020-04-17 13:50     ` Adrian Hunter
2020-04-17 21:37       ` Arnaldo Carvalho de Melo
2020-04-20  3:04         ` Andi Kleen
2020-04-22 12:17   ` [tip: perf/core] " tip-bot2 for Adrian Hunter
2020-04-01 10:16 ` [PATCH 12/16] perf tools: Move and globalize perf_evsel__find_pmu() and perf_evsel__is_aux_event() Adrian Hunter
2020-04-18 11:50   ` Arnaldo Carvalho de Melo
2020-04-18 12:04     ` Arnaldo Carvalho de Melo
2020-04-22 12:17   ` [tip: perf/core] perf evsel: " tip-bot2 for Adrian Hunter
2020-04-01 10:16 ` [PATCH 13/16] perf tools: Move leader-sampling configuration Adrian Hunter
2020-04-16 15:29   ` Arnaldo Carvalho de Melo
2020-04-22 12:17   ` [tip: perf/core] perf evlist: " tip-bot2 for Adrian Hunter
2020-04-01 10:16 ` [PATCH 14/16] perf tools: Rearrange perf_evsel__config_leader_sampling() Adrian Hunter
2020-04-22 12:17   ` [tip: perf/core] perf evsel: " tip-bot2 for Adrian Hunter
2020-04-01 10:16 ` [PATCH 15/16] perf tools: Allow multiple read formats Adrian Hunter
2020-04-22 12:17   ` [tip: perf/core] perf evlist: " tip-bot2 for Adrian Hunter
2020-04-01 10:16 ` [PATCH 16/16] perf tools: Add support for leader-sampling with AUX area events Adrian Hunter
2020-04-22 12:17   ` [tip: perf/core] " tip-bot2 for Adrian Hunter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).