linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GIT PULL 00/44] perf/core improvements and fixes
@ 2017-09-22 14:41 Arnaldo Carvalho de Melo
  2017-09-22 14:41 ` [PATCH 01/44] perf sched timehist: Add pid and tid options Arnaldo Carvalho de Melo
                   ` (20 more replies)
  0 siblings, 21 replies; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-22 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, David Ahern,
	Fenghua Yu, Jiri Olsa, Kan Liang, Li Zhijian, Lukasz Odzioba,
	Martin Kepplinger, Matt Fleming, Mike Kravetz, Namhyung Kim,
	Pei P Jia, Peter Zijlstra, Philip Li, Rik van Riel, Taeung Song,
	Tony Luck, Vikas Shivappa, Wang Nan, Xiaochen Shen,
	Arnaldo Carvalho de Melo

Hi Ingo,

	Please consider pulling,

- Arnaldo


The following changes since commit b130a699c07155a1d6ef7d971a5f3bf0e3818d5a:

  Merge tag 'perf-urgent-for-mingo-4.14-20170912' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent (2017-09-13 09:25:10 +0200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.15-20170922

for you to fetch changes up to 0a7c74eae307894c6c95316c382f118aef8481e8:

  perf tools: Provide mutex wrappers for pthreads rwlocks (2017-09-21 13:28:06 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

- Support direct --user-regs arguments in 'perf record', previously the
  only way to sample PERF_SAMPLE_REGS_USER was implicitly selecting it
  when recording callchains (Andi Kleen)

- Support showing sampled user regs in 'perf script' (Andi Kleen)

- Introduce the concept of weak groups in 'perf stat': try to set up a
  group, but if it's not schedulable fallback to not using a group. That
  gives us the best of both worlds: groups if they work, but still a
  usable fallback if they don't. E.g: (Andi Kleen)

  % perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}:W' -a sleep 1

    125,366,055  branches                                    (80.02%)
      9,208,402  branch-misses       # 7.35% of all branches (80.01%)
     24,560,249  l1d.replacement                             (80.00%)
     43,174,971  l2_lines_in.all                             (80.05%)
     31,891,457  l2_rqsts.all_code_rd                        (79.92%)

- Support metrics in 'stat' and 'list'. A metric is a formula that
  uses multiple events to compute a higher level result (e.g. IPC). (Andi Kleen)

- Add Intel processors vendor event metrics JSON files (Andi Kleen)

- Add 'pid' and 'tid' options to 'perf sched timehist' (David Ahern)

- Generate 'behavior' string table from kernel headers, helps getting
  new parameters when synchronizing kernel headers, like MADV_WIPEONFORK
  and MADV_KEEPONFORK, that are now beautied (Arnaldo Carvalho de Melo)

- Improve TUI progress bar by showing how many bytes from a total were
  processed (Jiri Olsa)

- Use scandir() to replace readdir(), prep work to have the synthesizing
  of PERF_RECORD_ entries for existing threads be multithreaded, making
  'perf top' bearable on high core count systems such as Intel's Knights
  Landing/Mill  (Kan Liang)

- Allow creating a ~/.perfconfig file when setting a variable to its
  default value, previously it would bail out and not write such a
  file (Taeung Song)

- Introduce wrapper for allowing purely single threaded apps to avoid
  the costs of locking (Arnaldo Carvalho de Melo)

- Introduce hashtable to reduce the cost of thread lookup

- Fix build C++ build wrt poison.h using void pointer arithmetic,
  affects only the embedded clang/llvm case, that is disabled by
  default (Arnaldo Carvalho de Melo)

- Fix leaking rec_argv in error cases (Martin Kepplinger)

- Remove Intel CQM perf test, that infrastructure was nuked (Xiaochen Shen)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

----------------------------------------------------------------
Andi Kleen (27):
      perf tools: Support weak groups in 'perf stat'
      perf vendor events: Support metric_group and no event name in JSON parser
      perf stat: Factor out generic metric printing
      perf stat: Print generic metric header even for failed expressions
      perf pmu: Extract function to get JSON alias map
      perf stat: Support JSON metrics in perf stat
      perf list: Add metric groups to perf list
      perf stat: Don't use ctx for saved values lookup
      perf stat: Support duration_time for metrics
      perf stat: Hide internal duration_time counter
      perf stat: Update walltime_nsecs_stats in interval mode
      perf record: Support direct --user-regs arguments
      perf script: Support user regs
      perf stat: Fall weak group back even for EBADF
      perf vendor events: Add JSON metrics for Broadwell
      perf vendor events: Add JSON metrics for Skylake
      perf vendor events: Add JSON metrics for Sandy Bridge
      perf vendor events: Add JSON metrics for Sandy Bridge EP
      perf vendor events: Add JSON metrics for Ivy Bridge
      perf vendor events: Add JSON metrics for Haswell
      perf vendor events: Add JSON metrics for Ivy Town
      perf vendor events: Add JSON metrics for Haswell EP
      perf vendor events: Add JSON metrics for Broadwell Server
      perf vendor events: Add JSON metrics for Broadwell DE
      perf vendor events: Add JSON metrics for Skylake server
      perf pmu: Improve error messages for missing PMUs
      perf stat: Fix adding multiple event groups

Arnaldo Carvalho de Melo (7):
      perf tools: Make copyfile_offset() static
      perf machine: Optimize a bit the machine__findnew_thread() methods
      perf trace beauty madvise: Generate 'behavior' string table from kernel headers
      tools: Update asm-generic/mman-common.h copy from the kernel
      perf tools: Get all of tools/{arch,include}/ in the MANIFEST
      tools include: Do not use poison with C++
      perf tools: Provide mutex wrappers for pthreads rwlocks

David Ahern (1):
      perf sched timehist: Add pid and tid options

Jiri Olsa (3):
      perf tools: Add python-clean target
      perf ui progress: Add ui specific init function
      perf ui progress: Add size info into progress bar

Kan Liang (2):
      perf tools: Use scandir() to replace readdir()
      perf machine: Use hashtable for machine threads

Martin Kepplinger (1):
      perf tools: Fix leaking rec_argv in error cases

Taeung Song (2):
      perf config: Write a config file just once
      perf config: Allow creating empty config set for config file autogeneration

Xiaochen Shen (1):
      perf tests: Remove Intel CQM perf test

 tools/include/linux/poison.h                       |   5 +
 tools/include/uapi/asm-generic/mman-common.h       |  14 +-
 tools/perf/Documentation/perf-list.txt             |   9 +-
 tools/perf/Documentation/perf-record.txt           |   2 +
 tools/perf/Documentation/perf-sched.txt            |   8 +
 tools/perf/Documentation/perf-script.txt           |   4 +-
 tools/perf/Documentation/perf-stat.txt             |   7 +
 tools/perf/MANIFEST                                |  87 +---
 tools/perf/Makefile.perf                           |  17 +-
 tools/perf/arch/x86/include/arch-tests.h           |   1 -
 tools/perf/arch/x86/tests/Build                    |   1 -
 tools/perf/arch/x86/tests/arch-tests.c             |   4 -
 tools/perf/arch/x86/tests/intel-cqm.c              | 127 ------
 tools/perf/builtin-c2c.c                           |   1 +
 tools/perf/builtin-config.c                        |  22 +-
 tools/perf/builtin-kvm.c                           |   1 -
 tools/perf/builtin-list.c                          |   7 +
 tools/perf/builtin-mem.c                           |   1 +
 tools/perf/builtin-record.c                        |   3 +
 tools/perf/builtin-sched.c                         |   4 +
 tools/perf/builtin-script.c                        |  32 +-
 tools/perf/builtin-stat.c                          |  82 +++-
 tools/perf/builtin-timechart.c                     |   4 +-
 tools/perf/builtin-trace.c                         |  20 +-
 tools/perf/perf.h                                  |   1 +
 .../pmu-events/arch/x86/broadwell/bdw-metrics.json | 164 +++++++
 .../arch/x86/broadwellde/bdwde-metrics.json        | 164 +++++++
 .../arch/x86/broadwellx/bdx-metrics.json           | 164 +++++++
 .../pmu-events/arch/x86/haswell/hsw-metrics.json   | 158 +++++++
 .../pmu-events/arch/x86/haswellx/hsx-metrics.json  | 158 +++++++
 .../pmu-events/arch/x86/ivybridge/ivb-metrics.json | 164 +++++++
 .../pmu-events/arch/x86/ivytown/ivt-metrics.json   | 164 +++++++
 .../pmu-events/arch/x86/jaketown/jkt-metrics.json  | 140 ++++++
 .../arch/x86/sandybridge/snb-metrics.json          | 140 ++++++
 .../pmu-events/arch/x86/skylake/skl-metrics.json   | 164 +++++++
 .../pmu-events/arch/x86/skylakex/skx-metrics.json  | 182 ++++++++
 tools/perf/pmu-events/jevents.c                    |  24 +-
 tools/perf/pmu-events/jevents.h                    |   2 +-
 tools/perf/pmu-events/pmu-events.h                 |   1 +
 tools/perf/tests/builtin-test.c                    |   1 +
 tools/perf/trace/beauty/madvise_behavior.sh        |  10 +
 tools/perf/trace/beauty/mmap.c                     |  38 +-
 tools/perf/ui/progress.c                           |   6 +-
 tools/perf/ui/progress.h                           |  12 +-
 tools/perf/ui/tui/progress.c                       |  32 +-
 tools/perf/util/Build                              |   2 +
 tools/perf/util/config.c                           |   5 +-
 tools/perf/util/data.c                             |   1 +
 tools/perf/util/dso.c                              |  13 +-
 tools/perf/util/dso.h                              |   4 +-
 tools/perf/util/event.c                            |  46 +-
 tools/perf/util/evlist.h                           |   1 +
 tools/perf/util/evsel.c                            |   7 +-
 tools/perf/util/evsel.h                            |   1 +
 tools/perf/util/machine.c                          | 155 ++++---
 tools/perf/util/machine.h                          |  24 +-
 tools/perf/util/map.c                              |  34 +-
 tools/perf/util/map.h                              |   3 +-
 tools/perf/util/metricgroup.c                      | 490 +++++++++++++++++++++
 tools/perf/util/metricgroup.h                      |  31 ++
 tools/perf/util/namespaces.c                       |   1 +
 tools/perf/util/parse-events.c                     |  29 +-
 tools/perf/util/parse-events.h                     |   3 +
 tools/perf/util/parse-events.l                     |   3 +-
 tools/perf/util/pmu.c                              |  55 ++-
 tools/perf/util/pmu.h                              |   2 +
 tools/perf/util/probe-file.c                       |   1 +
 tools/perf/util/rb_resort.h                        |   5 +-
 tools/perf/util/rwsem.c                            |  32 ++
 tools/perf/util/rwsem.h                            |  19 +
 tools/perf/util/session.c                          |   2 +-
 tools/perf/util/stat-shadow.c                      | 110 +++--
 tools/perf/util/stat.h                             |   4 +-
 tools/perf/util/symbol.c                           |   8 +-
 tools/perf/util/thread.c                           |   4 +-
 tools/perf/util/trace-event-info.c                 |   1 -
 tools/perf/util/trace-event-read.c                 |   1 -
 tools/perf/util/util.c                             |  16 +-
 tools/perf/util/util.h                             |   7 +-
 tools/perf/util/vdso.c                             |   4 +-
 tools/perf/util/zlib.c                             |   1 +
 81 files changed, 2988 insertions(+), 489 deletions(-)
 delete mode 100644 tools/perf/arch/x86/tests/intel-cqm.c
 create mode 100644 tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
 create mode 100644 tools/perf/pmu-events/arch/x86/broadwellde/bdwde-metrics.json
 create mode 100644 tools/perf/pmu-events/arch/x86/broadwellx/bdx-metrics.json
 create mode 100644 tools/perf/pmu-events/arch/x86/haswell/hsw-metrics.json
 create mode 100644 tools/perf/pmu-events/arch/x86/haswellx/hsx-metrics.json
 create mode 100644 tools/perf/pmu-events/arch/x86/ivybridge/ivb-metrics.json
 create mode 100644 tools/perf/pmu-events/arch/x86/ivytown/ivt-metrics.json
 create mode 100644 tools/perf/pmu-events/arch/x86/jaketown/jkt-metrics.json
 create mode 100644 tools/perf/pmu-events/arch/x86/sandybridge/snb-metrics.json
 create mode 100644 tools/perf/pmu-events/arch/x86/skylake/skl-metrics.json
 create mode 100644 tools/perf/pmu-events/arch/x86/skylakex/skx-metrics.json
 create mode 100755 tools/perf/trace/beauty/madvise_behavior.sh
 create mode 100644 tools/perf/util/metricgroup.c
 create mode 100644 tools/perf/util/metricgroup.h
 create mode 100644 tools/perf/util/rwsem.c
 create mode 100644 tools/perf/util/rwsem.h

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 01/44] perf sched timehist: Add pid and tid options
  2017-09-22 14:41 [GIT PULL 00/44] perf/core improvements and fixes Arnaldo Carvalho de Melo
@ 2017-09-22 14:41 ` Arnaldo Carvalho de Melo
  2017-09-22 14:41 ` [PATCH 02/44] perf tools: Support weak groups in 'perf stat' Arnaldo Carvalho de Melo
                   ` (19 subsequent siblings)
  20 siblings, 0 replies; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-22 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, David Ahern, Namhyung Kim,
	Arnaldo Carvalho de Melo

From: David Ahern <dsahern@gmail.com>

Add options to only show event for specific pid(s) and tid(s).

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1504288152-19690-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-sched.txt | 8 ++++++++
 tools/perf/builtin-sched.c              | 4 ++++
 2 files changed, 12 insertions(+)

diff --git a/tools/perf/Documentation/perf-sched.txt b/tools/perf/Documentation/perf-sched.txt
index a092a2499e8f..55b67338548e 100644
--- a/tools/perf/Documentation/perf-sched.txt
+++ b/tools/perf/Documentation/perf-sched.txt
@@ -106,6 +106,14 @@ OPTIONS for 'perf sched timehist'
 --max-stack::
 	Maximum number of functions to display in backtrace, default 5.
 
+-p=::
+--pid=::
+	Only show events for given process ID (comma separated list).
+
+-t=::
+--tid=::
+	Only show events for given thread ID (comma separated list).
+
 -s::
 --summary::
     Show only a summary of scheduling by thread with min, max, and average
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 322b4def8411..b7e8812ee80c 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -3363,6 +3363,10 @@ int cmd_sched(int argc, const char **argv)
 	OPT_STRING(0, "time", &sched.time_str, "str",
 		   "Time span for analysis (start,stop)"),
 	OPT_BOOLEAN(0, "state", &sched.show_state, "Show task state when sched-out"),
+	OPT_STRING('p', "pid", &symbol_conf.pid_list_str, "pid[,pid...]",
+		   "analyze events only for given process id(s)"),
+	OPT_STRING('t', "tid", &symbol_conf.tid_list_str, "tid[,tid...]",
+		   "analyze events only for given thread id(s)"),
 	OPT_PARENT(sched_options)
 	};
 
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 02/44] perf tools: Support weak groups in 'perf stat'
  2017-09-22 14:41 [GIT PULL 00/44] perf/core improvements and fixes Arnaldo Carvalho de Melo
  2017-09-22 14:41 ` [PATCH 01/44] perf sched timehist: Add pid and tid options Arnaldo Carvalho de Melo
@ 2017-09-22 14:41 ` Arnaldo Carvalho de Melo
  2017-09-22 14:41 ` [PATCH 03/44] perf vendor events: Support metric_group and no event name in JSON parser Arnaldo Carvalho de Melo
                   ` (18 subsequent siblings)
  20 siblings, 0 replies; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-22 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Andi Kleen, Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

Setting up groups can be complicated due to the complicated scheduling
restrictions of different PMUs.

User tools usually don't understand all these restrictions.

Still in many cases it is useful to set up groups and they work most of
the time. However if the group is set up wrong some members will not
report any value because they never get scheduled.

Add a concept of a 'weak group': try to set up a group, but if it's not
schedulable fallback to not using a group. That gives us the best of
both worlds: groups if they work, but still a usable fallback if they
don't.

In theory it would be possible to have more complex fallback strategies
(e.g. try to split the group in half), but the simple fallback of not
using a group seems to work for now.

So far the weak group is only implemented for perf stat, not for record.

Here's an unschedulable group (on IvyBridge with SMT on)

  % perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}' -a sleep 1

        73,806,067      branches
         4,848,144      branch-misses             #    6.57% of all branches
        14,754,458      l1d.replacement
        24,905,558      l2_lines_in.all
   <not supported>      l2_rqsts.all_code_rd         <------- will never report anything

With the weak group:

  % perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}:W' -a sleep 1

       125,366,055      branches                                                      (80.02%)
         9,208,402      branch-misses             #    7.35% of all branches          (80.01%)
        24,560,249      l1d.replacement                                               (80.00%)
        43,174,971      l2_lines_in.all                                               (80.05%)
        31,891,457      l2_rqsts.all_code_rd                                          (79.92%)

The extra event scheduled with some extra multiplexing

v2: Move fallback code to separate function.
Add comment on for_each_group_member
Adjust to new perf_evsel__close interface
v3: Fix debug print out.

Committer testing:

Before:

  # perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}' -a sleep 1

   Performance counter stats for 'system wide':

     <not counted>      branches
     <not counted>      branch-misses
     <not counted>      l1d.replacement
     <not counted>      l2_lines_in.all
   <not supported>      l2_rqsts.all_code_rd

       1.002147212 seconds time elapsed

  # perf stat -e '{branches,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}' -a sleep 1

   Performance counter stats for 'system wide':

        83,207,892      branches
        11,065,444      l1d.replacement
        28,484,024      l2_lines_in.all
        12,186,179      l2_rqsts.all_code_rd

       1.001739493 seconds time elapsed

After:

  # perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}':W -a sleep 1

   Performance counter stats for 'system wide':

       543,323,909      branches                                                      (80.01%)
        27,100,512      branch-misses             #    4.99% of all branches          (80.02%)
        50,402,905      l1d.replacement                                               (80.03%)
        67,385,892      l2_lines_in.all                                               (80.01%)
        21,352,885      l2_rqsts.all_code_rd                                          (79.94%)

       1.001086658 seconds time elapsed

  #

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/r/20170831194036.30146-2-andi@firstfloor.org
[ Add a "'perf stat' only, for now" comment in the man page, suggested by Jiri ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-list.txt |  2 ++
 tools/perf/builtin-stat.c              | 35 ++++++++++++++++++++++++++++++++++
 tools/perf/util/evsel.h                |  1 +
 tools/perf/util/parse-events.c         |  8 +++++++-
 tools/perf/util/parse-events.l         |  2 +-
 5 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt
index f709de54707b..75fc17f47298 100644
--- a/tools/perf/Documentation/perf-list.txt
+++ b/tools/perf/Documentation/perf-list.txt
@@ -47,6 +47,8 @@ counted. The following modifiers exist:
  P - use maximum detected precise level
  S - read sample value (PERF_SAMPLE_READ)
  D - pin the event to the PMU
+ W - group is weak and will fallback to non-group if not schedulable,
+     only supported in 'perf stat' for now.
 
 The 'p' modifier can be used for specifying how precise the instruction
 address should be. The 'p' modifier can be specified multiple times:
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 69523ed55894..7cc61eb0d83b 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -582,6 +582,32 @@ static bool perf_evsel__should_store_id(struct perf_evsel *counter)
 	return STAT_RECORD || counter->attr.read_format & PERF_FORMAT_ID;
 }
 
+static struct perf_evsel *perf_evsel__reset_weak_group(struct perf_evsel *evsel)
+{
+	struct perf_evsel *c2, *leader;
+	bool is_open = true;
+
+	leader = evsel->leader;
+	pr_debug("Weak group for %s/%d failed\n",
+			leader->name, leader->nr_members);
+
+	/*
+	 * for_each_group_member doesn't work here because it doesn't
+	 * include the first entry.
+	 */
+	evlist__for_each_entry(evsel_list, c2) {
+		if (c2 == evsel)
+			is_open = false;
+		if (c2->leader == leader) {
+			if (is_open)
+				perf_evsel__close(c2);
+			c2->leader = c2;
+			c2->nr_members = 0;
+		}
+	}
+	return leader;
+}
+
 static int __run_perf_stat(int argc, const char **argv)
 {
 	int interval = stat_config.interval;
@@ -618,6 +644,15 @@ static int __run_perf_stat(int argc, const char **argv)
 	evlist__for_each_entry(evsel_list, counter) {
 try_again:
 		if (create_perf_stat_counter(counter) < 0) {
+
+			/* Weak group failed. Reset the group. */
+			if (errno == EINVAL &&
+			    counter->leader != counter &&
+			    counter->weak_group) {
+				counter = perf_evsel__reset_weak_group(counter);
+				goto try_again;
+			}
+
 			/*
 			 * PPC returns ENXIO for HW counters until 2.6.37
 			 * (behavior changed with commit b0a873e).
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index dd2c4b5112a5..db658785d828 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -137,6 +137,7 @@ struct perf_evsel {
 	const char *		metric_name;
 	struct perf_evsel	**metric_events;
 	bool			collect_stat;
+	bool			weak_group;
 };
 
 union u64_swap {
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index f6257fb4f08c..57d7acf890e0 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1366,6 +1366,7 @@ struct event_modifier {
 	int exclude_GH;
 	int sample_read;
 	int pinned;
+	int weak;
 };
 
 static int get_event_modifier(struct event_modifier *mod, char *str,
@@ -1384,6 +1385,7 @@ static int get_event_modifier(struct event_modifier *mod, char *str,
 
 	int exclude = eu | ek | eh;
 	int exclude_GH = evsel ? evsel->exclude_GH : 0;
+	int weak = 0;
 
 	memset(mod, 0, sizeof(*mod));
 
@@ -1421,6 +1423,8 @@ static int get_event_modifier(struct event_modifier *mod, char *str,
 			sample_read = 1;
 		} else if (*str == 'D') {
 			pinned = 1;
+		} else if (*str == 'W') {
+			weak = 1;
 		} else
 			break;
 
@@ -1451,6 +1455,7 @@ static int get_event_modifier(struct event_modifier *mod, char *str,
 	mod->exclude_GH = exclude_GH;
 	mod->sample_read = sample_read;
 	mod->pinned = pinned;
+	mod->weak = weak;
 
 	return 0;
 }
@@ -1464,7 +1469,7 @@ static int check_modifier(char *str)
 	char *p = str;
 
 	/* The sizeof includes 0 byte as well. */
-	if (strlen(str) > (sizeof("ukhGHpppPSDI") - 1))
+	if (strlen(str) > (sizeof("ukhGHpppPSDIW") - 1))
 		return -1;
 
 	while (*p) {
@@ -1504,6 +1509,7 @@ int parse_events__modifier_event(struct list_head *list, char *str, bool add)
 		evsel->exclude_GH          = mod.exclude_GH;
 		evsel->sample_read         = mod.sample_read;
 		evsel->precise_max         = mod.precise_max;
+		evsel->weak_group	   = mod.weak;
 
 		if (perf_evsel__is_group_leader(evsel))
 			evsel->attr.pinned = mod.pinned;
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index c42edeac451f..fdb5bb52f01f 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -161,7 +161,7 @@ name		[a-zA-Z_*?][a-zA-Z0-9_*?.]*
 name_minus	[a-zA-Z_*?][a-zA-Z0-9\-_*?.:]*
 drv_cfg_term	[a-zA-Z0-9_\.]+(=[a-zA-Z0-9_*?\.:]+)?
 /* If you add a modifier you need to update check_modifier() */
-modifier_event	[ukhpPGHSDI]+
+modifier_event	[ukhpPGHSDIW]+
 modifier_bp	[rwx]{1,3}
 
 %%
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 03/44] perf vendor events: Support metric_group and no event name in JSON parser
  2017-09-22 14:41 [GIT PULL 00/44] perf/core improvements and fixes Arnaldo Carvalho de Melo
  2017-09-22 14:41 ` [PATCH 01/44] perf sched timehist: Add pid and tid options Arnaldo Carvalho de Melo
  2017-09-22 14:41 ` [PATCH 02/44] perf tools: Support weak groups in 'perf stat' Arnaldo Carvalho de Melo
@ 2017-09-22 14:41 ` Arnaldo Carvalho de Melo
  2017-09-22 14:41 ` [PATCH 04/44] perf stat: Factor out generic metric printing Arnaldo Carvalho de Melo
                   ` (17 subsequent siblings)
  20 siblings, 0 replies; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-22 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Andi Kleen, Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

Some enhancements to the JSON parser to prepare for metrics support

- Parse the new MetricGroup field
- Support JSON events with no event name, that have only MetricName.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170831194036.30146-3-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/pmu-events/jevents.c    | 24 ++++++++++++++++++------
 tools/perf/pmu-events/jevents.h    |  2 +-
 tools/perf/pmu-events/pmu-events.h |  1 +
 3 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/tools/perf/pmu-events/jevents.c b/tools/perf/pmu-events/jevents.c
index d51dc9ca8861..9eb7047bafe4 100644
--- a/tools/perf/pmu-events/jevents.c
+++ b/tools/perf/pmu-events/jevents.c
@@ -292,7 +292,7 @@ static int print_events_table_entry(void *data, char *name, char *event,
 				    char *desc, char *long_desc,
 				    char *pmu, char *unit, char *perpkg,
 				    char *metric_expr,
-				    char *metric_name)
+				    char *metric_name, char *metric_group)
 {
 	struct perf_entry_data *pd = data;
 	FILE *outfp = pd->outfp;
@@ -304,8 +304,10 @@ static int print_events_table_entry(void *data, char *name, char *event,
 	 */
 	fprintf(outfp, "{\n");
 
-	fprintf(outfp, "\t.name = \"%s\",\n", name);
-	fprintf(outfp, "\t.event = \"%s\",\n", event);
+	if (name)
+		fprintf(outfp, "\t.name = \"%s\",\n", name);
+	if (event)
+		fprintf(outfp, "\t.event = \"%s\",\n", event);
 	fprintf(outfp, "\t.desc = \"%s\",\n", desc);
 	fprintf(outfp, "\t.topic = \"%s\",\n", topic);
 	if (long_desc && long_desc[0])
@@ -320,6 +322,8 @@ static int print_events_table_entry(void *data, char *name, char *event,
 		fprintf(outfp, "\t.metric_expr = \"%s\",\n", metric_expr);
 	if (metric_name)
 		fprintf(outfp, "\t.metric_name = \"%s\",\n", metric_name);
+	if (metric_group)
+		fprintf(outfp, "\t.metric_group = \"%s\",\n", metric_group);
 	fprintf(outfp, "},\n");
 
 	return 0;
@@ -357,6 +361,9 @@ static char *real_event(const char *name, char *event)
 {
 	int i;
 
+	if (!name)
+		return NULL;
+
 	for (i = 0; fixed[i].name; i++)
 		if (!strcasecmp(name, fixed[i].name))
 			return (char *)fixed[i].event;
@@ -369,7 +376,7 @@ int json_events(const char *fn,
 		      char *long_desc,
 		      char *pmu, char *unit, char *perpkg,
 		      char *metric_expr,
-		      char *metric_name),
+		      char *metric_name, char *metric_group),
 	  void *data)
 {
 	int err = -EIO;
@@ -397,6 +404,7 @@ int json_events(const char *fn,
 		char *unit = NULL;
 		char *metric_expr = NULL;
 		char *metric_name = NULL;
+		char *metric_group = NULL;
 		unsigned long long eventcode = 0;
 		struct msrmap *msr = NULL;
 		jsmntok_t *msrval = NULL;
@@ -476,6 +484,8 @@ int json_events(const char *fn,
 				addfield(map, &perpkg, "", "", val);
 			} else if (json_streq(map, field, "MetricName")) {
 				addfield(map, &metric_name, "", "", val);
+			} else if (json_streq(map, field, "MetricGroup")) {
+				addfield(map, &metric_group, "", "", val);
 			} else if (json_streq(map, field, "MetricExpr")) {
 				addfield(map, &metric_expr, "", "", val);
 				for (s = metric_expr; *s; s++)
@@ -501,10 +511,11 @@ int json_events(const char *fn,
 			addfield(map, &event, ",", filter, NULL);
 		if (msr != NULL)
 			addfield(map, &event, ",", msr->pname, msrval);
-		fixname(name);
+		if (name)
+			fixname(name);
 
 		err = func(data, name, real_event(name, event), desc, long_desc,
-				pmu, unit, perpkg, metric_expr, metric_name);
+			   pmu, unit, perpkg, metric_expr, metric_name, metric_group);
 		free(event);
 		free(desc);
 		free(name);
@@ -516,6 +527,7 @@ int json_events(const char *fn,
 		free(unit);
 		free(metric_expr);
 		free(metric_name);
+		free(metric_group);
 		if (err)
 			break;
 		tok += j;
diff --git a/tools/perf/pmu-events/jevents.h b/tools/perf/pmu-events/jevents.h
index 611fac01913d..557994754410 100644
--- a/tools/perf/pmu-events/jevents.h
+++ b/tools/perf/pmu-events/jevents.h
@@ -6,7 +6,7 @@ int json_events(const char *fn,
 				char *long_desc,
 				char *pmu,
 				char *unit, char *perpkg, char *metric_expr,
-				char *metric_name),
+				char *metric_name, char *metric_group),
 		void *data);
 char *get_cpu_str(void);
 
diff --git a/tools/perf/pmu-events/pmu-events.h b/tools/perf/pmu-events/pmu-events.h
index 569eab3688dd..94fa1720f6fd 100644
--- a/tools/perf/pmu-events/pmu-events.h
+++ b/tools/perf/pmu-events/pmu-events.h
@@ -15,6 +15,7 @@ struct pmu_event {
 	const char *perpkg;
 	const char *metric_expr;
 	const char *metric_name;
+	const char *metric_group;
 };
 
 /*
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 04/44] perf stat: Factor out generic metric printing
  2017-09-22 14:41 [GIT PULL 00/44] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (2 preceding siblings ...)
  2017-09-22 14:41 ` [PATCH 03/44] perf vendor events: Support metric_group and no event name in JSON parser Arnaldo Carvalho de Melo
@ 2017-09-22 14:41 ` Arnaldo Carvalho de Melo
  2017-09-22 14:41 ` [PATCH 05/44] perf stat: Print generic metric header even for failed expressions Arnaldo Carvalho de Melo
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-22 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Andi Kleen, Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

The 'perf stat' shadow metric printing already supports generic metrics.
Factor out the code doing that into a separate function that can be
re-used in a later patch.

No behavior changes.

v2: Fix indentation

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170831194036.30146-4-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/stat-shadow.c | 69 ++++++++++++++++++++++++++-----------------
 1 file changed, 42 insertions(+), 27 deletions(-)

diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index a04cf56d3517..96aa6cbf24d6 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -627,6 +627,46 @@ static void print_smi_cost(int cpu, struct perf_evsel *evsel,
 	out->print_metric(out->ctx, NULL, "%4.0f", "SMI#", smi_num);
 }
 
+static void generic_metric(const char *metric_expr,
+			   struct perf_evsel **metric_events,
+			   char *name,
+			   const char *metric_name,
+			   double avg,
+			   int cpu,
+			   int ctx,
+			   struct perf_stat_output_ctx *out)
+{
+	print_metric_t print_metric = out->print_metric;
+	struct parse_ctx pctx;
+	double ratio;
+	int i;
+	void *ctxp = out->ctx;
+
+	expr__ctx_init(&pctx);
+	expr__add_id(&pctx, name, avg);
+	for (i = 0; metric_events[i]; i++) {
+		struct saved_value *v;
+
+		v = saved_value_lookup(metric_events[i], cpu, ctx, false);
+		if (!v)
+			break;
+		expr__add_id(&pctx, metric_events[i]->name, avg_stats(&v->stats));
+	}
+	if (!metric_events[i]) {
+		const char *p = metric_expr;
+
+		if (expr__parse(&ratio, &pctx, &p) == 0)
+			print_metric(ctxp, NULL, "%8.1f",
+				metric_name ?
+				metric_name :
+				out->force_header ?  name : "",
+				ratio);
+		else
+			print_metric(ctxp, NULL, NULL, "", 0);
+	} else
+		print_metric(ctxp, NULL, NULL, "", 0);
+}
+
 void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 				   double avg, int cpu,
 				   struct perf_stat_output_ctx *out)
@@ -819,33 +859,8 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 		else
 			print_metric(ctxp, NULL, NULL, name, 0);
 	} else if (evsel->metric_expr) {
-		struct parse_ctx pctx;
-		int i;
-
-		expr__ctx_init(&pctx);
-		expr__add_id(&pctx, evsel->name, avg);
-		for (i = 0; evsel->metric_events[i]; i++) {
-			struct saved_value *v;
-
-			v = saved_value_lookup(evsel->metric_events[i], cpu, ctx, false);
-			if (!v)
-				break;
-			expr__add_id(&pctx, evsel->metric_events[i]->name,
-					     avg_stats(&v->stats));
-		}
-		if (!evsel->metric_events[i]) {
-			const char *p = evsel->metric_expr;
-
-			if (expr__parse(&ratio, &pctx, &p) == 0)
-				print_metric(ctxp, NULL, "%8.1f",
-					evsel->metric_name ?
-					evsel->metric_name :
-					out->force_header ?  evsel->name : "",
-					ratio);
-			else
-				print_metric(ctxp, NULL, NULL, "", 0);
-		} else
-			print_metric(ctxp, NULL, NULL, "", 0);
+		generic_metric(evsel->metric_expr, evsel->metric_events, evsel->name,
+				evsel->metric_name, avg, cpu, ctx, out);
 	} else if (runtime_nsecs_stats[cpu].n != 0) {
 		char unit = 'M';
 		char unit_buf[10];
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 05/44] perf stat: Print generic metric header even for failed expressions
  2017-09-22 14:41 [GIT PULL 00/44] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (3 preceding siblings ...)
  2017-09-22 14:41 ` [PATCH 04/44] perf stat: Factor out generic metric printing Arnaldo Carvalho de Melo
@ 2017-09-22 14:41 ` Arnaldo Carvalho de Melo
  2017-09-22 14:41 ` [PATCH 06/44] perf pmu: Extract function to get JSON alias map Arnaldo Carvalho de Melo
                   ` (15 subsequent siblings)
  20 siblings, 0 replies; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-22 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Andi Kleen, Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

Print the generic metric header even when the expression evaluation
failed. Otherwise an expression that fails on the first collections due
to division by zero may suddenly reappear later without an header.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170831194036.30146-5-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/stat-shadow.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 96aa6cbf24d6..8c7ab29169b9 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -662,7 +662,9 @@ static void generic_metric(const char *metric_expr,
 				out->force_header ?  name : "",
 				ratio);
 		else
-			print_metric(ctxp, NULL, NULL, "", 0);
+			print_metric(ctxp, NULL, NULL,
+				     out->force_header ?
+				     (metric_name ? metric_name : name) : "", 0);
 	} else
 		print_metric(ctxp, NULL, NULL, "", 0);
 }
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 06/44] perf pmu: Extract function to get JSON alias map
  2017-09-22 14:41 [GIT PULL 00/44] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (4 preceding siblings ...)
  2017-09-22 14:41 ` [PATCH 05/44] perf stat: Print generic metric header even for failed expressions Arnaldo Carvalho de Melo
@ 2017-09-22 14:41 ` Arnaldo Carvalho de Melo
  2017-09-22 14:41 ` [PATCH 07/44] perf stat: Support JSON metrics in perf stat Arnaldo Carvalho de Melo
                   ` (14 subsequent siblings)
  20 siblings, 0 replies; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-22 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Andi Kleen, Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

Extract the code to get the per cpu JSON alias into a separate function
for reuse. No behavior changes.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170831194036.30146-6-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/pmu.c | 49 +++++++++++++++++++++++++++++++++----------------
 tools/perf/util/pmu.h |  2 ++
 2 files changed, 35 insertions(+), 16 deletions(-)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index ac16a9db1fb5..ed25d7f88731 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -516,16 +516,8 @@ char * __weak get_cpuid_str(void)
 	return NULL;
 }
 
-/*
- * From the pmu_events_map, find the table of PMU events that corresponds
- * to the current running CPU. Then, add all PMU events from that table
- * as aliases.
- */
-static void pmu_add_cpu_aliases(struct list_head *head, const char *name)
+static char *perf_pmu__getcpuid(void)
 {
-	int i;
-	struct pmu_events_map *map;
-	struct pmu_event *pe;
 	char *cpuid;
 	static bool printed;
 
@@ -535,22 +527,50 @@ static void pmu_add_cpu_aliases(struct list_head *head, const char *name)
 	if (!cpuid)
 		cpuid = get_cpuid_str();
 	if (!cpuid)
-		return;
+		return NULL;
 
 	if (!printed) {
 		pr_debug("Using CPUID %s\n", cpuid);
 		printed = true;
 	}
+	return cpuid;
+}
+
+struct pmu_events_map *perf_pmu__find_map(void)
+{
+	struct pmu_events_map *map;
+	char *cpuid = perf_pmu__getcpuid();
+	int i;
 
 	i = 0;
-	while (1) {
+	for (;;) {
 		map = &pmu_events_map[i++];
-		if (!map->table)
-			goto out;
+		if (!map->table) {
+			map = NULL;
+			break;
+		}
 
 		if (!strcmp(map->cpuid, cpuid))
 			break;
 	}
+	free(cpuid);
+	return map;
+}
+
+/*
+ * From the pmu_events_map, find the table of PMU events that corresponds
+ * to the current running CPU. Then, add all PMU events from that table
+ * as aliases.
+ */
+static void pmu_add_cpu_aliases(struct list_head *head, const char *name)
+{
+	int i;
+	struct pmu_events_map *map;
+	struct pmu_event *pe;
+
+	map = perf_pmu__find_map();
+	if (!map)
+		return;
 
 	/*
 	 * Found a matching PMU events table. Create aliases
@@ -575,9 +595,6 @@ static void pmu_add_cpu_aliases(struct list_head *head, const char *name)
 				(char *)pe->metric_expr,
 				(char *)pe->metric_name);
 	}
-
-out:
-	free(cpuid);
 }
 
 struct perf_event_attr * __weak
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 389e9729331f..060f6abba8ed 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -90,4 +90,6 @@ int perf_pmu__test(void);
 
 struct perf_event_attr *perf_pmu__get_default_config(struct perf_pmu *pmu);
 
+struct pmu_events_map *perf_pmu__find_map(void);
+
 #endif /* __PMU_H */
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 07/44] perf stat: Support JSON metrics in perf stat
  2017-09-22 14:41 [GIT PULL 00/44] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (5 preceding siblings ...)
  2017-09-22 14:41 ` [PATCH 06/44] perf pmu: Extract function to get JSON alias map Arnaldo Carvalho de Melo
@ 2017-09-22 14:41 ` Arnaldo Carvalho de Melo
  2017-09-22 14:41 ` [PATCH 08/44] perf list: Add metric groups to perf list Arnaldo Carvalho de Melo
                   ` (13 subsequent siblings)
  20 siblings, 0 replies; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-22 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Andi Kleen, Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

Add generic support for standalone metrics specified in JSON files to
perf stat. A metric is a formula that uses multiple events to compute a
higher level result (e.g. IPC).

Previously metrics were always tied to an event and automatically
enabled with that event. But now change it that we can have standalone
metrics. They are in the same JSON data structure as events, but don't
have an event name.

We also allow to organize the metrics in metric groups, which allows a
short cut to select several related metrics at once.

Add a new -M / --metrics option to perf stat that adds the metrics or
metric groups specified.

Add the core code to manage and parse the metric groups. They are
collected from the JSON data structures into a separate rblist.  When
computing shadow values look for metrics in that list.  Then they are
computed using the existing saved values infrastructure in stat-shadow.c

The actual JSON metrics are in a separate pull request.

  % perf stat -M Summary --metric-only -a sleep 1

   Performance counter stats for 'system wide':

  Instructions   CLKS          CPU_Utilization  GFLOPs   SMT_2T_Utilization   Kernel_Utilization
  317614222.0    1392930775.0  0.0              0.0      0.2                  0.1

       1.001497549 seconds time elapsed

  % perf stat -M GFLOPs flops

   Performance counter stats for 'flops':

     3,999,541,471  fp_comp_ops_exe.sse_scalar_single #  1.2 GFLOPs   (66.65%)
                14  fp_comp_ops_exe.sse_scalar_double                 (66.65%)
                 0  fp_comp_ops_exe.sse_packed_double                 (66.67%)
                 0  fp_comp_ops_exe.sse_packed_single                 (66.70%)
                 0  simd_fp_256.packed_double                         (66.70%)
                 0  simd_fp_256.packed_single                         (66.67%)
                 0  duration_time

       3.238372845 seconds time elapsed

v2: Add missing header file
v3: Move find_map to pmu.c

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170831194036.30146-7-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-stat.txt |   7 +
 tools/perf/builtin-stat.c              |  18 +-
 tools/perf/util/Build                  |   1 +
 tools/perf/util/metricgroup.c          | 313 +++++++++++++++++++++++++++++++++
 tools/perf/util/metricgroup.h          |  31 ++++
 tools/perf/util/pmu.c                  |   5 +-
 tools/perf/util/stat-shadow.c          |  22 ++-
 tools/perf/util/stat.h                 |   4 +-
 8 files changed, 395 insertions(+), 6 deletions(-)
 create mode 100644 tools/perf/util/metricgroup.c
 create mode 100644 tools/perf/util/metricgroup.h

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index c37d61682dfb..823fce7674bb 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -199,6 +199,13 @@ Aggregate counts per processor socket for system-wide mode measurements.
 --per-core::
 Aggregate counts per physical processor for system-wide mode measurements.
 
+-M::
+--metrics::
+Print metrics or metricgroups specified in a comma separated list.
+For a group all metrics from the group are added.
+The events from the metrics are automatically measured.
+See perf list output for the possble metrics and metricgroups.
+
 -A::
 --no-aggr::
 Do not aggregate counts across all monitored CPUs.
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 7cc61eb0d83b..874bc6dd8d60 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -65,6 +65,7 @@
 #include "util/tool.h"
 #include "util/group.h"
 #include "util/string2.h"
+#include "util/metricgroup.h"
 #include "asm/bug.h"
 
 #include <linux/time64.h>
@@ -133,6 +134,8 @@ static const char *smi_cost_attrs = {
 
 static struct perf_evlist	*evsel_list;
 
+static struct rblist		 metric_events;
+
 static struct target target = {
 	.uid	= UINT_MAX,
 };
@@ -1234,7 +1237,7 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
 
 	perf_stat__print_shadow_stats(counter, uval,
 				first_shadow_cpu(counter, id),
-				&out);
+				&out, &metric_events);
 	if (!csv_output && !metric_only) {
 		print_noise(counter, noise);
 		print_running(run, ena);
@@ -1565,7 +1568,8 @@ static void print_metric_headers(const char *prefix, bool no_indent)
 		os.evsel = counter;
 		perf_stat__print_shadow_stats(counter, 0,
 					      0,
-					      &out);
+					      &out,
+					      &metric_events);
 	}
 	fputc('\n', stat_config.output);
 }
@@ -1789,6 +1793,13 @@ static int enable_metric_only(const struct option *opt __maybe_unused,
 	return 0;
 }
 
+static int parse_metric_groups(const struct option *opt,
+			       const char *str,
+			       int unset __maybe_unused)
+{
+	return metricgroup__parse_groups(opt, str, &metric_events);
+}
+
 static const struct option stat_options[] = {
 	OPT_BOOLEAN('T', "transaction", &transaction_run,
 		    "hardware transaction statistics"),
@@ -1854,6 +1865,9 @@ static const struct option stat_options[] = {
 			"measure topdown level 1 statistics"),
 	OPT_BOOLEAN(0, "smi-cost", &smi_cost,
 			"measure SMI cost"),
+	OPT_CALLBACK('M', "metrics", &evsel_list, "metric/metric group list",
+		     "monitor specified metrics or metric groups (separated by ,)",
+		     parse_metric_groups),
 	OPT_END()
 };
 
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 94518c1bf8b6..71ab8466714d 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -34,6 +34,7 @@ libperf-y += dso.o
 libperf-y += symbol.o
 libperf-y += symbol_fprintf.o
 libperf-y += color.o
+libperf-y += metricgroup.o
 libperf-y += header.o
 libperf-y += callchain.o
 libperf-y += values.o
diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
new file mode 100644
index 000000000000..7516b1746594
--- /dev/null
+++ b/tools/perf/util/metricgroup.c
@@ -0,0 +1,313 @@
+/*
+ * Copyright (c) 2017, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ */
+
+/* Manage metrics and groups of metrics from JSON files */
+
+#include "metricgroup.h"
+#include "evlist.h"
+#include "strbuf.h"
+#include "pmu.h"
+#include "expr.h"
+#include "rblist.h"
+#include "pmu.h"
+#include <string.h>
+#include <stdbool.h>
+#include <errno.h>
+#include "pmu-events/pmu-events.h"
+#include "strbuf.h"
+#include "strlist.h"
+#include <assert.h>
+#include <ctype.h>
+
+struct metric_event *metricgroup__lookup(struct rblist *metric_events,
+					 struct perf_evsel *evsel,
+					 bool create)
+{
+	struct rb_node *nd;
+	struct metric_event me = {
+		.evsel = evsel
+	};
+	nd = rblist__find(metric_events, &me);
+	if (nd)
+		return container_of(nd, struct metric_event, nd);
+	if (create) {
+		rblist__add_node(metric_events, &me);
+		nd = rblist__find(metric_events, &me);
+		if (nd)
+			return container_of(nd, struct metric_event, nd);
+	}
+	return NULL;
+}
+
+static int metric_event_cmp(struct rb_node *rb_node, const void *entry)
+{
+	struct metric_event *a = container_of(rb_node,
+					      struct metric_event,
+					      nd);
+	const struct metric_event *b = entry;
+
+	if (a->evsel == b->evsel)
+		return 0;
+	if ((char *)a->evsel < (char *)b->evsel)
+		return -1;
+	return +1;
+}
+
+static struct rb_node *metric_event_new(struct rblist *rblist __maybe_unused,
+					const void *entry)
+{
+	struct metric_event *me = malloc(sizeof(struct metric_event));
+
+	if (!me)
+		return NULL;
+	memcpy(me, entry, sizeof(struct metric_event));
+	me->evsel = ((struct metric_event *)entry)->evsel;
+	INIT_LIST_HEAD(&me->head);
+	return &me->nd;
+}
+
+static void metricgroup__rblist_init(struct rblist *metric_events)
+{
+	rblist__init(metric_events);
+	metric_events->node_cmp = metric_event_cmp;
+	metric_events->node_new = metric_event_new;
+}
+
+struct egroup {
+	struct list_head nd;
+	int idnum;
+	const char **ids;
+	const char *metric_name;
+	const char *metric_expr;
+};
+
+static struct perf_evsel *find_evsel(struct perf_evlist *perf_evlist,
+				     const char **ids,
+				     int idnum,
+				     struct perf_evsel **metric_events)
+{
+	struct perf_evsel *ev, *start = NULL;
+	int ind = 0;
+
+	evlist__for_each_entry (perf_evlist, ev) {
+		if (!strcmp(ev->name, ids[ind])) {
+			metric_events[ind] = ev;
+			if (ind == 0)
+				start = ev;
+			if (++ind == idnum) {
+				metric_events[ind] = NULL;
+				return start;
+			}
+		} else {
+			ind = 0;
+			start = NULL;
+		}
+	}
+	/*
+	 * This can happen when an alias expands to multiple
+	 * events, like for uncore events.
+	 * We don't support this case for now.
+	 */
+	return NULL;
+}
+
+static int metricgroup__setup_events(struct list_head *groups,
+				     struct perf_evlist *perf_evlist,
+				     struct rblist *metric_events_list)
+{
+	struct metric_event *me;
+	struct metric_expr *expr;
+	int i = 0;
+	int ret = 0;
+	struct egroup *eg;
+	struct perf_evsel *evsel;
+
+	list_for_each_entry (eg, groups, nd) {
+		struct perf_evsel **metric_events;
+
+		metric_events = calloc(sizeof(void *), eg->idnum + 1);
+		if (!metric_events) {
+			ret = -ENOMEM;
+			break;
+		}
+		evsel = find_evsel(perf_evlist, eg->ids, eg->idnum,
+				   metric_events);
+		if (!evsel) {
+			pr_debug("Cannot resolve %s: %s\n",
+					eg->metric_name, eg->metric_expr);
+			continue;
+		}
+		for (i = 0; i < eg->idnum; i++)
+			metric_events[i]->collect_stat = true;
+		me = metricgroup__lookup(metric_events_list, evsel, true);
+		if (!me) {
+			ret = -ENOMEM;
+			break;
+		}
+		expr = malloc(sizeof(struct metric_expr));
+		if (!expr) {
+			ret = -ENOMEM;
+			break;
+		}
+		expr->metric_expr = eg->metric_expr;
+		expr->metric_name = eg->metric_name;
+		expr->metric_events = metric_events;
+		list_add(&expr->nd, &me->head);
+	}
+	return ret;
+}
+
+static bool match_metric(const char *n, const char *list)
+{
+	int len;
+	char *m;
+
+	if (!list)
+		return false;
+	if (!strcmp(list, "all"))
+		return true;
+	if (!n)
+		return !strcasecmp(list, "No_group");
+	len = strlen(list);
+	m = strcasestr(n, list);
+	if (!m)
+		return false;
+	if ((m == n || m[-1] == ';' || m[-1] == ' ') &&
+	    (m[len] == 0 || m[len] == ';'))
+		return true;
+	return false;
+}
+
+static int metricgroup__add_metric(const char *metric, struct strbuf *events,
+				   struct list_head *group_list)
+{
+	struct pmu_events_map *map = perf_pmu__find_map();
+	struct pmu_event *pe;
+	int ret = -EINVAL;
+	int i, j;
+
+	strbuf_init(events, 100);
+	strbuf_addf(events, "%s", "");
+
+	if (!map)
+		return 0;
+
+	for (i = 0; ; i++) {
+		pe = &map->table[i];
+
+		if (!pe->name && !pe->metric_group && !pe->metric_name)
+			break;
+		if (!pe->metric_expr)
+			continue;
+		if (match_metric(pe->metric_group, metric) ||
+		    match_metric(pe->metric_name, metric)) {
+			const char **ids;
+			int idnum;
+			struct egroup *eg;
+
+			pr_debug("metric expr %s for %s\n", pe->metric_expr, pe->metric_name);
+
+			if (expr__find_other(pe->metric_expr,
+					     NULL, &ids, &idnum) < 0)
+				continue;
+			if (events->len > 0)
+				strbuf_addf(events, ",");
+			for (j = 0; j < idnum; j++) {
+				pr_debug("found event %s\n", ids[j]);
+				strbuf_addf(events, "%s%s",
+					j == 0 ? "{" : ",",
+					ids[j]);
+			}
+			strbuf_addf(events, "}:W");
+
+			eg = malloc(sizeof(struct egroup));
+			if (!eg) {
+				ret = -ENOMEM;
+				break;
+			}
+			eg->ids = ids;
+			eg->idnum = idnum;
+			eg->metric_name = pe->metric_name;
+			eg->metric_expr = pe->metric_expr;
+			list_add_tail(&eg->nd, group_list);
+			ret = 0;
+		}
+	}
+	return ret;
+}
+
+static int metricgroup__add_metric_list(const char *list, struct strbuf *events,
+				        struct list_head *group_list)
+{
+	char *llist, *nlist, *p;
+	int ret = -EINVAL;
+
+	nlist = strdup(list);
+	if (!nlist)
+		return -ENOMEM;
+	llist = nlist;
+	while ((p = strsep(&llist, ",")) != NULL) {
+		ret = metricgroup__add_metric(p, events, group_list);
+		if (ret == -EINVAL) {
+			fprintf(stderr, "Cannot find metric or group `%s'\n",
+					p);
+			break;
+		}
+	}
+	free(nlist);
+	return ret;
+}
+
+static void metricgroup__free_egroups(struct list_head *group_list)
+{
+	struct egroup *eg, *egtmp;
+	int i;
+
+	list_for_each_entry_safe (eg, egtmp, group_list, nd) {
+		for (i = 0; i < eg->idnum; i++)
+			free((char *)eg->ids[i]);
+		free(eg->ids);
+		free(eg);
+	}
+}
+
+int metricgroup__parse_groups(const struct option *opt,
+			   const char *str,
+			   struct rblist *metric_events)
+{
+	struct parse_events_error parse_error;
+	struct perf_evlist *perf_evlist = *(struct perf_evlist **)opt->value;
+	struct strbuf extra_events;
+	LIST_HEAD(group_list);
+	int ret;
+
+	if (metric_events->nr_entries == 0)
+		metricgroup__rblist_init(metric_events);
+	ret = metricgroup__add_metric_list(str, &extra_events, &group_list);
+	if (ret)
+		return ret;
+	pr_debug("adding %s\n", extra_events.buf);
+	memset(&parse_error, 0, sizeof(struct parse_events_error));
+	ret = parse_events(perf_evlist, extra_events.buf, &parse_error);
+	if (ret) {
+		pr_err("Cannot set up events %s\n", extra_events.buf);
+		goto out;
+	}
+	strbuf_release(&extra_events);
+	ret = metricgroup__setup_events(&group_list, perf_evlist,
+					metric_events);
+out:
+	metricgroup__free_egroups(&group_list);
+	return ret;
+}
diff --git a/tools/perf/util/metricgroup.h b/tools/perf/util/metricgroup.h
new file mode 100644
index 000000000000..06854e125ee7
--- /dev/null
+++ b/tools/perf/util/metricgroup.h
@@ -0,0 +1,31 @@
+#ifndef METRICGROUP_H
+#define METRICGROUP_H 1
+
+#include "linux/list.h"
+#include "rblist.h"
+#include <subcmd/parse-options.h>
+#include "evlist.h"
+#include "strbuf.h"
+
+struct metric_event {
+	struct rb_node nd;
+	struct perf_evsel *evsel;
+	struct list_head head; /* list of metric_expr */
+};
+
+struct metric_expr {
+	struct list_head nd;
+	const char *metric_expr;
+	const char *metric_name;
+	struct perf_evsel **metric_events;
+};
+
+struct metric_event *metricgroup__lookup(struct rblist *metric_events,
+					 struct perf_evsel *evsel,
+					 bool create);
+int metricgroup__parse_groups(const struct option *opt,
+			const char *str,
+			struct rblist *metric_events);
+
+void metricgroup__print(bool metrics, bool groups, char *filter, bool raw);
+#endif
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index ed25d7f88731..7070638ab600 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -580,8 +580,11 @@ static void pmu_add_cpu_aliases(struct list_head *head, const char *name)
 		const char *pname;
 
 		pe = &map->table[i++];
-		if (!pe->name)
+		if (!pe->name) {
+			if (pe->metric_group || pe->metric_name)
+				continue;
 			break;
+		}
 
 		pname = pe->pmu ? pe->pmu : "cpu";
 		if (strncmp(pname, name, strlen(pname)))
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 8c7ab29169b9..42e6c17be7ff 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -6,6 +6,7 @@
 #include "rblist.h"
 #include "evlist.h"
 #include "expr.h"
+#include "metricgroup.h"
 
 enum {
 	CTX_BIT_USER	= 1 << 0,
@@ -671,13 +672,16 @@ static void generic_metric(const char *metric_expr,
 
 void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 				   double avg, int cpu,
-				   struct perf_stat_output_ctx *out)
+				   struct perf_stat_output_ctx *out,
+				   struct rblist *metric_events)
 {
 	void *ctxp = out->ctx;
 	print_metric_t print_metric = out->print_metric;
 	double total, ratio = 0.0, total2;
 	const char *color = NULL;
 	int ctx = evsel_context(evsel);
+	struct metric_event *me;
+	int num = 1;
 
 	if (perf_evsel__match(evsel, HARDWARE, HW_INSTRUCTIONS)) {
 		total = avg_stats(&runtime_cycles_stats[ctx][cpu]);
@@ -880,6 +884,20 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 	} else if (perf_stat_evsel__is(evsel, SMI_NUM)) {
 		print_smi_cost(cpu, evsel, out);
 	} else {
-		print_metric(ctxp, NULL, NULL, NULL, 0);
+		num = 0;
 	}
+
+	if ((me = metricgroup__lookup(metric_events, evsel, false)) != NULL) {
+		struct metric_expr *mexp;
+
+		list_for_each_entry (mexp, &me->head, nd) {
+			if (num++ > 0)
+				out->new_line(ctxp);
+			generic_metric(mexp->metric_expr, mexp->metric_events,
+					evsel->name, mexp->metric_name,
+					avg, cpu, ctx, out);
+		}
+	}
+	if (num == 0)
+		print_metric(ctxp, NULL, NULL, NULL, 0);
 }
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index eacaf958e19d..47915df346fb 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -91,9 +91,11 @@ struct perf_stat_output_ctx {
 	bool force_header;
 };
 
+struct rblist;
 void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 				   double avg, int cpu,
-				   struct perf_stat_output_ctx *out);
+				   struct perf_stat_output_ctx *out,
+				   struct rblist *metric_events);
 void perf_stat__collect_metric_expr(struct perf_evlist *);
 
 int perf_evlist__alloc_stats(struct perf_evlist *evlist, bool alloc_raw);
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 08/44] perf list: Add metric groups to perf list
  2017-09-22 14:41 [GIT PULL 00/44] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (6 preceding siblings ...)
  2017-09-22 14:41 ` [PATCH 07/44] perf stat: Support JSON metrics in perf stat Arnaldo Carvalho de Melo
@ 2017-09-22 14:41 ` Arnaldo Carvalho de Melo
  2017-09-22 14:41 ` [PATCH 09/44] perf stat: Don't use ctx for saved values lookup Arnaldo Carvalho de Melo
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-22 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Andi Kleen, Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

Add code to perf list to print metric groups, and metrics
that don't have an event name. The metricgroup code collects
the eventgroups and events into a rblist, and then prints
them according to the configured filters.

The metricgroups are printed by default, but can be
limited by perf list metric or perf list metricgroup

  % perf list metricgroup
  ..
  Metric Groups:

  DSB:
    DSB_Coverage
          [Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)]
  FLOPS:
    GFLOPs
          [Giga Floating Point Operations Per Second]
  Frontend:
    IFetch_Line_Utilization
          [Rough Estimation of fraction of fetched lines bytes that were likely consumed by program instructions]
  Frontend_Bandwidth:
    DSB_Coverage
          [Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)]
  Memory_BW:
    MLP
          [Memory-Level-Parallelism (average number of L1 miss demand load when there is at least 1 such miss)]

v2: Check return value of asprintf to fix warning on FC26
Fix key in lookup/addition for the groups list

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170831194036.30146-8-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-list.txt |   7 +-
 tools/perf/builtin-list.c              |   7 ++
 tools/perf/util/metricgroup.c          | 176 +++++++++++++++++++++++++++++++++
 tools/perf/util/parse-events.c         |   3 +
 4 files changed, 192 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt
index 75fc17f47298..24679aed90b7 100644
--- a/tools/perf/Documentation/perf-list.txt
+++ b/tools/perf/Documentation/perf-list.txt
@@ -8,7 +8,8 @@ perf-list - List all symbolic event types
 SYNOPSIS
 --------
 [verse]
-'perf list' [--no-desc] [--long-desc] [hw|sw|cache|tracepoint|pmu|sdt|event_glob]
+'perf list' [--no-desc] [--long-desc]
+            [hw|sw|cache|tracepoint|pmu|sdt|metric|metricgroup|event_glob]
 
 DESCRIPTION
 -----------
@@ -248,6 +249,10 @@ To limit the list use:
 
 . 'sdt' to list all Statically Defined Tracepoint events.
 
+. 'metric' to list metrics
+
+. 'metricgroup' to list metricgroups with metrics.
+
 . If none of the above is matched, it will apply the supplied glob to all
   events, printing the ones that match.
 
diff --git a/tools/perf/builtin-list.c b/tools/perf/builtin-list.c
index 4bf2cb4d25aa..b2d2ad3dd478 100644
--- a/tools/perf/builtin-list.c
+++ b/tools/perf/builtin-list.c
@@ -15,6 +15,7 @@
 #include "util/cache.h"
 #include "util/pmu.h"
 #include "util/debug.h"
+#include "util/metricgroup.h"
 #include <subcmd/parse-options.h>
 
 static bool desc_flag = true;
@@ -79,6 +80,10 @@ int cmd_list(int argc, const char **argv)
 						long_desc_flag, details_flag);
 		else if (strcmp(argv[i], "sdt") == 0)
 			print_sdt_events(NULL, NULL, raw_dump);
+		else if (strcmp(argv[i], "metric") == 0)
+			metricgroup__print(true, false, NULL, raw_dump);
+		else if (strcmp(argv[i], "metricgroup") == 0)
+			metricgroup__print(false, true, NULL, raw_dump);
 		else if ((sep = strchr(argv[i], ':')) != NULL) {
 			int sep_idx;
 
@@ -96,6 +101,7 @@ int cmd_list(int argc, const char **argv)
 			s[sep_idx] = '\0';
 			print_tracepoint_events(s, s + sep_idx + 1, raw_dump);
 			print_sdt_events(s, s + sep_idx + 1, raw_dump);
+			metricgroup__print(true, true, s, raw_dump);
 			free(s);
 		} else {
 			if (asprintf(&s, "*%s*", argv[i]) < 0) {
@@ -112,6 +118,7 @@ int cmd_list(int argc, const char **argv)
 						details_flag);
 			print_tracepoint_events(NULL, s, raw_dump);
 			print_sdt_events(NULL, s, raw_dump);
+			metricgroup__print(true, true, NULL, raw_dump);
 			free(s);
 		}
 	}
diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index 7516b1746594..2d60114f1870 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -189,6 +189,182 @@ static bool match_metric(const char *n, const char *list)
 	return false;
 }
 
+struct mep {
+	struct rb_node nd;
+	const char *name;
+	struct strlist *metrics;
+};
+
+static int mep_cmp(struct rb_node *rb_node, const void *entry)
+{
+	struct mep *a = container_of(rb_node, struct mep, nd);
+	struct mep *b = (struct mep *)entry;
+
+	return strcmp(a->name, b->name);
+}
+
+static struct rb_node *mep_new(struct rblist *rl __maybe_unused,
+					const void *entry)
+{
+	struct mep *me = malloc(sizeof(struct mep));
+
+	if (!me)
+		return NULL;
+	memcpy(me, entry, sizeof(struct mep));
+	me->name = strdup(me->name);
+	if (!me->name)
+		goto out_me;
+	me->metrics = strlist__new(NULL, NULL);
+	if (!me->metrics)
+		goto out_name;
+	return &me->nd;
+out_name:
+	free((char *)me->name);
+out_me:
+	free(me);
+	return NULL;
+}
+
+static struct mep *mep_lookup(struct rblist *groups, const char *name)
+{
+	struct rb_node *nd;
+	struct mep me = {
+		.name = name
+	};
+	nd = rblist__find(groups, &me);
+	if (nd)
+		return container_of(nd, struct mep, nd);
+	rblist__add_node(groups, &me);
+	nd = rblist__find(groups, &me);
+	if (nd)
+		return container_of(nd, struct mep, nd);
+	return NULL;
+}
+
+static void mep_delete(struct rblist *rl __maybe_unused,
+		       struct rb_node *nd)
+{
+	struct mep *me = container_of(nd, struct mep, nd);
+
+	strlist__delete(me->metrics);
+	free((void *)me->name);
+	free(me);
+}
+
+static void metricgroup__print_strlist(struct strlist *metrics, bool raw)
+{
+	struct str_node *sn;
+	int n = 0;
+
+	strlist__for_each_entry (sn, metrics) {
+		if (raw)
+			printf("%s%s", n > 0 ? " " : "", sn->s);
+		else
+			printf("  %s\n", sn->s);
+		n++;
+	}
+	if (raw)
+		putchar('\n');
+}
+
+void metricgroup__print(bool metrics, bool metricgroups, char *filter,
+			bool raw)
+{
+	struct pmu_events_map *map = perf_pmu__find_map();
+	struct pmu_event *pe;
+	int i;
+	struct rblist groups;
+	struct rb_node *node, *next;
+	struct strlist *metriclist = NULL;
+
+	if (!map)
+		return;
+
+	if (!metricgroups) {
+		metriclist = strlist__new(NULL, NULL);
+		if (!metriclist)
+			return;
+	}
+
+	rblist__init(&groups);
+	groups.node_new = mep_new;
+	groups.node_cmp = mep_cmp;
+	groups.node_delete = mep_delete;
+	for (i = 0; ; i++) {
+		const char *g;
+		pe = &map->table[i];
+
+		if (!pe->name && !pe->metric_group && !pe->metric_name)
+			break;
+		if (!pe->metric_expr)
+			continue;
+		g = pe->metric_group;
+		if (!g && pe->metric_name) {
+			if (pe->name)
+				continue;
+			g = "No_group";
+		}
+		if (g) {
+			char *omg;
+			char *mg = strdup(g);
+
+			if (!mg)
+				return;
+			omg = mg;
+			while ((g = strsep(&mg, ";")) != NULL) {
+				struct mep *me;
+				char *s;
+
+				if (*g == 0)
+					g = "No_group";
+				while (isspace(*g))
+					g++;
+				if (filter && !strstr(g, filter))
+					continue;
+				if (raw)
+					s = (char *)pe->metric_name;
+				else {
+					if (asprintf(&s, "%s\n\t[%s]",
+						     pe->metric_name, pe->desc) < 0)
+						return;
+				}
+
+				if (!s)
+					continue;
+
+				if (!metricgroups) {
+					strlist__add(metriclist, s);
+				} else {
+					me = mep_lookup(&groups, g);
+					if (!me)
+						continue;
+					strlist__add(me->metrics, s);
+				}
+			}
+			free(omg);
+		}
+	}
+
+	if (metricgroups && !raw)
+		printf("\nMetric Groups:\n\n");
+	else if (metrics && !raw)
+		printf("\nMetrics:\n\n");
+
+	for (node = rb_first(&groups.entries); node; node = next) {
+		struct mep *me = container_of(node, struct mep, nd);
+
+		if (metricgroups)
+			printf("%s%s%s", me->name, metrics ? ":" : "", raw ? " " : "\n");
+		if (metrics)
+			metricgroup__print_strlist(me->metrics, raw);
+		next = rb_next(node);
+		rblist__remove_node(&groups, node);
+	}
+	if (!metricgroups)
+		metricgroup__print_strlist(metriclist, raw);
+	strlist__delete(metriclist);
+}
+
 static int metricgroup__add_metric(const char *metric, struct strbuf *events,
 				   struct list_head *group_list)
 {
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 57d7acf890e0..75588920fccc 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -28,6 +28,7 @@
 #include "probe-file.h"
 #include "asm/bug.h"
 #include "util/parse-branch-options.h"
+#include "metricgroup.h"
 
 #define MAX_NAME_LEN 100
 
@@ -2380,6 +2381,8 @@ void print_events(const char *event_glob, bool name_only, bool quiet_flag,
 	print_tracepoint_events(NULL, NULL, name_only);
 
 	print_sdt_events(NULL, NULL, name_only);
+
+	metricgroup__print(true, true, NULL, name_only);
 }
 
 int parse_events__is_hardcoded_term(struct parse_events_term *term)
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 09/44] perf stat: Don't use ctx for saved values lookup
  2017-09-22 14:41 [GIT PULL 00/44] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (7 preceding siblings ...)
  2017-09-22 14:41 ` [PATCH 08/44] perf list: Add metric groups to perf list Arnaldo Carvalho de Melo
@ 2017-09-22 14:41 ` Arnaldo Carvalho de Melo
  2017-09-22 14:41 ` [PATCH 10/44] perf stat: Support duration_time for metrics Arnaldo Carvalho de Melo
                   ` (11 subsequent siblings)
  20 siblings, 0 replies; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-22 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Andi Kleen, Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

We don't need to use ctx to look up events for saved values.  The
context is already part of the evsel pointer, which is the primary key.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170831194036.30146-9-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/stat-shadow.c | 16 +++++-----------
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 42e6c17be7ff..664f49a9b012 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -56,7 +56,6 @@ struct saved_value {
 	struct rb_node rb_node;
 	struct perf_evsel *evsel;
 	int cpu;
-	int ctx;
 	struct stats stats;
 };
 
@@ -67,8 +66,6 @@ static int saved_value_cmp(struct rb_node *rb_node, const void *entry)
 					     rb_node);
 	const struct saved_value *b = entry;
 
-	if (a->ctx != b->ctx)
-		return a->ctx - b->ctx;
 	if (a->cpu != b->cpu)
 		return a->cpu - b->cpu;
 	if (a->evsel == b->evsel)
@@ -90,13 +87,12 @@ static struct rb_node *saved_value_new(struct rblist *rblist __maybe_unused,
 }
 
 static struct saved_value *saved_value_lookup(struct perf_evsel *evsel,
-					      int cpu, int ctx,
+					      int cpu,
 					      bool create)
 {
 	struct rb_node *nd;
 	struct saved_value dm = {
 		.cpu = cpu,
-		.ctx = ctx,
 		.evsel = evsel,
 	};
 	nd = rblist__find(&runtime_saved_values, &dm);
@@ -232,8 +228,7 @@ void perf_stat__update_shadow_stats(struct perf_evsel *counter, u64 *count,
 		update_stats(&runtime_aperf_stats[ctx][cpu], count[0]);
 
 	if (counter->collect_stat) {
-		struct saved_value *v = saved_value_lookup(counter, cpu, ctx,
-							   true);
+		struct saved_value *v = saved_value_lookup(counter, cpu, true);
 		update_stats(&v->stats, count[0]);
 	}
 }
@@ -634,7 +629,6 @@ static void generic_metric(const char *metric_expr,
 			   const char *metric_name,
 			   double avg,
 			   int cpu,
-			   int ctx,
 			   struct perf_stat_output_ctx *out)
 {
 	print_metric_t print_metric = out->print_metric;
@@ -648,7 +642,7 @@ static void generic_metric(const char *metric_expr,
 	for (i = 0; metric_events[i]; i++) {
 		struct saved_value *v;
 
-		v = saved_value_lookup(metric_events[i], cpu, ctx, false);
+		v = saved_value_lookup(metric_events[i], cpu, false);
 		if (!v)
 			break;
 		expr__add_id(&pctx, metric_events[i]->name, avg_stats(&v->stats));
@@ -866,7 +860,7 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 			print_metric(ctxp, NULL, NULL, name, 0);
 	} else if (evsel->metric_expr) {
 		generic_metric(evsel->metric_expr, evsel->metric_events, evsel->name,
-				evsel->metric_name, avg, cpu, ctx, out);
+				evsel->metric_name, avg, cpu, out);
 	} else if (runtime_nsecs_stats[cpu].n != 0) {
 		char unit = 'M';
 		char unit_buf[10];
@@ -895,7 +889,7 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 				out->new_line(ctxp);
 			generic_metric(mexp->metric_expr, mexp->metric_events,
 					evsel->name, mexp->metric_name,
-					avg, cpu, ctx, out);
+					avg, cpu, out);
 		}
 	}
 	if (num == 0)
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 10/44] perf stat: Support duration_time for metrics
  2017-09-22 14:41 [GIT PULL 00/44] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (8 preceding siblings ...)
  2017-09-22 14:41 ` [PATCH 09/44] perf stat: Don't use ctx for saved values lookup Arnaldo Carvalho de Melo
@ 2017-09-22 14:41 ` Arnaldo Carvalho de Melo
  2017-09-22 14:41 ` [PATCH 11/44] perf stat: Hide internal duration_time counter Arnaldo Carvalho de Melo
                   ` (10 subsequent siblings)
  20 siblings, 0 replies; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-22 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Andi Kleen, Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

Some of the metrics formulas (like GFLOPs) need to know how long the
measurement period is. Support an internal event called duration_time,
which reports time in second. It maps to the dummy event, but is special
cased for statistics to report the walltime duration.

So far it is not printed, but only used internally for metrics.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170831194036.30146-10-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/parse-events.l |  1 +
 tools/perf/util/stat-shadow.c  | 17 +++++++++++++----
 2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index fdb5bb52f01f..ea2426daf7e8 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -288,6 +288,7 @@ cpu-migrations|migrations			{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COU
 alignment-faults				{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_ALIGNMENT_FAULTS); }
 emulation-faults				{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_EMULATION_FAULTS); }
 dummy						{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_DUMMY); }
+duration_time					{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_DUMMY); }
 bpf-output					{ return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_BPF_OUTPUT); }
 
 	/*
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 664f49a9b012..a2c12d1ef32a 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -641,11 +641,20 @@ static void generic_metric(const char *metric_expr,
 	expr__add_id(&pctx, name, avg);
 	for (i = 0; metric_events[i]; i++) {
 		struct saved_value *v;
+		struct stats *stats;
+		double scale;
 
-		v = saved_value_lookup(metric_events[i], cpu, false);
-		if (!v)
-			break;
-		expr__add_id(&pctx, metric_events[i]->name, avg_stats(&v->stats));
+		if (!strcmp(metric_events[i]->name, "duration_time")) {
+			stats = &walltime_nsecs_stats;
+			scale = 1e-9;
+		} else {
+			v = saved_value_lookup(metric_events[i], cpu, false);
+			if (!v)
+				break;
+			stats = &v->stats;
+			scale = 1.0;
+		}
+		expr__add_id(&pctx, metric_events[i]->name, avg_stats(stats)*scale);
 	}
 	if (!metric_events[i]) {
 		const char *p = metric_expr;
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 11/44] perf stat: Hide internal duration_time counter
  2017-09-22 14:41 [GIT PULL 00/44] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (9 preceding siblings ...)
  2017-09-22 14:41 ` [PATCH 10/44] perf stat: Support duration_time for metrics Arnaldo Carvalho de Melo
@ 2017-09-22 14:41 ` Arnaldo Carvalho de Melo
  2017-09-22 14:41 ` [PATCH 12/44] perf stat: Update walltime_nsecs_stats in interval mode Arnaldo Carvalho de Melo
                   ` (9 subsequent siblings)
  20 siblings, 0 replies; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-22 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Andi Kleen, Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

Some perf stat metrics use an internal "duration_time" metric. It is not
correctly printed however. So hide it during output to avoid confusing
users with 0 counts.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170831194036.30146-11-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-stat.c | 27 ++++++++++++++++++++++++---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 874bc6dd8d60..855890e0b70b 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -195,6 +195,11 @@ static struct perf_stat_config stat_config = {
 	.scale		= true,
 };
 
+static bool is_duration_time(struct perf_evsel *evsel)
+{
+	return !strcmp(evsel->name, "duration_time");
+}
+
 static inline void diff_timespec(struct timespec *r, struct timespec *a,
 				 struct timespec *b)
 {
@@ -1363,6 +1368,9 @@ static void print_aggr(char *prefix)
 		ad.id = id = aggr_map->map[s];
 		first = true;
 		evlist__for_each_entry(evsel_list, counter) {
+			if (is_duration_time(counter))
+				continue;
+
 			ad.val = ad.ena = ad.run = 0;
 			ad.nr = 0;
 			if (!collect_data(counter, aggr_cb, &ad))
@@ -1506,6 +1514,8 @@ static void print_no_aggr_metric(char *prefix)
 		if (prefix)
 			fputs(prefix, stat_config.output);
 		evlist__for_each_entry(evsel_list, counter) {
+			if (is_duration_time(counter))
+				continue;
 			if (first) {
 				aggr_printout(counter, cpu, 0);
 				first = false;
@@ -1560,6 +1570,8 @@ static void print_metric_headers(const char *prefix, bool no_indent)
 
 	/* Print metrics headers only */
 	evlist__for_each_entry(evsel_list, counter) {
+		if (is_duration_time(counter))
+			continue;
 		os.evsel = counter;
 		out.ctx = &os;
 		out.print_metric = print_metric_header;
@@ -1707,12 +1719,18 @@ static void print_counters(struct timespec *ts, int argc, const char **argv)
 		print_aggr(prefix);
 		break;
 	case AGGR_THREAD:
-		evlist__for_each_entry(evsel_list, counter)
+		evlist__for_each_entry(evsel_list, counter) {
+			if (is_duration_time(counter))
+				continue;
 			print_aggr_thread(counter, prefix);
+		}
 		break;
 	case AGGR_GLOBAL:
-		evlist__for_each_entry(evsel_list, counter)
+		evlist__for_each_entry(evsel_list, counter) {
+			if (is_duration_time(counter))
+				continue;
 			print_counter_aggr(counter, prefix);
+		}
 		if (metric_only)
 			fputc('\n', stat_config.output);
 		break;
@@ -1720,8 +1738,11 @@ static void print_counters(struct timespec *ts, int argc, const char **argv)
 		if (metric_only)
 			print_no_aggr_metric(prefix);
 		else {
-			evlist__for_each_entry(evsel_list, counter)
+			evlist__for_each_entry(evsel_list, counter) {
+				if (is_duration_time(counter))
+					continue;
 				print_counter(counter, prefix);
+			}
 		}
 		break;
 	case AGGR_UNSET:
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 12/44] perf stat: Update walltime_nsecs_stats in interval mode
  2017-09-22 14:41 [GIT PULL 00/44] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (10 preceding siblings ...)
  2017-09-22 14:41 ` [PATCH 11/44] perf stat: Hide internal duration_time counter Arnaldo Carvalho de Melo
@ 2017-09-22 14:41 ` Arnaldo Carvalho de Melo
  2017-09-22 14:41 ` [PATCH 13/44] perf record: Support direct --user-regs arguments Arnaldo Carvalho de Melo
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-22 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Andi Kleen, Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

Some metrics (like GFLOPs) need walltime_nsecs_stats for each interval.
Compute it for each interval instead of only at the end.

Pointed out by Jiri.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170831194036.30146-12-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-stat.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 855890e0b70b..88f1d5fbdb48 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -415,6 +415,8 @@ static void process_interval(void)
 			pr_err("failed to write stat round event\n");
 	}
 
+	init_stats(&walltime_nsecs_stats);
+	update_stats(&walltime_nsecs_stats, stat_config.interval * 1000000);
 	print_counters(&rs, 0, NULL);
 }
 
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 13/44] perf record: Support direct --user-regs arguments
  2017-09-22 14:41 [GIT PULL 00/44] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (11 preceding siblings ...)
  2017-09-22 14:41 ` [PATCH 12/44] perf stat: Update walltime_nsecs_stats in interval mode Arnaldo Carvalho de Melo
@ 2017-09-22 14:41 ` Arnaldo Carvalho de Melo
  2017-09-22 14:41 ` [PATCH 14/44] perf script: Support user regs Arnaldo Carvalho de Melo
                   ` (7 subsequent siblings)
  20 siblings, 0 replies; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-22 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Andi Kleen, Jiri Olsa,
	Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

USER_REGS can currently only collected implicitely with call graph
recording. Sometimes it is useful to see them separately, and filter
them. Add a new --user-regs option to record that is similar to
--intr-regs, but acts on user regs.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170905170029.19722-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-record.txt | 2 ++
 tools/perf/builtin-record.c              | 3 +++
 tools/perf/perf.h                        | 1 +
 tools/perf/util/evsel.c                  | 7 ++++++-
 4 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index e397453e5a46..68a1ffb0a8a5 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -377,6 +377,8 @@ symbolic names, e.g. on x86, ax, si. To list the available registers use
 --intr-regs=\?. To name registers, pass a comma separated list such as
 --intr-regs=ax,bx. The list of register is architecture dependent.
 
+--user-regs::
+Capture user registers at sample time. Same arguments as -I.
 
 --running-time::
 Record running and enabled time for read events (:S)
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 56f8142ff97f..9b379f3a3d99 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1643,6 +1643,9 @@ static struct option __record_options[] = {
 	OPT_CALLBACK_OPTARG('I', "intr-regs", &record.opts.sample_intr_regs, NULL, "any register",
 		    "sample selected machine registers on interrupt,"
 		    " use -I ? to list register names", parse_regs),
+	OPT_CALLBACK_OPTARG(0, "user-regs", &record.opts.sample_user_regs, NULL, "any register",
+		    "sample selected machine registers on interrupt,"
+		    " use -I ? to list register names", parse_regs),
 	OPT_BOOLEAN(0, "running-time", &record.opts.running_time,
 		    "Record running/enabled time of read (:S) events"),
 	OPT_CALLBACK('k', "clockid", &record.opts,
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index dc442ba21bf6..fbb0a9cd0ac6 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -65,6 +65,7 @@ struct record_opts {
 	unsigned int user_freq;
 	u64          branch_stack;
 	u64	     sample_intr_regs;
+	u64	     sample_user_regs;
 	u64	     default_interval;
 	u64	     user_interval;
 	size_t	     auxtrace_snapshot_size;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 4bb89373eb52..7389746c0dc4 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -678,7 +678,7 @@ void perf_evsel__config_callchain(struct perf_evsel *evsel,
 		if (!function) {
 			perf_evsel__set_sample_bit(evsel, REGS_USER);
 			perf_evsel__set_sample_bit(evsel, STACK_USER);
-			attr->sample_regs_user = PERF_REGS_MASK;
+			attr->sample_regs_user |= PERF_REGS_MASK;
 			attr->sample_stack_user = param->dump_size;
 			attr->exclude_callchain_user = 1;
 		} else {
@@ -931,6 +931,11 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts,
 		perf_evsel__set_sample_bit(evsel, REGS_INTR);
 	}
 
+	if (opts->sample_user_regs) {
+		attr->sample_regs_user |= opts->sample_user_regs;
+		perf_evsel__set_sample_bit(evsel, REGS_USER);
+	}
+
 	if (target__has_cpu(&opts->target) || opts->sample_cpu)
 		perf_evsel__set_sample_bit(evsel, CPU);
 
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 14/44] perf script: Support user regs
  2017-09-22 14:41 [GIT PULL 00/44] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (12 preceding siblings ...)
  2017-09-22 14:41 ` [PATCH 13/44] perf record: Support direct --user-regs arguments Arnaldo Carvalho de Melo
@ 2017-09-22 14:41 ` Arnaldo Carvalho de Melo
  2017-09-22 14:41 ` [PATCH 15/44] perf tools: Add python-clean target Arnaldo Carvalho de Melo
                   ` (6 subsequent siblings)
  20 siblings, 0 replies; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-22 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Andi Kleen, Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

Teach perf script to print user regs.

  % perf record --user-regs=ip,sp ...
  % perf script -F ip,sym,uregs
  ...
   ffffffff9e060c24 native_write_msr ABI:2    SP:0x7ffd0ea06c38    IP:0x7fe77f55b637
   ffffffff9e060c24 native_write_msr ABI:2    SP:0x7ffd0ea06c38    IP:0x7fe77f55b637
   ffffffff9e060c24 native_write_msr ABI:2    SP:0x7ffd0ea06c38    IP:0x7fe77f55b637
   ffffffff9e060c24 native_write_msr ABI:2    SP:0x7ffd0ea06c38    IP:0x7fe77f55b637
   ffffffff9e00cc12 intel_pmu_handle_irq ABI:2    SP:0x7ffd0ea06c38    IP:0x7fe77f55b637

v2: Rebased on top of phys-addr patches

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/20170905184057.26135-1-andi@firstfloor.org
[ Use PRIu64 for regs->abi in print_sample_uregs() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-script.txt |  4 ++--
 tools/perf/builtin-script.c              | 30 +++++++++++++++++++++++++++++-
 2 files changed, 31 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 18dfcfa38454..bcc1ba35a2d8 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -116,8 +116,8 @@ OPTIONS
 --fields::
         Comma separated list of fields to print. Options are:
         comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
-        srcline, period, iregs, brstack, brstacksym, flags, bpf-output, brstackinsn, brstackoff,
-        callindent, insn, insnlen, synth, phys_addr.
+        srcline, period, iregs, uregs, brstack, brstacksym, flags, bpf-output, brstackinsn,
+        brstackoff, callindent, insn, insnlen, synth, phys_addr.
         Field list can be prepended with the type, trace, sw or hw,
         to indicate to which event type the field list applies.
         e.g., -F sw:comm,tid,time,ip,sym  and -F trace:time,cpu,trace
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 3d4c3b5e1868..725dbd3dd104 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -88,6 +88,7 @@ enum perf_output_field {
 	PERF_OUTPUT_BRSTACKOFF	    = 1U << 24,
 	PERF_OUTPUT_SYNTH           = 1U << 25,
 	PERF_OUTPUT_PHYS_ADDR       = 1U << 26,
+	PERF_OUTPUT_UREGS	    = 1U << 27,
 };
 
 struct output_option {
@@ -109,6 +110,7 @@ struct output_option {
 	{.str = "srcline", .field = PERF_OUTPUT_SRCLINE},
 	{.str = "period", .field = PERF_OUTPUT_PERIOD},
 	{.str = "iregs", .field = PERF_OUTPUT_IREGS},
+	{.str = "uregs", .field = PERF_OUTPUT_UREGS},
 	{.str = "brstack", .field = PERF_OUTPUT_BRSTACK},
 	{.str = "brstacksym", .field = PERF_OUTPUT_BRSTACKSYM},
 	{.str = "data_src", .field = PERF_OUTPUT_DATA_SRC},
@@ -385,6 +387,11 @@ static int perf_evsel__check_attr(struct perf_evsel *evsel,
 					PERF_OUTPUT_IREGS))
 		return -EINVAL;
 
+	if (PRINT_FIELD(UREGS) &&
+		perf_evsel__check_stype(evsel, PERF_SAMPLE_REGS_USER, "UREGS",
+					PERF_OUTPUT_UREGS))
+		return -EINVAL;
+
 	if (PRINT_FIELD(PHYS_ADDR) &&
 		perf_evsel__check_stype(evsel, PERF_SAMPLE_PHYS_ADDR, "PHYS_ADDR",
 					PERF_OUTPUT_PHYS_ADDR))
@@ -509,6 +516,24 @@ static void print_sample_iregs(struct perf_sample *sample,
 	}
 }
 
+static void print_sample_uregs(struct perf_sample *sample,
+			  struct perf_event_attr *attr)
+{
+	struct regs_dump *regs = &sample->user_regs;
+	uint64_t mask = attr->sample_regs_user;
+	unsigned i = 0, r;
+
+	if (!regs || !regs->regs)
+		return;
+
+	printf(" ABI:%" PRIu64 " ", regs->abi);
+
+	for_each_set_bit(r, (unsigned long *) &mask, sizeof(mask) * 8) {
+		u64 val = regs->regs[i++];
+		printf("%5s:0x%"PRIx64" ", perf_reg_name(r), val);
+	}
+}
+
 static void print_sample_start(struct perf_sample *sample,
 			       struct thread *thread,
 			       struct perf_evsel *evsel)
@@ -1444,6 +1469,9 @@ static void process_event(struct perf_script *script,
 	if (PRINT_FIELD(IREGS))
 		print_sample_iregs(sample, attr);
 
+	if (PRINT_FIELD(UREGS))
+		print_sample_uregs(sample, attr);
+
 	if (PRINT_FIELD(BRSTACK))
 		print_sample_brstack(sample, thread, attr);
 	else if (PRINT_FIELD(BRSTACKSYM))
@@ -2739,7 +2767,7 @@ int cmd_script(int argc, const char **argv)
 		     "+field to add and -field to remove."
 		     "Valid types: hw,sw,trace,raw,synth. "
 		     "Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,"
-		     "addr,symoff,period,iregs,brstack,brstacksym,flags,"
+		     "addr,symoff,period,iregs,uregs,brstack,brstacksym,flags,"
 		     "bpf-output,callindent,insn,insnlen,brstackinsn,synth,phys_addr",
 		     parse_output_fields),
 	OPT_BOOLEAN('a', "all-cpus", &system_wide,
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 15/44] perf tools: Add python-clean target
  2017-09-22 14:41 [GIT PULL 00/44] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (13 preceding siblings ...)
  2017-09-22 14:41 ` [PATCH 14/44] perf script: Support user regs Arnaldo Carvalho de Melo
@ 2017-09-22 14:41 ` Arnaldo Carvalho de Melo
  2017-09-22 14:41 ` [PATCH 16/44] perf ui progress: Add ui specific init function Arnaldo Carvalho de Melo
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-22 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Jiri Olsa, David Ahern,
	Namhyung Kim, Peter Zijlstra, Arnaldo Carvalho de Melo

From: Jiri Olsa <jolsa@kernel.org>

To be able to cleanup only python related binaries.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170908084621.31595-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Makefile.perf | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 91ef44bfaf3e..1df93b4c4648 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -173,7 +173,7 @@ AWK     = awk
 # non-config cases
 config := 1
 
-NON_CONFIG_TARGETS := clean TAGS tags cscope help install-doc install-man install-html install-info install-pdf doc man html info pdf
+NON_CONFIG_TARGETS := clean python-clean TAGS tags cscope help install-doc install-man install-html install-info install-pdf doc man html info pdf
 
 ifdef MAKECMDGOALS
 ifeq ($(filter-out $(NON_CONFIG_TARGETS),$(MAKECMDGOALS)),)
@@ -802,7 +802,10 @@ config-clean:
 	$(call QUIET_CLEAN, config)
 	$(Q)$(MAKE) -C $(srctree)/tools/build/feature/ $(if $(OUTPUT),OUTPUT=$(OUTPUT)feature/,) clean >/dev/null
 
-clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean config-clean fixdep-clean
+python-clean:
+	$(python-clean)
+
+clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean config-clean fixdep-clean python-clean
 	$(call QUIET_CLEAN, core-objs)  $(RM) $(LIB_FILE) $(OUTPUT)perf-archive $(OUTPUT)perf-with-kcore $(LANG_BINDINGS)
 	$(Q)find $(if $(OUTPUT),$(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete
 	$(Q)$(RM) $(OUTPUT).config-detected
@@ -819,7 +822,6 @@ clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clea
 		$(OUTPUT)$(vhost_virtio_ioctl_array) \
 		$(OUTPUT)$(perf_ioctl_array)
 	$(QUIET_SUBDIR0)Documentation $(QUIET_SUBDIR1) clean
-	$(python-clean)
 
 #
 # To provide FEATURE-DUMP into $(FEATURE_DUMP_COPY)
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 16/44] perf ui progress: Add ui specific init function
  2017-09-22 14:41 [GIT PULL 00/44] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (14 preceding siblings ...)
  2017-09-22 14:41 ` [PATCH 15/44] perf tools: Add python-clean target Arnaldo Carvalho de Melo
@ 2017-09-22 14:41 ` Arnaldo Carvalho de Melo
  2017-09-22 14:41 ` [PATCH 17/44] perf ui progress: Add size info into progress bar Arnaldo Carvalho de Melo
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-22 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Jiri Olsa, David Ahern,
	Namhyung Kim, Peter Zijlstra, Arnaldo Carvalho de Melo

From: Jiri Olsa <jolsa@kernel.org>

Adding ui specific init function allowing to setup the progress bar
width based on current screen scales.

Adding TUI init function to get more grained update of the progress bar.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170908120510.22515-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/ui/progress.c     | 2 ++
 tools/perf/ui/progress.h     | 1 +
 tools/perf/ui/tui/progress.c | 9 +++++++--
 3 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/tools/perf/ui/progress.c b/tools/perf/ui/progress.c
index ae91c8148edf..3e2b5d64c55e 100644
--- a/tools/perf/ui/progress.c
+++ b/tools/perf/ui/progress.c
@@ -34,6 +34,8 @@ void ui_progress__init(struct ui_progress *p, u64 total, const char *title)
 	p->total = total;
 	p->title = title;
 
+	if (ui_progress__ops->init)
+		ui_progress__ops->init(p);
 }
 
 void ui_progress__finish(void)
diff --git a/tools/perf/ui/progress.h b/tools/perf/ui/progress.h
index 717d39d3052b..e5f434a2070b 100644
--- a/tools/perf/ui/progress.h
+++ b/tools/perf/ui/progress.h
@@ -14,6 +14,7 @@ void ui_progress__init(struct ui_progress *p, u64 total, const char *title);
 void ui_progress__update(struct ui_progress *p, u64 adv);
 
 struct ui_progress_ops {
+	void (*init)(struct ui_progress *p);
 	void (*update)(struct ui_progress *p);
 	void (*finish)(void);
 };
diff --git a/tools/perf/ui/tui/progress.c b/tools/perf/ui/tui/progress.c
index c4b99008e2c9..f6b8f52aad7e 100644
--- a/tools/perf/ui/tui/progress.c
+++ b/tools/perf/ui/tui/progress.c
@@ -5,6 +5,11 @@
 #include "tui.h"
 #include "../browser.h"
 
+static void __tui_progress__init(struct ui_progress *p)
+{
+	p->next = p->step = p->total / (SLtt_Screen_Cols - 2) ?: 1;
+}
+
 static void tui_progress__update(struct ui_progress *p)
 {
 	int bar, y;
@@ -49,8 +54,8 @@ static void tui_progress__finish(void)
 	pthread_mutex_unlock(&ui__lock);
 }
 
-static struct ui_progress_ops tui_progress__ops =
-{
+static struct ui_progress_ops tui_progress__ops = {
+	.init   = __tui_progress__init,
 	.update = tui_progress__update,
 	.finish = tui_progress__finish,
 };
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 17/44] perf ui progress: Add size info into progress bar
  2017-09-22 14:41 [GIT PULL 00/44] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (15 preceding siblings ...)
  2017-09-22 14:41 ` [PATCH 16/44] perf ui progress: Add ui specific init function Arnaldo Carvalho de Melo
@ 2017-09-22 14:41 ` Arnaldo Carvalho de Melo
  2017-09-22 14:41 ` [PATCH 18/44] perf tools: Use scandir() to replace readdir() Arnaldo Carvalho de Melo
                   ` (3 subsequent siblings)
  20 siblings, 0 replies; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-22 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Jiri Olsa, David Ahern,
	Namhyung Kim, Peter Zijlstra, Arnaldo Carvalho de Melo

From: Jiri Olsa <jolsa@kernel.org>

Adding the size values '[current/total]' into progress bar, to show more
detailed progress of data reading.

Adding new ui_progress__init_size function to specify we want to display
the size.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170908120510.22515-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/ui/progress.c     |  4 +++-
 tools/perf/ui/progress.h     | 11 ++++++++++-
 tools/perf/ui/tui/progress.c | 23 ++++++++++++++++++++++-
 tools/perf/util/session.c    |  2 +-
 4 files changed, 36 insertions(+), 4 deletions(-)

diff --git a/tools/perf/ui/progress.c b/tools/perf/ui/progress.c
index 3e2b5d64c55e..7ade387d511c 100644
--- a/tools/perf/ui/progress.c
+++ b/tools/perf/ui/progress.c
@@ -27,12 +27,14 @@ void ui_progress__update(struct ui_progress *p, u64 adv)
 	}
 }
 
-void ui_progress__init(struct ui_progress *p, u64 total, const char *title)
+void __ui_progress__init(struct ui_progress *p, u64 total,
+			 const char *title, bool size)
 {
 	p->curr = 0;
 	p->next = p->step = total / 16 ?: 1;
 	p->total = total;
 	p->title = title;
+	p->size  = size;
 
 	if (ui_progress__ops->init)
 		ui_progress__ops->init(p);
diff --git a/tools/perf/ui/progress.h b/tools/perf/ui/progress.h
index e5f434a2070b..fbaa1507ebfe 100644
--- a/tools/perf/ui/progress.h
+++ b/tools/perf/ui/progress.h
@@ -8,9 +8,18 @@ void ui_progress__finish(void);
 struct ui_progress {
 	const char *title;
 	u64 curr, next, step, total;
+	bool size;
 };
 
-void ui_progress__init(struct ui_progress *p, u64 total, const char *title);
+void __ui_progress__init(struct ui_progress *p, u64 total,
+			 const char *title, bool size);
+
+#define ui_progress__init(p, total, title) \
+	__ui_progress__init(p, total, title, false)
+
+#define ui_progress__init_size(p, total, title) \
+	__ui_progress__init(p, total, title, true)
+
 void ui_progress__update(struct ui_progress *p, u64 adv);
 
 struct ui_progress_ops {
diff --git a/tools/perf/ui/tui/progress.c b/tools/perf/ui/tui/progress.c
index f6b8f52aad7e..68f6144ea603 100644
--- a/tools/perf/ui/tui/progress.c
+++ b/tools/perf/ui/tui/progress.c
@@ -1,8 +1,10 @@
+#include <linux/kernel.h>
 #include "../cache.h"
 #include "../progress.h"
 #include "../libslang.h"
 #include "../ui.h"
 #include "tui.h"
+#include "units.h"
 #include "../browser.h"
 
 static void __tui_progress__init(struct ui_progress *p)
@@ -10,8 +12,22 @@ static void __tui_progress__init(struct ui_progress *p)
 	p->next = p->step = p->total / (SLtt_Screen_Cols - 2) ?: 1;
 }
 
+static int get_title(struct ui_progress *p, char *buf, size_t size)
+{
+	char buf_cur[20];
+	char buf_tot[20];
+	int ret;
+
+	ret  = unit_number__scnprintf(buf_cur, sizeof(buf_cur), p->curr);
+	ret += unit_number__scnprintf(buf_tot, sizeof(buf_tot), p->total);
+
+	return ret + scnprintf(buf, size, "%s [%s/%s]",
+			       p->title, buf_cur, buf_tot);
+}
+
 static void tui_progress__update(struct ui_progress *p)
 {
+	char buf[100], *title = (char *) p->title;
 	int bar, y;
 	/*
 	 * FIXME: We should have a per UI backend way of showing progress,
@@ -23,13 +39,18 @@ static void tui_progress__update(struct ui_progress *p)
 	if (p->total == 0)
 		return;
 
+	if (p->size) {
+		get_title(p, buf, sizeof(buf));
+		title = buf;
+	}
+
 	ui__refresh_dimensions(false);
 	pthread_mutex_lock(&ui__lock);
 	y = SLtt_Screen_Rows / 2 - 2;
 	SLsmg_set_color(0);
 	SLsmg_draw_box(y, 0, 3, SLtt_Screen_Cols);
 	SLsmg_gotorc(y++, 1);
-	SLsmg_write_string((char *)p->title);
+	SLsmg_write_string(title);
 	SLsmg_fill_region(y, 1, 1, SLtt_Screen_Cols - 2, ' ');
 	SLsmg_set_color(HE_COLORSET_SELECTED);
 	bar = ((SLtt_Screen_Cols - 2) * p->curr) / p->total;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index a7ebd9fe8e40..ceac0848469d 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1847,7 +1847,7 @@ static int __perf_session__process_events(struct perf_session *session,
 	if (data_offset + data_size < file_size)
 		file_size = data_offset + data_size;
 
-	ui_progress__init(&prog, file_size, "Processing events...");
+	ui_progress__init_size(&prog, file_size, "Processing events...");
 
 	mmap_size = MMAP_SIZE;
 	if (mmap_size > file_size) {
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 18/44] perf tools: Use scandir() to replace readdir()
  2017-09-22 14:41 [GIT PULL 00/44] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (16 preceding siblings ...)
  2017-09-22 14:41 ` [PATCH 17/44] perf ui progress: Add size info into progress bar Arnaldo Carvalho de Melo
@ 2017-09-22 14:41 ` Arnaldo Carvalho de Melo
  2017-09-22 14:41 ` [PATCH 19/44] perf config: Write a config file just once Arnaldo Carvalho de Melo
                   ` (2 subsequent siblings)
  20 siblings, 0 replies; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-22 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Kan Liang, Adrian Hunter,
	Andi Kleen, Jiri Olsa, Lukasz Odzioba, Namhyung Kim,
	Peter Zijlstra, Arnaldo Carvalho de Melo

From: Kan Liang <kan.liang@intel.com>

In perf_event__synthesize_threads() perf goes through all proc files
serially by readdir.

scandir() does a snapshoot of /proc, which is multithreading friendly.

It's possible that some threads which are added during event synthesize.
But the number of lost threads should be small.  They should not impact
the final analysis.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Lukasz Odzioba <lukasz.odzioba@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1504806954-150842-3-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/event.c | 45 +++++++++++++++++++++++++--------------------
 1 file changed, 25 insertions(+), 20 deletions(-)

diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 1c905ba3641b..17c21ea68a72 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -683,12 +683,14 @@ int perf_event__synthesize_threads(struct perf_tool *tool,
 				   bool mmap_data,
 				   unsigned int proc_map_timeout)
 {
-	DIR *proc;
-	char proc_path[PATH_MAX];
-	struct dirent *dirent;
 	union perf_event *comm_event, *mmap_event, *fork_event;
 	union perf_event *namespaces_event;
+	char proc_path[PATH_MAX];
+	struct dirent **dirent;
 	int err = -1;
+	char *end;
+	pid_t pid;
+	int n, i;
 
 	if (machine__is_default_guest(machine))
 		return 0;
@@ -712,29 +714,32 @@ int perf_event__synthesize_threads(struct perf_tool *tool,
 		goto out_free_fork;
 
 	snprintf(proc_path, sizeof(proc_path), "%s/proc", machine->root_dir);
-	proc = opendir(proc_path);
+	n = scandir(proc_path, &dirent, 0, alphasort);
 
-	if (proc == NULL)
+	if (n < 0)
 		goto out_free_namespaces;
 
-	while ((dirent = readdir(proc)) != NULL) {
-		char *end;
-		pid_t pid = strtol(dirent->d_name, &end, 10);
-
-		if (*end) /* only interested in proper numerical dirents */
+	for (i = 0; i < n; i++) {
+		if (!isdigit(dirent[i]->d_name[0]))
 			continue;
-		/*
- 		 * We may race with exiting thread, so don't stop just because
- 		 * one thread couldn't be synthesized.
- 		 */
-		__event__synthesize_thread(comm_event, mmap_event, fork_event,
-					   namespaces_event, pid, 1, process,
-					   tool, machine, mmap_data,
-					   proc_map_timeout);
-	}
 
+		pid = (pid_t)strtol(dirent[i]->d_name, &end, 10);
+		/* only interested in proper numerical dirents */
+		if (!*end) {
+			/*
+			 * We may race with exiting thread, so don't stop just because
+			 * one thread couldn't be synthesized.
+			 */
+			__event__synthesize_thread(comm_event, mmap_event, fork_event,
+						   namespaces_event, pid, 1, process,
+						   tool, machine, mmap_data,
+						   proc_map_timeout);
+		}
+		free(dirent[i]);
+	}
+	free(dirent);
 	err = 0;
-	closedir(proc);
+
 out_free_namespaces:
 	free(namespaces_event);
 out_free_fork:
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 19/44] perf config: Write a config file just once
  2017-09-22 14:41 [GIT PULL 00/44] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (17 preceding siblings ...)
  2017-09-22 14:41 ` [PATCH 18/44] perf tools: Use scandir() to replace readdir() Arnaldo Carvalho de Melo
@ 2017-09-22 14:41 ` Arnaldo Carvalho de Melo
  2017-09-22 14:41 ` [PATCH 20/44] perf config: Allow creating empty config set for config file autogeneration Arnaldo Carvalho de Melo
  2017-09-22 16:26 ` [GIT PULL 00/44] perf/core improvements and fixes Ingo Molnar
  20 siblings, 0 replies; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-22 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Taeung Song, Jiri Olsa,
	Namhyung Kim, Arnaldo Carvalho de Melo

From: Taeung Song <treeze.taeung@gmail.com>

Currently set_config() can be repeatedly called for each input config on
the below case:

  $ perf config kmem.default=slab report.children=false ...

But it's a waste, so only once write a config file gathering all given
config key=value pairs.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1504754331-9776-1-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-config.c | 22 ++++++++++++++++------
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-config.c b/tools/perf/builtin-config.c
index a1d82e33282c..b89417d9305e 100644
--- a/tools/perf/builtin-config.c
+++ b/tools/perf/builtin-config.c
@@ -34,8 +34,7 @@ static struct option config_options[] = {
 	OPT_END()
 };
 
-static int set_config(struct perf_config_set *set, const char *file_name,
-		      const char *var, const char *value)
+static int set_config(struct perf_config_set *set, const char *file_name)
 {
 	struct perf_config_section *section = NULL;
 	struct perf_config_item *item = NULL;
@@ -49,7 +48,6 @@ static int set_config(struct perf_config_set *set, const char *file_name,
 	if (!fp)
 		return -1;
 
-	perf_config_set__collect(set, file_name, var, value);
 	fprintf(fp, "%s\n", first_line);
 
 	/* overwrite configvariables */
@@ -161,6 +159,7 @@ int cmd_config(int argc, const char **argv)
 	struct perf_config_set *set;
 	char *user_config = mkpath("%s/.perfconfig", getenv("HOME"));
 	const char *config_filename;
+	bool changed = false;
 
 	argc = parse_options(argc, argv, config_options, config_usage,
 			     PARSE_OPT_STOP_AT_NON_OPTION);
@@ -231,15 +230,26 @@ int cmd_config(int argc, const char **argv)
 					goto out_err;
 				}
 			} else {
-				if (set_config(set, config_filename, var, value) < 0) {
-					pr_err("Failed to set '%s=%s' on %s\n",
-					       var, value, config_filename);
+				if (perf_config_set__collect(set, config_filename,
+							     var, value) < 0) {
+					pr_err("Failed to add '%s=%s'\n",
+					       var, value);
 					free(arg);
 					goto out_err;
 				}
+				changed = true;
 			}
 			free(arg);
 		}
+
+		if (!changed)
+			break;
+
+		if (set_config(set, config_filename) < 0) {
+			pr_err("Failed to set the configs on %s\n",
+			       config_filename);
+			goto out_err;
+		}
 	}
 
 	ret = 0;
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 20/44] perf config: Allow creating empty config set for config file autogeneration
  2017-09-22 14:41 [GIT PULL 00/44] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (18 preceding siblings ...)
  2017-09-22 14:41 ` [PATCH 19/44] perf config: Write a config file just once Arnaldo Carvalho de Melo
@ 2017-09-22 14:41 ` Arnaldo Carvalho de Melo
  2017-09-22 16:26 ` [GIT PULL 00/44] perf/core improvements and fixes Ingo Molnar
  20 siblings, 0 replies; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-22 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Taeung Song, Jiri Olsa,
	Namhyung Kim, Arnaldo Carvalho de Melo

From: Taeung Song <treeze.taeung@gmail.com>

When there isn't a config file (e.g. ~/.perfconfig) or it has nothing,
the config set wasn't created.

If the config set does not exist, a config file can't be autogenerated.

So allow creating a empty config set in the above case,
then we can support the config file autogeneration.

Before:

  $ rm -f ~/.perfconfig
  $ perf config --user report.children=false

  $ cat ~/.perfconfig
  cat: /root/.perfconfig: No such file or directory

But I think it should work even if there isn't a config file.

After:

  $ rm -f ~/.perfconfig
  $ perf config --user report.children=false

  $ cat ~/.perfconfig
  # this file is auto-generated.
  [report]
      children = false

NOTE:

As a result, if perf_config_set__init() fails, it looks as if the config
set isn't freed. But it isn't a problem.  Because the config set will be
freed by perf_config_set__delete() at the end of cmd_config().

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1504754336-9824-1-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/config.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/tools/perf/util/config.c b/tools/perf/util/config.c
index bc75596f9e79..d2b6983b1779 100644
--- a/tools/perf/util/config.c
+++ b/tools/perf/util/config.c
@@ -700,10 +700,7 @@ struct perf_config_set *perf_config_set__new(void)
 
 	if (set) {
 		INIT_LIST_HEAD(&set->sections);
-		if (perf_config_set__init(set) < 0) {
-			perf_config_set__delete(set);
-			set = NULL;
-		}
+		perf_config_set__init(set);
 	}
 
 	return set;
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [GIT PULL 00/44] perf/core improvements and fixes
  2017-09-22 14:41 [GIT PULL 00/44] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (19 preceding siblings ...)
  2017-09-22 14:41 ` [PATCH 20/44] perf config: Allow creating empty config set for config file autogeneration Arnaldo Carvalho de Melo
@ 2017-09-22 16:26 ` Ingo Molnar
  20 siblings, 0 replies; 26+ messages in thread
From: Ingo Molnar @ 2017-09-22 16:26 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, linux-perf-users, Adrian Hunter,
	Alexander Shishkin, Andi Kleen, David Ahern, Fenghua Yu,
	Jiri Olsa, Kan Liang, Li Zhijian, Lukasz Odzioba,
	Martin Kepplinger, Matt Fleming, Mike Kravetz, Namhyung Kim,
	Pei P Jia, Peter Zijlstra, Philip Li, Rik van Riel, Taeung Song,
	Tony Luck, Vikas Shivappa, Wang Nan, Xiaochen Shen,
	Arnaldo Carvalho de Melo


* Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> Hi Ingo,
> 
> 	Please consider pulling,
> 
> - Arnaldo
> 
> 
> The following changes since commit b130a699c07155a1d6ef7d971a5f3bf0e3818d5a:
> 
>   Merge tag 'perf-urgent-for-mingo-4.14-20170912' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent (2017-09-13 09:25:10 +0200)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.15-20170922
> 
> for you to fetch changes up to 0a7c74eae307894c6c95316c382f118aef8481e8:
> 
>   perf tools: Provide mutex wrappers for pthreads rwlocks (2017-09-21 13:28:06 -0300)
> 
> ----------------------------------------------------------------
> perf/core improvements and fixes:
> 
> - Support direct --user-regs arguments in 'perf record', previously the
>   only way to sample PERF_SAMPLE_REGS_USER was implicitly selecting it
>   when recording callchains (Andi Kleen)
> 
> - Support showing sampled user regs in 'perf script' (Andi Kleen)
> 
> - Introduce the concept of weak groups in 'perf stat': try to set up a
>   group, but if it's not schedulable fallback to not using a group. That
>   gives us the best of both worlds: groups if they work, but still a
>   usable fallback if they don't. E.g: (Andi Kleen)
> 
>   % perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}:W' -a sleep 1
> 
>     125,366,055  branches                                    (80.02%)
>       9,208,402  branch-misses       # 7.35% of all branches (80.01%)
>      24,560,249  l1d.replacement                             (80.00%)
>      43,174,971  l2_lines_in.all                             (80.05%)
>      31,891,457  l2_rqsts.all_code_rd                        (79.92%)
> 
> - Support metrics in 'stat' and 'list'. A metric is a formula that
>   uses multiple events to compute a higher level result (e.g. IPC). (Andi Kleen)
> 
> - Add Intel processors vendor event metrics JSON files (Andi Kleen)
> 
> - Add 'pid' and 'tid' options to 'perf sched timehist' (David Ahern)
> 
> - Generate 'behavior' string table from kernel headers, helps getting
>   new parameters when synchronizing kernel headers, like MADV_WIPEONFORK
>   and MADV_KEEPONFORK, that are now beautied (Arnaldo Carvalho de Melo)
> 
> - Improve TUI progress bar by showing how many bytes from a total were
>   processed (Jiri Olsa)
> 
> - Use scandir() to replace readdir(), prep work to have the synthesizing
>   of PERF_RECORD_ entries for existing threads be multithreaded, making
>   'perf top' bearable on high core count systems such as Intel's Knights
>   Landing/Mill  (Kan Liang)
> 
> - Allow creating a ~/.perfconfig file when setting a variable to its
>   default value, previously it would bail out and not write such a
>   file (Taeung Song)
> 
> - Introduce wrapper for allowing purely single threaded apps to avoid
>   the costs of locking (Arnaldo Carvalho de Melo)
> 
> - Introduce hashtable to reduce the cost of thread lookup
> 
> - Fix build C++ build wrt poison.h using void pointer arithmetic,
>   affects only the embedded clang/llvm case, that is disabled by
>   default (Arnaldo Carvalho de Melo)
> 
> - Fix leaking rec_argv in error cases (Martin Kepplinger)
> 
> - Remove Intel CQM perf test, that infrastructure was nuked (Xiaochen Shen)
> 
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> ----------------------------------------------------------------
> Andi Kleen (27):
>       perf tools: Support weak groups in 'perf stat'
>       perf vendor events: Support metric_group and no event name in JSON parser
>       perf stat: Factor out generic metric printing
>       perf stat: Print generic metric header even for failed expressions
>       perf pmu: Extract function to get JSON alias map
>       perf stat: Support JSON metrics in perf stat
>       perf list: Add metric groups to perf list
>       perf stat: Don't use ctx for saved values lookup
>       perf stat: Support duration_time for metrics
>       perf stat: Hide internal duration_time counter
>       perf stat: Update walltime_nsecs_stats in interval mode
>       perf record: Support direct --user-regs arguments
>       perf script: Support user regs
>       perf stat: Fall weak group back even for EBADF
>       perf vendor events: Add JSON metrics for Broadwell
>       perf vendor events: Add JSON metrics for Skylake
>       perf vendor events: Add JSON metrics for Sandy Bridge
>       perf vendor events: Add JSON metrics for Sandy Bridge EP
>       perf vendor events: Add JSON metrics for Ivy Bridge
>       perf vendor events: Add JSON metrics for Haswell
>       perf vendor events: Add JSON metrics for Ivy Town
>       perf vendor events: Add JSON metrics for Haswell EP
>       perf vendor events: Add JSON metrics for Broadwell Server
>       perf vendor events: Add JSON metrics for Broadwell DE
>       perf vendor events: Add JSON metrics for Skylake server
>       perf pmu: Improve error messages for missing PMUs
>       perf stat: Fix adding multiple event groups
> 
> Arnaldo Carvalho de Melo (7):
>       perf tools: Make copyfile_offset() static
>       perf machine: Optimize a bit the machine__findnew_thread() methods
>       perf trace beauty madvise: Generate 'behavior' string table from kernel headers
>       tools: Update asm-generic/mman-common.h copy from the kernel
>       perf tools: Get all of tools/{arch,include}/ in the MANIFEST
>       tools include: Do not use poison with C++
>       perf tools: Provide mutex wrappers for pthreads rwlocks
> 
> David Ahern (1):
>       perf sched timehist: Add pid and tid options
> 
> Jiri Olsa (3):
>       perf tools: Add python-clean target
>       perf ui progress: Add ui specific init function
>       perf ui progress: Add size info into progress bar
> 
> Kan Liang (2):
>       perf tools: Use scandir() to replace readdir()
>       perf machine: Use hashtable for machine threads
> 
> Martin Kepplinger (1):
>       perf tools: Fix leaking rec_argv in error cases
> 
> Taeung Song (2):
>       perf config: Write a config file just once
>       perf config: Allow creating empty config set for config file autogeneration
> 
> Xiaochen Shen (1):
>       perf tests: Remove Intel CQM perf test
> 
>  tools/include/linux/poison.h                       |   5 +
>  tools/include/uapi/asm-generic/mman-common.h       |  14 +-
>  tools/perf/Documentation/perf-list.txt             |   9 +-
>  tools/perf/Documentation/perf-record.txt           |   2 +
>  tools/perf/Documentation/perf-sched.txt            |   8 +
>  tools/perf/Documentation/perf-script.txt           |   4 +-
>  tools/perf/Documentation/perf-stat.txt             |   7 +
>  tools/perf/MANIFEST                                |  87 +---
>  tools/perf/Makefile.perf                           |  17 +-
>  tools/perf/arch/x86/include/arch-tests.h           |   1 -
>  tools/perf/arch/x86/tests/Build                    |   1 -
>  tools/perf/arch/x86/tests/arch-tests.c             |   4 -
>  tools/perf/arch/x86/tests/intel-cqm.c              | 127 ------
>  tools/perf/builtin-c2c.c                           |   1 +
>  tools/perf/builtin-config.c                        |  22 +-
>  tools/perf/builtin-kvm.c                           |   1 -
>  tools/perf/builtin-list.c                          |   7 +
>  tools/perf/builtin-mem.c                           |   1 +
>  tools/perf/builtin-record.c                        |   3 +
>  tools/perf/builtin-sched.c                         |   4 +
>  tools/perf/builtin-script.c                        |  32 +-
>  tools/perf/builtin-stat.c                          |  82 +++-
>  tools/perf/builtin-timechart.c                     |   4 +-
>  tools/perf/builtin-trace.c                         |  20 +-
>  tools/perf/perf.h                                  |   1 +
>  .../pmu-events/arch/x86/broadwell/bdw-metrics.json | 164 +++++++
>  .../arch/x86/broadwellde/bdwde-metrics.json        | 164 +++++++
>  .../arch/x86/broadwellx/bdx-metrics.json           | 164 +++++++
>  .../pmu-events/arch/x86/haswell/hsw-metrics.json   | 158 +++++++
>  .../pmu-events/arch/x86/haswellx/hsx-metrics.json  | 158 +++++++
>  .../pmu-events/arch/x86/ivybridge/ivb-metrics.json | 164 +++++++
>  .../pmu-events/arch/x86/ivytown/ivt-metrics.json   | 164 +++++++
>  .../pmu-events/arch/x86/jaketown/jkt-metrics.json  | 140 ++++++
>  .../arch/x86/sandybridge/snb-metrics.json          | 140 ++++++
>  .../pmu-events/arch/x86/skylake/skl-metrics.json   | 164 +++++++
>  .../pmu-events/arch/x86/skylakex/skx-metrics.json  | 182 ++++++++
>  tools/perf/pmu-events/jevents.c                    |  24 +-
>  tools/perf/pmu-events/jevents.h                    |   2 +-
>  tools/perf/pmu-events/pmu-events.h                 |   1 +
>  tools/perf/tests/builtin-test.c                    |   1 +
>  tools/perf/trace/beauty/madvise_behavior.sh        |  10 +
>  tools/perf/trace/beauty/mmap.c                     |  38 +-
>  tools/perf/ui/progress.c                           |   6 +-
>  tools/perf/ui/progress.h                           |  12 +-
>  tools/perf/ui/tui/progress.c                       |  32 +-
>  tools/perf/util/Build                              |   2 +
>  tools/perf/util/config.c                           |   5 +-
>  tools/perf/util/data.c                             |   1 +
>  tools/perf/util/dso.c                              |  13 +-
>  tools/perf/util/dso.h                              |   4 +-
>  tools/perf/util/event.c                            |  46 +-
>  tools/perf/util/evlist.h                           |   1 +
>  tools/perf/util/evsel.c                            |   7 +-
>  tools/perf/util/evsel.h                            |   1 +
>  tools/perf/util/machine.c                          | 155 ++++---
>  tools/perf/util/machine.h                          |  24 +-
>  tools/perf/util/map.c                              |  34 +-
>  tools/perf/util/map.h                              |   3 +-
>  tools/perf/util/metricgroup.c                      | 490 +++++++++++++++++++++
>  tools/perf/util/metricgroup.h                      |  31 ++
>  tools/perf/util/namespaces.c                       |   1 +
>  tools/perf/util/parse-events.c                     |  29 +-
>  tools/perf/util/parse-events.h                     |   3 +
>  tools/perf/util/parse-events.l                     |   3 +-
>  tools/perf/util/pmu.c                              |  55 ++-
>  tools/perf/util/pmu.h                              |   2 +
>  tools/perf/util/probe-file.c                       |   1 +
>  tools/perf/util/rb_resort.h                        |   5 +-
>  tools/perf/util/rwsem.c                            |  32 ++
>  tools/perf/util/rwsem.h                            |  19 +
>  tools/perf/util/session.c                          |   2 +-
>  tools/perf/util/stat-shadow.c                      | 110 +++--
>  tools/perf/util/stat.h                             |   4 +-
>  tools/perf/util/symbol.c                           |   8 +-
>  tools/perf/util/thread.c                           |   4 +-
>  tools/perf/util/trace-event-info.c                 |   1 -
>  tools/perf/util/trace-event-read.c                 |   1 -
>  tools/perf/util/util.c                             |  16 +-
>  tools/perf/util/util.h                             |   7 +-
>  tools/perf/util/vdso.c                             |   4 +-
>  tools/perf/util/zlib.c                             |   1 +
>  81 files changed, 2988 insertions(+), 489 deletions(-)
>  delete mode 100644 tools/perf/arch/x86/tests/intel-cqm.c
>  create mode 100644 tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
>  create mode 100644 tools/perf/pmu-events/arch/x86/broadwellde/bdwde-metrics.json
>  create mode 100644 tools/perf/pmu-events/arch/x86/broadwellx/bdx-metrics.json
>  create mode 100644 tools/perf/pmu-events/arch/x86/haswell/hsw-metrics.json
>  create mode 100644 tools/perf/pmu-events/arch/x86/haswellx/hsx-metrics.json
>  create mode 100644 tools/perf/pmu-events/arch/x86/ivybridge/ivb-metrics.json
>  create mode 100644 tools/perf/pmu-events/arch/x86/ivytown/ivt-metrics.json
>  create mode 100644 tools/perf/pmu-events/arch/x86/jaketown/jkt-metrics.json
>  create mode 100644 tools/perf/pmu-events/arch/x86/sandybridge/snb-metrics.json
>  create mode 100644 tools/perf/pmu-events/arch/x86/skylake/skl-metrics.json
>  create mode 100644 tools/perf/pmu-events/arch/x86/skylakex/skx-metrics.json
>  create mode 100755 tools/perf/trace/beauty/madvise_behavior.sh
>  create mode 100644 tools/perf/util/metricgroup.c
>  create mode 100644 tools/perf/util/metricgroup.h
>  create mode 100644 tools/perf/util/rwsem.c
>  create mode 100644 tools/perf/util/rwsem.h

Pulled, thanks a lot Arnaldo!

	Ingo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [GIT PULL 00/44] perf/core improvements and fixes
  2018-08-09 14:57 Arnaldo Carvalho de Melo
@ 2018-08-09 15:27 ` Kim Phillips
  0 siblings, 0 replies; 26+ messages in thread
From: Kim Phillips @ 2018-08-09 15:27 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Clark Williams, linux-kernel, linux-perf-users,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, Andrew Morton,
	Andriy Shevchenko, David Ahern, Dmitry Torokhov,
	Ganapatrao Kulkarni, Heiko Carstens, Hendrik Brueckner, Jin Yao,
	Jiri Olsa, John Garry, Kate Stewart, Konstantin Khlebnikov,
	linux-arm-kernel, Martin Schwidefsky, Matthew Wilcox,
	Michael Ellerman, Mike Snitzer, Namhyung Kim, Peter Zijlstra,
	Philippe Ombredanne, Sean V Kelley, Stephane Eranian,
	Steven Rostedt, Thomas Richter, Wang Nan, Will Deacon,
	William Cohen, Yury Norov, Arnaldo Carvalho de Melo

On Thu,  9 Aug 2018 11:57:38 -0300
Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

Hi Arnaldo,

> Arch specific:
> 
> arm64: (Sean V Kelley)
> 
> - Enable JSON events for Ampere Computing eMAG processor
> 

Did this one get missed?:

https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1745454.html

Thanks,

Kim

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [GIT PULL 00/44] perf/core improvements and fixes
@ 2018-08-09 14:57 Arnaldo Carvalho de Melo
  2018-08-09 15:27 ` Kim Phillips
  0 siblings, 1 reply; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-08-09 14:57 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Andrew Morton, Andriy Shevchenko, David Ahern,
	Dmitry Torokhov, Ganapatrao Kulkarni, Heiko Carstens,
	Hendrik Brueckner, Jin Yao, Jiri Olsa, John Garry, Kate Stewart,
	Konstantin Khlebnikov, linux-arm-kernel, Martin Schwidefsky,
	Matthew Wilcox, Michael Ellerman, Mike Snitzer, Namhyung Kim,
	Peter Zijlstra, Philippe Ombredanne, Sean V Kelley,
	Stephane Eranian, Steven Rostedt, Thomas Richter, Wang Nan,
	Will Deacon, William Cohen, Yury Norov, Arnaldo Carvalho de Melo

Hi Ingo,

	Please consider pulling,

- Arnaldo

Test results at the end of this message, as usual.

Several new test environments consisting of building with/without
elfutils 0.173 ELF and DWARF libraries cross built for many
architectures, some appearing in the this container based test
environment for the first time: 

  56 ubuntu:18.04-x-m68k     : Ok  m68k-linux-gnu-gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
  60 ubuntu:18.04-x-riscv64  : Ok  riscv64-linux-gnu-gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
  62 ubuntu:18.04-x-sh4      : Ok  sh4-linux-gnu-gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
  63 ubuntu:18.04-x-sparc64  : Ok  sparc64-linux-gnu-gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0

The following changes since commit ec2cb7a526d49b65576301e183448fb51ee543a6:

  Merge tag 'perf-core-for-mingo-4.19-20180801' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2018-08-02 09:59:41 +0200)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.19-20180809

for you to fetch changes up to 6a9405b56c274024564f9014bba97b92c91b34d6:

  perf map: Optimize maps__fixup_overlappings() (2018-08-08 15:56:00 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

perf annotate: (Jiri Olsa)

- Show percentage based on global or local hits or period, adding hotkeys ('p':
  local/global, 'b': hit/period, use 'h' to see all hotkeys in the TUI) to
  toggle those modes in the TUI, as well as a command line to select that, i.e.

      perf report/annotate --percent-type global-period,local-period,global-hits,local-hits

  to help understand the impact an annotated line has globally or just on the
  function's total number of hits or its total period.

  Try using it from the dynamic annotation interface from 'perf top', i.e.
  fire up 'perf top', then press 'a' on a kernel function then try pressing 'p'
  to see, dynamicly, the global/local percentage for the lines with samples.

  This was based on a suggestion made by Stephane Eranian.

perf trace: (Arnaldo Carvalho de Melo)

- Process syscalls:sys_enter_SYSCALLNAME tracepoints using the strace-like
  beautifiers used with raw_syscalls:sys_enter, paving the way to use
  those beautifiers with whatever event carries that payload in its
  PERF_SAMPLE_RAW area (Arnaldo Carvalho de Melo)

- Add more wrappers for BPF functions to be used in eBPF programs, together
  with examples on how to use them, for instance, a "hello, world" like
  program attached to the 'openat' syscall entry tracepoint using a stdio.h
  like puts() function that abstracts access to the bpf_perf_event_output
  eBPF function associated with a eBPF map associated with a "bpf-output"
  (PERF_COUNT_SW_BPF_OUTPUT software event) that gets what is passed into
  the perf ring buffer to finally appear in the 'perf trace' output:

    $ cd tools/perf/examples/bpf/
    $ cat hello.c
    #include <stdio.h>

    int syscall_enter(openat)(void *args)
    {
	    puts("Hello, world\n");
	    return 0;
    }

    license(GPL);
    $
    # perf trace -e hello.c cat /etc/passwd > /dev/null
      0.000 __bpf_stdout__:Hello, world
      0.033 __bpf_stdout__:Hello, world
      0.358 __bpf_stdout__:Hello, world
    #

- Add another example (augmented_syscalls.c) that copies the syscall tracepoint
  payload + the first 64 bytes of 'openat''s 'filename' pointer parameter,
  using the eBPF's probe_read, probe_read_str and perf_event_output, sending to
  an eBPF map associated with a bpf-output perf event that then gets passwd to
  the existing raw_syscalls:sys_enter beautifier.

  The changesets were done very granularly so that we can see that payload
  first processed by the generic bpf_output formatter, where we can see that the
  filename is being copied, together with the raw_syscalls:sys_enter formatter,
  to make sure both agree, e.g.:

  # perf trace -e perf/tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null
     0.000 (         ): __augmented_syscalls__:X?.C......................`\..................../etc/ld.so.cache..#......,....ao.k...............k......1.".........
     0.006 (         ): syscalls:sys_enter_openat:dfd: CWD, filename: 0x5c600da8, flags: CLOEXEC
     0.008 ( 0.005 ms): cat/31292 openat(dfd: CWD, filename: 0x5c600da8, flags: CLOEXEC) = 3
     0.036 (         ): __augmented_syscalls__:X?.C.......................\..................../lib64/libc.so.6......... .\....#........?.......=.C..../.".........
     0.037 (         ): syscalls:sys_enter_openat:dfd: CWD, filename: 0x5c808ce0, flags: CLOEXEC
     0.039 ( 0.007 ms): cat/31292 openat(dfd: CWD, filename: 0x5c808ce0, flags: CLOEXEC) = 3
     0.323 (         ): __augmented_syscalls__:X?.C.....................P....................../etc/passwd......>.C....@................>.C.....,....ao.>.C........
     0.325 (         ): syscalls:sys_enter_openat:dfd: CWD, filename: 0xe8be50d6
     0.327 ( 0.004 ms): cat/31292 openat(dfd: CWD, filename: 0xe8be50d6) = 3
  #

  The next step is to improve the beautifiers to use that filename so that we
  can show it just like 'perf trace' + probe:vfs_getname (getname_flags kprobe supported
  by 'perf trace' to show the pathname in syscalls like open, openat, rename, etc) and
  strace does (via ptrace).

  This now requires having a clang installed to turn the augmented_syscalls.c into
  eBPF object to feed the kernel, in upcoming patches this need will be removed by
  making 'perf trace' generate the object directly, or by linking libclang into
  perf's binary, code that is already in merged.

  strace's '-s strsize' will be implemented to state how many bytes we should copy,
  so that a the familiar 'strace' workflow can be mimic'ed some more. Per-event
  terms should also be used to state how many bytes for each pointer arg should be
  copied and subsequently beautified, something like:

     # perf trace -e openat/filename:16/,read/buf:256/

Infrastructure: (Konstantin Khlebnikov)

- Optimize the synthesization of maps for pre-existing threads, synthesizing
  maps just for the thread group leader

- Optimize maps__fixup_overlappings()

Arch specific:

arm64: (Sean V Kelley)

- Enable JSON events for Ampere Computing eMAG processor

s/390: (Thomas Richter)

- Support auxiliary trace on 'perf report'

Cleanups:

- Drop unneeded bitmap_zero() calls (Yury Norov)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

----------------------------------------------------------------
Arnaldo Carvalho de Melo (17):
      perf trace: Associate vfs_getname()'ed pathname with fd returned from 'openat'
      perf trace: Use beautifiers on syscalls:sys_enter_ handlers
      perf trace: Rename some syscall_tp methods to raw_syscall
      perf trace: Allow setting up a syscall_tp struct without a format_field
      perf trace: Setup struct syscall_tp for syscalls:sys_{enter,exit}_NAME events
      perf trace: Use perf_evsel__sc_tp_{uint,ptr} for "id"/"args" handling syscalls:* events
      perf bpf: Add 'syscall_enter' probe helper for syscall enter tracepoints
      perf bpf: Add struct bpf_map struct
      perf bpf: Add bpf/stdio.h wrapper to bpf_perf_event_output function
      perf bpf: Make bpf__for_each_stdout_map() generic
      perf bpf: Generalize bpf__setup_stdout()
      perf bpf: Add bpf__setup_output_event() strerror() counterpart
      perf bpf: Add wrappers to BPF_FUNC_probe_read(_str) functions
      perf trace: Handle "bpf-output" events associated with "__augmented_syscalls__" BPF map
      perf bpf: Make bpf__setup_output_event() return the bpf-output event
      perf trace: Setup the augmented syscalls bpf-output event fields
      perf trace: Wire up the augmented syscalls with the syscalls:sys_enter_FOO beautifier

Jiri Olsa (20):
      perf annotate: Make symbol__annotate_fprintf2() local
      perf annotate: Make annotation_line__max_percent static
      perf annotate: Get rid of annotation__scnprintf_samples_period()
      perf annotate: Rename struct annotation_line::samples* to data*
      perf annotate: Rename local sample variables to data
      perf annotate: Rename hist to sym_hist in annotation__calc_percent
      perf annotate: Loop group events directly in annotation__calc_percent()
      perf annotate: Switch struct annotation_data::percent to array
      perf annotate: Add PERCENT_HITS_GLOBAL percent value
      perf annotate: Add PERCENT_PERIOD_LOCAL percent value
      perf annotate: Add PERCENT_PERIOD_GLOBAL percent value
      perf annotate: Add percent_type to struct annotation_options
      perf annotate: Pass struct annotation_options to symbol__calc_lines()
      perf annotate: Pass 'struct annotation_options' to map_symbol__annotation_dump()
      perf annotate: Pass browser percent_type in annotate_browser__calc_percent()
      perf annotate: Add support to toggle percent type
      perf annotate: Make local period the default percent type
      perf annotate: Display percent type in stdio output
      perf annotate: Add --percent-type option
      perf report: Add --percent-type option

Konstantin Khlebnikov (2):
      perf map: Synthesize maps only for thread group leader
      perf map: Optimize maps__fixup_overlappings()

Sean V Kelley (1):
      perf vendor events arm64: Enable JSON events for eMAG

Thomas Richter (3):
      perf auxtrace: Support for perf report -D for s390
      perf report: Add raw report support for s390 auxiliary trace
      perf report: Add GUI report support for s390 auxiliary trace

Yury Norov (1):
      perf tools: Drop unneeded bitmap_zero() calls

 tools/perf/Documentation/perf-annotate.txt         |   9 +
 tools/perf/Documentation/perf-report.txt           |   9 +
 tools/perf/arch/s390/util/auxtrace.c               |   1 +
 tools/perf/builtin-annotate.c                      |   4 +
 tools/perf/builtin-report.c                        |   3 +
 tools/perf/builtin-trace.c                         | 191 ++++-
 tools/perf/examples/bpf/augmented_syscalls.c       |  55 ++
 tools/perf/examples/bpf/hello.c                    |   9 +
 tools/perf/examples/bpf/sys_enter_openat.c         |  33 +
 tools/perf/include/bpf/bpf.h                       |  20 +
 tools/perf/include/bpf/stdio.h                     |  19 +
 .../arch/arm64/ampere/emag/core-imp-def.json       |  32 +
 tools/perf/pmu-events/arch/arm64/mapfile.csv       |   1 +
 tools/perf/tests/bitmap.c                          |   2 -
 tools/perf/tests/mem2node.c                        |   2 -
 tools/perf/ui/browsers/annotate.c                  |  76 +-
 tools/perf/util/Build                              |   1 +
 tools/perf/util/annotate.c                         | 301 ++++---
 tools/perf/util/annotate.h                         |  54 +-
 tools/perf/util/auxtrace.c                         |   3 +
 tools/perf/util/auxtrace.h                         |   1 +
 tools/perf/util/bpf-loader.c                       |  48 +-
 tools/perf/util/bpf-loader.h                       |  23 +-
 tools/perf/util/event.c                            |  13 +-
 tools/perf/util/evsel.h                            |   7 +
 tools/perf/util/header.c                           |   3 -
 tools/perf/util/map.c                              |  44 +-
 tools/perf/util/map.h                              |   1 -
 tools/perf/util/s390-cpumsf-kernel.h               |  71 ++
 tools/perf/util/s390-cpumsf.c                      | 945 +++++++++++++++++++++
 tools/perf/util/s390-cpumsf.h                      |  21 +
 31 files changed, 1773 insertions(+), 229 deletions(-)
 create mode 100644 tools/perf/examples/bpf/augmented_syscalls.c
 create mode 100644 tools/perf/examples/bpf/hello.c
 create mode 100644 tools/perf/examples/bpf/sys_enter_openat.c
 create mode 100644 tools/perf/include/bpf/stdio.h
 create mode 100644 tools/perf/pmu-events/arch/arm64/ampere/emag/core-imp-def.json
 create mode 100644 tools/perf/util/s390-cpumsf-kernel.h
 create mode 100644 tools/perf/util/s390-cpumsf.c
 create mode 100644 tools/perf/util/s390-cpumsf.h

Test results:

The first ones are container (docker) based builds of tools/perf with
and without libelf support.  Where clang is available, it is also used
to build perf with/without libelf, and building with LIBCLANGLLVM=1
(built-in clang) with gcc and clang when clang and its devel libraries
are installed.

The objtool and samples/bpf/ builds are disabled now that I'm switching from
using the sources in a local volume to fetching them from a http server to
build it inside the container, to make it easier to build in a container cluster.
Those will come back later.

Several are cross builds, the ones with -x-ARCH and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.

The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.

Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.

  # dm
   1 alpine:3.4                    : Ok   gcc (Alpine 5.3.0) 5.3.0
   2 alpine:3.5                    : Ok   gcc (Alpine 6.2.1) 6.2.1 20160822
   3 alpine:3.6                    : Ok   gcc (Alpine 6.3.0) 6.3.0
   4 alpine:3.7                    : Ok   gcc (Alpine 6.4.0) 6.4.0
   5 alpine:edge                   : Ok   gcc (Alpine 6.4.0) 6.4.0
   6 amazonlinux:1                 : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28)
   7 amazonlinux:2                 : Ok   gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5)
   8 android-ndk:r12b-arm          : Ok   arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
   9 android-ndk:r15c-arm          : Ok   arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
  10 centos:5                      : Ok   gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-55)
  11 centos:6                      : Ok   gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23)
  12 centos:7                      : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28)
  13 debian:7                      : Ok   gcc (Debian 4.7.2-5) 4.7.2
  14 debian:8                      : Ok   gcc (Debian 4.9.2-10+deb8u1) 4.9.2
  15 debian:9                      : Ok   gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
  16 debian:experimental           : Ok   gcc (Debian 8.2.0-1) 8.2.0
  17 debian:experimental-x-arm64   : Ok   aarch64-linux-gnu-gcc (Debian 8.1.0-12) 8.1.0
  18 debian:experimental-x-mips    : Ok   mips-linux-gnu-gcc (Debian 8.1.0-12) 8.1.0
  19 debian:experimental-x-mips64  : Ok   mips64-linux-gnuabi64-gcc (Debian 7.3.0-18) 7.3.0
  20 debian:experimental-x-mipsel  : Ok   mipsel-linux-gnu-gcc (Debian 8.1.0-12) 8.1.0
  21 fedora:20                     : Ok   gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7)
  22 fedora:21                     : Ok   gcc (GCC) 4.9.2 20150212 (Red Hat 4.9.2-6)
  23 fedora:22                     : Ok   gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
  24 fedora:23                     : Ok   gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
  25 fedora:24                     : Ok   gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1)
  26 fedora:24-x-ARC-uClibc        : Ok   arc-linux-gcc (ARCompact ISA Linux uClibc toolchain 2017.09-rc2) 7.1.1 20170710
  27 fedora:25                     : Ok   gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1)
  28 fedora:26                     : Ok   gcc (GCC) 7.3.1 20180130 (Red Hat 7.3.1-2)
  29 fedora:27                     : Ok   gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5)
  30 fedora:28                     : Ok   gcc (GCC) 8.1.1 20180502 (Red Hat 8.1.1-1)
  31 fedora:rawhide                : Ok   gcc (GCC) 8.0.1 20180324 (Red Hat 8.0.1-0.20)
  32 gentoo-stage3-amd64:latest    : Ok   gcc (Gentoo 7.3.0-r3 p1.4) 7.3.0
  33 mageia:5                      : Ok   gcc (GCC) 4.9.2
  34 mageia:6                      : Ok   gcc (Mageia 5.5.0-1.mga6) 5.5.0
  35 opensuse:42.1                 : Ok   gcc (SUSE Linux) 4.8.5
  36 opensuse:42.2                 : Ok   gcc (SUSE Linux) 4.8.5
  37 opensuse:42.3                 : Ok   gcc (SUSE Linux) 4.8.5
  38 opensuse:tumbleweed           : Ok   gcc (SUSE Linux) 7.3.1 20180323 [gcc-7-branch revision 258812]
  39 oraclelinux:6                 : Ok   gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23.0.1)
  40 oraclelinux:7                 : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28.0.1)
  41 ubuntu:12.04.5                : Ok   gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
  42 ubuntu:14.04.4                : Ok   gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
  43 ubuntu:14.04.4-x-linaro-arm64 : Ok   aarch64-linux-gnu-gcc (Linaro GCC 5.5-2017.10) 5.5.0
  44 ubuntu:16.04                  : Ok   gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609
  45 ubuntu:16.04-x-arm            : Ok   arm-linux-gnueabihf-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  46 ubuntu:16.04-x-arm64          : Ok   aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  47 ubuntu:16.04-x-powerpc        : Ok   powerpc-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  48 ubuntu:16.04-x-powerpc64      : Ok   powerpc64-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  49 ubuntu:16.04-x-powerpc64el    : Ok   powerpc64le-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  50 ubuntu:16.04-x-s390           : Ok   s390x-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  51 ubuntu:16.10                  : Ok   gcc (Ubuntu 6.2.0-5ubuntu12) 6.2.0 20161005
  52 ubuntu:17.10                  : Ok   gcc (Ubuntu 7.2.0-8ubuntu3.2) 7.2.0
  53 ubuntu:18.04                  : Ok   gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
  54 ubuntu:18.04-x-arm            : Ok   arm-linux-gnueabihf-gcc (Ubuntu/Linaro 7.3.0-16ubuntu3) 7.3.0
  55 ubuntu:18.04-x-arm64          : Ok   aarch64-linux-gnu-gcc (Ubuntu/Linaro 7.3.0-16ubuntu3) 7.3.0
  56 ubuntu:18.04-x-m68k           : Ok   m68k-linux-gnu-gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
  57 ubuntu:18.04-x-powerpc        : Ok   powerpc-linux-gnu-gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
  58 ubuntu:18.04-x-powerpc64      : Ok   powerpc64-linux-gnu-gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
  59 ubuntu:18.04-x-powerpc64el    : Ok   powerpc64le-linux-gnu-gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
  60 ubuntu:18.04-x-riscv64        : Ok   riscv64-linux-gnu-gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
  61 ubuntu:18.04-x-s390           : Ok   s390x-linux-gnu-gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
  62 ubuntu:18.04-x-sh4            : Ok   sh4-linux-gnu-gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
  63 ubuntu:18.04-x-sparc64        : Ok   sparc64-linux-gnu-gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
  64 ubuntu:18.10                  : Ok   gcc (Ubuntu 8.2.0-1ubuntu2) 8.2.0
  #

  # uname -a
  Linux jouet 4.18.0-rc8-00002-g1236568ee3cb #12 SMP Tue Aug 7 14:08:26 -03 2018 x86_64 x86_64 x86_64 GNU/Linux
  # git log --oneline -1
  6a9405b56c27 (HEAD -> perf/core, tag: perf-core-for-mingo-4.19-20180809, acme.korg/perf/core) perf map: Optimize maps__fixup_overlappings()
  # perf version --build-options
  perf version 4.18.rc7.g6a9405
                   dwarf: [ on  ]  # HAVE_DWARF_SUPPORT
      dwarf_getlocations: [ on  ]  # HAVE_DWARF_GETLOCATIONS_SUPPORT
                   glibc: [ on  ]  # HAVE_GLIBC_SUPPORT
                    gtk2: [ on  ]  # HAVE_GTK2_SUPPORT
           syscall_table: [ on  ]  # HAVE_SYSCALL_TABLE_SUPPORT
                  libbfd: [ on  ]  # HAVE_LIBBFD_SUPPORT
                  libelf: [ on  ]  # HAVE_LIBELF_SUPPORT
                 libnuma: [ on  ]  # HAVE_LIBNUMA_SUPPORT
  numa_num_possible_cpus: [ on  ]  # HAVE_LIBNUMA_SUPPORT
                 libperl: [ on  ]  # HAVE_LIBPERL_SUPPORT
               libpython: [ on  ]  # HAVE_LIBPYTHON_SUPPORT
                libslang: [ on  ]  # HAVE_SLANG_SUPPORT
               libcrypto: [ on  ]  # HAVE_LIBCRYPTO_SUPPORT
               libunwind: [ on  ]  # HAVE_LIBUNWIND_SUPPORT
      libdw-dwarf-unwind: [ on  ]  # HAVE_DWARF_SUPPORT
                    zlib: [ on  ]  # HAVE_ZLIB_SUPPORT
                    lzma: [ on  ]  # HAVE_LZMA_SUPPORT
               get_cpuid: [ on  ]  # HAVE_AUXTRACE_SUPPORT
                     bpf: [ on  ]  # HAVE_LIBBPF_SUPPORT
  # perf test
   1: vmlinux symtab matches kallsyms                       : Ok
   2: Detect openat syscall event                           : Ok
   3: Detect openat syscall event on all cpus               : Ok
   4: Read samples using the mmap interface                 : Ok
   5: Test data source output                               : Ok
   6: Parse event definition strings                        : Ok
   7: Simple expression parser                              : Ok
   8: PERF_RECORD_* events & perf_sample fields             : Ok
   9: Parse perf pmu format                                 : Ok
  10: DSO data read                                         : Ok
  11: DSO data cache                                        : Ok
  12: DSO data reopen                                       : Ok
  13: Roundtrip evsel->name                                 : Ok
  14: Parse sched tracepoints fields                        : Ok
  15: syscalls:sys_enter_openat event fields                : Ok
  16: Setup struct perf_event_attr                          : Ok
  17: Match and link multiple hists                         : Ok
  18: 'import perf' in python                               : Ok
  19: Breakpoint overflow signal handler                    : Ok
  20: Breakpoint overflow sampling                          : Ok
  21: Breakpoint accounting                                 : Ok
  22: Number of exit events of a simple workload            : Ok
  23: Software clock events period values                   : Ok
  24: Object code reading                                   : Ok
  25: Sample parsing                                        : Ok
  26: Use a dummy software event to keep tracking           : Ok
  27: Parse with no sample_id_all bit set                   : Ok
  28: Filter hist entries                                   : Ok
  29: Lookup mmap thread                                    : Ok
  30: Share thread mg                                       : Ok
  31: Sort output of hist entries                           : Ok
  32: Cumulate child hist entries                           : Ok
  33: Track with sched_switch                               : Ok
  34: Filter fds with revents mask in a fdarray             : Ok
  35: Add fd to a fdarray, making it autogrow               : Ok
  36: kmod_path__parse                                      : Ok
  37: Thread map                                            : Ok
  38: LLVM search and compile                               :
  38.1: Basic BPF llvm compile                              : Ok
  38.2: kbuild searching                                    : Ok
  38.3: Compile source for BPF prologue generation          : Ok
  38.4: Compile source for BPF relocation                   : Ok
  39: Session topology                                      : Ok
  40: BPF filter                                            :
  40.1: Basic BPF filtering                                 : Ok
  40.2: BPF pinning                                         : Ok
  40.3: BPF prologue generation                             : Ok
  40.4: BPF relocation checker                              : Ok
  41: Synthesize thread map                                 : Ok
  42: Remove thread map                                     : Ok
  43: Synthesize cpu map                                    : Ok
  44: Synthesize stat config                                : Ok
  45: Synthesize stat                                       : Ok
  46: Synthesize stat round                                 : Ok
  47: Synthesize attr update                                : Ok
  48: Event times                                           : Ok
  49: Read backward ring buffer                             : Ok
  50: Print cpu map                                         : Ok
  51: Probe SDT events                                      : Ok
  52: is_printable_array                                    : Ok
  53: Print bitmap                                          : Ok
  54: perf hooks                                            : Ok
  55: builtin clang support                                 : Skip (not compiled in)
  56: unit_number__scnprintf                                : Ok
  57: mem2node                                              : Ok
  58: x86 rdpmc                                             : Ok
  59: Convert perf time to TSC                              : Ok
  60: DWARF unwind                                          : Ok
  61: x86 instruction decoder - new instructions            : Ok
  62: Use vfs_getname probe to get syscall args filenames   : Ok
  63: Check open filename arg using perf trace + vfs_getname: Ok
  64: probe libc's inet_pton & backtrace it with ping       : Ok
  65: Add vfs_getname probe to get syscall args filenames   : Ok
  #

  $ make -C tools/perf build-test
  make: Entering directory '/home/acme/git/perf/tools/perf'
  - tarpkg: ./tests/perf-targz-src-pkg .
             make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
              make_clean_all_O: make clean all
                   make_help_O: make help
  make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
                  make_debug_O: make DEBUG=1
           make_no_backtrace_O: make NO_BACKTRACE=1
            make_no_libaudit_O: make NO_LIBAUDIT=1
              make_no_libelf_O: make NO_LIBELF=1
                 make_perf_o_O: make perf.o
            make_install_bin_O: make install-bin
         make_install_prefix_O: make install prefix=/tmp/krava
                make_no_gtk2_O: make NO_GTK2=1
              make_no_libbpf_O: make NO_LIBBPF=1
                 make_static_O: make LDFLAGS=-static
             make_no_libperl_O: make NO_LIBPERL=1
                 make_cscope_O: make cscope
                make_no_newt_O: make NO_NEWT=1
            make_no_demangle_O: make NO_DEMANGLE=1
   make_install_prefix_slash_O: make install prefix=/tmp/krava/
                  make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
                   make_pure_O: make
               make_no_slang_O: make NO_SLANG=1
         make_with_clangllvm_O: make LIBCLANGLLVM=1
                    make_doc_O: make doc
             make_no_libnuma_O: make NO_LIBNUMA=1
                   make_tags_O: make tags
            make_no_auxtrace_O: make NO_AUXTRACE=1
                make_install_O: make install
        make_with_babeltrace_O: make LIBBABELTRACE=1
           make_no_libunwind_O: make NO_LIBUNWIND=1
             make_util_map_o_O: make util/map.o
           make_no_libpython_O: make NO_LIBPYTHON=1
           make_no_libbionic_O: make NO_LIBBIONIC=1
       make_util_pmu_bison_o_O: make util/pmu-bison.o
                make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
  OK
  make: Leaving directory '/home/acme/git/perf/tools/perf'
  $

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [GIT PULL 00/44] perf/core improvements and fixes
  2018-03-24 20:01 Arnaldo Carvalho de Melo
@ 2018-03-25  8:40 ` Ingo Molnar
  0 siblings, 0 replies; 26+ messages in thread
From: Ingo Molnar @ 2018-03-25  8:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, linux-perf-users, Adrian Hunter,
	Alexander Shishkin, Andi Kleen, David Ahern, Jin Yao, Jiri Olsa,
	Kim Phillips, Linus Torvalds, Martin Vuille, Namhyung Kim,
	Peter Zijlstra, Petr Machata, Wang Nan, Arnaldo Carvalho de Melo


* Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> Hi Ingo,
> 
> 	Mostly a 'perf annotate' refactoring to allow reusing the TUI
> formatting routines in a --stdio2 mode for 'perf annotate' that at some
> point should replace --stdio, leaving that old code deprecated for a
> while, then ditching it.
> 
> 	That will take a while yet because there is some stuff in the
> --stdio code that needs to be done in the annotation UI agnostic core to
> then get used in --tui and --stdio2.
> 
> 	There is also some improvements for issues Linus reported in the
> TUI annotation code for ASM functions.
> 
> 	Please consider pulling,
> 
> - Arnaldo
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit ecd380b8dead1bad67e3af87e2ddfe826c3da79d:
> 
>   Merge tag 'perf-core-for-mingo-4.17-20180319' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2018-03-19 20:37:48 +0100)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.17-20180323
> 
> for you to fetch changes up to 980b68ec0694f250e967cb18c5705ef5de10fdd5:
> 
>   perf annotate: Use absolute addresses to calculate jump target offsets (2018-03-23 16:46:53 -0300)
> 
> ----------------------------------------------------------------
> perf/core improvements and fixes:
> 
> - Move non-TUI specific annotation routines out of the TUI browser so
>   that it can be used in other UIs, and to demonstrate that introduce
>   a 'perf annotate --stdio2' option that will apply those formatting
>   routines to provide a non-interactive annotation mode (Arnaldo Carvalho de Melo)
> 
> - Add 'P' hotkey to the annotation TUI, so dump the current annotated
>   symbol to a file, easing report thru e-mail, by getting rid of the
>   spaces + right hand side scrollbar chars (Arnaldo Carvalho de Melo)
> 
> - Support --ignore-vmlinux to 'perf report' and 'perf annotate', that
>   was already present in 'perf top', to use /proc/{kcore,kallsyms},
>   allowing to see what is in fact running (patched stuff, alternatives,
>   ftrace, etc), not the initial state of the kernel (vmlinux) (Arnaldo Carvalho de Melo)
> 
> - Support 'jump' instructions to a different function, treating them
>   as 'call' instructions (Arnaldo Carvalho de Melo)
> 
> - Fix some jump artifacts when using vmlinux + ASM functions, where
>   the ELF symtab for instance, for entry_SYSCALL_64 includes that and
>   what comes after the 'syscall_return_via_sysret' label, but the
>   objdump -dS prints the jump targets + offsets using the
>   syscall_return_via_sysret address, which was confusing 'perf annotate'.
>   See the cset comments for further info (Arnaldo Carvalho de Melo)
> 
> - Report error from dwfl_attach_state() in the unwind code (Martin Vuille)
> 
> - Reference Py_None before returning it in the python extension (Petr Machata)
> 
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> ----------------------------------------------------------------
> Arnaldo Carvalho de Melo (42):
>       perf annotate: Move annotation_options out of the TUI browser
>       perf annotate: Move cycles/IPC formatting width constants outside TUI
>       perf annotate tui: Use annotate_browser__cycles_width() mroe
>       perf annotate tui: Move have_cycles to struct annotation
>       perf annotate: Move annotation_line array from TUI to generic code
>       perf annotate: Move compute_ipc() to annotation library
>       perf annotate: Move nr_events from annotate_browser to annotation struct
>       perf annotate: Stop using a global config struct
>       perf annotate: Move pcnt_with() to the annotation library
>       perf annotate tui: Add browser__annotation() helper
>       perf annotate: Move max_jump_sources to struct annotation
>       perf annotate: Move jumps_percent_color to ui_browser
>       perf annotate: Move nr_jumps to struct annotation
>       perf annotate: Move mark_jump_targets from the TUI to the annotation library
>       perf annotate: Nuke struct browser_line
>       perf annotate: Move 'start' to struct annotation
>       perf annotate: Move nr_{asm_}entries to struct annotation
>       perf annotate: Introduce set_offsets() method out of TUI code
>       perf annotate: Move the column widths from the TUI to generic lib
>       perf annotate: Move update_column_widths() to the generic lib
>       perf annotate: Introduce init_column_widths() method out of TUI code
>       perf annotate: Introduce symbol__annotate2 method
>       perf annotate: Introduce annotation_line__max_percent()
>       perf ui browser: Add vprintf() method
>       perf annotate: Introduce annotation_line__print_start() out of TUI code
>       perf annotate: Finish the generalization of annotate_browser__write()
>       perf annotate: Use a ops table for annotation_line__write()
>       perf annotate: Introduce annotation_line__filter()
>       perf annotate: Introduce the --stdio2 output mode
>       perf annotate: Move the default annotate options to the library
>       perf annotate: Use the default annotation options for --stdio2
>       perf annotate: Add function header to --stdio2
>       perf annotate: Introduce --ignore-vmlinux command line option
>       perf report: Introduce --ignore-vmlinux command line option
>       perf annotate browser: Add 'P' hotkey to dump annotation to file
>       perf annotate: No need to calculate notes->start twice
>       perf annotate: Pass function descriptor to its instruction parsing routines
>       perf annotate: Mark jumps to outher functions with the call arrow
>       perf annotate: Add "_local" to jump/offset validation routines
>       perf annotate: Support jumping from one function to another
>       perf annotate: Defer searching for comma in raw line till it is needed
>       perf annotate: Use absolute addresses to calculate jump target offsets
> 
> Martin Vuille (1):
>       perf unwind: Report error from dwfl_attach_state
> 
> Petr Machata (1):
>       perf python: Reference Py_None before returning it
> 
>  tools/perf/Documentation/perf-annotate.txt   |   5 +
>  tools/perf/Documentation/perf-report.txt     |   3 +
>  tools/perf/arch/s390/annotate/instructions.c |   5 +-
>  tools/perf/builtin-annotate.c                |  27 +-
>  tools/perf/builtin-report.c                  |   3 +
>  tools/perf/builtin-top.c                     |   2 +
>  tools/perf/ui/browser.c                      |   9 +-
>  tools/perf/ui/browser.h                      |   3 +-
>  tools/perf/ui/browsers/annotate.c            | 686 ++++++---------------------
>  tools/perf/util/annotate.c                   | 679 +++++++++++++++++++++++++-
>  tools/perf/util/annotate.h                   | 102 +++-
>  tools/perf/util/python.c                     |   4 +-
>  tools/perf/util/unwind-libdw.c               |   3 +-
>  13 files changed, 944 insertions(+), 587 deletions(-)

Pulled, thanks a lot Arnaldo!

	Ingo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [GIT PULL 00/44] perf/core improvements and fixes
@ 2018-03-24 20:01 Arnaldo Carvalho de Melo
  2018-03-25  8:40 ` Ingo Molnar
  0 siblings, 1 reply; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-03-24 20:01 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
	Adrian Hunter, Alexander Shishkin, Andi Kleen, David Ahern,
	Jin Yao, Jiri Olsa, Kim Phillips, Linus Torvalds, Martin Vuille,
	Namhyung Kim, Peter Zijlstra, Petr Machata, Wang Nan,
	Arnaldo Carvalho de Melo

Hi Ingo,

	Mostly a 'perf annotate' refactoring to allow reusing the TUI
formatting routines in a --stdio2 mode for 'perf annotate' that at some
point should replace --stdio, leaving that old code deprecated for a
while, then ditching it.

	That will take a while yet because there is some stuff in the
--stdio code that needs to be done in the annotation UI agnostic core to
then get used in --tui and --stdio2.

	There is also some improvements for issues Linus reported in the
TUI annotation code for ASM functions.

	Please consider pulling,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit ecd380b8dead1bad67e3af87e2ddfe826c3da79d:

  Merge tag 'perf-core-for-mingo-4.17-20180319' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2018-03-19 20:37:48 +0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.17-20180323

for you to fetch changes up to 980b68ec0694f250e967cb18c5705ef5de10fdd5:

  perf annotate: Use absolute addresses to calculate jump target offsets (2018-03-23 16:46:53 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

- Move non-TUI specific annotation routines out of the TUI browser so
  that it can be used in other UIs, and to demonstrate that introduce
  a 'perf annotate --stdio2' option that will apply those formatting
  routines to provide a non-interactive annotation mode (Arnaldo Carvalho de Melo)

- Add 'P' hotkey to the annotation TUI, so dump the current annotated
  symbol to a file, easing report thru e-mail, by getting rid of the
  spaces + right hand side scrollbar chars (Arnaldo Carvalho de Melo)

- Support --ignore-vmlinux to 'perf report' and 'perf annotate', that
  was already present in 'perf top', to use /proc/{kcore,kallsyms},
  allowing to see what is in fact running (patched stuff, alternatives,
  ftrace, etc), not the initial state of the kernel (vmlinux) (Arnaldo Carvalho de Melo)

- Support 'jump' instructions to a different function, treating them
  as 'call' instructions (Arnaldo Carvalho de Melo)

- Fix some jump artifacts when using vmlinux + ASM functions, where
  the ELF symtab for instance, for entry_SYSCALL_64 includes that and
  what comes after the 'syscall_return_via_sysret' label, but the
  objdump -dS prints the jump targets + offsets using the
  syscall_return_via_sysret address, which was confusing 'perf annotate'.
  See the cset comments for further info (Arnaldo Carvalho de Melo)

- Report error from dwfl_attach_state() in the unwind code (Martin Vuille)

- Reference Py_None before returning it in the python extension (Petr Machata)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

----------------------------------------------------------------
Arnaldo Carvalho de Melo (42):
      perf annotate: Move annotation_options out of the TUI browser
      perf annotate: Move cycles/IPC formatting width constants outside TUI
      perf annotate tui: Use annotate_browser__cycles_width() mroe
      perf annotate tui: Move have_cycles to struct annotation
      perf annotate: Move annotation_line array from TUI to generic code
      perf annotate: Move compute_ipc() to annotation library
      perf annotate: Move nr_events from annotate_browser to annotation struct
      perf annotate: Stop using a global config struct
      perf annotate: Move pcnt_with() to the annotation library
      perf annotate tui: Add browser__annotation() helper
      perf annotate: Move max_jump_sources to struct annotation
      perf annotate: Move jumps_percent_color to ui_browser
      perf annotate: Move nr_jumps to struct annotation
      perf annotate: Move mark_jump_targets from the TUI to the annotation library
      perf annotate: Nuke struct browser_line
      perf annotate: Move 'start' to struct annotation
      perf annotate: Move nr_{asm_}entries to struct annotation
      perf annotate: Introduce set_offsets() method out of TUI code
      perf annotate: Move the column widths from the TUI to generic lib
      perf annotate: Move update_column_widths() to the generic lib
      perf annotate: Introduce init_column_widths() method out of TUI code
      perf annotate: Introduce symbol__annotate2 method
      perf annotate: Introduce annotation_line__max_percent()
      perf ui browser: Add vprintf() method
      perf annotate: Introduce annotation_line__print_start() out of TUI code
      perf annotate: Finish the generalization of annotate_browser__write()
      perf annotate: Use a ops table for annotation_line__write()
      perf annotate: Introduce annotation_line__filter()
      perf annotate: Introduce the --stdio2 output mode
      perf annotate: Move the default annotate options to the library
      perf annotate: Use the default annotation options for --stdio2
      perf annotate: Add function header to --stdio2
      perf annotate: Introduce --ignore-vmlinux command line option
      perf report: Introduce --ignore-vmlinux command line option
      perf annotate browser: Add 'P' hotkey to dump annotation to file
      perf annotate: No need to calculate notes->start twice
      perf annotate: Pass function descriptor to its instruction parsing routines
      perf annotate: Mark jumps to outher functions with the call arrow
      perf annotate: Add "_local" to jump/offset validation routines
      perf annotate: Support jumping from one function to another
      perf annotate: Defer searching for comma in raw line till it is needed
      perf annotate: Use absolute addresses to calculate jump target offsets

Martin Vuille (1):
      perf unwind: Report error from dwfl_attach_state

Petr Machata (1):
      perf python: Reference Py_None before returning it

 tools/perf/Documentation/perf-annotate.txt   |   5 +
 tools/perf/Documentation/perf-report.txt     |   3 +
 tools/perf/arch/s390/annotate/instructions.c |   5 +-
 tools/perf/builtin-annotate.c                |  27 +-
 tools/perf/builtin-report.c                  |   3 +
 tools/perf/builtin-top.c                     |   2 +
 tools/perf/ui/browser.c                      |   9 +-
 tools/perf/ui/browser.h                      |   3 +-
 tools/perf/ui/browsers/annotate.c            | 686 ++++++---------------------
 tools/perf/util/annotate.c                   | 679 +++++++++++++++++++++++++-
 tools/perf/util/annotate.h                   | 102 +++-
 tools/perf/util/python.c                     |   4 +-
 tools/perf/util/unwind-libdw.c               |   3 +-
 13 files changed, 944 insertions(+), 587 deletions(-)

Test results:

The first ones are container (docker) based builds of tools/perf with and
without libelf support.  Where clang is available, it is also used to build
perf with/without libelf.

The objtool and samples/bpf/ builds are disabled now that I'm switching from
using the sources in a local volume to fetching them from a http server to
build it inside the container, to make it easier to build in a container cluster.
Those will come back later.

Several are cross builds, the ones with -x-ARCH and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.

The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.

Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.

  # dm
   1  alpine:3.4                    : Ok   gcc (Alpine 5.3.0) 5.3.0
   2  alpine:3.5                    : Ok   gcc (Alpine 6.2.1) 6.2.1 20160822
   3  alpine:3.6                    : Ok   gcc (Alpine 6.3.0) 6.3.0
   4  alpine:3.7                    : Ok   gcc (Alpine 6.4.0) 6.4.0
   5  alpine:edge                   : Ok   gcc (Alpine 6.4.0) 6.4.0
   6  amazonlinux:1                 : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11)
   7  amazonlinux:2                 : Ok   gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-2)
   8  android-ndk:r12b-arm          : Ok   arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
   9  android-ndk:r15c-arm          : Ok   arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
  10  centos:5                      : Ok   gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-55)
  11  centos:6                      : Ok   gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18)
  12  centos:7                      : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16)
  13  debian:7                      : Ok   gcc (Debian 4.7.2-5) 4.7.2
  14  debian:8                      : Ok   gcc (Debian 4.9.2-10+deb8u1) 4.9.2
  15  debian:9                      : Ok   gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
  16  debian:experimental           : Ok   gcc (Debian 7.3.0-12) 7.3.0
  17  debian:experimental-x-arm64   : Ok   aarch64-linux-gnu-gcc (Debian 7.3.0-12) 7.3.0
  18  debian:experimental-x-mips    : Ok   mips-linux-gnu-gcc (Debian 7.3.0-12) 7.3.0
  19  debian:experimental-x-mips64  : Ok   mips64-linux-gnuabi64-gcc (Debian 7.3.0-11) 7.3.0
  20  debian:experimental-x-mipsel  : Ok   mipsel-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
  21  fedora:20                     : Ok   gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7)
  22  fedora:21                     : Ok   gcc (GCC) 4.9.2 20150212 (Red Hat 4.9.2-6)
  23  fedora:22                     : Ok   gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
  24  fedora:23                     : Ok   gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
  25  fedora:24                     : Ok   gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1)
  26  fedora:24-x-ARC-uClibc        : Ok   arc-linux-gcc (ARCompact ISA Linux uClibc toolchain 2017.09-rc2) 7.1.1 20170710
  27  fedora:25                     : Ok   gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1)
  28  fedora:26                     : Ok   gcc (GCC) 7.3.1 20180130 (Red Hat 7.3.1-2)
  29  fedora:27                     : Ok   gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5)
  30  fedora:rawhide                : Ok   gcc (GCC) 8.0.1 20180222 (Red Hat 8.0.1-0.16)
  31  gentoo-stage3-amd64:latest    : Ok   gcc (Gentoo 6.4.0-r1 p1.3) 6.4.0
  32  mageia:5                      : Ok   gcc (GCC) 4.9.2
  33  mageia:6                      : Ok   gcc (Mageia 5.5.0-1.mga6) 5.5.0
  34  opensuse:42.1                 : Ok   gcc (SUSE Linux) 4.8.5
  35  opensuse:42.2                 : Ok   gcc (SUSE Linux) 4.8.5
  36  opensuse:42.3                 : Ok   gcc (SUSE Linux) 4.8.5
  37  opensuse:tumbleweed           : Ok   gcc (SUSE Linux) 7.3.0
  38  oraclelinux:6                 : Ok   gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18)
  39  oraclelinux:7                 : Ok   gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16.0.3)
  40  ubuntu:12.04.5                : Ok   gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
  41  ubuntu:14.04.4                : Ok   gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
  42  ubuntu:14.04.4-x-linaro-arm64 : Ok   aarch64-linux-gnu-gcc (Linaro GCC 5.4-2017.05) 5.4.1 20170404
  43  ubuntu:15.04                  : Ok   gcc (Ubuntu 4.9.2-10ubuntu13) 4.9.2
  44  ubuntu:16.04                  : Ok   gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  45  ubuntu:16.04-x-arm            : Ok   arm-linux-gnueabihf-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  46  ubuntu:16.04-x-arm64          : Ok   aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
  47  ubuntu:16.04-x-powerpc        : Ok   powerpc-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
  48  ubuntu:16.04-x-powerpc64      : Ok   powerpc64-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  48  ubuntu:16.04-x-powerpc64      : Ok   powerpc64-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  49  ubuntu:16.04-x-powerpc64el    : Ok   powerpc64le-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  50  ubuntu:16.04-x-s390           : Ok   s390x-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
  51  ubuntu:16.10                  : Ok   gcc (Ubuntu 6.2.0-5ubuntu12) 6.2.0 20161005
  52  ubuntu:17.04                  : Ok   gcc (Ubuntu 6.3.0-12ubuntu2) 6.3.0 20170406
  53  ubuntu:17.10                  : Ok   gcc (Ubuntu 7.2.0-8ubuntu3) 7.2.0
  54  ubuntu:18.04                  : Ok   gcc (Ubuntu 7.2.0-16ubuntu1) 7.2.0

  # uname -a
  Linux jouet 4.16.0-rc5-00086-gdf09348f78dc #1 SMP Fri Mar 16 09:46:40 -03 2018 x86_64 x86_64 x86_64 GNU/Linux
  # perf test
   1: vmlinux symtab matches kallsyms                       : Ok
   2: Detect openat syscall event                           : Ok
   3: Detect openat syscall event on all cpus               : Ok
   4: Read samples using the mmap interface                 : Ok
   5: Test data source output                               : Ok
   6: Parse event definition strings                        : Ok
   7: Simple expression parser                              : Ok
   8: PERF_RECORD_* events & perf_sample fields             : Ok
   9: Parse perf pmu format                                 : Ok
  10: DSO data read                                         : Ok
  11: DSO data cache                                        : Ok
  12: DSO data reopen                                       : Ok
  13: Roundtrip evsel->name                                 : Ok
  14: Parse sched tracepoints fields                        : Ok
  15: syscalls:sys_enter_openat event fields                : Ok
  16: Setup struct perf_event_attr                          : Ok
  17: Match and link multiple hists                         : Ok
  18: 'import perf' in python                               : Ok
  19: Breakpoint overflow signal handler                    : Ok
  20: Breakpoint overflow sampling                          : Ok
  21: Breakpoint accounting                                 : Skip
  22: Number of exit events of a simple workload            : Ok
  23: Software clock events period values                   : Ok
  24: Object code reading                                   : Ok
  25: Sample parsing                                        : Ok
  26: Use a dummy software event to keep tracking           : Ok
  27: Parse with no sample_id_all bit set                   : Ok
  28: Filter hist entries                                   : Ok
  29: Lookup mmap thread                                    : Ok
  30: Share thread mg                                       : Ok
  31: Sort output of hist entries                           : Ok
  32: Cumulate child hist entries                           : Ok
  33: Track with sched_switch                               : Ok
  34: Filter fds with revents mask in a fdarray             : Ok
  35: Add fd to a fdarray, making it autogrow               : Ok
  36: kmod_path__parse                                      : Ok
  37: Thread map                                            : Ok
  38: LLVM search and compile                               :
  38.1: Basic BPF llvm compile                              : Ok
  38.2: kbuild searching                                    : Ok
  38.3: Compile source for BPF prologue generation          : Ok
  38.4: Compile source for BPF relocation                   : Ok
  39: Session topology                                      : Ok
  40: BPF filter                                            :
  40.1: Basic BPF filtering                                 : Ok
  40.2: BPF pinning                                         : Ok
  40.3: BPF prologue generation                             : Ok
  40.4: BPF relocation checker                              : Ok
  41: Synthesize thread map                                 : Ok
  42: Remove thread map                                     : Ok
  43: Synthesize cpu map                                    : Ok
  44: Synthesize stat config                                : Ok
  45: Synthesize stat                                       : Ok
  46: Synthesize stat round                                 : Ok
  47: Synthesize attr update                                : Ok
  48: Event times                                           : Ok
  49: Read backward ring buffer                             : Ok
  50: Print cpu map                                         : Ok
  51: Probe SDT events                                      : Ok
  52: is_printable_array                                    : Ok
  53: Print bitmap                                          : Ok
  54: perf hooks                                            : Ok
  55: builtin clang support                                 : Skip (not compiled in)
  56: unit_number__scnprintf                                : Ok
  57: mem2node                                              : Ok
  58: x86 rdpmc                                             : Ok
  59: Convert perf time to TSC                              : Ok
  60: DWARF unwind                                          : Ok
  61: x86 instruction decoder - new instructions            : Ok
  62: Use vfs_getname probe to get syscall args filenames   : Ok
  63: probe libc's inet_pton & backtrace it with ping       : Ok
  64: Check open filename arg using perf trace + vfs_getname: Ok
  65: probe libc's inet_pton & backtrace it with ping       : Ok
  66: Add vfs_getname probe to get syscall args filenames   : Ok
  #
  
  $ make -C tools/perf build-test
  make: Entering directory '/home/acme/git/perf/tools/perf'
  - tarpkg: ./tests/perf-targz-src-pkg .
   make_install_prefix_slash_O: make install prefix=/tmp/krava/
           make_no_libpython_O: make NO_LIBPYTHON=1
              make_no_libelf_O: make NO_LIBELF=1
                 make_perf_o_O: make perf.o
                    make_doc_O: make doc
           make_no_libunwind_O: make NO_LIBUNWIND=1
        make_with_babeltrace_O: make LIBBABELTRACE=1
            make_install_bin_O: make install-bin
           make_no_libbionic_O: make NO_LIBBIONIC=1
                  make_debug_O: make DEBUG=1
                 make_static_O: make LDFLAGS=-static
               make_no_slang_O: make NO_SLANG=1
              make_no_libbpf_O: make NO_LIBBPF=1
  make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
                make_no_newt_O: make NO_NEWT=1
                   make_tags_O: make tags
             make_util_map_o_O: make util/map.o
            make_no_auxtrace_O: make NO_AUXTRACE=1
         make_with_clangllvm_O: make LIBCLANGLLVM=1
         make_install_prefix_O: make install prefix=/tmp/krava
             make_no_libnuma_O: make NO_LIBNUMA=1
                  make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
                make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
                make_install_O: make install
                   make_pure_O: make
            make_no_libaudit_O: make NO_LIBAUDIT=1
             make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
            make_no_demangle_O: make NO_DEMANGLE=1
                   make_help_O: make help
       make_util_pmu_bison_o_O: make util/pmu-bison.o
                make_no_gtk2_O: make NO_GTK2=1
           make_no_backtrace_O: make NO_BACKTRACE=1
             make_no_libperl_O: make NO_LIBPERL=1
              make_clean_all_O: make clean all
  OK
  make: Leaving directory '/home/acme/git/perf/tools/perf'
  $

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2018-08-09 15:27 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-22 14:41 [GIT PULL 00/44] perf/core improvements and fixes Arnaldo Carvalho de Melo
2017-09-22 14:41 ` [PATCH 01/44] perf sched timehist: Add pid and tid options Arnaldo Carvalho de Melo
2017-09-22 14:41 ` [PATCH 02/44] perf tools: Support weak groups in 'perf stat' Arnaldo Carvalho de Melo
2017-09-22 14:41 ` [PATCH 03/44] perf vendor events: Support metric_group and no event name in JSON parser Arnaldo Carvalho de Melo
2017-09-22 14:41 ` [PATCH 04/44] perf stat: Factor out generic metric printing Arnaldo Carvalho de Melo
2017-09-22 14:41 ` [PATCH 05/44] perf stat: Print generic metric header even for failed expressions Arnaldo Carvalho de Melo
2017-09-22 14:41 ` [PATCH 06/44] perf pmu: Extract function to get JSON alias map Arnaldo Carvalho de Melo
2017-09-22 14:41 ` [PATCH 07/44] perf stat: Support JSON metrics in perf stat Arnaldo Carvalho de Melo
2017-09-22 14:41 ` [PATCH 08/44] perf list: Add metric groups to perf list Arnaldo Carvalho de Melo
2017-09-22 14:41 ` [PATCH 09/44] perf stat: Don't use ctx for saved values lookup Arnaldo Carvalho de Melo
2017-09-22 14:41 ` [PATCH 10/44] perf stat: Support duration_time for metrics Arnaldo Carvalho de Melo
2017-09-22 14:41 ` [PATCH 11/44] perf stat: Hide internal duration_time counter Arnaldo Carvalho de Melo
2017-09-22 14:41 ` [PATCH 12/44] perf stat: Update walltime_nsecs_stats in interval mode Arnaldo Carvalho de Melo
2017-09-22 14:41 ` [PATCH 13/44] perf record: Support direct --user-regs arguments Arnaldo Carvalho de Melo
2017-09-22 14:41 ` [PATCH 14/44] perf script: Support user regs Arnaldo Carvalho de Melo
2017-09-22 14:41 ` [PATCH 15/44] perf tools: Add python-clean target Arnaldo Carvalho de Melo
2017-09-22 14:41 ` [PATCH 16/44] perf ui progress: Add ui specific init function Arnaldo Carvalho de Melo
2017-09-22 14:41 ` [PATCH 17/44] perf ui progress: Add size info into progress bar Arnaldo Carvalho de Melo
2017-09-22 14:41 ` [PATCH 18/44] perf tools: Use scandir() to replace readdir() Arnaldo Carvalho de Melo
2017-09-22 14:41 ` [PATCH 19/44] perf config: Write a config file just once Arnaldo Carvalho de Melo
2017-09-22 14:41 ` [PATCH 20/44] perf config: Allow creating empty config set for config file autogeneration Arnaldo Carvalho de Melo
2017-09-22 16:26 ` [GIT PULL 00/44] perf/core improvements and fixes Ingo Molnar
2018-03-24 20:01 Arnaldo Carvalho de Melo
2018-03-25  8:40 ` Ingo Molnar
2018-08-09 14:57 Arnaldo Carvalho de Melo
2018-08-09 15:27 ` Kim Phillips

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).