All of lore.kernel.org
 help / color / mirror / Atom feed
* [GIT PULL 00/25] perf/core improvements and fixes
@ 2017-06-21 18:02 Arnaldo Carvalho de Melo
  2017-06-21 18:02 ` [PATCH 01/25] perf evsel: Adopt find_process() Arnaldo Carvalho de Melo
                   ` (25 more replies)
  0 siblings, 26 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-21 18:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Andi Kleen, David Ahern, Jiri Olsa, Kan Liang, linuxppc-dev,
	Milian Wolff, Namhyung Kim, Naveen N . Rao, Paolo Bonzini,
	Peter Zijlstra, Ravi Bangoria, Robert Elliott, Stephane Eranian,
	Thomas Gleixner, Wang Nan, Arnaldo Carvalho de Melo

Hi Ingo,

	Please consider pulling,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit 007b811b4041989ec2dc91b9614aa2c41332723e:

  Merge tag 'perf-core-for-mingo-4.13-20170719' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2017-06-20 10:49:08 +0200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.13-20170621

for you to fetch changes up to 701516ae3dec801084bc913d21e03fce15c61a0b:

  perf script: Fix message because field list option is -F not -f (2017-06-21 11:35:53 -0300)

----------------------------------------------------------------
perf/core improvements ad fixes:

New features:

- Add support to measure SMI cost in 'perf stat' (Kan Liang)

- Add support for unwinding callchains in powerpc with libdw (Paolo Bonzini)

Fixes:

- Fix message: cpu list option is -C not -c (Adrian Hunter)

- Fix 'perf script' message: field list option is -F not -f (Adrian Hunter)

- Intel PT fixes: (Adrian Hunter)

  o Fix missing stack clear
  o Ensure IP is zero when state is INTEL_PT_STATE_NO_IP
  o Fix last_ip usage
  o Ensure never to set 'last_ip' when packet 'count' is zero
  o Clear FUP flag on error
  o Fix transactions_sample_type

Infrastructure:

- Intel PT cleanups/refactorings (Adrian Hunter)

  o Use FUP always when scanning for an IP
  o Add missing __fallthrough
  o Remove redundant initial_skip checks
  o Allow decoding with branch tracing disabled
  o Add default config for pass-through branch enable
  o Add documentation for new config terms
  o Add decoder support for ptwrite and power event packets
  o Add reserved byte to CBR packet payload
  o Add decoder support for CBR events

- Move  find_process() to the only place that uses it, skimming some
  more fat from util.[ch] (Arnaldo Carvalho de Melo)

- Do parameter validation earlier on fetch_kernel_version() (Arnaldo Carvalho de Melo)

- Remove unused _ALL_SOURCE define (Arnaldo Carvalho de Melo)

- Add sysfs__write_int function (Kan Liang)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

----------------------------------------------------------------
Adrian Hunter (19):
      perf intel-pt: Move decoder error setting into one condition
      perf intel-pt: Improve sample timestamp
      perf intel-pt: Fix missing stack clear
      perf intel-pt: Ensure IP is zero when state is INTEL_PT_STATE_NO_IP
      perf intel-pt: Fix last_ip usage
      perf intel-pt: Ensure never to set 'last_ip' when packet 'count' is zero
      perf intel-pt: Use FUP always when scanning for an IP
      perf intel-pt: Clear FUP flag on error
      perf intel-pt: Add missing __fallthrough
      perf intel-pt: Allow decoding with branch tracing disabled
      perf intel-pt: Add default config for pass-through branch enable
      perf intel-pt: Add documentation for new config terms
      perf intel-pt: Add decoder support for ptwrite and power event packets
      perf intel-pt: Add reserved byte to CBR packet payload
      perf intel-pt: Add decoder support for CBR events
      perf intel-pt: Remove redundant initial_skip checks
      perf intel-pt: Fix transactions_sample_type
      perf tools: Fix message because cpu list option is -C not -c
      perf script: Fix message because field list option is -F not -f

Arnaldo Carvalho de Melo (3):
      perf evsel: Adopt find_process()
      perf tools: Do parameter validation earlier on fetch_kernel_version()
      perf tools: Remove unused _ALL_SOURCE define

Kan Liang (2):
      tools lib api fs: Add sysfs__write_int function
      perf stat: Add support to measure SMI cost

Paolo Bonzini (1):
      perf unwind: Support for powerpc

 tools/lib/api/fs/fs.c                              |  30 +++
 tools/lib/api/fs/fs.h                              |   4 +
 tools/perf/Documentation/intel-pt.txt              |  36 +++
 tools/perf/Documentation/perf-stat.txt             |  14 +
 tools/perf/Makefile.config                         |   2 +-
 tools/perf/arch/powerpc/util/Build                 |   2 +
 tools/perf/arch/powerpc/util/unwind-libdw.c        |  73 ++++++
 tools/perf/arch/x86/util/intel-pt.c                |   5 +
 tools/perf/builtin-script.c                        |   2 +-
 tools/perf/builtin-stat.c                          |  49 ++++
 tools/perf/util/evsel.c                            |  39 +++
 .../perf/util/intel-pt-decoder/intel-pt-decoder.c  | 290 +++++++++++++++++++--
 .../perf/util/intel-pt-decoder/intel-pt-decoder.h  |  13 +
 .../util/intel-pt-decoder/intel-pt-pkt-decoder.c   | 110 +++++++-
 .../util/intel-pt-decoder/intel-pt-pkt-decoder.h   |   7 +
 tools/perf/util/intel-pt.c                         |  23 +-
 tools/perf/util/session.c                          |   2 +-
 tools/perf/util/stat-shadow.c                      |  33 +++
 tools/perf/util/stat.c                             |   2 +
 tools/perf/util/stat.h                             |   2 +
 tools/perf/util/util.c                             |  52 +---
 tools/perf/util/util.h                             |   3 -
 22 files changed, 710 insertions(+), 83 deletions(-)
 create mode 100644 tools/perf/arch/powerpc/util/unwind-libdw.c

Test results:

The first ones are container (docker) based builds of tools/perf with and
without libelf support, objtool where it is supported and samples/bpf/, ditto.
Where clang is available, it is also used to build perf with/without libelf.

Several are cross builds, the ones with -x-ARCH, and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.

The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.

Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.

  # dm
   1 alpine:3.4: Ok
   2 alpine:3.5: Ok
   3 alpine:3.6: Ok
   4 alpine:edge: Ok
   5 android-ndk:r12b-arm: Ok
   6 archlinux:latest: Ok
   7 centos:5: Ok
   8 centos:6: Ok
   9 centos:7: Ok
  10 debian:7: Ok
  11 debian:8: Ok
  12 debian:9: Ok
  13 debian:experimental: Ok
  14 debian:experimental-x-arm64: Ok
  15 debian:experimental-x-mips: Ok
  16 debian:experimental-x-mips64: Ok
  17 debian:experimental-x-mipsel: Ok
  18 fedora:20: Ok
  19 fedora:21: Ok
  20 fedora:22: Ok
  21 fedora:23: Ok
  22 fedora:24: Ok
  23 fedora:24-x-ARC-uClibc: Ok
  24 fedora:25: Ok
  25 fedora:rawhide: Ok
  26 mageia:5: Ok
  27 opensuse:13.2: Ok
  28 opensuse:42.1: Ok
  29 opensuse:tumbleweed: Ok
  30 ubuntu:12.04.5: Ok
  31 ubuntu:14.04.4: Ok
  32 ubuntu:14.04.4-x-linaro-arm64: Ok
  33 ubuntu:15.10: Ok
  34 ubuntu:16.04: Ok
  35 ubuntu:16.04-x-arm: Ok
  36 ubuntu:16.04-x-arm64: Ok
  37 ubuntu:16.04-x-powerpc: Ok
  38 ubuntu:16.04-x-powerpc64: Ok
  39 ubuntu:16.04-x-powerpc64el: Ok
  40 ubuntu:16.04-x-s390: Ok
  41 ubuntu:16.10: Ok
  42 ubuntu:17.04: Ok
  # 

  # uname -a
  Linux jouet 4.12.0-rc4+ #1 SMP Fri Jun 9 12:59:23 -03 2017 x86_64 x86_64 x86_64 GNU/Linux
  # perf test
   1: vmlinux symtab matches kallsyms            : Ok
   2: Detect openat syscall event                : Ok
   3: Detect openat syscall event on all cpus    : Ok
   4: Read samples using the mmap interface      : Ok
   5: Parse event definition strings             : Ok
   6: Simple expression parser                   : Ok
   7: PERF_RECORD_* events & perf_sample fields  : Ok
   8: Parse perf pmu format                      : Ok
   9: DSO data read                              : Ok
  10: DSO data cache                             : Ok
  11: DSO data reopen                            : Ok
  12: Roundtrip evsel->name                      : Ok
  13: Parse sched tracepoints fields             : Ok
  14: syscalls:sys_enter_openat event fields     : Ok
  15: Setup struct perf_event_attr               : Ok
  16: Match and link multiple hists              : Ok
  17: 'import perf' in python                    : Ok
  18: Breakpoint overflow signal handler         : Ok
  19: Breakpoint overflow sampling               : Ok
  20: Number of exit events of a simple workload : Ok
  21: Software clock events period values        : Ok
  22: Object code reading                        : Ok
  23: Sample parsing                             : Ok
  24: Use a dummy software event to keep tracking: Ok
  25: Parse with no sample_id_all bit set        : Ok
  26: Filter hist entries                        : Ok
  27: Lookup mmap thread                         : Ok
  28: Share thread mg                            : Ok
  29: Sort output of hist entries                : Ok
  30: Cumulate child hist entries                : Ok
  31: Track with sched_switch                    : Ok
  32: Filter fds with revents mask in a fdarray  : Ok
  33: Add fd to a fdarray, making it autogrow    : Ok
  34: kmod_path__parse                           : Ok
  35: Thread map                                 : Ok
  36: LLVM search and compile                    :
  36.1: Basic BPF llvm compile                    : Ok
  36.2: kbuild searching                          : Ok
  36.3: Compile source for BPF prologue generation: Ok
  36.4: Compile source for BPF relocation         : Ok
  37: Session topology                           : Ok
  38: BPF filter                                 :
  38.1: Basic BPF filtering                      : Ok
  38.2: BPF pinning                              : Ok
  38.3: BPF prologue generation                  : Ok
  38.4: BPF relocation checker                   : Ok
  39: Synthesize thread map                      : Ok
  40: Remove thread map                          : Ok
  41: Synthesize cpu map                         : Ok
  42: Synthesize stat config                     : Ok
  43: Synthesize stat                            : Ok
  44: Synthesize stat round                      : Ok
  45: Synthesize attr update                     : Ok
  46: Event times                                : Ok
  47: Read backward ring buffer                  : Ok
  48: Print cpu map                              : Ok
  49: Probe SDT events                           : Ok
  50: is_printable_array                         : Ok
  51: Print bitmap                               : Ok
  52: perf hooks                                 : Ok
  53: builtin clang support                      : Skip (not compiled in)
  54: unit_number__scnprintf                     : Ok
  55: x86 rdpmc                                  : Ok
  56: Convert perf time to TSC                   : Ok
  57: DWARF unwind                               : Ok
  58: x86 instruction decoder - new instructions : Ok
  59: Intel cqm nmi context read                 : Skip
  #

  $ make -C tools/perf build-test
  make: Entering directory '/home/acme/git/linux/tools/perf'
  - tarpkg: ./tests/perf-targz-src-pkg .
               make_no_slang_O: make NO_SLANG=1
              make_clean_all_O: make clean all
                make_no_gtk2_O: make NO_GTK2=1
             make_no_libperl_O: make NO_LIBPERL=1
            make_no_auxtrace_O: make NO_AUXTRACE=1
                    make_doc_O: make doc
           make_no_libbionic_O: make NO_LIBBIONIC=1
           make_no_backtrace_O: make NO_BACKTRACE=1
              make_no_libelf_O: make NO_LIBELF=1
            make_no_libaudit_O: make NO_LIBAUDIT=1
             make_no_libnuma_O: make NO_LIBNUMA=1
  make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
       make_util_pmu_bison_o_O: make util/pmu-bison.o
            make_no_demangle_O: make NO_DEMANGLE=1
             make_util_map_o_O: make util/map.o
   make_install_prefix_slash_O: make install prefix=/tmp/krava/
         make_with_clangllvm_O: make LIBCLANGLLVM=1
           make_no_libunwind_O: make NO_LIBUNWIND=1
           make_no_libpython_O: make NO_LIBPYTHON=1
                 make_perf_o_O: make perf.o
                   make_tags_O: make tags
             make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
            make_install_bin_O: make install-bin
                  make_debug_O: make DEBUG=1
              make_no_libbpf_O: make NO_LIBBPF=1
        make_with_babeltrace_O: make LIBBABELTRACE=1
                   make_pure_O: make
                  make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
         make_install_prefix_O: make install prefix=/tmp/krava
                make_install_O: make install
                   make_help_O: make help
                 make_static_O: make LDFLAGS=-static
                make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
                make_no_newt_O: make NO_NEWT=1
  OK
  make: Leaving directory '/home/acme/git/linux/tools/perf'
  $ 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 01/25] perf evsel: Adopt find_process()
  2017-06-21 18:02 [GIT PULL 00/25] perf/core improvements and fixes Arnaldo Carvalho de Melo
@ 2017-06-21 18:02 ` Arnaldo Carvalho de Melo
  2017-06-21 18:02 ` [PATCH 02/25] perf tools: Do parameter validation earlier on fetch_kernel_version() Arnaldo Carvalho de Melo
                   ` (24 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-21 18:02 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Arnaldo Carvalho de Melo

From: Arnaldo Carvalho de Melo <acme@redhat.com>

And make it static, nobody else uses it, if we ever need it in more
places we can carve a new source file for process related methods,
for now lets reduce util.{c,h} a tad more.

Link: http://lkml.kernel.org/n/tip-zgb28rllvypjibw52aaz9p15@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evsel.c | 39 +++++++++++++++++++++++++++++++++++++++
 tools/perf/util/util.c  | 37 -------------------------------------
 tools/perf/util/util.h  |  2 --
 3 files changed, 39 insertions(+), 39 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 7f78f27f5382..6f4882f8d61f 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -11,6 +11,7 @@
 #include <errno.h>
 #include <inttypes.h>
 #include <linux/bitops.h>
+#include <api/fs/fs.h>
 #include <api/fs/tracing_path.h>
 #include <traceevent/event-parse.h>
 #include <linux/hw_breakpoint.h>
@@ -19,6 +20,8 @@
 #include <linux/err.h>
 #include <sys/ioctl.h>
 #include <sys/resource.h>
+#include <sys/types.h>
+#include <dirent.h>
 #include "asm/bug.h"
 #include "callchain.h"
 #include "cgroup.h"
@@ -2472,6 +2475,42 @@ bool perf_evsel__fallback(struct perf_evsel *evsel, int err,
 	return false;
 }
 
+static bool find_process(const char *name)
+{
+	size_t len = strlen(name);
+	DIR *dir;
+	struct dirent *d;
+	int ret = -1;
+
+	dir = opendir(procfs__mountpoint());
+	if (!dir)
+		return false;
+
+	/* Walk through the directory. */
+	while (ret && (d = readdir(dir)) != NULL) {
+		char path[PATH_MAX];
+		char *data;
+		size_t size;
+
+		if ((d->d_type != DT_DIR) ||
+		     !strcmp(".", d->d_name) ||
+		     !strcmp("..", d->d_name))
+			continue;
+
+		scnprintf(path, sizeof(path), "%s/%s/comm",
+			  procfs__mountpoint(), d->d_name);
+
+		if (filename__read_str(path, &data, &size))
+			continue;
+
+		ret = strncmp(name, data, len);
+		free(data);
+	}
+
+	closedir(dir);
+	return ret ? false : true;
+}
+
 int perf_evsel__open_strerror(struct perf_evsel *evsel, struct target *target,
 			      int err, char *msg, size_t size)
 {
diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index 28c9f335006c..3cd42995ac6f 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -343,43 +343,6 @@ int perf_event_paranoid(void)
 
 	return value;
 }
-
-bool find_process(const char *name)
-{
-	size_t len = strlen(name);
-	DIR *dir;
-	struct dirent *d;
-	int ret = -1;
-
-	dir = opendir(procfs__mountpoint());
-	if (!dir)
-		return false;
-
-	/* Walk through the directory. */
-	while (ret && (d = readdir(dir)) != NULL) {
-		char path[PATH_MAX];
-		char *data;
-		size_t size;
-
-		if ((d->d_type != DT_DIR) ||
-		     !strcmp(".", d->d_name) ||
-		     !strcmp("..", d->d_name))
-			continue;
-
-		scnprintf(path, sizeof(path), "%s/%s/comm",
-			  procfs__mountpoint(), d->d_name);
-
-		if (filename__read_str(path, &data, &size))
-			continue;
-
-		ret = strncmp(name, data, len);
-		free(data);
-	}
-
-	closedir(dir);
-	return ret ? false : true;
-}
-
 static int
 fetch_ubuntu_kernel_version(unsigned int *puint)
 {
diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
index 21c6db173bcc..9ae7e6e35f9a 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -49,8 +49,6 @@ int hex2u64(const char *ptr, u64 *val);
 extern unsigned int page_size;
 extern int cacheline_size;
 
-bool find_process(const char *name);
-
 int fetch_kernel_version(unsigned int *puint,
 			 char *str, size_t str_sz);
 #define KVER_VERSION(x)		(((x) >> 16) & 0xff)
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 02/25] perf tools: Do parameter validation earlier on fetch_kernel_version()
  2017-06-21 18:02 [GIT PULL 00/25] perf/core improvements and fixes Arnaldo Carvalho de Melo
  2017-06-21 18:02 ` [PATCH 01/25] perf evsel: Adopt find_process() Arnaldo Carvalho de Melo
@ 2017-06-21 18:02 ` Arnaldo Carvalho de Melo
  2017-06-21 18:02 ` [PATCH 03/25] perf tools: Remove unused _ALL_SOURCE define Arnaldo Carvalho de Melo
                   ` (23 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-21 18:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	David Ahern, Jiri Olsa, Namhyung Kim, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

While trying to reduce util.[ch] I noticed that fetch_kernel_version()
and fetch_ubuntu_kernel_version() do lots of operations only to check if
they are needed, i.e. it checks if the pointer where to return the
kernel version is NULL only after obtaining the kernel version from
/proc/version_signature or by parsing the results from uname().

Do it earlier not to confuse people reading this code in the future :-)

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-i94qwyekk4tzbu0b9ce1r1mz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/util.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index 3cd42995ac6f..988111e0bab5 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -350,8 +350,12 @@ fetch_ubuntu_kernel_version(unsigned int *puint)
 	size_t line_len = 0;
 	char *ptr, *line = NULL;
 	int version, patchlevel, sublevel, err;
-	FILE *vsig = fopen("/proc/version_signature", "r");
+	FILE *vsig;
 
+	if (!puint)
+		return 0;
+
+	vsig = fopen("/proc/version_signature", "r");
 	if (!vsig) {
 		pr_debug("Open /proc/version_signature failed: %s\n",
 			 strerror(errno));
@@ -381,8 +385,7 @@ fetch_ubuntu_kernel_version(unsigned int *puint)
 		goto errout;
 	}
 
-	if (puint)
-		*puint = (version << 16) + (patchlevel << 8) + sublevel;
+	*puint = (version << 16) + (patchlevel << 8) + sublevel;
 	err = 0;
 errout:
 	free(line);
@@ -409,6 +412,9 @@ fetch_kernel_version(unsigned int *puint, char *str,
 		str[str_size - 1] = '\0';
 	}
 
+	if (!puint || int_ver_ready)
+		return 0;
+
 	err = sscanf(utsname.release, "%d.%d.%d",
 		     &version, &patchlevel, &sublevel);
 
@@ -418,8 +424,7 @@ fetch_kernel_version(unsigned int *puint, char *str,
 		return -1;
 	}
 
-	if (puint && !int_ver_ready)
-		*puint = (version << 16) + (patchlevel << 8) + sublevel;
+	*puint = (version << 16) + (patchlevel << 8) + sublevel;
 	return 0;
 }
 
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 03/25] perf tools: Remove unused _ALL_SOURCE define
  2017-06-21 18:02 [GIT PULL 00/25] perf/core improvements and fixes Arnaldo Carvalho de Melo
  2017-06-21 18:02 ` [PATCH 01/25] perf evsel: Adopt find_process() Arnaldo Carvalho de Melo
  2017-06-21 18:02 ` [PATCH 02/25] perf tools: Do parameter validation earlier on fetch_kernel_version() Arnaldo Carvalho de Melo
@ 2017-06-21 18:02 ` Arnaldo Carvalho de Melo
  2017-06-21 18:02 ` [PATCH 04/25] tools lib api fs: Add sysfs__write_int function Arnaldo Carvalho de Melo
                   ` (22 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-21 18:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	David Ahern, Jiri Olsa, Namhyung Kim, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Curious as to what this was for I looked at /usr/include/ and only some
python headers define this, and it ends up being to enable "extensions"
on some old OSes:

  /* Enable extensions on AIX 3, Interix */

I guess we can remove this one safely.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-omnundlxo2brs552bdl6m0j1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/util.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
index 9ae7e6e35f9a..978572dfeb14 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -1,7 +1,6 @@
 #ifndef GIT_COMPAT_UTIL_H
 #define GIT_COMPAT_UTIL_H
 
-#define _ALL_SOURCE 1
 #define _BSD_SOURCE 1
 /* glibc 2.20 deprecates _BSD_SOURCE in favour of _DEFAULT_SOURCE */
 #define _DEFAULT_SOURCE 1
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 04/25] tools lib api fs: Add sysfs__write_int function
  2017-06-21 18:02 [GIT PULL 00/25] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (2 preceding siblings ...)
  2017-06-21 18:02 ` [PATCH 03/25] perf tools: Remove unused _ALL_SOURCE define Arnaldo Carvalho de Melo
@ 2017-06-21 18:02 ` Arnaldo Carvalho de Melo
  2017-06-21 18:02 ` [PATCH 05/25] perf stat: Add support to measure SMI cost Arnaldo Carvalho de Melo
                   ` (21 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-21 18:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Kan Liang, Andi Kleen, Kan Liang, Peter Zijlstra,
	Robert Elliott, Stephane Eranian, Thomas Gleixner,
	Arnaldo Carvalho de Melo

From: Kan Liang <Kan.liang@intel.com>

Add sysfs__write_int() to ease up writing int to sysfs.  New interface
is:

  int sysfs__write_int(const char *entry, int value);

Also, introducing filename__write_int() which is useful for new helpers
to write sysctl values.

Signed-off-by: Kan Liang <Kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Elliott <elliott@hpe.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1495825538-5230-2-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/lib/api/fs/fs.c | 30 ++++++++++++++++++++++++++++++
 tools/lib/api/fs/fs.h |  4 ++++
 2 files changed, 34 insertions(+)

diff --git a/tools/lib/api/fs/fs.c b/tools/lib/api/fs/fs.c
index 809c7721cd24..a7ecf8f469f4 100644
--- a/tools/lib/api/fs/fs.c
+++ b/tools/lib/api/fs/fs.c
@@ -387,6 +387,22 @@ int filename__read_str(const char *filename, char **buf, size_t *sizep)
 	return err;
 }
 
+int filename__write_int(const char *filename, int value)
+{
+	int fd = open(filename, O_WRONLY), err = -1;
+	char buf[64];
+
+	if (fd < 0)
+		return err;
+
+	sprintf(buf, "%d", value);
+	if (write(fd, buf, sizeof(buf)) == sizeof(buf))
+		err = 0;
+
+	close(fd);
+	return err;
+}
+
 int procfs__read_str(const char *entry, char **buf, size_t *sizep)
 {
 	char path[PATH_MAX];
@@ -480,3 +496,17 @@ int sysctl__read_int(const char *sysctl, int *value)
 
 	return filename__read_int(path, value);
 }
+
+int sysfs__write_int(const char *entry, int value)
+{
+	char path[PATH_MAX];
+	const char *sysfs = sysfs__mountpoint();
+
+	if (!sysfs)
+		return -1;
+
+	if (snprintf(path, sizeof(path), "%s/%s", sysfs, entry) >= PATH_MAX)
+		return -1;
+
+	return filename__write_int(path, value);
+}
diff --git a/tools/lib/api/fs/fs.h b/tools/lib/api/fs/fs.h
index 956c21127d1e..45605348461e 100644
--- a/tools/lib/api/fs/fs.h
+++ b/tools/lib/api/fs/fs.h
@@ -31,6 +31,8 @@ int filename__read_int(const char *filename, int *value);
 int filename__read_ull(const char *filename, unsigned long long *value);
 int filename__read_str(const char *filename, char **buf, size_t *sizep);
 
+int filename__write_int(const char *filename, int value);
+
 int procfs__read_str(const char *entry, char **buf, size_t *sizep);
 
 int sysctl__read_int(const char *sysctl, int *value);
@@ -38,4 +40,6 @@ int sysfs__read_int(const char *entry, int *value);
 int sysfs__read_ull(const char *entry, unsigned long long *value);
 int sysfs__read_str(const char *entry, char **buf, size_t *sizep);
 int sysfs__read_bool(const char *entry, bool *value);
+
+int sysfs__write_int(const char *entry, int value);
 #endif /* __API_FS__ */
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 05/25] perf stat: Add support to measure SMI cost
  2017-06-21 18:02 [GIT PULL 00/25] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (3 preceding siblings ...)
  2017-06-21 18:02 ` [PATCH 04/25] tools lib api fs: Add sysfs__write_int function Arnaldo Carvalho de Melo
@ 2017-06-21 18:02 ` Arnaldo Carvalho de Melo
  2017-06-21 18:02 ` [PATCH 06/25] perf unwind: Support for powerpc Arnaldo Carvalho de Melo
                   ` (20 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-21 18:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Kan Liang, Andi Kleen, Kan Liang, Peter Zijlstra,
	Robert Elliott, Stephane Eranian, Thomas Gleixner,
	Arnaldo Carvalho de Melo

From: Kan Liang <Kan.liang@intel.com>

Implementing a new --smi-cost mode in perf stat to measure SMI cost.

During the measurement, the /sys/device/cpu/freeze_on_smi will be set.

The measurement can be done with one counter (unhalted core cycles), and
two free running MSR counters (IA32_APERF and SMI_COUNT).

In practice, the percentages of SMI core cycles should be more useful
than absolute value. So the output will be the percentage of SMI core
cycles and SMI#. metric_only will be set by default.

SMI cycles% = (aperf - unhalted core cycles) / aperf

Here is an example output.

 Performance counter stats for 'sudo echo ':

SMI cycles%          SMI#
    0.1%              1

       0.010858678 seconds time elapsed

Users who wants to get the actual value can apply additional
--no-metric-only.

Signed-off-by: Kan Liang <Kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Elliott <elliott@hpe.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1495825538-5230-3-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-stat.txt | 14 ++++++++++
 tools/perf/builtin-stat.c              | 49 ++++++++++++++++++++++++++++++++++
 tools/perf/util/stat-shadow.c          | 33 +++++++++++++++++++++++
 tools/perf/util/stat.c                 |  2 ++
 tools/perf/util/stat.h                 |  2 ++
 5 files changed, 100 insertions(+)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index bd0e4417f2be..698076313606 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -239,6 +239,20 @@ taskset.
 --no-merge::
 Do not merge results from same PMUs.
 
+--smi-cost::
+Measure SMI cost if msr/aperf/ and msr/smi/ events are supported.
+
+During the measurement, the /sys/device/cpu/freeze_on_smi will be set to
+freeze core counters on SMI.
+The aperf counter will not be effected by the setting.
+The cost of SMI can be measured by (aperf - unhalted core cycles).
+
+In practice, the percentages of SMI cycles is very useful for performance
+oriented analysis. --metric_only will be applied by default.
+The output is SMI cycles%, equals to (aperf - unhalted core cycles) / aperf
+
+Users who wants to get the actual value can apply --no-metric-only.
+
 EXAMPLES
 --------
 
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index ad9324d1daf9..324363054c3f 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -86,6 +86,7 @@
 #define DEFAULT_SEPARATOR	" "
 #define CNTR_NOT_SUPPORTED	"<not supported>"
 #define CNTR_NOT_COUNTED	"<not counted>"
+#define FREEZE_ON_SMI_PATH	"devices/cpu/freeze_on_smi"
 
 static void print_counters(struct timespec *ts, int argc, const char **argv);
 
@@ -122,6 +123,14 @@ static const char * topdown_attrs[] = {
 	NULL,
 };
 
+static const char *smi_cost_attrs = {
+	"{"
+	"msr/aperf/,"
+	"msr/smi/,"
+	"cycles"
+	"}"
+};
+
 static struct perf_evlist	*evsel_list;
 
 static struct target target = {
@@ -137,6 +146,8 @@ static bool			null_run			=  false;
 static int			detailed_run			=  0;
 static bool			transaction_run;
 static bool			topdown_run			= false;
+static bool			smi_cost			= false;
+static bool			smi_reset			= false;
 static bool			big_num				=  true;
 static int			big_num_opt			=  -1;
 static const char		*csv_sep			= NULL;
@@ -1782,6 +1793,8 @@ static const struct option stat_options[] = {
 			"Only print computed metrics. No raw values", enable_metric_only),
 	OPT_BOOLEAN(0, "topdown", &topdown_run,
 			"measure topdown level 1 statistics"),
+	OPT_BOOLEAN(0, "smi-cost", &smi_cost,
+			"measure SMI cost"),
 	OPT_END()
 };
 
@@ -2160,6 +2173,39 @@ static int add_default_attributes(void)
 		return 0;
 	}
 
+	if (smi_cost) {
+		int smi;
+
+		if (sysfs__read_int(FREEZE_ON_SMI_PATH, &smi) < 0) {
+			fprintf(stderr, "freeze_on_smi is not supported.\n");
+			return -1;
+		}
+
+		if (!smi) {
+			if (sysfs__write_int(FREEZE_ON_SMI_PATH, 1) < 0) {
+				fprintf(stderr, "Failed to set freeze_on_smi.\n");
+				return -1;
+			}
+			smi_reset = true;
+		}
+
+		if (pmu_have_event("msr", "aperf") &&
+		    pmu_have_event("msr", "smi")) {
+			if (!force_metric_only)
+				metric_only = true;
+			err = parse_events(evsel_list, smi_cost_attrs, NULL);
+		} else {
+			fprintf(stderr, "To measure SMI cost, it needs "
+				"msr/aperf/, msr/smi/ and cpu/cycles/ support\n");
+			return -1;
+		}
+		if (err) {
+			fprintf(stderr, "Cannot set up SMI cost events\n");
+			return -1;
+		}
+		return 0;
+	}
+
 	if (topdown_run) {
 		char *str = NULL;
 		bool warn = false;
@@ -2742,6 +2788,9 @@ int cmd_stat(int argc, const char **argv)
 	perf_stat__exit_aggr_mode();
 	perf_evlist__free_stats(evsel_list);
 out:
+	if (smi_cost && smi_reset)
+		sysfs__write_int(FREEZE_ON_SMI_PATH, 0);
+
 	perf_evlist__delete(evsel_list);
 	return status;
 }
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index ac10cc675d39..719d6cb86952 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -44,6 +44,8 @@ static struct stats runtime_topdown_slots_issued[NUM_CTX][MAX_NR_CPUS];
 static struct stats runtime_topdown_slots_retired[NUM_CTX][MAX_NR_CPUS];
 static struct stats runtime_topdown_fetch_bubbles[NUM_CTX][MAX_NR_CPUS];
 static struct stats runtime_topdown_recovery_bubbles[NUM_CTX][MAX_NR_CPUS];
+static struct stats runtime_smi_num_stats[NUM_CTX][MAX_NR_CPUS];
+static struct stats runtime_aperf_stats[NUM_CTX][MAX_NR_CPUS];
 static struct rblist runtime_saved_values;
 static bool have_frontend_stalled;
 
@@ -157,6 +159,8 @@ void perf_stat__reset_shadow_stats(void)
 	memset(runtime_topdown_slots_issued, 0, sizeof(runtime_topdown_slots_issued));
 	memset(runtime_topdown_fetch_bubbles, 0, sizeof(runtime_topdown_fetch_bubbles));
 	memset(runtime_topdown_recovery_bubbles, 0, sizeof(runtime_topdown_recovery_bubbles));
+	memset(runtime_smi_num_stats, 0, sizeof(runtime_smi_num_stats));
+	memset(runtime_aperf_stats, 0, sizeof(runtime_aperf_stats));
 
 	next = rb_first(&runtime_saved_values.entries);
 	while (next) {
@@ -217,6 +221,10 @@ void perf_stat__update_shadow_stats(struct perf_evsel *counter, u64 *count,
 		update_stats(&runtime_dtlb_cache_stats[ctx][cpu], count[0]);
 	else if (perf_evsel__match(counter, HW_CACHE, HW_CACHE_ITLB))
 		update_stats(&runtime_itlb_cache_stats[ctx][cpu], count[0]);
+	else if (perf_stat_evsel__is(counter, SMI_NUM))
+		update_stats(&runtime_smi_num_stats[ctx][cpu], count[0]);
+	else if (perf_stat_evsel__is(counter, APERF))
+		update_stats(&runtime_aperf_stats[ctx][cpu], count[0]);
 
 	if (counter->collect_stat) {
 		struct saved_value *v = saved_value_lookup(counter, cpu, ctx,
@@ -592,6 +600,29 @@ static double td_be_bound(int ctx, int cpu)
 	return sanitize_val(1.0 - sum);
 }
 
+static void print_smi_cost(int cpu, struct perf_evsel *evsel,
+			   struct perf_stat_output_ctx *out)
+{
+	double smi_num, aperf, cycles, cost = 0.0;
+	int ctx = evsel_context(evsel);
+	const char *color = NULL;
+
+	smi_num = avg_stats(&runtime_smi_num_stats[ctx][cpu]);
+	aperf = avg_stats(&runtime_aperf_stats[ctx][cpu]);
+	cycles = avg_stats(&runtime_cycles_stats[ctx][cpu]);
+
+	if ((cycles == 0) || (aperf == 0))
+		return;
+
+	if (smi_num)
+		cost = (aperf - cycles) / aperf * 100.00;
+
+	if (cost > 10)
+		color = PERF_COLOR_RED;
+	out->print_metric(out->ctx, color, "%8.1f%%", "SMI cycles%", cost);
+	out->print_metric(out->ctx, NULL, "%4.0f", "SMI#", smi_num);
+}
+
 void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 				   double avg, int cpu,
 				   struct perf_stat_output_ctx *out)
@@ -825,6 +856,8 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 		}
 		snprintf(unit_buf, sizeof(unit_buf), "%c/sec", unit);
 		print_metric(ctxp, NULL, "%8.3f", unit_buf, ratio);
+	} else if (perf_stat_evsel__is(evsel, SMI_NUM)) {
+		print_smi_cost(cpu, evsel, out);
 	} else {
 		print_metric(ctxp, NULL, NULL, NULL, 0);
 	}
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index c58174443dc1..53b9a994a3dc 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -86,6 +86,8 @@ static const char *id_str[PERF_STAT_EVSEL_ID__MAX] = {
 	ID(TOPDOWN_SLOTS_RETIRED, topdown-slots-retired),
 	ID(TOPDOWN_FETCH_BUBBLES, topdown-fetch-bubbles),
 	ID(TOPDOWN_RECOVERY_BUBBLES, topdown-recovery-bubbles),
+	ID(SMI_NUM, msr/smi/),
+	ID(APERF, msr/aperf/),
 };
 #undef ID
 
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index 0a65ae23f495..7522bf10b03e 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -22,6 +22,8 @@ enum perf_stat_evsel_id {
 	PERF_STAT_EVSEL_ID__TOPDOWN_SLOTS_RETIRED,
 	PERF_STAT_EVSEL_ID__TOPDOWN_FETCH_BUBBLES,
 	PERF_STAT_EVSEL_ID__TOPDOWN_RECOVERY_BUBBLES,
+	PERF_STAT_EVSEL_ID__SMI_NUM,
+	PERF_STAT_EVSEL_ID__APERF,
 	PERF_STAT_EVSEL_ID__MAX,
 };
 
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 06/25] perf unwind: Support for powerpc
  2017-06-21 18:02 [GIT PULL 00/25] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (4 preceding siblings ...)
  2017-06-21 18:02 ` [PATCH 05/25] perf stat: Add support to measure SMI cost Arnaldo Carvalho de Melo
@ 2017-06-21 18:02 ` Arnaldo Carvalho de Melo
  2017-06-21 18:02 ` [PATCH 07/25] perf intel-pt: Move decoder error setting into one condition Arnaldo Carvalho de Melo
                   ` (19 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-21 18:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Paolo Bonzini, Naveen N . Rao, linuxppc-dev,
	Arnaldo Carvalho de Melo

From: Paolo Bonzini <pbonzini@redhat.com>

Porting PPC to libdw only needs an architecture-specific hook to move
the register state from perf to libdw.

The ARM and x86 architectures already use libdw, and it is useful to
have as much common code for the unwinder as possible.  Mark Wielaard
has contributed a frame-based unwinder to libdw, so that unwinding works
even for binaries that do not have CFI information.  In addition,
libunwind is always preferred to libdw by the build machinery so this
cannot introduce regressions on machines that have both libunwind and
libdw installed.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Milian Wolff <milian.wolff@kdab.com>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1496312681-20133-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Makefile.config                  |  2 +-
 tools/perf/arch/powerpc/util/Build          |  2 +
 tools/perf/arch/powerpc/util/unwind-libdw.c | 73 +++++++++++++++++++++++++++++
 3 files changed, 76 insertions(+), 1 deletion(-)
 create mode 100644 tools/perf/arch/powerpc/util/unwind-libdw.c

diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index 1f4fbc9a3292..bdf0e87f9b29 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -61,7 +61,7 @@ endif
 # Disable it on all other architectures in case libdw unwind
 # support is detected in system. Add supported architectures
 # to the check.
-ifneq ($(SRCARCH),$(filter $(SRCARCH),x86 arm))
+ifneq ($(SRCARCH),$(filter $(SRCARCH),x86 arm powerpc))
   NO_LIBDW_DWARF_UNWIND := 1
 endif
 
diff --git a/tools/perf/arch/powerpc/util/Build b/tools/perf/arch/powerpc/util/Build
index 90ad64b231cd..2e6595310420 100644
--- a/tools/perf/arch/powerpc/util/Build
+++ b/tools/perf/arch/powerpc/util/Build
@@ -5,4 +5,6 @@ libperf-y += perf_regs.o
 
 libperf-$(CONFIG_DWARF) += dwarf-regs.o
 libperf-$(CONFIG_DWARF) += skip-callchain-idx.o
+
 libperf-$(CONFIG_LIBUNWIND) += unwind-libunwind.o
+libperf-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
diff --git a/tools/perf/arch/powerpc/util/unwind-libdw.c b/tools/perf/arch/powerpc/util/unwind-libdw.c
new file mode 100644
index 000000000000..3a24b3c43273
--- /dev/null
+++ b/tools/perf/arch/powerpc/util/unwind-libdw.c
@@ -0,0 +1,73 @@
+#include <elfutils/libdwfl.h>
+#include "../../util/unwind-libdw.h"
+#include "../../util/perf_regs.h"
+#include "../../util/event.h"
+
+/* See backends/ppc_initreg.c and backends/ppc_regs.c in elfutils.  */
+static const int special_regs[3][2] = {
+	{ 65, PERF_REG_POWERPC_LINK },
+	{ 101, PERF_REG_POWERPC_XER },
+	{ 109, PERF_REG_POWERPC_CTR },
+};
+
+bool libdw__arch_set_initial_registers(Dwfl_Thread *thread, void *arg)
+{
+	struct unwind_info *ui = arg;
+	struct regs_dump *user_regs = &ui->sample->user_regs;
+	Dwarf_Word dwarf_regs[32], dwarf_nip;
+	size_t i;
+
+#define REG(r) ({						\
+	Dwarf_Word val = 0;					\
+	perf_reg_value(&val, user_regs, PERF_REG_POWERPC_##r);	\
+	val;							\
+})
+
+	dwarf_regs[0]  = REG(R0);
+	dwarf_regs[1]  = REG(R1);
+	dwarf_regs[2]  = REG(R2);
+	dwarf_regs[3]  = REG(R3);
+	dwarf_regs[4]  = REG(R4);
+	dwarf_regs[5]  = REG(R5);
+	dwarf_regs[6]  = REG(R6);
+	dwarf_regs[7]  = REG(R7);
+	dwarf_regs[8]  = REG(R8);
+	dwarf_regs[9]  = REG(R9);
+	dwarf_regs[10] = REG(R10);
+	dwarf_regs[11] = REG(R11);
+	dwarf_regs[12] = REG(R12);
+	dwarf_regs[13] = REG(R13);
+	dwarf_regs[14] = REG(R14);
+	dwarf_regs[15] = REG(R15);
+	dwarf_regs[16] = REG(R16);
+	dwarf_regs[17] = REG(R17);
+	dwarf_regs[18] = REG(R18);
+	dwarf_regs[19] = REG(R19);
+	dwarf_regs[20] = REG(R20);
+	dwarf_regs[21] = REG(R21);
+	dwarf_regs[22] = REG(R22);
+	dwarf_regs[23] = REG(R23);
+	dwarf_regs[24] = REG(R24);
+	dwarf_regs[25] = REG(R25);
+	dwarf_regs[26] = REG(R26);
+	dwarf_regs[27] = REG(R27);
+	dwarf_regs[28] = REG(R28);
+	dwarf_regs[29] = REG(R29);
+	dwarf_regs[30] = REG(R30);
+	dwarf_regs[31] = REG(R31);
+	if (!dwfl_thread_state_registers(thread, 0, 32, dwarf_regs))
+		return false;
+
+	dwarf_nip = REG(NIP);
+	dwfl_thread_state_register_pc(thread, dwarf_nip);
+	for (i = 0; i < ARRAY_SIZE(special_regs); i++) {
+		Dwarf_Word val = 0;
+		perf_reg_value(&val, user_regs, special_regs[i][1]);
+		if (!dwfl_thread_state_registers(thread,
+						 special_regs[i][0], 1,
+						 &val))
+			return false;
+	}
+
+	return true;
+}
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 07/25] perf intel-pt: Move decoder error setting into one condition
  2017-06-21 18:02 [GIT PULL 00/25] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (5 preceding siblings ...)
  2017-06-21 18:02 ` [PATCH 06/25] perf unwind: Support for powerpc Arnaldo Carvalho de Melo
@ 2017-06-21 18:02 ` Arnaldo Carvalho de Melo
  2017-06-21 18:02 ` [PATCH 08/25] perf intel-pt: Improve sample timestamp Arnaldo Carvalho de Melo
                   ` (18 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-21 18:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Adrian Hunter, Andi Kleen, stable,
	Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

Move decoder error setting into one condition.

Cc'ed to stable because later fixes depend on it.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1495786658-18063-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index 7cf7f7aca4d2..5a9676c6e23f 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -2130,15 +2130,18 @@ const struct intel_pt_state *intel_pt_decode(struct intel_pt_decoder *decoder)
 		}
 	} while (err == -ENOLINK);
 
-	decoder->state.err = err ? intel_pt_ext_err(err) : 0;
+	if (err) {
+		decoder->state.err = intel_pt_ext_err(err);
+		decoder->state.from_ip = decoder->ip;
+	} else {
+		decoder->state.err = 0;
+	}
+
 	decoder->state.timestamp = decoder->timestamp;
 	decoder->state.est_timestamp = intel_pt_est_timestamp(decoder);
 	decoder->state.cr3 = decoder->cr3;
 	decoder->state.tot_insn_cnt = decoder->tot_insn_cnt;
 
-	if (err)
-		decoder->state.from_ip = decoder->ip;
-
 	return &decoder->state;
 }
 
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 08/25] perf intel-pt: Improve sample timestamp
  2017-06-21 18:02 [GIT PULL 00/25] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (6 preceding siblings ...)
  2017-06-21 18:02 ` [PATCH 07/25] perf intel-pt: Move decoder error setting into one condition Arnaldo Carvalho de Melo
@ 2017-06-21 18:02 ` Arnaldo Carvalho de Melo
  2017-06-21 18:02 ` [PATCH 09/25] perf intel-pt: Fix missing stack clear Arnaldo Carvalho de Melo
                   ` (17 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-21 18:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Adrian Hunter, Andi Kleen, stable,
	Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

The decoder uses its current timestamp in samples. Usually that is a
timestamp that has already passed, but in some cases it is a timestamp
for a branch that the decoder is walking towards, and consequently
hasn't reached. Improve that situation by using the pkt_state to
determine when to use the current or previous timestamp.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1495786658-18063-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 .../perf/util/intel-pt-decoder/intel-pt-decoder.c  | 34 ++++++++++++++++++++--
 1 file changed, 31 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index 5a9676c6e23f..d5c69e822282 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -64,6 +64,25 @@ enum intel_pt_pkt_state {
 	INTEL_PT_STATE_FUP_NO_TIP,
 };
 
+static inline bool intel_pt_sample_time(enum intel_pt_pkt_state pkt_state)
+{
+	switch (pkt_state) {
+	case INTEL_PT_STATE_NO_PSB:
+	case INTEL_PT_STATE_NO_IP:
+	case INTEL_PT_STATE_ERR_RESYNC:
+	case INTEL_PT_STATE_IN_SYNC:
+	case INTEL_PT_STATE_TNT:
+		return true;
+	case INTEL_PT_STATE_TIP:
+	case INTEL_PT_STATE_TIP_PGD:
+	case INTEL_PT_STATE_FUP:
+	case INTEL_PT_STATE_FUP_NO_TIP:
+		return false;
+	default:
+		return true;
+	};
+}
+
 #ifdef INTEL_PT_STRICT
 #define INTEL_PT_STATE_ERR1	INTEL_PT_STATE_NO_PSB
 #define INTEL_PT_STATE_ERR2	INTEL_PT_STATE_NO_PSB
@@ -99,6 +118,7 @@ struct intel_pt_decoder {
 	uint64_t timestamp;
 	uint64_t tsc_timestamp;
 	uint64_t ref_timestamp;
+	uint64_t sample_timestamp;
 	uint64_t ret_addr;
 	uint64_t ctc_timestamp;
 	uint64_t ctc_delta;
@@ -139,6 +159,7 @@ struct intel_pt_decoder {
 	unsigned int fup_tx_flags;
 	unsigned int tx_flags;
 	uint64_t timestamp_insn_cnt;
+	uint64_t sample_insn_cnt;
 	uint64_t stuck_ip;
 	int no_progress;
 	int stuck_ip_prd;
@@ -898,6 +919,7 @@ static int intel_pt_walk_insn(struct intel_pt_decoder *decoder,
 
 	decoder->tot_insn_cnt += insn_cnt;
 	decoder->timestamp_insn_cnt += insn_cnt;
+	decoder->sample_insn_cnt += insn_cnt;
 	decoder->period_insn_cnt += insn_cnt;
 
 	if (err) {
@@ -2069,7 +2091,7 @@ static int intel_pt_sync(struct intel_pt_decoder *decoder)
 
 static uint64_t intel_pt_est_timestamp(struct intel_pt_decoder *decoder)
 {
-	uint64_t est = decoder->timestamp_insn_cnt << 1;
+	uint64_t est = decoder->sample_insn_cnt << 1;
 
 	if (!decoder->cbr || !decoder->max_non_turbo_ratio)
 		goto out;
@@ -2077,7 +2099,7 @@ static uint64_t intel_pt_est_timestamp(struct intel_pt_decoder *decoder)
 	est *= decoder->max_non_turbo_ratio;
 	est /= decoder->cbr;
 out:
-	return decoder->timestamp + est;
+	return decoder->sample_timestamp + est;
 }
 
 const struct intel_pt_state *intel_pt_decode(struct intel_pt_decoder *decoder)
@@ -2133,11 +2155,17 @@ const struct intel_pt_state *intel_pt_decode(struct intel_pt_decoder *decoder)
 	if (err) {
 		decoder->state.err = intel_pt_ext_err(err);
 		decoder->state.from_ip = decoder->ip;
+		decoder->sample_timestamp = decoder->timestamp;
+		decoder->sample_insn_cnt = decoder->timestamp_insn_cnt;
 	} else {
 		decoder->state.err = 0;
+		if (intel_pt_sample_time(decoder->pkt_state)) {
+			decoder->sample_timestamp = decoder->timestamp;
+			decoder->sample_insn_cnt = decoder->timestamp_insn_cnt;
+		}
 	}
 
-	decoder->state.timestamp = decoder->timestamp;
+	decoder->state.timestamp = decoder->sample_timestamp;
 	decoder->state.est_timestamp = intel_pt_est_timestamp(decoder);
 	decoder->state.cr3 = decoder->cr3;
 	decoder->state.tot_insn_cnt = decoder->tot_insn_cnt;
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 09/25] perf intel-pt: Fix missing stack clear
  2017-06-21 18:02 [GIT PULL 00/25] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (7 preceding siblings ...)
  2017-06-21 18:02 ` [PATCH 08/25] perf intel-pt: Improve sample timestamp Arnaldo Carvalho de Melo
@ 2017-06-21 18:02 ` Arnaldo Carvalho de Melo
  2017-06-21 18:02 ` [PATCH 10/25] perf intel-pt: Ensure IP is zero when state is INTEL_PT_STATE_NO_IP Arnaldo Carvalho de Melo
                   ` (16 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-21 18:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Adrian Hunter, Andi Kleen, stable,
	Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

The return compression stack must be cleared whenever there is a PSB. Fix
one case where that was not happening.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1495786658-18063-4-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index d5c69e822282..28b16c3acdde 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -1932,6 +1932,7 @@ static int intel_pt_walk_to_ip(struct intel_pt_decoder *decoder)
 			break;
 
 		case INTEL_PT_PSB:
+			intel_pt_clear_stack(&decoder->stack);
 			err = intel_pt_walk_psb(decoder);
 			if (err)
 				return err;
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 10/25] perf intel-pt: Ensure IP is zero when state is INTEL_PT_STATE_NO_IP
  2017-06-21 18:02 [GIT PULL 00/25] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (8 preceding siblings ...)
  2017-06-21 18:02 ` [PATCH 09/25] perf intel-pt: Fix missing stack clear Arnaldo Carvalho de Melo
@ 2017-06-21 18:02 ` Arnaldo Carvalho de Melo
  2017-06-21 18:02 ` [PATCH 11/25] perf intel-pt: Fix last_ip usage Arnaldo Carvalho de Melo
                   ` (15 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-21 18:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Adrian Hunter, Andi Kleen, stable,
	Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

A value of zero is used to indicate that there is no IP. Ensure the
value is zero when the state is INTEL_PT_STATE_NO_IP.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1495786658-18063-5-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index 28b16c3acdde..f9cd3aaef488 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -2117,6 +2117,7 @@ const struct intel_pt_state *intel_pt_decode(struct intel_pt_decoder *decoder)
 			break;
 		case INTEL_PT_STATE_NO_IP:
 			decoder->last_ip = 0;
+			decoder->ip = 0;
 			/* Fall through */
 		case INTEL_PT_STATE_ERR_RESYNC:
 			err = intel_pt_sync_ip(decoder);
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 11/25] perf intel-pt: Fix last_ip usage
  2017-06-21 18:02 [GIT PULL 00/25] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (9 preceding siblings ...)
  2017-06-21 18:02 ` [PATCH 10/25] perf intel-pt: Ensure IP is zero when state is INTEL_PT_STATE_NO_IP Arnaldo Carvalho de Melo
@ 2017-06-21 18:02 ` Arnaldo Carvalho de Melo
  2017-06-21 18:02 ` [PATCH 12/25] perf intel-pt: Ensure never to set 'last_ip' when packet 'count' is zero Arnaldo Carvalho de Melo
                   ` (14 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-21 18:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Adrian Hunter, Andi Kleen, stable,
	Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

Intel PT uses IP compression based on the last IP. For decoding
purposes, 'last IP' is considered to be reset to zero whenever there is
a synchronization packet (PSB). The decoder wasn't doing that, and was
treating the zero value to mean that there was no last IP, whereas
compression can be done against the zero value. Fix by setting last_ip
to zero when a PSB is received and keep track of have_last_ip.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1495786658-18063-6-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index f9cd3aaef488..62e2c2ad3f1d 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -111,6 +111,7 @@ struct intel_pt_decoder {
 	bool have_tma;
 	bool have_cyc;
 	bool fixup_last_mtc;
+	bool have_last_ip;
 	uint64_t pos;
 	uint64_t last_ip;
 	uint64_t ip;
@@ -419,6 +420,7 @@ static uint64_t intel_pt_calc_ip(const struct intel_pt_pkt *packet,
 static inline void intel_pt_set_last_ip(struct intel_pt_decoder *decoder)
 {
 	decoder->last_ip = intel_pt_calc_ip(&decoder->packet, decoder->last_ip);
+	decoder->have_last_ip = true;
 }
 
 static inline void intel_pt_set_ip(struct intel_pt_decoder *decoder)
@@ -1672,6 +1674,8 @@ static int intel_pt_walk_trace(struct intel_pt_decoder *decoder)
 			break;
 
 		case INTEL_PT_PSB:
+			decoder->last_ip = 0;
+			decoder->have_last_ip = true;
 			intel_pt_clear_stack(&decoder->stack);
 			err = intel_pt_walk_psbend(decoder);
 			if (err == -EAGAIN)
@@ -1752,7 +1756,7 @@ static int intel_pt_walk_trace(struct intel_pt_decoder *decoder)
 
 static inline bool intel_pt_have_ip(struct intel_pt_decoder *decoder)
 {
-	return decoder->last_ip || decoder->packet.count == 0 ||
+	return decoder->have_last_ip || decoder->packet.count == 0 ||
 	       decoder->packet.count == 3 || decoder->packet.count == 6;
 }
 
@@ -1882,7 +1886,7 @@ static int intel_pt_walk_to_ip(struct intel_pt_decoder *decoder)
 				if (decoder->ip)
 					return 0;
 			}
-			if (decoder->packet.count)
+			if (decoder->packet.count && decoder->have_last_ip)
 				intel_pt_set_last_ip(decoder);
 			break;
 
@@ -1932,6 +1936,8 @@ static int intel_pt_walk_to_ip(struct intel_pt_decoder *decoder)
 			break;
 
 		case INTEL_PT_PSB:
+			decoder->last_ip = 0;
+			decoder->have_last_ip = true;
 			intel_pt_clear_stack(&decoder->stack);
 			err = intel_pt_walk_psb(decoder);
 			if (err)
@@ -2066,6 +2072,7 @@ static int intel_pt_sync(struct intel_pt_decoder *decoder)
 
 	decoder->pge = false;
 	decoder->continuous_period = false;
+	decoder->have_last_ip = false;
 	decoder->last_ip = 0;
 	decoder->ip = 0;
 	intel_pt_clear_stack(&decoder->stack);
@@ -2074,6 +2081,7 @@ static int intel_pt_sync(struct intel_pt_decoder *decoder)
 	if (err)
 		return err;
 
+	decoder->have_last_ip = true;
 	decoder->pkt_state = INTEL_PT_STATE_NO_IP;
 
 	err = intel_pt_walk_psb(decoder);
@@ -2116,6 +2124,7 @@ const struct intel_pt_state *intel_pt_decode(struct intel_pt_decoder *decoder)
 			err = intel_pt_sync(decoder);
 			break;
 		case INTEL_PT_STATE_NO_IP:
+			decoder->have_last_ip = false;
 			decoder->last_ip = 0;
 			decoder->ip = 0;
 			/* Fall through */
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 12/25] perf intel-pt: Ensure never to set 'last_ip' when packet 'count' is zero
  2017-06-21 18:02 [GIT PULL 00/25] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (10 preceding siblings ...)
  2017-06-21 18:02 ` [PATCH 11/25] perf intel-pt: Fix last_ip usage Arnaldo Carvalho de Melo
@ 2017-06-21 18:02 ` Arnaldo Carvalho de Melo
  2017-06-21 18:02 ` [PATCH 13/25] perf intel-pt: Use FUP always when scanning for an IP Arnaldo Carvalho de Melo
                   ` (13 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-21 18:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Adrian Hunter, Andi Kleen, stable,
	Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

Intel PT uses IP compression based on the last IP. For decoding purposes,
'last IP' is not updated when a branch target has been suppressed, which is
indicated by IPBytes == 0. IPBytes is stored in the packet 'count', so
ensure never to set 'last_ip' when packet 'count' is zero.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1495786658-18063-7-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index 62e2c2ad3f1d..72b9dc8135a2 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -1470,7 +1470,8 @@ static int intel_pt_walk_psbend(struct intel_pt_decoder *decoder)
 
 		case INTEL_PT_FUP:
 			decoder->pge = true;
-			intel_pt_set_last_ip(decoder);
+			if (decoder->packet.count)
+				intel_pt_set_last_ip(decoder);
 			break;
 
 		case INTEL_PT_MODE_TSX:
@@ -1756,8 +1757,9 @@ static int intel_pt_walk_trace(struct intel_pt_decoder *decoder)
 
 static inline bool intel_pt_have_ip(struct intel_pt_decoder *decoder)
 {
-	return decoder->have_last_ip || decoder->packet.count == 0 ||
-	       decoder->packet.count == 3 || decoder->packet.count == 6;
+	return decoder->packet.count &&
+	       (decoder->have_last_ip || decoder->packet.count == 3 ||
+		decoder->packet.count == 6);
 }
 
 /* Walk PSB+ packets to get in sync. */
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 13/25] perf intel-pt: Use FUP always when scanning for an IP
  2017-06-21 18:02 [GIT PULL 00/25] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (11 preceding siblings ...)
  2017-06-21 18:02 ` [PATCH 12/25] perf intel-pt: Ensure never to set 'last_ip' when packet 'count' is zero Arnaldo Carvalho de Melo
@ 2017-06-21 18:02 ` Arnaldo Carvalho de Melo
  2017-06-21 18:02 ` [PATCH 14/25] perf intel-pt: Clear FUP flag on error Arnaldo Carvalho de Melo
                   ` (12 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-21 18:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Adrian Hunter, Andi Kleen, stable,
	Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

The decoder will try to use branch packets to find an IP to start decoding
or to recover from errors. Currently the FUP packet is used only in the
case of an overflow, however there is no reason for that to be a special
case. So just use FUP always when scanning for an IP.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1495786658-18063-8-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index 72b9dc8135a2..923c60efda26 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -1882,14 +1882,10 @@ static int intel_pt_walk_to_ip(struct intel_pt_decoder *decoder)
 			break;
 
 		case INTEL_PT_FUP:
-			if (decoder->overflow) {
-				if (intel_pt_have_ip(decoder))
-					intel_pt_set_ip(decoder);
-				if (decoder->ip)
-					return 0;
-			}
-			if (decoder->packet.count && decoder->have_last_ip)
-				intel_pt_set_last_ip(decoder);
+			if (intel_pt_have_ip(decoder))
+				intel_pt_set_ip(decoder);
+			if (decoder->ip)
+				return 0;
 			break;
 
 		case INTEL_PT_MTC:
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 14/25] perf intel-pt: Clear FUP flag on error
  2017-06-21 18:02 [GIT PULL 00/25] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (12 preceding siblings ...)
  2017-06-21 18:02 ` [PATCH 13/25] perf intel-pt: Use FUP always when scanning for an IP Arnaldo Carvalho de Melo
@ 2017-06-21 18:02 ` Arnaldo Carvalho de Melo
  2017-06-21 18:02 ` [PATCH 15/25] perf intel-pt: Add missing __fallthrough Arnaldo Carvalho de Melo
                   ` (11 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-21 18:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Adrian Hunter, Andi Kleen, stable,
	Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

Sometimes a FUP packet is associated with a TSX transaction and a flag is
set to indicate that. Ensure that flag is cleared on any error condition
because at that point the decoder can no longer assume it is correct.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1495786658-18063-9-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index 923c60efda26..a97710162e07 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -1962,6 +1962,8 @@ static int intel_pt_sync_ip(struct intel_pt_decoder *decoder)
 {
 	int err;
 
+	decoder->set_fup_tx_flags = false;
+
 	intel_pt_log("Scanning for full IP\n");
 	err = intel_pt_walk_to_ip(decoder);
 	if (err)
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 15/25] perf intel-pt: Add missing __fallthrough
  2017-06-21 18:02 [GIT PULL 00/25] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (13 preceding siblings ...)
  2017-06-21 18:02 ` [PATCH 14/25] perf intel-pt: Clear FUP flag on error Arnaldo Carvalho de Melo
@ 2017-06-21 18:02 ` Arnaldo Carvalho de Melo
  2017-06-21 18:02 ` [PATCH 16/25] perf intel-pt: Allow decoding with branch tracing disabled Arnaldo Carvalho de Melo
                   ` (10 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-21 18:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Adrian Hunter, Andi Kleen, Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

perf tools uses __fallthrough. Add missing  __fallthrough to a switch
statement.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-10-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index a97710162e07..cad40fe93bd2 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -2127,7 +2127,7 @@ const struct intel_pt_state *intel_pt_decode(struct intel_pt_decoder *decoder)
 			decoder->have_last_ip = false;
 			decoder->last_ip = 0;
 			decoder->ip = 0;
-			/* Fall through */
+			__fallthrough;
 		case INTEL_PT_STATE_ERR_RESYNC:
 			err = intel_pt_sync_ip(decoder);
 			break;
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 16/25] perf intel-pt: Allow decoding with branch tracing disabled
  2017-06-21 18:02 [GIT PULL 00/25] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (14 preceding siblings ...)
  2017-06-21 18:02 ` [PATCH 15/25] perf intel-pt: Add missing __fallthrough Arnaldo Carvalho de Melo
@ 2017-06-21 18:02 ` Arnaldo Carvalho de Melo
  2017-06-21 18:02 ` [PATCH 17/25] perf intel-pt: Add default config for pass-through branch enable Arnaldo Carvalho de Melo
                   ` (9 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-21 18:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Adrian Hunter, Andi Kleen, Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

The kernel now supports the disabling of branch tracing, however the
decoder assumes branch tracing is always enabled. Pass through a parameter
to indicate whether branch tracing is enabled and use it to avoid cases
when the decoder is expecting branch packets. There are 2 such cases.
First, FUP packets which can bind to an IP even when there is no branch
tracing. Secondly, the decoder will try to use branch packets to find an IP
to start decoding or to recover from errors.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-11-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.c | 13 +++++++++++++
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.h |  1 +
 tools/perf/util/intel-pt.c                          | 14 ++++++++++++++
 3 files changed, 28 insertions(+)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index cad40fe93bd2..dacb9223e743 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -106,6 +106,7 @@ struct intel_pt_decoder {
 	const unsigned char *buf;
 	size_t len;
 	bool return_compression;
+	bool branch_enable;
 	bool mtc_insn;
 	bool pge;
 	bool have_tma;
@@ -214,6 +215,7 @@ struct intel_pt_decoder *intel_pt_decoder_new(struct intel_pt_params *params)
 	decoder->pgd_ip             = params->pgd_ip;
 	decoder->data               = params->data;
 	decoder->return_compression = params->return_compression;
+	decoder->branch_enable      = params->branch_enable;
 
 	decoder->period             = params->period;
 	decoder->period_type        = params->period_type;
@@ -1650,6 +1652,10 @@ static int intel_pt_walk_trace(struct intel_pt_decoder *decoder)
 				break;
 			}
 			intel_pt_set_last_ip(decoder);
+			if (!decoder->branch_enable) {
+				decoder->ip = decoder->last_ip;
+				break;
+			}
 			err = intel_pt_walk_fup(decoder);
 			if (err != -EAGAIN) {
 				if (err)
@@ -1964,6 +1970,13 @@ static int intel_pt_sync_ip(struct intel_pt_decoder *decoder)
 
 	decoder->set_fup_tx_flags = false;
 
+	if (!decoder->branch_enable) {
+		decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
+		decoder->overflow = false;
+		decoder->state.type = 0; /* Do not have a sample */
+		return 0;
+	}
+
 	intel_pt_log("Scanning for full IP\n");
 	err = intel_pt_walk_to_ip(decoder);
 	if (err)
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
index e90619a43c0c..add3bed58349 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
@@ -87,6 +87,7 @@ struct intel_pt_params {
 	bool (*pgd_ip)(uint64_t ip, void *data);
 	void *data;
 	bool return_compression;
+	bool branch_enable;
 	uint64_t period;
 	enum intel_pt_period_type period_type;
 	unsigned max_non_turbo_ratio;
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 4c7718f87a08..5c59b8c6a719 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -668,6 +668,19 @@ static bool intel_pt_return_compression(struct intel_pt *pt)
 	return true;
 }
 
+static bool intel_pt_branch_enable(struct intel_pt *pt)
+{
+	struct perf_evsel *evsel;
+	u64 config;
+
+	evlist__for_each_entry(pt->session->evlist, evsel) {
+		if (intel_pt_get_config(pt, &evsel->attr, &config) &&
+		    (config & 1) && !(config & 0x2000))
+			return false;
+	}
+	return true;
+}
+
 static unsigned int intel_pt_mtc_period(struct intel_pt *pt)
 {
 	struct perf_evsel *evsel;
@@ -799,6 +812,7 @@ static struct intel_pt_queue *intel_pt_alloc_queue(struct intel_pt *pt,
 	params.walk_insn = intel_pt_walk_next_insn;
 	params.data = ptq;
 	params.return_compression = intel_pt_return_compression(pt);
+	params.branch_enable = intel_pt_branch_enable(pt);
 	params.max_non_turbo_ratio = pt->max_non_turbo_ratio;
 	params.mtc_period = intel_pt_mtc_period(pt);
 	params.tsc_ctc_ratio_n = pt->tsc_ctc_ratio_n;
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 17/25] perf intel-pt: Add default config for pass-through branch enable
  2017-06-21 18:02 [GIT PULL 00/25] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (15 preceding siblings ...)
  2017-06-21 18:02 ` [PATCH 16/25] perf intel-pt: Allow decoding with branch tracing disabled Arnaldo Carvalho de Melo
@ 2017-06-21 18:02 ` Arnaldo Carvalho de Melo
  2017-06-21 18:02 ` [PATCH 18/25] perf intel-pt: Add documentation for new config terms Arnaldo Carvalho de Melo
                   ` (8 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-21 18:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Adrian Hunter, Andi Kleen, Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

Branch tracing is enabled by default, so a fake config bit called 'pt'
(pass-through) was added to allow the 'branch enable' bit to have affect.
Add default config 'pt,branch' which will allow users to disable branch
tracing using 'branch=0' instead of having to specify 'pt,branch=0'.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-12-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/arch/x86/util/intel-pt.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/tools/perf/arch/x86/util/intel-pt.c b/tools/perf/arch/x86/util/intel-pt.c
index 6fe667b3269e..9535be57033f 100644
--- a/tools/perf/arch/x86/util/intel-pt.c
+++ b/tools/perf/arch/x86/util/intel-pt.c
@@ -192,6 +192,7 @@ static u64 intel_pt_default_config(struct perf_pmu *intel_pt_pmu)
 	int psb_cyc, psb_periods, psb_period;
 	int pos = 0;
 	u64 config;
+	char c;
 
 	pos += scnprintf(buf + pos, sizeof(buf) - pos, "tsc");
 
@@ -225,6 +226,10 @@ static u64 intel_pt_default_config(struct perf_pmu *intel_pt_pmu)
 		}
 	}
 
+	if (perf_pmu__scan_file(intel_pt_pmu, "format/pt", "%c", &c) == 1 &&
+	    perf_pmu__scan_file(intel_pt_pmu, "format/branch", "%c", &c) == 1)
+		pos += scnprintf(buf + pos, sizeof(buf) - pos, ",pt,branch");
+
 	pr_debug2("%s default config: %s\n", intel_pt_pmu->name, buf);
 
 	intel_pt_parse_terms(&intel_pt_pmu->format, buf, &config);
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 18/25] perf intel-pt: Add documentation for new config terms
  2017-06-21 18:02 [GIT PULL 00/25] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (16 preceding siblings ...)
  2017-06-21 18:02 ` [PATCH 17/25] perf intel-pt: Add default config for pass-through branch enable Arnaldo Carvalho de Melo
@ 2017-06-21 18:02 ` Arnaldo Carvalho de Melo
  2017-06-21 18:02 ` [PATCH 19/25] perf intel-pt: Add decoder support for ptwrite and power event packets Arnaldo Carvalho de Melo
                   ` (7 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-21 18:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Adrian Hunter, Andi Kleen, Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

Add documentation for new config terms.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-13-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/intel-pt.txt | 36 +++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/tools/perf/Documentation/intel-pt.txt b/tools/perf/Documentation/intel-pt.txt
index b0b3007d3c9c..d157dee7a4ec 100644
--- a/tools/perf/Documentation/intel-pt.txt
+++ b/tools/perf/Documentation/intel-pt.txt
@@ -364,6 +364,42 @@ cyc_thresh	Specifies how frequently CYC packets are produced - see cyc
 
 		CYC packets are not requested by default.
 
+pt		Specifies pass-through which enables the 'branch' config term.
+
+		The default config selects 'pt' if it is available, so a user will
+		never need to specify this term.
+
+branch		Enable branch tracing.  Branch tracing is enabled by default so to
+		disable branch tracing use 'branch=0'.
+
+		The default config selects 'branch' if it is available.
+
+ptw		Enable PTWRITE packets which are produced when a ptwrite instruction
+		is executed.
+
+		Support for this feature is indicated by:
+
+			/sys/bus/event_source/devices/intel_pt/caps/ptwrite
+
+		which contains "1" if the feature is supported and
+		"0" otherwise.
+
+fup_on_ptw	Enable a FUP packet to follow the PTWRITE packet.  The FUP packet
+		provides the address of the ptwrite instruction.  In the absence of
+		fup_on_ptw, the decoder will use the address of the previous branch
+		if branch tracing is enabled, otherwise the address will be zero.
+		Note that fup_on_ptw will work even when branch tracing is disabled.
+
+pwr_evt		Enable power events.  The power events provide information about
+		changes to the CPU C-state.
+
+		Support for this feature is indicated by:
+
+			/sys/bus/event_source/devices/intel_pt/caps/power_event_trace
+
+		which contains "1" if the feature is supported and
+		"0" otherwise.
+
 
 new snapshot option
 -------------------
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 19/25] perf intel-pt: Add decoder support for ptwrite and power event packets
  2017-06-21 18:02 [GIT PULL 00/25] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (17 preceding siblings ...)
  2017-06-21 18:02 ` [PATCH 18/25] perf intel-pt: Add documentation for new config terms Arnaldo Carvalho de Melo
@ 2017-06-21 18:02 ` Arnaldo Carvalho de Melo
  2017-06-21 18:02 ` [PATCH 20/25] perf intel-pt: Add reserved byte to CBR packet payload Arnaldo Carvalho de Melo
                   ` (6 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-21 18:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Adrian Hunter, Andi Kleen, Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

Add decoder support for PTWRITE, MWAIT, PWRE, PWRX and EXSTOP packets. This
patch only affects the decoder, so the tools still do not select or consume
the new information. That is added in subsequent patches.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-14-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 .../perf/util/intel-pt-decoder/intel-pt-decoder.c  | 176 ++++++++++++++++++++-
 .../perf/util/intel-pt-decoder/intel-pt-decoder.h  |  10 ++
 .../util/intel-pt-decoder/intel-pt-pkt-decoder.c   | 108 +++++++++++++
 .../util/intel-pt-decoder/intel-pt-pkt-decoder.h   |   7 +
 4 files changed, 293 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index dacb9223e743..e42804d10001 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -158,8 +158,15 @@ struct intel_pt_decoder {
 	bool continuous_period;
 	bool overflow;
 	bool set_fup_tx_flags;
+	bool set_fup_ptw;
+	bool set_fup_mwait;
+	bool set_fup_pwre;
+	bool set_fup_exstop;
 	unsigned int fup_tx_flags;
 	unsigned int tx_flags;
+	uint64_t fup_ptw_payload;
+	uint64_t fup_mwait_payload;
+	uint64_t fup_pwre_payload;
 	uint64_t timestamp_insn_cnt;
 	uint64_t sample_insn_cnt;
 	uint64_t stuck_ip;
@@ -660,6 +667,8 @@ static int intel_pt_calc_cyc_cb(struct intel_pt_pkt_info *pkt_info)
 	case INTEL_PT_PAD:
 	case INTEL_PT_VMCS:
 	case INTEL_PT_MNT:
+	case INTEL_PT_PTWRITE:
+	case INTEL_PT_PTWRITE_IP:
 		return 0;
 
 	case INTEL_PT_MTC:
@@ -758,6 +767,11 @@ static int intel_pt_calc_cyc_cb(struct intel_pt_pkt_info *pkt_info)
 
 	case INTEL_PT_TIP_PGD:
 	case INTEL_PT_TRACESTOP:
+	case INTEL_PT_EXSTOP:
+	case INTEL_PT_EXSTOP_IP:
+	case INTEL_PT_MWAIT:
+	case INTEL_PT_PWRE:
+	case INTEL_PT_PWRX:
 	case INTEL_PT_OVF:
 	case INTEL_PT_BAD: /* Does not happen */
 	default:
@@ -1016,6 +1030,57 @@ static int intel_pt_walk_insn(struct intel_pt_decoder *decoder,
 	return err;
 }
 
+static bool intel_pt_fup_event(struct intel_pt_decoder *decoder)
+{
+	bool ret = false;
+
+	if (decoder->set_fup_tx_flags) {
+		decoder->set_fup_tx_flags = false;
+		decoder->tx_flags = decoder->fup_tx_flags;
+		decoder->state.type = INTEL_PT_TRANSACTION;
+		decoder->state.from_ip = decoder->ip;
+		decoder->state.to_ip = 0;
+		decoder->state.flags = decoder->fup_tx_flags;
+		return true;
+	}
+	if (decoder->set_fup_ptw) {
+		decoder->set_fup_ptw = false;
+		decoder->state.type = INTEL_PT_PTW;
+		decoder->state.flags |= INTEL_PT_FUP_IP;
+		decoder->state.from_ip = decoder->ip;
+		decoder->state.to_ip = 0;
+		decoder->state.ptw_payload = decoder->fup_ptw_payload;
+		return true;
+	}
+	if (decoder->set_fup_mwait) {
+		decoder->set_fup_mwait = false;
+		decoder->state.type = INTEL_PT_MWAIT_OP;
+		decoder->state.from_ip = decoder->ip;
+		decoder->state.to_ip = 0;
+		decoder->state.mwait_payload = decoder->fup_mwait_payload;
+		ret = true;
+	}
+	if (decoder->set_fup_pwre) {
+		decoder->set_fup_pwre = false;
+		decoder->state.type |= INTEL_PT_PWR_ENTRY;
+		decoder->state.type &= ~INTEL_PT_BRANCH;
+		decoder->state.from_ip = decoder->ip;
+		decoder->state.to_ip = 0;
+		decoder->state.pwre_payload = decoder->fup_pwre_payload;
+		ret = true;
+	}
+	if (decoder->set_fup_exstop) {
+		decoder->set_fup_exstop = false;
+		decoder->state.type |= INTEL_PT_EX_STOP;
+		decoder->state.type &= ~INTEL_PT_BRANCH;
+		decoder->state.flags |= INTEL_PT_FUP_IP;
+		decoder->state.from_ip = decoder->ip;
+		decoder->state.to_ip = 0;
+		ret = true;
+	}
+	return ret;
+}
+
 static int intel_pt_walk_fup(struct intel_pt_decoder *decoder)
 {
 	struct intel_pt_insn intel_pt_insn;
@@ -1029,15 +1094,8 @@ static int intel_pt_walk_fup(struct intel_pt_decoder *decoder)
 		if (err == INTEL_PT_RETURN)
 			return 0;
 		if (err == -EAGAIN) {
-			if (decoder->set_fup_tx_flags) {
-				decoder->set_fup_tx_flags = false;
-				decoder->tx_flags = decoder->fup_tx_flags;
-				decoder->state.type = INTEL_PT_TRANSACTION;
-				decoder->state.from_ip = decoder->ip;
-				decoder->state.to_ip = 0;
-				decoder->state.flags = decoder->fup_tx_flags;
+			if (intel_pt_fup_event(decoder))
 				return 0;
-			}
 			return err;
 		}
 		decoder->set_fup_tx_flags = false;
@@ -1443,6 +1501,13 @@ static int intel_pt_walk_psbend(struct intel_pt_decoder *decoder)
 		case INTEL_PT_TRACESTOP:
 		case INTEL_PT_BAD:
 		case INTEL_PT_PSB:
+		case INTEL_PT_PTWRITE:
+		case INTEL_PT_PTWRITE_IP:
+		case INTEL_PT_EXSTOP:
+		case INTEL_PT_EXSTOP_IP:
+		case INTEL_PT_MWAIT:
+		case INTEL_PT_PWRE:
+		case INTEL_PT_PWRX:
 			decoder->have_tma = false;
 			intel_pt_log("ERROR: Unexpected packet\n");
 			return -EAGAIN;
@@ -1524,6 +1589,13 @@ static int intel_pt_walk_fup_tip(struct intel_pt_decoder *decoder)
 		case INTEL_PT_MODE_TSX:
 		case INTEL_PT_BAD:
 		case INTEL_PT_PSBEND:
+		case INTEL_PT_PTWRITE:
+		case INTEL_PT_PTWRITE_IP:
+		case INTEL_PT_EXSTOP:
+		case INTEL_PT_EXSTOP_IP:
+		case INTEL_PT_MWAIT:
+		case INTEL_PT_PWRE:
+		case INTEL_PT_PWRX:
 			intel_pt_log("ERROR: Missing TIP after FUP\n");
 			decoder->pkt_state = INTEL_PT_STATE_ERR3;
 			return -ENOENT;
@@ -1654,8 +1726,13 @@ static int intel_pt_walk_trace(struct intel_pt_decoder *decoder)
 			intel_pt_set_last_ip(decoder);
 			if (!decoder->branch_enable) {
 				decoder->ip = decoder->last_ip;
+				if (intel_pt_fup_event(decoder))
+					return 0;
+				no_tip = false;
 				break;
 			}
+			if (decoder->set_fup_mwait)
+				no_tip = true;
 			err = intel_pt_walk_fup(decoder);
 			if (err != -EAGAIN) {
 				if (err)
@@ -1755,6 +1832,71 @@ static int intel_pt_walk_trace(struct intel_pt_decoder *decoder)
 		case INTEL_PT_PAD:
 			break;
 
+		case INTEL_PT_PTWRITE_IP:
+			decoder->fup_ptw_payload = decoder->packet.payload;
+			err = intel_pt_get_next_packet(decoder);
+			if (err)
+				return err;
+			if (decoder->packet.type == INTEL_PT_FUP) {
+				decoder->set_fup_ptw = true;
+				no_tip = true;
+			} else {
+				intel_pt_log_at("ERROR: Missing FUP after PTWRITE",
+						decoder->pos);
+			}
+			goto next;
+
+		case INTEL_PT_PTWRITE:
+			decoder->state.type = INTEL_PT_PTW;
+			decoder->state.from_ip = decoder->ip;
+			decoder->state.to_ip = 0;
+			decoder->state.ptw_payload = decoder->packet.payload;
+			return 0;
+
+		case INTEL_PT_MWAIT:
+			decoder->fup_mwait_payload = decoder->packet.payload;
+			decoder->set_fup_mwait = true;
+			break;
+
+		case INTEL_PT_PWRE:
+			if (decoder->set_fup_mwait) {
+				decoder->fup_pwre_payload =
+							decoder->packet.payload;
+				decoder->set_fup_pwre = true;
+				break;
+			}
+			decoder->state.type = INTEL_PT_PWR_ENTRY;
+			decoder->state.from_ip = decoder->ip;
+			decoder->state.to_ip = 0;
+			decoder->state.pwrx_payload = decoder->packet.payload;
+			return 0;
+
+		case INTEL_PT_EXSTOP_IP:
+			err = intel_pt_get_next_packet(decoder);
+			if (err)
+				return err;
+			if (decoder->packet.type == INTEL_PT_FUP) {
+				decoder->set_fup_exstop = true;
+				no_tip = true;
+			} else {
+				intel_pt_log_at("ERROR: Missing FUP after EXSTOP",
+						decoder->pos);
+			}
+			goto next;
+
+		case INTEL_PT_EXSTOP:
+			decoder->state.type = INTEL_PT_EX_STOP;
+			decoder->state.from_ip = decoder->ip;
+			decoder->state.to_ip = 0;
+			return 0;
+
+		case INTEL_PT_PWRX:
+			decoder->state.type = INTEL_PT_PWR_EXIT;
+			decoder->state.from_ip = decoder->ip;
+			decoder->state.to_ip = 0;
+			decoder->state.pwrx_payload = decoder->packet.payload;
+			return 0;
+
 		default:
 			return intel_pt_bug(decoder);
 		}
@@ -1784,6 +1926,13 @@ static int intel_pt_walk_psb(struct intel_pt_decoder *decoder)
 			__fallthrough;
 		case INTEL_PT_TIP_PGE:
 		case INTEL_PT_TIP:
+		case INTEL_PT_PTWRITE:
+		case INTEL_PT_PTWRITE_IP:
+		case INTEL_PT_EXSTOP:
+		case INTEL_PT_EXSTOP_IP:
+		case INTEL_PT_MWAIT:
+		case INTEL_PT_PWRE:
+		case INTEL_PT_PWRX:
 			intel_pt_log("ERROR: Unexpected packet\n");
 			return -ENOENT;
 
@@ -1958,6 +2107,13 @@ static int intel_pt_walk_to_ip(struct intel_pt_decoder *decoder)
 		case INTEL_PT_VMCS:
 		case INTEL_PT_MNT:
 		case INTEL_PT_PAD:
+		case INTEL_PT_PTWRITE:
+		case INTEL_PT_PTWRITE_IP:
+		case INTEL_PT_EXSTOP:
+		case INTEL_PT_EXSTOP_IP:
+		case INTEL_PT_MWAIT:
+		case INTEL_PT_PWRE:
+		case INTEL_PT_PWRX:
 		default:
 			break;
 		}
@@ -1969,6 +2125,10 @@ static int intel_pt_sync_ip(struct intel_pt_decoder *decoder)
 	int err;
 
 	decoder->set_fup_tx_flags = false;
+	decoder->set_fup_ptw = false;
+	decoder->set_fup_mwait = false;
+	decoder->set_fup_pwre = false;
+	decoder->set_fup_exstop = false;
 
 	if (!decoder->branch_enable) {
 		decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
index add3bed58349..414c88e9e0da 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
@@ -25,11 +25,17 @@
 #define INTEL_PT_IN_TX		(1 << 0)
 #define INTEL_PT_ABORT_TX	(1 << 1)
 #define INTEL_PT_ASYNC		(1 << 2)
+#define INTEL_PT_FUP_IP		(1 << 3)
 
 enum intel_pt_sample_type {
 	INTEL_PT_BRANCH		= 1 << 0,
 	INTEL_PT_INSTRUCTION	= 1 << 1,
 	INTEL_PT_TRANSACTION	= 1 << 2,
+	INTEL_PT_PTW		= 1 << 3,
+	INTEL_PT_MWAIT_OP	= 1 << 4,
+	INTEL_PT_PWR_ENTRY	= 1 << 5,
+	INTEL_PT_EX_STOP	= 1 << 6,
+	INTEL_PT_PWR_EXIT	= 1 << 7,
 };
 
 enum intel_pt_period_type {
@@ -63,6 +69,10 @@ struct intel_pt_state {
 	uint64_t timestamp;
 	uint64_t est_timestamp;
 	uint64_t trace_nr;
+	uint64_t ptw_payload;
+	uint64_t mwait_payload;
+	uint64_t pwre_payload;
+	uint64_t pwrx_payload;
 	uint32_t flags;
 	enum intel_pt_insn_op insn_op;
 	int insn_len;
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.c
index 7528ae4f7e28..accdb646a03d 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.c
@@ -64,6 +64,13 @@ static const char * const packet_name[] = {
 	[INTEL_PT_PIP]		= "PIP",
 	[INTEL_PT_OVF]		= "OVF",
 	[INTEL_PT_MNT]		= "MNT",
+	[INTEL_PT_PTWRITE]	= "PTWRITE",
+	[INTEL_PT_PTWRITE_IP]	= "PTWRITE",
+	[INTEL_PT_EXSTOP]	= "EXSTOP",
+	[INTEL_PT_EXSTOP_IP]	= "EXSTOP",
+	[INTEL_PT_MWAIT]	= "MWAIT",
+	[INTEL_PT_PWRE]		= "PWRE",
+	[INTEL_PT_PWRX]		= "PWRX",
 };
 
 const char *intel_pt_pkt_name(enum intel_pt_pkt_type type)
@@ -217,12 +224,80 @@ static int intel_pt_get_3byte(const unsigned char *buf, size_t len,
 	}
 }
 
+static int intel_pt_get_ptwrite(const unsigned char *buf, size_t len,
+				struct intel_pt_pkt *packet)
+{
+	packet->count = (buf[1] >> 5) & 0x3;
+	packet->type = buf[1] & BIT(7) ? INTEL_PT_PTWRITE_IP :
+					 INTEL_PT_PTWRITE;
+
+	switch (packet->count) {
+	case 0:
+		if (len < 6)
+			return INTEL_PT_NEED_MORE_BYTES;
+		packet->payload = le32_to_cpu(*(uint32_t *)(buf + 2));
+		return 6;
+	case 1:
+		if (len < 10)
+			return INTEL_PT_NEED_MORE_BYTES;
+		packet->payload = le64_to_cpu(*(uint64_t *)(buf + 2));
+		return 10;
+	default:
+		return INTEL_PT_BAD_PACKET;
+	}
+}
+
+static int intel_pt_get_exstop(struct intel_pt_pkt *packet)
+{
+	packet->type = INTEL_PT_EXSTOP;
+	return 2;
+}
+
+static int intel_pt_get_exstop_ip(struct intel_pt_pkt *packet)
+{
+	packet->type = INTEL_PT_EXSTOP_IP;
+	return 2;
+}
+
+static int intel_pt_get_mwait(const unsigned char *buf, size_t len,
+			      struct intel_pt_pkt *packet)
+{
+	if (len < 10)
+		return INTEL_PT_NEED_MORE_BYTES;
+	packet->type = INTEL_PT_MWAIT;
+	packet->payload = le64_to_cpu(*(uint64_t *)(buf + 2));
+	return 10;
+}
+
+static int intel_pt_get_pwre(const unsigned char *buf, size_t len,
+			     struct intel_pt_pkt *packet)
+{
+	if (len < 4)
+		return INTEL_PT_NEED_MORE_BYTES;
+	packet->type = INTEL_PT_PWRE;
+	memcpy_le64(&packet->payload, buf + 2, 2);
+	return 4;
+}
+
+static int intel_pt_get_pwrx(const unsigned char *buf, size_t len,
+			     struct intel_pt_pkt *packet)
+{
+	if (len < 7)
+		return INTEL_PT_NEED_MORE_BYTES;
+	packet->type = INTEL_PT_PWRX;
+	memcpy_le64(&packet->payload, buf + 2, 5);
+	return 7;
+}
+
 static int intel_pt_get_ext(const unsigned char *buf, size_t len,
 			    struct intel_pt_pkt *packet)
 {
 	if (len < 2)
 		return INTEL_PT_NEED_MORE_BYTES;
 
+	if ((buf[1] & 0x1f) == 0x12)
+		return intel_pt_get_ptwrite(buf, len, packet);
+
 	switch (buf[1]) {
 	case 0xa3: /* Long TNT */
 		return intel_pt_get_long_tnt(buf, len, packet);
@@ -244,6 +319,16 @@ static int intel_pt_get_ext(const unsigned char *buf, size_t len,
 		return intel_pt_get_tma(buf, len, packet);
 	case 0xC3: /* 3-byte header */
 		return intel_pt_get_3byte(buf, len, packet);
+	case 0x62: /* EXSTOP no IP */
+		return intel_pt_get_exstop(packet);
+	case 0xE2: /* EXSTOP with IP */
+		return intel_pt_get_exstop_ip(packet);
+	case 0xC2: /* MWAIT */
+		return intel_pt_get_mwait(buf, len, packet);
+	case 0x22: /* PWRE */
+		return intel_pt_get_pwre(buf, len, packet);
+	case 0xA2: /* PWRX */
+		return intel_pt_get_pwrx(buf, len, packet);
 	default:
 		return INTEL_PT_BAD_PACKET;
 	}
@@ -522,6 +607,29 @@ int intel_pt_pkt_desc(const struct intel_pt_pkt *packet, char *buf,
 		ret = snprintf(buf, buf_len, "%s 0x%llx (NR=%d)",
 			       name, payload, nr);
 		return ret;
+	case INTEL_PT_PTWRITE:
+		return snprintf(buf, buf_len, "%s 0x%llx IP:0", name, payload);
+	case INTEL_PT_PTWRITE_IP:
+		return snprintf(buf, buf_len, "%s 0x%llx IP:1", name, payload);
+	case INTEL_PT_EXSTOP:
+		return snprintf(buf, buf_len, "%s IP:0", name);
+	case INTEL_PT_EXSTOP_IP:
+		return snprintf(buf, buf_len, "%s IP:1", name);
+	case INTEL_PT_MWAIT:
+		return snprintf(buf, buf_len, "%s 0x%llx Hints 0x%x Extensions 0x%x",
+				name, payload, (unsigned int)(payload & 0xff),
+				(unsigned int)((payload >> 32) & 0x3));
+	case INTEL_PT_PWRE:
+		return snprintf(buf, buf_len, "%s 0x%llx HW:%u CState:%u Sub-CState:%u",
+				name, payload, !!(payload & 0x80),
+				(unsigned int)((payload >> 12) & 0xf),
+				(unsigned int)((payload >> 8) & 0xf));
+	case INTEL_PT_PWRX:
+		return snprintf(buf, buf_len, "%s 0x%llx Last CState:%u Deepest CState:%u Wake Reason 0x%x",
+				name, payload,
+				(unsigned int)((payload >> 4) & 0xf),
+				(unsigned int)(payload & 0xf),
+				(unsigned int)((payload >> 8) & 0xf));
 	default:
 		break;
 	}
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.h b/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.h
index 781bb79883bd..73ddc3a88d07 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.h
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.h
@@ -52,6 +52,13 @@ enum intel_pt_pkt_type {
 	INTEL_PT_PIP,
 	INTEL_PT_OVF,
 	INTEL_PT_MNT,
+	INTEL_PT_PTWRITE,
+	INTEL_PT_PTWRITE_IP,
+	INTEL_PT_EXSTOP,
+	INTEL_PT_EXSTOP_IP,
+	INTEL_PT_MWAIT,
+	INTEL_PT_PWRE,
+	INTEL_PT_PWRX,
 };
 
 struct intel_pt_pkt {
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 20/25] perf intel-pt: Add reserved byte to CBR packet payload
  2017-06-21 18:02 [GIT PULL 00/25] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (18 preceding siblings ...)
  2017-06-21 18:02 ` [PATCH 19/25] perf intel-pt: Add decoder support for ptwrite and power event packets Arnaldo Carvalho de Melo
@ 2017-06-21 18:02 ` Arnaldo Carvalho de Melo
  2017-06-21 18:02 ` [PATCH 21/25] perf intel-pt: Add decoder support for CBR events Arnaldo Carvalho de Melo
                   ` (5 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-21 18:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Adrian Hunter, Andi Kleen, Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

Future proof CBR packet decoding by passing through also the undefined
'reserved' byte in the packet payload.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-15-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.c     | 2 +-
 tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index e42804d10001..96bf8d8e83c0 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -1444,7 +1444,7 @@ static void intel_pt_calc_mtc_timestamp(struct intel_pt_decoder *decoder)
 
 static void intel_pt_calc_cbr(struct intel_pt_decoder *decoder)
 {
-	unsigned int cbr = decoder->packet.payload;
+	unsigned int cbr = decoder->packet.payload & 0xff;
 
 	if (decoder->cbr == cbr)
 		return;
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.c
index accdb646a03d..ba4c9dd18643 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.c
@@ -130,7 +130,7 @@ static int intel_pt_get_cbr(const unsigned char *buf, size_t len,
 	if (len < 4)
 		return INTEL_PT_NEED_MORE_BYTES;
 	packet->type = INTEL_PT_CBR;
-	packet->payload = buf[2];
+	packet->payload = le16_to_cpu(*(uint16_t *)(buf + 2));
 	return 4;
 }
 
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 21/25] perf intel-pt: Add decoder support for CBR events
  2017-06-21 18:02 [GIT PULL 00/25] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (19 preceding siblings ...)
  2017-06-21 18:02 ` [PATCH 20/25] perf intel-pt: Add reserved byte to CBR packet payload Arnaldo Carvalho de Melo
@ 2017-06-21 18:02 ` Arnaldo Carvalho de Melo
  2017-06-21 18:02 ` [PATCH 22/25] perf intel-pt: Remove redundant initial_skip checks Arnaldo Carvalho de Melo
                   ` (4 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-21 18:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Adrian Hunter, Andi Kleen, Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

Add decoder support for informing the tools of changes to the core-to-bus
ratio (CBR).

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-16-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.c | 19 +++++++++++++++++++
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.h |  2 ++
 2 files changed, 21 insertions(+)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index 96bf8d8e83c0..5dea06289db5 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -141,6 +141,7 @@ struct intel_pt_decoder {
 	int pkt_len;
 	int last_packet_type;
 	unsigned int cbr;
+	unsigned int cbr_seen;
 	unsigned int max_non_turbo_ratio;
 	double max_non_turbo_ratio_fp;
 	double cbr_cyc_to_tsc;
@@ -167,6 +168,7 @@ struct intel_pt_decoder {
 	uint64_t fup_ptw_payload;
 	uint64_t fup_mwait_payload;
 	uint64_t fup_pwre_payload;
+	uint64_t cbr_payload;
 	uint64_t timestamp_insn_cnt;
 	uint64_t sample_insn_cnt;
 	uint64_t stuck_ip;
@@ -1446,6 +1448,8 @@ static void intel_pt_calc_cbr(struct intel_pt_decoder *decoder)
 {
 	unsigned int cbr = decoder->packet.payload & 0xff;
 
+	decoder->cbr_payload = decoder->packet.payload;
+
 	if (decoder->cbr == cbr)
 		return;
 
@@ -1806,6 +1810,16 @@ static int intel_pt_walk_trace(struct intel_pt_decoder *decoder)
 
 		case INTEL_PT_CBR:
 			intel_pt_calc_cbr(decoder);
+			if (!decoder->branch_enable &&
+			    decoder->cbr != decoder->cbr_seen) {
+				decoder->cbr_seen = decoder->cbr;
+				decoder->state.type = INTEL_PT_CBR_CHG;
+				decoder->state.from_ip = decoder->ip;
+				decoder->state.to_ip = 0;
+				decoder->state.cbr_payload =
+							decoder->packet.payload;
+				return 0;
+			}
 			break;
 
 		case INTEL_PT_MODE_EXEC:
@@ -2343,6 +2357,11 @@ const struct intel_pt_state *intel_pt_decode(struct intel_pt_decoder *decoder)
 		decoder->sample_insn_cnt = decoder->timestamp_insn_cnt;
 	} else {
 		decoder->state.err = 0;
+		if (decoder->cbr != decoder->cbr_seen && decoder->state.type) {
+			decoder->cbr_seen = decoder->cbr;
+			decoder->state.type |= INTEL_PT_CBR_CHG;
+			decoder->state.cbr_payload = decoder->cbr_payload;
+		}
 		if (intel_pt_sample_time(decoder->pkt_state)) {
 			decoder->sample_timestamp = decoder->timestamp;
 			decoder->sample_insn_cnt = decoder->timestamp_insn_cnt;
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
index 414c88e9e0da..921b22e8ca0e 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
@@ -36,6 +36,7 @@ enum intel_pt_sample_type {
 	INTEL_PT_PWR_ENTRY	= 1 << 5,
 	INTEL_PT_EX_STOP	= 1 << 6,
 	INTEL_PT_PWR_EXIT	= 1 << 7,
+	INTEL_PT_CBR_CHG	= 1 << 8,
 };
 
 enum intel_pt_period_type {
@@ -73,6 +74,7 @@ struct intel_pt_state {
 	uint64_t mwait_payload;
 	uint64_t pwre_payload;
 	uint64_t pwrx_payload;
+	uint64_t cbr_payload;
 	uint32_t flags;
 	enum intel_pt_insn_op insn_op;
 	int insn_len;
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 22/25] perf intel-pt: Remove redundant initial_skip checks
  2017-06-21 18:02 [GIT PULL 00/25] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (20 preceding siblings ...)
  2017-06-21 18:02 ` [PATCH 21/25] perf intel-pt: Add decoder support for CBR events Arnaldo Carvalho de Melo
@ 2017-06-21 18:02 ` Arnaldo Carvalho de Melo
  2017-06-21 18:02 ` [PATCH 23/25] perf intel-pt: Fix transactions_sample_type Arnaldo Carvalho de Melo
                   ` (3 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-21 18:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Adrian Hunter, Andi Kleen, Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

'initial_skip' is checked inside the sample synthesis functions which means
it is actually being done twice for 'instructions' and 'transactions'
samples. Remove the redundant checks.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-17-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/intel-pt.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 5c59b8c6a719..3ae03f920253 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -1322,18 +1322,14 @@ static int intel_pt_sample(struct intel_pt_queue *ptq)
 	ptq->have_sample = false;
 
 	if (pt->sample_instructions &&
-	    (state->type & INTEL_PT_INSTRUCTION) &&
-	    (!pt->synth_opts.initial_skip ||
-	     pt->num_events++ >= pt->synth_opts.initial_skip)) {
+	    (state->type & INTEL_PT_INSTRUCTION)) {
 		err = intel_pt_synth_instruction_sample(ptq);
 		if (err)
 			return err;
 	}
 
 	if (pt->sample_transactions &&
-	    (state->type & INTEL_PT_TRANSACTION) &&
-	    (!pt->synth_opts.initial_skip ||
-	     pt->num_events++ >= pt->synth_opts.initial_skip)) {
+	    (state->type & INTEL_PT_TRANSACTION)) {
 		err = intel_pt_synth_transaction_sample(ptq);
 		if (err)
 			return err;
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 23/25] perf intel-pt: Fix transactions_sample_type
  2017-06-21 18:02 [GIT PULL 00/25] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (21 preceding siblings ...)
  2017-06-21 18:02 ` [PATCH 22/25] perf intel-pt: Remove redundant initial_skip checks Arnaldo Carvalho de Melo
@ 2017-06-21 18:02 ` Arnaldo Carvalho de Melo
  2017-06-21 18:02 ` [PATCH 24/25] perf tools: Fix message because cpu list option is -C not -c Arnaldo Carvalho de Melo
                   ` (2 subsequent siblings)
  25 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-21 18:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Adrian Hunter, Andi Kleen, Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

'transactions_sample_type' is needed to correctly inject transactions
samples but it was not being set. Set it from the event sample type.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-18-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/intel-pt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 3ae03f920253..6df836469f2b 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -2035,6 +2035,7 @@ static int intel_pt_synth_events(struct intel_pt *pt,
 			return err;
 		}
 		pt->sample_transactions = true;
+		pt->transactions_sample_type = attr.sample_type;
 		pt->transactions_id = id;
 		id += 1;
 		evlist__for_each_entry(evlist, evsel) {
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 24/25] perf tools: Fix message because cpu list option is -C not -c
  2017-06-21 18:02 [GIT PULL 00/25] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (22 preceding siblings ...)
  2017-06-21 18:02 ` [PATCH 23/25] perf intel-pt: Fix transactions_sample_type Arnaldo Carvalho de Melo
@ 2017-06-21 18:02 ` Arnaldo Carvalho de Melo
  2017-06-21 18:02 ` [PATCH 25/25] perf script: Fix message because field list option is -F not -f Arnaldo Carvalho de Melo
  2017-06-21 18:13 ` [GIT PULL 00/25] perf/core improvements and fixes Ingo Molnar
  25 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-21 18:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Adrian Hunter, Andi Kleen, Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

Fix message because cpu list option is -C not -c

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-19-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/session.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 7dc1096264c5..d19c40a81040 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -2035,7 +2035,7 @@ int perf_session__cpu_bitmap(struct perf_session *session,
 
 		if (!(evsel->attr.sample_type & PERF_SAMPLE_CPU)) {
 			pr_err("File does not contain CPU events. "
-			       "Remove -c option to proceed.\n");
+			       "Remove -C option to proceed.\n");
 			return -1;
 		}
 	}
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 25/25] perf script: Fix message because field list option is -F not -f
  2017-06-21 18:02 [GIT PULL 00/25] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (23 preceding siblings ...)
  2017-06-21 18:02 ` [PATCH 24/25] perf tools: Fix message because cpu list option is -C not -c Arnaldo Carvalho de Melo
@ 2017-06-21 18:02 ` Arnaldo Carvalho de Melo
  2017-06-21 18:13 ` [GIT PULL 00/25] perf/core improvements and fixes Ingo Molnar
  25 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-21 18:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Adrian Hunter, Andi Kleen, Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

Fix message because field list option is -F not -f.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-20-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-script.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index db5261c3f719..4bce7d8679cb 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -385,7 +385,7 @@ static int perf_session__check_output_opt(struct perf_session *session)
 		 */
 		if (!evsel && output[j].user_set && !output[j].wildcard_set) {
 			pr_err("%s events do not exist. "
-			       "Remove corresponding -f option to proceed.\n",
+			       "Remove corresponding -F option to proceed.\n",
 			       event_type(j));
 			return -1;
 		}
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [GIT PULL 00/25] perf/core improvements and fixes
  2017-06-21 18:02 [GIT PULL 00/25] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (24 preceding siblings ...)
  2017-06-21 18:02 ` [PATCH 25/25] perf script: Fix message because field list option is -F not -f Arnaldo Carvalho de Melo
@ 2017-06-21 18:13 ` Ingo Molnar
  25 siblings, 0 replies; 27+ messages in thread
From: Ingo Molnar @ 2017-06-21 18:13 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Adrian Hunter, Andi Kleen, David Ahern, Jiri Olsa,
	Kan Liang, linuxppc-dev, Milian Wolff, Namhyung Kim,
	Naveen N . Rao, Paolo Bonzini, Peter Zijlstra, Ravi Bangoria,
	Robert Elliott, Stephane Eranian, Thomas Gleixner, Wang Nan,
	Arnaldo Carvalho de Melo


* Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> Hi Ingo,
> 
> 	Please consider pulling,
> 
> - Arnaldo
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit 007b811b4041989ec2dc91b9614aa2c41332723e:
> 
>   Merge tag 'perf-core-for-mingo-4.13-20170719' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2017-06-20 10:49:08 +0200)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.13-20170621
> 
> for you to fetch changes up to 701516ae3dec801084bc913d21e03fce15c61a0b:
> 
>   perf script: Fix message because field list option is -F not -f (2017-06-21 11:35:53 -0300)
> 
> ----------------------------------------------------------------
> perf/core improvements ad fixes:
> 
> New features:
> 
> - Add support to measure SMI cost in 'perf stat' (Kan Liang)
> 
> - Add support for unwinding callchains in powerpc with libdw (Paolo Bonzini)
> 
> Fixes:
> 
> - Fix message: cpu list option is -C not -c (Adrian Hunter)
> 
> - Fix 'perf script' message: field list option is -F not -f (Adrian Hunter)
> 
> - Intel PT fixes: (Adrian Hunter)
> 
>   o Fix missing stack clear
>   o Ensure IP is zero when state is INTEL_PT_STATE_NO_IP
>   o Fix last_ip usage
>   o Ensure never to set 'last_ip' when packet 'count' is zero
>   o Clear FUP flag on error
>   o Fix transactions_sample_type
> 
> Infrastructure:
> 
> - Intel PT cleanups/refactorings (Adrian Hunter)
> 
>   o Use FUP always when scanning for an IP
>   o Add missing __fallthrough
>   o Remove redundant initial_skip checks
>   o Allow decoding with branch tracing disabled
>   o Add default config for pass-through branch enable
>   o Add documentation for new config terms
>   o Add decoder support for ptwrite and power event packets
>   o Add reserved byte to CBR packet payload
>   o Add decoder support for CBR events
> 
> - Move  find_process() to the only place that uses it, skimming some
>   more fat from util.[ch] (Arnaldo Carvalho de Melo)
> 
> - Do parameter validation earlier on fetch_kernel_version() (Arnaldo Carvalho de Melo)
> 
> - Remove unused _ALL_SOURCE define (Arnaldo Carvalho de Melo)
> 
> - Add sysfs__write_int function (Kan Liang)
> 
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> ----------------------------------------------------------------
> Adrian Hunter (19):
>       perf intel-pt: Move decoder error setting into one condition
>       perf intel-pt: Improve sample timestamp
>       perf intel-pt: Fix missing stack clear
>       perf intel-pt: Ensure IP is zero when state is INTEL_PT_STATE_NO_IP
>       perf intel-pt: Fix last_ip usage
>       perf intel-pt: Ensure never to set 'last_ip' when packet 'count' is zero
>       perf intel-pt: Use FUP always when scanning for an IP
>       perf intel-pt: Clear FUP flag on error
>       perf intel-pt: Add missing __fallthrough
>       perf intel-pt: Allow decoding with branch tracing disabled
>       perf intel-pt: Add default config for pass-through branch enable
>       perf intel-pt: Add documentation for new config terms
>       perf intel-pt: Add decoder support for ptwrite and power event packets
>       perf intel-pt: Add reserved byte to CBR packet payload
>       perf intel-pt: Add decoder support for CBR events
>       perf intel-pt: Remove redundant initial_skip checks
>       perf intel-pt: Fix transactions_sample_type
>       perf tools: Fix message because cpu list option is -C not -c
>       perf script: Fix message because field list option is -F not -f
> 
> Arnaldo Carvalho de Melo (3):
>       perf evsel: Adopt find_process()
>       perf tools: Do parameter validation earlier on fetch_kernel_version()
>       perf tools: Remove unused _ALL_SOURCE define
> 
> Kan Liang (2):
>       tools lib api fs: Add sysfs__write_int function
>       perf stat: Add support to measure SMI cost
> 
> Paolo Bonzini (1):
>       perf unwind: Support for powerpc
> 
>  tools/lib/api/fs/fs.c                              |  30 +++
>  tools/lib/api/fs/fs.h                              |   4 +
>  tools/perf/Documentation/intel-pt.txt              |  36 +++
>  tools/perf/Documentation/perf-stat.txt             |  14 +
>  tools/perf/Makefile.config                         |   2 +-
>  tools/perf/arch/powerpc/util/Build                 |   2 +
>  tools/perf/arch/powerpc/util/unwind-libdw.c        |  73 ++++++
>  tools/perf/arch/x86/util/intel-pt.c                |   5 +
>  tools/perf/builtin-script.c                        |   2 +-
>  tools/perf/builtin-stat.c                          |  49 ++++
>  tools/perf/util/evsel.c                            |  39 +++
>  .../perf/util/intel-pt-decoder/intel-pt-decoder.c  | 290 +++++++++++++++++++--
>  .../perf/util/intel-pt-decoder/intel-pt-decoder.h  |  13 +
>  .../util/intel-pt-decoder/intel-pt-pkt-decoder.c   | 110 +++++++-
>  .../util/intel-pt-decoder/intel-pt-pkt-decoder.h   |   7 +
>  tools/perf/util/intel-pt.c                         |  23 +-
>  tools/perf/util/session.c                          |   2 +-
>  tools/perf/util/stat-shadow.c                      |  33 +++
>  tools/perf/util/stat.c                             |   2 +
>  tools/perf/util/stat.h                             |   2 +
>  tools/perf/util/util.c                             |  52 +---
>  tools/perf/util/util.h                             |   3 -
>  22 files changed, 710 insertions(+), 83 deletions(-)
>  create mode 100644 tools/perf/arch/powerpc/util/unwind-libdw.c

Pulled, thanks a lot Arnaldo!

	Ingo

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2017-06-21 18:13 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-21 18:02 [GIT PULL 00/25] perf/core improvements and fixes Arnaldo Carvalho de Melo
2017-06-21 18:02 ` [PATCH 01/25] perf evsel: Adopt find_process() Arnaldo Carvalho de Melo
2017-06-21 18:02 ` [PATCH 02/25] perf tools: Do parameter validation earlier on fetch_kernel_version() Arnaldo Carvalho de Melo
2017-06-21 18:02 ` [PATCH 03/25] perf tools: Remove unused _ALL_SOURCE define Arnaldo Carvalho de Melo
2017-06-21 18:02 ` [PATCH 04/25] tools lib api fs: Add sysfs__write_int function Arnaldo Carvalho de Melo
2017-06-21 18:02 ` [PATCH 05/25] perf stat: Add support to measure SMI cost Arnaldo Carvalho de Melo
2017-06-21 18:02 ` [PATCH 06/25] perf unwind: Support for powerpc Arnaldo Carvalho de Melo
2017-06-21 18:02 ` [PATCH 07/25] perf intel-pt: Move decoder error setting into one condition Arnaldo Carvalho de Melo
2017-06-21 18:02 ` [PATCH 08/25] perf intel-pt: Improve sample timestamp Arnaldo Carvalho de Melo
2017-06-21 18:02 ` [PATCH 09/25] perf intel-pt: Fix missing stack clear Arnaldo Carvalho de Melo
2017-06-21 18:02 ` [PATCH 10/25] perf intel-pt: Ensure IP is zero when state is INTEL_PT_STATE_NO_IP Arnaldo Carvalho de Melo
2017-06-21 18:02 ` [PATCH 11/25] perf intel-pt: Fix last_ip usage Arnaldo Carvalho de Melo
2017-06-21 18:02 ` [PATCH 12/25] perf intel-pt: Ensure never to set 'last_ip' when packet 'count' is zero Arnaldo Carvalho de Melo
2017-06-21 18:02 ` [PATCH 13/25] perf intel-pt: Use FUP always when scanning for an IP Arnaldo Carvalho de Melo
2017-06-21 18:02 ` [PATCH 14/25] perf intel-pt: Clear FUP flag on error Arnaldo Carvalho de Melo
2017-06-21 18:02 ` [PATCH 15/25] perf intel-pt: Add missing __fallthrough Arnaldo Carvalho de Melo
2017-06-21 18:02 ` [PATCH 16/25] perf intel-pt: Allow decoding with branch tracing disabled Arnaldo Carvalho de Melo
2017-06-21 18:02 ` [PATCH 17/25] perf intel-pt: Add default config for pass-through branch enable Arnaldo Carvalho de Melo
2017-06-21 18:02 ` [PATCH 18/25] perf intel-pt: Add documentation for new config terms Arnaldo Carvalho de Melo
2017-06-21 18:02 ` [PATCH 19/25] perf intel-pt: Add decoder support for ptwrite and power event packets Arnaldo Carvalho de Melo
2017-06-21 18:02 ` [PATCH 20/25] perf intel-pt: Add reserved byte to CBR packet payload Arnaldo Carvalho de Melo
2017-06-21 18:02 ` [PATCH 21/25] perf intel-pt: Add decoder support for CBR events Arnaldo Carvalho de Melo
2017-06-21 18:02 ` [PATCH 22/25] perf intel-pt: Remove redundant initial_skip checks Arnaldo Carvalho de Melo
2017-06-21 18:02 ` [PATCH 23/25] perf intel-pt: Fix transactions_sample_type Arnaldo Carvalho de Melo
2017-06-21 18:02 ` [PATCH 24/25] perf tools: Fix message because cpu list option is -C not -c Arnaldo Carvalho de Melo
2017-06-21 18:02 ` [PATCH 25/25] perf script: Fix message because field list option is -F not -f Arnaldo Carvalho de Melo
2017-06-21 18:13 ` [GIT PULL 00/25] perf/core improvements and fixes Ingo Molnar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.