LKML Archive on lore.kernel.org
 help / color / Atom feed
* [GIT PULL 0/7] perf/urgent callchain fixes
@ 2017-05-24  6:21 Namhyung Kim
  2017-05-24  6:21 ` [PATCH 1/7] perf report: don't crash on invalid maps in `-g srcline` mode Namhyung Kim
                   ` (8 more replies)
  0 siblings, 9 replies; 27+ messages in thread
From: Namhyung Kim @ 2017-05-24  6:21 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa,
	Milian Wolff, Yao Jin

Hi Ingo,

Please consider pulling the perf tooling changes below.  Build tested
on Ubuntu, Fedora and Archlinux.  I found a problem during `perf test`
but it seems unrelated to this series.  Will take a look it later.

Thanks,
Namhyung


The following changes since commit 88b0193d9418c00340e45e0a913a0813bc6c8c96:

  perf/callchain: Force USER_DS when invoking perf_callchain_user() (2017-05-10 07:54:00 +0200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf tags/perf-urgent-for-mingo-4.12-20170524

for you to fetch changes up to 37d4e1b6ba56773cef96122dff4436c2c534c381:

  perf tools: Fix to put caller above callee in children mode (2017-05-24 08:51:11 +0900)

----------------------------------------------------------------
perf/urgent fixes

Fixes:

 - Fix segfault on `perf report -g srcline` if a callchain address
   cannot find a map for some reason.  The srcline sorting mode needs
   a DSO to resolve line numbers and it's accessed via a map.  But it
   should check if map is available for the address first.  (Milian Wolff)

 - Fix off-by-one for srcline output.  It passed (unwound) address to
   resolve srcline for callchains.  But it's a return address of the
   function which points to a next instruction.  This leads to
   off-by-one for srcline info.  So pass the "address - 1" instead to
   get the correct srcline.  This also considers "signal frame" as
   well which has the exact address, so pass the address directly in
   this case.  (Milian Wolff)

 - Fix missing inlined function.  Current code missed to display
   inlined functions at the end.  This was found when comparing the
   output of addr2line and perf script.  (Milian Wolff)
   
 
User Visible:

 - `perf script` also gained `--inline` option to show inlined
   functions with callchains.  This helped to find a bug in the
   current inline code.  (Namhyung Kim)

 - Fix missed callchain ordering with `-g callee/caller` when libbfd
   is not available.  (Milian Wolff)
 
 - Reorder output entries in `perf report --children` so that it can
   put parent entries above their children.  It worked like this but
   missed when callchain display order was changed with `-g caller`.
   Now default is `-g caller` if children mode enabled.  (Namhyung Kim)


----------------------------------------------------------------

Milian Wolff (5):
      perf report: don't crash on invalid maps in `-g srcline` mode
      perf report: fix memory leak in addr2line when called by addr2inlines
      perf report: fix off-by-one for non-activation frames
      perf report: always honor callchain order for inlined nodes
      perf report: do not drop last inlined frame

Namhyung Kim (2):
      perf script: Add --inline option
      perf tools: Fix to put caller above callee in children mode

 tools/perf/Documentation/perf-script.txt |  4 +++
 tools/perf/builtin-script.c              |  2 ++
 tools/perf/ui/hist.c                     |  2 ++
 tools/perf/util/callchain.c              | 13 ++++++---
 tools/perf/util/evsel_fprintf.c          | 33 +++++++++++++++++++++
 tools/perf/util/srcline.c                | 49 +++++++++++++++++---------------
 tools/perf/util/unwind-libdw.c           |  6 +++-
 tools/perf/util/unwind-libunwind-local.c | 11 +++++++
 8 files changed, 92 insertions(+), 28 deletions(-)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 1/7] perf report: don't crash on invalid maps in `-g srcline` mode
  2017-05-24  6:21 [GIT PULL 0/7] perf/urgent callchain fixes Namhyung Kim
@ 2017-05-24  6:21 ` Namhyung Kim
  2017-05-24  7:03   ` [tip:perf/urgent] perf report: Don't " tip-bot for Milian Wolff
  2017-05-24  6:21 ` [PATCH 2/7] perf report: fix memory leak in addr2line when called by addr2inlines Namhyung Kim
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Namhyung Kim @ 2017-05-24  6:21 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa,
	Milian Wolff, Yao Jin, Arnaldo Carvalho de Melo, David Ahern,
	Peter Zijlstra

From: Milian Wolff <milian.wolff@kdab.com>

I just hit a segfault when doing `perf report -g srcline`.
Valgrind pointed me at this code as the culprit:

==8359== Invalid read of size 8
==8359==    at 0x3096D9: map__rip_2objdump (map.c:430)
==8359==    by 0x2FC1A3: match_chain_srcline (callchain.c:645)
==8359==    by 0x2FC1A3: match_chain (callchain.c:700)
==8359==    by 0x2FC1A3: append_chain (callchain.c:895)
==8359==    by 0x2FC1A3: append_chain_children (callchain.c:846)
==8359==    by 0x2FF719: callchain_append (callchain.c:944)
==8359==    by 0x2FF719: hist_entry__append_callchain (callchain.c:1058)
==8359==    by 0x32FA06: iter_add_single_cumulative_entry (hist.c:908)
==8359==    by 0x33195C: hist_entry_iter__add (hist.c:1050)
==8359==    by 0x258F65: process_sample_event (builtin-report.c:204)
==8359==    by 0x30D60C: perf_session__deliver_event (session.c:1310)
==8359==    by 0x30D60C: ordered_events__deliver_event (session.c:119)
==8359==    by 0x310D12: __ordered_events__flush (ordered-events.c:210)
==8359==    by 0x310D12: ordered_events__flush.part.3 (ordered-events.c:277)
==8359==    by 0x30DD3C: perf_session__process_user_event (session.c:1349)
==8359==    by 0x30DD3C: perf_session__process_event (session.c:1475)
==8359==    by 0x30FC3C: __perf_session__process_events (session.c:1867)
==8359==    by 0x30FC3C: perf_session__process_events (session.c:1921)
==8359==    by 0x25A985: __cmd_report (builtin-report.c:575)
==8359==    by 0x25A985: cmd_report (builtin-report.c:1054)
==8359==    by 0x2B9A80: run_builtin (perf.c:296)
==8359==  Address 0x70 is not stack'd, malloc'd or (recently) free'd

This patch fixes the issue.

Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Yao Jin <yao.jin@linux.intel.com>
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
[namhyung@kernel.org: remove dependency from another change]
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/callchain.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 81fc29ac798f..b4204b43ed58 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -621,14 +621,19 @@ enum match_result {
 static enum match_result match_chain_srcline(struct callchain_cursor_node *node,
 					     struct callchain_list *cnode)
 {
-	char *left = get_srcline(cnode->ms.map->dso,
+	char *left = NULL;
+	char *right = NULL;
+	enum match_result ret = MATCH_EQ;
+	int cmp;
+
+	if (cnode->ms.map)
+		left = get_srcline(cnode->ms.map->dso,
 				 map__rip_2objdump(cnode->ms.map, cnode->ip),
 				 cnode->ms.sym, true, false);
-	char *right = get_srcline(node->map->dso,
+	if (node->map)
+		right = get_srcline(node->map->dso,
 				  map__rip_2objdump(node->map, node->ip),
 				  node->sym, true, false);
-	enum match_result ret = MATCH_EQ;
-	int cmp;
 
 	if (left && right)
 		cmp = strcmp(left, right);
-- 
2.13.0

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 2/7] perf report: fix memory leak in addr2line when called by addr2inlines
  2017-05-24  6:21 [GIT PULL 0/7] perf/urgent callchain fixes Namhyung Kim
  2017-05-24  6:21 ` [PATCH 1/7] perf report: don't crash on invalid maps in `-g srcline` mode Namhyung Kim
@ 2017-05-24  6:21 ` Namhyung Kim
  2017-05-24  7:04   ` [tip:perf/urgent] perf report: Fix " tip-bot for Milian Wolff
  2017-05-24  6:21 ` [PATCH 3/7] perf report: fix off-by-one for non-activation frames Namhyung Kim
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Namhyung Kim @ 2017-05-24  6:21 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa,
	Milian Wolff, Yao Jin, Arnaldo Carvalho de Melo, David Ahern,
	Peter Zijlstra

From: Milian Wolff <milian.wolff@kdab.com>

When a filename was found in addr2line it was duplicated via strdup
but never freed. Now we pass NULL and handle this gracefully in
addr2line.

Detected by Valgrind:

==16331== 1,680 bytes in 21 blocks are definitely lost in loss record 148 of 220
==16331==    at 0x4C2AF1F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16331==    by 0x672FA69: strdup (in /usr/lib/libc-2.25.so)
==16331==    by 0x52769F: addr2line (srcline.c:256)
==16331==    by 0x52769F: addr2inlines (srcline.c:294)
==16331==    by 0x52769F: dso__parse_addr_inlines (srcline.c:502)
==16331==    by 0x574D7A: inline__fprintf (hist.c:41)
==16331==    by 0x574D7A: ipchain__fprintf_graph (hist.c:147)
==16331==    by 0x57518A: __callchain__fprintf_graph (hist.c:212)
==16331==    by 0x5753CF: callchain__fprintf_graph.constprop.6 (hist.c:337)
==16331==    by 0x57738E: hist_entry__fprintf (hist.c:628)
==16331==    by 0x57738E: hists__fprintf (hist.c:882)
==16331==    by 0x44A20F: perf_evlist__tty_browse_hists (builtin-report.c:399)
==16331==    by 0x44A20F: report__browse_hists (builtin-report.c:491)
==16331==    by 0x44A20F: __cmd_report (builtin-report.c:624)
==16331==    by 0x44A20F: cmd_report (builtin-report.c:1054)
==16331==    by 0x4A49CE: run_builtin (perf.c:296)
==16331==    by 0x4A4CC0: handle_internal_command (perf.c:348)
==16331==    by 0x434371: run_argv (perf.c:392)
==16331==    by 0x434371: main (perf.c:530)

Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Yao Jin <yao.jin@linux.intel.com>
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/srcline.c | 23 +++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index df051a52393c..5e376d64d59e 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -230,7 +230,10 @@ static int addr2line(const char *dso_name, u64 addr,
 
 	bfd_map_over_sections(a2l->abfd, find_address_in_section, a2l);
 
-	if (a2l->found && unwind_inlines) {
+	if (!a2l->found)
+		return 0;
+
+	if (unwind_inlines) {
 		int cnt = 0;
 
 		while (bfd_find_inliner_info(a2l->abfd, &a2l->filename,
@@ -243,6 +246,8 @@ static int addr2line(const char *dso_name, u64 addr,
 							a2l->line, node,
 							dso) != 0)
 					return 0;
+				// found at least one inline frame
+				ret = 1;
 			}
 		}
 
@@ -252,14 +257,14 @@ static int addr2line(const char *dso_name, u64 addr,
 		}
 	}
 
-	if (a2l->found && a2l->filename) {
-		*file = strdup(a2l->filename);
-		*line = a2l->line;
-
-		if (*file)
-			ret = 1;
+	if (file) {
+		*file = a2l->filename ? strdup(a2l->filename) : NULL;
+		ret = *file ? 1 : 0;
 	}
 
+	if (line)
+		*line = a2l->line;
+
 	return ret;
 }
 
@@ -278,8 +283,6 @@ void dso__free_a2l(struct dso *dso)
 static struct inline_node *addr2inlines(const char *dso_name, u64 addr,
 	struct dso *dso)
 {
-	char *file = NULL;
-	unsigned int line = 0;
 	struct inline_node *node;
 
 	node = zalloc(sizeof(*node));
@@ -291,7 +294,7 @@ static struct inline_node *addr2inlines(const char *dso_name, u64 addr,
 	INIT_LIST_HEAD(&node->val);
 	node->addr = addr;
 
-	if (!addr2line(dso_name, addr, &file, &line, dso, TRUE, node))
+	if (!addr2line(dso_name, addr, NULL, NULL, dso, TRUE, node))
 		goto out_free_inline_node;
 
 	if (list_empty(&node->val))
-- 
2.13.0

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 3/7] perf report: fix off-by-one for non-activation frames
  2017-05-24  6:21 [GIT PULL 0/7] perf/urgent callchain fixes Namhyung Kim
  2017-05-24  6:21 ` [PATCH 1/7] perf report: don't crash on invalid maps in `-g srcline` mode Namhyung Kim
  2017-05-24  6:21 ` [PATCH 2/7] perf report: fix memory leak in addr2line when called by addr2inlines Namhyung Kim
@ 2017-05-24  6:21 ` Namhyung Kim
  2017-05-24  7:05   ` [tip:perf/urgent] perf report: Fix " tip-bot for Milian Wolff
  2017-05-24  6:21 ` [PATCH 4/7] perf script: Add --inline option Namhyung Kim
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Namhyung Kim @ 2017-05-24  6:21 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa,
	Milian Wolff, Yao Jin, Arnaldo Carvalho de Melo, David Ahern,
	Peter Zijlstra

From: Milian Wolff <milian.wolff@kdab.com>

As the documentation for dwfl_frame_pc says, frames that
are no activation frames need to have their program counter
decremented by one to properly find the function of the caller.

This fixes many cases where perf report currently attributes
the cost to the next line. I.e. I have code like this:

~~~~~~~~~~~~~~~
  #include <thread>
  #include <chrono>

  using namespace std;

  int main()
  {
    this_thread::sleep_for(chrono::milliseconds(1000));
    this_thread::sleep_for(chrono::milliseconds(100));
    this_thread::sleep_for(chrono::milliseconds(10));

    return 0;
  }
~~~~~~~~~~~~~~~

Now compile and record it:

~~~~~~~~~~~~~~~
g++ -std=c++11 -g -O2 test.cpp
echo 1 | sudo tee /proc/sys/kernel/sched_schedstats
perf record \
    --event sched:sched_stat_sleep \
    --event sched:sched_process_exit \
    --event sched:sched_switch --call-graph=dwarf \
    --output perf.data.raw \
    ./a.out
echo 0 | sudo tee /proc/sys/kernel/sched_schedstats
perf inject --sched-stat --input perf.data.raw --output perf.data
~~~~~~~~~~~~~~~

Before this patch, the report clearly shows the off-by-one issue.
Most notably, the last sleep invocation is incorrectly attributed
to the "return 0;" line:

~~~~~~~~~~~~~~~
  Overhead  Source:Line
  ........  ...........

   100.00%  core.c:0
            |
            ---__schedule core.c:0
               schedule
               do_nanosleep hrtimer.c:0
               hrtimer_nanosleep
               sys_nanosleep
               entry_SYSCALL_64_fastpath .tmp_entry_64.o:0
               __nanosleep_nocancel .:0
               std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > thread:323
               |
               |--90.08%--main test.cpp:9
               |          __libc_start_main
               |          _start
               |
               |--9.01%--main test.cpp:10
               |          __libc_start_main
               |          _start
               |
                --0.91%--main test.cpp:13
                          __libc_start_main
                          _start
~~~~~~~~~~~~~~~

With this patch here applied, the issue is fixed. The report becomes
much more usable:

~~~~~~~~~~~~~~~
  Overhead  Source:Line
  ........  ...........

   100.00%  core.c:0
            |
            ---__schedule core.c:0
               schedule
               do_nanosleep hrtimer.c:0
               hrtimer_nanosleep
               sys_nanosleep
               entry_SYSCALL_64_fastpath .tmp_entry_64.o:0
               __nanosleep_nocancel .:0
               std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > thread:323
               |
               |--90.08%--main test.cpp:8
               |          __libc_start_main
               |          _start
               |
               |--9.01%--main test.cpp:9
               |          __libc_start_main
               |          _start
               |
                --0.91%--main test.cpp:10
                          __libc_start_main
                          _start
~~~~~~~~~~~~~~~

Similarly it works for signal frames:

~~~~~~~~~~~~~~~

__noinline void bar(void)
{
  volatile long cnt = 0;

  for (cnt = 0; cnt < 100000000; cnt++);
}

__noinline void foo(void)
{
  bar();
}

void sig_handler(int sig)
{
  foo();
}

int main(void)
{
  signal(SIGUSR1, sig_handler);
  raise(SIGUSR1);

  foo();
  return 0;
}
~~~~~~~~~~~~~~~~

Before, the report wrongly points to `signal.c:29` after raise():

~~~~~~~~~~~~~~~~
$ perf report --stdio --no-children -g srcline -s srcline
...
   100.00%  signal.c:11
            |
            ---bar signal.c:11
               |
               |--50.49%--main signal.c:29
               |          __libc_start_main
               |          _start
               |
                --49.51%--0x33a8f
                          raise .:0
                          main signal.c:29
                          __libc_start_main
                          _start
~~~~~~~~~~~~~~~~

With this patch in, the issue is fixed and we instead get:

~~~~~~~~~~~~~~~~
   100.00%  signal   signal            [.] bar
            |
            ---bar signal.c:11
               |
               |--50.49%--main signal.c:29
               |          __libc_start_main
               |          _start
               |
                --49.51%--0x33a8f
                          raise .:0
                          main signal.c:27
                          __libc_start_main
                          _start
~~~~~~~~~~~~~~~~

Note how this patch fixes this issue for both unwinding methods, i.e.
both dwfl and libunwind. The former case is straight-forward thanks
to dwfl_frame_pc. For libunwind, we replace the functionality via
unw_is_signal_frame for any but the very first frame.

Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Yao Jin <yao.jin@linux.intel.com>
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/unwind-libdw.c           |  6 +++++-
 tools/perf/util/unwind-libunwind-local.c | 11 +++++++++++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/unwind-libdw.c b/tools/perf/util/unwind-libdw.c
index f90e11a555b2..943a06291587 100644
--- a/tools/perf/util/unwind-libdw.c
+++ b/tools/perf/util/unwind-libdw.c
@@ -168,12 +168,16 @@ frame_callback(Dwfl_Frame *state, void *arg)
 {
 	struct unwind_info *ui = arg;
 	Dwarf_Addr pc;
+	bool isactivation;
 
-	if (!dwfl_frame_pc(state, &pc, NULL)) {
+	if (!dwfl_frame_pc(state, &pc, &isactivation)) {
 		pr_err("%s", dwfl_errmsg(-1));
 		return DWARF_CB_ABORT;
 	}
 
+	if (!isactivation)
+		--pc;
+
 	return entry(pc, ui) || !(--ui->max_stack) ?
 	       DWARF_CB_ABORT : DWARF_CB_OK;
 }
diff --git a/tools/perf/util/unwind-libunwind-local.c b/tools/perf/util/unwind-libunwind-local.c
index f8455bed6e65..84d553898e2a 100644
--- a/tools/perf/util/unwind-libunwind-local.c
+++ b/tools/perf/util/unwind-libunwind-local.c
@@ -692,6 +692,17 @@ static int get_entries(struct unwind_info *ui, unwind_entry_cb_t cb,
 
 		while (!ret && (unw_step(&c) > 0) && i < max_stack) {
 			unw_get_reg(&c, UNW_REG_IP, &ips[i]);
+
+			/*
+			 * Decrement the IP for any non-activation frames.
+			 * this is required to properly find the srcline
+			 * for caller frames.
+			 * See also the documentation for dwfl_frame_pc,
+			 * which this code tries to replicate.
+			 */
+			if (unw_is_signal_frame(&c) <= 0)
+				--ips[i];
+
 			++i;
 		}
 
-- 
2.13.0

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 4/7] perf script: Add --inline option
  2017-05-24  6:21 [GIT PULL 0/7] perf/urgent callchain fixes Namhyung Kim
                   ` (2 preceding siblings ...)
  2017-05-24  6:21 ` [PATCH 3/7] perf report: fix off-by-one for non-activation frames Namhyung Kim
@ 2017-05-24  6:21 ` Namhyung Kim
  2017-05-24  6:38   ` Ingo Molnar
  2017-05-24  7:05   ` [tip:perf/urgent] perf script: Add --inline option for debugging tip-bot for Namhyung Kim
  2017-05-24  6:21 ` [PATCH 5/7] perf report: always honor callchain order for inlined nodes Namhyung Kim
                   ` (4 subsequent siblings)
  8 siblings, 2 replies; 27+ messages in thread
From: Namhyung Kim @ 2017-05-24  6:21 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa,
	Milian Wolff, Yao Jin

The --inline option is to show inlined functions in callchains.

For example,

  $ perf script
  a.out  5644 11611.467597:     309961 cycles:u:
                     790 main (/home/namhyung/tmp/perf/a.out)
                   20511 __libc_start_main (/usr/lib/libc-2.25.so)
                     8ba _start (/home/namhyung/tmp/perf/a.out)
  ...

  $ perf script --inline
  a.out  5644 11611.467597:     309961 cycles:u:
                     790 main (/home/namhyung/tmp/perf/a.out)
                         std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator()
                         std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >
                         std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >
                         main
                   20511 __libc_start_main (/usr/lib/libc-2.25.so)
                     8ba _start (/home/namhyung/tmp/perf/a.out)
  ...

Cc: Jin Yao <yao.jin@linux.intel.com>
Reviewed-and-tested-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/Documentation/perf-script.txt |  4 ++++
 tools/perf/builtin-script.c              |  2 ++
 tools/perf/util/evsel_fprintf.c          | 33 ++++++++++++++++++++++++++++++++
 3 files changed, 39 insertions(+)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index cb0eda3925e6..3517e204a2b3 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -311,6 +311,10 @@ include::itrace.txt[]
 	Set the maximum number of program blocks to print with brstackasm for
 	each sample.
 
+--inline::
+	If a callgraph address belongs to an inlined function, the inline stack
+	will be printed. Each entry has function name and file/line.
+
 SEE ALSO
 --------
 linkperf:perf-record[1], linkperf:perf-script-perl[1],
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index d05aec491cff..4761b0d7fcb5 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -2494,6 +2494,8 @@ int cmd_script(int argc, const char **argv)
 			"Enable kernel symbol demangling"),
 	OPT_STRING(0, "time", &script.time_str, "str",
 		   "Time span of interest (start,stop)"),
+	OPT_BOOLEAN(0, "inline", &symbol_conf.inline_name,
+		    "Show inline function"),
 	OPT_END()
 	};
 	const char * const script_subcommands[] = { "record", "report", NULL };
diff --git a/tools/perf/util/evsel_fprintf.c b/tools/perf/util/evsel_fprintf.c
index e415aee6a245..583f3a602506 100644
--- a/tools/perf/util/evsel_fprintf.c
+++ b/tools/perf/util/evsel_fprintf.c
@@ -7,6 +7,7 @@
 #include "map.h"
 #include "strlist.h"
 #include "symbol.h"
+#include "srcline.h"
 
 static int comma_fprintf(FILE *fp, bool *first, const char *fmt, ...)
 {
@@ -168,6 +169,38 @@ int sample__fprintf_callchain(struct perf_sample *sample, int left_alignment,
 			if (!print_oneline)
 				printed += fprintf(fp, "\n");
 
+			if (symbol_conf.inline_name && node->map) {
+				struct inline_node *inode;
+
+				addr = map__rip_2objdump(node->map, node->ip),
+				inode = dso__parse_addr_inlines(node->map->dso, addr);
+
+				if (inode) {
+					struct inline_list *ilist;
+
+					list_for_each_entry(ilist, &inode->val, list) {
+						if (print_arrow)
+							printed += fprintf(fp, " <-");
+
+						/* IP is same, just skip it */
+						if (print_ip)
+							printed += fprintf(fp, "%c%16s",
+									   s, "");
+						if (print_sym)
+							printed += fprintf(fp, " %s",
+									   ilist->funcname);
+						if (print_srcline)
+							printed += fprintf(fp, "\n  %s:%d",
+									   ilist->filename,
+									   ilist->line_nr);
+						if (!print_oneline)
+							printed += fprintf(fp, "\n");
+					}
+
+					inline_node__delete(inode);
+				}
+			}
+
 			if (symbol_conf.bt_stop_list &&
 			    node->sym &&
 			    strlist__has_entry(symbol_conf.bt_stop_list,
-- 
2.13.0

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 5/7] perf report: always honor callchain order for inlined nodes
  2017-05-24  6:21 [GIT PULL 0/7] perf/urgent callchain fixes Namhyung Kim
                   ` (3 preceding siblings ...)
  2017-05-24  6:21 ` [PATCH 4/7] perf script: Add --inline option Namhyung Kim
@ 2017-05-24  6:21 ` Namhyung Kim
  2017-05-24  7:06   ` [tip:perf/urgent] perf report: Always " tip-bot for Milian Wolff
  2017-05-24  6:21 ` [PATCH 6/7] perf report: do not drop last inlined frame Namhyung Kim
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Namhyung Kim @ 2017-05-24  6:21 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa,
	Milian Wolff, Yao Jin, Arnaldo Carvalho de Melo, David Ahern,
	Peter Zijlstra

From: Milian Wolff <milian.wolff@kdab.com>

So far, the inlined nodes where only reversed when we built perf
against libbfd. If that was not available, the addr2line fallback
code path was missing the inline_list__reverse call.

Now we always add the nodes in the correct order within
inline_list__append. This removes the need to reverse the list
and also ensures that all callers construct the list in the right
order.

Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Yao Jin <yao.jin@linux.intel.com>
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/srcline.c | 18 ++++--------------
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index 5e376d64d59e..6af0364cad06 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -56,7 +56,10 @@ static int inline_list__append(char *filename, char *funcname, int line_nr,
 		}
 	}
 
-	list_add_tail(&ilist->list, &node->val);
+	if (callchain_param.order == ORDER_CALLEE)
+		list_add_tail(&ilist->list, &node->val);
+	else
+		list_add(&ilist->list, &node->val);
 
 	return 0;
 }
@@ -200,14 +203,6 @@ static void addr2line_cleanup(struct a2l_data *a2l)
 
 #define MAX_INLINE_NEST 1024
 
-static void inline_list__reverse(struct inline_node *node)
-{
-	struct inline_list *ilist, *n;
-
-	list_for_each_entry_safe_reverse(ilist, n, &node->val, list)
-		list_move_tail(&ilist->list, &node->val);
-}
-
 static int addr2line(const char *dso_name, u64 addr,
 		     char **file, unsigned int *line, struct dso *dso,
 		     bool unwind_inlines, struct inline_node *node)
@@ -250,11 +245,6 @@ static int addr2line(const char *dso_name, u64 addr,
 				ret = 1;
 			}
 		}
-
-		if ((node != NULL) &&
-		    (callchain_param.order != ORDER_CALLEE)) {
-			inline_list__reverse(node);
-		}
 	}
 
 	if (file) {
-- 
2.13.0

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 6/7] perf report: do not drop last inlined frame
  2017-05-24  6:21 [GIT PULL 0/7] perf/urgent callchain fixes Namhyung Kim
                   ` (4 preceding siblings ...)
  2017-05-24  6:21 ` [PATCH 5/7] perf report: always honor callchain order for inlined nodes Namhyung Kim
@ 2017-05-24  6:21 ` Namhyung Kim
  2017-05-24  7:06   ` [tip:perf/urgent] perf report: Do " tip-bot for Milian Wolff
  2017-05-24  6:21 ` [PATCH 7/7] perf tools: Fix to put caller above callee in children mode Namhyung Kim
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Namhyung Kim @ 2017-05-24  6:21 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa,
	Milian Wolff, Yao Jin, Arnaldo Carvalho de Melo, David Ahern,
	Peter Zijlstra

From: Milian Wolff <milian.wolff@kdab.com>

The very last inlined frame, i.e. the one furthest away from the
non-inlined frame, was silently dropped. This is apparent when
comparing the output of `perf script` and `addr2line`:

~~~~~~
$ perf script --inline
...
a.out 26722 80836.309329:      72425 cycles:
                   21561 __hypot_finite (/usr/lib/libm-2.25.so)
                    ace3 hypot (/usr/lib/libm-2.25.so)
                     a4a main (a.out)
                         std::abs<double>
                         std::_Norm_helper<true>::_S_do_it<double>
                         std::norm<double>
                         main
                   20510 __libc_start_main (/usr/lib/libc-2.25.so)
                     bd9 _start (a.out)

$ addr2line -a -f -i -e /tmp/a.out a4a | c++filt
0x0000000000000a4a
std::__complex_abs(doublecomplex )
/usr/include/c++/6.3.1/complex:589
double std::abs<double>(std::complex<double> const&)
/usr/include/c++/6.3.1/complex:597
double std::_Norm_helper<true>::_S_do_it<double>(std::complex<double> const&)
/usr/include/c++/6.3.1/complex:654
double std::norm<double>(std::complex<double> const&)
/usr/include/c++/6.3.1/complex:664
main
/tmp/inlining.cpp:14
~~~~~

Note how `std::__complex_abs` is missing from the `perf script`
output. This is similarly showing up in `perf report`. The patch
here fixes this issue, and the output becomes:

~~~~~
a.out 26722 80836.309329:      72425 cycles:
                   21561 __hypot_finite (/usr/lib/libm-2.25.so)
                    ace3 hypot (/usr/lib/libm-2.25.so)
                     a4a main (a.out)
                         std::__complex_abs
                         std::abs<double>
                         std::_Norm_helper<true>::_S_do_it<double>
                         std::norm<double>
                         main
                   20510 __libc_start_main (/usr/lib/libc-2.25.so)
                     bd9 _start (a.out)
~~~~~

Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Yao Jin <yao.jin@linux.intel.com>
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/srcline.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index 6af0364cad06..ebc88a74e67b 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -203,6 +203,16 @@ static void addr2line_cleanup(struct a2l_data *a2l)
 
 #define MAX_INLINE_NEST 1024
 
+static int inline_list__append_dso_a2l(struct dso *dso,
+				       struct inline_node *node)
+{
+	struct a2l_data *a2l = dso->a2l;
+	char *funcname = a2l->funcname ? strdup(a2l->funcname) : NULL;
+	char *filename = a2l->filename ? strdup(a2l->filename) : NULL;
+
+	return inline_list__append(filename, funcname, a2l->line, node, dso);
+}
+
 static int addr2line(const char *dso_name, u64 addr,
 		     char **file, unsigned int *line, struct dso *dso,
 		     bool unwind_inlines, struct inline_node *node)
@@ -231,15 +241,15 @@ static int addr2line(const char *dso_name, u64 addr,
 	if (unwind_inlines) {
 		int cnt = 0;
 
+		if (node && inline_list__append_dso_a2l(dso, node))
+			return 0;
+
 		while (bfd_find_inliner_info(a2l->abfd, &a2l->filename,
 					     &a2l->funcname, &a2l->line) &&
 		       cnt++ < MAX_INLINE_NEST) {
 
 			if (node != NULL) {
-				if (inline_list__append(strdup(a2l->filename),
-							strdup(a2l->funcname),
-							a2l->line, node,
-							dso) != 0)
+				if (inline_list__append_dso_a2l(dso, node))
 					return 0;
 				// found at least one inline frame
 				ret = 1;
-- 
2.13.0

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 7/7] perf tools: Fix to put caller above callee in children mode
  2017-05-24  6:21 [GIT PULL 0/7] perf/urgent callchain fixes Namhyung Kim
                   ` (5 preceding siblings ...)
  2017-05-24  6:21 ` [PATCH 6/7] perf report: do not drop last inlined frame Namhyung Kim
@ 2017-05-24  6:21 ` Namhyung Kim
  2017-05-24  7:07   ` [tip:perf/urgent] perf tools: Put caller above callee in --children mode tip-bot for Namhyung Kim
  2017-05-24  6:53 ` [GIT PULL 0/7] perf/urgent callchain fixes Ingo Molnar
  2017-06-08 13:15 ` [GIT PULL 0/7] perf/urgent callchain fixes Milian Wolff
  8 siblings, 1 reply; 27+ messages in thread
From: Namhyung Kim @ 2017-05-24  6:21 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa,
	Milian Wolff, Yao Jin, Frederic Weisbecker

The __hpp__sort_acc() sorts entries using callchain depth in order to
put callers above in children mode.  But it assumed the callchain order
was callee-first.  Now default (for children) is caller-first so the
order of entries is reverted.

For example, consider following case.

  $ perf report --no-children
  ..l
  # Overhead  Command  Shared Object        Symbol
  # ........  .......  ...................  ..........................
  #
      99.44%  a.out    a.out                [.] main
              |
              ---main
                 __libc_start_main
                 _start

Then children mode should show 'start' above '__libc_start_main' since
it's the caller (parent) of the __libc_start_main.  But it's reversed:

  # Children      Self  Command  Shared Object    Symbol
  # ........  ........  .......  ...............  .....................
  #
      99.61%     0.00%  a.out    libc-2.25.so     [.] __libc_start_main
      99.61%     0.00%  a.out    a.out            [.] _start
      99.54%    99.44%  a.out    a.out            [.] main

This patch fixes it.

  # Children      Self  Command  Shared Object    Symbol
  # ........  ........  .......  ...............  .....................
  #
      99.61%     0.00%  a.out    a.out            [.] _start
      99.61%     0.00%  a.out    libc-2.25.so     [.] __libc_start_main
      99.54%    99.44%  a.out    a.out            [.] main

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Milian Wolff <milian.wolff@kdab.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/ui/hist.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/perf/ui/hist.c b/tools/perf/ui/hist.c
index 59addd52d9cd..ddb2c6fbdf91 100644
--- a/tools/perf/ui/hist.c
+++ b/tools/perf/ui/hist.c
@@ -210,6 +210,8 @@ static int __hpp__sort_acc(struct hist_entry *a, struct hist_entry *b,
 			return 0;
 
 		ret = b->callchain->max_depth - a->callchain->max_depth;
+		if (callchain_param.order == ORDER_CALLER)
+			ret = -ret;
 	}
 	return ret;
 }
-- 
2.13.0

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/7] perf script: Add --inline option
  2017-05-24  6:21 ` [PATCH 4/7] perf script: Add --inline option Namhyung Kim
@ 2017-05-24  6:38   ` Ingo Molnar
  2017-05-24  7:13     ` Namhyung Kim
  2017-05-24  7:05   ` [tip:perf/urgent] perf script: Add --inline option for debugging tip-bot for Namhyung Kim
  1 sibling, 1 reply; 27+ messages in thread
From: Ingo Molnar @ 2017-05-24  6:38 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa,
	Milian Wolff, Yao Jin


* Namhyung Kim <namhyung@kernel.org> wrote:

> The --inline option is to show inlined functions in callchains.
> 
> For example,
> 
>   $ perf script
>   a.out  5644 11611.467597:     309961 cycles:u:
>                      790 main (/home/namhyung/tmp/perf/a.out)
>                    20511 __libc_start_main (/usr/lib/libc-2.25.so)
>                      8ba _start (/home/namhyung/tmp/perf/a.out)
>   ...
> 
>   $ perf script --inline
>   a.out  5644 11611.467597:     309961 cycles:u:
>                      790 main (/home/namhyung/tmp/perf/a.out)
>                          std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator()
>                          std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >
>                          std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >
>                          main
>                    20511 __libc_start_main (/usr/lib/libc-2.25.so)
>                      8ba _start (/home/namhyung/tmp/perf/a.out)
>   ...

Shouldn't this be the default behavior, to make call chains more readable?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [GIT PULL 0/7] perf/urgent callchain fixes
  2017-05-24  6:21 [GIT PULL 0/7] perf/urgent callchain fixes Namhyung Kim
                   ` (6 preceding siblings ...)
  2017-05-24  6:21 ` [PATCH 7/7] perf tools: Fix to put caller above callee in children mode Namhyung Kim
@ 2017-05-24  6:53 ` Ingo Molnar
  2017-05-24  6:57   ` [PATCH] tools/include: Sync kernel ABI headers with tooling headers Ingo Molnar
  2017-06-08 13:15 ` [GIT PULL 0/7] perf/urgent callchain fixes Milian Wolff
  8 siblings, 1 reply; 27+ messages in thread
From: Ingo Molnar @ 2017-05-24  6:53 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa,
	Milian Wolff, Yao Jin


* Namhyung Kim <namhyung@kernel.org> wrote:

> Hi Ingo,
> 
> Please consider pulling the perf tooling changes below.  Build tested
> on Ubuntu, Fedora and Archlinux.  I found a problem during `perf test`
> but it seems unrelated to this series.  Will take a look it later.
> 
> Thanks,
> Namhyung
> 
> 
> The following changes since commit 88b0193d9418c00340e45e0a913a0813bc6c8c96:
> 
>   perf/callchain: Force USER_DS when invoking perf_callchain_user() (2017-05-10 07:54:00 +0200)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf tags/perf-urgent-for-mingo-4.12-20170524
> 
> for you to fetch changes up to 37d4e1b6ba56773cef96122dff4436c2c534c381:
> 
>   perf tools: Fix to put caller above callee in children mode (2017-05-24 08:51:11 +0900)
> 
> ----------------------------------------------------------------
> perf/urgent fixes
> 
> Fixes:
> 
>  - Fix segfault on `perf report -g srcline` if a callchain address
>    cannot find a map for some reason.  The srcline sorting mode needs
>    a DSO to resolve line numbers and it's accessed via a map.  But it
>    should check if map is available for the address first.  (Milian Wolff)
> 
>  - Fix off-by-one for srcline output.  It passed (unwound) address to
>    resolve srcline for callchains.  But it's a return address of the
>    function which points to a next instruction.  This leads to
>    off-by-one for srcline info.  So pass the "address - 1" instead to
>    get the correct srcline.  This also considers "signal frame" as
>    well which has the exact address, so pass the address directly in
>    this case.  (Milian Wolff)
> 
>  - Fix missing inlined function.  Current code missed to display
>    inlined functions at the end.  This was found when comparing the
>    output of addr2line and perf script.  (Milian Wolff)
>    
>  
> User Visible:
> 
>  - `perf script` also gained `--inline` option to show inlined
>    functions with callchains.  This helped to find a bug in the
>    current inline code.  (Namhyung Kim)
> 
>  - Fix missed callchain ordering with `-g callee/caller` when libbfd
>    is not available.  (Milian Wolff)
>  
>  - Reorder output entries in `perf report --children` so that it can
>    put parent entries above their children.  It worked like this but
>    missed when callchain display order was changed with `-g caller`.
>    Now default is `-g caller` if children mode enabled.  (Namhyung Kim)
> 
> 
> ----------------------------------------------------------------
> 
> Milian Wolff (5):
>       perf report: don't crash on invalid maps in `-g srcline` mode
>       perf report: fix memory leak in addr2line when called by addr2inlines
>       perf report: fix off-by-one for non-activation frames
>       perf report: always honor callchain order for inlined nodes
>       perf report: do not drop last inlined frame
> 
> Namhyung Kim (2):
>       perf script: Add --inline option
>       perf tools: Fix to put caller above callee in children mode
> 
>  tools/perf/Documentation/perf-script.txt |  4 +++
>  tools/perf/builtin-script.c              |  2 ++
>  tools/perf/ui/hist.c                     |  2 ++
>  tools/perf/util/callchain.c              | 13 ++++++---
>  tools/perf/util/evsel_fprintf.c          | 33 +++++++++++++++++++++
>  tools/perf/util/srcline.c                | 49 +++++++++++++++++---------------
>  tools/perf/util/unwind-libdw.c           |  6 +++-
>  tools/perf/util/unwind-libunwind-local.c | 11 +++++++
>  8 files changed, 92 insertions(+), 28 deletions(-)

Thanks, I've applied the fixes from email with some minor tweaks to the 
changelogs.

I also noticed that we now have a lot of warnings about out of sync headers:

Warning: include/uapi/linux/stat.h differs from kernel
Warning: arch/x86/include/asm/disabled-features.h differs from kernel
Warning: arch/x86/include/asm/required-features.h differs from kernel
Warning: arch/x86/include/asm/cpufeatures.h differs from kernel
Warning: arch/x86/include/uapi/asm/kvm.h differs from kernel
Warning: arch/x86/include/uapi/asm/vmx.h differs from kernel
Warning: arch/powerpc/include/uapi/asm/kvm.h differs from kernel
Warning: arch/s390/include/uapi/asm/kvm.h differs from kernel
Warning: arch/arm/include/uapi/asm/kvm.h differs from kernel
Warning: arch/arm64/include/uapi/asm/kvm.h differs from kernel

... will post a separate patch for that.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] tools/include: Sync kernel ABI headers with tooling headers
  2017-05-24  6:53 ` [GIT PULL 0/7] perf/urgent callchain fixes Ingo Molnar
@ 2017-05-24  6:57   ` Ingo Molnar
  2017-05-24  7:07     ` [tip:perf/urgent] " tip-bot for Ingo Molnar
  0 siblings, 1 reply; 27+ messages in thread
From: Ingo Molnar @ 2017-05-24  6:57 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa,
	Milian Wolff, Yao Jin

Sync (copy) the following v4.12 kernel headers to the tooling headers:

  arch/x86/include/asm/disabled-features.h:
  arch/x86/include/uapi/asm/kvm.h:
  arch/powerpc/include/uapi/asm/kvm.h:
  arch/s390/include/uapi/asm/kvm.h:
  arch/arm/include/uapi/asm/kvm.h:
  arch/arm64/include/uapi/asm/kvm.h:

   - 'struct kvm_sync_regs' got changed in an ABI-incompatible way,
     fortunately none of the (in-kernel) tooling relied on it

   - new KVM_DEV calls added

  arch/x86/include/asm/required-features.h:

   - 5-level paging hardware ABI detail added

  arch/x86/include/asm/cpufeatures.h:

   - new CPU feature added

  arch/x86/include/uapi/asm/vmx.h:

   - new VMX exit conditions

None of the changes requires fixes in the tooling source code.

This addresses the following warnings:

  Warning: include/uapi/linux/stat.h differs from kernel
  Warning: arch/x86/include/asm/disabled-features.h differs from kernel
  Warning: arch/x86/include/asm/required-features.h differs from kernel
  Warning: arch/x86/include/asm/cpufeatures.h differs from kernel
  Warning: arch/x86/include/uapi/asm/kvm.h differs from kernel
  Warning: arch/x86/include/uapi/asm/vmx.h differs from kernel
  Warning: arch/powerpc/include/uapi/asm/kvm.h differs from kernel
  Warning: arch/s390/include/uapi/asm/kvm.h differs from kernel
  Warning: arch/arm/include/uapi/asm/kvm.h differs from kernel
  Warning: arch/arm64/include/uapi/asm/kvm.h differs from kernel

Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 tools/arch/arm/include/uapi/asm/kvm.h          | 10 +++++++++-
 tools/arch/arm64/include/uapi/asm/kvm.h        | 10 +++++++++-
 tools/arch/powerpc/include/uapi/asm/kvm.h      |  3 +++
 tools/arch/s390/include/uapi/asm/kvm.h         | 26 ++++++++++++++++++++++++--
 tools/arch/x86/include/asm/cpufeatures.h       |  2 ++
 tools/arch/x86/include/asm/disabled-features.h |  8 +++++++-
 tools/arch/x86/include/asm/required-features.h |  8 +++++++-
 tools/arch/x86/include/uapi/asm/kvm.h          |  3 +++
 tools/arch/x86/include/uapi/asm/vmx.h          | 25 ++++++++++++++++++-------
 tools/include/uapi/linux/stat.h                |  8 ++------
 10 files changed, 84 insertions(+), 19 deletions(-)

diff --git a/tools/arch/arm/include/uapi/asm/kvm.h b/tools/arch/arm/include/uapi/asm/kvm.h
index 6ebd3e6a1fd1..5e3c673fa3f4 100644
--- a/tools/arch/arm/include/uapi/asm/kvm.h
+++ b/tools/arch/arm/include/uapi/asm/kvm.h
@@ -27,6 +27,8 @@
 #define __KVM_HAVE_IRQ_LINE
 #define __KVM_HAVE_READONLY_MEM
 
+#define KVM_COALESCED_MMIO_PAGE_OFFSET 1
+
 #define KVM_REG_SIZE(id)						\
 	(1U << (((id) & KVM_REG_SIZE_MASK) >> KVM_REG_SIZE_SHIFT))
 
@@ -114,6 +116,8 @@ struct kvm_debug_exit_arch {
 };
 
 struct kvm_sync_regs {
+	/* Used with KVM_CAP_ARM_USER_IRQ */
+	__u64 device_irq_level;
 };
 
 struct kvm_arch_memory_slot {
@@ -192,13 +196,17 @@ struct kvm_arch_memory_slot {
 #define KVM_DEV_ARM_VGIC_GRP_REDIST_REGS 5
 #define KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS 6
 #define KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO  7
+#define KVM_DEV_ARM_VGIC_GRP_ITS_REGS	8
 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT	10
 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_MASK \
 			(0x3fffffULL << KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT)
 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INTID_MASK 0x3ff
 #define VGIC_LEVEL_INFO_LINE_LEVEL	0
 
-#define   KVM_DEV_ARM_VGIC_CTRL_INIT    0
+#define   KVM_DEV_ARM_VGIC_CTRL_INIT		0
+#define   KVM_DEV_ARM_ITS_SAVE_TABLES		1
+#define   KVM_DEV_ARM_ITS_RESTORE_TABLES	2
+#define   KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES	3
 
 /* KVM_IRQ_LINE irq field index values */
 #define KVM_ARM_IRQ_TYPE_SHIFT		24
diff --git a/tools/arch/arm64/include/uapi/asm/kvm.h b/tools/arch/arm64/include/uapi/asm/kvm.h
index c2860358ae3e..70eea2ecc663 100644
--- a/tools/arch/arm64/include/uapi/asm/kvm.h
+++ b/tools/arch/arm64/include/uapi/asm/kvm.h
@@ -39,6 +39,8 @@
 #define __KVM_HAVE_IRQ_LINE
 #define __KVM_HAVE_READONLY_MEM
 
+#define KVM_COALESCED_MMIO_PAGE_OFFSET 1
+
 #define KVM_REG_SIZE(id)						\
 	(1U << (((id) & KVM_REG_SIZE_MASK) >> KVM_REG_SIZE_SHIFT))
 
@@ -143,6 +145,8 @@ struct kvm_debug_exit_arch {
 #define KVM_GUESTDBG_USE_HW		(1 << 17)
 
 struct kvm_sync_regs {
+	/* Used with KVM_CAP_ARM_USER_IRQ */
+	__u64 device_irq_level;
 };
 
 struct kvm_arch_memory_slot {
@@ -212,13 +216,17 @@ struct kvm_arch_memory_slot {
 #define KVM_DEV_ARM_VGIC_GRP_REDIST_REGS 5
 #define KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS 6
 #define KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO  7
+#define KVM_DEV_ARM_VGIC_GRP_ITS_REGS 8
 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT	10
 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_MASK \
 			(0x3fffffULL << KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT)
 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INTID_MASK	0x3ff
 #define VGIC_LEVEL_INFO_LINE_LEVEL	0
 
-#define   KVM_DEV_ARM_VGIC_CTRL_INIT	0
+#define   KVM_DEV_ARM_VGIC_CTRL_INIT		0
+#define   KVM_DEV_ARM_ITS_SAVE_TABLES           1
+#define   KVM_DEV_ARM_ITS_RESTORE_TABLES        2
+#define   KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES	3
 
 /* Device Control API on vcpu fd */
 #define KVM_ARM_VCPU_PMU_V3_CTRL	0
diff --git a/tools/arch/powerpc/include/uapi/asm/kvm.h b/tools/arch/powerpc/include/uapi/asm/kvm.h
index 4edbe4bb0e8b..07fbeb927834 100644
--- a/tools/arch/powerpc/include/uapi/asm/kvm.h
+++ b/tools/arch/powerpc/include/uapi/asm/kvm.h
@@ -29,6 +29,9 @@
 #define __KVM_HAVE_IRQ_LINE
 #define __KVM_HAVE_GUEST_DEBUG
 
+/* Not always available, but if it is, this is the correct offset.  */
+#define KVM_COALESCED_MMIO_PAGE_OFFSET 1
+
 struct kvm_regs {
 	__u64 pc;
 	__u64 cr;
diff --git a/tools/arch/s390/include/uapi/asm/kvm.h b/tools/arch/s390/include/uapi/asm/kvm.h
index 7f4fd65e9208..3dd2a1d308dd 100644
--- a/tools/arch/s390/include/uapi/asm/kvm.h
+++ b/tools/arch/s390/include/uapi/asm/kvm.h
@@ -26,6 +26,8 @@
 #define KVM_DEV_FLIC_ADAPTER_REGISTER	6
 #define KVM_DEV_FLIC_ADAPTER_MODIFY	7
 #define KVM_DEV_FLIC_CLEAR_IO_IRQ	8
+#define KVM_DEV_FLIC_AISM		9
+#define KVM_DEV_FLIC_AIRQ_INJECT	10
 /*
  * We can have up to 4*64k pending subchannels + 8 adapter interrupts,
  * as well as up  to ASYNC_PF_PER_VCPU*KVM_MAX_VCPUS pfault done interrupts.
@@ -41,7 +43,14 @@ struct kvm_s390_io_adapter {
 	__u8 isc;
 	__u8 maskable;
 	__u8 swap;
-	__u8 pad;
+	__u8 flags;
+};
+
+#define KVM_S390_ADAPTER_SUPPRESSIBLE 0x01
+
+struct kvm_s390_ais_req {
+	__u8 isc;
+	__u16 mode;
 };
 
 #define KVM_S390_IO_ADAPTER_MASK 1
@@ -110,6 +119,7 @@ struct kvm_s390_vm_cpu_machine {
 #define KVM_S390_VM_CPU_FEAT_CMMA	10
 #define KVM_S390_VM_CPU_FEAT_PFMFI	11
 #define KVM_S390_VM_CPU_FEAT_SIGPIF	12
+#define KVM_S390_VM_CPU_FEAT_KSS	13
 struct kvm_s390_vm_cpu_feat {
 	__u64 feat[16];
 };
@@ -198,6 +208,10 @@ struct kvm_guest_debug_arch {
 #define KVM_SYNC_VRS    (1UL << 6)
 #define KVM_SYNC_RICCB  (1UL << 7)
 #define KVM_SYNC_FPRS   (1UL << 8)
+#define KVM_SYNC_GSCB   (1UL << 9)
+/* length and alignment of the sdnx as a power of two */
+#define SDNXC 8
+#define SDNXL (1UL << SDNXC)
 /* definition of registers in kvm_run */
 struct kvm_sync_regs {
 	__u64 prefix;	/* prefix register */
@@ -218,8 +232,16 @@ struct kvm_sync_regs {
 	};
 	__u8  reserved[512];	/* for future vector expansion */
 	__u32 fpc;		/* valid on KVM_SYNC_VRS or KVM_SYNC_FPRS */
-	__u8 padding[52];	/* riccb needs to be 64byte aligned */
+	__u8 padding1[52];	/* riccb needs to be 64byte aligned */
 	__u8 riccb[64];		/* runtime instrumentation controls block */
+	__u8 padding2[192];	/* sdnx needs to be 256byte aligned */
+	union {
+		__u8 sdnx[SDNXL];  /* state description annex */
+		struct {
+			__u64 reserved1[2];
+			__u64 gscb[4];
+		};
+	};
 };
 
 #define KVM_REG_S390_TODPR	(KVM_REG_S390 | KVM_REG_SIZE_U32 | 0x1)
diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h
index 0fe00446f9ca..2701e5f8145b 100644
--- a/tools/arch/x86/include/asm/cpufeatures.h
+++ b/tools/arch/x86/include/asm/cpufeatures.h
@@ -202,6 +202,8 @@
 #define X86_FEATURE_AVX512_4VNNIW (7*32+16) /* AVX-512 Neural Network Instructions */
 #define X86_FEATURE_AVX512_4FMAPS (7*32+17) /* AVX-512 Multiply Accumulation Single precision */
 
+#define X86_FEATURE_MBA         ( 7*32+18) /* Memory Bandwidth Allocation */
+
 /* Virtualization flags: Linux defined, word 8 */
 #define X86_FEATURE_TPR_SHADOW  ( 8*32+ 0) /* Intel TPR Shadow */
 #define X86_FEATURE_VNMI        ( 8*32+ 1) /* Intel Virtual NMI */
diff --git a/tools/arch/x86/include/asm/disabled-features.h b/tools/arch/x86/include/asm/disabled-features.h
index 85599ad4d024..5dff775af7cd 100644
--- a/tools/arch/x86/include/asm/disabled-features.h
+++ b/tools/arch/x86/include/asm/disabled-features.h
@@ -36,6 +36,12 @@
 # define DISABLE_OSPKE		(1<<(X86_FEATURE_OSPKE & 31))
 #endif /* CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS */
 
+#ifdef CONFIG_X86_5LEVEL
+# define DISABLE_LA57	0
+#else
+# define DISABLE_LA57	(1<<(X86_FEATURE_LA57 & 31))
+#endif
+
 /*
  * Make sure to add features to the correct mask
  */
@@ -55,7 +61,7 @@
 #define DISABLED_MASK13	0
 #define DISABLED_MASK14	0
 #define DISABLED_MASK15	0
-#define DISABLED_MASK16	(DISABLE_PKU|DISABLE_OSPKE)
+#define DISABLED_MASK16	(DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57)
 #define DISABLED_MASK17	0
 #define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)
 
diff --git a/tools/arch/x86/include/asm/required-features.h b/tools/arch/x86/include/asm/required-features.h
index fac9a5c0abe9..d91ba04dd007 100644
--- a/tools/arch/x86/include/asm/required-features.h
+++ b/tools/arch/x86/include/asm/required-features.h
@@ -53,6 +53,12 @@
 # define NEED_MOVBE	0
 #endif
 
+#ifdef CONFIG_X86_5LEVEL
+# define NEED_LA57	(1<<(X86_FEATURE_LA57 & 31))
+#else
+# define NEED_LA57	0
+#endif
+
 #ifdef CONFIG_X86_64
 #ifdef CONFIG_PARAVIRT
 /* Paravirtualized systems may not have PSE or PGE available */
@@ -98,7 +104,7 @@
 #define REQUIRED_MASK13	0
 #define REQUIRED_MASK14	0
 #define REQUIRED_MASK15	0
-#define REQUIRED_MASK16	0
+#define REQUIRED_MASK16	(NEED_LA57)
 #define REQUIRED_MASK17	0
 #define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)
 
diff --git a/tools/arch/x86/include/uapi/asm/kvm.h b/tools/arch/x86/include/uapi/asm/kvm.h
index 739c0c594022..c2824d02ba37 100644
--- a/tools/arch/x86/include/uapi/asm/kvm.h
+++ b/tools/arch/x86/include/uapi/asm/kvm.h
@@ -9,6 +9,9 @@
 #include <linux/types.h>
 #include <linux/ioctl.h>
 
+#define KVM_PIO_PAGE_OFFSET 1
+#define KVM_COALESCED_MMIO_PAGE_OFFSET 2
+
 #define DE_VECTOR 0
 #define DB_VECTOR 1
 #define BP_VECTOR 3
diff --git a/tools/arch/x86/include/uapi/asm/vmx.h b/tools/arch/x86/include/uapi/asm/vmx.h
index 14458658e988..690a2dcf4078 100644
--- a/tools/arch/x86/include/uapi/asm/vmx.h
+++ b/tools/arch/x86/include/uapi/asm/vmx.h
@@ -76,7 +76,11 @@
 #define EXIT_REASON_WBINVD              54
 #define EXIT_REASON_XSETBV              55
 #define EXIT_REASON_APIC_WRITE          56
+#define EXIT_REASON_RDRAND              57
 #define EXIT_REASON_INVPCID             58
+#define EXIT_REASON_VMFUNC              59
+#define EXIT_REASON_ENCLS               60
+#define EXIT_REASON_RDSEED              61
 #define EXIT_REASON_PML_FULL            62
 #define EXIT_REASON_XSAVES              63
 #define EXIT_REASON_XRSTORS             64
@@ -90,6 +94,7 @@
 	{ EXIT_REASON_TASK_SWITCH,           "TASK_SWITCH" }, \
 	{ EXIT_REASON_CPUID,                 "CPUID" }, \
 	{ EXIT_REASON_HLT,                   "HLT" }, \
+	{ EXIT_REASON_INVD,                  "INVD" }, \
 	{ EXIT_REASON_INVLPG,                "INVLPG" }, \
 	{ EXIT_REASON_RDPMC,                 "RDPMC" }, \
 	{ EXIT_REASON_RDTSC,                 "RDTSC" }, \
@@ -108,6 +113,8 @@
 	{ EXIT_REASON_IO_INSTRUCTION,        "IO_INSTRUCTION" }, \
 	{ EXIT_REASON_MSR_READ,              "MSR_READ" }, \
 	{ EXIT_REASON_MSR_WRITE,             "MSR_WRITE" }, \
+	{ EXIT_REASON_INVALID_STATE,         "INVALID_STATE" }, \
+	{ EXIT_REASON_MSR_LOAD_FAIL,         "MSR_LOAD_FAIL" }, \
 	{ EXIT_REASON_MWAIT_INSTRUCTION,     "MWAIT_INSTRUCTION" }, \
 	{ EXIT_REASON_MONITOR_TRAP_FLAG,     "MONITOR_TRAP_FLAG" }, \
 	{ EXIT_REASON_MONITOR_INSTRUCTION,   "MONITOR_INSTRUCTION" }, \
@@ -115,20 +122,24 @@
 	{ EXIT_REASON_MCE_DURING_VMENTRY,    "MCE_DURING_VMENTRY" }, \
 	{ EXIT_REASON_TPR_BELOW_THRESHOLD,   "TPR_BELOW_THRESHOLD" }, \
 	{ EXIT_REASON_APIC_ACCESS,           "APIC_ACCESS" }, \
-	{ EXIT_REASON_GDTR_IDTR,	     "GDTR_IDTR" }, \
-	{ EXIT_REASON_LDTR_TR,		     "LDTR_TR" }, \
+	{ EXIT_REASON_EOI_INDUCED,           "EOI_INDUCED" }, \
+	{ EXIT_REASON_GDTR_IDTR,             "GDTR_IDTR" }, \
+	{ EXIT_REASON_LDTR_TR,               "LDTR_TR" }, \
 	{ EXIT_REASON_EPT_VIOLATION,         "EPT_VIOLATION" }, \
 	{ EXIT_REASON_EPT_MISCONFIG,         "EPT_MISCONFIG" }, \
 	{ EXIT_REASON_INVEPT,                "INVEPT" }, \
+	{ EXIT_REASON_RDTSCP,                "RDTSCP" }, \
 	{ EXIT_REASON_PREEMPTION_TIMER,      "PREEMPTION_TIMER" }, \
+	{ EXIT_REASON_INVVPID,               "INVVPID" }, \
 	{ EXIT_REASON_WBINVD,                "WBINVD" }, \
+	{ EXIT_REASON_XSETBV,                "XSETBV" }, \
 	{ EXIT_REASON_APIC_WRITE,            "APIC_WRITE" }, \
-	{ EXIT_REASON_EOI_INDUCED,           "EOI_INDUCED" }, \
-	{ EXIT_REASON_INVALID_STATE,         "INVALID_STATE" }, \
-	{ EXIT_REASON_MSR_LOAD_FAIL,         "MSR_LOAD_FAIL" }, \
-	{ EXIT_REASON_INVD,                  "INVD" }, \
-	{ EXIT_REASON_INVVPID,               "INVVPID" }, \
+	{ EXIT_REASON_RDRAND,                "RDRAND" }, \
 	{ EXIT_REASON_INVPCID,               "INVPCID" }, \
+	{ EXIT_REASON_VMFUNC,                "VMFUNC" }, \
+	{ EXIT_REASON_ENCLS,                 "ENCLS" }, \
+	{ EXIT_REASON_RDSEED,                "RDSEED" }, \
+	{ EXIT_REASON_PML_FULL,              "PML_FULL" }, \
 	{ EXIT_REASON_XSAVES,                "XSAVES" }, \
 	{ EXIT_REASON_XRSTORS,               "XRSTORS" }
 
diff --git a/tools/include/uapi/linux/stat.h b/tools/include/uapi/linux/stat.h
index d538897b8e08..17b10304c393 100644
--- a/tools/include/uapi/linux/stat.h
+++ b/tools/include/uapi/linux/stat.h
@@ -48,17 +48,13 @@
  * tv_sec holds the number of seconds before (negative) or after (positive)
  * 00:00:00 1st January 1970 UTC.
  *
- * tv_nsec holds a number of nanoseconds before (0..-999,999,999 if tv_sec is
- * negative) or after (0..999,999,999 if tv_sec is positive) the tv_sec time.
- *
- * Note that if both tv_sec and tv_nsec are non-zero, then the two values must
- * either be both positive or both negative.
+ * tv_nsec holds a number of nanoseconds (0..999,999,999) after the tv_sec time.
  *
  * __reserved is held in case we need a yet finer resolution.
  */
 struct statx_timestamp {
 	__s64	tv_sec;
-	__s32	tv_nsec;
+	__u32	tv_nsec;
 	__s32	__reserved;
 };
 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [tip:perf/urgent] perf report: Don't crash on invalid maps in `-g srcline` mode
  2017-05-24  6:21 ` [PATCH 1/7] perf report: don't crash on invalid maps in `-g srcline` mode Namhyung Kim
@ 2017-05-24  7:03   ` " tip-bot for Milian Wolff
  0 siblings, 0 replies; 27+ messages in thread
From: tip-bot for Milian Wolff @ 2017-05-24  7:03 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, jolsa, yao.jin, acme, torvalds, acme, milian.wolff,
	mingo, namhyung, jolsa, tglx, hpa, peterz, dsahern, a.p.zijlstra

Commit-ID:  7d4df089d77306914426a604c890175f91a9a459
Gitweb:     http://git.kernel.org/tip/7d4df089d77306914426a604c890175f91a9a459
Author:     Milian Wolff <milian.wolff@kdab.com>
AuthorDate: Wed, 24 May 2017 15:21:23 +0900
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 May 2017 08:41:47 +0200

perf report: Don't crash on invalid maps in `-g srcline` mode

I just hit a segfault when doing `perf report -g srcline`.
Valgrind pointed me at this code as the culprit:

  ==8359== Invalid read of size 8
  ==8359==    at 0x3096D9: map__rip_2objdump (map.c:430)
  ==8359==    by 0x2FC1A3: match_chain_srcline (callchain.c:645)
  ==8359==    by 0x2FC1A3: match_chain (callchain.c:700)
  ==8359==    by 0x2FC1A3: append_chain (callchain.c:895)
  ==8359==    by 0x2FC1A3: append_chain_children (callchain.c:846)
  ==8359==    by 0x2FF719: callchain_append (callchain.c:944)
  ==8359==    by 0x2FF719: hist_entry__append_callchain (callchain.c:1058)
  ==8359==    by 0x32FA06: iter_add_single_cumulative_entry (hist.c:908)
  ==8359==    by 0x33195C: hist_entry_iter__add (hist.c:1050)
  ==8359==    by 0x258F65: process_sample_event (builtin-report.c:204)
  ==8359==    by 0x30D60C: perf_session__deliver_event (session.c:1310)
  ==8359==    by 0x30D60C: ordered_events__deliver_event (session.c:119)
  ==8359==    by 0x310D12: __ordered_events__flush (ordered-events.c:210)
  ==8359==    by 0x310D12: ordered_events__flush.part.3 (ordered-events.c:277)
  ==8359==    by 0x30DD3C: perf_session__process_user_event (session.c:1349)
  ==8359==    by 0x30DD3C: perf_session__process_event (session.c:1475)
  ==8359==    by 0x30FC3C: __perf_session__process_events (session.c:1867)
  ==8359==    by 0x30FC3C: perf_session__process_events (session.c:1921)
  ==8359==    by 0x25A985: __cmd_report (builtin-report.c:575)
  ==8359==    by 0x25A985: cmd_report (builtin-report.c:1054)
  ==8359==    by 0x2B9A80: run_builtin (perf.c:296)
  ==8359==  Address 0x70 is not stack'd, malloc'd or (recently) free'd

This patch fixes the issue.

Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
[ Remove dependency from another change ]
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-2-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 tools/perf/util/callchain.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 81fc29a..b4204b4 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -621,14 +621,19 @@ enum match_result {
 static enum match_result match_chain_srcline(struct callchain_cursor_node *node,
 					     struct callchain_list *cnode)
 {
-	char *left = get_srcline(cnode->ms.map->dso,
+	char *left = NULL;
+	char *right = NULL;
+	enum match_result ret = MATCH_EQ;
+	int cmp;
+
+	if (cnode->ms.map)
+		left = get_srcline(cnode->ms.map->dso,
 				 map__rip_2objdump(cnode->ms.map, cnode->ip),
 				 cnode->ms.sym, true, false);
-	char *right = get_srcline(node->map->dso,
+	if (node->map)
+		right = get_srcline(node->map->dso,
 				  map__rip_2objdump(node->map, node->ip),
 				  node->sym, true, false);
-	enum match_result ret = MATCH_EQ;
-	int cmp;
 
 	if (left && right)
 		cmp = strcmp(left, right);

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [tip:perf/urgent] perf report: Fix memory leak in addr2line when called by addr2inlines
  2017-05-24  6:21 ` [PATCH 2/7] perf report: fix memory leak in addr2line when called by addr2inlines Namhyung Kim
@ 2017-05-24  7:04   ` " tip-bot for Milian Wolff
  0 siblings, 0 replies; 27+ messages in thread
From: tip-bot for Milian Wolff @ 2017-05-24  7:04 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, peterz, jolsa, hpa, milian.wolff, jolsa, namhyung,
	acme, yao.jin, dsahern, mingo, acme, tglx, a.p.zijlstra,
	torvalds

Commit-ID:  b21cc97810932a551f7aac46f0b89c469c828b3f
Gitweb:     http://git.kernel.org/tip/b21cc97810932a551f7aac46f0b89c469c828b3f
Author:     Milian Wolff <milian.wolff@kdab.com>
AuthorDate: Wed, 24 May 2017 15:21:24 +0900
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 May 2017 08:41:48 +0200

perf report: Fix memory leak in addr2line when called by addr2inlines

When a filename was found in addr2line it was duplicated via strdup()
but never freed. Now we pass NULL and handle this gracefully in
addr2line.

Detected by Valgrind:

  ==16331== 1,680 bytes in 21 blocks are definitely lost in loss record 148 of 220
  ==16331==    at 0x4C2AF1F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
  ==16331==    by 0x672FA69: strdup (in /usr/lib/libc-2.25.so)
  ==16331==    by 0x52769F: addr2line (srcline.c:256)
  ==16331==    by 0x52769F: addr2inlines (srcline.c:294)
  ==16331==    by 0x52769F: dso__parse_addr_inlines (srcline.c:502)
  ==16331==    by 0x574D7A: inline__fprintf (hist.c:41)
  ==16331==    by 0x574D7A: ipchain__fprintf_graph (hist.c:147)
  ==16331==    by 0x57518A: __callchain__fprintf_graph (hist.c:212)
  ==16331==    by 0x5753CF: callchain__fprintf_graph.constprop.6 (hist.c:337)
  ==16331==    by 0x57738E: hist_entry__fprintf (hist.c:628)
  ==16331==    by 0x57738E: hists__fprintf (hist.c:882)
  ==16331==    by 0x44A20F: perf_evlist__tty_browse_hists (builtin-report.c:399)
  ==16331==    by 0x44A20F: report__browse_hists (builtin-report.c:491)
  ==16331==    by 0x44A20F: __cmd_report (builtin-report.c:624)
  ==16331==    by 0x44A20F: cmd_report (builtin-report.c:1054)
  ==16331==    by 0x4A49CE: run_builtin (perf.c:296)
  ==16331==    by 0x4A4CC0: handle_internal_command (perf.c:348)
  ==16331==    by 0x434371: run_argv (perf.c:392)
  ==16331==    by 0x434371: main (perf.c:530)

Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-3-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 tools/perf/util/srcline.c | 23 +++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index df051a5..5e376d6 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -230,7 +230,10 @@ static int addr2line(const char *dso_name, u64 addr,
 
 	bfd_map_over_sections(a2l->abfd, find_address_in_section, a2l);
 
-	if (a2l->found && unwind_inlines) {
+	if (!a2l->found)
+		return 0;
+
+	if (unwind_inlines) {
 		int cnt = 0;
 
 		while (bfd_find_inliner_info(a2l->abfd, &a2l->filename,
@@ -243,6 +246,8 @@ static int addr2line(const char *dso_name, u64 addr,
 							a2l->line, node,
 							dso) != 0)
 					return 0;
+				// found at least one inline frame
+				ret = 1;
 			}
 		}
 
@@ -252,14 +257,14 @@ static int addr2line(const char *dso_name, u64 addr,
 		}
 	}
 
-	if (a2l->found && a2l->filename) {
-		*file = strdup(a2l->filename);
-		*line = a2l->line;
-
-		if (*file)
-			ret = 1;
+	if (file) {
+		*file = a2l->filename ? strdup(a2l->filename) : NULL;
+		ret = *file ? 1 : 0;
 	}
 
+	if (line)
+		*line = a2l->line;
+
 	return ret;
 }
 
@@ -278,8 +283,6 @@ void dso__free_a2l(struct dso *dso)
 static struct inline_node *addr2inlines(const char *dso_name, u64 addr,
 	struct dso *dso)
 {
-	char *file = NULL;
-	unsigned int line = 0;
 	struct inline_node *node;
 
 	node = zalloc(sizeof(*node));
@@ -291,7 +294,7 @@ static struct inline_node *addr2inlines(const char *dso_name, u64 addr,
 	INIT_LIST_HEAD(&node->val);
 	node->addr = addr;
 
-	if (!addr2line(dso_name, addr, &file, &line, dso, TRUE, node))
+	if (!addr2line(dso_name, addr, NULL, NULL, dso, TRUE, node))
 		goto out_free_inline_node;
 
 	if (list_empty(&node->val))

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [tip:perf/urgent] perf report: Fix off-by-one for non-activation frames
  2017-05-24  6:21 ` [PATCH 3/7] perf report: fix off-by-one for non-activation frames Namhyung Kim
@ 2017-05-24  7:05   ` " tip-bot for Milian Wolff
  0 siblings, 0 replies; 27+ messages in thread
From: tip-bot for Milian Wolff @ 2017-05-24  7:05 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: a.p.zijlstra, jolsa, yao.jin, hpa, torvalds, linux-kernel,
	dsahern, jolsa, acme, milian.wolff, peterz, acme, namhyung,
	mingo, tglx

Commit-ID:  1982ad48fc82c284a5cc55697a012d3357e84d01
Gitweb:     http://git.kernel.org/tip/1982ad48fc82c284a5cc55697a012d3357e84d01
Author:     Milian Wolff <milian.wolff@kdab.com>
AuthorDate: Wed, 24 May 2017 15:21:25 +0900
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 May 2017 08:41:48 +0200

perf report: Fix off-by-one for non-activation frames

As the documentation for dwfl_frame_pc says, frames that
are no activation frames need to have their program counter
decremented by one to properly find the function of the caller.

This fixes many cases where perf report currently attributes
the cost to the next line. I.e. I have code like this:

~~~~~~~~~~~~~~~
  #include <thread>
  #include <chrono>

  using namespace std;

  int main()
  {
    this_thread::sleep_for(chrono::milliseconds(1000));
    this_thread::sleep_for(chrono::milliseconds(100));
    this_thread::sleep_for(chrono::milliseconds(10));

    return 0;
  }
~~~~~~~~~~~~~~~

Now compile and record it:

~~~~~~~~~~~~~~~
  g++ -std=c++11 -g -O2 test.cpp
  echo 1 | sudo tee /proc/sys/kernel/sched_schedstats
  perf record \
    --event sched:sched_stat_sleep \
    --event sched:sched_process_exit \
    --event sched:sched_switch --call-graph=dwarf \
    --output perf.data.raw \
    ./a.out
  echo 0 | sudo tee /proc/sys/kernel/sched_schedstats
  perf inject --sched-stat --input perf.data.raw --output perf.data
~~~~~~~~~~~~~~~

Before this patch, the report clearly shows the off-by-one issue.
Most notably, the last sleep invocation is incorrectly attributed
to the "return 0;" line:

~~~~~~~~~~~~~~~
  Overhead  Source:Line
  ........  ...........

   100.00%  core.c:0
            |
            ---__schedule core.c:0
               schedule
               do_nanosleep hrtimer.c:0
               hrtimer_nanosleep
               sys_nanosleep
               entry_SYSCALL_64_fastpath .tmp_entry_64.o:0
               __nanosleep_nocancel .:0
               std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > thread:323
               |
               |--90.08%--main test.cpp:9
               |          __libc_start_main
               |          _start
               |
               |--9.01%--main test.cpp:10
               |          __libc_start_main
               |          _start
               |
                --0.91%--main test.cpp:13
                          __libc_start_main
                          _start
~~~~~~~~~~~~~~~

With this patch here applied, the issue is fixed. The report becomes
much more usable:

~~~~~~~~~~~~~~~
  Overhead  Source:Line
  ........  ...........

   100.00%  core.c:0
            |
            ---__schedule core.c:0
               schedule
               do_nanosleep hrtimer.c:0
               hrtimer_nanosleep
               sys_nanosleep
               entry_SYSCALL_64_fastpath .tmp_entry_64.o:0
               __nanosleep_nocancel .:0
               std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > thread:323
               |
               |--90.08%--main test.cpp:8
               |          __libc_start_main
               |          _start
               |
               |--9.01%--main test.cpp:9
               |          __libc_start_main
               |          _start
               |
                --0.91%--main test.cpp:10
                          __libc_start_main
                          _start
~~~~~~~~~~~~~~~

Similarly it works for signal frames:

~~~~~~~~~~~~~~~
  __noinline void bar(void)
  {
    volatile long cnt = 0;

    for (cnt = 0; cnt < 100000000; cnt++);
  }

  __noinline void foo(void)
  {
    bar();
  }

  void sig_handler(int sig)
  {
    foo();
  }

  int main(void)
  {
    signal(SIGUSR1, sig_handler);
    raise(SIGUSR1);

    foo();
    return 0;
  }
~~~~~~~~~~~~~~~~

Before, the report wrongly points to `signal.c:29` after raise():

~~~~~~~~~~~~~~~~
  $ perf report --stdio --no-children -g srcline -s srcline
  ...
   100.00%  signal.c:11
            |
            ---bar signal.c:11
               |
               |--50.49%--main signal.c:29
               |          __libc_start_main
               |          _start
               |
                --49.51%--0x33a8f
                          raise .:0
                          main signal.c:29
                          __libc_start_main
                          _start
~~~~~~~~~~~~~~~~

With this patch in, the issue is fixed and we instead get:

~~~~~~~~~~~~~~~~
   100.00%  signal   signal            [.] bar
            |
            ---bar signal.c:11
               |
               |--50.49%--main signal.c:29
               |          __libc_start_main
               |          _start
               |
                --49.51%--0x33a8f
                          raise .:0
                          main signal.c:27
                          __libc_start_main
                          _start
~~~~~~~~~~~~~~~~

Note how this patch fixes this issue for both unwinding methods, i.e.
both dwfl and libunwind. The former case is straight-forward thanks
to dwfl_frame_pc(). For libunwind, we replace the functionality via
unw_is_signal_frame() for any but the very first frame.

Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-4-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 tools/perf/util/unwind-libdw.c           |  6 +++++-
 tools/perf/util/unwind-libunwind-local.c | 11 +++++++++++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/unwind-libdw.c b/tools/perf/util/unwind-libdw.c
index f90e11a..943a0629 100644
--- a/tools/perf/util/unwind-libdw.c
+++ b/tools/perf/util/unwind-libdw.c
@@ -168,12 +168,16 @@ frame_callback(Dwfl_Frame *state, void *arg)
 {
 	struct unwind_info *ui = arg;
 	Dwarf_Addr pc;
+	bool isactivation;
 
-	if (!dwfl_frame_pc(state, &pc, NULL)) {
+	if (!dwfl_frame_pc(state, &pc, &isactivation)) {
 		pr_err("%s", dwfl_errmsg(-1));
 		return DWARF_CB_ABORT;
 	}
 
+	if (!isactivation)
+		--pc;
+
 	return entry(pc, ui) || !(--ui->max_stack) ?
 	       DWARF_CB_ABORT : DWARF_CB_OK;
 }
diff --git a/tools/perf/util/unwind-libunwind-local.c b/tools/perf/util/unwind-libunwind-local.c
index f8455be..672c2ad 100644
--- a/tools/perf/util/unwind-libunwind-local.c
+++ b/tools/perf/util/unwind-libunwind-local.c
@@ -692,6 +692,17 @@ static int get_entries(struct unwind_info *ui, unwind_entry_cb_t cb,
 
 		while (!ret && (unw_step(&c) > 0) && i < max_stack) {
 			unw_get_reg(&c, UNW_REG_IP, &ips[i]);
+
+			/*
+			 * Decrement the IP for any non-activation frames.
+			 * this is required to properly find the srcline
+			 * for caller frames.
+			 * See also the documentation for dwfl_frame_pc(),
+			 * which this code tries to replicate.
+			 */
+			if (unw_is_signal_frame(&c) <= 0)
+				--ips[i];
+
 			++i;
 		}
 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [tip:perf/urgent] perf script: Add --inline option for debugging
  2017-05-24  6:21 ` [PATCH 4/7] perf script: Add --inline option Namhyung Kim
  2017-05-24  6:38   ` Ingo Molnar
@ 2017-05-24  7:05   ` tip-bot for Namhyung Kim
  1 sibling, 0 replies; 27+ messages in thread
From: tip-bot for Namhyung Kim @ 2017-05-24  7:05 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jolsa, tglx, linux-kernel, torvalds, peterz, milian.wolff, acme,
	namhyung, mingo, hpa, acme, jolsa, yao.jin

Commit-ID:  325fbff51f961491adff4037d0e0a94d6132bd9b
Gitweb:     http://git.kernel.org/tip/325fbff51f961491adff4037d0e0a94d6132bd9b
Author:     Namhyung Kim <namhyung@kernel.org>
AuthorDate: Wed, 24 May 2017 15:21:26 +0900
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 May 2017 08:41:48 +0200

perf script: Add --inline option for debugging

The --inline option is to show inlined functions in callchains.

For example:

  $ perf script
  a.out  5644 11611.467597:     309961 cycles:u:
                     790 main (/home/namhyung/tmp/perf/a.out)
                   20511 __libc_start_main (/usr/lib/libc-2.25.so)
                     8ba _start (/home/namhyung/tmp/perf/a.out)
  ...

  $ perf script --inline
  a.out  5644 11611.467597:     309961 cycles:u:
                     790 main (/home/namhyung/tmp/perf/a.out)
                         std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator()
                         std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >
                         std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >
                         main
                   20511 __libc_start_main (/usr/lib/libc-2.25.so)
                     8ba _start (/home/namhyung/tmp/perf/a.out)
  ...

Reviewed-and-tested-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-5-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 tools/perf/Documentation/perf-script.txt |  4 ++++
 tools/perf/builtin-script.c              |  2 ++
 tools/perf/util/evsel_fprintf.c          | 33 ++++++++++++++++++++++++++++++++
 3 files changed, 39 insertions(+)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index cb0eda3..3517e20 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -311,6 +311,10 @@ include::itrace.txt[]
 	Set the maximum number of program blocks to print with brstackasm for
 	each sample.
 
+--inline::
+	If a callgraph address belongs to an inlined function, the inline stack
+	will be printed. Each entry has function name and file/line.
+
 SEE ALSO
 --------
 linkperf:perf-record[1], linkperf:perf-script-perl[1],
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index d05aec4..4761b0d 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -2494,6 +2494,8 @@ int cmd_script(int argc, const char **argv)
 			"Enable kernel symbol demangling"),
 	OPT_STRING(0, "time", &script.time_str, "str",
 		   "Time span of interest (start,stop)"),
+	OPT_BOOLEAN(0, "inline", &symbol_conf.inline_name,
+		    "Show inline function"),
 	OPT_END()
 	};
 	const char * const script_subcommands[] = { "record", "report", NULL };
diff --git a/tools/perf/util/evsel_fprintf.c b/tools/perf/util/evsel_fprintf.c
index e415aee..583f3a6 100644
--- a/tools/perf/util/evsel_fprintf.c
+++ b/tools/perf/util/evsel_fprintf.c
@@ -7,6 +7,7 @@
 #include "map.h"
 #include "strlist.h"
 #include "symbol.h"
+#include "srcline.h"
 
 static int comma_fprintf(FILE *fp, bool *first, const char *fmt, ...)
 {
@@ -168,6 +169,38 @@ int sample__fprintf_callchain(struct perf_sample *sample, int left_alignment,
 			if (!print_oneline)
 				printed += fprintf(fp, "\n");
 
+			if (symbol_conf.inline_name && node->map) {
+				struct inline_node *inode;
+
+				addr = map__rip_2objdump(node->map, node->ip),
+				inode = dso__parse_addr_inlines(node->map->dso, addr);
+
+				if (inode) {
+					struct inline_list *ilist;
+
+					list_for_each_entry(ilist, &inode->val, list) {
+						if (print_arrow)
+							printed += fprintf(fp, " <-");
+
+						/* IP is same, just skip it */
+						if (print_ip)
+							printed += fprintf(fp, "%c%16s",
+									   s, "");
+						if (print_sym)
+							printed += fprintf(fp, " %s",
+									   ilist->funcname);
+						if (print_srcline)
+							printed += fprintf(fp, "\n  %s:%d",
+									   ilist->filename,
+									   ilist->line_nr);
+						if (!print_oneline)
+							printed += fprintf(fp, "\n");
+					}
+
+					inline_node__delete(inode);
+				}
+			}
+
 			if (symbol_conf.bt_stop_list &&
 			    node->sym &&
 			    strlist__has_entry(symbol_conf.bt_stop_list,

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [tip:perf/urgent] perf report: Always honor callchain order for inlined nodes
  2017-05-24  6:21 ` [PATCH 5/7] perf report: always honor callchain order for inlined nodes Namhyung Kim
@ 2017-05-24  7:06   ` " tip-bot for Milian Wolff
  0 siblings, 0 replies; 27+ messages in thread
From: tip-bot for Milian Wolff @ 2017-05-24  7:06 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: peterz, hpa, jolsa, yao.jin, milian.wolff, torvalds, tglx,
	dsahern, linux-kernel, acme, jolsa, a.p.zijlstra, namhyung,
	mingo, acme

Commit-ID:  28071f51839e393f697d0d1df0b223a4bc373606
Gitweb:     http://git.kernel.org/tip/28071f51839e393f697d0d1df0b223a4bc373606
Author:     Milian Wolff <milian.wolff@kdab.com>
AuthorDate: Wed, 24 May 2017 15:21:27 +0900
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 May 2017 08:41:48 +0200

perf report: Always honor callchain order for inlined nodes

So far, the inlined nodes where only reversed when we built perf
against libbfd. If that was not available, the addr2line fallback
code path was missing the inline_list__reverse call.

Now we always add the nodes in the correct order within
inline_list__append. This removes the need to reverse the list
and also ensures that all callers construct the list in the right
order.

Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-6-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 tools/perf/util/srcline.c | 18 ++++--------------
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index 5e376d6..6af0364 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -56,7 +56,10 @@ static int inline_list__append(char *filename, char *funcname, int line_nr,
 		}
 	}
 
-	list_add_tail(&ilist->list, &node->val);
+	if (callchain_param.order == ORDER_CALLEE)
+		list_add_tail(&ilist->list, &node->val);
+	else
+		list_add(&ilist->list, &node->val);
 
 	return 0;
 }
@@ -200,14 +203,6 @@ static void addr2line_cleanup(struct a2l_data *a2l)
 
 #define MAX_INLINE_NEST 1024
 
-static void inline_list__reverse(struct inline_node *node)
-{
-	struct inline_list *ilist, *n;
-
-	list_for_each_entry_safe_reverse(ilist, n, &node->val, list)
-		list_move_tail(&ilist->list, &node->val);
-}
-
 static int addr2line(const char *dso_name, u64 addr,
 		     char **file, unsigned int *line, struct dso *dso,
 		     bool unwind_inlines, struct inline_node *node)
@@ -250,11 +245,6 @@ static int addr2line(const char *dso_name, u64 addr,
 				ret = 1;
 			}
 		}
-
-		if ((node != NULL) &&
-		    (callchain_param.order != ORDER_CALLEE)) {
-			inline_list__reverse(node);
-		}
 	}
 
 	if (file) {

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [tip:perf/urgent] perf report: Do not drop last inlined frame
  2017-05-24  6:21 ` [PATCH 6/7] perf report: do not drop last inlined frame Namhyung Kim
@ 2017-05-24  7:06   ` " tip-bot for Milian Wolff
  0 siblings, 0 replies; 27+ messages in thread
From: tip-bot for Milian Wolff @ 2017-05-24  7:06 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, dsahern, a.p.zijlstra, linux-kernel, hpa, jolsa, acme,
	tglx, namhyung, peterz, acme, torvalds, yao.jin, jolsa,
	milian.wolff

Commit-ID:  4d53b9d546f9f4505e6e3d58c8eed894d6f684e7
Gitweb:     http://git.kernel.org/tip/4d53b9d546f9f4505e6e3d58c8eed894d6f684e7
Author:     Milian Wolff <milian.wolff@kdab.com>
AuthorDate: Wed, 24 May 2017 15:21:28 +0900
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 May 2017 08:41:48 +0200

perf report: Do not drop last inlined frame

The very last inlined frame, i.e. the one furthest away from the
non-inlined frame, was silently dropped. This is apparent when
comparing the output of `perf script` and `addr2line`:

~~~~~~
  $ perf script --inline
  ...
  a.out 26722 80836.309329:      72425 cycles:
                     21561 __hypot_finite (/usr/lib/libm-2.25.so)
                      ace3 hypot (/usr/lib/libm-2.25.so)
                       a4a main (a.out)
                           std::abs<double>
                           std::_Norm_helper<true>::_S_do_it<double>
                           std::norm<double>
                           main
                     20510 __libc_start_main (/usr/lib/libc-2.25.so)
                       bd9 _start (a.out)

  $ addr2line -a -f -i -e /tmp/a.out a4a | c++filt
  0x0000000000000a4a
  std::__complex_abs(doublecomplex )
  /usr/include/c++/6.3.1/complex:589
  double std::abs<double>(std::complex<double> const&)
  /usr/include/c++/6.3.1/complex:597
  double std::_Norm_helper<true>::_S_do_it<double>(std::complex<double> const&)
  /usr/include/c++/6.3.1/complex:654
  double std::norm<double>(std::complex<double> const&)
  /usr/include/c++/6.3.1/complex:664
  main
  /tmp/inlining.cpp:14
~~~~~

Note how `std::__complex_abs` is missing from the `perf script`
output. This is similarly showing up in `perf report`. The patch
here fixes this issue, and the output becomes:

~~~~~
  a.out 26722 80836.309329:      72425 cycles:
                     21561 __hypot_finite (/usr/lib/libm-2.25.so)
                      ace3 hypot (/usr/lib/libm-2.25.so)
                       a4a main (a.out)
                           std::__complex_abs
                           std::abs<double>
                           std::_Norm_helper<true>::_S_do_it<double>
                           std::norm<double>
                           main
                     20510 __libc_start_main (/usr/lib/libc-2.25.so)
                       bd9 _start (a.out)
~~~~~

Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-7-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 tools/perf/util/srcline.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index 6af0364..ebc88a7 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -203,6 +203,16 @@ static void addr2line_cleanup(struct a2l_data *a2l)
 
 #define MAX_INLINE_NEST 1024
 
+static int inline_list__append_dso_a2l(struct dso *dso,
+				       struct inline_node *node)
+{
+	struct a2l_data *a2l = dso->a2l;
+	char *funcname = a2l->funcname ? strdup(a2l->funcname) : NULL;
+	char *filename = a2l->filename ? strdup(a2l->filename) : NULL;
+
+	return inline_list__append(filename, funcname, a2l->line, node, dso);
+}
+
 static int addr2line(const char *dso_name, u64 addr,
 		     char **file, unsigned int *line, struct dso *dso,
 		     bool unwind_inlines, struct inline_node *node)
@@ -231,15 +241,15 @@ static int addr2line(const char *dso_name, u64 addr,
 	if (unwind_inlines) {
 		int cnt = 0;
 
+		if (node && inline_list__append_dso_a2l(dso, node))
+			return 0;
+
 		while (bfd_find_inliner_info(a2l->abfd, &a2l->filename,
 					     &a2l->funcname, &a2l->line) &&
 		       cnt++ < MAX_INLINE_NEST) {
 
 			if (node != NULL) {
-				if (inline_list__append(strdup(a2l->filename),
-							strdup(a2l->funcname),
-							a2l->line, node,
-							dso) != 0)
+				if (inline_list__append_dso_a2l(dso, node))
 					return 0;
 				// found at least one inline frame
 				ret = 1;

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [tip:perf/urgent] perf tools: Put caller above callee in --children mode
  2017-05-24  6:21 ` [PATCH 7/7] perf tools: Fix to put caller above callee in children mode Namhyung Kim
@ 2017-05-24  7:07   ` tip-bot for Namhyung Kim
  0 siblings, 0 replies; 27+ messages in thread
From: tip-bot for Namhyung Kim @ 2017-05-24  7:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, hpa, jolsa, milian.wolff, mingo, peterz, yao.jin, tglx,
	torvalds, fweisbec, namhyung, linux-kernel, acme, jolsa

Commit-ID:  7111ffff60a68f55d864200cd6c7677319e5c242
Gitweb:     http://git.kernel.org/tip/7111ffff60a68f55d864200cd6c7677319e5c242
Author:     Namhyung Kim <namhyung@kernel.org>
AuthorDate: Wed, 24 May 2017 15:21:29 +0900
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 May 2017 08:41:49 +0200

perf tools: Put caller above callee in --children mode

The __hpp__sort_acc() sorts entries using callchain depth in order to
put callers above in children mode.  But it assumed the callchain order
was callee-first.  Now default (for children) is caller-first so the
order of entries is reverted.

For example, consider following case:

  $ perf report --no-children
  ..l
  # Overhead  Command  Shared Object        Symbol
  # ........  .......  ...................  ..........................
  #
      99.44%  a.out    a.out                [.] main
              |
              ---main
                 __libc_start_main
                 _start

Then children mode should show 'start' above '__libc_start_main' since
it's the caller (parent) of the __libc_start_main.  But it's reversed:

  # Children      Self  Command  Shared Object    Symbol
  # ........  ........  .......  ...............  .....................
  #
      99.61%     0.00%  a.out    libc-2.25.so     [.] __libc_start_main
      99.61%     0.00%  a.out    a.out            [.] _start
      99.54%    99.44%  a.out    a.out            [.] main

This patch fixes it.

  # Children      Self  Command  Shared Object    Symbol
  # ........  ........  .......  ...............  .....................
  #
      99.61%     0.00%  a.out    a.out            [.] _start
      99.61%     0.00%  a.out    libc-2.25.so     [.] __libc_start_main
      99.54%    99.44%  a.out    a.out            [.] main

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-8-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 tools/perf/ui/hist.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/perf/ui/hist.c b/tools/perf/ui/hist.c
index 59addd5..ddb2c6f 100644
--- a/tools/perf/ui/hist.c
+++ b/tools/perf/ui/hist.c
@@ -210,6 +210,8 @@ static int __hpp__sort_acc(struct hist_entry *a, struct hist_entry *b,
 			return 0;
 
 		ret = b->callchain->max_depth - a->callchain->max_depth;
+		if (callchain_param.order == ORDER_CALLER)
+			ret = -ret;
 	}
 	return ret;
 }

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [tip:perf/urgent] tools/include: Sync kernel ABI headers with tooling headers
  2017-05-24  6:57   ` [PATCH] tools/include: Sync kernel ABI headers with tooling headers Ingo Molnar
@ 2017-05-24  7:07     ` " tip-bot for Ingo Molnar
  0 siblings, 0 replies; 27+ messages in thread
From: tip-bot for Ingo Molnar @ 2017-05-24  7:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, acme, namhyung, linux-kernel, torvalds, milian.wolff,
	yao.jin, tglx, hpa, mingo, jolsa, peterz, jolsa

Commit-ID:  6e30437bd42c4d4e9cfc4c40efda00eb83a11cde
Gitweb:     http://git.kernel.org/tip/6e30437bd42c4d4e9cfc4c40efda00eb83a11cde
Author:     Ingo Molnar <mingo@kernel.org>
AuthorDate: Wed, 24 May 2017 08:57:21 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 May 2017 09:00:21 +0200

tools/include: Sync kernel ABI headers with tooling headers

Sync (copy) the following v4.12 kernel headers to the tooling headers:

  arch/x86/include/asm/disabled-features.h:
  arch/x86/include/uapi/asm/kvm.h:
  arch/powerpc/include/uapi/asm/kvm.h:
  arch/s390/include/uapi/asm/kvm.h:
  arch/arm/include/uapi/asm/kvm.h:
  arch/arm64/include/uapi/asm/kvm.h:

   - 'struct kvm_sync_regs' got changed in an ABI-incompatible way,
     fortunately none of the (in-kernel) tooling relied on it

   - new KVM_DEV calls added

  arch/x86/include/asm/required-features.h:

   - 5-level paging hardware ABI detail added

  arch/x86/include/asm/cpufeatures.h:

   - new CPU feature added

  arch/x86/include/uapi/asm/vmx.h:

   - new VMX exit conditions

None of the changes requires fixes in the tooling source code.

This addresses the following warnings:

  Warning: include/uapi/linux/stat.h differs from kernel
  Warning: arch/x86/include/asm/disabled-features.h differs from kernel
  Warning: arch/x86/include/asm/required-features.h differs from kernel
  Warning: arch/x86/include/asm/cpufeatures.h differs from kernel
  Warning: arch/x86/include/uapi/asm/kvm.h differs from kernel
  Warning: arch/x86/include/uapi/asm/vmx.h differs from kernel
  Warning: arch/powerpc/include/uapi/asm/kvm.h differs from kernel
  Warning: arch/s390/include/uapi/asm/kvm.h differs from kernel
  Warning: arch/arm/include/uapi/asm/kvm.h differs from kernel
  Warning: arch/arm64/include/uapi/asm/kvm.h differs from kernel

Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yao Jin <yao.jin@linux.intel.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170524065721.j2mlch6bgk5klgbc@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 tools/arch/arm/include/uapi/asm/kvm.h          | 10 +++++++++-
 tools/arch/arm64/include/uapi/asm/kvm.h        | 10 +++++++++-
 tools/arch/powerpc/include/uapi/asm/kvm.h      |  3 +++
 tools/arch/s390/include/uapi/asm/kvm.h         | 26 ++++++++++++++++++++++++--
 tools/arch/x86/include/asm/cpufeatures.h       |  2 ++
 tools/arch/x86/include/asm/disabled-features.h |  8 +++++++-
 tools/arch/x86/include/asm/required-features.h |  8 +++++++-
 tools/arch/x86/include/uapi/asm/kvm.h          |  3 +++
 tools/arch/x86/include/uapi/asm/vmx.h          | 25 ++++++++++++++++++-------
 tools/include/uapi/linux/stat.h                |  8 ++------
 10 files changed, 84 insertions(+), 19 deletions(-)

diff --git a/tools/arch/arm/include/uapi/asm/kvm.h b/tools/arch/arm/include/uapi/asm/kvm.h
index 6ebd3e6..5e3c673 100644
--- a/tools/arch/arm/include/uapi/asm/kvm.h
+++ b/tools/arch/arm/include/uapi/asm/kvm.h
@@ -27,6 +27,8 @@
 #define __KVM_HAVE_IRQ_LINE
 #define __KVM_HAVE_READONLY_MEM
 
+#define KVM_COALESCED_MMIO_PAGE_OFFSET 1
+
 #define KVM_REG_SIZE(id)						\
 	(1U << (((id) & KVM_REG_SIZE_MASK) >> KVM_REG_SIZE_SHIFT))
 
@@ -114,6 +116,8 @@ struct kvm_debug_exit_arch {
 };
 
 struct kvm_sync_regs {
+	/* Used with KVM_CAP_ARM_USER_IRQ */
+	__u64 device_irq_level;
 };
 
 struct kvm_arch_memory_slot {
@@ -192,13 +196,17 @@ struct kvm_arch_memory_slot {
 #define KVM_DEV_ARM_VGIC_GRP_REDIST_REGS 5
 #define KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS 6
 #define KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO  7
+#define KVM_DEV_ARM_VGIC_GRP_ITS_REGS	8
 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT	10
 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_MASK \
 			(0x3fffffULL << KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT)
 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INTID_MASK 0x3ff
 #define VGIC_LEVEL_INFO_LINE_LEVEL	0
 
-#define   KVM_DEV_ARM_VGIC_CTRL_INIT    0
+#define   KVM_DEV_ARM_VGIC_CTRL_INIT		0
+#define   KVM_DEV_ARM_ITS_SAVE_TABLES		1
+#define   KVM_DEV_ARM_ITS_RESTORE_TABLES	2
+#define   KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES	3
 
 /* KVM_IRQ_LINE irq field index values */
 #define KVM_ARM_IRQ_TYPE_SHIFT		24
diff --git a/tools/arch/arm64/include/uapi/asm/kvm.h b/tools/arch/arm64/include/uapi/asm/kvm.h
index c286035..70eea2e 100644
--- a/tools/arch/arm64/include/uapi/asm/kvm.h
+++ b/tools/arch/arm64/include/uapi/asm/kvm.h
@@ -39,6 +39,8 @@
 #define __KVM_HAVE_IRQ_LINE
 #define __KVM_HAVE_READONLY_MEM
 
+#define KVM_COALESCED_MMIO_PAGE_OFFSET 1
+
 #define KVM_REG_SIZE(id)						\
 	(1U << (((id) & KVM_REG_SIZE_MASK) >> KVM_REG_SIZE_SHIFT))
 
@@ -143,6 +145,8 @@ struct kvm_debug_exit_arch {
 #define KVM_GUESTDBG_USE_HW		(1 << 17)
 
 struct kvm_sync_regs {
+	/* Used with KVM_CAP_ARM_USER_IRQ */
+	__u64 device_irq_level;
 };
 
 struct kvm_arch_memory_slot {
@@ -212,13 +216,17 @@ struct kvm_arch_memory_slot {
 #define KVM_DEV_ARM_VGIC_GRP_REDIST_REGS 5
 #define KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS 6
 #define KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO  7
+#define KVM_DEV_ARM_VGIC_GRP_ITS_REGS 8
 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT	10
 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_MASK \
 			(0x3fffffULL << KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT)
 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INTID_MASK	0x3ff
 #define VGIC_LEVEL_INFO_LINE_LEVEL	0
 
-#define   KVM_DEV_ARM_VGIC_CTRL_INIT	0
+#define   KVM_DEV_ARM_VGIC_CTRL_INIT		0
+#define   KVM_DEV_ARM_ITS_SAVE_TABLES           1
+#define   KVM_DEV_ARM_ITS_RESTORE_TABLES        2
+#define   KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES	3
 
 /* Device Control API on vcpu fd */
 #define KVM_ARM_VCPU_PMU_V3_CTRL	0
diff --git a/tools/arch/powerpc/include/uapi/asm/kvm.h b/tools/arch/powerpc/include/uapi/asm/kvm.h
index 4edbe4b..07fbeb9 100644
--- a/tools/arch/powerpc/include/uapi/asm/kvm.h
+++ b/tools/arch/powerpc/include/uapi/asm/kvm.h
@@ -29,6 +29,9 @@
 #define __KVM_HAVE_IRQ_LINE
 #define __KVM_HAVE_GUEST_DEBUG
 
+/* Not always available, but if it is, this is the correct offset.  */
+#define KVM_COALESCED_MMIO_PAGE_OFFSET 1
+
 struct kvm_regs {
 	__u64 pc;
 	__u64 cr;
diff --git a/tools/arch/s390/include/uapi/asm/kvm.h b/tools/arch/s390/include/uapi/asm/kvm.h
index 7f4fd65..3dd2a1d 100644
--- a/tools/arch/s390/include/uapi/asm/kvm.h
+++ b/tools/arch/s390/include/uapi/asm/kvm.h
@@ -26,6 +26,8 @@
 #define KVM_DEV_FLIC_ADAPTER_REGISTER	6
 #define KVM_DEV_FLIC_ADAPTER_MODIFY	7
 #define KVM_DEV_FLIC_CLEAR_IO_IRQ	8
+#define KVM_DEV_FLIC_AISM		9
+#define KVM_DEV_FLIC_AIRQ_INJECT	10
 /*
  * We can have up to 4*64k pending subchannels + 8 adapter interrupts,
  * as well as up  to ASYNC_PF_PER_VCPU*KVM_MAX_VCPUS pfault done interrupts.
@@ -41,7 +43,14 @@ struct kvm_s390_io_adapter {
 	__u8 isc;
 	__u8 maskable;
 	__u8 swap;
-	__u8 pad;
+	__u8 flags;
+};
+
+#define KVM_S390_ADAPTER_SUPPRESSIBLE 0x01
+
+struct kvm_s390_ais_req {
+	__u8 isc;
+	__u16 mode;
 };
 
 #define KVM_S390_IO_ADAPTER_MASK 1
@@ -110,6 +119,7 @@ struct kvm_s390_vm_cpu_machine {
 #define KVM_S390_VM_CPU_FEAT_CMMA	10
 #define KVM_S390_VM_CPU_FEAT_PFMFI	11
 #define KVM_S390_VM_CPU_FEAT_SIGPIF	12
+#define KVM_S390_VM_CPU_FEAT_KSS	13
 struct kvm_s390_vm_cpu_feat {
 	__u64 feat[16];
 };
@@ -198,6 +208,10 @@ struct kvm_guest_debug_arch {
 #define KVM_SYNC_VRS    (1UL << 6)
 #define KVM_SYNC_RICCB  (1UL << 7)
 #define KVM_SYNC_FPRS   (1UL << 8)
+#define KVM_SYNC_GSCB   (1UL << 9)
+/* length and alignment of the sdnx as a power of two */
+#define SDNXC 8
+#define SDNXL (1UL << SDNXC)
 /* definition of registers in kvm_run */
 struct kvm_sync_regs {
 	__u64 prefix;	/* prefix register */
@@ -218,8 +232,16 @@ struct kvm_sync_regs {
 	};
 	__u8  reserved[512];	/* for future vector expansion */
 	__u32 fpc;		/* valid on KVM_SYNC_VRS or KVM_SYNC_FPRS */
-	__u8 padding[52];	/* riccb needs to be 64byte aligned */
+	__u8 padding1[52];	/* riccb needs to be 64byte aligned */
 	__u8 riccb[64];		/* runtime instrumentation controls block */
+	__u8 padding2[192];	/* sdnx needs to be 256byte aligned */
+	union {
+		__u8 sdnx[SDNXL];  /* state description annex */
+		struct {
+			__u64 reserved1[2];
+			__u64 gscb[4];
+		};
+	};
 };
 
 #define KVM_REG_S390_TODPR	(KVM_REG_S390 | KVM_REG_SIZE_U32 | 0x1)
diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h
index 0fe0044..2701e5f 100644
--- a/tools/arch/x86/include/asm/cpufeatures.h
+++ b/tools/arch/x86/include/asm/cpufeatures.h
@@ -202,6 +202,8 @@
 #define X86_FEATURE_AVX512_4VNNIW (7*32+16) /* AVX-512 Neural Network Instructions */
 #define X86_FEATURE_AVX512_4FMAPS (7*32+17) /* AVX-512 Multiply Accumulation Single precision */
 
+#define X86_FEATURE_MBA         ( 7*32+18) /* Memory Bandwidth Allocation */
+
 /* Virtualization flags: Linux defined, word 8 */
 #define X86_FEATURE_TPR_SHADOW  ( 8*32+ 0) /* Intel TPR Shadow */
 #define X86_FEATURE_VNMI        ( 8*32+ 1) /* Intel Virtual NMI */
diff --git a/tools/arch/x86/include/asm/disabled-features.h b/tools/arch/x86/include/asm/disabled-features.h
index 85599ad..5dff775 100644
--- a/tools/arch/x86/include/asm/disabled-features.h
+++ b/tools/arch/x86/include/asm/disabled-features.h
@@ -36,6 +36,12 @@
 # define DISABLE_OSPKE		(1<<(X86_FEATURE_OSPKE & 31))
 #endif /* CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS */
 
+#ifdef CONFIG_X86_5LEVEL
+# define DISABLE_LA57	0
+#else
+# define DISABLE_LA57	(1<<(X86_FEATURE_LA57 & 31))
+#endif
+
 /*
  * Make sure to add features to the correct mask
  */
@@ -55,7 +61,7 @@
 #define DISABLED_MASK13	0
 #define DISABLED_MASK14	0
 #define DISABLED_MASK15	0
-#define DISABLED_MASK16	(DISABLE_PKU|DISABLE_OSPKE)
+#define DISABLED_MASK16	(DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57)
 #define DISABLED_MASK17	0
 #define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)
 
diff --git a/tools/arch/x86/include/asm/required-features.h b/tools/arch/x86/include/asm/required-features.h
index fac9a5c..d91ba04 100644
--- a/tools/arch/x86/include/asm/required-features.h
+++ b/tools/arch/x86/include/asm/required-features.h
@@ -53,6 +53,12 @@
 # define NEED_MOVBE	0
 #endif
 
+#ifdef CONFIG_X86_5LEVEL
+# define NEED_LA57	(1<<(X86_FEATURE_LA57 & 31))
+#else
+# define NEED_LA57	0
+#endif
+
 #ifdef CONFIG_X86_64
 #ifdef CONFIG_PARAVIRT
 /* Paravirtualized systems may not have PSE or PGE available */
@@ -98,7 +104,7 @@
 #define REQUIRED_MASK13	0
 #define REQUIRED_MASK14	0
 #define REQUIRED_MASK15	0
-#define REQUIRED_MASK16	0
+#define REQUIRED_MASK16	(NEED_LA57)
 #define REQUIRED_MASK17	0
 #define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)
 
diff --git a/tools/arch/x86/include/uapi/asm/kvm.h b/tools/arch/x86/include/uapi/asm/kvm.h
index 739c0c5..c2824d0 100644
--- a/tools/arch/x86/include/uapi/asm/kvm.h
+++ b/tools/arch/x86/include/uapi/asm/kvm.h
@@ -9,6 +9,9 @@
 #include <linux/types.h>
 #include <linux/ioctl.h>
 
+#define KVM_PIO_PAGE_OFFSET 1
+#define KVM_COALESCED_MMIO_PAGE_OFFSET 2
+
 #define DE_VECTOR 0
 #define DB_VECTOR 1
 #define BP_VECTOR 3
diff --git a/tools/arch/x86/include/uapi/asm/vmx.h b/tools/arch/x86/include/uapi/asm/vmx.h
index 1445865..690a2dc 100644
--- a/tools/arch/x86/include/uapi/asm/vmx.h
+++ b/tools/arch/x86/include/uapi/asm/vmx.h
@@ -76,7 +76,11 @@
 #define EXIT_REASON_WBINVD              54
 #define EXIT_REASON_XSETBV              55
 #define EXIT_REASON_APIC_WRITE          56
+#define EXIT_REASON_RDRAND              57
 #define EXIT_REASON_INVPCID             58
+#define EXIT_REASON_VMFUNC              59
+#define EXIT_REASON_ENCLS               60
+#define EXIT_REASON_RDSEED              61
 #define EXIT_REASON_PML_FULL            62
 #define EXIT_REASON_XSAVES              63
 #define EXIT_REASON_XRSTORS             64
@@ -90,6 +94,7 @@
 	{ EXIT_REASON_TASK_SWITCH,           "TASK_SWITCH" }, \
 	{ EXIT_REASON_CPUID,                 "CPUID" }, \
 	{ EXIT_REASON_HLT,                   "HLT" }, \
+	{ EXIT_REASON_INVD,                  "INVD" }, \
 	{ EXIT_REASON_INVLPG,                "INVLPG" }, \
 	{ EXIT_REASON_RDPMC,                 "RDPMC" }, \
 	{ EXIT_REASON_RDTSC,                 "RDTSC" }, \
@@ -108,6 +113,8 @@
 	{ EXIT_REASON_IO_INSTRUCTION,        "IO_INSTRUCTION" }, \
 	{ EXIT_REASON_MSR_READ,              "MSR_READ" }, \
 	{ EXIT_REASON_MSR_WRITE,             "MSR_WRITE" }, \
+	{ EXIT_REASON_INVALID_STATE,         "INVALID_STATE" }, \
+	{ EXIT_REASON_MSR_LOAD_FAIL,         "MSR_LOAD_FAIL" }, \
 	{ EXIT_REASON_MWAIT_INSTRUCTION,     "MWAIT_INSTRUCTION" }, \
 	{ EXIT_REASON_MONITOR_TRAP_FLAG,     "MONITOR_TRAP_FLAG" }, \
 	{ EXIT_REASON_MONITOR_INSTRUCTION,   "MONITOR_INSTRUCTION" }, \
@@ -115,20 +122,24 @@
 	{ EXIT_REASON_MCE_DURING_VMENTRY,    "MCE_DURING_VMENTRY" }, \
 	{ EXIT_REASON_TPR_BELOW_THRESHOLD,   "TPR_BELOW_THRESHOLD" }, \
 	{ EXIT_REASON_APIC_ACCESS,           "APIC_ACCESS" }, \
-	{ EXIT_REASON_GDTR_IDTR,	     "GDTR_IDTR" }, \
-	{ EXIT_REASON_LDTR_TR,		     "LDTR_TR" }, \
+	{ EXIT_REASON_EOI_INDUCED,           "EOI_INDUCED" }, \
+	{ EXIT_REASON_GDTR_IDTR,             "GDTR_IDTR" }, \
+	{ EXIT_REASON_LDTR_TR,               "LDTR_TR" }, \
 	{ EXIT_REASON_EPT_VIOLATION,         "EPT_VIOLATION" }, \
 	{ EXIT_REASON_EPT_MISCONFIG,         "EPT_MISCONFIG" }, \
 	{ EXIT_REASON_INVEPT,                "INVEPT" }, \
+	{ EXIT_REASON_RDTSCP,                "RDTSCP" }, \
 	{ EXIT_REASON_PREEMPTION_TIMER,      "PREEMPTION_TIMER" }, \
+	{ EXIT_REASON_INVVPID,               "INVVPID" }, \
 	{ EXIT_REASON_WBINVD,                "WBINVD" }, \
+	{ EXIT_REASON_XSETBV,                "XSETBV" }, \
 	{ EXIT_REASON_APIC_WRITE,            "APIC_WRITE" }, \
-	{ EXIT_REASON_EOI_INDUCED,           "EOI_INDUCED" }, \
-	{ EXIT_REASON_INVALID_STATE,         "INVALID_STATE" }, \
-	{ EXIT_REASON_MSR_LOAD_FAIL,         "MSR_LOAD_FAIL" }, \
-	{ EXIT_REASON_INVD,                  "INVD" }, \
-	{ EXIT_REASON_INVVPID,               "INVVPID" }, \
+	{ EXIT_REASON_RDRAND,                "RDRAND" }, \
 	{ EXIT_REASON_INVPCID,               "INVPCID" }, \
+	{ EXIT_REASON_VMFUNC,                "VMFUNC" }, \
+	{ EXIT_REASON_ENCLS,                 "ENCLS" }, \
+	{ EXIT_REASON_RDSEED,                "RDSEED" }, \
+	{ EXIT_REASON_PML_FULL,              "PML_FULL" }, \
 	{ EXIT_REASON_XSAVES,                "XSAVES" }, \
 	{ EXIT_REASON_XRSTORS,               "XRSTORS" }
 
diff --git a/tools/include/uapi/linux/stat.h b/tools/include/uapi/linux/stat.h
index d538897..17b1030 100644
--- a/tools/include/uapi/linux/stat.h
+++ b/tools/include/uapi/linux/stat.h
@@ -48,17 +48,13 @@
  * tv_sec holds the number of seconds before (negative) or after (positive)
  * 00:00:00 1st January 1970 UTC.
  *
- * tv_nsec holds a number of nanoseconds before (0..-999,999,999 if tv_sec is
- * negative) or after (0..999,999,999 if tv_sec is positive) the tv_sec time.
- *
- * Note that if both tv_sec and tv_nsec are non-zero, then the two values must
- * either be both positive or both negative.
+ * tv_nsec holds a number of nanoseconds (0..999,999,999) after the tv_sec time.
  *
  * __reserved is held in case we need a yet finer resolution.
  */
 struct statx_timestamp {
 	__s64	tv_sec;
-	__s32	tv_nsec;
+	__u32	tv_nsec;
 	__s32	__reserved;
 };
 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/7] perf script: Add --inline option
  2017-05-24  6:38   ` Ingo Molnar
@ 2017-05-24  7:13     ` Namhyung Kim
  2017-05-24  7:21       ` Ingo Molnar
  0 siblings, 1 reply; 27+ messages in thread
From: Namhyung Kim @ 2017-05-24  7:13 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa,
	Milian Wolff, Yao Jin

On Wed, May 24, 2017 at 08:38:11AM +0200, Ingo Molnar wrote:
> 
> * Namhyung Kim <namhyung@kernel.org> wrote:
> 
> > The --inline option is to show inlined functions in callchains.
> > 
> > For example,
> > 
> >   $ perf script
> >   a.out  5644 11611.467597:     309961 cycles:u:
> >                      790 main (/home/namhyung/tmp/perf/a.out)
> >                    20511 __libc_start_main (/usr/lib/libc-2.25.so)
> >                      8ba _start (/home/namhyung/tmp/perf/a.out)
> >   ...
> > 
> >   $ perf script --inline
> >   a.out  5644 11611.467597:     309961 cycles:u:
> >                      790 main (/home/namhyung/tmp/perf/a.out)
> >                          std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator()
> >                          std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >
> >                          std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >
> >                          main
> >                    20511 __libc_start_main (/usr/lib/libc-2.25.so)
> >                      8ba _start (/home/namhyung/tmp/perf/a.out)
> >   ...
> 
> Shouldn't this be the default behavior, to make call chains more readable?

AFAIK perf report didn't make it default due to a performance impact,
but I didn't know how much it is.  Especially if perf was not built
with libbfd it'll run external addr2line to get inlined functions for
each callchain entry..

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/7] perf script: Add --inline option
  2017-05-24  7:13     ` Namhyung Kim
@ 2017-05-24  7:21       ` Ingo Molnar
  2017-05-24  7:53         ` Milian Wolff
  0 siblings, 1 reply; 27+ messages in thread
From: Ingo Molnar @ 2017-05-24  7:21 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa,
	Milian Wolff, Yao Jin


* Namhyung Kim <namhyung@kernel.org> wrote:

> On Wed, May 24, 2017 at 08:38:11AM +0200, Ingo Molnar wrote:
> > 
> > * Namhyung Kim <namhyung@kernel.org> wrote:
> > 
> > > The --inline option is to show inlined functions in callchains.
> > > 
> > > For example,
> > > 
> > >   $ perf script
> > >   a.out  5644 11611.467597:     309961 cycles:u:
> > >                      790 main (/home/namhyung/tmp/perf/a.out)
> > >                    20511 __libc_start_main (/usr/lib/libc-2.25.so)
> > >                      8ba _start (/home/namhyung/tmp/perf/a.out)
> > >   ...
> > > 
> > >   $ perf script --inline
> > >   a.out  5644 11611.467597:     309961 cycles:u:
> > >                      790 main (/home/namhyung/tmp/perf/a.out)
> > >                          std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator()
> > >                          std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >
> > >                          std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >
> > >                          main
> > >                    20511 __libc_start_main (/usr/lib/libc-2.25.so)
> > >                      8ba _start (/home/namhyung/tmp/perf/a.out)
> > >   ...
> > 
> > Shouldn't this be the default behavior, to make call chains more readable?
> 
> AFAIK perf report didn't make it default due to a performance impact,
> but I didn't know how much it is.  Especially if perf was not built
> with libbfd it'll run external addr2line to get inlined functions for
> each callchain entry..

So then at least let's make it the default when all libraries are present. Not 
enabling something when the build is not 'complete' is fair game - distros will 
typically have all the libraries available.

We need to remember that roughly 99% of all our users will use as few perf command 
line options as they can get away with - myself included. Adding a non-debugging 
feature as a non-default command line option is really as if we didn't do 
anything: very few if any people will use it, and it might bitrot in the future 
without people noticing.

So we need apply some thought into making it available to two orders of magnitude 
more people! If someone types 'perf report' we should give the best selection of 
all the features we have available.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/7] perf script: Add --inline option
  2017-05-24  7:21       ` Ingo Molnar
@ 2017-05-24  7:53         ` Milian Wolff
  2017-05-24  8:06           ` Ingo Molnar
  0 siblings, 1 reply; 27+ messages in thread
From: Milian Wolff @ 2017-05-24  7:53 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Namhyung Kim, LKML, kernel-team, Arnaldo Carvalho de Melo,
	Jiri Olsa, Yao Jin

[-- Attachment #1: Type: text/plain, Size: 3443 bytes --]

On Wednesday, May 24, 2017 9:21:42 AM CEST Ingo Molnar wrote:
> * Namhyung Kim <namhyung@kernel.org> wrote:
> > On Wed, May 24, 2017 at 08:38:11AM +0200, Ingo Molnar wrote:
> > > * Namhyung Kim <namhyung@kernel.org> wrote:
> > > > The --inline option is to show inlined functions in callchains.
> > > > 
> > > > For example,
> > > > 
> > > >   $ perf script
> > > >   
> > > >   a.out  5644 11611.467597:     309961 cycles:u:
> > > >                      790 main (/home/namhyung/tmp/perf/a.out)
> > > >                    
> > > >                    20511 __libc_start_main (/usr/lib/libc-2.25.so)
> > > >                    
> > > >                      8ba _start (/home/namhyung/tmp/perf/a.out)
> > > >   
> > > >   ...
> > > >   
> > > >   $ perf script --inline
> > > >   
> > > >   a.out  5644 11611.467597:     309961 cycles:u:
> > > >                      790 main (/home/namhyung/tmp/perf/a.out)
> > > >                      
> > > >                          std::__detail::_Adaptor<std::linear_congruent
> > > >                          ial_engine<unsigned long, 16807ul, 0ul,
> > > >                          2147483647ul>, double>::operator()
> > > >                          std::uniform_real_distribution<double>::oper
> > > >                          ator()<std::linear_congruential_engine<unsign
> > > >                          ed long, 16807ul, 0ul, 2147483647ul> >
> > > >                          std::uniform_real_distribution<double>::oper
> > > >                          ator()<std::linear_congruential_engine<unsign
> > > >                          ed long, 16807ul, 0ul, 2147483647ul> > main
> > > >                    
> > > >                    20511 __libc_start_main (/usr/lib/libc-2.25.so)
> > > >                    
> > > >                      8ba _start (/home/namhyung/tmp/perf/a.out)
> > > >   
> > > >   ...
> > > 
> > > Shouldn't this be the default behavior, to make call chains more
> > > readable?
> > 
> > AFAIK perf report didn't make it default due to a performance impact,
> > but I didn't know how much it is.  Especially if perf was not built
> > with libbfd it'll run external addr2line to get inlined functions for
> > each callchain entry..
> 
> So then at least let's make it the default when all libraries are present.
> Not enabling something when the build is not 'complete' is fair game -
> distros will typically have all the libraries available.
> 
> We need to remember that roughly 99% of all our users will use as few perf
> command line options as they can get away with - myself included. Adding a
> non-debugging feature as a non-default command line option is really as if
> we didn't do anything: very few if any people will use it, and it might
> bitrot in the future without people noticing.
> 
> So we need apply some thought into making it available to two orders of
> magnitude more people! If someone types 'perf report' we should give the
> best selection of all the features we have available.

Just a suggestion: My larger patch set that is in review now adds some caching 
features which already speeds up the whole process considerably. As such, my 
suggestion is to wait for this patch set to be integrated. Then we could 
enable --inline unconditionally, or at least only when libbfd is available.

Cheers

-- 
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3826 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/7] perf script: Add --inline option
  2017-05-24  7:53         ` Milian Wolff
@ 2017-05-24  8:06           ` Ingo Molnar
  0 siblings, 0 replies; 27+ messages in thread
From: Ingo Molnar @ 2017-05-24  8:06 UTC (permalink / raw)
  To: Milian Wolff
  Cc: Namhyung Kim, LKML, kernel-team, Arnaldo Carvalho de Melo,
	Jiri Olsa, Yao Jin


* Milian Wolff <milian.wolff@kdab.com> wrote:

> On Wednesday, May 24, 2017 9:21:42 AM CEST Ingo Molnar wrote:
> > * Namhyung Kim <namhyung@kernel.org> wrote:
> > > On Wed, May 24, 2017 at 08:38:11AM +0200, Ingo Molnar wrote:
> > > > * Namhyung Kim <namhyung@kernel.org> wrote:
> > > > > The --inline option is to show inlined functions in callchains.
> > > > > 
> > > > > For example,
> > > > > 
> > > > >   $ perf script
> > > > >   
> > > > >   a.out  5644 11611.467597:     309961 cycles:u:
> > > > >                      790 main (/home/namhyung/tmp/perf/a.out)
> > > > >                    
> > > > >                    20511 __libc_start_main (/usr/lib/libc-2.25.so)
> > > > >                    
> > > > >                      8ba _start (/home/namhyung/tmp/perf/a.out)
> > > > >   
> > > > >   ...
> > > > >   
> > > > >   $ perf script --inline
> > > > >   
> > > > >   a.out  5644 11611.467597:     309961 cycles:u:
> > > > >                      790 main (/home/namhyung/tmp/perf/a.out)
> > > > >                      
> > > > >                          std::__detail::_Adaptor<std::linear_congruent
> > > > >                          ial_engine<unsigned long, 16807ul, 0ul,
> > > > >                          2147483647ul>, double>::operator()
> > > > >                          std::uniform_real_distribution<double>::oper
> > > > >                          ator()<std::linear_congruential_engine<unsign
> > > > >                          ed long, 16807ul, 0ul, 2147483647ul> >
> > > > >                          std::uniform_real_distribution<double>::oper
> > > > >                          ator()<std::linear_congruential_engine<unsign
> > > > >                          ed long, 16807ul, 0ul, 2147483647ul> > main
> > > > >                    
> > > > >                    20511 __libc_start_main (/usr/lib/libc-2.25.so)
> > > > >                    
> > > > >                      8ba _start (/home/namhyung/tmp/perf/a.out)
> > > > >   
> > > > >   ...
> > > > 
> > > > Shouldn't this be the default behavior, to make call chains more
> > > > readable?
> > > 
> > > AFAIK perf report didn't make it default due to a performance impact,
> > > but I didn't know how much it is.  Especially if perf was not built
> > > with libbfd it'll run external addr2line to get inlined functions for
> > > each callchain entry..
> > 
> > So then at least let's make it the default when all libraries are present.
> > Not enabling something when the build is not 'complete' is fair game -
> > distros will typically have all the libraries available.
> > 
> > We need to remember that roughly 99% of all our users will use as few perf
> > command line options as they can get away with - myself included. Adding a
> > non-debugging feature as a non-default command line option is really as if
> > we didn't do anything: very few if any people will use it, and it might
> > bitrot in the future without people noticing.
> > 
> > So we need apply some thought into making it available to two orders of
> > magnitude more people! If someone types 'perf report' we should give the
> > best selection of all the features we have available.
> 
> Just a suggestion: My larger patch set that is in review now adds some caching 
> features which already speeds up the whole process considerably. As such, my 
> suggestion is to wait for this patch set to be integrated. Then we could 
> enable --inline unconditionally, or at least only when libbfd is available.

I'm fine with that - and please make the default-enabling part of your patch 
series, so it does not get forgotten.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [GIT PULL 0/7] perf/urgent callchain fixes
  2017-05-24  6:21 [GIT PULL 0/7] perf/urgent callchain fixes Namhyung Kim
                   ` (7 preceding siblings ...)
  2017-05-24  6:53 ` [GIT PULL 0/7] perf/urgent callchain fixes Ingo Molnar
@ 2017-06-08 13:15 ` Milian Wolff
  2017-06-08 13:59   ` Arnaldo Carvalho de Melo
  8 siblings, 1 reply; 27+ messages in thread
From: Milian Wolff @ 2017-06-08 13:15 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, LKML, kernel-team, Arnaldo Carvalho de Melo,
	Jiri Olsa, Yao Jin

[-- Attachment #1: Type: text/plain, Size: 571 bytes --]

On Wednesday, May 24, 2017 8:21:22 AM CEST Namhyung Kim wrote:
> Hi Ingo,
> 
> Please consider pulling the perf tooling changes below.  Build tested
> on Ubuntu, Fedora and Archlinux.  I found a problem during `perf test`
> but it seems unrelated to this series.  Will take a look it later.

Hey guys,

I notice that these patches are not in acme's perf/core branch. Can they be 
applied there too please?

Thanks

-- 
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3826 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [GIT PULL 0/7] perf/urgent callchain fixes
  2017-06-08 13:15 ` [GIT PULL 0/7] perf/urgent callchain fixes Milian Wolff
@ 2017-06-08 13:59   ` Arnaldo Carvalho de Melo
  2017-06-08 14:34     ` Milian Wolff
  0 siblings, 1 reply; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-08 13:59 UTC (permalink / raw)
  To: Milian Wolff
  Cc: Namhyung Kim, Ingo Molnar, LKML, kernel-team, Jiri Olsa, Yao Jin

Em Thu, Jun 08, 2017 at 03:15:32PM +0200, Milian Wolff escreveu:
> On Wednesday, May 24, 2017 8:21:22 AM CEST Namhyung Kim wrote:
> > Hi Ingo,
> > 
> > Please consider pulling the perf tooling changes below.  Build tested
> > on Ubuntu, Fedora and Archlinux.  I found a problem during `perf test`
> > but it seems unrelated to this series.  Will take a look it later.
> 
> Hey guys,
> 
> I notice that these patches are not in acme's perf/core branch. Can they be 
> applied there too please?

It is there now, Ingo merged tip/perf/urgent into tip/perf/core and I
just rebased my perf/core with that, just pushed.

- Arnaldo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [GIT PULL 0/7] perf/urgent callchain fixes
  2017-06-08 13:59   ` Arnaldo Carvalho de Melo
@ 2017-06-08 14:34     ` Milian Wolff
  0 siblings, 0 replies; 27+ messages in thread
From: Milian Wolff @ 2017-06-08 14:34 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Namhyung Kim, Ingo Molnar, LKML, kernel-team, Jiri Olsa, Yao Jin

[-- Attachment #1: Type: text/plain, Size: 909 bytes --]

On Thursday, June 8, 2017 3:59:31 PM CEST Arnaldo Carvalho de Melo wrote:
> Em Thu, Jun 08, 2017 at 03:15:32PM +0200, Milian Wolff escreveu:
> > On Wednesday, May 24, 2017 8:21:22 AM CEST Namhyung Kim wrote:
> > > Hi Ingo,
> > > 
> > > Please consider pulling the perf tooling changes below.  Build tested
> > > on Ubuntu, Fedora and Archlinux.  I found a problem during `perf test`
> > > but it seems unrelated to this series.  Will take a look it later.
> > 
> > Hey guys,
> > 
> > I notice that these patches are not in acme's perf/core branch. Can they
> > be
> > applied there too please?
> 
> It is there now, Ingo merged tip/perf/urgent into tip/perf/core and I
> just rebased my perf/core with that, just pushed.

Excellent, thank you.

Cheers

-- 
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3826 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] tools/include: Sync kernel ABI headers with tooling headers
  2017-09-12 19:24 [GIT PULL 0/9] perf/urgent fixes Arnaldo Carvalho de Melo
@ 2017-09-13  7:38 ` Ingo Molnar
  0 siblings, 0 replies; 27+ messages in thread
From: Ingo Molnar @ 2017-09-13  7:38 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, linux-perf-users, Adrian Hunter, David Ahern,
	Jiri Olsa, Milian Wolff, Namhyung Kim, Peter Zijlstra,
	Taeung Song, Wang Nan, Yao Jin, Arnaldo Carvalho de Melo

Time for a sync with ABI/uapi headers with the upcoming v4.14 kernel.

None of the ABI changes require any source code level changes to our
existing in-kernel tooling code:

  - tools/arch/s390/include/uapi/asm/kvm.h:

      New KVM_S390_VM_TOD_EXT ABI, not used by in-kernel tooling.

  - tools/arch/x86/include/asm/cpufeatures.h:
    tools/arch/x86/include/asm/disabled-features.h:

      New PCID, SME and VGIF x86 CPU feature bits defined.

  - tools/include/asm-generic/hugetlb_encode.h:
    tools/include/uapi/asm-generic/mman-common.h:
    tools/include/uapi/linux/mman.h:

      Two new madvise() flags, plus a hugetlb system call mmap flags
      restructuring/extension changes.

  - tools/include/uapi/drm/drm.h:
    tools/include/uapi/drm/i915_drm.h:

      New drm_syncobj_create flags definitions, new drm_syncobj_wait
      and drm_syncobj_array ABIs. DRM_I915_PERF_* calls and a new
      I915_PARAM_HAS_EXEC_FENCE_ARRAY ABI for the Intel driver.

  - tools/include/uapi/linux/bpf.h:

      New bpf_sock fields (::mark and ::priority), new XDP_REDIRECT
      action, new kvm_ppc_smmu_info fields (::data_keys, instr_keys)

Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 tools/arch/s390/include/uapi/asm/kvm.h         |  6 +++
 tools/arch/x86/include/asm/cpufeatures.h       |  2 +
 tools/arch/x86/include/asm/disabled-features.h |  4 +-
 tools/include/asm-generic/hugetlb_encode.h     | 34 +++++++++++++++++
 tools/include/uapi/asm-generic/mman-common.h   | 14 ++-----
 tools/include/uapi/drm/drm.h                   | 22 +++++++++++
 tools/include/uapi/drm/i915_drm.h              | 51 +++++++++++++++++++++++++-
 tools/include/uapi/linux/bpf.h                 | 32 ++++++++++------
 tools/include/uapi/linux/kvm.h                 |  3 +-
 tools/include/uapi/linux/mman.h                | 24 +++++++++++-
 10 files changed, 164 insertions(+), 28 deletions(-)

diff --git a/tools/arch/s390/include/uapi/asm/kvm.h b/tools/arch/s390/include/uapi/asm/kvm.h
index 69d09c39bbcd..cd7359e23d86 100644
--- a/tools/arch/s390/include/uapi/asm/kvm.h
+++ b/tools/arch/s390/include/uapi/asm/kvm.h
@@ -88,6 +88,12 @@ struct kvm_s390_io_adapter_req {
 /* kvm attributes for KVM_S390_VM_TOD */
 #define KVM_S390_VM_TOD_LOW		0
 #define KVM_S390_VM_TOD_HIGH		1
+#define KVM_S390_VM_TOD_EXT		2
+
+struct kvm_s390_vm_tod_clock {
+	__u8  epoch_idx;
+	__u64 tod;
+};
 
 /* kvm attributes for KVM_S390_VM_CPU_MODEL */
 /* processor related attributes are r/w */
diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h
index 8ea315a11fe0..2519c6c801c9 100644
--- a/tools/arch/x86/include/asm/cpufeatures.h
+++ b/tools/arch/x86/include/asm/cpufeatures.h
@@ -196,6 +196,7 @@
 
 #define X86_FEATURE_HW_PSTATE	( 7*32+ 8) /* AMD HW-PState */
 #define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */
+#define X86_FEATURE_SME		( 7*32+10) /* AMD Secure Memory Encryption */
 
 #define X86_FEATURE_INTEL_PPIN	( 7*32+14) /* Intel Processor Inventory Number */
 #define X86_FEATURE_INTEL_PT	( 7*32+15) /* Intel Processor Trace */
@@ -287,6 +288,7 @@
 #define X86_FEATURE_PFTHRESHOLD (15*32+12) /* pause filter threshold */
 #define X86_FEATURE_AVIC	(15*32+13) /* Virtual Interrupt Controller */
 #define X86_FEATURE_V_VMSAVE_VMLOAD (15*32+15) /* Virtual VMSAVE VMLOAD */
+#define X86_FEATURE_VGIF	(15*32+16) /* Virtual GIF */
 
 /* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx), word 16 */
 #define X86_FEATURE_AVX512VBMI  (16*32+ 1) /* AVX512 Vector Bit Manipulation instructions*/
diff --git a/tools/arch/x86/include/asm/disabled-features.h b/tools/arch/x86/include/asm/disabled-features.h
index 5dff775af7cd..c10c9128f54e 100644
--- a/tools/arch/x86/include/asm/disabled-features.h
+++ b/tools/arch/x86/include/asm/disabled-features.h
@@ -21,11 +21,13 @@
 # define DISABLE_K6_MTRR	(1<<(X86_FEATURE_K6_MTRR & 31))
 # define DISABLE_CYRIX_ARR	(1<<(X86_FEATURE_CYRIX_ARR & 31))
 # define DISABLE_CENTAUR_MCR	(1<<(X86_FEATURE_CENTAUR_MCR & 31))
+# define DISABLE_PCID		0
 #else
 # define DISABLE_VME		0
 # define DISABLE_K6_MTRR	0
 # define DISABLE_CYRIX_ARR	0
 # define DISABLE_CENTAUR_MCR	0
+# define DISABLE_PCID		(1<<(X86_FEATURE_PCID & 31))
 #endif /* CONFIG_X86_64 */
 
 #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
@@ -49,7 +51,7 @@
 #define DISABLED_MASK1	0
 #define DISABLED_MASK2	0
 #define DISABLED_MASK3	(DISABLE_CYRIX_ARR|DISABLE_CENTAUR_MCR|DISABLE_K6_MTRR)
-#define DISABLED_MASK4	0
+#define DISABLED_MASK4	(DISABLE_PCID)
 #define DISABLED_MASK5	0
 #define DISABLED_MASK6	0
 #define DISABLED_MASK7	0
diff --git a/tools/include/asm-generic/hugetlb_encode.h b/tools/include/asm-generic/hugetlb_encode.h
new file mode 100644
index 000000000000..e4732d3c2998
--- /dev/null
+++ b/tools/include/asm-generic/hugetlb_encode.h
@@ -0,0 +1,34 @@
+#ifndef _ASM_GENERIC_HUGETLB_ENCODE_H_
+#define _ASM_GENERIC_HUGETLB_ENCODE_H_
+
+/*
+ * Several system calls take a flag to request "hugetlb" huge pages.
+ * Without further specification, these system calls will use the
+ * system's default huge page size.  If a system supports multiple
+ * huge page sizes, the desired huge page size can be specified in
+ * bits [26:31] of the flag arguments.  The value in these 6 bits
+ * will encode the log2 of the huge page size.
+ *
+ * The following definitions are associated with this huge page size
+ * encoding in flag arguments.  System call specific header files
+ * that use this encoding should include this file.  They can then
+ * provide definitions based on these with their own specific prefix.
+ * for example:
+ * #define MAP_HUGE_SHIFT HUGETLB_FLAG_ENCODE_SHIFT
+ */
+
+#define HUGETLB_FLAG_ENCODE_SHIFT	26
+#define HUGETLB_FLAG_ENCODE_MASK	0x3f
+
+#define HUGETLB_FLAG_ENCODE_64KB	(16 << HUGETLB_FLAG_ENCODE_SHIFT)
+#define HUGETLB_FLAG_ENCODE_512KB	(19 << HUGETLB_FLAG_ENCODE_SHIFT)
+#define HUGETLB_FLAG_ENCODE_1MB		(20 << HUGETLB_FLAG_ENCODE_SHIFT)
+#define HUGETLB_FLAG_ENCODE_2MB		(21 << HUGETLB_FLAG_ENCODE_SHIFT)
+#define HUGETLB_FLAG_ENCODE_8MB		(23 << HUGETLB_FLAG_ENCODE_SHIFT)
+#define HUGETLB_FLAG_ENCODE_16MB	(24 << HUGETLB_FLAG_ENCODE_SHIFT)
+#define HUGETLB_FLAG_ENCODE_256MB	(28 << HUGETLB_FLAG_ENCODE_SHIFT)
+#define HUGETLB_FLAG_ENCODE_1GB		(30 << HUGETLB_FLAG_ENCODE_SHIFT)
+#define HUGETLB_FLAG_ENCODE_2GB		(31 << HUGETLB_FLAG_ENCODE_SHIFT)
+#define HUGETLB_FLAG_ENCODE_16GB	(34 << HUGETLB_FLAG_ENCODE_SHIFT)
+
+#endif /* _ASM_GENERIC_HUGETLB_ENCODE_H_ */
diff --git a/tools/include/uapi/asm-generic/mman-common.h b/tools/include/uapi/asm-generic/mman-common.h
index 8c27db0c5c08..203268f9231e 100644
--- a/tools/include/uapi/asm-generic/mman-common.h
+++ b/tools/include/uapi/asm-generic/mman-common.h
@@ -58,20 +58,12 @@
 					   overrides the coredump filter bits */
 #define MADV_DODUMP	17		/* Clear the MADV_DONTDUMP flag */
 
+#define MADV_WIPEONFORK 18		/* Zero memory on fork, child only */
+#define MADV_KEEPONFORK 19		/* Undo MADV_WIPEONFORK */
+
 /* compatibility flags */
 #define MAP_FILE	0
 
-/*
- * When MAP_HUGETLB is set bits [26:31] encode the log2 of the huge page size.
- * This gives us 6 bits, which is enough until someone invents 128 bit address
- * spaces.
- *
- * Assume these are all power of twos.
- * When 0 use the default page size.
- */
-#define MAP_HUGE_SHIFT	26
-#define MAP_HUGE_MASK	0x3f
-
 #define PKEY_DISABLE_ACCESS	0x1
 #define PKEY_DISABLE_WRITE	0x2
 #define PKEY_ACCESS_MASK	(PKEY_DISABLE_ACCESS |\
diff --git a/tools/include/uapi/drm/drm.h b/tools/include/uapi/drm/drm.h
index 101593ab10ac..97677cd6964d 100644
--- a/tools/include/uapi/drm/drm.h
+++ b/tools/include/uapi/drm/drm.h
@@ -700,6 +700,7 @@ struct drm_prime_handle {
 
 struct drm_syncobj_create {
 	__u32 handle;
+#define DRM_SYNCOBJ_CREATE_SIGNALED (1 << 0)
 	__u32 flags;
 };
 
@@ -718,6 +719,24 @@ struct drm_syncobj_handle {
 	__u32 pad;
 };
 
+#define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL (1 << 0)
+#define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT (1 << 1)
+struct drm_syncobj_wait {
+	__u64 handles;
+	/* absolute timeout */
+	__s64 timeout_nsec;
+	__u32 count_handles;
+	__u32 flags;
+	__u32 first_signaled; /* only valid when not waiting all */
+	__u32 pad;
+};
+
+struct drm_syncobj_array {
+	__u64 handles;
+	__u32 count_handles;
+	__u32 pad;
+};
+
 #if defined(__cplusplus)
 }
 #endif
@@ -840,6 +859,9 @@ extern "C" {
 #define DRM_IOCTL_SYNCOBJ_DESTROY	DRM_IOWR(0xC0, struct drm_syncobj_destroy)
 #define DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD	DRM_IOWR(0xC1, struct drm_syncobj_handle)
 #define DRM_IOCTL_SYNCOBJ_FD_TO_HANDLE	DRM_IOWR(0xC2, struct drm_syncobj_handle)
+#define DRM_IOCTL_SYNCOBJ_WAIT		DRM_IOWR(0xC3, struct drm_syncobj_wait)
+#define DRM_IOCTL_SYNCOBJ_RESET		DRM_IOWR(0xC4, struct drm_syncobj_array)
+#define DRM_IOCTL_SYNCOBJ_SIGNAL	DRM_IOWR(0xC5, struct drm_syncobj_array)
 
 /**
  * Device specific ioctls should only be in their respective headers
diff --git a/tools/include/uapi/drm/i915_drm.h b/tools/include/uapi/drm/i915_drm.h
index 7ccbd6a2bbe0..6598fb76d2c2 100644
--- a/tools/include/uapi/drm/i915_drm.h
+++ b/tools/include/uapi/drm/i915_drm.h
@@ -260,6 +260,8 @@ typedef struct _drm_i915_sarea {
 #define DRM_I915_GEM_CONTEXT_GETPARAM	0x34
 #define DRM_I915_GEM_CONTEXT_SETPARAM	0x35
 #define DRM_I915_PERF_OPEN		0x36
+#define DRM_I915_PERF_ADD_CONFIG	0x37
+#define DRM_I915_PERF_REMOVE_CONFIG	0x38
 
 #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
 #define DRM_IOCTL_I915_FLUSH		DRM_IO ( DRM_COMMAND_BASE + DRM_I915_FLUSH)
@@ -315,6 +317,8 @@ typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_GEM_CONTEXT_GETPARAM	DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_GETPARAM, struct drm_i915_gem_context_param)
 #define DRM_IOCTL_I915_GEM_CONTEXT_SETPARAM	DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_SETPARAM, struct drm_i915_gem_context_param)
 #define DRM_IOCTL_I915_PERF_OPEN	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_OPEN, struct drm_i915_perf_open_param)
+#define DRM_IOCTL_I915_PERF_ADD_CONFIG	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_ADD_CONFIG, struct drm_i915_perf_oa_config)
+#define DRM_IOCTL_I915_PERF_REMOVE_CONFIG	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_REMOVE_CONFIG, __u64)
 
 /* Allow drivers to submit batchbuffers directly to hardware, relying
  * on the security mechanisms provided by hardware.
@@ -431,6 +435,11 @@ typedef struct drm_i915_irq_wait {
  */
 #define I915_PARAM_HAS_EXEC_BATCH_FIRST	 48
 
+/* Query whether DRM_I915_GEM_EXECBUFFER2 supports supplying an array of
+ * drm_i915_gem_exec_fence structures.  See I915_EXEC_FENCE_ARRAY.
+ */
+#define I915_PARAM_HAS_EXEC_FENCE_ARRAY  49
+
 typedef struct drm_i915_getparam {
 	__s32 param;
 	/*
@@ -812,6 +821,17 @@ struct drm_i915_gem_exec_object2 {
 	__u64 rsvd2;
 };
 
+struct drm_i915_gem_exec_fence {
+	/**
+	 * User's handle for a drm_syncobj to wait on or signal.
+	 */
+	__u32 handle;
+
+#define I915_EXEC_FENCE_WAIT            (1<<0)
+#define I915_EXEC_FENCE_SIGNAL          (1<<1)
+	__u32 flags;
+};
+
 struct drm_i915_gem_execbuffer2 {
 	/**
 	 * List of gem_exec_object2 structs
@@ -826,7 +846,11 @@ struct drm_i915_gem_execbuffer2 {
 	__u32 DR1;
 	__u32 DR4;
 	__u32 num_cliprects;
-	/** This is a struct drm_clip_rect *cliprects */
+	/**
+	 * This is a struct drm_clip_rect *cliprects if I915_EXEC_FENCE_ARRAY
+	 * is not set.  If I915_EXEC_FENCE_ARRAY is set, then this is a
+	 * struct drm_i915_gem_exec_fence *fences.
+	 */
 	__u64 cliprects_ptr;
 #define I915_EXEC_RING_MASK              (7<<0)
 #define I915_EXEC_DEFAULT                (0<<0)
@@ -927,7 +951,14 @@ struct drm_i915_gem_execbuffer2 {
  * element).
  */
 #define I915_EXEC_BATCH_FIRST		(1<<18)
-#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_BATCH_FIRST<<1))
+
+/* Setting I915_FENCE_ARRAY implies that num_cliprects and cliprects_ptr
+ * define an array of i915_gem_exec_fence structures which specify a set of
+ * dma fences to wait upon or signal.
+ */
+#define I915_EXEC_FENCE_ARRAY   (1<<19)
+
+#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_ARRAY<<1))
 
 #define I915_EXEC_CONTEXT_ID_MASK	(0xffffffff)
 #define i915_execbuffer2_set_context_id(eb2, context) \
@@ -1467,6 +1498,22 @@ enum drm_i915_perf_record_type {
 	DRM_I915_PERF_RECORD_MAX /* non-ABI */
 };
 
+/**
+ * Structure to upload perf dynamic configuration into the kernel.
+ */
+struct drm_i915_perf_oa_config {
+	/** String formatted like "%08x-%04x-%04x-%04x-%012x" */
+	char uuid[36];
+
+	__u32 n_mux_regs;
+	__u32 n_boolean_regs;
+	__u32 n_flex_regs;
+
+	__u64 __user mux_regs_ptr;
+	__u64 __user boolean_regs_ptr;
+	__u64 __user flex_regs_ptr;
+};
+
 #if defined(__cplusplus)
 }
 #endif
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 461811e57140..43ab5c402f98 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -143,12 +143,6 @@ enum bpf_attach_type {
 
 #define MAX_BPF_ATTACH_TYPE __MAX_BPF_ATTACH_TYPE
 
-enum bpf_sockmap_flags {
-	BPF_SOCKMAP_UNSPEC,
-	BPF_SOCKMAP_STRPARSER,
-	__MAX_BPF_SOCKMAP_FLAG
-};
-
 /* If BPF_F_ALLOW_OVERRIDE flag is used in BPF_PROG_ATTACH command
  * to the given target_fd cgroup the descendent cgroup will be able to
  * override effective bpf program that was inherited from this cgroup
@@ -368,9 +362,20 @@ union bpf_attr {
  * int bpf_redirect(ifindex, flags)
  *     redirect to another netdev
  *     @ifindex: ifindex of the net device
- *     @flags: bit 0 - if set, redirect to ingress instead of egress
- *             other bits - reserved
- *     Return: TC_ACT_REDIRECT
+ *     @flags:
+ *	  cls_bpf:
+ *          bit 0 - if set, redirect to ingress instead of egress
+ *          other bits - reserved
+ *	  xdp_bpf:
+ *	    all bits - reserved
+ *     Return: cls_bpf: TC_ACT_REDIRECT on success or TC_ACT_SHOT on error
+ *	       xdp_bfp: XDP_REDIRECT on success or XDP_ABORT on error
+ * int bpf_redirect_map(map, key, flags)
+ *     redirect to endpoint in map
+ *     @map: pointer to dev map
+ *     @key: index in map to lookup
+ *     @flags: --
+ *     Return: XDP_REDIRECT on success or XDP_ABORT on error
  *
  * u32 bpf_get_route_realm(skb)
  *     retrieve a dst's tclassid
@@ -632,7 +637,7 @@ union bpf_attr {
 	FN(skb_adjust_room),		\
 	FN(redirect_map),		\
 	FN(sk_redirect_map),		\
-	FN(sock_map_update),
+	FN(sock_map_update),		\
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
@@ -753,20 +758,23 @@ struct bpf_sock {
 	__u32 family;
 	__u32 type;
 	__u32 protocol;
+	__u32 mark;
+	__u32 priority;
 };
 
 #define XDP_PACKET_HEADROOM 256
 
 /* User return codes for XDP prog type.
  * A valid XDP program must return one of these defined values. All other
- * return codes are reserved for future use. Unknown return codes will result
- * in packet drop.
+ * return codes are reserved for future use. Unknown return codes will
+ * result in packet drops and a warning via bpf_warn_invalid_xdp_action().
  */
 enum xdp_action {
 	XDP_ABORTED = 0,
 	XDP_DROP,
 	XDP_PASS,
 	XDP_TX,
+	XDP_REDIRECT,
 };
 
 /* user accessible metadata for XDP packet hook
diff --git a/tools/include/uapi/linux/kvm.h b/tools/include/uapi/linux/kvm.h
index 6cd63c18708a..838887587411 100644
--- a/tools/include/uapi/linux/kvm.h
+++ b/tools/include/uapi/linux/kvm.h
@@ -711,7 +711,8 @@ struct kvm_ppc_one_seg_page_size {
 struct kvm_ppc_smmu_info {
 	__u64 flags;
 	__u32 slb_size;
-	__u32 pad;
+	__u16 data_keys;	/* # storage keys supported for data */
+	__u16 instr_keys;	/* # storage keys supported for instructions */
 	struct kvm_ppc_one_seg_page_size sps[KVM_PPC_PAGE_SIZES_MAX_SZ];
 };
 
diff --git a/tools/include/uapi/linux/mman.h b/tools/include/uapi/linux/mman.h
index 81d8edf11789..a937480d7cd3 100644
--- a/tools/include/uapi/linux/mman.h
+++ b/tools/include/uapi/linux/mman.h
@@ -1,7 +1,8 @@
 #ifndef _UAPI_LINUX_MMAN_H
 #define _UAPI_LINUX_MMAN_H
 
-#include <uapi/asm/mman.h>
+#include <asm/mman.h>
+#include <asm-generic/hugetlb_encode.h>
 
 #define MREMAP_MAYMOVE	1
 #define MREMAP_FIXED	2
@@ -10,4 +11,25 @@
 #define OVERCOMMIT_ALWAYS		1
 #define OVERCOMMIT_NEVER		2
 
+/*
+ * Huge page size encoding when MAP_HUGETLB is specified, and a huge page
+ * size other than the default is desired.  See hugetlb_encode.h.
+ * All known huge page size encodings are provided here.  It is the
+ * responsibility of the application to know which sizes are supported on
+ * the running system.  See mmap(2) man page for details.
+ */
+#define MAP_HUGE_SHIFT	HUGETLB_FLAG_ENCODE_SHIFT
+#define MAP_HUGE_MASK	HUGETLB_FLAG_ENCODE_MASK
+
+#define MAP_HUGE_64KB	HUGETLB_FLAG_ENCODE_64KB
+#define MAP_HUGE_512KB	HUGETLB_FLAG_ENCODE_512KB
+#define MAP_HUGE_1MB	HUGETLB_FLAG_ENCODE_1MB
+#define MAP_HUGE_2MB	HUGETLB_FLAG_ENCODE_2MB
+#define MAP_HUGE_8MB	HUGETLB_FLAG_ENCODE_8MB
+#define MAP_HUGE_16MB	HUGETLB_FLAG_ENCODE_16MB
+#define MAP_HUGE_256MB	HUGETLB_FLAG_ENCODE_256MB
+#define MAP_HUGE_1GB	HUGETLB_FLAG_ENCODE_1GB
+#define MAP_HUGE_2GB	HUGETLB_FLAG_ENCODE_2GB
+#define MAP_HUGE_16GB	HUGETLB_FLAG_ENCODE_16GB
+
 #endif /* _UAPI_LINUX_MMAN_H */

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, back to index

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-24  6:21 [GIT PULL 0/7] perf/urgent callchain fixes Namhyung Kim
2017-05-24  6:21 ` [PATCH 1/7] perf report: don't crash on invalid maps in `-g srcline` mode Namhyung Kim
2017-05-24  7:03   ` [tip:perf/urgent] perf report: Don't " tip-bot for Milian Wolff
2017-05-24  6:21 ` [PATCH 2/7] perf report: fix memory leak in addr2line when called by addr2inlines Namhyung Kim
2017-05-24  7:04   ` [tip:perf/urgent] perf report: Fix " tip-bot for Milian Wolff
2017-05-24  6:21 ` [PATCH 3/7] perf report: fix off-by-one for non-activation frames Namhyung Kim
2017-05-24  7:05   ` [tip:perf/urgent] perf report: Fix " tip-bot for Milian Wolff
2017-05-24  6:21 ` [PATCH 4/7] perf script: Add --inline option Namhyung Kim
2017-05-24  6:38   ` Ingo Molnar
2017-05-24  7:13     ` Namhyung Kim
2017-05-24  7:21       ` Ingo Molnar
2017-05-24  7:53         ` Milian Wolff
2017-05-24  8:06           ` Ingo Molnar
2017-05-24  7:05   ` [tip:perf/urgent] perf script: Add --inline option for debugging tip-bot for Namhyung Kim
2017-05-24  6:21 ` [PATCH 5/7] perf report: always honor callchain order for inlined nodes Namhyung Kim
2017-05-24  7:06   ` [tip:perf/urgent] perf report: Always " tip-bot for Milian Wolff
2017-05-24  6:21 ` [PATCH 6/7] perf report: do not drop last inlined frame Namhyung Kim
2017-05-24  7:06   ` [tip:perf/urgent] perf report: Do " tip-bot for Milian Wolff
2017-05-24  6:21 ` [PATCH 7/7] perf tools: Fix to put caller above callee in children mode Namhyung Kim
2017-05-24  7:07   ` [tip:perf/urgent] perf tools: Put caller above callee in --children mode tip-bot for Namhyung Kim
2017-05-24  6:53 ` [GIT PULL 0/7] perf/urgent callchain fixes Ingo Molnar
2017-05-24  6:57   ` [PATCH] tools/include: Sync kernel ABI headers with tooling headers Ingo Molnar
2017-05-24  7:07     ` [tip:perf/urgent] " tip-bot for Ingo Molnar
2017-06-08 13:15 ` [GIT PULL 0/7] perf/urgent callchain fixes Milian Wolff
2017-06-08 13:59   ` Arnaldo Carvalho de Melo
2017-06-08 14:34     ` Milian Wolff
2017-09-12 19:24 [GIT PULL 0/9] perf/urgent fixes Arnaldo Carvalho de Melo
2017-09-13  7:38 ` [PATCH] tools/include: Sync kernel ABI headers with tooling headers Ingo Molnar

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org linux-kernel@archiver.kernel.org
	public-inbox-index lkml


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox