linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GIT PULL 00/10] perf/core improvements and fixes
@ 2013-11-14 20:25 Arnaldo Carvalho de Melo
  2013-11-14 20:25 ` [PATCH 01/10] perf trace: Tweak summary output Arnaldo Carvalho de Melo
                   ` (10 more replies)
  0 siblings, 11 replies; 25+ messages in thread
From: Arnaldo Carvalho de Melo @ 2013-11-14 20:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Azat Khuzhin, Bill Gray, David Ahern, Davidlohr Bueso,
	Don Zickus, Frederic Weisbecker, Jiri Olsa, Joe Mario,
	Mike Galbraith, Namhyung Kim, Paul Mackerras, Pekka Enberg,
	Peter Zijlstra, Richard Fowles, stable, Stephane Eranian,
	Sukadev Bhattiprolu, v.karpov, Waiman Long,
	Arnaldo Carvalho de Melo

From: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>

Hi Ingo,

	Please consider pulling, done on top of tip/perf/urgent.

- Arnaldo

The following changes since commit e310718d0e83aeb9969264dc577c45db16d9104d:

  tools/perf/build: Fix feature-libunwind-debug-frame handling (2013-11-14 18:00:45 +0100)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux tags/perf-core-for-mingo

for you to fetch changes up to 539e6bb71e350541105e67e3d6c31392d9da25ef:

  perf record: Add an option to force per-cpu mmaps (2013-11-14 16:10:27 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

. Synthesize anon MMAP records again, fix from Don Zickus.

. Add an option in 'perf record' to force per-cpu mmaps, from Adrian Hunter.

. Limit max callchain using max_stack on DWARF unwinding too.

. Fix segfault in the UI browser caused by off by one handling END key.

. Add '--demangle'/'--no-demangle' to perf probe, so that we can overcome
  current limitations in handling C++ symbols, from Azat Khuzhin .

. Tweak 'perf trace' summary output, from Pekka Enberg.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

----------------------------------------------------------------
Adrian Hunter (1):
      perf record: Add an option to force per-cpu mmaps

Arnaldo Carvalho de Melo (4):
      perf tools: Use perf_evlist__{first,last}, perf_evsel__next
      perf evsel: Introduce perf_evsel__prev() method
      perf symbols: Limit max callchain using max_stack on DWARF unwinding too
      perf ui browser: Fix segfault caused by off by one handling END key

Azat Khuzhin (1):
      perf probe: Add '--demangle'/'--no-demangle'

Davidlohr Bueso (1):
      perf tools: Remove trivial extra semincolon

Don Zickus (1):
      perf tools: Synthesize anon MMAP records again

Ingo Molnar (1):
      perf top: Add missing newline if the 'uid' is invalid

Pekka Enberg (1):
      perf trace: Tweak summary output

 tools/perf/Documentation/perf-record.txt |  6 ++++++
 tools/perf/builtin-probe.c               |  2 ++
 tools/perf/builtin-record.c              |  2 ++
 tools/perf/builtin-top.c                 |  4 ++--
 tools/perf/builtin-trace.c               | 10 +++++-----
 tools/perf/tests/parse-events.c          |  3 +--
 tools/perf/ui/browser.c                  |  4 ++--
 tools/perf/ui/browsers/hists.c           | 11 +++++------
 tools/perf/util/event.c                  |  6 ++++--
 tools/perf/util/evlist.c                 |  6 ++++--
 tools/perf/util/evsel.c                  |  4 ++--
 tools/perf/util/evsel.h                  |  5 +++++
 tools/perf/util/machine.c                |  2 +-
 tools/perf/util/target.h                 |  1 +
 tools/perf/util/unwind.c                 |  9 +++++----
 tools/perf/util/unwind.h                 |  5 +++--
 16 files changed, 50 insertions(+), 30 deletions(-)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 01/10] perf trace: Tweak summary output
  2013-11-14 20:25 [GIT PULL 00/10] perf/core improvements and fixes Arnaldo Carvalho de Melo
@ 2013-11-14 20:25 ` Arnaldo Carvalho de Melo
  2013-11-14 20:25 ` [PATCH 02/10] perf tools: Remove trivial extra semincolon Arnaldo Carvalho de Melo
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Arnaldo Carvalho de Melo @ 2013-11-14 20:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Pekka Enberg, David Ahern, Ingo Molnar,
	Arnaldo Carvalho de Melo

From: Pekka Enberg <penberg@kernel.org>

Tweak the summary output as suggested by Ingo Molnar:

  [penberg@localhost ~]$ perf trace -a --duration 10000 --summary -- sleep 1
  ^C
   Summary of events:

   Xorg (817), 148 events, 0.0%, 0.000 msec

     syscall            calls      min       avg       max      stddev
                                 (msec)    (msec)    (msec)        (%)
     --------------- -------- --------- --------- ---------     ------
     read                   7     0.002     0.004     0.011     32.00%
     rt_sigprocmask        40     0.001     0.001     0.002      1.31%
     ioctl                  6     0.002     0.003     0.005     19.45%
     writev                 7     0.004     0.018     0.059     43.76%
     select                 9     0.000    74.513   507.869     74.61%
     setitimer              4     0.001     0.002     0.002     10.08%

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Link: http://lkml.kernel.org/r/1384345308-24404-1-git-send-email-penberg@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 6b230af940e2..8be17fc462ba 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2112,9 +2112,9 @@ static size_t thread__dump_stats(struct thread_trace *ttrace,
 
 	printed += fprintf(fp, "\n");
 
-	printed += fprintf(fp, "                                                    msec/call\n");
-	printed += fprintf(fp, "   syscall            calls      min      avg      max stddev\n");
-	printed += fprintf(fp, "   --------------- -------- -------- -------- -------- ------\n");
+	printed += fprintf(fp, "   syscall            calls      min       avg       max      stddev\n");
+	printed += fprintf(fp, "                               (msec)    (msec)    (msec)        (%%)\n");
+	printed += fprintf(fp, "   --------------- -------- --------- --------- ---------     ------\n");
 
 	/* each int_node is a syscall */
 	while (inode) {
@@ -2131,9 +2131,9 @@ static size_t thread__dump_stats(struct thread_trace *ttrace,
 
 			sc = &trace->syscalls.table[inode->i];
 			printed += fprintf(fp, "   %-15s", sc->name);
-			printed += fprintf(fp, " %8" PRIu64 " %8.3f %8.3f",
+			printed += fprintf(fp, " %8" PRIu64 " %9.3f %9.3f",
 					   n, min, avg);
-			printed += fprintf(fp, " %8.3f %6.2f\n", max, pct);
+			printed += fprintf(fp, " %9.3f %9.2f%%\n", max, pct);
 		}
 
 		inode = intlist__next(inode);
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 02/10] perf tools: Remove trivial extra semincolon
  2013-11-14 20:25 [GIT PULL 00/10] perf/core improvements and fixes Arnaldo Carvalho de Melo
  2013-11-14 20:25 ` [PATCH 01/10] perf trace: Tweak summary output Arnaldo Carvalho de Melo
@ 2013-11-14 20:25 ` Arnaldo Carvalho de Melo
  2013-11-14 20:25 ` [PATCH 03/10] perf top: Add missing newline if the 'uid' is invalid Arnaldo Carvalho de Melo
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Arnaldo Carvalho de Melo @ 2013-11-14 20:25 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Davidlohr Bueso, Arnaldo Carvalho de Melo

From: Davidlohr Bueso <davidlohr@hp.com>

Accidentally ran into these, get rid of them.

Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Link: http://lkml.kernel.org/r/1384323864.2527.8.camel@buesod1.americas.hpqcorp.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/ui/browser.c  | 2 +-
 tools/perf/util/evlist.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/ui/browser.c b/tools/perf/ui/browser.c
index bbc782e364b0..3648d4ec041f 100644
--- a/tools/perf/ui/browser.c
+++ b/tools/perf/ui/browser.c
@@ -680,7 +680,7 @@ static void __ui_browser__line_arrow_down(struct ui_browser *browser,
 	if (end >= browser->top_idx + browser->height)
 		end_row = browser->height - 1;
 	else
-		end_row = end - browser->top_idx;;
+		end_row = end - browser->top_idx;
 
 	ui_browser__gotorc(browser, row, column);
 	SLsmg_draw_vline(end_row - row + 1);
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index dc6fa3fbb180..5ce2ace2d6c1 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1148,7 +1148,7 @@ size_t perf_evlist__fprintf(struct perf_evlist *evlist, FILE *fp)
 				   perf_evsel__name(evsel));
 	}
 
-	return printed + fprintf(fp, "\n");;
+	return printed + fprintf(fp, "\n");
 }
 
 int perf_evlist__strerror_tp(struct perf_evlist *evlist __maybe_unused,
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 03/10] perf top: Add missing newline if the 'uid' is invalid
  2013-11-14 20:25 [GIT PULL 00/10] perf/core improvements and fixes Arnaldo Carvalho de Melo
  2013-11-14 20:25 ` [PATCH 01/10] perf trace: Tweak summary output Arnaldo Carvalho de Melo
  2013-11-14 20:25 ` [PATCH 02/10] perf tools: Remove trivial extra semincolon Arnaldo Carvalho de Melo
@ 2013-11-14 20:25 ` Arnaldo Carvalho de Melo
  2013-11-14 20:25 ` [PATCH 04/10] perf tools: Synthesize anon MMAP records again Arnaldo Carvalho de Melo
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Arnaldo Carvalho de Melo @ 2013-11-14 20:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Frederic Weisbecker, Adrian Hunter, David Ahern,
	Jiri Olsa, Namhyung Kim, Arnaldo Carvalho de Melo

From: Ingo Molnar <mingo@kernel.org>

Add missing newline if the 'uid' is invalid:

  hubble:~> perf top --stdio -u help
  Error:
  Invalid User: helphubble:~>

Fixed by this patch:

  comet:~/tip/tools/perf> perf top --stdio -u help
  Error:
  Invalid User: help
  comet:~/tip/tools/perf>

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20131112232609.GA31474@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-top.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index b8f8e29db332..71e6402729a8 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1172,7 +1172,7 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
 	status = target__validate(target);
 	if (status) {
 		target__strerror(target, status, errbuf, BUFSIZ);
-		ui__warning("%s", errbuf);
+		ui__warning("%s\n", errbuf);
 	}
 
 	status = target__parse_uid(target);
@@ -1180,7 +1180,7 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
 		int saved_errno = errno;
 
 		target__strerror(target, status, errbuf, BUFSIZ);
-		ui__error("%s", errbuf);
+		ui__error("%s\n", errbuf);
 
 		status = -saved_errno;
 		goto out_delete_evlist;
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 04/10] perf tools: Synthesize anon MMAP records again
  2013-11-14 20:25 [GIT PULL 00/10] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (2 preceding siblings ...)
  2013-11-14 20:25 ` [PATCH 03/10] perf top: Add missing newline if the 'uid' is invalid Arnaldo Carvalho de Melo
@ 2013-11-14 20:25 ` Arnaldo Carvalho de Melo
  2013-11-14 20:25 ` [PATCH 05/10] perf tools: Use perf_evlist__{first,last}, perf_evsel__next Arnaldo Carvalho de Melo
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Arnaldo Carvalho de Melo @ 2013-11-14 20:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Don Zickus, Bill Gray, Jiri Olsa, Joe Mario,
	Richard Fowles, Stephane Eranian, stable,
	Arnaldo Carvalho de Melo

From: Don Zickus <dzickus@redhat.com>

When introducing the PERF_RECORD_MMAP2 in:

5c5e854bc760 perf tools: Add attr->mmap2 support

A check for the number of entries parsed by sscanf was introduced that
assumed all of the 8 fields needed to be correctly parsed so that
particular /proc/pid/maps line would be considered synthesizable.

That broke anon records synthesizing, as it doesn't have the 'execname'
field.

Fix it by keeping the sscanf return check, changing it to not require
that the 'execname' variable be parsed, so that the preexisting logic
can kick in and set it to '//anon'.

This should get things like JIT profiling working again.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Bill Gray <bgray@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Richard Fowles <rfowles@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/n/tip-bo4akalno7579shpz29u867j@git.kernel.org
[ commit log message is mine, dzickus reported the problem with a patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/event.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 6e3a846aed0e..bb788c109fe6 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -209,8 +209,10 @@ static int perf_event__synthesize_mmap_events(struct perf_tool *tool,
 		       &event->mmap.start, &event->mmap.len, prot,
 		       &event->mmap.pgoff,
 		       execname);
-
-		if (n != 5)
+		/*
+ 		 * Anon maps don't have the execname.
+ 		 */
+		if (n < 4)
 			continue;
 		/*
 		 * Just like the kernel, see __perf_event_mmap in kernel/perf_event.c
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 05/10] perf tools: Use perf_evlist__{first,last}, perf_evsel__next
  2013-11-14 20:25 [GIT PULL 00/10] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (3 preceding siblings ...)
  2013-11-14 20:25 ` [PATCH 04/10] perf tools: Synthesize anon MMAP records again Arnaldo Carvalho de Melo
@ 2013-11-14 20:25 ` Arnaldo Carvalho de Melo
  2013-11-14 20:25 ` [PATCH 06/10] perf evsel: Introduce perf_evsel__prev() method Arnaldo Carvalho de Melo
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Arnaldo Carvalho de Melo @ 2013-11-14 20:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	David Ahern, Frederic Weisbecker, Jiri Olsa, Mike Galbraith,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian

From: Arnaldo Carvalho de Melo <acme@redhat.com>

In a few remaining places where the equivalent open coded variant was
still being used.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-4vjnloi5fisilykwxalb5nel@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/tests/parse-events.c | 3 +--
 tools/perf/ui/browsers/hists.c  | 9 ++++-----
 2 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c
index ef671cd41bb3..3cbd10496087 100644
--- a/tools/perf/tests/parse-events.c
+++ b/tools/perf/tests/parse-events.c
@@ -441,9 +441,8 @@ static int test__checkevent_pmu_name(struct perf_evlist *evlist)
 
 static int test__checkevent_pmu_events(struct perf_evlist *evlist)
 {
-	struct perf_evsel *evsel;
+	struct perf_evsel *evsel = perf_evlist__first(evlist);
 
-	evsel = list_entry(evlist->entries.next, struct perf_evsel, node);
 	TEST_ASSERT_VAL("wrong number of entries", 1 == evlist->nr_entries);
 	TEST_ASSERT_VAL("wrong type", PERF_TYPE_RAW == evsel->attr.type);
 	TEST_ASSERT_VAL("wrong exclude_user",
diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 16848bb4c418..089fd3713783 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -1847,13 +1847,13 @@ browse_hists:
 			switch (key) {
 			case K_TAB:
 				if (pos->node.next == &evlist->entries)
-					pos = list_entry(evlist->entries.next, struct perf_evsel, node);
+					pos = perf_evlist__first(evlist);
 				else
-					pos = list_entry(pos->node.next, struct perf_evsel, node);
+					pos = perf_evsel__next(pos);
 				goto browse_hists;
 			case K_UNTAB:
 				if (pos->node.prev == &evlist->entries)
-					pos = list_entry(evlist->entries.prev, struct perf_evsel, node);
+					pos = perf_evlist__last(evlist);
 				else
 					pos = list_entry(pos->node.prev, struct perf_evsel, node);
 				goto browse_hists;
@@ -1943,8 +1943,7 @@ int perf_evlist__tui_browse_hists(struct perf_evlist *evlist, const char *help,
 
 single_entry:
 	if (nr_entries == 1) {
-		struct perf_evsel *first = list_entry(evlist->entries.next,
-						      struct perf_evsel, node);
+		struct perf_evsel *first = perf_evlist__first(evlist);
 		const char *ev_name = perf_evsel__name(first);
 
 		return perf_evsel__hists_browse(first, nr_entries, help,
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 06/10] perf evsel: Introduce perf_evsel__prev() method
  2013-11-14 20:25 [GIT PULL 00/10] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (4 preceding siblings ...)
  2013-11-14 20:25 ` [PATCH 05/10] perf tools: Use perf_evlist__{first,last}, perf_evsel__next Arnaldo Carvalho de Melo
@ 2013-11-14 20:25 ` Arnaldo Carvalho de Melo
  2013-11-14 20:25 ` [PATCH 07/10] perf symbols: Limit max callchain using max_stack on DWARF unwinding too Arnaldo Carvalho de Melo
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Arnaldo Carvalho de Melo @ 2013-11-14 20:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	David Ahern, Frederic Weisbecker, Jiri Olsa, Mike Galbraith,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Just one use so far, on the hists browser, for completeness since there
we use perf_evlist__{first,last} and perf_evsel__next() for handling the
TAB and UNTAB keys.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-d09l4lejp5427enuf3igpckw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/ui/browsers/hists.c | 2 +-
 tools/perf/util/evsel.h        | 5 +++++
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 089fd3713783..a440e03cd8c2 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -1855,7 +1855,7 @@ browse_hists:
 				if (pos->node.prev == &evlist->entries)
 					pos = perf_evlist__last(evlist);
 				else
-					pos = list_entry(pos->node.prev, struct perf_evsel, node);
+					pos = perf_evsel__prev(pos);
 				goto browse_hists;
 			case K_ESC:
 				if (!ui_browser__dialog_yesno(&menu->b,
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index f5029653dcd7..1ea7c92e6e33 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -279,6 +279,11 @@ static inline struct perf_evsel *perf_evsel__next(struct perf_evsel *evsel)
 	return list_entry(evsel->node.next, struct perf_evsel, node);
 }
 
+static inline struct perf_evsel *perf_evsel__prev(struct perf_evsel *evsel)
+{
+	return list_entry(evsel->node.prev, struct perf_evsel, node);
+}
+
 /**
  * perf_evsel__is_group_leader - Return whether given evsel is a leader event
  *
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 07/10] perf symbols: Limit max callchain using max_stack on DWARF unwinding too
  2013-11-14 20:25 [GIT PULL 00/10] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (5 preceding siblings ...)
  2013-11-14 20:25 ` [PATCH 06/10] perf evsel: Introduce perf_evsel__prev() method Arnaldo Carvalho de Melo
@ 2013-11-14 20:25 ` Arnaldo Carvalho de Melo
  2013-11-14 20:25 ` [PATCH 08/10] perf ui browser: Fix segfault caused by off by one handling END key Arnaldo Carvalho de Melo
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Arnaldo Carvalho de Melo @ 2013-11-14 20:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	David Ahern, Frederic Weisbecker, Jiri Olsa, Mike Galbraith,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian,
	Waiman Long

From: Arnaldo Carvalho de Melo <acme@redhat.com>

It was affecting only frame-pointer (fp) based callchain processing.

Usage example:

  perf top --call-graph dwarf,1024 --max-stack 2

Works for any tool that does callchain resolving and provides a
--max-stack option.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Waiman Long <Waiman.Long@hp.com>
Link: http://lkml.kernel.org/n/tip-eu45v8s3tq9ruay8tpfyon79@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/machine.c | 2 +-
 tools/perf/util/unwind.c  | 9 +++++----
 tools/perf/util/unwind.h  | 5 +++--
 3 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 0393912d8033..84cdb072ac83 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1368,7 +1368,7 @@ int machine__resolve_callchain(struct machine *machine,
 
 	return unwind__get_entries(unwind_entry, &callchain_cursor, machine,
 				   thread, evsel->attr.sample_regs_user,
-				   sample);
+				   sample, max_stack);
 
 }
 
diff --git a/tools/perf/util/unwind.c b/tools/perf/util/unwind.c
index 5390d0b8862a..0efd5393de85 100644
--- a/tools/perf/util/unwind.c
+++ b/tools/perf/util/unwind.c
@@ -559,7 +559,7 @@ static unw_accessors_t accessors = {
 };
 
 static int get_entries(struct unwind_info *ui, unwind_entry_cb_t cb,
-		       void *arg)
+		       void *arg, int max_stack)
 {
 	unw_addr_space_t addr_space;
 	unw_cursor_t c;
@@ -575,7 +575,7 @@ static int get_entries(struct unwind_info *ui, unwind_entry_cb_t cb,
 	if (ret)
 		display_error(ret);
 
-	while (!ret && (unw_step(&c) > 0)) {
+	while (!ret && (unw_step(&c) > 0) && max_stack--) {
 		unw_word_t ip;
 
 		unw_get_reg(&c, UNW_REG_IP, &ip);
@@ -588,7 +588,8 @@ static int get_entries(struct unwind_info *ui, unwind_entry_cb_t cb,
 
 int unwind__get_entries(unwind_entry_cb_t cb, void *arg,
 			struct machine *machine, struct thread *thread,
-			u64 sample_uregs, struct perf_sample *data)
+			u64 sample_uregs, struct perf_sample *data,
+			int max_stack)
 {
 	unw_word_t ip;
 	struct unwind_info ui = {
@@ -610,5 +611,5 @@ int unwind__get_entries(unwind_entry_cb_t cb, void *arg,
 	if (ret)
 		return -ENOMEM;
 
-	return get_entries(&ui, cb, arg);
+	return get_entries(&ui, cb, arg, max_stack);
 }
diff --git a/tools/perf/util/unwind.h b/tools/perf/util/unwind.h
index ec0c71a2ca2e..d5966f49e22c 100644
--- a/tools/perf/util/unwind.h
+++ b/tools/perf/util/unwind.h
@@ -18,7 +18,7 @@ int unwind__get_entries(unwind_entry_cb_t cb, void *arg,
 			struct machine *machine,
 			struct thread *thread,
 			u64 sample_uregs,
-			struct perf_sample *data);
+			struct perf_sample *data, int max_stack);
 int unwind__arch_reg_id(int regnum);
 #else
 static inline int
@@ -27,7 +27,8 @@ unwind__get_entries(unwind_entry_cb_t cb __maybe_unused,
 		    struct machine *machine __maybe_unused,
 		    struct thread *thread __maybe_unused,
 		    u64 sample_uregs __maybe_unused,
-		    struct perf_sample *data __maybe_unused)
+		    struct perf_sample *data __maybe_unused,
+		    int max_stack __maybe_unused)
 {
 	return 0;
 }
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 08/10] perf ui browser: Fix segfault caused by off by one handling END key
  2013-11-14 20:25 [GIT PULL 00/10] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (6 preceding siblings ...)
  2013-11-14 20:25 ` [PATCH 07/10] perf symbols: Limit max callchain using max_stack on DWARF unwinding too Arnaldo Carvalho de Melo
@ 2013-11-14 20:25 ` Arnaldo Carvalho de Melo
  2013-11-14 20:25 ` [PATCH 09/10] perf probe: Add '--demangle'/'--no-demangle' Arnaldo Carvalho de Melo
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Arnaldo Carvalho de Melo @ 2013-11-14 20:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	David Ahern, Frederic Weisbecker, Jiri Olsa, Mike Galbraith,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian

From: Arnaldo Carvalho de Melo <acme@redhat.com>

$ perf record ls
$ perf report

Press 'down enter end'

Result:

Program received signal SIGSEGV, Segmentation fault.

The UI browser, used on a argv array would access past the end of the
array on SEEK_END because it wasn't using 'nr_entries - 1', fix it.

Reported-by: v.karpov@samsung.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=59291
Link: http://lkml.kernel.org/n/tip-3g83ipasqi219ktv764xzzjs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/ui/browser.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/ui/browser.c b/tools/perf/ui/browser.c
index 3648d4ec041f..cbaa7af45513 100644
--- a/tools/perf/ui/browser.c
+++ b/tools/perf/ui/browser.c
@@ -569,7 +569,7 @@ void ui_browser__argv_seek(struct ui_browser *browser, off_t offset, int whence)
 		browser->top = browser->top + browser->top_idx + offset;
 		break;
 	case SEEK_END:
-		browser->top = browser->top + browser->nr_entries + offset;
+		browser->top = browser->top + browser->nr_entries - 1 + offset;
 		break;
 	default:
 		return;
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 09/10] perf probe: Add '--demangle'/'--no-demangle'
  2013-11-14 20:25 [GIT PULL 00/10] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (7 preceding siblings ...)
  2013-11-14 20:25 ` [PATCH 08/10] perf ui browser: Fix segfault caused by off by one handling END key Arnaldo Carvalho de Melo
@ 2013-11-14 20:25 ` Arnaldo Carvalho de Melo
  2013-11-14 20:25 ` [PATCH 10/10] perf record: Add an option to force per-cpu mmaps Arnaldo Carvalho de Melo
  2013-11-15  6:38 ` [GIT PULL 00/10] perf/core improvements and fixes Ingo Molnar
  10 siblings, 0 replies; 25+ messages in thread
From: Arnaldo Carvalho de Melo @ 2013-11-14 20:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Azat Khuzhin, Ingo Molnar, Paul Mackerras,
	Peter Zijlstra, Arnaldo Carvalho de Melo

From: Azat Khuzhin <a3at.mail@gmail.com>

You can't pass demangled name into "perf probe", because of special chars:
./perf probe -f -x /tmp/a.out 'foo(int)'
Semantic error :There is non-digit char in line number.

And you can't even pass without demangling (because it search symbol in
DSO with demangle=true):
./perf probe -f -x /tmp/a.out _Z3fooi
no symbols found in /tmp/a.out, maybe install a debug package?

However:
nm /tmp/a.out | grep foo
000000000040056d T _Z3fooi

After this patch, using the next command:
./perf probe -f --no-demangle -x /tmp/a.out _Z3fooi

probe will be successfully added.

Signed-off-by: Azat Khuzhin <a3at.mail@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1382947464-31266-1-git-send-email-a3at.mail@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-probe.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
index 89acc17cf2a0..6ea9e85bdc00 100644
--- a/tools/perf/builtin-probe.c
+++ b/tools/perf/builtin-probe.c
@@ -325,6 +325,8 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
 		     opt_set_filter),
 	OPT_CALLBACK('x', "exec", NULL, "executable|path",
 			"target executable name or path", opt_set_target),
+	OPT_BOOLEAN(0, "demangle", &symbol_conf.demangle,
+		    "Disable symbol demangling"),
 	OPT_END()
 	};
 	int ret;
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 10/10] perf record: Add an option to force per-cpu mmaps
  2013-11-14 20:25 [GIT PULL 00/10] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (8 preceding siblings ...)
  2013-11-14 20:25 ` [PATCH 09/10] perf probe: Add '--demangle'/'--no-demangle' Arnaldo Carvalho de Melo
@ 2013-11-14 20:25 ` Arnaldo Carvalho de Melo
  2013-11-15  6:06   ` Ingo Molnar
  2013-11-15  6:38 ` [GIT PULL 00/10] perf/core improvements and fixes Ingo Molnar
  10 siblings, 1 reply; 25+ messages in thread
From: Arnaldo Carvalho de Melo @ 2013-11-14 20:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Adrian Hunter, David Ahern, Frederic Weisbecker,
	Ingo Molnar, Jiri Olsa, Mike Galbraith, Namhyung Kim,
	Paul Mackerras, Peter Zijlstra, Stephane Eranian,
	Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

By default, when tasks are specified (i.e. -p, -t or -u options)
per-thread mmaps are created.

Add an option to override that and force per-cpu mmaps.

Further comments by peterz:

So this option allows -t/-p/-u to create one buffer per cpu and attach
all the various thread/process/user tasks' their counters to that one
buffer?

As opposed to the current state where each such counter would have its
own buffer.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1383313899-15987-7-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-record.txt | 6 ++++++
 tools/perf/builtin-record.c              | 2 ++
 tools/perf/util/evlist.c                 | 4 +++-
 tools/perf/util/evsel.c                  | 4 ++--
 tools/perf/util/target.h                 | 1 +
 5 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 052f7c4dc00c..43b42c4f4a91 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -201,6 +201,12 @@ abort events and some memory events in precise mode on modern Intel CPUs.
 --transaction::
 Record transaction flags for transaction related events.
 
+--force-per-cpu::
+Force the use of per-cpu mmaps.  By default, when tasks are specified (i.e. -p,
+-t or -u options) per-thread mmaps are created.  This option overrides that and
+forces per-cpu mmaps.  A side-effect of that is that inheritance is
+automatically enabled.  Add the -i option also to disable inheritance.
+
 SEE ALSO
 --------
 linkperf:perf-stat[1], linkperf:perf-list[1]
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 4d644fe2d5b7..7c8020a32784 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -888,6 +888,8 @@ const struct option record_options[] = {
 		    "sample by weight (on special events only)"),
 	OPT_BOOLEAN(0, "transaction", &record.opts.sample_transaction,
 		    "sample transaction flags (special events only)"),
+	OPT_BOOLEAN(0, "force-per-cpu", &record.opts.target.force_per_cpu,
+		    "force the use of per-cpu mmaps"),
 	OPT_END()
 };
 
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 5ce2ace2d6c1..bbc746aa5716 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -819,7 +819,9 @@ int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
 	if (evlist->threads == NULL)
 		return -1;
 
-	if (target__has_task(target))
+	if (target->force_per_cpu)
+		evlist->cpus = cpu_map__new(target->cpu_list);
+	else if (target__has_task(target))
 		evlist->cpus = cpu_map__dummy_new();
 	else if (!target__has_cpu(target) && !target->uses_mmap)
 		evlist->cpus = cpu_map__dummy_new();
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 18f7c188ff63..46dd4c2a41ce 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -645,7 +645,7 @@ void perf_evsel__config(struct perf_evsel *evsel,
 		}
 	}
 
-	if (target__has_cpu(&opts->target))
+	if (target__has_cpu(&opts->target) || opts->target.force_per_cpu)
 		perf_evsel__set_sample_bit(evsel, CPU);
 
 	if (opts->period)
@@ -653,7 +653,7 @@ void perf_evsel__config(struct perf_evsel *evsel,
 
 	if (!perf_missing_features.sample_id_all &&
 	    (opts->sample_time || !opts->no_inherit ||
-	     target__has_cpu(&opts->target)))
+	     target__has_cpu(&opts->target) || opts->target.force_per_cpu))
 		perf_evsel__set_sample_bit(evsel, TIME);
 
 	if (opts->raw_samples) {
diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h
index 89bab7129de4..2d0c50690892 100644
--- a/tools/perf/util/target.h
+++ b/tools/perf/util/target.h
@@ -12,6 +12,7 @@ struct target {
 	uid_t	     uid;
 	bool	     system_wide;
 	bool	     uses_mmap;
+	bool	     force_per_cpu;
 };
 
 enum target_errno {
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 10/10] perf record: Add an option to force per-cpu mmaps
  2013-11-14 20:25 ` [PATCH 10/10] perf record: Add an option to force per-cpu mmaps Arnaldo Carvalho de Melo
@ 2013-11-15  6:06   ` Ingo Molnar
  2013-11-15 11:00     ` Adrian Hunter
  0 siblings, 1 reply; 25+ messages in thread
From: Ingo Molnar @ 2013-11-15  6:06 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Adrian Hunter, David Ahern, Frederic Weisbecker,
	Ingo Molnar, Jiri Olsa, Mike Galbraith, Namhyung Kim,
	Paul Mackerras, Peter Zijlstra, Stephane Eranian,
	Arnaldo Carvalho de Melo, Namhyung Kim


* Arnaldo Carvalho de Melo <acme@infradead.org> wrote:

> +--force-per-cpu::
> + Force the use of per-cpu mmaps.  By default, when tasks are specified (i.e. -p,
> + -t or -u options) per-thread mmaps are created.  This option overrides that and
> + forces per-cpu mmaps.  A side-effect of that is that inheritance is
> + automatically enabled.  Add the -i option also to disable inheritance.

So I still haven't seen an explanation why it's called 'force' 
anything. AFAICS nothing is 'forced' really, this is simply another 
trace-ringbuffer setup method, right?

And I also raised why this shouldn't be the default event tracing 
method instead of a weird config option. Per-cpu tracing is cache 
compact, it is easier to size properly and in general it is pretty 
easy to think about. (It also has less of the TSC timestamp ordering 
problems as per thread tracing, at least in theory.)

Is there something that makes per cpu tracing undesirable as the 
default?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [GIT PULL 00/10] perf/core improvements and fixes
  2013-11-14 20:25 [GIT PULL 00/10] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (9 preceding siblings ...)
  2013-11-14 20:25 ` [PATCH 10/10] perf record: Add an option to force per-cpu mmaps Arnaldo Carvalho de Melo
@ 2013-11-15  6:38 ` Ingo Molnar
  10 siblings, 0 replies; 25+ messages in thread
From: Ingo Molnar @ 2013-11-15  6:38 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Azat Khuzhin, Bill Gray, David Ahern, Davidlohr Bueso,
	Don Zickus, Frederic Weisbecker, Jiri Olsa, Joe Mario,
	Mike Galbraith, Namhyung Kim, Paul Mackerras, Pekka Enberg,
	Peter Zijlstra, Richard Fowles, stable, Stephane Eranian,
	Sukadev Bhattiprolu, v.karpov, Waiman Long,
	Arnaldo Carvalho de Melo


* Arnaldo Carvalho de Melo <acme@infradead.org> wrote:

> From: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
> 
> Hi Ingo,
> 
> 	Please consider pulling, done on top of tip/perf/urgent.
> 
> - Arnaldo
> 
> The following changes since commit e310718d0e83aeb9969264dc577c45db16d9104d:
> 
>   tools/perf/build: Fix feature-libunwind-debug-frame handling (2013-11-14 18:00:45 +0100)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux tags/perf-core-for-mingo
> 
> for you to fetch changes up to 539e6bb71e350541105e67e3d6c31392d9da25ef:
> 
>   perf record: Add an option to force per-cpu mmaps (2013-11-14 16:10:27 -0300)
> 
> ----------------------------------------------------------------
> perf/core improvements and fixes:
> 
> . Synthesize anon MMAP records again, fix from Don Zickus.
> 
> . Add an option in 'perf record' to force per-cpu mmaps, from Adrian Hunter.
> 
> . Limit max callchain using max_stack on DWARF unwinding too.
> 
> . Fix segfault in the UI browser caused by off by one handling END key.
> 
> . Add '--demangle'/'--no-demangle' to perf probe, so that we can overcome
>   current limitations in handling C++ symbols, from Azat Khuzhin .
> 
> . Tweak 'perf trace' summary output, from Pekka Enberg.
> 
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> ----------------------------------------------------------------
> Adrian Hunter (1):
>       perf record: Add an option to force per-cpu mmaps
> 
> Arnaldo Carvalho de Melo (4):
>       perf tools: Use perf_evlist__{first,last}, perf_evsel__next
>       perf evsel: Introduce perf_evsel__prev() method
>       perf symbols: Limit max callchain using max_stack on DWARF unwinding too
>       perf ui browser: Fix segfault caused by off by one handling END key
> 
> Azat Khuzhin (1):
>       perf probe: Add '--demangle'/'--no-demangle'
> 
> Davidlohr Bueso (1):
>       perf tools: Remove trivial extra semincolon
> 
> Don Zickus (1):
>       perf tools: Synthesize anon MMAP records again
> 
> Ingo Molnar (1):
>       perf top: Add missing newline if the 'uid' is invalid
> 
> Pekka Enberg (1):
>       perf trace: Tweak summary output
> 
>  tools/perf/Documentation/perf-record.txt |  6 ++++++
>  tools/perf/builtin-probe.c               |  2 ++
>  tools/perf/builtin-record.c              |  2 ++
>  tools/perf/builtin-top.c                 |  4 ++--
>  tools/perf/builtin-trace.c               | 10 +++++-----
>  tools/perf/tests/parse-events.c          |  3 +--
>  tools/perf/ui/browser.c                  |  4 ++--
>  tools/perf/ui/browsers/hists.c           | 11 +++++------
>  tools/perf/util/event.c                  |  6 ++++--
>  tools/perf/util/evlist.c                 |  6 ++++--
>  tools/perf/util/evsel.c                  |  4 ++--
>  tools/perf/util/evsel.h                  |  5 +++++
>  tools/perf/util/machine.c                |  2 +-
>  tools/perf/util/target.h                 |  1 +
>  tools/perf/util/unwind.c                 |  9 +++++----
>  tools/perf/util/unwind.h                 |  5 +++--
>  16 files changed, 50 insertions(+), 30 deletions(-)

Pulled, thanks a lot Arnaldo!

	Ingo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 10/10] perf record: Add an option to force per-cpu mmaps
  2013-11-15  6:06   ` Ingo Molnar
@ 2013-11-15 11:00     ` Adrian Hunter
  2013-11-15 11:10       ` Ingo Molnar
  2013-11-15 13:52       ` [PATCH V2] perf record: Make per-cpu mmaps the default Adrian Hunter
  0 siblings, 2 replies; 25+ messages in thread
From: Adrian Hunter @ 2013-11-15 11:00 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Arnaldo Carvalho de Melo, linux-kernel, David Ahern,
	Frederic Weisbecker, Ingo Molnar, Jiri Olsa, Mike Galbraith,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian,
	Arnaldo Carvalho de Melo, Namhyung Kim

On 15/11/13 08:06, Ingo Molnar wrote:
> 
> * Arnaldo Carvalho de Melo <acme@infradead.org> wrote:
> 
>> +--force-per-cpu::
>> + Force the use of per-cpu mmaps.  By default, when tasks are specified (i.e. -p,
>> + -t or -u options) per-thread mmaps are created.  This option overrides that and
>> + forces per-cpu mmaps.  A side-effect of that is that inheritance is
>> + automatically enabled.  Add the -i option also to disable inheritance.
> 
> So I still haven't seen an explanation why it's called 'force' 
> anything. AFAICS nothing is 'forced' really, this is simply another 
> trace-ringbuffer setup method, right?

The option itself does not determine whether or not per-cpu mmaps are used.
For example you cannot get a per-thread mmap for a workload by:

    perf record --no-force-per-cpu ls

So the option, as implemented, is a modifier of other options, not an option
in itself.  That is why its called 'force'.

To drop 'force':




From: Adrian Hunter <adrian.hunter@intel.com>
Date: Fri, 15 Nov 2013 09:40:00 +0200
Subject: [PATCH] perf record: Drop 'force' from --force-per-cpu option

'force' is confusing so rename the option and
change the documentation accordingly.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Documentation/perf-record.txt | 10 +++++-----
 tools/perf/builtin-record.c              |  4 ++--
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 43b42c4..98a7d66 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -201,11 +201,11 @@ abort events and some memory events in precise mode on modern Intel CPUs.
 --transaction::
 Record transaction flags for transaction related events.
 
---force-per-cpu::
-Force the use of per-cpu mmaps.  By default, when tasks are specified (i.e. -p,
--t or -u options) per-thread mmaps are created.  This option overrides that and
-forces per-cpu mmaps.  A side-effect of that is that inheritance is
-automatically enabled.  Add the -i option also to disable inheritance.
+--per-cpu::
+Use per-cpu mmaps.  By default, when tasks are specified (i.e. -p, -t or -u
+options) per-thread mmaps are created.  This option overrides that and uses
+per-cpu mmaps.  A side-effect of that is that inheritance is automatically
+enabled.  Add the -i option also to disable inheritance.
 
 SEE ALSO
 --------
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 7c8020a..56ca57d 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -888,8 +888,8 @@ const struct option record_options[] = {
 		    "sample by weight (on special events only)"),
 	OPT_BOOLEAN(0, "transaction", &record.opts.sample_transaction,
 		    "sample transaction flags (special events only)"),
-	OPT_BOOLEAN(0, "force-per-cpu", &record.opts.target.force_per_cpu,
-		    "force the use of per-cpu mmaps"),
+	OPT_BOOLEAN(0, "per-cpu", &record.opts.target.force_per_cpu,
+		    "use per-cpu mmaps"),
 	OPT_END()
 };
 
-- 
1.7.11.7




But really then --no-per-cpu needs to be implemented:




From: Adrian Hunter <adrian.hunter@intel.com>
Date: Fri, 15 Nov 2013 10:26:50 +0200
Subject: [PATCH 2/2] perf record: Allow --no-per-cpu to select per-thread
 mmaps

The effect of --no-per-cpu is:

	-p, -t, -u	no difference
	-C, -a		no difference (causes a
			warning)
	otherwise	record the workload as a
			single thread i.e.
			no-inheritance

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/builtin-record.c | 20 ++++++++++++++++++--
 tools/perf/util/evlist.c    |  2 +-
 tools/perf/util/target.c    | 11 ++++++++++-
 tools/perf/util/target.h    |  2 ++
 4 files changed, 31 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 56ca57d..d2dbb01 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -776,6 +776,22 @@ int record_callchain_opt(const struct option *opt,
 	return 0;
 }
 
+static int parse_per_cpu(const struct option *opt,
+			 const char *arg __maybe_unused, int unset)
+{
+	struct target *target = opt->value;
+
+	if (unset) {
+		target->force_per_cpu = false;
+		target->force_per_thread = true;
+	} else {
+		target->force_per_cpu = true;
+		target->force_per_thread = false;
+	}
+
+	return 0;
+}
+
 static const char * const record_usage[] = {
 	"perf record [<options>] [<command>]",
 	"perf record [<options>] -- <command> [<options>]",
@@ -888,8 +904,8 @@ const struct option record_options[] = {
 		    "sample by weight (on special events only)"),
 	OPT_BOOLEAN(0, "transaction", &record.opts.sample_transaction,
 		    "sample transaction flags (special events only)"),
-	OPT_BOOLEAN(0, "per-cpu", &record.opts.target.force_per_cpu,
-		    "use per-cpu mmaps"),
+	OPT_CALLBACK_NOOPT(0, "per-cpu", &record.opts.target, "per-cpu",
+			   "use per-cpu mmaps", parse_per_cpu),
 	OPT_END()
 };
 
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index bbc746a..3ed4674 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -821,7 +821,7 @@ int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
 
 	if (target->force_per_cpu)
 		evlist->cpus = cpu_map__new(target->cpu_list);
-	else if (target__has_task(target))
+	else if (target__has_task(target) || target->force_per_thread)
 		evlist->cpus = cpu_map__dummy_new();
 	else if (!target__has_cpu(target) && !target->uses_mmap)
 		evlist->cpus = cpu_map__dummy_new();
diff --git a/tools/perf/util/target.c b/tools/perf/util/target.c
index 3c778a0..11d4527 100644
--- a/tools/perf/util/target.c
+++ b/tools/perf/util/target.c
@@ -55,6 +55,13 @@ enum target_errno target__validate(struct target *target)
 			ret = TARGET_ERRNO__UID_OVERRIDE_SYSTEM;
 	}
 
+	if (target->force_per_thread &&
+	    (target->system_wide || target->cpu_list)) {
+		target->force_per_thread = false;
+		if (ret == TARGET_ERRNO__SUCCESS)
+			ret = TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD;
+	}
+
 	return ret;
 }
 
@@ -100,6 +107,7 @@ static const char *target__error_str[] = {
 	"UID switch overriding CPU",
 	"PID/TID switch overriding SYSTEM",
 	"UID switch overriding SYSTEM",
+	"SYSTEM/CPU switch overriding NO-PER-CPU",
 	"Invalid User: %s",
 	"Problems obtaining information for user %s",
 };
@@ -131,7 +139,8 @@ int target__strerror(struct target *target, int errnum,
 	msg = target__error_str[idx];
 
 	switch (errnum) {
-	case TARGET_ERRNO__PID_OVERRIDE_CPU ... TARGET_ERRNO__UID_OVERRIDE_SYSTEM:
+	case TARGET_ERRNO__PID_OVERRIDE_CPU ...
+	     TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD:
 		snprintf(buf, buflen, "%s", msg);
 		break;
 
diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h
index 2d0c506..65494c1b 100644
--- a/tools/perf/util/target.h
+++ b/tools/perf/util/target.h
@@ -13,6 +13,7 @@ struct target {
 	bool	     system_wide;
 	bool	     uses_mmap;
 	bool	     force_per_cpu;
+	bool	     force_per_thread;
 };
 
 enum target_errno {
@@ -33,6 +34,7 @@ enum target_errno {
 	TARGET_ERRNO__UID_OVERRIDE_CPU,
 	TARGET_ERRNO__PID_OVERRIDE_SYSTEM,
 	TARGET_ERRNO__UID_OVERRIDE_SYSTEM,
+	TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD,
 
 	/* for target__parse_uid() */
 	TARGET_ERRNO__INVALID_UID,
-- 
1.7.11.7




> 
> And I also raised why this shouldn't be the default event tracing 
> method instead of a weird config option. Per-cpu tracing is cache 
> compact, it is easier to size properly and in general it is pretty 
> easy to think about. (It also has less of the TSC timestamp ordering 
> problems as per thread tracing, at least in theory.)
> 
> Is there something that makes per cpu tracing undesirable as the 
> default?

One reason is to avoid changing the meaning of existing options.

To flip it around, ignore the patches above and apply:



From: Adrian Hunter <adrian.hunter@intel.com>
Date: Fri, 15 Nov 2013 11:17:56 +0200
Subject: [PATCH] perf record: Make per-cpu mmaps the default.

This affects the -p, -t and -u options that
previously defaulted to per-thread mmaps.

Consequently add an option to select
per-thread mmaps to support the old
behaviour.

Note that per-thread can be used with a
workload-only (i.e. none of -p, -t, -u,
-a or -C is selected) to get a per-thread
mmap with no inheritance.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Documentation/perf-record.txt | 10 +++++-----
 tools/perf/builtin-record.c              |  5 +++--
 tools/perf/util/evlist.c                 |  6 ++++--
 tools/perf/util/evsel.c                  |  4 ++--
 tools/perf/util/target.c                 | 11 ++++++++++-
 tools/perf/util/target.h                 |  4 +++-
 6 files changed, 27 insertions(+), 13 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 43b42c4..6ac867e 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -201,11 +201,11 @@ abort events and some memory events in precise mode on modern Intel CPUs.
 --transaction::
 Record transaction flags for transaction related events.
 
---force-per-cpu::
-Force the use of per-cpu mmaps.  By default, when tasks are specified (i.e. -p,
--t or -u options) per-thread mmaps are created.  This option overrides that and
-forces per-cpu mmaps.  A side-effect of that is that inheritance is
-automatically enabled.  Add the -i option also to disable inheritance.
+--per-thread::
+Use per-thread mmaps.  By default per-cpu mmaps are created.  This option
+overrides that and uses per-thread mmaps.  A side-effect of that is that
+inheritance is automatically disabled.  --per-thread is ignored with a warning
+if combined with -a or -C options.
 
 SEE ALSO
 --------
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 7c8020a..f5b18b8 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -800,6 +800,7 @@ static struct perf_record record = {
 		.freq		     = 4000,
 		.target		     = {
 			.uses_mmap   = true,
+			.default_per_cpu = true,
 		},
 	},
 };
@@ -888,8 +889,8 @@ const struct option record_options[] = {
 		    "sample by weight (on special events only)"),
 	OPT_BOOLEAN(0, "transaction", &record.opts.sample_transaction,
 		    "sample transaction flags (special events only)"),
-	OPT_BOOLEAN(0, "force-per-cpu", &record.opts.target.force_per_cpu,
-		    "force the use of per-cpu mmaps"),
+	OPT_BOOLEAN(0, "per-thread", &record.opts.target.per_thread,
+		    "use per-thread mmaps"),
 	OPT_END()
 };
 
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index bbc746a..76fa764 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -819,8 +819,10 @@ int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
 	if (evlist->threads == NULL)
 		return -1;
 
-	if (target->force_per_cpu)
-		evlist->cpus = cpu_map__new(target->cpu_list);
+	if (target->default_per_cpu)
+		evlist->cpus = target->per_thread ?
+					cpu_map__dummy_new() :
+					cpu_map__new(target->cpu_list);
 	else if (target__has_task(target))
 		evlist->cpus = cpu_map__dummy_new();
 	else if (!target__has_cpu(target) && !target->uses_mmap)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 46dd4c2..18f7c18 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -645,7 +645,7 @@ void perf_evsel__config(struct perf_evsel *evsel,
 		}
 	}
 
-	if (target__has_cpu(&opts->target) || opts->target.force_per_cpu)
+	if (target__has_cpu(&opts->target))
 		perf_evsel__set_sample_bit(evsel, CPU);
 
 	if (opts->period)
@@ -653,7 +653,7 @@ void perf_evsel__config(struct perf_evsel *evsel,
 
 	if (!perf_missing_features.sample_id_all &&
 	    (opts->sample_time || !opts->no_inherit ||
-	     target__has_cpu(&opts->target) || opts->target.force_per_cpu))
+	     target__has_cpu(&opts->target)))
 		perf_evsel__set_sample_bit(evsel, TIME);
 
 	if (opts->raw_samples) {
diff --git a/tools/perf/util/target.c b/tools/perf/util/target.c
index 3c778a0..e74c596 100644
--- a/tools/perf/util/target.c
+++ b/tools/perf/util/target.c
@@ -55,6 +55,13 @@ enum target_errno target__validate(struct target *target)
 			ret = TARGET_ERRNO__UID_OVERRIDE_SYSTEM;
 	}
 
+	/* THREAD and SYSTEM/CPU are mutually exclusive */
+	if (target->per_thread && (target->system_wide || target->cpu_list)) {
+		target->per_thread = false;
+		if (ret == TARGET_ERRNO__SUCCESS)
+			ret = TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD;
+	}
+
 	return ret;
 }
 
@@ -100,6 +107,7 @@ static const char *target__error_str[] = {
 	"UID switch overriding CPU",
 	"PID/TID switch overriding SYSTEM",
 	"UID switch overriding SYSTEM",
+	"SYSTEM/CPU switch overriding PER-THREAD",
 	"Invalid User: %s",
 	"Problems obtaining information for user %s",
 };
@@ -131,7 +139,8 @@ int target__strerror(struct target *target, int errnum,
 	msg = target__error_str[idx];
 
 	switch (errnum) {
-	case TARGET_ERRNO__PID_OVERRIDE_CPU ... TARGET_ERRNO__UID_OVERRIDE_SYSTEM:
+	case TARGET_ERRNO__PID_OVERRIDE_CPU ...
+	     TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD:
 		snprintf(buf, buflen, "%s", msg);
 		break;
 
diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h
index 2d0c506..31dd2e9 100644
--- a/tools/perf/util/target.h
+++ b/tools/perf/util/target.h
@@ -12,7 +12,8 @@ struct target {
 	uid_t	     uid;
 	bool	     system_wide;
 	bool	     uses_mmap;
-	bool	     force_per_cpu;
+	bool	     default_per_cpu;
+	bool	     per_thread;
 };
 
 enum target_errno {
@@ -33,6 +34,7 @@ enum target_errno {
 	TARGET_ERRNO__UID_OVERRIDE_CPU,
 	TARGET_ERRNO__PID_OVERRIDE_SYSTEM,
 	TARGET_ERRNO__UID_OVERRIDE_SYSTEM,
+	TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD,
 
 	/* for target__parse_uid() */
 	TARGET_ERRNO__INVALID_UID,
-- 
1.7.11.7






^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 10/10] perf record: Add an option to force per-cpu mmaps
  2013-11-15 11:00     ` Adrian Hunter
@ 2013-11-15 11:10       ` Ingo Molnar
  2013-11-15 11:27         ` Adrian Hunter
  2013-11-15 13:52       ` [PATCH V2] perf record: Make per-cpu mmaps the default Adrian Hunter
  1 sibling, 1 reply; 25+ messages in thread
From: Ingo Molnar @ 2013-11-15 11:10 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, linux-kernel, David Ahern,
	Frederic Weisbecker, Ingo Molnar, Jiri Olsa, Mike Galbraith,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian,
	Arnaldo Carvalho de Melo, Namhyung Kim


* Adrian Hunter <adrian.hunter@intel.com> wrote:

> > And I also raised why this shouldn't be the default event tracing 
> > method instead of a weird config option. Per-cpu tracing is cache 
> > compact, it is easier to size properly and in general it is pretty 
> > easy to think about. (It also has less of the TSC timestamp 
> > ordering problems as per thread tracing, at least in theory.)
> > 
> > Is there something that makes per cpu tracing undesirable as the 
> > default?
> 
> One reason is to avoid changing the meaning of existing options.

Well, the way the tracing buffers are set up is a mostly tool internal 
matter so in that sense it should be just fine to change the default 
behavior - as long as output remains unchanged (which it should).

Or is there any material change in behavior somewhere?

> To flip it around, ignore the patches above and apply:

> Subject: [PATCH] perf record: Make per-cpu mmaps the default.

Yay!

> +--per-thread::
> +Use per-thread mmaps.  By default per-cpu mmaps are created.  This option
> +overrides that and uses per-thread mmaps.  A side-effect of that is that
> +inheritance is automatically disabled.  --per-thread is ignored with a warning
> +if combined with -a or -C options.

I think this is the natural thing to do, --per-thread is the 'somewhat 
weird' option that cannot be used in all modes.

Acked-by: Ingo Molnar <mingo@kernel.org>

:-)

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 10/10] perf record: Add an option to force per-cpu mmaps
  2013-11-15 11:10       ` Ingo Molnar
@ 2013-11-15 11:27         ` Adrian Hunter
  2013-11-15 11:56           ` Ingo Molnar
  0 siblings, 1 reply; 25+ messages in thread
From: Adrian Hunter @ 2013-11-15 11:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Arnaldo Carvalho de Melo, linux-kernel, David Ahern,
	Frederic Weisbecker, Ingo Molnar, Jiri Olsa, Mike Galbraith,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian,
	Arnaldo Carvalho de Melo, Namhyung Kim

On 15/11/13 13:10, Ingo Molnar wrote:
> 
> * Adrian Hunter <adrian.hunter@intel.com> wrote:
> 
>>> And I also raised why this shouldn't be the default event tracing 
>>> method instead of a weird config option. Per-cpu tracing is cache 
>>> compact, it is easier to size properly and in general it is pretty 
>>> easy to think about. (It also has less of the TSC timestamp 
>>> ordering problems as per thread tracing, at least in theory.)
>>>
>>> Is there something that makes per cpu tracing undesirable as the 
>>> default?
>>
>> One reason is to avoid changing the meaning of existing options.
> 
> Well, the way the tracing buffers are set up is a mostly tool internal 
> matter so in that sense it should be just fine to change the default 
> behavior - as long as output remains unchanged (which it should).
> 
> Or is there any material change in behavior somewhere?

Inheritance is enabled automatically with per-cpu mmaps,
although that is one of the reasons people want
per-cpu mmaps.

> 
>> To flip it around, ignore the patches above and apply:
> 
>> Subject: [PATCH] perf record: Make per-cpu mmaps the default.
> 
> Yay!
> 
>> +--per-thread::
>> +Use per-thread mmaps.  By default per-cpu mmaps are created.  This option
>> +overrides that and uses per-thread mmaps.  A side-effect of that is that
>> +inheritance is automatically disabled.  --per-thread is ignored with a warning
>> +if combined with -a or -C options.
> 
> I think this is the natural thing to do, --per-thread is the 'somewhat 
> weird' option that cannot be used in all modes.
> 
> Acked-by: Ingo Molnar <mingo@kernel.org>
> 
> :-)
> 
> Thanks,
> 
> 	Ingo
> 
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 10/10] perf record: Add an option to force per-cpu mmaps
  2013-11-15 11:27         ` Adrian Hunter
@ 2013-11-15 11:56           ` Ingo Molnar
  2013-11-15 12:03             ` Peter Zijlstra
  0 siblings, 1 reply; 25+ messages in thread
From: Ingo Molnar @ 2013-11-15 11:56 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, linux-kernel, David Ahern,
	Frederic Weisbecker, Ingo Molnar, Jiri Olsa, Mike Galbraith,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian,
	Arnaldo Carvalho de Melo, Namhyung Kim


* Adrian Hunter <adrian.hunter@intel.com> wrote:

> On 15/11/13 13:10, Ingo Molnar wrote:
> > 
> > * Adrian Hunter <adrian.hunter@intel.com> wrote:
> > 
> >>> And I also raised why this shouldn't be the default event tracing 
> >>> method instead of a weird config option. Per-cpu tracing is cache 
> >>> compact, it is easier to size properly and in general it is pretty 
> >>> easy to think about. (It also has less of the TSC timestamp 
> >>> ordering problems as per thread tracing, at least in theory.)
> >>>
> >>> Is there something that makes per cpu tracing undesirable as the 
> >>> default?
> >>
> >> One reason is to avoid changing the meaning of existing options.
> > 
> > Well, the way the tracing buffers are set up is a mostly tool internal 
> > matter so in that sense it should be just fine to change the default 
> > behavior - as long as output remains unchanged (which it should).
> > 
> > Or is there any material change in behavior somewhere?
> 
> Inheritance is enabled automatically with per-cpu mmaps,
> although that is one of the reasons people want
> per-cpu mmaps.

So, here's the current status quo, there's 4 basic types of profiling 
that 99% of the people are using, in order of popularity:

	perf record <cmd>
	perf record -a sleep N
	perf record -p <PID>
	perf record -t <TID>

The first two (which I'd guess comprise about 95% of real-world usage) 
have inheritance enabled.

The last two (-p/-t) have inheritance disabled by default.

Correct?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 10/10] perf record: Add an option to force per-cpu mmaps
  2013-11-15 11:56           ` Ingo Molnar
@ 2013-11-15 12:03             ` Peter Zijlstra
  2013-11-15 12:05               ` Ingo Molnar
  2013-11-15 12:52               ` Adrian Hunter
  0 siblings, 2 replies; 25+ messages in thread
From: Peter Zijlstra @ 2013-11-15 12:03 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Adrian Hunter, Arnaldo Carvalho de Melo, linux-kernel,
	David Ahern, Frederic Weisbecker, Ingo Molnar, Jiri Olsa,
	Mike Galbraith, Namhyung Kim, Paul Mackerras, Stephane Eranian,
	Arnaldo Carvalho de Melo, Namhyung Kim

On Fri, Nov 15, 2013 at 12:56:29PM +0100, Ingo Molnar wrote:
> So, here's the current status quo, there's 4 basic types of profiling 
> that 99% of the people are using, in order of popularity:
> 
> 	perf record <cmd>
> 	perf record -a sleep N
> 	perf record -p <PID>
> 	perf record -t <TID>
> 
> The first two (which I'd guess comprise about 95% of real-world usage) 
> have inheritance enabled.
> 
> The last two (-p/-t) have inheritance disabled by default.

Yes, and I would expect it to be disabled for the TID option as you
explicitly select a single threads.

For the process wide thing it would make sense to enable inheritance by
default though.

So the big trade-off is that for single threaded processes which do not
fork you now have a single buffer, whereas with the inheritance option
you'll end up with nr_cpus buffers by default.

I suppose for most normal people that's not really an issue; and I
suppose all people with silly large machines already pay extra attention
-- but at least make it explicit and very clear that this is so.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 10/10] perf record: Add an option to force per-cpu mmaps
  2013-11-15 12:03             ` Peter Zijlstra
@ 2013-11-15 12:05               ` Ingo Molnar
  2013-11-15 12:30                 ` Peter Zijlstra
  2013-11-15 12:33                 ` Adrian Hunter
  2013-11-15 12:52               ` Adrian Hunter
  1 sibling, 2 replies; 25+ messages in thread
From: Ingo Molnar @ 2013-11-15 12:05 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Adrian Hunter, Arnaldo Carvalho de Melo, linux-kernel,
	David Ahern, Frederic Weisbecker, Ingo Molnar, Jiri Olsa,
	Mike Galbraith, Namhyung Kim, Paul Mackerras, Stephane Eranian,
	Arnaldo Carvalho de Melo, Namhyung Kim


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Fri, Nov 15, 2013 at 12:56:29PM +0100, Ingo Molnar wrote:
> > So, here's the current status quo, there's 4 basic types of profiling 
> > that 99% of the people are using, in order of popularity:
> > 
> > 	perf record <cmd>
> > 	perf record -a sleep N
> > 	perf record -p <PID>
> > 	perf record -t <TID>
> > 
> > The first two (which I'd guess comprise about 95% of real-world usage) 
> > have inheritance enabled.
> > 
> > The last two (-p/-t) have inheritance disabled by default.
> 
> Yes, and I would expect it to be disabled for the TID option as you 
> explicitly select a single threads.

Correct.

> For the process wide thing it would make sense to enable inheritance 
> by default though.
> 
> So the big trade-off is that for single threaded processes which do 
> not fork you now have a single buffer, whereas with the inheritance 
> option you'll end up with nr_cpus buffers by default.
> 
> I suppose for most normal people that's not really an issue; and I 
> suppose all people with silly large machines already pay extra 
> attention -- but at least make it explicit and very clear that this 
> is so.

Do the first variant, 'perf record <cmd>', already use per CPU 
buffers?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 10/10] perf record: Add an option to force per-cpu mmaps
  2013-11-15 12:05               ` Ingo Molnar
@ 2013-11-15 12:30                 ` Peter Zijlstra
  2013-11-15 12:33                 ` Adrian Hunter
  1 sibling, 0 replies; 25+ messages in thread
From: Peter Zijlstra @ 2013-11-15 12:30 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Adrian Hunter, Arnaldo Carvalho de Melo, linux-kernel,
	David Ahern, Frederic Weisbecker, Ingo Molnar, Jiri Olsa,
	Mike Galbraith, Namhyung Kim, Paul Mackerras, Stephane Eranian,
	Arnaldo Carvalho de Melo, Namhyung Kim

On Fri, Nov 15, 2013 at 01:05:25PM +0100, Ingo Molnar wrote:
> Do the first variant, 'perf record <cmd>', already use per CPU 
> buffers?

Yes.. this is required for inheritance to work.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 10/10] perf record: Add an option to force per-cpu mmaps
  2013-11-15 12:05               ` Ingo Molnar
  2013-11-15 12:30                 ` Peter Zijlstra
@ 2013-11-15 12:33                 ` Adrian Hunter
  1 sibling, 0 replies; 25+ messages in thread
From: Adrian Hunter @ 2013-11-15 12:33 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Arnaldo Carvalho de Melo, linux-kernel,
	David Ahern, Frederic Weisbecker, Ingo Molnar, Jiri Olsa,
	Mike Galbraith, Namhyung Kim, Paul Mackerras, Stephane Eranian,
	Arnaldo Carvalho de Melo, Namhyung Kim

On 15/11/13 14:05, Ingo Molnar wrote:
> 
> * Peter Zijlstra <peterz@infradead.org> wrote:
> 
>> On Fri, Nov 15, 2013 at 12:56:29PM +0100, Ingo Molnar wrote:
>>> So, here's the current status quo, there's 4 basic types of profiling 
>>> that 99% of the people are using, in order of popularity:
>>>
>>> 	perf record <cmd>
>>> 	perf record -a sleep N
>>> 	perf record -p <PID>
>>> 	perf record -t <TID>
>>>
>>> The first two (which I'd guess comprise about 95% of real-world usage) 
>>> have inheritance enabled.
>>>
>>> The last two (-p/-t) have inheritance disabled by default.
>>
>> Yes, and I would expect it to be disabled for the TID option as you 
>> explicitly select a single threads.
> 
> Correct.
> 
>> For the process wide thing it would make sense to enable inheritance 
>> by default though.
>>
>> So the big trade-off is that for single threaded processes which do 
>> not fork you now have a single buffer, whereas with the inheritance 
>> option you'll end up with nr_cpus buffers by default.
>>
>> I suppose for most normal people that's not really an issue; and I 
>> suppose all people with silly large machines already pay extra 
>> attention -- but at least make it explicit and very clear that this 
>> is so.
> 
> Do the first variant, 'perf record <cmd>', already use per CPU 
> buffers?

Yes.

Another difference (and I need to fix the patch) is that per-cpu mmaps
require PERF_SAMPLE_TIME so that the events do not appear out of order.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 10/10] perf record: Add an option to force per-cpu mmaps
  2013-11-15 12:03             ` Peter Zijlstra
  2013-11-15 12:05               ` Ingo Molnar
@ 2013-11-15 12:52               ` Adrian Hunter
  2013-11-15 12:53                 ` Peter Zijlstra
  1 sibling, 1 reply; 25+ messages in thread
From: Adrian Hunter @ 2013-11-15 12:52 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, linux-kernel, David Ahern,
	Frederic Weisbecker, Ingo Molnar, Jiri Olsa, Mike Galbraith,
	Namhyung Kim, Paul Mackerras, Stephane Eranian,
	Arnaldo Carvalho de Melo, Namhyung Kim

On 15/11/13 14:03, Peter Zijlstra wrote:
> On Fri, Nov 15, 2013 at 12:56:29PM +0100, Ingo Molnar wrote:
>> So, here's the current status quo, there's 4 basic types of profiling 
>> that 99% of the people are using, in order of popularity:
>>
>> 	perf record <cmd>
>> 	perf record -a sleep N
>> 	perf record -p <PID>
>> 	perf record -t <TID>
>>
>> The first two (which I'd guess comprise about 95% of real-world usage) 
>> have inheritance enabled.
>>
>> The last two (-p/-t) have inheritance disabled by default.
> 
> Yes, and I would expect it to be disabled for the TID option as you
> explicitly select a single threads.

So you want -t to imply -i ?

That means if you want inheritance you have to do

	-t <TID> --no-no-inherit

Or do you want another option --inherit

> 
> For the process wide thing it would make sense to enable inheritance by
> default though.
> 
> So the big trade-off is that for single threaded processes which do not
> fork you now have a single buffer, whereas with the inheritance option
> you'll end up with nr_cpus buffers by default.
> 
> I suppose for most normal people that's not really an issue; and I
> suppose all people with silly large machines already pay extra attention
> -- but at least make it explicit and very clear that this is so.
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 10/10] perf record: Add an option to force per-cpu mmaps
  2013-11-15 12:52               ` Adrian Hunter
@ 2013-11-15 12:53                 ` Peter Zijlstra
  0 siblings, 0 replies; 25+ messages in thread
From: Peter Zijlstra @ 2013-11-15 12:53 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, linux-kernel, David Ahern,
	Frederic Weisbecker, Ingo Molnar, Jiri Olsa, Mike Galbraith,
	Namhyung Kim, Paul Mackerras, Stephane Eranian,
	Arnaldo Carvalho de Melo, Namhyung Kim

On Fri, Nov 15, 2013 at 02:52:44PM +0200, Adrian Hunter wrote:
> On 15/11/13 14:03, Peter Zijlstra wrote:
> > On Fri, Nov 15, 2013 at 12:56:29PM +0100, Ingo Molnar wrote:
> >> So, here's the current status quo, there's 4 basic types of profiling 
> >> that 99% of the people are using, in order of popularity:
> >>
> >> 	perf record <cmd>
> >> 	perf record -a sleep N
> >> 	perf record -p <PID>
> >> 	perf record -t <TID>
> >>
> >> The first two (which I'd guess comprise about 95% of real-world usage) 
> >> have inheritance enabled.
> >>
> >> The last two (-p/-t) have inheritance disabled by default.
> > 
> > Yes, and I would expect it to be disabled for the TID option as you
> > explicitly select a single threads.
> 
> So you want -t to imply -i ?
> 
> That means if you want inheritance you have to do
> 
> 	-t <TID> --no-no-inherit
> 
> Or do you want another option --inherit

/me boggles, they're not the same? ;-)

Maybe we should extend the option parser to know that a double negative
is a nop :-)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH V2] perf record: Make per-cpu mmaps the default.
  2013-11-15 11:00     ` Adrian Hunter
  2013-11-15 11:10       ` Ingo Molnar
@ 2013-11-15 13:52       ` Adrian Hunter
  2013-11-30 12:50         ` [tip:perf/core] " tip-bot for Adrian Hunter
  1 sibling, 1 reply; 25+ messages in thread
From: Adrian Hunter @ 2013-11-15 13:52 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Arnaldo Carvalho de Melo
  Cc: Adrian Hunter, Ingo Molnar, linux-kernel, David Ahern,
	Frederic Weisbecker, Ingo Molnar, Jiri Olsa, Mike Galbraith,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian,
	Namhyung Kim

This affects the -p, -t and -u options that
previously defaulted to per-thread mmaps.

Consequently add an option to select
per-thread mmaps to support the old
behaviour.

Note that per-thread can be used with a
workload-only (i.e. none of -p, -t, -u,
-a or -C is selected) to get a per-thread
mmap with no inheritance.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---

Changes in V2:

	Ensure PERF_SAMPLE_TIME is set for per-cpu mmaps.


 tools/perf/Documentation/perf-record.txt     | 10 +++++-----
 tools/perf/builtin-record.c                  |  5 +++--
 tools/perf/tests/attr/test-record-no-inherit |  2 +-
 tools/perf/util/evlist.c                     |  6 ++++--
 tools/perf/util/evsel.c                      |  5 +++--
 tools/perf/util/target.c                     | 11 ++++++++++-
 tools/perf/util/target.h                     |  4 +++-
 7 files changed, 29 insertions(+), 14 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 43b42c4..6ac867e 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -201,11 +201,11 @@ abort events and some memory events in precise mode on modern Intel CPUs.
 --transaction::
 Record transaction flags for transaction related events.
 
---force-per-cpu::
-Force the use of per-cpu mmaps.  By default, when tasks are specified (i.e. -p,
--t or -u options) per-thread mmaps are created.  This option overrides that and
-forces per-cpu mmaps.  A side-effect of that is that inheritance is
-automatically enabled.  Add the -i option also to disable inheritance.
+--per-thread::
+Use per-thread mmaps.  By default per-cpu mmaps are created.  This option
+overrides that and uses per-thread mmaps.  A side-effect of that is that
+inheritance is automatically disabled.  --per-thread is ignored with a warning
+if combined with -a or -C options.
 
 SEE ALSO
 --------
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 7c8020a..f5b18b8 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -800,6 +800,7 @@ static struct perf_record record = {
 		.freq		     = 4000,
 		.target		     = {
 			.uses_mmap   = true,
+			.default_per_cpu = true,
 		},
 	},
 };
@@ -888,8 +889,8 @@ const struct option record_options[] = {
 		    "sample by weight (on special events only)"),
 	OPT_BOOLEAN(0, "transaction", &record.opts.sample_transaction,
 		    "sample transaction flags (special events only)"),
-	OPT_BOOLEAN(0, "force-per-cpu", &record.opts.target.force_per_cpu,
-		    "force the use of per-cpu mmaps"),
+	OPT_BOOLEAN(0, "per-thread", &record.opts.target.per_thread,
+		    "use per-thread mmaps"),
 	OPT_END()
 };
 
diff --git a/tools/perf/tests/attr/test-record-no-inherit b/tools/perf/tests/attr/test-record-no-inherit
index 9079a25..44edcb2 100644
--- a/tools/perf/tests/attr/test-record-no-inherit
+++ b/tools/perf/tests/attr/test-record-no-inherit
@@ -3,5 +3,5 @@ command = record
 args    = -i kill >/dev/null 2>&1
 
 [event:base-record]
-sample_type=259
+sample_type=263
 inherit=0
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index bbc746a..76fa764 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -819,8 +819,10 @@ int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
 	if (evlist->threads == NULL)
 		return -1;
 
-	if (target->force_per_cpu)
-		evlist->cpus = cpu_map__new(target->cpu_list);
+	if (target->default_per_cpu)
+		evlist->cpus = target->per_thread ?
+					cpu_map__dummy_new() :
+					cpu_map__new(target->cpu_list);
 	else if (target__has_task(target))
 		evlist->cpus = cpu_map__dummy_new();
 	else if (!target__has_cpu(target) && !target->uses_mmap)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 46dd4c2..77e38ff 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -572,6 +572,7 @@ void perf_evsel__config(struct perf_evsel *evsel,
 	struct perf_evsel *leader = evsel->leader;
 	struct perf_event_attr *attr = &evsel->attr;
 	int track = !evsel->idx; /* only the first counter needs these */
+	bool per_cpu = opts->target.default_per_cpu && !opts->target.per_thread;
 
 	attr->sample_id_all = perf_missing_features.sample_id_all ? 0 : 1;
 	attr->inherit	    = !opts->no_inherit;
@@ -645,7 +646,7 @@ void perf_evsel__config(struct perf_evsel *evsel,
 		}
 	}
 
-	if (target__has_cpu(&opts->target) || opts->target.force_per_cpu)
+	if (target__has_cpu(&opts->target))
 		perf_evsel__set_sample_bit(evsel, CPU);
 
 	if (opts->period)
@@ -653,7 +654,7 @@ void perf_evsel__config(struct perf_evsel *evsel,
 
 	if (!perf_missing_features.sample_id_all &&
 	    (opts->sample_time || !opts->no_inherit ||
-	     target__has_cpu(&opts->target) || opts->target.force_per_cpu))
+	     target__has_cpu(&opts->target) || per_cpu))
 		perf_evsel__set_sample_bit(evsel, TIME);
 
 	if (opts->raw_samples) {
diff --git a/tools/perf/util/target.c b/tools/perf/util/target.c
index 3c778a0..e74c596 100644
--- a/tools/perf/util/target.c
+++ b/tools/perf/util/target.c
@@ -55,6 +55,13 @@ enum target_errno target__validate(struct target *target)
 			ret = TARGET_ERRNO__UID_OVERRIDE_SYSTEM;
 	}
 
+	/* THREAD and SYSTEM/CPU are mutually exclusive */
+	if (target->per_thread && (target->system_wide || target->cpu_list)) {
+		target->per_thread = false;
+		if (ret == TARGET_ERRNO__SUCCESS)
+			ret = TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD;
+	}
+
 	return ret;
 }
 
@@ -100,6 +107,7 @@ static const char *target__error_str[] = {
 	"UID switch overriding CPU",
 	"PID/TID switch overriding SYSTEM",
 	"UID switch overriding SYSTEM",
+	"SYSTEM/CPU switch overriding PER-THREAD",
 	"Invalid User: %s",
 	"Problems obtaining information for user %s",
 };
@@ -131,7 +139,8 @@ int target__strerror(struct target *target, int errnum,
 	msg = target__error_str[idx];
 
 	switch (errnum) {
-	case TARGET_ERRNO__PID_OVERRIDE_CPU ... TARGET_ERRNO__UID_OVERRIDE_SYSTEM:
+	case TARGET_ERRNO__PID_OVERRIDE_CPU ...
+	     TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD:
 		snprintf(buf, buflen, "%s", msg);
 		break;
 
diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h
index 2d0c506..31dd2e9 100644
--- a/tools/perf/util/target.h
+++ b/tools/perf/util/target.h
@@ -12,7 +12,8 @@ struct target {
 	uid_t	     uid;
 	bool	     system_wide;
 	bool	     uses_mmap;
-	bool	     force_per_cpu;
+	bool	     default_per_cpu;
+	bool	     per_thread;
 };
 
 enum target_errno {
@@ -33,6 +34,7 @@ enum target_errno {
 	TARGET_ERRNO__UID_OVERRIDE_CPU,
 	TARGET_ERRNO__PID_OVERRIDE_SYSTEM,
 	TARGET_ERRNO__UID_OVERRIDE_SYSTEM,
+	TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD,
 
 	/* for target__parse_uid() */
 	TARGET_ERRNO__INVALID_UID,
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [tip:perf/core] perf record: Make per-cpu mmaps the default.
  2013-11-15 13:52       ` [PATCH V2] perf record: Make per-cpu mmaps the default Adrian Hunter
@ 2013-11-30 12:50         ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 25+ messages in thread
From: tip-bot for Adrian Hunter @ 2013-11-30 12:50 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, linux-kernel, eranian, paulus, hpa, mingo, a.p.zijlstra,
	efault, namhyung, namhyung, jolsa, fweisbec, adrian.hunter,
	dsahern, tglx

Commit-ID:  3aa5939d71fa22a947808ba9c798b8537c35097a
Gitweb:     http://git.kernel.org/tip/3aa5939d71fa22a947808ba9c798b8537c35097a
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Fri, 15 Nov 2013 15:52:29 +0200
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 27 Nov 2013 14:58:36 -0300

perf record: Make per-cpu mmaps the default.

This affects the -p, -t and -u options that previously defaulted to
per-thread mmaps.

Consequently add an option to select per-thread mmaps to support the old
behaviour.

Note that per-thread can be used with a workload-only (i.e. none of -p,
-t, -u, -a or -C is selected) to get a per-thread mmap with no
inheritance.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/5286271D.3020808@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-record.txt     | 10 +++++-----
 tools/perf/builtin-record.c                  |  5 +++--
 tools/perf/tests/attr/test-record-no-inherit |  2 +-
 tools/perf/util/evlist.c                     |  6 ++++--
 tools/perf/util/evsel.c                      |  5 +++--
 tools/perf/util/target.c                     | 11 ++++++++++-
 tools/perf/util/target.h                     |  4 +++-
 7 files changed, 29 insertions(+), 14 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 43b42c4..6ac867e 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -201,11 +201,11 @@ abort events and some memory events in precise mode on modern Intel CPUs.
 --transaction::
 Record transaction flags for transaction related events.
 
---force-per-cpu::
-Force the use of per-cpu mmaps.  By default, when tasks are specified (i.e. -p,
--t or -u options) per-thread mmaps are created.  This option overrides that and
-forces per-cpu mmaps.  A side-effect of that is that inheritance is
-automatically enabled.  Add the -i option also to disable inheritance.
+--per-thread::
+Use per-thread mmaps.  By default per-cpu mmaps are created.  This option
+overrides that and uses per-thread mmaps.  A side-effect of that is that
+inheritance is automatically disabled.  --per-thread is ignored with a warning
+if combined with -a or -C options.
 
 SEE ALSO
 --------
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 7c8020a..f5b18b8 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -800,6 +800,7 @@ static struct perf_record record = {
 		.freq		     = 4000,
 		.target		     = {
 			.uses_mmap   = true,
+			.default_per_cpu = true,
 		},
 	},
 };
@@ -888,8 +889,8 @@ const struct option record_options[] = {
 		    "sample by weight (on special events only)"),
 	OPT_BOOLEAN(0, "transaction", &record.opts.sample_transaction,
 		    "sample transaction flags (special events only)"),
-	OPT_BOOLEAN(0, "force-per-cpu", &record.opts.target.force_per_cpu,
-		    "force the use of per-cpu mmaps"),
+	OPT_BOOLEAN(0, "per-thread", &record.opts.target.per_thread,
+		    "use per-thread mmaps"),
 	OPT_END()
 };
 
diff --git a/tools/perf/tests/attr/test-record-no-inherit b/tools/perf/tests/attr/test-record-no-inherit
index 9079a25..44edcb2 100644
--- a/tools/perf/tests/attr/test-record-no-inherit
+++ b/tools/perf/tests/attr/test-record-no-inherit
@@ -3,5 +3,5 @@ command = record
 args    = -i kill >/dev/null 2>&1
 
 [event:base-record]
-sample_type=259
+sample_type=263
 inherit=0
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index bbc746a..76fa764 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -819,8 +819,10 @@ int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
 	if (evlist->threads == NULL)
 		return -1;
 
-	if (target->force_per_cpu)
-		evlist->cpus = cpu_map__new(target->cpu_list);
+	if (target->default_per_cpu)
+		evlist->cpus = target->per_thread ?
+					cpu_map__dummy_new() :
+					cpu_map__new(target->cpu_list);
 	else if (target__has_task(target))
 		evlist->cpus = cpu_map__dummy_new();
 	else if (!target__has_cpu(target) && !target->uses_mmap)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index dad6492..b5fe7f9 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -574,6 +574,7 @@ void perf_evsel__config(struct perf_evsel *evsel,
 	struct perf_evsel *leader = evsel->leader;
 	struct perf_event_attr *attr = &evsel->attr;
 	int track = !evsel->idx; /* only the first counter needs these */
+	bool per_cpu = opts->target.default_per_cpu && !opts->target.per_thread;
 
 	attr->sample_id_all = perf_missing_features.sample_id_all ? 0 : 1;
 	attr->inherit	    = !opts->no_inherit;
@@ -647,7 +648,7 @@ void perf_evsel__config(struct perf_evsel *evsel,
 		}
 	}
 
-	if (target__has_cpu(&opts->target) || opts->target.force_per_cpu)
+	if (target__has_cpu(&opts->target))
 		perf_evsel__set_sample_bit(evsel, CPU);
 
 	if (opts->period)
@@ -655,7 +656,7 @@ void perf_evsel__config(struct perf_evsel *evsel,
 
 	if (!perf_missing_features.sample_id_all &&
 	    (opts->sample_time || !opts->no_inherit ||
-	     target__has_cpu(&opts->target) || opts->target.force_per_cpu))
+	     target__has_cpu(&opts->target) || per_cpu))
 		perf_evsel__set_sample_bit(evsel, TIME);
 
 	if (opts->raw_samples) {
diff --git a/tools/perf/util/target.c b/tools/perf/util/target.c
index 3c778a0..e74c596 100644
--- a/tools/perf/util/target.c
+++ b/tools/perf/util/target.c
@@ -55,6 +55,13 @@ enum target_errno target__validate(struct target *target)
 			ret = TARGET_ERRNO__UID_OVERRIDE_SYSTEM;
 	}
 
+	/* THREAD and SYSTEM/CPU are mutually exclusive */
+	if (target->per_thread && (target->system_wide || target->cpu_list)) {
+		target->per_thread = false;
+		if (ret == TARGET_ERRNO__SUCCESS)
+			ret = TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD;
+	}
+
 	return ret;
 }
 
@@ -100,6 +107,7 @@ static const char *target__error_str[] = {
 	"UID switch overriding CPU",
 	"PID/TID switch overriding SYSTEM",
 	"UID switch overriding SYSTEM",
+	"SYSTEM/CPU switch overriding PER-THREAD",
 	"Invalid User: %s",
 	"Problems obtaining information for user %s",
 };
@@ -131,7 +139,8 @@ int target__strerror(struct target *target, int errnum,
 	msg = target__error_str[idx];
 
 	switch (errnum) {
-	case TARGET_ERRNO__PID_OVERRIDE_CPU ... TARGET_ERRNO__UID_OVERRIDE_SYSTEM:
+	case TARGET_ERRNO__PID_OVERRIDE_CPU ...
+	     TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD:
 		snprintf(buf, buflen, "%s", msg);
 		break;
 
diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h
index 2d0c506..31dd2e9 100644
--- a/tools/perf/util/target.h
+++ b/tools/perf/util/target.h
@@ -12,7 +12,8 @@ struct target {
 	uid_t	     uid;
 	bool	     system_wide;
 	bool	     uses_mmap;
-	bool	     force_per_cpu;
+	bool	     default_per_cpu;
+	bool	     per_thread;
 };
 
 enum target_errno {
@@ -33,6 +34,7 @@ enum target_errno {
 	TARGET_ERRNO__UID_OVERRIDE_CPU,
 	TARGET_ERRNO__PID_OVERRIDE_SYSTEM,
 	TARGET_ERRNO__UID_OVERRIDE_SYSTEM,
+	TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD,
 
 	/* for target__parse_uid() */
 	TARGET_ERRNO__INVALID_UID,

^ permalink raw reply related	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2013-11-30 12:51 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-14 20:25 [GIT PULL 00/10] perf/core improvements and fixes Arnaldo Carvalho de Melo
2013-11-14 20:25 ` [PATCH 01/10] perf trace: Tweak summary output Arnaldo Carvalho de Melo
2013-11-14 20:25 ` [PATCH 02/10] perf tools: Remove trivial extra semincolon Arnaldo Carvalho de Melo
2013-11-14 20:25 ` [PATCH 03/10] perf top: Add missing newline if the 'uid' is invalid Arnaldo Carvalho de Melo
2013-11-14 20:25 ` [PATCH 04/10] perf tools: Synthesize anon MMAP records again Arnaldo Carvalho de Melo
2013-11-14 20:25 ` [PATCH 05/10] perf tools: Use perf_evlist__{first,last}, perf_evsel__next Arnaldo Carvalho de Melo
2013-11-14 20:25 ` [PATCH 06/10] perf evsel: Introduce perf_evsel__prev() method Arnaldo Carvalho de Melo
2013-11-14 20:25 ` [PATCH 07/10] perf symbols: Limit max callchain using max_stack on DWARF unwinding too Arnaldo Carvalho de Melo
2013-11-14 20:25 ` [PATCH 08/10] perf ui browser: Fix segfault caused by off by one handling END key Arnaldo Carvalho de Melo
2013-11-14 20:25 ` [PATCH 09/10] perf probe: Add '--demangle'/'--no-demangle' Arnaldo Carvalho de Melo
2013-11-14 20:25 ` [PATCH 10/10] perf record: Add an option to force per-cpu mmaps Arnaldo Carvalho de Melo
2013-11-15  6:06   ` Ingo Molnar
2013-11-15 11:00     ` Adrian Hunter
2013-11-15 11:10       ` Ingo Molnar
2013-11-15 11:27         ` Adrian Hunter
2013-11-15 11:56           ` Ingo Molnar
2013-11-15 12:03             ` Peter Zijlstra
2013-11-15 12:05               ` Ingo Molnar
2013-11-15 12:30                 ` Peter Zijlstra
2013-11-15 12:33                 ` Adrian Hunter
2013-11-15 12:52               ` Adrian Hunter
2013-11-15 12:53                 ` Peter Zijlstra
2013-11-15 13:52       ` [PATCH V2] perf record: Make per-cpu mmaps the default Adrian Hunter
2013-11-30 12:50         ` [tip:perf/core] " tip-bot for Adrian Hunter
2013-11-15  6:38 ` [GIT PULL 00/10] perf/core improvements and fixes Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).