All of lore.kernel.org
 help / color / mirror / Atom feed
* [GIT PULL 00/10] perf/core improvements and fixes
@ 2016-05-25 21:34 Arnaldo Carvalho de Melo
  2016-05-25 21:34 ` [PATCH 01/10] perf symbols: Check kptr_restrict for root Arnaldo Carvalho de Melo
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-25 21:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Alexander Shishkin, Alexei Starovoitov, Andi Kleen,
	Brendan Gregg, David Ahern, Frederic Weisbecker, He Kuang,
	Jiri Olsa, Linus Torvalds, Masami Hiramatsu, Milian Wolff,
	Namhyung Kim, Peter Zijlstra, pi3orama, Stephane Eranian,
	Thomas Gleixner, Vince Weaver, Wang Nan, Zefan Li,
	Arnaldo Carvalho de Melo

Hi Ingo,

	Please consider pulling,

- Arnaldo

The following changes since commit 275ae411e56f8f900fa364da29c4706f9af4e1f3:

  perf/x86/intel/rapl: Fix pmus free during cleanup (2016-05-25 10:56:43 +0200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20160525

for you to fetch changes up to 83e1e314baf9a1424bf2f50953ed7d50612763c4:

  tools: Pass arg to fdarray__filter's call back function (2016-05-25 17:27:25 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

User visible/kernel ABI:

- Per event callchain limit: Recently we introduced a sysctl to tune the
  max-stack for all events for which callchains were requested:

  $ sysctl kernel.perf_event_max_stack
  kernel.perf_event_max_stack = 127

  Now this patch introduces a way to configure this per event, i.e. this
  becomes possible:

  $ perf record -e sched:*/max-stack=2/ -e block:*/max-stack=10/ -a

  allowing finer tuning of how much buffer space callchains use.

  This uses an u16 from the reserved space at the end, leaving another
  u16 for future use.

  There has been interest in even finer tuning, namely to control the
  max stack for kernel and userspace callchains separately. Further
  discussion is needed, we may for instance use the remaining u16 for
  that and when it is present, assume that the sample_max_stack introduced
  in this patch applies for the kernel, and the u16 left is used for
  limiting the userspace callchain. (Arnaldo Carvalho de Melo)

- Fix kptr_restrict=2 related 'perf record' segfault (Wang Nan)

Infrastructure;

- Adopt get_main_thread from db-export.c (Andi Kleen)

- More prep work for backward ring buffer support (Wang Nan)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

----------------------------------------------------------------
Andi Kleen (1):
      perf thread: Adopt get_main_thread from db-export.c

Arnaldo Carvalho de Melo (2):
      perf core: Per event callchain limit
      perf tools: Per event max-stack settings

Wang Nan (7):
      perf symbols: Check kptr_restrict for root
      perf record: Fix crash when kptr is restricted
      perf record: Robustify perf_event__synth_time_conv()
      perf evlist: Don't poll and mmap overwritable events
      perf evlist: Check 'base' pointer before checking refcnt when put a mmap
      perf evlist: Choose correct reading direction according to evlist->backward
      tools: Pass arg to fdarray__filter's call back function

 include/linux/perf_event.h      |  2 +-
 include/uapi/linux/perf_event.h |  6 +++++-
 kernel/bpf/stackmap.c           |  2 +-
 kernel/events/callchain.c       | 14 ++++++++++++--
 kernel/events/core.c            |  5 ++++-
 tools/lib/api/fd/array.c        |  5 +++--
 tools/lib/api/fd/array.h        |  3 ++-
 tools/perf/arch/x86/util/tsc.c  |  2 ++
 tools/perf/builtin-record.c     |  9 ++++++++-
 tools/perf/tests/fdarray.c      |  8 ++++----
 tools/perf/util/callchain.h     |  1 +
 tools/perf/util/db-export.c     | 13 +------------
 tools/perf/util/event.c         |  2 ++
 tools/perf/util/evlist.c        | 43 ++++++++++++++++++++++++++++++++---------
 tools/perf/util/evlist.h        |  2 ++
 tools/perf/util/evsel.c         | 16 +++++++++++++--
 tools/perf/util/evsel.h         |  2 ++
 tools/perf/util/parse-events.c  |  8 ++++++++
 tools/perf/util/parse-events.h  |  1 +
 tools/perf/util/parse-events.l  |  1 +
 tools/perf/util/session.c       |  2 ++
 tools/perf/util/symbol.c        | 16 +++++++--------
 tools/perf/util/thread.c        | 11 +++++++++++
 tools/perf/util/thread.h        |  2 ++
 24 files changed, 131 insertions(+), 45 deletions(-)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 01/10] perf symbols: Check kptr_restrict for root
  2016-05-25 21:34 [GIT PULL 00/10] perf/core improvements and fixes Arnaldo Carvalho de Melo
@ 2016-05-25 21:34 ` Arnaldo Carvalho de Melo
  2016-05-25 21:34 ` [PATCH 02/10] perf record: Fix crash when kptr is restricted Arnaldo Carvalho de Melo
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-25 21:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Wang Nan, Zefan Li, pi3orama, Arnaldo Carvalho de Melo

From: Wang Nan <wangnan0@huawei.com>

If kptr_restrict is set to 2, even root is not allowed to see pointers.
This patch checks kptr_restrict even if euid == 0. For root, report
error if kptr_restrict is 2.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1464081688-167940-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/symbol.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 20f9cb32b703..54c4ff2b1cee 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -1933,17 +1933,17 @@ int setup_intlist(struct intlist **list, const char *list_str,
 static bool symbol__read_kptr_restrict(void)
 {
 	bool value = false;
+	FILE *fp = fopen("/proc/sys/kernel/kptr_restrict", "r");
 
-	if (geteuid() != 0) {
-		FILE *fp = fopen("/proc/sys/kernel/kptr_restrict", "r");
-		if (fp != NULL) {
-			char line[8];
+	if (fp != NULL) {
+		char line[8];
 
-			if (fgets(line, sizeof(line), fp) != NULL)
-				value = atoi(line) != 0;
+		if (fgets(line, sizeof(line), fp) != NULL)
+			value = (geteuid() != 0) ?
+					(atoi(line) != 0) :
+					(atoi(line) == 2);
 
-			fclose(fp);
-		}
+		fclose(fp);
 	}
 
 	return value;
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 02/10] perf record: Fix crash when kptr is restricted
  2016-05-25 21:34 [GIT PULL 00/10] perf/core improvements and fixes Arnaldo Carvalho de Melo
  2016-05-25 21:34 ` [PATCH 01/10] perf symbols: Check kptr_restrict for root Arnaldo Carvalho de Melo
@ 2016-05-25 21:34 ` Arnaldo Carvalho de Melo
  2016-05-25 21:34 ` [PATCH 03/10] perf thread: Adopt get_main_thread from db-export.c Arnaldo Carvalho de Melo
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-25 21:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Wang Nan, Zefan Li, pi3orama, Arnaldo Carvalho de Melo

From: Wang Nan <wangnan0@huawei.com>

Before this patch, a simple 'perf record' could fail if kptr_restrict is
set to 1 (for normal user) or 2 (for root):

  # perf record ls
  WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,
  check /proc/sys/kernel/kptr_restrict.

  Samples in kernel functions may not be resolved if a suitable vmlinux
  file is not found in the buildid cache or in the vmlinux path.

  Samples in kernel modules won't be resolved at all.

  If some relocation was applied (e.g. kexec) symbols may be misresolved
  even with a suitable vmlinux or kallsyms file.

  Segmentation fault (core dumped)

This patch skips perf_event__synthesize_kernel_mmap() when kptr is not
available.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fixes: 45e90056904b ("perf machine: Do not bail out if not managing to read ref reloc symbol")
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1464081688-167940-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/event.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index f6fcc6832949..9b141f12329e 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -673,6 +673,8 @@ int perf_event__synthesize_kernel_mmap(struct perf_tool *tool,
 	int err;
 	union perf_event *event;
 
+	if (symbol_conf.kptr_restrict)
+		return -1;
 	if (map == NULL)
 		return -1;
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 03/10] perf thread: Adopt get_main_thread from db-export.c
  2016-05-25 21:34 [GIT PULL 00/10] perf/core improvements and fixes Arnaldo Carvalho de Melo
  2016-05-25 21:34 ` [PATCH 01/10] perf symbols: Check kptr_restrict for root Arnaldo Carvalho de Melo
  2016-05-25 21:34 ` [PATCH 02/10] perf record: Fix crash when kptr is restricted Arnaldo Carvalho de Melo
@ 2016-05-25 21:34 ` Arnaldo Carvalho de Melo
  2016-05-25 21:34 ` [PATCH 04/10] perf core: Per event callchain limit Arnaldo Carvalho de Melo
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-25 21:34 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Andi Kleen, Jiri Olsa, Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

Move the get_main_thread function from db-export.c to thread.c so that
it can be used elsewhere.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1464051145-19968-2-git-send-email-andi@firstfloor.org
[ Removed leftover bits from db-export.h ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/db-export.c | 13 +------------
 tools/perf/util/thread.c    | 11 +++++++++++
 tools/perf/util/thread.h    |  2 ++
 3 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/db-export.c b/tools/perf/util/db-export.c
index c9a6dc173e74..b0c2b5c5d337 100644
--- a/tools/perf/util/db-export.c
+++ b/tools/perf/util/db-export.c
@@ -233,17 +233,6 @@ int db_export__symbol(struct db_export *dbe, struct symbol *sym,
 	return 0;
 }
 
-static struct thread *get_main_thread(struct machine *machine, struct thread *thread)
-{
-	if (thread->pid_ == thread->tid)
-		return thread__get(thread);
-
-	if (thread->pid_ == -1)
-		return NULL;
-
-	return machine__find_thread(machine, thread->pid_, thread->pid_);
-}
-
 static int db_ids_from_al(struct db_export *dbe, struct addr_location *al,
 			  u64 *dso_db_id, u64 *sym_db_id, u64 *offset)
 {
@@ -382,7 +371,7 @@ int db_export__sample(struct db_export *dbe, union perf_event *event,
 	if (err)
 		return err;
 
-	main_thread = get_main_thread(al->machine, thread);
+	main_thread = thread__main_thread(al->machine, thread);
 	if (main_thread)
 		comm = machine__thread_exec_comm(al->machine, main_thread);
 
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 45fcb715a36b..ada58e6070bf 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -265,3 +265,14 @@ void thread__find_cpumode_addr_location(struct thread *thread,
 			break;
 	}
 }
+
+struct thread *thread__main_thread(struct machine *machine, struct thread *thread)
+{
+	if (thread->pid_ == thread->tid)
+		return thread__get(thread);
+
+	if (thread->pid_ == -1)
+		return NULL;
+
+	return machine__find_thread(machine, thread->pid_, thread->pid_);
+}
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index 45fba13c800b..08fcb14cf637 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -81,6 +81,8 @@ void thread__insert_map(struct thread *thread, struct map *map);
 int thread__fork(struct thread *thread, struct thread *parent, u64 timestamp);
 size_t thread__fprintf(struct thread *thread, FILE *fp);
 
+struct thread *thread__main_thread(struct machine *machine, struct thread *thread);
+
 void thread__find_addr_map(struct thread *thread,
 			   u8 cpumode, enum map_type type, u64 addr,
 			   struct addr_location *al);
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 04/10] perf core: Per event callchain limit
  2016-05-25 21:34 [GIT PULL 00/10] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (2 preceding siblings ...)
  2016-05-25 21:34 ` [PATCH 03/10] perf thread: Adopt get_main_thread from db-export.c Arnaldo Carvalho de Melo
@ 2016-05-25 21:34 ` Arnaldo Carvalho de Melo
  2016-05-25 21:34 ` [PATCH 05/10] perf tools: Per event max-stack settings Arnaldo Carvalho de Melo
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-25 21:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Alexander Shishkin, Alexei Starovoitov, Brendan Gregg,
	David Ahern, Frederic Weisbecker, He Kuang, Jiri Olsa,
	Linus Torvalds, Masami Hiramatsu, Milian Wolff, Namhyung Kim,
	Peter Zijlstra, Stephane Eranian, Thomas Gleixner, Vince Weaver,
	Wang Nan, Zefan Li

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Additionally to being able to control the system wide maximum depth via
/proc/sys/kernel/perf_event_max_stack, now we are able to ask for
different depths per event, using perf_event_attr.sample_max_stack for
that.

This uses an u16 hole at the end of perf_event_attr, that, when
perf_event_attr.sample_type has the PERF_SAMPLE_CALLCHAIN, if
sample_max_stack is zero, means use perf_event_max_stack, otherwise
it'll be bounds checked under callchain_mutex.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 include/linux/perf_event.h      |  2 +-
 include/uapi/linux/perf_event.h |  6 +++++-
 kernel/bpf/stackmap.c           |  2 +-
 kernel/events/callchain.c       | 14 ++++++++++++--
 kernel/events/core.c            |  5 ++++-
 5 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 6b87be908790..0e43355c7aad 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1076,7 +1076,7 @@ extern void perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, struct
 extern struct perf_callchain_entry *
 get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user,
 		   u32 max_stack, bool crosstask, bool add_mark);
-extern int get_callchain_buffers(void);
+extern int get_callchain_buffers(int max_stack);
 extern void put_callchain_buffers(void);
 
 extern int sysctl_perf_event_max_stack;
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 36ce552cf6a9..c66a485a24ac 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -276,6 +276,9 @@ enum perf_event_read_format {
 
 /*
  * Hardware event_id to monitor via a performance monitoring event:
+ *
+ * @sample_max_stack: Max number of frame pointers in a callchain,
+ *		      should be < /proc/sys/kernel/perf_event_max_stack
  */
 struct perf_event_attr {
 
@@ -385,7 +388,8 @@ struct perf_event_attr {
 	 * Wakeup watermark for AUX area
 	 */
 	__u32	aux_watermark;
-	__u32	__reserved_2;	/* align to __u64 */
+	__u16	sample_max_stack;
+	__u16	__reserved_2;	/* align to __u64 */
 };
 
 #define perf_flags(attr)	(*(&(attr)->read_format + 1))
diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
index a82d7605db3f..f1de5c1a2af6 100644
--- a/kernel/bpf/stackmap.c
+++ b/kernel/bpf/stackmap.c
@@ -99,7 +99,7 @@ static struct bpf_map *stack_map_alloc(union bpf_attr *attr)
 	if (err)
 		goto free_smap;
 
-	err = get_callchain_buffers();
+	err = get_callchain_buffers(sysctl_perf_event_max_stack);
 	if (err)
 		goto free_smap;
 
diff --git a/kernel/events/callchain.c b/kernel/events/callchain.c
index 179ef4640964..e9fdb5203de5 100644
--- a/kernel/events/callchain.c
+++ b/kernel/events/callchain.c
@@ -104,7 +104,7 @@ fail:
 	return -ENOMEM;
 }
 
-int get_callchain_buffers(void)
+int get_callchain_buffers(int event_max_stack)
 {
 	int err = 0;
 	int count;
@@ -121,6 +121,15 @@ int get_callchain_buffers(void)
 		/* If the allocation failed, give up */
 		if (!callchain_cpus_entries)
 			err = -ENOMEM;
+		/*
+		 * If requesting per event more than the global cap,
+		 * return a different error to help userspace figure
+		 * this out.
+		 *
+		 * And also do it here so that we have &callchain_mutex held.
+		 */
+		if (event_max_stack > sysctl_perf_event_max_stack)
+			err = -EOVERFLOW;
 		goto exit;
 	}
 
@@ -174,11 +183,12 @@ perf_callchain(struct perf_event *event, struct pt_regs *regs)
 	bool user   = !event->attr.exclude_callchain_user;
 	/* Disallow cross-task user callchains. */
 	bool crosstask = event->ctx->task && event->ctx->task != current;
+	const u32 max_stack = event->attr.sample_max_stack;
 
 	if (!kernel && !user)
 		return NULL;
 
-	return get_perf_callchain(regs, 0, kernel, user, sysctl_perf_event_max_stack, crosstask, true);
+	return get_perf_callchain(regs, 0, kernel, user, max_stack, crosstask, true);
 }
 
 struct perf_callchain_entry *
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 050a290c72c7..79363f298445 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -8843,7 +8843,7 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
 
 	if (!event->parent) {
 		if (event->attr.sample_type & PERF_SAMPLE_CALLCHAIN) {
-			err = get_callchain_buffers();
+			err = get_callchain_buffers(attr->sample_max_stack);
 			if (err)
 				goto err_addr_filters;
 		}
@@ -9165,6 +9165,9 @@ SYSCALL_DEFINE5(perf_event_open,
 			return -EINVAL;
 	}
 
+	if (!attr.sample_max_stack)
+		attr.sample_max_stack = sysctl_perf_event_max_stack;
+
 	/*
 	 * In cgroup mode, the pid argument is used to pass the fd
 	 * opened to the cgroup directory in cgroupfs. The cpu argument
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 05/10] perf tools: Per event max-stack settings
  2016-05-25 21:34 [GIT PULL 00/10] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (3 preceding siblings ...)
  2016-05-25 21:34 ` [PATCH 04/10] perf core: Per event callchain limit Arnaldo Carvalho de Melo
@ 2016-05-25 21:34 ` Arnaldo Carvalho de Melo
  2016-05-25 21:34 ` [PATCH 06/10] perf record: Robustify perf_event__synth_time_conv() Arnaldo Carvalho de Melo
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-25 21:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
	Alexander Shishkin, Alexei Starovoitov, Brendan Gregg,
	David Ahern, Frederic Weisbecker, He Kuang, Jiri Olsa,
	Linus Torvalds, Masami Hiramatsu, Milian Wolff, Namhyung Kim,
	Peter Zijlstra, Stephane Eranian, Thomas Gleixner, Vince Weaver,
	Wang Nan, Zefan Li

From: Arnaldo Carvalho de Melo <acme@redhat.com>

The tooling counterpart, now it is possible to do:

  # perf record -e sched:sched_switch/max-stack=10/ -e cycles/call-graph=dwarf,max-stack=4/ -e cpu-cycles/call-graph=dwarf,max-stack=1024/ usleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.052 MB perf.data (5 samples) ]
  # perf evlist -v
  sched:sched_switch: type: 2, size: 112, config: 0x110, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, sample_max_stack: 10
  cycles/call-graph=dwarf,max-stack=4/: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|PERIOD|REGS_USER|STACK_USER|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, freq: 1, enable_on_exec: 1, sample_id_all: 1, exclude_guest: 1, exclude_callchain_user: 1, sample_regs_user: 0xff0fff, sample_stack_user: 8192, sample_max_stack: 4
  cpu-cycles/call-graph=dwarf,max-stack=1024/: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|PERIOD|REGS_USER|STACK_USER|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, freq: 1, enable_on_exec: 1, sample_id_all: 1, exclude_guest: 1, exclude_callchain_user: 1, sample_regs_user: 0xff0fff, sample_stack_user: 8192, sample_max_stack: 1024
  # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events

Using just /max-stack=N/ means /call-graph=fp,max-stack=N/, that should
be further configurable by means of some .perfconfig knob.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/callchain.h    |  1 +
 tools/perf/util/evsel.c        | 16 ++++++++++++++--
 tools/perf/util/evsel.h        |  2 ++
 tools/perf/util/parse-events.c |  8 ++++++++
 tools/perf/util/parse-events.h |  1 +
 tools/perf/util/parse-events.l |  1 +
 tools/perf/util/session.c      |  2 ++
 7 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 65e2a4f7cb4e..a70f6b54eb92 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -94,6 +94,7 @@ struct callchain_param {
 	enum perf_call_graph_mode record_mode;
 	u32			dump_size;
 	enum chain_mode 	mode;
+	u16			max_stack;
 	u32			print_limit;
 	double			min_percent;
 	sort_chain_func_t	sort;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 02c177d14c8d..245ac503f211 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -572,6 +572,8 @@ void perf_evsel__config_callchain(struct perf_evsel *evsel,
 
 	perf_evsel__set_sample_bit(evsel, CALLCHAIN);
 
+	attr->sample_max_stack = param->max_stack;
+
 	if (param->record_mode == CALLCHAIN_LBR) {
 		if (!opts->branch_stack) {
 			if (attr->exclude_user) {
@@ -635,7 +637,8 @@ static void apply_config_terms(struct perf_evsel *evsel,
 	struct perf_event_attr *attr = &evsel->attr;
 	struct callchain_param param;
 	u32 dump_size = 0;
-	char *callgraph_buf = NULL;
+	int max_stack = 0;
+	const char *callgraph_buf = NULL;
 
 	/* callgraph default */
 	param.record_mode = callchain_param.record_mode;
@@ -662,6 +665,9 @@ static void apply_config_terms(struct perf_evsel *evsel,
 		case PERF_EVSEL__CONFIG_TERM_STACK_USER:
 			dump_size = term->val.stack_user;
 			break;
+		case PERF_EVSEL__CONFIG_TERM_MAX_STACK:
+			max_stack = term->val.max_stack;
+			break;
 		case PERF_EVSEL__CONFIG_TERM_INHERIT:
 			/*
 			 * attr->inherit should has already been set by
@@ -677,7 +683,12 @@ static void apply_config_terms(struct perf_evsel *evsel,
 	}
 
 	/* User explicitly set per-event callgraph, clear the old setting and reset. */
-	if ((callgraph_buf != NULL) || (dump_size > 0)) {
+	if ((callgraph_buf != NULL) || (dump_size > 0) || max_stack) {
+		if (max_stack) {
+			param.max_stack = max_stack;
+			if (callgraph_buf == NULL)
+				callgraph_buf = "fp";
+		}
 
 		/* parse callgraph parameters */
 		if (callgraph_buf != NULL) {
@@ -1329,6 +1340,7 @@ int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
 	PRINT_ATTRf(clockid, p_signed);
 	PRINT_ATTRf(sample_regs_intr, p_hex);
 	PRINT_ATTRf(aux_watermark, p_unsigned);
+	PRINT_ATTRf(sample_max_stack, p_unsigned);
 
 	return ret;
 }
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index c1f10159804c..028412b32d5a 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -44,6 +44,7 @@ enum {
 	PERF_EVSEL__CONFIG_TERM_CALLGRAPH,
 	PERF_EVSEL__CONFIG_TERM_STACK_USER,
 	PERF_EVSEL__CONFIG_TERM_INHERIT,
+	PERF_EVSEL__CONFIG_TERM_MAX_STACK,
 	PERF_EVSEL__CONFIG_TERM_MAX,
 };
 
@@ -56,6 +57,7 @@ struct perf_evsel_config_term {
 		bool	time;
 		char	*callgraph;
 		u64	stack_user;
+		int	max_stack;
 		bool	inherit;
 	} val;
 };
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index bcbc983d4b12..89d40bb425e1 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -900,6 +900,7 @@ static const char *config_term_names[__PARSE_EVENTS__TERM_TYPE_NR] = {
 	[PARSE_EVENTS__TERM_TYPE_STACKSIZE]		= "stack-size",
 	[PARSE_EVENTS__TERM_TYPE_NOINHERIT]		= "no-inherit",
 	[PARSE_EVENTS__TERM_TYPE_INHERIT]		= "inherit",
+	[PARSE_EVENTS__TERM_TYPE_MAX_STACK]		= "max-stack",
 };
 
 static bool config_term_shrinked;
@@ -995,6 +996,9 @@ do {									   \
 	case PARSE_EVENTS__TERM_TYPE_NAME:
 		CHECK_TYPE_VAL(STR);
 		break;
+	case PARSE_EVENTS__TERM_TYPE_MAX_STACK:
+		CHECK_TYPE_VAL(NUM);
+		break;
 	default:
 		err->str = strdup("unknown term");
 		err->idx = term->err_term;
@@ -1040,6 +1044,7 @@ static int config_term_tracepoint(struct perf_event_attr *attr,
 	case PARSE_EVENTS__TERM_TYPE_STACKSIZE:
 	case PARSE_EVENTS__TERM_TYPE_INHERIT:
 	case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
+	case PARSE_EVENTS__TERM_TYPE_MAX_STACK:
 		return config_term_common(attr, term, err);
 	default:
 		if (err) {
@@ -1109,6 +1114,9 @@ do {								\
 		case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
 			ADD_CONFIG_TERM(INHERIT, inherit, term->val.num ? 0 : 1);
 			break;
+		case PARSE_EVENTS__TERM_TYPE_MAX_STACK:
+			ADD_CONFIG_TERM(MAX_STACK, max_stack, term->val.num);
+			break;
 		default:
 			break;
 		}
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index d740c3ca9a1d..46c05ccd5dfe 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -68,6 +68,7 @@ enum {
 	PARSE_EVENTS__TERM_TYPE_STACKSIZE,
 	PARSE_EVENTS__TERM_TYPE_NOINHERIT,
 	PARSE_EVENTS__TERM_TYPE_INHERIT,
+	PARSE_EVENTS__TERM_TYPE_MAX_STACK,
 	__PARSE_EVENTS__TERM_TYPE_NR,
 };
 
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 1477fbc78993..01af1ee90a27 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -199,6 +199,7 @@ branch_type		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_BRANCH_SAMPLE_TYPE
 time			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_TIME); }
 call-graph		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CALLGRAPH); }
 stack-size		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_STACKSIZE); }
+max-stack		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_MAX_STACK); }
 inherit			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_INHERIT); }
 no-inherit		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOINHERIT); }
 ,			{ return ','; }
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 2335b2824d8a..43d30ea87b7e 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -593,6 +593,7 @@ do { 						\
 	if (bswap_safe(f, 0))			\
 		attr->f = bswap_##sz(attr->f);	\
 } while(0)
+#define bswap_field_16(f) bswap_field(f, 16)
 #define bswap_field_32(f) bswap_field(f, 32)
 #define bswap_field_64(f) bswap_field(f, 64)
 
@@ -608,6 +609,7 @@ do { 						\
 	bswap_field_64(sample_regs_user);
 	bswap_field_32(sample_stack_user);
 	bswap_field_32(aux_watermark);
+	bswap_field_16(sample_max_stack);
 
 	/*
 	 * After read_format are bitfields. Check read_format because
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 06/10] perf record: Robustify perf_event__synth_time_conv()
  2016-05-25 21:34 [GIT PULL 00/10] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (4 preceding siblings ...)
  2016-05-25 21:34 ` [PATCH 05/10] perf tools: Per event max-stack settings Arnaldo Carvalho de Melo
@ 2016-05-25 21:34 ` Arnaldo Carvalho de Melo
  2016-05-25 21:34 ` [PATCH 07/10] perf evlist: Don't poll and mmap overwritable events Arnaldo Carvalho de Melo
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-25 21:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Wang Nan, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li, pi3orama, He Kuang,
	Arnaldo Carvalho de Melo

From: Wang Nan <wangnan0@huawei.com>

It is possible that all events in an evlist are overwritable.
perf_event__synth_time_conv() should not crash in this case.
record__pick_pc() is used to check avaliability.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1464056944-166978-3-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/arch/x86/util/tsc.c | 2 ++
 tools/perf/builtin-record.c    | 9 ++++++++-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/tools/perf/arch/x86/util/tsc.c b/tools/perf/arch/x86/util/tsc.c
index 357f1b13b5ae..2e5567c94e09 100644
--- a/tools/perf/arch/x86/util/tsc.c
+++ b/tools/perf/arch/x86/util/tsc.c
@@ -62,6 +62,8 @@ int perf_event__synth_time_conv(const struct perf_event_mmap_page *pc,
 	struct perf_tsc_conversion tc;
 	int err;
 
+	if (!pc)
+		return 0;
 	err = perf_read_tsc_conversion(pc, &tc);
 	if (err == -EOPNOTSUPP)
 		return 0;
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index dc3fcb597e4c..d4cf1b0c88f9 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -655,6 +655,13 @@ perf_event__synth_time_conv(const struct perf_event_mmap_page *pc __maybe_unused
 	return 0;
 }
 
+static const struct perf_event_mmap_page *record__pick_pc(struct record *rec)
+{
+	if (rec->evlist && rec->evlist->mmap && rec->evlist->mmap[0].base)
+		return rec->evlist->mmap[0].base;
+	return NULL;
+}
+
 static int record__synthesize(struct record *rec)
 {
 	struct perf_session *session = rec->session;
@@ -692,7 +699,7 @@ static int record__synthesize(struct record *rec)
 		}
 	}
 
-	err = perf_event__synth_time_conv(rec->evlist->mmap[0].base, tool,
+	err = perf_event__synth_time_conv(record__pick_pc(rec), tool,
 					  process_synthesized_event, machine);
 	if (err)
 		goto out;
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 07/10] perf evlist: Don't poll and mmap overwritable events
  2016-05-25 21:34 [GIT PULL 00/10] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (5 preceding siblings ...)
  2016-05-25 21:34 ` [PATCH 06/10] perf record: Robustify perf_event__synth_time_conv() Arnaldo Carvalho de Melo
@ 2016-05-25 21:34 ` Arnaldo Carvalho de Melo
  2016-05-25 21:34 ` [PATCH 08/10] perf evlist: Check 'base' pointer before checking refcnt when put a mmap Arnaldo Carvalho de Melo
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-25 21:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Wang Nan, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li, pi3orama, He Kuang,
	Arnaldo Carvalho de Melo

From: Wang Nan <wangnan0@huawei.com>

There's no need to receive events from overwritable ring buffer.
Instead, perf should make them run in background until some external
event of interest takes place.  This patch makes ignores normal events from
overwrite evlists.

Overwritable events must be mapped readonly and backward, so if evlist
and evsel doesn't match (evsel->overwrite is true but either evlist is
read/write or evlist is not backward, and vice versa), skip mapping it.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1464056944-166978-3-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evlist.c | 23 +++++++++++++++++++----
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index e82ba90cc969..50d7b80987c0 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -462,9 +462,9 @@ int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
 	return 0;
 }
 
-static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx)
+static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx, short revent)
 {
-	int pos = fdarray__add(&evlist->pollfd, fd, POLLIN | POLLERR | POLLHUP);
+	int pos = fdarray__add(&evlist->pollfd, fd, revent | POLLERR | POLLHUP);
 	/*
 	 * Save the idx so that when we filter out fds POLLHUP'ed we can
 	 * close the associated evlist->mmap[] entry.
@@ -480,7 +480,7 @@ static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx
 
 int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd)
 {
-	return __perf_evlist__add_pollfd(evlist, fd, -1);
+	return __perf_evlist__add_pollfd(evlist, fd, -1, POLLIN);
 }
 
 static void perf_evlist__munmap_filtered(struct fdarray *fda, int fd)
@@ -983,15 +983,28 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 	return 0;
 }
 
+static bool
+perf_evlist__should_poll(struct perf_evlist *evlist __maybe_unused,
+			 struct perf_evsel *evsel)
+{
+	if (evsel->overwrite)
+		return false;
+	return true;
+}
+
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 				       struct mmap_params *mp, int cpu,
 				       int thread, int *output)
 {
 	struct perf_evsel *evsel;
+	int revent;
 
 	evlist__for_each(evlist, evsel) {
 		int fd;
 
+		if (evsel->overwrite != (evlist->overwrite && evlist->backward))
+			continue;
+
 		if (evsel->system_wide && thread)
 			continue;
 
@@ -1008,6 +1021,8 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 			perf_evlist__mmap_get(evlist, idx);
 		}
 
+		revent = perf_evlist__should_poll(evlist, evsel) ? POLLIN : 0;
+
 		/*
 		 * The system_wide flag causes a selected event to be opened
 		 * always without a pid.  Consequently it will never get a
@@ -1016,7 +1031,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 		 * Therefore don't add it for polling.
 		 */
 		if (!evsel->system_wide &&
-		    __perf_evlist__add_pollfd(evlist, fd, idx) < 0) {
+		    __perf_evlist__add_pollfd(evlist, fd, idx, revent) < 0) {
 			perf_evlist__mmap_put(evlist, idx);
 			return -1;
 		}
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 08/10] perf evlist: Check 'base' pointer before checking refcnt when put a mmap
  2016-05-25 21:34 [GIT PULL 00/10] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (6 preceding siblings ...)
  2016-05-25 21:34 ` [PATCH 07/10] perf evlist: Don't poll and mmap overwritable events Arnaldo Carvalho de Melo
@ 2016-05-25 21:34 ` Arnaldo Carvalho de Melo
  2016-05-25 21:34 ` [PATCH 09/10] perf evlist: Choose correct reading direction according to evlist->backward Arnaldo Carvalho de Melo
  2016-05-25 21:34 ` [PATCH 10/10] tools: Pass arg to fdarray__filter's call back function Arnaldo Carvalho de Melo
  9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-25 21:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Wang Nan, He Kuang, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li, pi3orama, Arnaldo Carvalho de Melo

From: Wang Nan <wangnan0@huawei.com>

evlist->mmap[i]->refcnt could be 0 if an evlist has no evsel or if all
evsels don't match the evlist during mmap. For example, when all evsels
are overwritable but the evlist itself is normal. To avoid crashing,
perf should check 'base' pointer before checking refcnt, and raise bug
only when base is not NULL.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1464183898-174512-2-git-send-email-wangnan0@huawei.com
[ Renamed 'mmap' variable, it is reserved in old distros such as Ubuntu 12.04, breaking the build ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evlist.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 50d7b80987c0..58ede3257c61 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -856,9 +856,11 @@ static void perf_evlist__mmap_get(struct perf_evlist *evlist, int idx)
 
 static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx)
 {
-	BUG_ON(atomic_read(&evlist->mmap[idx].refcnt) == 0);
+	struct perf_mmap *md = &evlist->mmap[idx];
+
+	BUG_ON(md->base && atomic_read(&md->refcnt) == 0);
 
-	if (atomic_dec_and_test(&evlist->mmap[idx].refcnt))
+	if (atomic_dec_and_test(&md->refcnt))
 		__perf_evlist__munmap(evlist, idx);
 }
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 09/10] perf evlist: Choose correct reading direction according to evlist->backward
  2016-05-25 21:34 [GIT PULL 00/10] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (7 preceding siblings ...)
  2016-05-25 21:34 ` [PATCH 08/10] perf evlist: Check 'base' pointer before checking refcnt when put a mmap Arnaldo Carvalho de Melo
@ 2016-05-25 21:34 ` Arnaldo Carvalho de Melo
  2016-05-25 21:34 ` [PATCH 10/10] tools: Pass arg to fdarray__filter's call back function Arnaldo Carvalho de Melo
  9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-25 21:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Wang Nan, He Kuang, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li, pi3orama, Arnaldo Carvalho de Melo

From: Wang Nan <wangnan0@huawei.com>

Now we have evlist->backward to indicate the mmap direction. Make
perf_evlist__mmap_read() choose right direction automatically.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1464183898-174512-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evlist.c | 9 ++++++++-
 tools/perf/util/evlist.h | 2 ++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 58ede3257c61..719729ef9d7c 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -777,7 +777,7 @@ broken_event:
 	return event;
 }
 
-union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
+union perf_event *perf_evlist__mmap_read_forward(struct perf_evlist *evlist, int idx)
 {
 	struct perf_mmap *md = &evlist->mmap[idx];
 	u64 head;
@@ -832,6 +832,13 @@ perf_evlist__mmap_read_backward(struct perf_evlist *evlist, int idx)
 	return perf_mmap__read(md, false, start, end, &md->prev);
 }
 
+union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
+{
+	if (!evlist->backward)
+		return perf_evlist__mmap_read_forward(evlist, idx);
+	return perf_evlist__mmap_read_backward(evlist, idx);
+}
+
 void perf_evlist__mmap_read_catchup(struct perf_evlist *evlist, int idx)
 {
 	struct perf_mmap *md = &evlist->mmap[idx];
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index d740fb877ab6..68cb1361c97c 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -131,6 +131,8 @@ struct perf_sample_id *perf_evlist__id2sid(struct perf_evlist *evlist, u64 id);
 
 union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx);
 
+union perf_event *perf_evlist__mmap_read_forward(struct perf_evlist *evlist,
+						 int idx);
 union perf_event *perf_evlist__mmap_read_backward(struct perf_evlist *evlist,
 						  int idx);
 void perf_evlist__mmap_read_catchup(struct perf_evlist *evlist, int idx);
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 10/10] tools: Pass arg to fdarray__filter's call back function
  2016-05-25 21:34 [GIT PULL 00/10] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (8 preceding siblings ...)
  2016-05-25 21:34 ` [PATCH 09/10] perf evlist: Choose correct reading direction according to evlist->backward Arnaldo Carvalho de Melo
@ 2016-05-25 21:34 ` Arnaldo Carvalho de Melo
  9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-25 21:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Wang Nan, He Kuang, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li, pi3orama, Arnaldo Carvalho de Melo

From: Wang Nan <wangnan0@huawei.com>

Before this patch there's no way to pass arguments to fdarray__filter's
call back function.

This improvement will be used by 'perf record' to support unmapping ring
buffer for both main evlist and overwrite evlist. Without this patch
there's no way to track overwrite evlist from 'struct fdarray'.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1464183898-174512-10-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/lib/api/fd/array.c   | 5 +++--
 tools/lib/api/fd/array.h   | 3 ++-
 tools/perf/tests/fdarray.c | 8 ++++----
 tools/perf/util/evlist.c   | 5 +++--
 4 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/tools/lib/api/fd/array.c b/tools/lib/api/fd/array.c
index 0e636c4339b8..b0a035fc87b3 100644
--- a/tools/lib/api/fd/array.c
+++ b/tools/lib/api/fd/array.c
@@ -85,7 +85,8 @@ int fdarray__add(struct fdarray *fda, int fd, short revents)
 }
 
 int fdarray__filter(struct fdarray *fda, short revents,
-		    void (*entry_destructor)(struct fdarray *fda, int fd))
+		    void (*entry_destructor)(struct fdarray *fda, int fd, void *arg),
+		    void *arg)
 {
 	int fd, nr = 0;
 
@@ -95,7 +96,7 @@ int fdarray__filter(struct fdarray *fda, short revents,
 	for (fd = 0; fd < fda->nr; ++fd) {
 		if (fda->entries[fd].revents & revents) {
 			if (entry_destructor)
-				entry_destructor(fda, fd);
+				entry_destructor(fda, fd, arg);
 
 			continue;
 		}
diff --git a/tools/lib/api/fd/array.h b/tools/lib/api/fd/array.h
index 45db01818f45..e87fd800fa8d 100644
--- a/tools/lib/api/fd/array.h
+++ b/tools/lib/api/fd/array.h
@@ -34,7 +34,8 @@ void fdarray__delete(struct fdarray *fda);
 int fdarray__add(struct fdarray *fda, int fd, short revents);
 int fdarray__poll(struct fdarray *fda, int timeout);
 int fdarray__filter(struct fdarray *fda, short revents,
-		    void (*entry_destructor)(struct fdarray *fda, int fd));
+		    void (*entry_destructor)(struct fdarray *fda, int fd, void *arg),
+		    void *arg);
 int fdarray__grow(struct fdarray *fda, int extra);
 int fdarray__fprintf(struct fdarray *fda, FILE *fp);
 
diff --git a/tools/perf/tests/fdarray.c b/tools/perf/tests/fdarray.c
index c809463edbe5..59dbd0550c51 100644
--- a/tools/perf/tests/fdarray.c
+++ b/tools/perf/tests/fdarray.c
@@ -36,7 +36,7 @@ int test__fdarray__filter(int subtest __maybe_unused)
 	}
 
 	fdarray__init_revents(fda, POLLIN);
-	nr_fds = fdarray__filter(fda, POLLHUP, NULL);
+	nr_fds = fdarray__filter(fda, POLLHUP, NULL, NULL);
 	if (nr_fds != fda->nr_alloc) {
 		pr_debug("\nfdarray__filter()=%d != %d shouldn't have filtered anything",
 			 nr_fds, fda->nr_alloc);
@@ -44,7 +44,7 @@ int test__fdarray__filter(int subtest __maybe_unused)
 	}
 
 	fdarray__init_revents(fda, POLLHUP);
-	nr_fds = fdarray__filter(fda, POLLHUP, NULL);
+	nr_fds = fdarray__filter(fda, POLLHUP, NULL, NULL);
 	if (nr_fds != 0) {
 		pr_debug("\nfdarray__filter()=%d != %d, should have filtered all fds",
 			 nr_fds, fda->nr_alloc);
@@ -57,7 +57,7 @@ int test__fdarray__filter(int subtest __maybe_unused)
 
 	pr_debug("\nfiltering all but fda->entries[2]:");
 	fdarray__fprintf_prefix(fda, "before", stderr);
-	nr_fds = fdarray__filter(fda, POLLHUP, NULL);
+	nr_fds = fdarray__filter(fda, POLLHUP, NULL, NULL);
 	fdarray__fprintf_prefix(fda, " after", stderr);
 	if (nr_fds != 1) {
 		pr_debug("\nfdarray__filter()=%d != 1, should have left just one event", nr_fds);
@@ -78,7 +78,7 @@ int test__fdarray__filter(int subtest __maybe_unused)
 
 	pr_debug("\nfiltering all but (fda->entries[0], fda->entries[3]):");
 	fdarray__fprintf_prefix(fda, "before", stderr);
-	nr_fds = fdarray__filter(fda, POLLHUP, NULL);
+	nr_fds = fdarray__filter(fda, POLLHUP, NULL, NULL);
 	fdarray__fprintf_prefix(fda, " after", stderr);
 	if (nr_fds != 2) {
 		pr_debug("\nfdarray__filter()=%d != 2, should have left just two events",
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 719729ef9d7c..e0f30946ed1a 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -483,7 +483,8 @@ int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd)
 	return __perf_evlist__add_pollfd(evlist, fd, -1, POLLIN);
 }
 
-static void perf_evlist__munmap_filtered(struct fdarray *fda, int fd)
+static void perf_evlist__munmap_filtered(struct fdarray *fda, int fd,
+					 void *arg __maybe_unused)
 {
 	struct perf_evlist *evlist = container_of(fda, struct perf_evlist, pollfd);
 
@@ -493,7 +494,7 @@ static void perf_evlist__munmap_filtered(struct fdarray *fda, int fd)
 int perf_evlist__filter_pollfd(struct perf_evlist *evlist, short revents_and_mask)
 {
 	return fdarray__filter(&evlist->pollfd, revents_and_mask,
-			       perf_evlist__munmap_filtered);
+			       perf_evlist__munmap_filtered, NULL);
 }
 
 int perf_evlist__poll(struct perf_evlist *evlist, int timeout)
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-05-25 21:36 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-25 21:34 [GIT PULL 00/10] perf/core improvements and fixes Arnaldo Carvalho de Melo
2016-05-25 21:34 ` [PATCH 01/10] perf symbols: Check kptr_restrict for root Arnaldo Carvalho de Melo
2016-05-25 21:34 ` [PATCH 02/10] perf record: Fix crash when kptr is restricted Arnaldo Carvalho de Melo
2016-05-25 21:34 ` [PATCH 03/10] perf thread: Adopt get_main_thread from db-export.c Arnaldo Carvalho de Melo
2016-05-25 21:34 ` [PATCH 04/10] perf core: Per event callchain limit Arnaldo Carvalho de Melo
2016-05-25 21:34 ` [PATCH 05/10] perf tools: Per event max-stack settings Arnaldo Carvalho de Melo
2016-05-25 21:34 ` [PATCH 06/10] perf record: Robustify perf_event__synth_time_conv() Arnaldo Carvalho de Melo
2016-05-25 21:34 ` [PATCH 07/10] perf evlist: Don't poll and mmap overwritable events Arnaldo Carvalho de Melo
2016-05-25 21:34 ` [PATCH 08/10] perf evlist: Check 'base' pointer before checking refcnt when put a mmap Arnaldo Carvalho de Melo
2016-05-25 21:34 ` [PATCH 09/10] perf evlist: Choose correct reading direction according to evlist->backward Arnaldo Carvalho de Melo
2016-05-25 21:34 ` [PATCH 10/10] tools: Pass arg to fdarray__filter's call back function Arnaldo Carvalho de Melo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.