linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/7] perf tools: Support overwritable ring buffer
@ 2016-05-24  2:28 Wang Nan
  2016-05-24  2:28 ` [PATCH v4 1/7] perf evlist: Introduce aux perf evlist Wang Nan
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Wang Nan @ 2016-05-24  2:28 UTC (permalink / raw)
  To: acme
  Cc: pi3orama, linux-kernel, Wang Nan, Arnaldo Carvalho de Melo,
	He Kuang, Jiri Olsa, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, Zefan Li

This patch set enables daemonized perf recording by utilizing
overwritable backward ring buffer. With this feature one can
put perf background, and dump ring buffer records by a SIGUSR2
when he/she find something unusual. For example, following
command record system calls, schedule events and samples on cpu cycles
continously:

 # perf record -g -e cycles -e raw_syscalls:*/call-graph=no/ \
                  -e sched:sched_switch/call-graph=no/ \
                  --switch-output --overwrite -a

Then by sending SIGUSR2 to perf when lagging is happen, we get multiple
perf.data output, each of them correspond a abnormal event, and the data
size is reasonable:

 # ls -l ./perf.data*
 -rw------- 1 root root 5122165 May 13 23:51 ./perf.data.2016051323511683
 -rw------- 1 root root 5135093 May 13 23:51 ./perf.data.2016051323512107
 -rw------- 1 root root 5135213 May 13 23:51 ./perf.data.2016051323512215
 -rw------- 1 root root 5135157 May 13 23:51 ./perf.data.2016051323512387

v1 -> v2: Totally redesign: drop the principle of 'channal', use
          auxiliary evlist instead. Fix missing documentation.

v2 -> v3: Rename perf_evlist__toggle_paused() to perf_evlist__pause/resume.

v3 -> v4: Update commit message to describe auxiliary evlist more clearly.

Wang Nan (7):
  perf evlist: Introduce aux perf evlist
  perf tools: Don't poll and mmap overwritable events
  perf tools: Enable overwrite settings
  perf record: Introduce rec->overwrite_evlist for overwritable events
  perf record: Toggle overwrite ring buffer for reading
  perf tools: Don't warn about out of order event if write_backward is
    used
  perf tools: Check write_backward during evlist config

 tools/perf/Documentation/perf-record.txt |  14 ++
 tools/perf/arch/x86/util/tsc.c           |   2 +
 tools/perf/builtin-record.c              | 283 +++++++++++++++++++++++++++----
 tools/perf/perf.h                        |   1 +
 tools/perf/util/evlist.c                 |  52 ++++--
 tools/perf/util/evlist.h                 |   3 +
 tools/perf/util/evsel.c                  |  27 +--
 tools/perf/util/evsel.h                  |  15 ++
 tools/perf/util/parse-events.c           |  20 ++-
 tools/perf/util/parse-events.h           |   2 +
 tools/perf/util/parse-events.l           |   2 +
 tools/perf/util/record.c                 |  17 ++
 tools/perf/util/session.c                |  22 ++-
 13 files changed, 399 insertions(+), 61 deletions(-)

Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com

-- 
1.8.3.4

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v4 1/7] perf evlist: Introduce aux perf evlist
  2016-05-24  2:28 [PATCH v4 0/7] perf tools: Support overwritable ring buffer Wang Nan
@ 2016-05-24  2:28 ` Wang Nan
  2016-05-24  2:28 ` [PATCH v4 2/7] perf tools: Don't poll and mmap overwritable events Wang Nan
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Wang Nan @ 2016-05-24  2:28 UTC (permalink / raw)
  To: acme
  Cc: pi3orama, linux-kernel, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li

An auxiliary evlist is created by perf_evlist__new_aux() using an existing
evlist as it parent. An auxiliary evlist can have its own 'struct perf_mmap',
but can't have other data. User should use its parent instead when accessing
other data.

Auxiliary evlists are container of 'struct perf_mmap'. It is introduced to
allow its parent evlist map different events to separated mmap. Following
commit creates an auxiliary evlist for overwritable events, because
overwritable events need a readonly and backward ring buffer, which is
different from normal events.

To achieve this goal, this patch carefully changes 'evlist' to
'evlist->parent' in all functions in the path of 'perf_evlist__mmap_ex',
except 'evlist->mmap' related operations, to make sure all evlist
modifications (like pollfd and event id hash tables) goes to original
evlist.

A 'evlist->parent' pointer is added to 'struct perf_evlist' and points to
the evlist itself for normal evlists.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/evlist.c | 31 +++++++++++++++++++++----------
 tools/perf/util/evlist.h |  3 +++
 2 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index e82ba90..55cea69 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -45,6 +45,7 @@ void perf_evlist__init(struct perf_evlist *evlist, struct cpu_map *cpus,
 	fdarray__init(&evlist->pollfd, 64);
 	evlist->workload.pid = -1;
 	evlist->backward = false;
+	evlist->parent = evlist;
 }
 
 struct perf_evlist *perf_evlist__new(void)
@@ -989,7 +990,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 {
 	struct perf_evsel *evsel;
 
-	evlist__for_each(evlist, evsel) {
+	evlist__for_each(evlist->parent, evsel) {
 		int fd;
 
 		if (evsel->system_wide && thread)
@@ -1016,16 +1017,16 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 		 * Therefore don't add it for polling.
 		 */
 		if (!evsel->system_wide &&
-		    __perf_evlist__add_pollfd(evlist, fd, idx) < 0) {
+		    __perf_evlist__add_pollfd(evlist->parent, fd, idx) < 0) {
 			perf_evlist__mmap_put(evlist, idx);
 			return -1;
 		}
 
 		if (evsel->attr.read_format & PERF_FORMAT_ID) {
-			if (perf_evlist__id_add_fd(evlist, evsel, cpu, thread,
+			if (perf_evlist__id_add_fd(evlist->parent, evsel, cpu, thread,
 						   fd) < 0)
 				return -1;
-			perf_evlist__set_sid_idx(evlist, evsel, idx, cpu,
+			perf_evlist__set_sid_idx(evlist->parent, evsel, idx, cpu,
 						 thread);
 		}
 	}
@@ -1066,13 +1067,13 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist,
 					struct mmap_params *mp)
 {
 	int thread;
-	int nr_threads = thread_map__nr(evlist->threads);
+	int nr_threads = thread_map__nr(evlist->parent->threads);
 
 	pr_debug2("perf event ring buffer mmapped per thread\n");
 	for (thread = 0; thread < nr_threads; thread++) {
 		int output = -1;
 
-		auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, thread,
+		auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist->parent, thread,
 					      false);
 
 		if (perf_evlist__mmap_per_evsel(evlist, thread, mp, 0, thread,
@@ -1211,8 +1212,8 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 			 bool auxtrace_overwrite)
 {
 	struct perf_evsel *evsel;
-	const struct cpu_map *cpus = evlist->cpus;
-	const struct thread_map *threads = evlist->threads;
+	const struct cpu_map *cpus = evlist->parent->cpus;
+	const struct thread_map *threads = evlist->parent->threads;
 	struct mmap_params mp = {
 		.prot = PROT_READ | (overwrite ? 0 : PROT_WRITE),
 	};
@@ -1220,7 +1221,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 	if (evlist->mmap == NULL && perf_evlist__alloc_mmap(evlist) < 0)
 		return -ENOMEM;
 
-	if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0)
+	if (evlist->parent->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist->parent) < 0)
 		return -ENOMEM;
 
 	evlist->overwrite = overwrite;
@@ -1231,7 +1232,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 	auxtrace_mmap_params__init(&mp.auxtrace_mp, evlist->mmap_len,
 				   auxtrace_pages, auxtrace_overwrite);
 
-	evlist__for_each(evlist, evsel) {
+	evlist__for_each(evlist->parent, evsel) {
 		if ((evsel->attr.read_format & PERF_FORMAT_ID) &&
 		    evsel->sample_id == NULL &&
 		    perf_evsel__alloc_id(evsel, cpu_map__nr(cpus), threads->nr) < 0)
@@ -1888,3 +1889,13 @@ perf_evlist__find_evsel_by_str(struct perf_evlist *evlist,
 
 	return NULL;
 }
+
+struct perf_evlist *perf_evlist__new_aux(struct perf_evlist *parent)
+{
+	struct perf_evlist *evlist = zalloc(sizeof(*evlist));
+
+	if (evlist != NULL)
+		perf_evlist__init(evlist, parent->cpus, parent->threads);
+	evlist->parent = parent->parent;
+	return evlist;
+}
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index d740fb8..0505012 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -60,6 +60,7 @@ struct perf_evlist {
 	struct perf_evsel *selected;
 	struct events_stats stats;
 	struct perf_env	*env;
+	struct perf_evlist *parent;
 };
 
 struct perf_evsel_str_handler {
@@ -70,6 +71,8 @@ struct perf_evsel_str_handler {
 struct perf_evlist *perf_evlist__new(void);
 struct perf_evlist *perf_evlist__new_default(void);
 struct perf_evlist *perf_evlist__new_dummy(void);
+struct perf_evlist *perf_evlist__new_aux(struct perf_evlist *);
+
 void perf_evlist__init(struct perf_evlist *evlist, struct cpu_map *cpus,
 		       struct thread_map *threads);
 void perf_evlist__exit(struct perf_evlist *evlist);
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 2/7] perf tools: Don't poll and mmap overwritable events
  2016-05-24  2:28 [PATCH v4 0/7] perf tools: Support overwritable ring buffer Wang Nan
  2016-05-24  2:28 ` [PATCH v4 1/7] perf evlist: Introduce aux perf evlist Wang Nan
@ 2016-05-24  2:28 ` Wang Nan
  2016-06-02  6:32   ` [tip:perf/core] perf record: Robustify perf_event__synth_time_conv() tip-bot for Wang Nan
  2016-06-02  6:32   ` [tip:perf/core] perf evlist: Don't poll and mmap overwritable events tip-bot for Wang Nan
  2016-05-24  2:29 ` [PATCH v4 3/7] perf tools: Enable overwrite settings Wang Nan
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 13+ messages in thread
From: Wang Nan @ 2016-05-24  2:28 UTC (permalink / raw)
  To: acme
  Cc: pi3orama, linux-kernel, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li

There's no need to receive events from overwritable ring buffer. Instead,
perf should make them run background until something happen. This patch
makes normal events from overwrite events ignored.

Overwritable events must be mapped readonly and backward, so if evlist
and evsel is not match (evsel->overwrite is true but either evlist is
read/write or evlist is not backward, and vice versa), skip mapping it.

It is possible that all events in an evlist are overwritable.
perf_event__synth_time_conv() should not crash in this case. record__pick_pc()
is used to check avaliability. Further commits will expand it when we
introduce auxiliary evlists and have multiple mmaps.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/arch/x86/util/tsc.c |  2 ++
 tools/perf/builtin-record.c    |  9 ++++++++-
 tools/perf/util/evlist.c       | 23 +++++++++++++++++++----
 3 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/tools/perf/arch/x86/util/tsc.c b/tools/perf/arch/x86/util/tsc.c
index 357f1b1..2e5567c 100644
--- a/tools/perf/arch/x86/util/tsc.c
+++ b/tools/perf/arch/x86/util/tsc.c
@@ -62,6 +62,8 @@ int perf_event__synth_time_conv(const struct perf_event_mmap_page *pc,
 	struct perf_tsc_conversion tc;
 	int err;
 
+	if (!pc)
+		return 0;
 	err = perf_read_tsc_conversion(pc, &tc);
 	if (err == -EOPNOTSUPP)
 		return 0;
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index dc3fcb5..d4cf1b0 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -655,6 +655,13 @@ perf_event__synth_time_conv(const struct perf_event_mmap_page *pc __maybe_unused
 	return 0;
 }
 
+static const struct perf_event_mmap_page *record__pick_pc(struct record *rec)
+{
+	if (rec->evlist && rec->evlist->mmap && rec->evlist->mmap[0].base)
+		return rec->evlist->mmap[0].base;
+	return NULL;
+}
+
 static int record__synthesize(struct record *rec)
 {
 	struct perf_session *session = rec->session;
@@ -692,7 +699,7 @@ static int record__synthesize(struct record *rec)
 		}
 	}
 
-	err = perf_event__synth_time_conv(rec->evlist->mmap[0].base, tool,
+	err = perf_event__synth_time_conv(record__pick_pc(rec), tool,
 					  process_synthesized_event, machine);
 	if (err)
 		goto out;
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 55cea69..fbd0d47 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -463,9 +463,9 @@ int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
 	return 0;
 }
 
-static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx)
+static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx, short revent)
 {
-	int pos = fdarray__add(&evlist->pollfd, fd, POLLIN | POLLERR | POLLHUP);
+	int pos = fdarray__add(&evlist->pollfd, fd, revent | POLLERR | POLLHUP);
 	/*
 	 * Save the idx so that when we filter out fds POLLHUP'ed we can
 	 * close the associated evlist->mmap[] entry.
@@ -481,7 +481,7 @@ static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx
 
 int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd)
 {
-	return __perf_evlist__add_pollfd(evlist, fd, -1);
+	return __perf_evlist__add_pollfd(evlist, fd, -1, POLLIN);
 }
 
 static void perf_evlist__munmap_filtered(struct fdarray *fda, int fd)
@@ -984,15 +984,28 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 	return 0;
 }
 
+static bool
+perf_evlist__should_poll(struct perf_evlist *evlist __maybe_unused,
+			 struct perf_evsel *evsel)
+{
+	if (evsel->overwrite)
+		return false;
+	return true;
+}
+
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 				       struct mmap_params *mp, int cpu,
 				       int thread, int *output)
 {
 	struct perf_evsel *evsel;
+	int revent;
 
 	evlist__for_each(evlist->parent, evsel) {
 		int fd;
 
+		if (evsel->overwrite != (evlist->overwrite && evlist->backward))
+			continue;
+
 		if (evsel->system_wide && thread)
 			continue;
 
@@ -1009,6 +1022,8 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 			perf_evlist__mmap_get(evlist, idx);
 		}
 
+		revent = perf_evlist__should_poll(evlist, evsel) ? POLLIN : 0;
+
 		/*
 		 * The system_wide flag causes a selected event to be opened
 		 * always without a pid.  Consequently it will never get a
@@ -1017,7 +1032,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 		 * Therefore don't add it for polling.
 		 */
 		if (!evsel->system_wide &&
-		    __perf_evlist__add_pollfd(evlist->parent, fd, idx) < 0) {
+		    __perf_evlist__add_pollfd(evlist->parent, fd, idx, revent) < 0) {
 			perf_evlist__mmap_put(evlist, idx);
 			return -1;
 		}
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 3/7] perf tools: Enable overwrite settings
  2016-05-24  2:28 [PATCH v4 0/7] perf tools: Support overwritable ring buffer Wang Nan
  2016-05-24  2:28 ` [PATCH v4 1/7] perf evlist: Introduce aux perf evlist Wang Nan
  2016-05-24  2:28 ` [PATCH v4 2/7] perf tools: Don't poll and mmap overwritable events Wang Nan
@ 2016-05-24  2:29 ` Wang Nan
  2016-05-24 18:40   ` Arnaldo Carvalho de Melo
  2016-05-24  2:29 ` [PATCH v4 4/7] perf record: Introduce rec->overwrite_evlist for overwritable events Wang Nan
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Wang Nan @ 2016-05-24  2:29 UTC (permalink / raw)
  To: acme
  Cc: pi3orama, linux-kernel, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li

This patch allows following config terms and option:

Globally setting events to overwrite;

 # perf record --overwrite ...

Set specific events to be overwrite or no-overwrite.

 # perf record --event cycles/overwrite/ ...
 # perf record --event cycles/no-overwrite/ ...

Add missing config terms and update config term array size because the
longest string length is changed.

For overwritable events, automatically select attr.write_backward since
perf requires it to be backward for reading.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/Documentation/perf-record.txt | 14 ++++++++++++++
 tools/perf/builtin-record.c              |  1 +
 tools/perf/perf.h                        |  1 +
 tools/perf/util/evsel.c                  | 12 ++++++++++++
 tools/perf/util/evsel.h                  |  2 ++
 tools/perf/util/parse-events.c           | 20 ++++++++++++++++++--
 tools/perf/util/parse-events.h           |  2 ++
 tools/perf/util/parse-events.l           |  2 ++
 8 files changed, 52 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 8dbee83..f5cb932 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -360,6 +360,20 @@ particular perf.data snapshot should be kept or not.
 
 Implies --timestamp-filename, --no-buildid and --no-buildid-cache.
 
+--overwrite::
+Makes all events use overwritable ring buffer. Event with overwritable ring
+buffer works like a flight recorder: when buffer gets full, instead of dumping
+records into output file, kernel overwrites old records silently. Perf dumps
+data from overwritable ring buffer when switching output (see --switch-output)
+and before terminate.
+
+Perf behaves like a daemon when '--overwrite' and '--switch-output' are
+provided. It record and drop events in background, and dumps data when
+something unusual is detected.
+
+'overwrite' attribute can also be set or canceled for specific event using
+config terms like 'cycles/overwrite/' and 'instructions/no-overwrite/'.
+
 SEE ALSO
 --------
 linkperf:perf-stat[1], linkperf:perf-list[1]
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index d4cf1b0..9611380 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1310,6 +1310,7 @@ struct option __record_options[] = {
 	OPT_BOOLEAN_SET('i', "no-inherit", &record.opts.no_inherit,
 			&record.opts.no_inherit_set,
 			"child tasks do not inherit counters"),
+	OPT_BOOLEAN(0, "overwrite", &record.opts.overwrite, "use overwrite mode"),
 	OPT_UINTEGER('F', "freq", &record.opts.user_freq, "profile at this frequency"),
 	OPT_CALLBACK('m', "mmap-pages", &record.opts, "pages[,pages]",
 		     "number of mmap data pages and AUX area tracing mmap pages",
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index cd8f1b1..608b42b 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -59,6 +59,7 @@ struct record_opts {
 	bool	     record_switch_events;
 	bool	     all_kernel;
 	bool	     all_user;
+	bool	     overwrite;
 	unsigned int freq;
 	unsigned int mmap_pages;
 	unsigned int auxtrace_mmap_pages;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 02c177d..6330a4f 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -671,11 +671,22 @@ static void apply_config_terms(struct perf_evsel *evsel,
 			 */
 			attr->inherit = term->val.inherit ? 1 : 0;
 			break;
+		case PERF_EVSEL__CONFIG_TERM_OVERWRITE:
+			evsel->overwrite = term->val.overwrite ? 1 : 0;
+			break;
 		default:
 			break;
 		}
 	}
 
+	/*
+	 * Set backward after config term processing because it is
+	 * possible to set overwrite globally, without config
+	 * terms.
+	 */
+	if (evsel->overwrite)
+		attr->write_backward = 1;
+
 	/* User explicitly set per-event callgraph, clear the old setting and reset. */
 	if ((callgraph_buf != NULL) || (dump_size > 0)) {
 
@@ -747,6 +758,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts,
 
 	attr->sample_id_all = perf_missing_features.sample_id_all ? 0 : 1;
 	attr->inherit	    = !opts->no_inherit;
+	evsel->overwrite    = opts->overwrite;
 
 	perf_evsel__set_sample_bit(evsel, IP);
 	perf_evsel__set_sample_bit(evsel, TID);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index c1f1015..bce99fa 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -44,6 +44,7 @@ enum {
 	PERF_EVSEL__CONFIG_TERM_CALLGRAPH,
 	PERF_EVSEL__CONFIG_TERM_STACK_USER,
 	PERF_EVSEL__CONFIG_TERM_INHERIT,
+	PERF_EVSEL__CONFIG_TERM_OVERWRITE,
 	PERF_EVSEL__CONFIG_TERM_MAX,
 };
 
@@ -57,6 +58,7 @@ struct perf_evsel_config_term {
 		char	*callgraph;
 		u64	stack_user;
 		bool	inherit;
+		bool	overwrite;
 	} val;
 };
 
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index bcbc983..85f813d 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -900,6 +900,8 @@ static const char *config_term_names[__PARSE_EVENTS__TERM_TYPE_NR] = {
 	[PARSE_EVENTS__TERM_TYPE_STACKSIZE]		= "stack-size",
 	[PARSE_EVENTS__TERM_TYPE_NOINHERIT]		= "no-inherit",
 	[PARSE_EVENTS__TERM_TYPE_INHERIT]		= "inherit",
+	[PARSE_EVENTS__TERM_TYPE_OVERWRITE]		= "overwrite",
+	[PARSE_EVENTS__TERM_TYPE_NOOVERWRITE]		= "no-overwrite",
 };
 
 static bool config_term_shrinked;
@@ -992,6 +994,12 @@ do {									   \
 	case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
 		CHECK_TYPE_VAL(NUM);
 		break;
+	case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
+		CHECK_TYPE_VAL(NUM);
+		break;
+	case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
+		CHECK_TYPE_VAL(NUM);
+		break;
 	case PARSE_EVENTS__TERM_TYPE_NAME:
 		CHECK_TYPE_VAL(STR);
 		break;
@@ -1040,6 +1048,8 @@ static int config_term_tracepoint(struct perf_event_attr *attr,
 	case PARSE_EVENTS__TERM_TYPE_STACKSIZE:
 	case PARSE_EVENTS__TERM_TYPE_INHERIT:
 	case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
+	case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
+	case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
 		return config_term_common(attr, term, err);
 	default:
 		if (err) {
@@ -1109,6 +1119,12 @@ do {								\
 		case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
 			ADD_CONFIG_TERM(INHERIT, inherit, term->val.num ? 0 : 1);
 			break;
+		case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
+			ADD_CONFIG_TERM(OVERWRITE, overwrite, term->val.num ? 1 : 0);
+			break;
+		case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
+			ADD_CONFIG_TERM(OVERWRITE, overwrite, term->val.num ? 0 : 1);
+			break;
 		default:
 			break;
 		}
@@ -2322,9 +2338,9 @@ static void config_terms_list(char *buf, size_t buf_sz)
 char *parse_events_formats_error_string(char *additional_terms)
 {
 	char *str;
-	/* "branch_type" is the longest name */
+	/* "no-overwrite" is the longest name */
 	char static_terms[__PARSE_EVENTS__TERM_TYPE_NR *
-			  (sizeof("branch_type") - 1)];
+			  (sizeof("no-overwrite") - 1)];
 
 	config_terms_list(static_terms, sizeof(static_terms));
 	/* valid terms */
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index d740c3c..f341d9d 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -68,6 +68,8 @@ enum {
 	PARSE_EVENTS__TERM_TYPE_STACKSIZE,
 	PARSE_EVENTS__TERM_TYPE_NOINHERIT,
 	PARSE_EVENTS__TERM_TYPE_INHERIT,
+	PARSE_EVENTS__TERM_TYPE_NOOVERWRITE,
+	PARSE_EVENTS__TERM_TYPE_OVERWRITE,
 	__PARSE_EVENTS__TERM_TYPE_NR,
 };
 
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 1477fbc..cc4c426 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -201,6 +201,8 @@ call-graph		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CALLGRAPH); }
 stack-size		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_STACKSIZE); }
 inherit			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_INHERIT); }
 no-inherit		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOINHERIT); }
+overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_OVERWRITE); }
+no-overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOOVERWRITE); }
 ,			{ return ','; }
 "/"			{ BEGIN(INITIAL); return '/'; }
 {name_minus}		{ return str(yyscanner, PE_NAME); }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 4/7] perf record: Introduce rec->overwrite_evlist for overwritable events
  2016-05-24  2:28 [PATCH v4 0/7] perf tools: Support overwritable ring buffer Wang Nan
                   ` (2 preceding siblings ...)
  2016-05-24  2:29 ` [PATCH v4 3/7] perf tools: Enable overwrite settings Wang Nan
@ 2016-05-24  2:29 ` Wang Nan
  2016-05-24  2:29 ` [PATCH v4 5/7] perf record: Toggle overwrite ring buffer for reading Wang Nan
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Wang Nan @ 2016-05-24  2:29 UTC (permalink / raw)
  To: acme
  Cc: pi3orama, linux-kernel, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li

Create an auxiliary evlist for overwritable events.

Before mmap, build this evlist and set 'overwrite' and 'backward'
attribute. Since perf_evlist__mmap_ex() only maps events when
evsel->overwrite matches evlist's corresponding attributes, with
these two evlists an event goes to either rec->evlist or
rec->overwrite_evlist.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 131 ++++++++++++++++++++++++++++++++++++--------
 1 file changed, 108 insertions(+), 23 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 9611380..9f6e42c 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -50,6 +50,7 @@ struct record {
 	struct perf_data_file	file;
 	struct auxtrace_record	*itr;
 	struct perf_evlist	*evlist;
+	struct perf_evlist	*overwrite_evlist;
 	struct perf_session	*session;
 	const char		*progname;
 	int			realtime_prio;
@@ -341,6 +342,84 @@ int auxtrace_record__snapshot_start(struct auxtrace_record *itr __maybe_unused)
 
 #endif
 
+static int record__create_overwrite_evlist(struct record *rec)
+{
+	struct perf_evlist *evlist = rec->evlist;
+	struct perf_evsel *pos;
+
+	evlist__for_each(evlist, pos) {
+		if (!pos->overwrite)
+			continue;
+
+		if (!rec->overwrite_evlist) {
+			rec->overwrite_evlist = perf_evlist__new_aux(evlist);
+			if (rec->overwrite_evlist) {
+				rec->overwrite_evlist->backward = true;
+				rec->overwrite_evlist->overwrite = true;
+				return 0;
+			} else
+				return -ENOMEM;
+		}
+	}
+	return 0;
+}
+
+static int record__mmap_evlist(struct record *rec,
+			       struct perf_evlist *evlist,
+			       bool overwrite)
+{
+	struct record_opts *opts = &rec->opts;
+	char msg[512];
+
+	/*
+	 * Don't use evlist->overwrite because it is logically an
+	 * internal attribute and is set by perf_evlist__mmap_ex().
+	 * Avoid circular dependency.
+	 */
+	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, overwrite,
+				 opts->auxtrace_mmap_pages,
+				 opts->auxtrace_snapshot_mode) < 0) {
+		if (errno == EPERM) {
+			pr_err("Permission error mapping pages.\n"
+			       "Consider increasing "
+			       "/proc/sys/kernel/perf_event_mlock_kb,\n"
+			       "or try again with a smaller value of -m/--mmap_pages.\n"
+			       "(current value: %u,%u)\n",
+			       opts->mmap_pages, opts->auxtrace_mmap_pages);
+			return -errno;
+		} else {
+			pr_err("failed to mmap with %d (%s)\n", errno,
+				strerror_r(errno, msg, sizeof(msg)));
+			if (errno)
+				return -errno;
+			else
+				return -EINVAL;
+		}
+	}
+	return 0;
+}
+
+static int record__mmap(struct record *rec)
+{
+	int err;
+
+	err = record__create_overwrite_evlist(rec);
+	if (err)
+		return err;
+
+	err = record__mmap_evlist(rec, rec->evlist, false);
+	if (err)
+		return err;
+
+	if (!rec->overwrite_evlist)
+		return 0;
+
+	err = record__mmap_evlist(rec, rec->overwrite_evlist, true);
+	if (err)
+		return err;
+	return 0;
+}
+
 static int record__open(struct record *rec)
 {
 	char msg[512];
@@ -353,6 +432,13 @@ static int record__open(struct record *rec)
 	perf_evlist__config(evlist, opts, &callchain_param);
 
 	evlist__for_each(evlist, pos) {
+		if (pos->overwrite) {
+			if (!pos->attr.write_backward) {
+				ui__warning("Unable to read from overwrite ring buffer\n\n");
+				rc = -ENOSYS;
+				goto out;
+			}
+		}
 try_again:
 		if (perf_evsel__open(pos, pos->cpus, pos->threads) < 0) {
 			if (perf_evsel__fallback(pos, errno, msg, sizeof(msg))) {
@@ -377,28 +463,9 @@ try_again:
 		goto out;
 	}
 
-	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, false,
-				 opts->auxtrace_mmap_pages,
-				 opts->auxtrace_snapshot_mode) < 0) {
-		if (errno == EPERM) {
-			pr_err("Permission error mapping pages.\n"
-			       "Consider increasing "
-			       "/proc/sys/kernel/perf_event_mlock_kb,\n"
-			       "or try again with a smaller value of -m/--mmap_pages.\n"
-			       "(current value: %u,%u)\n",
-			       opts->mmap_pages, opts->auxtrace_mmap_pages);
-			rc = -errno;
-		} else {
-			pr_err("failed to mmap with %d (%s)\n", errno,
-				strerror_r(errno, msg, sizeof(msg)));
-			if (errno)
-				rc = -errno;
-			else
-				rc = -EINVAL;
-		}
+	rc = record__mmap(rec);
+	if (rc)
 		goto out;
-	}
-
 	session->evlist = evlist;
 	perf_session__set_id_hdr_size(session);
 out:
@@ -655,10 +722,26 @@ perf_event__synth_time_conv(const struct perf_event_mmap_page *pc __maybe_unused
 	return 0;
 }
 
+static const struct perf_event_mmap_page *
+perf_evlist__pick_pc(struct perf_evlist *evlist)
+{
+	if (evlist && evlist->mmap && evlist->mmap[0].base)
+		return evlist->mmap[0].base;
+	return NULL;
+}
+
 static const struct perf_event_mmap_page *record__pick_pc(struct record *rec)
 {
-	if (rec->evlist && rec->evlist->mmap && rec->evlist->mmap[0].base)
-		return rec->evlist->mmap[0].base;
+	const struct perf_event_mmap_page *pc;
+
+	/* Change it to a loop if a new aux evlist is added */
+	pc = perf_evlist__pick_pc(rec->evlist);
+	if (pc)
+		return pc;
+	pc = perf_evlist__pick_pc(rec->overwrite_evlist);
+	if (pc)
+		return pc;
+
 	return NULL;
 }
 
@@ -1566,6 +1649,8 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
 	err = __cmd_record(&record, argc, argv);
 out_symbol_exit:
 	perf_evlist__delete(rec->evlist);
+	if (rec->overwrite_evlist)
+		perf_evlist__delete(rec->overwrite_evlist);
 	symbol__exit();
 	auxtrace_record__free(rec->itr);
 	return err;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 5/7] perf record: Toggle overwrite ring buffer for reading
  2016-05-24  2:28 [PATCH v4 0/7] perf tools: Support overwritable ring buffer Wang Nan
                   ` (3 preceding siblings ...)
  2016-05-24  2:29 ` [PATCH v4 4/7] perf record: Introduce rec->overwrite_evlist for overwritable events Wang Nan
@ 2016-05-24  2:29 ` Wang Nan
  2016-05-24  2:29 ` [PATCH v4 6/7] perf tools: Don't warn about out of order event if write_backward is used Wang Nan
  2016-05-24  2:29 ` [PATCH v4 7/7] perf tools: Check write_backward during evlist config Wang Nan
  6 siblings, 0 replies; 13+ messages in thread
From: Wang Nan @ 2016-05-24  2:29 UTC (permalink / raw)
  To: acme
  Cc: pi3orama, linux-kernel, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li

overwrite_evt_state is introduced to reflect the state of overwritable
ring buffers. It is a state machine with 3 states:

 RUNNING --(1)--> DATA_PENDING --(2)--> EMPTY
    ^                  ^                 |
    |                  |___(disallow)___/|
    |                                    |
     \_________________(3)______________/

 RUNNING      : Overwritable ring buffers are recording
 DATA_PENDING : We are required to collect overwritable ring buffers
 EMPTY        : We have collected data from those ring buffers.

 (1): Pause ring buffers for reading
 (2): Read from ring buffers
 (3): Resume ring buffers for recording

We can't avoid this complexity. Because we deliberately drop records from
overwritable ring buffer, we can't detect remaining data by checking head
and old pointers. Therefore, DATA_PENDING state is mandatory.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 146 +++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 136 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 9f6e42c..0d89bf4 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -42,6 +42,28 @@
 #include <sys/mman.h>
 #include <asm/bug.h>
 
+/*
+ * State machine of overwrite_evt_state:
+ *
+ * RUNNING --(1)--> DATA_PENDING --(2)--> EMPTY
+ *    ^                  ^                 |
+ *    |                  |___(disallow)___/|
+ *    |                                    |
+ *     \_________________(3)______________/
+ *
+ * RUNNING      : Overwritable ring buffers are recording
+ * DATA_PENDING : We are required to collect overwritable ring buffers
+ * EMPTY        : We have collected data from those ring buffers.
+ *
+ * (1): Pause ring buffers for reading
+ * (2): Read from ring buffers
+ * (3): Resume ring buffers for recording
+ */
+enum overwrite_evt_state {
+	OVERWRITE_EVT_RUNNING,
+	OVERWRITE_EVT_DATA_PENDING,
+	OVERWRITE_EVT_EMPTY,
+};
 
 struct record {
 	struct perf_tool	tool;
@@ -61,6 +83,7 @@ struct record {
 	bool			buildid_all;
 	bool			timestamp_filename;
 	bool			switch_output;
+	enum overwrite_evt_state overwrite_evt_state;
 	unsigned long long	samples;
 };
 
@@ -132,9 +155,9 @@ rb_find_range(struct perf_evlist *evlist,
 	return backward_rb_find_range(data, mask, head, start, end);
 }
 
-static int record__mmap_read(struct record *rec, int idx)
+static int record__mmap_read(struct record *rec, struct perf_evlist *evlist, int idx)
 {
-	struct perf_mmap *md = &rec->evlist->mmap[idx];
+	struct perf_mmap *md = &evlist->mmap[idx];
 	u64 head = perf_mmap__read_head(md);
 	u64 old = md->prev;
 	u64 end = head, start = old;
@@ -143,7 +166,7 @@ static int record__mmap_read(struct record *rec, int idx)
 	void *buf;
 	int rc = 0;
 
-	if (rb_find_range(rec->evlist, data, md->mask, head,
+	if (rb_find_range(evlist, data, md->mask, head,
 			  old, &start, &end))
 		return -1;
 
@@ -157,7 +180,7 @@ static int record__mmap_read(struct record *rec, int idx)
 		WARN_ONCE(1, "failed to keep up with mmap data. (warn only once)\n");
 
 		md->prev = head;
-		perf_evlist__mmap_consume(rec->evlist, idx);
+		perf_evlist__mmap_consume(evlist, idx);
 		return 0;
 	}
 
@@ -182,7 +205,7 @@ static int record__mmap_read(struct record *rec, int idx)
 	}
 
 	md->prev = head;
-	perf_evlist__mmap_consume(rec->evlist, idx);
+	perf_evlist__mmap_consume(evlist, idx);
 out:
 	return rc;
 }
@@ -468,6 +491,7 @@ try_again:
 		goto out;
 	session->evlist = evlist;
 	perf_session__set_id_hdr_size(session);
+	rec->overwrite_evt_state = OVERWRITE_EVT_RUNNING;
 out:
 	return rc;
 }
@@ -548,17 +572,72 @@ static struct perf_event_header finished_round_event = {
 	.type = PERF_RECORD_FINISHED_ROUND,
 };
 
-static int record__mmap_read_all(struct record *rec)
+static void
+record__toggle_overwrite_evsels(struct record *rec,
+				enum overwrite_evt_state state)
+{
+	struct perf_evlist *evlist = rec->overwrite_evlist;
+	enum overwrite_evt_state old_state = rec->overwrite_evt_state;
+	enum action {
+		NONE,
+		PAUSE,
+		RESUME,
+	} action = NONE;
+
+	switch (old_state) {
+	case OVERWRITE_EVT_RUNNING:
+		if (state != OVERWRITE_EVT_RUNNING)
+			action = PAUSE;
+		break;
+	case OVERWRITE_EVT_DATA_PENDING:
+		if (state == OVERWRITE_EVT_RUNNING)
+			action = RESUME;
+		break;
+	case OVERWRITE_EVT_EMPTY:
+		if (state == OVERWRITE_EVT_RUNNING)
+			action = RESUME;
+		if (state == OVERWRITE_EVT_DATA_PENDING)
+			state = OVERWRITE_EVT_EMPTY;
+		break;
+	default:
+		WARN_ONCE(1, "Shouldn't get there\n");
+	}
+
+	rec->overwrite_evt_state = state;
+
+	if (action == NONE)
+		return;
+
+	if (!evlist)
+		return;
+
+	switch (action) {
+	case PAUSE:
+		perf_evlist__pause(evlist);
+		break;
+	case RESUME:
+		perf_evlist__resume(evlist);
+		break;
+	case NONE:
+	default:
+		break;
+	}
+}
+
+static int __record__mmap_read_evlist(struct record *rec, struct perf_evlist *evlist)
 {
 	u64 bytes_written = rec->bytes_written;
 	int i;
 	int rc = 0;
 
-	for (i = 0; i < rec->evlist->nr_mmaps; i++) {
-		struct auxtrace_mmap *mm = &rec->evlist->mmap[i].auxtrace_mmap;
+	if (!evlist)
+		return 0;
+
+	for (i = 0; i < evlist->nr_mmaps; i++) {
+		struct auxtrace_mmap *mm = &evlist->mmap[i].auxtrace_mmap;
 
-		if (rec->evlist->mmap[i].base) {
-			if (record__mmap_read(rec, i) != 0) {
+		if (evlist->mmap[i].base) {
+			if (record__mmap_read(rec, evlist, i) != 0) {
 				rc = -1;
 				goto out;
 			}
@@ -582,6 +661,23 @@ out:
 	return rc;
 }
 
+static int record__mmap_read_all(struct record *rec)
+{
+	int err;
+
+	err = __record__mmap_read_evlist(rec, rec->evlist);
+	if (err)
+		return err;
+
+	if (rec->overwrite_evt_state == OVERWRITE_EVT_DATA_PENDING) {
+		err = __record__mmap_read_evlist(rec, rec->overwrite_evlist);
+		if (err)
+			return err;
+		record__toggle_overwrite_evsels(rec, OVERWRITE_EVT_EMPTY);
+	}
+	return 0;
+}
+
 static void record__init_features(struct record *rec)
 {
 	struct perf_session *session = rec->session;
@@ -978,6 +1074,17 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	for (;;) {
 		unsigned long long hits = rec->samples;
 
+		/*
+		 * rec->overwrite_evt_state is possible to be
+		 * OVERWRITE_EVT_EMPTY here: when done == true and
+		 * hits != rec->samples after previous reading.
+		 *
+		 * record__toggle_overwrite_evsels ensure we never
+		 * convert OVERWRITE_EVT_EMPTY to OVERWRITE_EVT_DATA_PENDING.
+		 */
+		if (trigger_is_hit(&switch_output_trigger) || done || draining)
+			record__toggle_overwrite_evsels(rec, OVERWRITE_EVT_DATA_PENDING);
+
 		if (record__mmap_read_all(rec) < 0) {
 			trigger_error(&auxtrace_snapshot_trigger);
 			trigger_error(&switch_output_trigger);
@@ -997,8 +1104,27 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		}
 
 		if (trigger_is_hit(&switch_output_trigger)) {
+			/*
+			 * If switch_output_trigger is hit, the data in
+			 * overwritable ring buffer should have been collected,
+			 * so overwrite_evt_state should be set to
+			 * OVERWRITE_EVT_EMPTY.
+			 *
+			 * If SIGUSR2 raise after or during record__mmap_read_all(),
+			 * record__mmap_read_all() didn't collect data from
+			 * overwritable ring buffer. Read again.
+			 */
+			if (rec->overwrite_evt_state == OVERWRITE_EVT_RUNNING)
+				continue;
 			trigger_ready(&switch_output_trigger);
 
+			/*
+			 * Reenable events in overwrite ring buffer after
+			 * record__mmap_read_all(): we should have collected
+			 * data from it.
+			 */
+			record__toggle_overwrite_evsels(rec, OVERWRITE_EVT_RUNNING);
+
 			if (!quiet)
 				fprintf(stderr, "[ perf record: dump data: Woken up %ld times ]\n",
 					waking);
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 6/7] perf tools: Don't warn about out of order event if write_backward is used
  2016-05-24  2:28 [PATCH v4 0/7] perf tools: Support overwritable ring buffer Wang Nan
                   ` (4 preceding siblings ...)
  2016-05-24  2:29 ` [PATCH v4 5/7] perf record: Toggle overwrite ring buffer for reading Wang Nan
@ 2016-05-24  2:29 ` Wang Nan
  2016-05-24  2:29 ` [PATCH v4 7/7] perf tools: Check write_backward during evlist config Wang Nan
  6 siblings, 0 replies; 13+ messages in thread
From: Wang Nan @ 2016-05-24  2:29 UTC (permalink / raw)
  To: acme
  Cc: pi3orama, linux-kernel, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li

If write_backward attribute is set, records are written into kernel
ring buffer from end to beginning, but read from beginning to end.
To avoid 'XX out of order events recorded' warning message (timestamps
of records is in reverse order when using write_backward), suppress the
warning message if write_backward is selected by at lease one event.

Result:

Before this patch:
 # perf record -m 1 -e raw_syscalls:sys_exit/overwrite/ \
                    -e raw_syscalls:sys_enter \
                    dd if=/dev/zero of=/dev/null count=300
 300+0 records in
 300+0 records out
 153600 bytes (154 kB) copied, 0.000601617 s, 255 MB/s
 [ perf record: Woken up 5 times to write data ]
 Warning:
 40 out of order events recorded.
 [ perf record: Captured and wrote 0.096 MB perf.data (696 samples) ]

After this patch:
 # perf record -m 1 -e raw_syscalls:sys_exit/overwrite/ \
                    -e raw_syscalls:sys_enter \
                    dd if=/dev/zero of=/dev/null count=300
 300+0 records in
 300+0 records out
 153600 bytes (154 kB) copied, 0.000644873 s, 238 MB/s
 [ perf record: Woken up 5 times to write data ]
 [ perf record: Captured and wrote 0.096 MB perf.data (696 samples) ]

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/session.c | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 2335b28..8e3d9d4 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1495,10 +1495,27 @@ int perf_session__register_idle_thread(struct perf_session *session)
 	return err;
 }
 
+static void
+perf_session__warn_order(const struct perf_session *session)
+{
+	const struct ordered_events *oe = &session->ordered_events;
+	struct perf_evsel *evsel;
+	bool should_warn = true;
+
+	evlist__for_each(session->evlist, evsel) {
+		if (evsel->attr.write_backward)
+			should_warn = false;
+	}
+
+	if (!should_warn)
+		return;
+	if (oe->nr_unordered_events != 0)
+		ui__warning("%u out of order events recorded.\n", oe->nr_unordered_events);
+}
+
 static void perf_session__warn_about_errors(const struct perf_session *session)
 {
 	const struct events_stats *stats = &session->evlist->stats;
-	const struct ordered_events *oe = &session->ordered_events;
 
 	if (session->tool->lost == perf_event__process_lost &&
 	    stats->nr_events[PERF_RECORD_LOST] != 0) {
@@ -1555,8 +1572,7 @@ static void perf_session__warn_about_errors(const struct perf_session *session)
 			    stats->nr_unprocessable_samples);
 	}
 
-	if (oe->nr_unordered_events != 0)
-		ui__warning("%u out of order events recorded.\n", oe->nr_unordered_events);
+	perf_session__warn_order(session);
 
 	events_stats__auxtrace_error_warn(stats);
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 7/7] perf tools: Check write_backward during evlist config
  2016-05-24  2:28 [PATCH v4 0/7] perf tools: Support overwritable ring buffer Wang Nan
                   ` (5 preceding siblings ...)
  2016-05-24  2:29 ` [PATCH v4 6/7] perf tools: Don't warn about out of order event if write_backward is used Wang Nan
@ 2016-05-24  2:29 ` Wang Nan
  6 siblings, 0 replies; 13+ messages in thread
From: Wang Nan @ 2016-05-24  2:29 UTC (permalink / raw)
  To: acme
  Cc: pi3orama, linux-kernel, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li

Before this patch, when using overwritable ring buffer on an old
kernel, error message is misleading:

 # ~/perf record -m 1 -e raw_syscalls:*/overwrite/ -a
 Error:
 The raw_syscalls:sys_enter event is not supported.

This patch output clear error message to tell user his/her kernel
is too old:

 # ~/perf record -m 1 -e raw_syscalls:*/overwrite/ -a
 Reading from overwrite event is not supported by this kernel
 Error:
 The raw_syscalls:sys_enter event is not supported.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/evsel.c  | 17 +++++------------
 tools/perf/util/evsel.h  | 13 +++++++++++++
 tools/perf/util/record.c | 17 +++++++++++++++++
 3 files changed, 35 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 6330a4f..994310f 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -29,17 +29,7 @@
 #include "trace-event.h"
 #include "stat.h"
 
-static struct {
-	bool sample_id_all;
-	bool exclude_guest;
-	bool mmap2;
-	bool cloexec;
-	bool clockid;
-	bool clockid_wrong;
-	bool lbr_flags;
-	bool write_backward;
-} perf_missing_features;
-
+struct perf_missing_features perf_missing_features;
 static clockid_t clockid;
 
 static int perf_evsel__no_extra_init(struct perf_evsel *evsel __maybe_unused)
@@ -684,8 +674,11 @@ static void apply_config_terms(struct perf_evsel *evsel,
 	 * possible to set overwrite globally, without config
 	 * terms.
 	 */
-	if (evsel->overwrite)
+	if (evsel->overwrite) {
+		WARN_ONCE(perf_missing_features.write_backward,
+			  "Reading from overwrite event is not supported by this kernel\n");
 		attr->write_backward = 1;
+	}
 
 	/* User explicitly set per-event callgraph, clear the old setting and reset. */
 	if ((callgraph_buf != NULL) || (dump_size > 0)) {
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index bce99fa..c9b6716 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -11,6 +11,19 @@
 #include "cpumap.h"
 #include "counts.h"
 
+struct perf_missing_features {
+	bool sample_id_all;
+	bool exclude_guest;
+	bool mmap2;
+	bool cloexec;
+	bool clockid;
+	bool clockid_wrong;
+	bool lbr_flags;
+	bool write_backward;
+};
+
+extern struct perf_missing_features perf_missing_features;
+
 struct perf_evsel;
 
 /*
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index 481792c..e3ab812 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -90,6 +90,11 @@ static void perf_probe_context_switch(struct perf_evsel *evsel)
 	evsel->attr.context_switch = 1;
 }
 
+static void perf_probe_write_backward(struct perf_evsel *evsel)
+{
+	evsel->attr.write_backward = 1;
+}
+
 bool perf_can_sample_identifier(void)
 {
 	return perf_probe_api(perf_probe_sample_identifier);
@@ -129,6 +134,17 @@ bool perf_can_record_cpu_wide(void)
 	return true;
 }
 
+static void perf_check_write_backward(void)
+{
+	static bool checked = false;
+
+	if (!checked) {
+		perf_missing_features.write_backward =
+			!perf_probe_api(perf_probe_write_backward);
+		checked = true;
+	}
+}
+
 void perf_evlist__config(struct perf_evlist *evlist, struct record_opts *opts,
 			 struct callchain_param *callchain)
 {
@@ -136,6 +152,7 @@ void perf_evlist__config(struct perf_evlist *evlist, struct record_opts *opts,
 	bool use_sample_identifier = false;
 	bool use_comm_exec;
 
+	perf_check_write_backward();
 	/*
 	 * Set the evsel leader links before we configure attributes,
 	 * since some might depend on this info.
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 3/7] perf tools: Enable overwrite settings
  2016-05-24  2:29 ` [PATCH v4 3/7] perf tools: Enable overwrite settings Wang Nan
@ 2016-05-24 18:40   ` Arnaldo Carvalho de Melo
  2016-05-24 18:41     ` Arnaldo Carvalho de Melo
  2016-05-25  2:14     ` Wangnan (F)
  0 siblings, 2 replies; 13+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-24 18:40 UTC (permalink / raw)
  To: Wang Nan
  Cc: pi3orama, linux-kernel, He Kuang, Arnaldo Carvalho de Melo,
	Jiri Olsa, Masami Hiramatsu, Namhyung Kim, Zefan Li

Em Tue, May 24, 2016 at 02:29:00AM +0000, Wang Nan escreveu:
> This patch allows following config terms and option:
> 
> Globally setting events to overwrite;
> 
>  # perf record --overwrite ...
> 
> Set specific events to be overwrite or no-overwrite.
> 
>  # perf record --event cycles/overwrite/ ...
>  # perf record --event cycles/no-overwrite/ ...

So, based on this chunk of documentation in this patch:

<quote>
Perf dumps data from overwritable ring buffer when switching output (see
--switch-output) and before terminate.
</>

I tried:

No --overwrite:

  # perf record -e syscalls:*enter_nanosleep* usleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.019 MB perf.data (1 samples) ]
  # perf evlist -v
  syscalls:sys_enter_nanosleep: type: 2, size: 112, config: 0x132, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1, inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
  # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
  # perf script
          usleep 29416 [002] 220099.782982: syscalls:sys_enter_nanosleep: rqtp: 0x7ffc21f73cc0, rmtp: 0x00000000

Now I went on to try this new --overwrite thing:

  # perf record --overwrite -e syscalls:*enter_nanosleep* usleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.019 MB perf.data ]
  # perf evlist -v
syscalls:sys_enter_nanosleep: type: 2, size: 112, config: 0x132, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1, inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, write_backward: 1
  # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
  # perf script
  # 

So it hasn't recorded anything at anytime, i.e. I expected, based on the
documentation provided, that it would get what was in its buffer, to be written, 
i.e. the single "syscalls:sys_enter_nanosleep" event that took place in that
workload.

So I'm now trying it together with --switch-output, but I just get one
timestamp suffixed perf.data file, empty, without that event that I know took
place.

Care to ellaborate here?

- Arnaldo
 
> Add missing config terms and update config term array size because the
> longest string length is changed.
> 
> For overwritable events, automatically select attr.write_backward since
> perf requires it to be backward for reading.
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Signed-off-by: He Kuang <hekuang@huawei.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> ---
>  tools/perf/Documentation/perf-record.txt | 14 ++++++++++++++
>  tools/perf/builtin-record.c              |  1 +
>  tools/perf/perf.h                        |  1 +
>  tools/perf/util/evsel.c                  | 12 ++++++++++++
>  tools/perf/util/evsel.h                  |  2 ++
>  tools/perf/util/parse-events.c           | 20 ++++++++++++++++++--
>  tools/perf/util/parse-events.h           |  2 ++
>  tools/perf/util/parse-events.l           |  2 ++
>  8 files changed, 52 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
> index 8dbee83..f5cb932 100644
> --- a/tools/perf/Documentation/perf-record.txt
> +++ b/tools/perf/Documentation/perf-record.txt
> @@ -360,6 +360,20 @@ particular perf.data snapshot should be kept or not.
>  
>  Implies --timestamp-filename, --no-buildid and --no-buildid-cache.
>  
> +--overwrite::
> +Makes all events use overwritable ring buffer. Event with overwritable ring
> +buffer works like a flight recorder: when buffer gets full, instead of dumping
> +records into output file, kernel overwrites old records silently. Perf dumps
> +data from overwritable ring buffer when switching output (see --switch-output)
> +and before terminate.
> +
> +Perf behaves like a daemon when '--overwrite' and '--switch-output' are
> +provided. It record and drop events in background, and dumps data when
> +something unusual is detected.
> +
> +'overwrite' attribute can also be set or canceled for specific event using
> +config terms like 'cycles/overwrite/' and 'instructions/no-overwrite/'.
> +
>  SEE ALSO
>  --------
>  linkperf:perf-stat[1], linkperf:perf-list[1]
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index d4cf1b0..9611380 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -1310,6 +1310,7 @@ struct option __record_options[] = {
>  	OPT_BOOLEAN_SET('i', "no-inherit", &record.opts.no_inherit,
>  			&record.opts.no_inherit_set,
>  			"child tasks do not inherit counters"),
> +	OPT_BOOLEAN(0, "overwrite", &record.opts.overwrite, "use overwrite mode"),
>  	OPT_UINTEGER('F', "freq", &record.opts.user_freq, "profile at this frequency"),
>  	OPT_CALLBACK('m', "mmap-pages", &record.opts, "pages[,pages]",
>  		     "number of mmap data pages and AUX area tracing mmap pages",
> diff --git a/tools/perf/perf.h b/tools/perf/perf.h
> index cd8f1b1..608b42b 100644
> --- a/tools/perf/perf.h
> +++ b/tools/perf/perf.h
> @@ -59,6 +59,7 @@ struct record_opts {
>  	bool	     record_switch_events;
>  	bool	     all_kernel;
>  	bool	     all_user;
> +	bool	     overwrite;
>  	unsigned int freq;
>  	unsigned int mmap_pages;
>  	unsigned int auxtrace_mmap_pages;
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 02c177d..6330a4f 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -671,11 +671,22 @@ static void apply_config_terms(struct perf_evsel *evsel,
>  			 */
>  			attr->inherit = term->val.inherit ? 1 : 0;
>  			break;
> +		case PERF_EVSEL__CONFIG_TERM_OVERWRITE:
> +			evsel->overwrite = term->val.overwrite ? 1 : 0;
> +			break;
>  		default:
>  			break;
>  		}
>  	}
>  
> +	/*
> +	 * Set backward after config term processing because it is
> +	 * possible to set overwrite globally, without config
> +	 * terms.
> +	 */
> +	if (evsel->overwrite)
> +		attr->write_backward = 1;
> +
>  	/* User explicitly set per-event callgraph, clear the old setting and reset. */
>  	if ((callgraph_buf != NULL) || (dump_size > 0)) {
>  
> @@ -747,6 +758,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts,
>  
>  	attr->sample_id_all = perf_missing_features.sample_id_all ? 0 : 1;
>  	attr->inherit	    = !opts->no_inherit;
> +	evsel->overwrite    = opts->overwrite;
>  
>  	perf_evsel__set_sample_bit(evsel, IP);
>  	perf_evsel__set_sample_bit(evsel, TID);
> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> index c1f1015..bce99fa 100644
> --- a/tools/perf/util/evsel.h
> +++ b/tools/perf/util/evsel.h
> @@ -44,6 +44,7 @@ enum {
>  	PERF_EVSEL__CONFIG_TERM_CALLGRAPH,
>  	PERF_EVSEL__CONFIG_TERM_STACK_USER,
>  	PERF_EVSEL__CONFIG_TERM_INHERIT,
> +	PERF_EVSEL__CONFIG_TERM_OVERWRITE,
>  	PERF_EVSEL__CONFIG_TERM_MAX,
>  };
>  
> @@ -57,6 +58,7 @@ struct perf_evsel_config_term {
>  		char	*callgraph;
>  		u64	stack_user;
>  		bool	inherit;
> +		bool	overwrite;
>  	} val;
>  };
>  
> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> index bcbc983..85f813d 100644
> --- a/tools/perf/util/parse-events.c
> +++ b/tools/perf/util/parse-events.c
> @@ -900,6 +900,8 @@ static const char *config_term_names[__PARSE_EVENTS__TERM_TYPE_NR] = {
>  	[PARSE_EVENTS__TERM_TYPE_STACKSIZE]		= "stack-size",
>  	[PARSE_EVENTS__TERM_TYPE_NOINHERIT]		= "no-inherit",
>  	[PARSE_EVENTS__TERM_TYPE_INHERIT]		= "inherit",
> +	[PARSE_EVENTS__TERM_TYPE_OVERWRITE]		= "overwrite",
> +	[PARSE_EVENTS__TERM_TYPE_NOOVERWRITE]		= "no-overwrite",
>  };
>  
>  static bool config_term_shrinked;
> @@ -992,6 +994,12 @@ do {									   \
>  	case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
>  		CHECK_TYPE_VAL(NUM);
>  		break;
> +	case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
> +		CHECK_TYPE_VAL(NUM);
> +		break;
> +	case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
> +		CHECK_TYPE_VAL(NUM);
> +		break;
>  	case PARSE_EVENTS__TERM_TYPE_NAME:
>  		CHECK_TYPE_VAL(STR);
>  		break;
> @@ -1040,6 +1048,8 @@ static int config_term_tracepoint(struct perf_event_attr *attr,
>  	case PARSE_EVENTS__TERM_TYPE_STACKSIZE:
>  	case PARSE_EVENTS__TERM_TYPE_INHERIT:
>  	case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
> +	case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
> +	case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
>  		return config_term_common(attr, term, err);
>  	default:
>  		if (err) {
> @@ -1109,6 +1119,12 @@ do {								\
>  		case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
>  			ADD_CONFIG_TERM(INHERIT, inherit, term->val.num ? 0 : 1);
>  			break;
> +		case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
> +			ADD_CONFIG_TERM(OVERWRITE, overwrite, term->val.num ? 1 : 0);
> +			break;
> +		case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
> +			ADD_CONFIG_TERM(OVERWRITE, overwrite, term->val.num ? 0 : 1);
> +			break;
>  		default:
>  			break;
>  		}
> @@ -2322,9 +2338,9 @@ static void config_terms_list(char *buf, size_t buf_sz)
>  char *parse_events_formats_error_string(char *additional_terms)
>  {
>  	char *str;
> -	/* "branch_type" is the longest name */
> +	/* "no-overwrite" is the longest name */
>  	char static_terms[__PARSE_EVENTS__TERM_TYPE_NR *
> -			  (sizeof("branch_type") - 1)];
> +			  (sizeof("no-overwrite") - 1)];
>  
>  	config_terms_list(static_terms, sizeof(static_terms));
>  	/* valid terms */
> diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
> index d740c3c..f341d9d 100644
> --- a/tools/perf/util/parse-events.h
> +++ b/tools/perf/util/parse-events.h
> @@ -68,6 +68,8 @@ enum {
>  	PARSE_EVENTS__TERM_TYPE_STACKSIZE,
>  	PARSE_EVENTS__TERM_TYPE_NOINHERIT,
>  	PARSE_EVENTS__TERM_TYPE_INHERIT,
> +	PARSE_EVENTS__TERM_TYPE_NOOVERWRITE,
> +	PARSE_EVENTS__TERM_TYPE_OVERWRITE,
>  	__PARSE_EVENTS__TERM_TYPE_NR,
>  };
>  
> diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
> index 1477fbc..cc4c426 100644
> --- a/tools/perf/util/parse-events.l
> +++ b/tools/perf/util/parse-events.l
> @@ -201,6 +201,8 @@ call-graph		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CALLGRAPH); }
>  stack-size		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_STACKSIZE); }
>  inherit			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_INHERIT); }
>  no-inherit		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOINHERIT); }
> +overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_OVERWRITE); }
> +no-overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOOVERWRITE); }
>  ,			{ return ','; }
>  "/"			{ BEGIN(INITIAL); return '/'; }
>  {name_minus}		{ return str(yyscanner, PE_NAME); }
> -- 
> 1.8.3.4

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 3/7] perf tools: Enable overwrite settings
  2016-05-24 18:40   ` Arnaldo Carvalho de Melo
@ 2016-05-24 18:41     ` Arnaldo Carvalho de Melo
  2016-05-25  2:14     ` Wangnan (F)
  1 sibling, 0 replies; 13+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-24 18:41 UTC (permalink / raw)
  To: Wang Nan
  Cc: pi3orama, linux-kernel, He Kuang, Arnaldo Carvalho de Melo,
	Jiri Olsa, Masami Hiramatsu, Namhyung Kim, Zefan Li

Em Tue, May 24, 2016 at 03:40:30PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Tue, May 24, 2016 at 02:29:00AM +0000, Wang Nan escreveu:
> > This patch allows following config terms and option:
> > 
> > Globally setting events to overwrite;
> > 
> >  # perf record --overwrite ...
> > 
> > Set specific events to be overwrite or no-overwrite.
> > 
> >  # perf record --event cycles/overwrite/ ...
> >  # perf record --event cycles/no-overwrite/ ...
> 
> So, based on this chunk of documentation in this patch:

BTW, I applied the first 2 patches.

- Arnaldo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 3/7] perf tools: Enable overwrite settings
  2016-05-24 18:40   ` Arnaldo Carvalho de Melo
  2016-05-24 18:41     ` Arnaldo Carvalho de Melo
@ 2016-05-25  2:14     ` Wangnan (F)
  1 sibling, 0 replies; 13+ messages in thread
From: Wangnan (F) @ 2016-05-25  2:14 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: pi3orama, linux-kernel, He Kuang, Arnaldo Carvalho de Melo,
	Jiri Olsa, Masami Hiramatsu, Namhyung Kim, Zefan Li



On 2016/5/25 2:40, Arnaldo Carvalho de Melo wrote:
> Em Tue, May 24, 2016 at 02:29:00AM +0000, Wang Nan escreveu:
>> This patch allows following config terms and option:
>>
>> Globally setting events to overwrite;
>>
>>   # perf record --overwrite ...
>>
>> Set specific events to be overwrite or no-overwrite.
>>
>>   # perf record --event cycles/overwrite/ ...
>>   # perf record --event cycles/no-overwrite/ ...
> So, based on this chunk of documentation in this patch:
>
> <quote>
> Perf dumps data from overwritable ring buffer when switching output (see
> --switch-output) and before terminate.
> </>
>
> I tried:
>
> No --overwrite:
>
>    # perf record -e syscalls:*enter_nanosleep* usleep 1
>    [ perf record: Woken up 1 times to write data ]
>    [ perf record: Captured and wrote 0.019 MB perf.data (1 samples) ]
>    # perf evlist -v
>    syscalls:sys_enter_nanosleep: type: 2, size: 112, config: 0x132, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1, inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
>    # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
>    # perf script
>            usleep 29416 [002] 220099.782982: syscalls:sys_enter_nanosleep: rqtp: 0x7ffc21f73cc0, rmtp: 0x00000000
>
> Now I went on to try this new --overwrite thing:
>
>    # perf record --overwrite -e syscalls:*enter_nanosleep* usleep 1
>    [ perf record: Woken up 1 times to write data ]
>    [ perf record: Captured and wrote 0.019 MB perf.data ]
>    # perf evlist -v
> syscalls:sys_enter_nanosleep: type: 2, size: 112, config: 0x132, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1, inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, write_backward: 1
>    # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
>    # perf script
>    #
>
> So it hasn't recorded anything at anytime, i.e. I expected, based on the
> documentation provided, that it would get what was in its buffer, to be written,
> i.e. the single "syscalls:sys_enter_nanosleep" event that took place in that
> workload.
>
> So I'm now trying it together with --switch-output, but I just get one
> timestamp suffixed perf.data file, empty, without that event that I know took
> place.
>
> Care to ellaborate here?

Sorry, you need to apply patch 3/7 - 5/7 to enable operations described 
here.

I'll reorder these patches and send again, but some important patch 
would become
untestable until patch 3/7 get applied. I'll try to add a test case for 
them.

Thank you.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [tip:perf/core] perf record: Robustify perf_event__synth_time_conv()
  2016-05-24  2:28 ` [PATCH v4 2/7] perf tools: Don't poll and mmap overwritable events Wang Nan
@ 2016-06-02  6:32   ` tip-bot for Wang Nan
  2016-06-02  6:32   ` [tip:perf/core] perf evlist: Don't poll and mmap overwritable events tip-bot for Wang Nan
  1 sibling, 0 replies; 13+ messages in thread
From: tip-bot for Wang Nan @ 2016-06-02  6:32 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, mhiramat, acme, linux-kernel, mingo, jolsa, tglx, wangnan0,
	lizefan, hekuang, namhyung

Commit-ID:  c45628b0a3f90c4ffeca5f72f227008ceedc21c5
Gitweb:     http://git.kernel.org/tip/c45628b0a3f90c4ffeca5f72f227008ceedc21c5
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Tue, 24 May 2016 02:28:59 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Mon, 30 May 2016 12:41:44 -0300

perf record: Robustify perf_event__synth_time_conv()

It is possible that all events in an evlist are overwritable.
perf_event__synth_time_conv() should not crash in this case.
record__pick_pc() is used to check avaliability.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1464056944-166978-3-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/arch/x86/util/tsc.c | 2 ++
 tools/perf/builtin-record.c    | 9 ++++++++-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/tools/perf/arch/x86/util/tsc.c b/tools/perf/arch/x86/util/tsc.c
index 357f1b1..2e5567c 100644
--- a/tools/perf/arch/x86/util/tsc.c
+++ b/tools/perf/arch/x86/util/tsc.c
@@ -62,6 +62,8 @@ int perf_event__synth_time_conv(const struct perf_event_mmap_page *pc,
 	struct perf_tsc_conversion tc;
 	int err;
 
+	if (!pc)
+		return 0;
 	err = perf_read_tsc_conversion(pc, &tc);
 	if (err == -EOPNOTSUPP)
 		return 0;
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index dc3fcb5..d4cf1b0 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -655,6 +655,13 @@ perf_event__synth_time_conv(const struct perf_event_mmap_page *pc __maybe_unused
 	return 0;
 }
 
+static const struct perf_event_mmap_page *record__pick_pc(struct record *rec)
+{
+	if (rec->evlist && rec->evlist->mmap && rec->evlist->mmap[0].base)
+		return rec->evlist->mmap[0].base;
+	return NULL;
+}
+
 static int record__synthesize(struct record *rec)
 {
 	struct perf_session *session = rec->session;
@@ -692,7 +699,7 @@ static int record__synthesize(struct record *rec)
 		}
 	}
 
-	err = perf_event__synth_time_conv(rec->evlist->mmap[0].base, tool,
+	err = perf_event__synth_time_conv(record__pick_pc(rec), tool,
 					  process_synthesized_event, machine);
 	if (err)
 		goto out;

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [tip:perf/core] perf evlist: Don't poll and mmap overwritable events
  2016-05-24  2:28 ` [PATCH v4 2/7] perf tools: Don't poll and mmap overwritable events Wang Nan
  2016-06-02  6:32   ` [tip:perf/core] perf record: Robustify perf_event__synth_time_conv() tip-bot for Wang Nan
@ 2016-06-02  6:32   ` tip-bot for Wang Nan
  1 sibling, 0 replies; 13+ messages in thread
From: tip-bot for Wang Nan @ 2016-06-02  6:32 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jolsa, tglx, acme, hpa, namhyung, hekuang, linux-kernel, mingo,
	wangnan0, lizefan, mhiramat

Commit-ID:  f3058a1c1932aa1b027856945163144bda6366df
Gitweb:     http://git.kernel.org/tip/f3058a1c1932aa1b027856945163144bda6366df
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Tue, 24 May 2016 02:28:59 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Mon, 30 May 2016 12:41:45 -0300

perf evlist: Don't poll and mmap overwritable events

There's no need to receive events from overwritable ring buffer.
Instead, perf should make them run in background until some external
event of interest takes place.  This patch makes ignores normal events from
overwrite evlists.

Overwritable events must be mapped readonly and backward, so if evlist
and evsel doesn't match (evsel->overwrite is true but either evlist is
read/write or evlist is not backward, and vice versa), skip mapping it.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1464056944-166978-3-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evlist.c | 23 +++++++++++++++++++----
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index e82ba90..50d7b80 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -462,9 +462,9 @@ int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
 	return 0;
 }
 
-static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx)
+static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx, short revent)
 {
-	int pos = fdarray__add(&evlist->pollfd, fd, POLLIN | POLLERR | POLLHUP);
+	int pos = fdarray__add(&evlist->pollfd, fd, revent | POLLERR | POLLHUP);
 	/*
 	 * Save the idx so that when we filter out fds POLLHUP'ed we can
 	 * close the associated evlist->mmap[] entry.
@@ -480,7 +480,7 @@ static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx
 
 int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd)
 {
-	return __perf_evlist__add_pollfd(evlist, fd, -1);
+	return __perf_evlist__add_pollfd(evlist, fd, -1, POLLIN);
 }
 
 static void perf_evlist__munmap_filtered(struct fdarray *fda, int fd)
@@ -983,15 +983,28 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 	return 0;
 }
 
+static bool
+perf_evlist__should_poll(struct perf_evlist *evlist __maybe_unused,
+			 struct perf_evsel *evsel)
+{
+	if (evsel->overwrite)
+		return false;
+	return true;
+}
+
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 				       struct mmap_params *mp, int cpu,
 				       int thread, int *output)
 {
 	struct perf_evsel *evsel;
+	int revent;
 
 	evlist__for_each(evlist, evsel) {
 		int fd;
 
+		if (evsel->overwrite != (evlist->overwrite && evlist->backward))
+			continue;
+
 		if (evsel->system_wide && thread)
 			continue;
 
@@ -1008,6 +1021,8 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 			perf_evlist__mmap_get(evlist, idx);
 		}
 
+		revent = perf_evlist__should_poll(evlist, evsel) ? POLLIN : 0;
+
 		/*
 		 * The system_wide flag causes a selected event to be opened
 		 * always without a pid.  Consequently it will never get a
@@ -1016,7 +1031,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 		 * Therefore don't add it for polling.
 		 */
 		if (!evsel->system_wide &&
-		    __perf_evlist__add_pollfd(evlist, fd, idx) < 0) {
+		    __perf_evlist__add_pollfd(evlist, fd, idx, revent) < 0) {
 			perf_evlist__mmap_put(evlist, idx);
 			return -1;
 		}

^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-06-02  6:33 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-24  2:28 [PATCH v4 0/7] perf tools: Support overwritable ring buffer Wang Nan
2016-05-24  2:28 ` [PATCH v4 1/7] perf evlist: Introduce aux perf evlist Wang Nan
2016-05-24  2:28 ` [PATCH v4 2/7] perf tools: Don't poll and mmap overwritable events Wang Nan
2016-06-02  6:32   ` [tip:perf/core] perf record: Robustify perf_event__synth_time_conv() tip-bot for Wang Nan
2016-06-02  6:32   ` [tip:perf/core] perf evlist: Don't poll and mmap overwritable events tip-bot for Wang Nan
2016-05-24  2:29 ` [PATCH v4 3/7] perf tools: Enable overwrite settings Wang Nan
2016-05-24 18:40   ` Arnaldo Carvalho de Melo
2016-05-24 18:41     ` Arnaldo Carvalho de Melo
2016-05-25  2:14     ` Wangnan (F)
2016-05-24  2:29 ` [PATCH v4 4/7] perf record: Introduce rec->overwrite_evlist for overwritable events Wang Nan
2016-05-24  2:29 ` [PATCH v4 5/7] perf record: Toggle overwrite ring buffer for reading Wang Nan
2016-05-24  2:29 ` [PATCH v4 6/7] perf tools: Don't warn about out of order event if write_backward is used Wang Nan
2016-05-24  2:29 ` [PATCH v4 7/7] perf tools: Check write_backward during evlist config Wang Nan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).