linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v16 00/15] perf tools: Support overwritable ring buffer
@ 2016-07-14  8:34 Wang Nan
  2016-07-14  8:34 ` [PATCH v16 01/15] perf tools: Drop redundant evsel->overwrite indicator Wang Nan
                   ` (15 more replies)
  0 siblings, 16 replies; 34+ messages in thread
From: Wang Nan @ 2016-07-14  8:34 UTC (permalink / raw)
  To: acme
  Cc: lizefan, linux-kernel, pi3orama, Wang Nan,
	Arnaldo Carvalho de Melo, He Kuang, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Nilay Vaish

This patch set enables daemonized perf recording by utilizing
overwritable backward ring buffer. With this feature one can
put perf background, and dump ring buffer records by a SIGUSR2
when he/she find something unusual. For example, following
command record system calls, schedule events and samples on cpu cycles
continously:

 # perf record -g -e cycles -e raw_syscalls:*/call-graph=no/ \
                  -e sched:sched_switch/call-graph=no/ \
                  --switch-output --overwrite -a

Then by sending SIGUSR2 to perf when lagging is happen, we get multiple
perf.data output, each of them correspond a abnormal event, and the data
size is reasonable:

 # ls -l ./perf.data*
 -rw------- 1 root root 5122165 May 13 23:51 ./perf.data.2016051323511683
 -rw------- 1 root root 5135093 May 13 23:51 ./perf.data.2016051323512107
 -rw------- 1 root root 5135213 May 13 23:51 ./perf.data.2016051323512215
 -rw------- 1 root root 5135157 May 13 23:51 ./perf.data.2016051323512387

v1 -> v2: Totally redesign: drop the principle of 'channal', use
          auxiliary evlist instead. Fix missing documentation.

v2 -> v3: Rename perf_evlist__toggle_paused() to perf_evlist__pause/resume.

v3 -> v4: Update commit message to describe auxiliary evlist more clearly.

v4 -> v5: Reorder commits, ensure '--overwrite' works right after perf
          support the option.
          Add test cases for auxiliary evlist.
          Avoid bug if main evlist is empty.

v5 -> v6: Improve filter pollfd related code.

v6 -> v7: Rebase to newest perf/core.

v7 -> v8: Unmap mmaps from parent and children in
          perf_evlist__munmap_filtered(), hide more detail of aux evlist.
          Add --tail-synthesize, do synthesize at the end of perf.data.

v8 -> v9: Beautify code of test case, make patch set more granular,
          improve documentation.

v9 -> v10: Make patch set more granular: extract preparation code to
           patch 1-3.

v10 -> v11: Rebase to newest perf/core: solve conflicts caused by commit
            e5cadb93d08 ("perf evlist: Rename for_each() macros to
            for_each_entry()").

v11 -> v12: Improve 'perf test backward': skip this test on old kernel,
            resolve conflicts.

v12 -> v13: Drop evsel->overwrite, use evsel->attr.write_backward instead.

v13 -> v14: Follow Jiri Olsa's suggestion: Improve commit message,
            add OVERWRITE_EVT_NOTREADY state, stop the state machine if
	    overwrite_evlist is not generated.

v14 -> v15: Totally redesign: drop auxiliary evlist, use one evlist to
            contain normal and backward ring buffers (add evlist->backward_mmap).
            Build new API and helpers for it.

v15 -> v16: Follow Jiri Olsa's suggestion:
            1. Move backward ring buffer state machine to evlist.
	    2. Dynamically allocate backward mmap, so backward_mmap == NULL
	       means there's no backward ring buffer.
	       Stop the state machine when there's no backward ring buffer.
	    3. Rename: _output2 to _output_backward.
	    4. Patch rearrangement.
	    5. Update record__pick_pc(): read from backward_mmap if normal
               mmap is empty.

Arnaldo Carvalho de Melo (1):
  perf tools: Drop redundant evsel->overwrite indicator

Wang Nan (14):
  tools lib fd array: Allow associating a pointer cookie with each entry
  perf tools: Update perf evlist mmap related APIs and helpers
  perf tools: Decouple record__mmap_read() and evlist.
  perf tools: Record mmap cookie into fdarray private field
  perf tools: Extract common code in mmap failure processing
  perf tools: Introduce backward_mmap array for evlist
  perf tools: Map backward events to backward_mmap
  perf tools: Drop evlist->backward
  perf tools: Setup backward mmap state machine
  perf record: Read from overwritable ring buffer
  perf tools: Make perf_evlist__{pause,resume} internal helpers
  perf tools: Enable overwrite settings
  perf tools: Don't warn about out of order event if write_backward is
    used
  perf tools: Add --tail-synthesize option

 tools/lib/api/fd/array.h                 |   1 +
 tools/perf/Documentation/perf-record.txt |  22 +++
 tools/perf/builtin-record.c              | 113 ++++++++++---
 tools/perf/perf.h                        |   2 +
 tools/perf/tests/backward-ring-buffer.c  |  14 +-
 tools/perf/util/evlist.c                 | 270 ++++++++++++++++++++++---------
 tools/perf/util/evlist.h                 |  48 +++++-
 tools/perf/util/evsel.c                  |  16 +-
 tools/perf/util/evsel.h                  |   3 +-
 tools/perf/util/parse-events.c           |  20 ++-
 tools/perf/util/parse-events.h           |   2 +
 tools/perf/util/parse-events.l           |   2 +
 tools/perf/util/session.c                |  22 ++-
 13 files changed, 411 insertions(+), 124 deletions(-)

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: pi3orama@163.com
-- 
1.8.3.4

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v16 01/15] perf tools: Drop redundant evsel->overwrite indicator
  2016-07-14  8:34 [PATCH v16 00/15] perf tools: Support overwritable ring buffer Wang Nan
@ 2016-07-14  8:34 ` Wang Nan
  2016-07-16 20:45   ` [tip:perf/core] perf evlist: " tip-bot for Arnaldo Carvalho de Melo
  2016-07-14  8:34 ` [PATCH v16 02/15] tools lib fd array: Allow associating a pointer cookie with each entry Wang Nan
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 34+ messages in thread
From: Wang Nan @ 2016-07-14  8:34 UTC (permalink / raw)
  To: acme
  Cc: lizefan, linux-kernel, pi3orama, Arnaldo Carvalho de Melo,
	Wang Nan, He Kuang, Jiri Olsa, Masami Hiramatsu, Namhyung Kim,
	Nilay Vaish

From: Arnaldo Carvalho de Melo <acme@redhat.com>

evsel->overwrite indicator means an event should be put into
overwritable ring buffer. In current implementation, it equals to
evsel->attr.write_backward. To reduce compliexity, remove
evsel->overwrite, use evsel->attr.write_backward instead.

In addition, in __perf_evsel__open(), if kernel doesn't support
write_backward and user explicitly set it in evsel, don't fallback
like other missing feature, since it is meaningless to fall back to
a forward ring buffer in this case: we are unable to stably read
from an forward overwritable ring buffer.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: pi3orama@163.com
---
 tools/perf/tests/backward-ring-buffer.c |  1 +
 tools/perf/util/evlist.c                |  4 ++--
 tools/perf/util/evsel.c                 | 12 +++++-------
 tools/perf/util/evsel.h                 |  1 -
 4 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/tools/perf/tests/backward-ring-buffer.c b/tools/perf/tests/backward-ring-buffer.c
index f20ea4c..5cee387 100644
--- a/tools/perf/tests/backward-ring-buffer.c
+++ b/tools/perf/tests/backward-ring-buffer.c
@@ -101,6 +101,7 @@ int test__backward_ring_buffer(int subtest __maybe_unused)
 		return TEST_FAIL;
 	}
 
+	evlist->backward = true;
 	err = perf_evlist__create_maps(evlist, &opts.target);
 	if (err < 0) {
 		pr_debug("Not enough memory to create thread/cpu maps\n");
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 862e69c..6803f5c 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1003,7 +1003,7 @@ static bool
 perf_evlist__should_poll(struct perf_evlist *evlist __maybe_unused,
 			 struct perf_evsel *evsel)
 {
-	if (evsel->overwrite)
+	if (evsel->attr.write_backward)
 		return false;
 	return true;
 }
@@ -1018,7 +1018,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 	evlist__for_each_entry(evlist, evsel) {
 		int fd;
 
-		if (evsel->overwrite != (evlist->overwrite && evlist->backward))
+		if (!!evsel->attr.write_backward != (evlist->overwrite && evlist->backward))
 			continue;
 
 		if (evsel->system_wide && thread)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index ba0f59f..9ac2f92 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1377,6 +1377,9 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 	int pid = -1, err;
 	enum { NO_CHANGE, SET_TO_MAX, INCREASED_MAX } set_rlimit = NO_CHANGE;
 
+	if (perf_missing_features.write_backward && evsel->attr.write_backward)
+		return -EINVAL;
+
 	if (evsel->system_wide)
 		nthreads = 1;
 	else
@@ -1407,11 +1410,6 @@ fallback_missing_features:
 	if (perf_missing_features.lbr_flags)
 		evsel->attr.branch_sample_type &= ~(PERF_SAMPLE_BRANCH_NO_FLAGS |
 				     PERF_SAMPLE_BRANCH_NO_CYCLES);
-	if (perf_missing_features.write_backward) {
-		if (evsel->overwrite)
-			return -EINVAL;
-		evsel->attr.write_backward = false;
-	}
 retry_sample_id:
 	if (perf_missing_features.sample_id_all)
 		evsel->attr.sample_id_all = 0;
@@ -1513,7 +1511,7 @@ try_fallback:
 	 */
 	if (!perf_missing_features.write_backward && evsel->attr.write_backward) {
 		perf_missing_features.write_backward = true;
-		goto fallback_missing_features;
+		goto out_close;
 	} else if (!perf_missing_features.clockid_wrong && evsel->attr.use_clockid) {
 		perf_missing_features.clockid_wrong = true;
 		goto fallback_missing_features;
@@ -2422,7 +2420,7 @@ int perf_evsel__open_strerror(struct perf_evsel *evsel, struct target *target,
 	"We found oprofile daemon running, please stop it and try again.");
 		break;
 	case EINVAL:
-		if (evsel->overwrite && perf_missing_features.write_backward)
+		if (evsel->attr.write_backward && perf_missing_features.write_backward)
 			return scnprintf(msg, size, "Reading from overwrite event is not supported by this kernel.");
 		if (perf_missing_features.clockid)
 			return scnprintf(msg, size, "clockid feature not supported.");
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index d73391e..e60cbfc 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -114,7 +114,6 @@ struct perf_evsel {
 	bool			tracking;
 	bool			per_pkg;
 	bool			precise_max;
-	bool			overwrite;
 	/* parse modifier helper */
 	int			exclude_GH;
 	int			nr_members;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v16 02/15] tools lib fd array: Allow associating a pointer cookie with each entry
  2016-07-14  8:34 [PATCH v16 00/15] perf tools: Support overwritable ring buffer Wang Nan
  2016-07-14  8:34 ` [PATCH v16 01/15] perf tools: Drop redundant evsel->overwrite indicator Wang Nan
@ 2016-07-14  8:34 ` Wang Nan
  2016-07-16 20:46   ` [tip:perf/core] " tip-bot for Wang Nan
  2016-07-14  8:34 ` [PATCH v16 03/15] perf tools: Update perf evlist mmap related APIs and helpers Wang Nan
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 34+ messages in thread
From: Wang Nan @ 2016-07-14  8:34 UTC (permalink / raw)
  To: acme
  Cc: lizefan, linux-kernel, pi3orama, Wang Nan,
	Arnaldo Carvalho de Melo, He Kuang, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Nilay Vaish, Adrian Hunter, David Ahern,
	Peter Zijlstra

Add a 'ptr' field to fdarray->priv array.

This feature will be used by following commits, which introduce muiltiple
'struct perf_mmap' array for different types of mapping. Because of this,
during fdarray__filter(), a simple 'idx' is not enough. Add a pointer cookie
allows directly associate a 'struct perf_mmap' pointer to an fdarray entry.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: pi3orama@163.com
---
 tools/lib/api/fd/array.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/lib/api/fd/array.h b/tools/lib/api/fd/array.h
index e87fd80..71287dd 100644
--- a/tools/lib/api/fd/array.h
+++ b/tools/lib/api/fd/array.h
@@ -22,6 +22,7 @@ struct fdarray {
 	struct pollfd *entries;
 	union {
 		int    idx;
+		void   *ptr;
 	} *priv;
 };
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v16 03/15] perf tools: Update perf evlist mmap related APIs and helpers
  2016-07-14  8:34 [PATCH v16 00/15] perf tools: Support overwritable ring buffer Wang Nan
  2016-07-14  8:34 ` [PATCH v16 01/15] perf tools: Drop redundant evsel->overwrite indicator Wang Nan
  2016-07-14  8:34 ` [PATCH v16 02/15] tools lib fd array: Allow associating a pointer cookie with each entry Wang Nan
@ 2016-07-14  8:34 ` Wang Nan
  2016-07-16 20:46   ` [tip:perf/core] perf evlist: Update " tip-bot for Wang Nan
  2016-07-14  8:34 ` [PATCH v16 04/15] perf tools: Decouple record__mmap_read() and evlist Wang Nan
                   ` (12 subsequent siblings)
  15 siblings, 1 reply; 34+ messages in thread
From: Wang Nan @ 2016-07-14  8:34 UTC (permalink / raw)
  To: acme
  Cc: lizefan, linux-kernel, pi3orama, Wang Nan,
	Arnaldo Carvalho de Melo, He Kuang, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Nilay Vaish

Currently, evlist mmap related helpers and APIs accept evlist and idx,
and dereference 'struct perf_mmap' by evlist->mmap[idx]. This is
unnecessary, and force each evlist contains only one mmap array.

Following commits are going to introduce multiple mmap arrays to a evlist.
This patch refators these APIs and helpers, introduces functions accept
perf_mmap pointer directly. New helpers and APIs are decoupled with
perf_evlist, and become perf_mmap functions (so they have perf_mmap prefix).

Old functions are reimplemented with new functions. Some of them will be
removed in following commits.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: pi3orama@163.com
---
 tools/perf/util/evlist.c | 139 ++++++++++++++++++++++++++++++++---------------
 tools/perf/util/evlist.h |  12 ++++
 2 files changed, 108 insertions(+), 43 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 6803f5c..a4137e0 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -29,6 +29,7 @@
 
 static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx);
 static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx);
+static void perf_mmap__munmap(struct perf_mmap *map);
 
 #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
 #define SID(e, x, y) xyarray__entry(e->sample_id, x, y)
@@ -781,9 +782,8 @@ broken_event:
 	return event;
 }
 
-union perf_event *perf_evlist__mmap_read_forward(struct perf_evlist *evlist, int idx)
+union perf_event *perf_mmap__read_forward(struct perf_mmap *md, bool check_messup)
 {
-	struct perf_mmap *md = &evlist->mmap[idx];
 	u64 head;
 	u64 old = md->prev;
 
@@ -795,13 +795,12 @@ union perf_event *perf_evlist__mmap_read_forward(struct perf_evlist *evlist, int
 
 	head = perf_mmap__read_head(md);
 
-	return perf_mmap__read(md, evlist->overwrite, old, head, &md->prev);
+	return perf_mmap__read(md, check_messup, old, head, &md->prev);
 }
 
 union perf_event *
-perf_evlist__mmap_read_backward(struct perf_evlist *evlist, int idx)
+perf_mmap__read_backward(struct perf_mmap *md)
 {
-	struct perf_mmap *md = &evlist->mmap[idx];
 	u64 head, end;
 	u64 start = md->prev;
 
@@ -836,6 +835,31 @@ perf_evlist__mmap_read_backward(struct perf_evlist *evlist, int idx)
 	return perf_mmap__read(md, false, start, end, &md->prev);
 }
 
+union perf_event *perf_evlist__mmap_read_forward(struct perf_evlist *evlist, int idx)
+{
+	struct perf_mmap *md = &evlist->mmap[idx];
+
+	/*
+	 * Check messup is required for forward overwritable ring buffer:
+	 * memory pointed by md->prev can be overwritten in this case.
+	 * No need for read-write ring buffer: kernel stop outputting when
+	 * it hit md->prev (perf_mmap__consume()).
+	 */
+	return perf_mmap__read_forward(md, evlist->overwrite);
+}
+
+union perf_event *perf_evlist__mmap_read_backward(struct perf_evlist *evlist, int idx)
+{
+	struct perf_mmap *md = &evlist->mmap[idx];
+
+	/*
+	 * No need to check messup for backward ring buffer:
+	 * We can always read arbitrary long data from a backward
+	 * ring buffer unless we forget to pause it before reading.
+	 */
+	return perf_mmap__read_backward(md);
+}
+
 union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
 {
 	if (!evlist->backward)
@@ -843,9 +867,8 @@ union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
 	return perf_evlist__mmap_read_backward(evlist, idx);
 }
 
-void perf_evlist__mmap_read_catchup(struct perf_evlist *evlist, int idx)
+void perf_mmap__read_catchup(struct perf_mmap *md)
 {
-	struct perf_mmap *md = &evlist->mmap[idx];
 	u64 head;
 
 	if (!atomic_read(&md->refcnt))
@@ -855,38 +878,54 @@ void perf_evlist__mmap_read_catchup(struct perf_evlist *evlist, int idx)
 	md->prev = head;
 }
 
+void perf_evlist__mmap_read_catchup(struct perf_evlist *evlist, int idx)
+{
+	perf_mmap__read_catchup(&evlist->mmap[idx]);
+}
+
 static bool perf_mmap__empty(struct perf_mmap *md)
 {
 	return perf_mmap__read_head(md) == md->prev && !md->auxtrace_mmap.base;
 }
 
-static void perf_evlist__mmap_get(struct perf_evlist *evlist, int idx)
+static void perf_mmap__get(struct perf_mmap *map)
 {
-	atomic_inc(&evlist->mmap[idx].refcnt);
+	atomic_inc(&map->refcnt);
 }
 
-static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx)
+static void perf_mmap__put(struct perf_mmap *md)
 {
-	struct perf_mmap *md = &evlist->mmap[idx];
-
 	BUG_ON(md->base && atomic_read(&md->refcnt) == 0);
 
 	if (atomic_dec_and_test(&md->refcnt))
-		__perf_evlist__munmap(evlist, idx);
+		perf_mmap__munmap(md);
 }
 
-void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
+static void perf_evlist__mmap_get(struct perf_evlist *evlist, int idx)
 {
-	struct perf_mmap *md = &evlist->mmap[idx];
+	perf_mmap__get(&evlist->mmap[idx]);
+}
+
+static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx)
+{
+	perf_mmap__put(&evlist->mmap[idx]);
+}
 
-	if (!evlist->overwrite) {
+void perf_mmap__consume(struct perf_mmap *md, bool overwrite)
+{
+	if (!overwrite) {
 		u64 old = md->prev;
 
 		perf_mmap__write_tail(md, old);
 	}
 
 	if (atomic_read(&md->refcnt) == 1 && perf_mmap__empty(md))
-		perf_evlist__mmap_put(evlist, idx);
+		perf_mmap__put(md);
+}
+
+void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
+{
+	perf_mmap__consume(&evlist->mmap[idx], evlist->overwrite);
 }
 
 int __weak auxtrace_mmap__mmap(struct auxtrace_mmap *mm __maybe_unused,
@@ -917,15 +956,20 @@ void __weak auxtrace_mmap_params__set_idx(
 {
 }
 
-static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
+static void perf_mmap__munmap(struct perf_mmap *map)
 {
-	if (evlist->mmap[idx].base != NULL) {
-		munmap(evlist->mmap[idx].base, evlist->mmap_len);
-		evlist->mmap[idx].base = NULL;
-		evlist->mmap[idx].fd = -1;
-		atomic_set(&evlist->mmap[idx].refcnt, 0);
+	if (map->base != NULL) {
+		munmap(map->base, perf_mmap__mmap_len(map));
+		map->base = NULL;
+		map->fd = -1;
+		atomic_set(&map->refcnt, 0);
 	}
-	auxtrace_mmap__munmap(&evlist->mmap[idx].auxtrace_mmap);
+	auxtrace_mmap__munmap(&map->auxtrace_mmap);
+}
+
+static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
+{
+	perf_mmap__munmap(&evlist->mmap[idx]);
 }
 
 void perf_evlist__munmap(struct perf_evlist *evlist)
@@ -941,20 +985,21 @@ void perf_evlist__munmap(struct perf_evlist *evlist)
 	zfree(&evlist->mmap);
 }
 
-static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
+static struct perf_mmap *perf_evlist__alloc_mmap(struct perf_evlist *evlist)
 {
 	int i;
+	struct perf_mmap *map;
 
 	evlist->nr_mmaps = cpu_map__nr(evlist->cpus);
 	if (cpu_map__empty(evlist->cpus))
 		evlist->nr_mmaps = thread_map__nr(evlist->threads);
-	evlist->mmap = zalloc(evlist->nr_mmaps * sizeof(struct perf_mmap));
-	if (!evlist->mmap)
-		return -ENOMEM;
+	map = zalloc(evlist->nr_mmaps * sizeof(struct perf_mmap));
+	if (!map)
+		return NULL;
 
 	for (i = 0; i < evlist->nr_mmaps; i++)
-		evlist->mmap[i].fd = -1;
-	return 0;
+		map[i].fd = -1;
+	return map;
 }
 
 struct mmap_params {
@@ -963,8 +1008,8 @@ struct mmap_params {
 	struct auxtrace_mmap_params auxtrace_mp;
 };
 
-static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
-			       struct mmap_params *mp, int fd)
+static int perf_mmap__mmap(struct perf_mmap *map,
+			   struct mmap_params *mp, int fd)
 {
 	/*
 	 * The last one will be done at perf_evlist__mmap_consume(), so that we
@@ -979,26 +1024,32 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 	 * evlist layer can't just drop it when filtering events in
 	 * perf_evlist__filter_pollfd().
 	 */
-	atomic_set(&evlist->mmap[idx].refcnt, 2);
-	evlist->mmap[idx].prev = 0;
-	evlist->mmap[idx].mask = mp->mask;
-	evlist->mmap[idx].base = mmap(NULL, evlist->mmap_len, mp->prot,
-				      MAP_SHARED, fd, 0);
-	if (evlist->mmap[idx].base == MAP_FAILED) {
+	atomic_set(&map->refcnt, 2);
+	map->prev = 0;
+	map->mask = mp->mask;
+	map->base = mmap(NULL, perf_mmap__mmap_len(map), mp->prot,
+			 MAP_SHARED, fd, 0);
+	if (map->base == MAP_FAILED) {
 		pr_debug2("failed to mmap perf event ring buffer, error %d\n",
 			  errno);
-		evlist->mmap[idx].base = NULL;
+		map->base = NULL;
 		return -1;
 	}
-	evlist->mmap[idx].fd = fd;
+	map->fd = fd;
 
-	if (auxtrace_mmap__mmap(&evlist->mmap[idx].auxtrace_mmap,
-				&mp->auxtrace_mp, evlist->mmap[idx].base, fd))
+	if (auxtrace_mmap__mmap(&map->auxtrace_mmap,
+				&mp->auxtrace_mp, map->base, fd))
 		return -1;
 
 	return 0;
 }
 
+static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
+			       struct mmap_params *mp, int fd)
+{
+	return perf_mmap__mmap(&evlist->mmap[idx], mp, fd);
+}
+
 static bool
 perf_evlist__should_poll(struct perf_evlist *evlist __maybe_unused,
 			 struct perf_evsel *evsel)
@@ -1248,7 +1299,9 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 		.prot = PROT_READ | (overwrite ? 0 : PROT_WRITE),
 	};
 
-	if (evlist->mmap == NULL && perf_evlist__alloc_mmap(evlist) < 0)
+	if (!evlist->mmap)
+		evlist->mmap = perf_evlist__alloc_mmap(evlist);
+	if (!evlist->mmap)
 		return -ENOMEM;
 
 	if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index afd0877..9e680c6 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -35,6 +35,12 @@ struct perf_mmap {
 	char		 event_copy[PERF_SAMPLE_MAX_SIZE] __attribute__((aligned(8)));
 };
 
+static inline size_t
+perf_mmap__mmap_len(struct perf_mmap *map)
+{
+	return map->mask + 1 + page_size;
+}
+
 struct perf_evlist {
 	struct list_head entries;
 	struct hlist_head heads[PERF_EVLIST__HLIST_SIZE];
@@ -129,6 +135,12 @@ struct perf_evsel *perf_evlist__id2evsel_strict(struct perf_evlist *evlist,
 
 struct perf_sample_id *perf_evlist__id2sid(struct perf_evlist *evlist, u64 id);
 
+union perf_event *perf_mmap__read_forward(struct perf_mmap *map, bool check_messup);
+union perf_event *perf_mmap__read_backward(struct perf_mmap *map);
+
+void perf_mmap__read_catchup(struct perf_mmap *md);
+void perf_mmap__consume(struct perf_mmap *md, bool overwrite);
+
 union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx);
 
 union perf_event *perf_evlist__mmap_read_forward(struct perf_evlist *evlist,
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v16 04/15] perf tools: Decouple record__mmap_read() and evlist.
  2016-07-14  8:34 [PATCH v16 00/15] perf tools: Support overwritable ring buffer Wang Nan
                   ` (2 preceding siblings ...)
  2016-07-14  8:34 ` [PATCH v16 03/15] perf tools: Update perf evlist mmap related APIs and helpers Wang Nan
@ 2016-07-14  8:34 ` Wang Nan
  2016-07-16 20:47   ` [tip:perf/core] perf record: " tip-bot for Wang Nan
  2016-07-14  8:34 ` [PATCH v16 05/15] perf tools: Record mmap cookie into fdarray private field Wang Nan
                   ` (11 subsequent siblings)
  15 siblings, 1 reply; 34+ messages in thread
From: Wang Nan @ 2016-07-14  8:34 UTC (permalink / raw)
  To: acme
  Cc: lizefan, linux-kernel, pi3orama, Wang Nan,
	Arnaldo Carvalho de Melo, He Kuang, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Nilay Vaish

Perf evlist will have multiple mmap arrays. Update record__mmap_read():
it should read from 'struct perf_mmap' directly.

Also, make record__mmap_read() ready to read from backward ring buffer.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 37 ++++++++++++++++++++++---------------
 1 file changed, 22 insertions(+), 15 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index d9f5cc3..d15517e 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -119,11 +119,10 @@ backward_rb_find_range(void *buf, int mask, u64 head, u64 *start, u64 *end)
 }
 
 static int
-rb_find_range(struct perf_evlist *evlist,
-	      void *data, int mask, u64 head, u64 old,
-	      u64 *start, u64 *end)
+rb_find_range(void *data, int mask, u64 head, u64 old,
+	      u64 *start, u64 *end, bool backward)
 {
-	if (!evlist->backward) {
+	if (!backward) {
 		*start = old;
 		*end = head;
 		return 0;
@@ -132,9 +131,10 @@ rb_find_range(struct perf_evlist *evlist,
 	return backward_rb_find_range(data, mask, head, start, end);
 }
 
-static int record__mmap_read(struct record *rec, struct perf_evlist *evlist, int idx)
+static int
+record__mmap_read(struct record *rec, struct perf_mmap *md,
+		  bool overwrite, bool backward)
 {
-	struct perf_mmap *md = &evlist->mmap[idx];
 	u64 head = perf_mmap__read_head(md);
 	u64 old = md->prev;
 	u64 end = head, start = old;
@@ -143,8 +143,8 @@ static int record__mmap_read(struct record *rec, struct perf_evlist *evlist, int
 	void *buf;
 	int rc = 0;
 
-	if (rb_find_range(evlist, data, md->mask, head,
-			  old, &start, &end))
+	if (rb_find_range(data, md->mask, head,
+			  old, &start, &end, backward))
 		return -1;
 
 	if (start == end)
@@ -157,7 +157,7 @@ static int record__mmap_read(struct record *rec, struct perf_evlist *evlist, int
 		WARN_ONCE(1, "failed to keep up with mmap data. (warn only once)\n");
 
 		md->prev = head;
-		perf_evlist__mmap_consume(evlist, idx);
+		perf_mmap__consume(md, overwrite || backward);
 		return 0;
 	}
 
@@ -182,7 +182,7 @@ static int record__mmap_read(struct record *rec, struct perf_evlist *evlist, int
 	}
 
 	md->prev = head;
-	perf_evlist__mmap_consume(evlist, idx);
+	perf_mmap__consume(md, overwrite || backward);
 out:
 	return rc;
 }
@@ -498,20 +498,27 @@ static struct perf_event_header finished_round_event = {
 	.type = PERF_RECORD_FINISHED_ROUND,
 };
 
-static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evlist)
+static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evlist,
+				    bool backward)
 {
 	u64 bytes_written = rec->bytes_written;
 	int i;
 	int rc = 0;
+	struct perf_mmap *maps;
 
 	if (!evlist)
 		return 0;
 
+	maps = evlist->mmap;
+	if (!maps)
+		return 0;
+
 	for (i = 0; i < evlist->nr_mmaps; i++) {
-		struct auxtrace_mmap *mm = &evlist->mmap[i].auxtrace_mmap;
+		struct auxtrace_mmap *mm = &maps[i].auxtrace_mmap;
 
-		if (evlist->mmap[i].base) {
-			if (record__mmap_read(rec, evlist, i) != 0) {
+		if (maps[i].base) {
+			if (record__mmap_read(rec, &maps[i],
+					      evlist->overwrite, backward) != 0) {
 				rc = -1;
 				goto out;
 			}
@@ -539,7 +546,7 @@ static int record__mmap_read_all(struct record *rec)
 {
 	int err;
 
-	err = record__mmap_read_evlist(rec, rec->evlist);
+	err = record__mmap_read_evlist(rec, rec->evlist, false);
 	if (err)
 		return err;
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v16 05/15] perf tools: Record mmap cookie into fdarray private field
  2016-07-14  8:34 [PATCH v16 00/15] perf tools: Support overwritable ring buffer Wang Nan
                   ` (3 preceding siblings ...)
  2016-07-14  8:34 ` [PATCH v16 04/15] perf tools: Decouple record__mmap_read() and evlist Wang Nan
@ 2016-07-14  8:34 ` Wang Nan
  2016-07-15 14:59   ` Jiri Olsa
  2016-07-16 20:47   ` [tip:perf/core] perf evlist: " tip-bot for Wang Nan
  2016-07-14  8:34 ` [PATCH v16 06/15] perf tools: Extract common code in mmap failure processing Wang Nan
                   ` (10 subsequent siblings)
  15 siblings, 2 replies; 34+ messages in thread
From: Wang Nan @ 2016-07-14  8:34 UTC (permalink / raw)
  To: acme
  Cc: lizefan, linux-kernel, pi3orama, Wang Nan,
	Arnaldo Carvalho de Melo, He Kuang, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Nilay Vaish

Insetad of saving a index into fdarray entries private field, save the
corresponding 'struct perf_mmap' pointer, and release them directly
using perf_mmap__put().

Following commits introduce multiple mmap arrays to evlist. Without this
patch, perf_evlist__munmap_filtered() is unable to retrive correct
'struct perf_mmap' pointer.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: pi3orama@163.com
---
 tools/perf/util/evlist.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index a4137e0..1462085 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -30,6 +30,7 @@
 static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx);
 static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx);
 static void perf_mmap__munmap(struct perf_mmap *map);
+static void perf_mmap__put(struct perf_mmap *map);
 
 #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
 #define SID(e, x, y) xyarray__entry(e->sample_id, x, y)
@@ -466,7 +467,8 @@ int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
 	return 0;
 }
 
-static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx, short revent)
+static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd,
+				     struct perf_mmap *map, short revent)
 {
 	int pos = fdarray__add(&evlist->pollfd, fd, revent | POLLERR | POLLHUP);
 	/*
@@ -474,7 +476,7 @@ static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx
 	 * close the associated evlist->mmap[] entry.
 	 */
 	if (pos >= 0) {
-		evlist->pollfd.priv[pos].idx = idx;
+		evlist->pollfd.priv[pos].ptr = map;
 
 		fcntl(fd, F_SETFL, O_NONBLOCK);
 	}
@@ -484,15 +486,16 @@ static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx
 
 int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd)
 {
-	return __perf_evlist__add_pollfd(evlist, fd, -1, POLLIN);
+	return __perf_evlist__add_pollfd(evlist, fd, NULL, POLLIN);
 }
 
 static void perf_evlist__munmap_filtered(struct fdarray *fda, int fd,
 					 void *arg __maybe_unused)
 {
-	struct perf_evlist *evlist = container_of(fda, struct perf_evlist, pollfd);
+	struct perf_mmap *map = fda->priv[fd].ptr;
 
-	perf_evlist__mmap_put(evlist, fda->priv[fd].idx);
+	if (map)
+		perf_mmap__put(map);
 }
 
 int perf_evlist__filter_pollfd(struct perf_evlist *evlist, short revents_and_mask)
@@ -1098,7 +1101,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 		 * Therefore don't add it for polling.
 		 */
 		if (!evsel->system_wide &&
-		    __perf_evlist__add_pollfd(evlist, fd, idx, revent) < 0) {
+		    __perf_evlist__add_pollfd(evlist, fd, &evlist->mmap[idx], revent) < 0) {
 			perf_evlist__mmap_put(evlist, idx);
 			return -1;
 		}
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v16 06/15] perf tools: Extract common code in mmap failure processing
  2016-07-14  8:34 [PATCH v16 00/15] perf tools: Support overwritable ring buffer Wang Nan
                   ` (4 preceding siblings ...)
  2016-07-14  8:34 ` [PATCH v16 05/15] perf tools: Record mmap cookie into fdarray private field Wang Nan
@ 2016-07-14  8:34 ` Wang Nan
  2016-07-16 20:48   ` [tip:perf/core] perf evlist: " tip-bot for Wang Nan
  2016-07-14  8:34 ` [PATCH v16 07/15] perf tools: Introduce backward_mmap array for evlist Wang Nan
                   ` (9 subsequent siblings)
  15 siblings, 1 reply; 34+ messages in thread
From: Wang Nan @ 2016-07-14  8:34 UTC (permalink / raw)
  To: acme
  Cc: lizefan, linux-kernel, pi3orama, Wang Nan,
	Arnaldo Carvalho de Melo, He Kuang, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Nilay Vaish

In perf_evlist__mmap_per_cpu() and perf_evlist__mmap_per_thread(), when
mmap failure, successfully created maps should be cleared.

Current code uses two loops __perf_evlist__munmap() for each function.

This patch extracts common code to perf_evlist__munmap_nofree() and
use previous introduced decoupled API perf_mmap__munmap(). Now
__perf_evlist__munmap() can be removed because of no user.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: pi3orama@163.com
---
 tools/perf/util/evlist.c | 20 ++++++++------------
 1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 1462085..54ae0a0 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -28,7 +28,6 @@
 #include <linux/err.h>
 
 static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx);
-static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx);
 static void perf_mmap__munmap(struct perf_mmap *map);
 static void perf_mmap__put(struct perf_mmap *map);
 
@@ -970,12 +969,7 @@ static void perf_mmap__munmap(struct perf_mmap *map)
 	auxtrace_mmap__munmap(&map->auxtrace_mmap);
 }
 
-static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
-{
-	perf_mmap__munmap(&evlist->mmap[idx]);
-}
-
-void perf_evlist__munmap(struct perf_evlist *evlist)
+static void perf_evlist__munmap_nofree(struct perf_evlist *evlist)
 {
 	int i;
 
@@ -983,8 +977,12 @@ void perf_evlist__munmap(struct perf_evlist *evlist)
 		return;
 
 	for (i = 0; i < evlist->nr_mmaps; i++)
-		__perf_evlist__munmap(evlist, i);
+		perf_mmap__munmap(&evlist->mmap[i]);
+}
 
+void perf_evlist__munmap(struct perf_evlist *evlist)
+{
+	perf_evlist__munmap_nofree(evlist);
 	zfree(&evlist->mmap);
 }
 
@@ -1142,8 +1140,7 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist,
 	return 0;
 
 out_unmap:
-	for (cpu = 0; cpu < nr_cpus; cpu++)
-		__perf_evlist__munmap(evlist, cpu);
+	perf_evlist__munmap_nofree(evlist);
 	return -1;
 }
 
@@ -1168,8 +1165,7 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist,
 	return 0;
 
 out_unmap:
-	for (thread = 0; thread < nr_threads; thread++)
-		__perf_evlist__munmap(evlist, thread);
+	perf_evlist__munmap_nofree(evlist);
 	return -1;
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v16 07/15] perf tools: Introduce backward_mmap array for evlist
  2016-07-14  8:34 [PATCH v16 00/15] perf tools: Support overwritable ring buffer Wang Nan
                   ` (5 preceding siblings ...)
  2016-07-14  8:34 ` [PATCH v16 06/15] perf tools: Extract common code in mmap failure processing Wang Nan
@ 2016-07-14  8:34 ` Wang Nan
  2016-07-16 20:48   ` [tip:perf/core] perf evlist: " tip-bot for Wang Nan
  2016-07-14  8:34 ` [PATCH v16 08/15] perf tools: Map backward events to backward_mmap Wang Nan
                   ` (8 subsequent siblings)
  15 siblings, 1 reply; 34+ messages in thread
From: Wang Nan @ 2016-07-14  8:34 UTC (permalink / raw)
  To: acme
  Cc: lizefan, linux-kernel, pi3orama, Wang Nan,
	Arnaldo Carvalho de Melo, He Kuang, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Nilay Vaish

Add backward_mmap to evlist, free it together with normal mmap.

Improve perf_evlist__pick_pc(), search backward_mmap if evlist->mmap is
not available.

This patch doesn't make alloc this array. It will be allocated
conditionally in following commits.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 10 +++++++---
 tools/perf/util/evlist.c    | 12 ++++++++----
 tools/perf/util/evlist.h    |  1 +
 3 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index d15517e..dbcb223 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -509,7 +509,7 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli
 	if (!evlist)
 		return 0;
 
-	maps = evlist->mmap;
+	maps = backward ? evlist->backward_mmap : evlist->mmap;
 	if (!maps)
 		return 0;
 
@@ -696,8 +696,12 @@ perf_event__synth_time_conv(const struct perf_event_mmap_page *pc __maybe_unused
 static const struct perf_event_mmap_page *
 perf_evlist__pick_pc(struct perf_evlist *evlist)
 {
-	if (evlist && evlist->mmap && evlist->mmap[0].base)
-		return evlist->mmap[0].base;
+	if (evlist) {
+		if (evlist->mmap && evlist->mmap[0].base)
+			return evlist->mmap[0].base;
+		if (evlist->backward_mmap && evlist->backward_mmap[0].base)
+			return evlist->backward_mmap[0].base;
+	}
 	return NULL;
 }
 
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 54ae0a0..24927e1 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -123,6 +123,7 @@ static void perf_evlist__purge(struct perf_evlist *evlist)
 void perf_evlist__exit(struct perf_evlist *evlist)
 {
 	zfree(&evlist->mmap);
+	zfree(&evlist->backward_mmap);
 	fdarray__exit(&evlist->pollfd);
 }
 
@@ -973,17 +974,20 @@ static void perf_evlist__munmap_nofree(struct perf_evlist *evlist)
 {
 	int i;
 
-	if (evlist->mmap == NULL)
-		return;
+	if (evlist->mmap)
+		for (i = 0; i < evlist->nr_mmaps; i++)
+			perf_mmap__munmap(&evlist->mmap[i]);
 
-	for (i = 0; i < evlist->nr_mmaps; i++)
-		perf_mmap__munmap(&evlist->mmap[i]);
+	if (evlist->backward_mmap)
+		for (i = 0; i < evlist->nr_mmaps; i++)
+			perf_mmap__munmap(&evlist->backward_mmap[i]);
 }
 
 void perf_evlist__munmap(struct perf_evlist *evlist)
 {
 	perf_evlist__munmap_nofree(evlist);
 	zfree(&evlist->mmap);
+	zfree(&evlist->backward_mmap);
 }
 
 static struct perf_mmap *perf_evlist__alloc_mmap(struct perf_evlist *evlist)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 9e680c6..07a1ad0 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -61,6 +61,7 @@ struct perf_evlist {
 	} workload;
 	struct fdarray	 pollfd;
 	struct perf_mmap *mmap;
+	struct perf_mmap *backward_mmap;
 	struct thread_map *threads;
 	struct cpu_map	  *cpus;
 	struct perf_evsel *selected;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v16 08/15] perf tools: Map backward events to backward_mmap
  2016-07-14  8:34 [PATCH v16 00/15] perf tools: Support overwritable ring buffer Wang Nan
                   ` (6 preceding siblings ...)
  2016-07-14  8:34 ` [PATCH v16 07/15] perf tools: Introduce backward_mmap array for evlist Wang Nan
@ 2016-07-14  8:34 ` Wang Nan
  2016-07-16 20:48   ` [tip:perf/core] perf evlist: " tip-bot for Wang Nan
  2016-07-14  8:34 ` [PATCH v16 09/15] perf tools: Drop evlist->backward Wang Nan
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 34+ messages in thread
From: Wang Nan @ 2016-07-14  8:34 UTC (permalink / raw)
  To: acme
  Cc: lizefan, linux-kernel, pi3orama, Wang Nan,
	Arnaldo Carvalho de Melo, He Kuang, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Nilay Vaish

In perf_evlist__mmap_per_evsel(), select backward_mmap for backward events.
Utilize new perf_mmap APIs. Dynamically alloc backward_mmap.

Remove useless functions.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: pi3orama@163.com
---
 tools/perf/tests/backward-ring-buffer.c |  4 +--
 tools/perf/util/evlist.c                | 54 ++++++++++++++++-----------------
 2 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/tools/perf/tests/backward-ring-buffer.c b/tools/perf/tests/backward-ring-buffer.c
index 5cee387..b2c6348 100644
--- a/tools/perf/tests/backward-ring-buffer.c
+++ b/tools/perf/tests/backward-ring-buffer.c
@@ -31,8 +31,8 @@ static int count_samples(struct perf_evlist *evlist, int *sample_count,
 	for (i = 0; i < evlist->nr_mmaps; i++) {
 		union perf_event *event;
 
-		perf_evlist__mmap_read_catchup(evlist, i);
-		while ((event = perf_evlist__mmap_read_backward(evlist, i)) != NULL) {
+		perf_mmap__read_catchup(&evlist->backward_mmap[i]);
+		while ((event = perf_mmap__read_backward(&evlist->backward_mmap[i])) != NULL) {
 			const u32 type = event->header.type;
 
 			switch (type) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 24927e1..7570f90 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -27,7 +27,6 @@
 #include <linux/log2.h>
 #include <linux/err.h>
 
-static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx);
 static void perf_mmap__munmap(struct perf_mmap *map);
 static void perf_mmap__put(struct perf_mmap *map);
 
@@ -692,8 +691,11 @@ static int perf_evlist__set_paused(struct perf_evlist *evlist, bool value)
 {
 	int i;
 
+	if (!evlist->backward_mmap)
+		return 0;
+
 	for (i = 0; i < evlist->nr_mmaps; i++) {
-		int fd = evlist->mmap[i].fd;
+		int fd = evlist->backward_mmap[i].fd;
 		int err;
 
 		if (fd < 0)
@@ -904,16 +906,6 @@ static void perf_mmap__put(struct perf_mmap *md)
 		perf_mmap__munmap(md);
 }
 
-static void perf_evlist__mmap_get(struct perf_evlist *evlist, int idx)
-{
-	perf_mmap__get(&evlist->mmap[idx]);
-}
-
-static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx)
-{
-	perf_mmap__put(&evlist->mmap[idx]);
-}
-
 void perf_mmap__consume(struct perf_mmap *md, bool overwrite)
 {
 	if (!overwrite) {
@@ -1049,12 +1041,6 @@ static int perf_mmap__mmap(struct perf_mmap *map,
 	return 0;
 }
 
-static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
-			       struct mmap_params *mp, int fd)
-{
-	return perf_mmap__mmap(&evlist->mmap[idx], mp, fd);
-}
-
 static bool
 perf_evlist__should_poll(struct perf_evlist *evlist __maybe_unused,
 			 struct perf_evsel *evsel)
@@ -1066,16 +1052,27 @@ perf_evlist__should_poll(struct perf_evlist *evlist __maybe_unused,
 
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 				       struct mmap_params *mp, int cpu,
-				       int thread, int *output)
+				       int thread, int *_output, int *_output_backward)
 {
 	struct perf_evsel *evsel;
 	int revent;
 
 	evlist__for_each_entry(evlist, evsel) {
+		struct perf_mmap *maps = evlist->mmap;
+		int *output = _output;
 		int fd;
 
-		if (!!evsel->attr.write_backward != (evlist->overwrite && evlist->backward))
-			continue;
+		if (evsel->attr.write_backward) {
+			output = _output_backward;
+			maps = evlist->backward_mmap;
+
+			if (!maps) {
+				maps = perf_evlist__alloc_mmap(evlist);
+				if (!maps)
+					return -1;
+				evlist->backward_mmap = maps;
+			}
+		}
 
 		if (evsel->system_wide && thread)
 			continue;
@@ -1084,13 +1081,14 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 
 		if (*output == -1) {
 			*output = fd;
-			if (__perf_evlist__mmap(evlist, idx, mp, *output) < 0)
+
+			if (perf_mmap__mmap(&maps[idx], mp, *output)  < 0)
 				return -1;
 		} else {
 			if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, *output) != 0)
 				return -1;
 
-			perf_evlist__mmap_get(evlist, idx);
+			perf_mmap__get(&maps[idx]);
 		}
 
 		revent = perf_evlist__should_poll(evlist, evsel) ? POLLIN : 0;
@@ -1103,8 +1101,8 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 		 * Therefore don't add it for polling.
 		 */
 		if (!evsel->system_wide &&
-		    __perf_evlist__add_pollfd(evlist, fd, &evlist->mmap[idx], revent) < 0) {
-			perf_evlist__mmap_put(evlist, idx);
+		    __perf_evlist__add_pollfd(evlist, fd, &maps[idx], revent) < 0) {
+			perf_mmap__put(&maps[idx]);
 			return -1;
 		}
 
@@ -1130,13 +1128,14 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist,
 	pr_debug2("perf event ring buffer mmapped per cpu\n");
 	for (cpu = 0; cpu < nr_cpus; cpu++) {
 		int output = -1;
+		int output_backward = -1;
 
 		auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, cpu,
 					      true);
 
 		for (thread = 0; thread < nr_threads; thread++) {
 			if (perf_evlist__mmap_per_evsel(evlist, cpu, mp, cpu,
-							thread, &output))
+							thread, &output, &output_backward))
 				goto out_unmap;
 		}
 	}
@@ -1157,12 +1156,13 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist,
 	pr_debug2("perf event ring buffer mmapped per thread\n");
 	for (thread = 0; thread < nr_threads; thread++) {
 		int output = -1;
+		int output_backward = -1;
 
 		auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, thread,
 					      false);
 
 		if (perf_evlist__mmap_per_evsel(evlist, thread, mp, 0, thread,
-						&output))
+						&output, &output_backward))
 			goto out_unmap;
 	}
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v16 09/15] perf tools: Drop evlist->backward
  2016-07-14  8:34 [PATCH v16 00/15] perf tools: Support overwritable ring buffer Wang Nan
                   ` (7 preceding siblings ...)
  2016-07-14  8:34 ` [PATCH v16 08/15] perf tools: Map backward events to backward_mmap Wang Nan
@ 2016-07-14  8:34 ` Wang Nan
  2016-07-16 20:49   ` [tip:perf/core] perf evlist: " tip-bot for Wang Nan
  2016-07-14  8:34 ` [PATCH v16 10/15] perf tools: Setup backward mmap state machine Wang Nan
                   ` (6 subsequent siblings)
  15 siblings, 1 reply; 34+ messages in thread
From: Wang Nan @ 2016-07-14  8:34 UTC (permalink / raw)
  To: acme
  Cc: lizefan, linux-kernel, pi3orama, Wang Nan,
	Arnaldo Carvalho de Melo, He Kuang, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Nilay Vaish

Now there's no real user of evlist->backward. Drop it. We are going
to use evlist->backward_mmap as a container for backward ring buffer.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: pi3orama@163.com
---
 tools/perf/tests/backward-ring-buffer.c | 1 -
 tools/perf/util/evlist.c                | 5 +----
 tools/perf/util/evlist.h                | 1 -
 3 files changed, 1 insertion(+), 6 deletions(-)

diff --git a/tools/perf/tests/backward-ring-buffer.c b/tools/perf/tests/backward-ring-buffer.c
index b2c6348..db9cd30 100644
--- a/tools/perf/tests/backward-ring-buffer.c
+++ b/tools/perf/tests/backward-ring-buffer.c
@@ -101,7 +101,6 @@ int test__backward_ring_buffer(int subtest __maybe_unused)
 		return TEST_FAIL;
 	}
 
-	evlist->backward = true;
 	err = perf_evlist__create_maps(evlist, &opts.target);
 	if (err < 0) {
 		pr_debug("Not enough memory to create thread/cpu maps\n");
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 7570f90..5beb44f 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -44,7 +44,6 @@ void perf_evlist__init(struct perf_evlist *evlist, struct cpu_map *cpus,
 	perf_evlist__set_maps(evlist, cpus, threads);
 	fdarray__init(&evlist->pollfd, 64);
 	evlist->workload.pid = -1;
-	evlist->backward = false;
 }
 
 struct perf_evlist *perf_evlist__new(void)
@@ -867,9 +866,7 @@ union perf_event *perf_evlist__mmap_read_backward(struct perf_evlist *evlist, in
 
 union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
 {
-	if (!evlist->backward)
-		return perf_evlist__mmap_read_forward(evlist, idx);
-	return perf_evlist__mmap_read_backward(evlist, idx);
+	return perf_evlist__mmap_read_forward(evlist, idx);
 }
 
 void perf_mmap__read_catchup(struct perf_mmap *md)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 07a1ad0..6a3d9bd 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -50,7 +50,6 @@ struct perf_evlist {
 	bool		 overwrite;
 	bool		 enabled;
 	bool		 has_user_cpus;
-	bool		 backward;
 	size_t		 mmap_len;
 	int		 id_pos;
 	int		 is_pos;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v16 10/15] perf tools: Setup backward mmap state machine
  2016-07-14  8:34 [PATCH v16 00/15] perf tools: Support overwritable ring buffer Wang Nan
                   ` (8 preceding siblings ...)
  2016-07-14  8:34 ` [PATCH v16 09/15] perf tools: Drop evlist->backward Wang Nan
@ 2016-07-14  8:34 ` Wang Nan
  2016-07-16 20:49   ` [tip:perf/core] perf evlist: " tip-bot for Wang Nan
  2016-07-14  8:34 ` [PATCH v16 11/15] perf record: Read from overwritable ring buffer Wang Nan
                   ` (5 subsequent siblings)
  15 siblings, 1 reply; 34+ messages in thread
From: Wang Nan @ 2016-07-14  8:34 UTC (permalink / raw)
  To: acme
  Cc: lizefan, linux-kernel, pi3orama, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Nilay Vaish

Introduce a bkw_mmap_state state machine to evlist:

                     .________________(forbid)_____________.
                     |                                     V
 NOTREADY --(0)--> RUNNING --(1)--> DATA_PENDING --(2)--> EMPTY
                     ^  ^              |   ^               |
                     |  |__(forbid)____/   |___(forbid)___/|
                     |                                     |
                      \_________________(3)_______________/

 NOTREADY     : Backward ring buffers are not ready
 RUNNING      : Backward ring buffers are recording
 DATA_PENDING : We are required to collect data from backward ring buffers
 EMPTY        : We have collected data from backward ring buffers.

 (0): Setup backward ring buffer
 (1): Pause ring buffers for reading
 (2): Read from ring buffers
 (3): Resume ring buffers for recording

We can't avoid this complexity. Since we deliberately drop records from
overwritable ring buffer, there's no way for us to check remaining from
ring buffer itself (by checking head and old pointers). Therefore, we
need DATA_PENDING and EMPTY state to help us recording what we have done
to the ring buffer.

In record__mmap_read_evlist(), drive this state machine from DATA_PENDING
to EMPTY.

In perf_evlist__mmap_per_evsel(), drive this state machine from NOTREADY
to RUNNING when creating backward mmap.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c |  5 ++++
 tools/perf/util/evlist.c    | 63 +++++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/evlist.h    | 32 +++++++++++++++++++++++
 3 files changed, 100 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index dbcb223..d4f15e7 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -513,6 +513,9 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli
 	if (!maps)
 		return 0;
 
+	if (backward && evlist->bkw_mmap_state != BKW_MMAP_DATA_PENDING)
+		return 0;
+
 	for (i = 0; i < evlist->nr_mmaps; i++) {
 		struct auxtrace_mmap *mm = &maps[i].auxtrace_mmap;
 
@@ -538,6 +541,8 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli
 	if (bytes_written != rec->bytes_written)
 		rc = record__write(rec, &finished_round_event, sizeof(finished_round_event));
 
+	if (backward)
+		perf_evlist__toggle_bkw_mmap(evlist, BKW_MMAP_EMPTY);
 out:
 	return rc;
 }
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 5beb44f..400cb22 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -15,6 +15,7 @@
 #include "evlist.h"
 #include "evsel.h"
 #include "debug.h"
+#include "asm/bug.h"
 #include <unistd.h>
 
 #include "parse-events.h"
@@ -44,6 +45,7 @@ void perf_evlist__init(struct perf_evlist *evlist, struct cpu_map *cpus,
 	perf_evlist__set_maps(evlist, cpus, threads);
 	fdarray__init(&evlist->pollfd, 64);
 	evlist->workload.pid = -1;
+	evlist->bkw_mmap_state = BKW_MMAP_NOTREADY;
 }
 
 struct perf_evlist *perf_evlist__new(void)
@@ -1068,6 +1070,8 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 				if (!maps)
 					return -1;
 				evlist->backward_mmap = maps;
+				if (evlist->bkw_mmap_state == BKW_MMAP_NOTREADY)
+					perf_evlist__toggle_bkw_mmap(evlist, BKW_MMAP_RUNNING);
 			}
 		}
 
@@ -1972,3 +1976,62 @@ perf_evlist__find_evsel_by_str(struct perf_evlist *evlist,
 
 	return NULL;
 }
+
+void
+perf_evlist__toggle_bkw_mmap(struct perf_evlist *evlist,
+			     enum bkw_mmap_state state)
+{
+	enum bkw_mmap_state old_state = evlist->bkw_mmap_state;
+	enum action {
+		NONE,
+		PAUSE,
+		RESUME,
+	} action = NONE;
+
+	if (!evlist->backward_mmap)
+		return;
+
+	switch (old_state) {
+	case BKW_MMAP_NOTREADY: {
+		if (state != BKW_MMAP_RUNNING)
+			goto state_err;;
+		break;
+	}
+	case BKW_MMAP_RUNNING: {
+		if (state != BKW_MMAP_DATA_PENDING)
+			goto state_err;
+		action = PAUSE;
+		break;
+	}
+	case BKW_MMAP_DATA_PENDING: {
+		if (state != BKW_MMAP_EMPTY)
+			goto state_err;
+		break;
+	}
+	case BKW_MMAP_EMPTY: {
+		if (state != BKW_MMAP_RUNNING)
+			goto state_err;
+		action = RESUME;
+		break;
+	}
+	default:
+		WARN_ONCE(1, "Shouldn't get there\n");
+	}
+
+	evlist->bkw_mmap_state = state;
+
+	switch (action) {
+	case PAUSE:
+		perf_evlist__pause(evlist);
+		break;
+	case RESUME:
+		perf_evlist__resume(evlist);
+		break;
+	case NONE:
+	default:
+		break;
+	}
+
+state_err:
+	return;
+}
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 6a3d9bd..a2b84dd 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -41,6 +41,34 @@ perf_mmap__mmap_len(struct perf_mmap *map)
 	return map->mask + 1 + page_size;
 }
 
+/*
+ * State machine of bkw_mmap_state:
+ *
+ *                     .________________(forbid)_____________.
+ *                     |                                     V
+ * NOTREADY --(0)--> RUNNING --(1)--> DATA_PENDING --(2)--> EMPTY
+ *                     ^  ^              |   ^               |
+ *                     |  |__(forbid)____/   |___(forbid)___/|
+ *                     |                                     |
+ *                      \_________________(3)_______________/
+ *
+ * NOTREADY     : Backward ring buffers are not ready
+ * RUNNING      : Backward ring buffers are recording
+ * DATA_PENDING : We are required to collect data from backward ring buffers
+ * EMPTY        : We have collected data from backward ring buffers.
+ *
+ * (0): Setup backward ring buffer
+ * (1): Pause ring buffers for reading
+ * (2): Read from ring buffers
+ * (3): Resume ring buffers for recording
+ */
+enum bkw_mmap_state {
+	BKW_MMAP_NOTREADY,
+	BKW_MMAP_RUNNING,
+	BKW_MMAP_DATA_PENDING,
+	BKW_MMAP_EMPTY,
+};
+
 struct perf_evlist {
 	struct list_head entries;
 	struct hlist_head heads[PERF_EVLIST__HLIST_SIZE];
@@ -54,6 +82,7 @@ struct perf_evlist {
 	int		 id_pos;
 	int		 is_pos;
 	u64		 combined_sample_type;
+	enum bkw_mmap_state bkw_mmap_state;
 	struct {
 		int	cork_fd;
 		pid_t	pid;
@@ -135,6 +164,9 @@ struct perf_evsel *perf_evlist__id2evsel_strict(struct perf_evlist *evlist,
 
 struct perf_sample_id *perf_evlist__id2sid(struct perf_evlist *evlist, u64 id);
 
+void
+perf_evlist__toggle_bkw_mmap(struct perf_evlist *evlist, enum bkw_mmap_state state);
+
 union perf_event *perf_mmap__read_forward(struct perf_mmap *map, bool check_messup);
 union perf_event *perf_mmap__read_backward(struct perf_mmap *map);
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v16 11/15] perf record: Read from overwritable ring buffer
  2016-07-14  8:34 [PATCH v16 00/15] perf tools: Support overwritable ring buffer Wang Nan
                   ` (9 preceding siblings ...)
  2016-07-14  8:34 ` [PATCH v16 10/15] perf tools: Setup backward mmap state machine Wang Nan
@ 2016-07-14  8:34 ` Wang Nan
  2016-07-16 20:50   ` [tip:perf/core] " tip-bot for Wang Nan
  2016-07-14  8:34 ` [PATCH v16 12/15] perf tools: Make perf_evlist__{pause,resume} internal helpers Wang Nan
                   ` (4 subsequent siblings)
  15 siblings, 1 reply; 34+ messages in thread
From: Wang Nan @ 2016-07-14  8:34 UTC (permalink / raw)
  To: acme
  Cc: lizefan, linux-kernel, pi3orama, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Nilay Vaish

Drive the evlist->bkw_mmap_state state machine during draining and when
SIGUSR2 is received. Read backward ring buffer in record__mmap_read_all.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 31 ++++++++++++++++++++++++++++++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index d4f15e7..b87070b 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -555,7 +555,7 @@ static int record__mmap_read_all(struct record *rec)
 	if (err)
 		return err;
 
-	return err;
+	return record__mmap_read_evlist(rec, rec->evlist, true);
 }
 
 static void record__init_features(struct record *rec)
@@ -953,6 +953,17 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	for (;;) {
 		unsigned long long hits = rec->samples;
 
+		/*
+		 * rec->evlist->bkw_mmap_state is possible to be
+		 * BKW_MMAP_EMPTY here: when done == true and
+		 * hits != rec->samples in previous round.
+		 *
+		 * perf_evlist__toggle_bkw_mmap ensure we never
+		 * convert BKW_MMAP_EMPTY to BKW_MMAP_DATA_PENDING.
+		 */
+		if (trigger_is_hit(&switch_output_trigger) || done || draining)
+			perf_evlist__toggle_bkw_mmap(rec->evlist, BKW_MMAP_DATA_PENDING);
+
 		if (record__mmap_read_all(rec) < 0) {
 			trigger_error(&auxtrace_snapshot_trigger);
 			trigger_error(&switch_output_trigger);
@@ -972,8 +983,26 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		}
 
 		if (trigger_is_hit(&switch_output_trigger)) {
+			/*
+			 * If switch_output_trigger is hit, the data in
+			 * overwritable ring buffer should have been collected,
+			 * so bkw_mmap_state should be set to BKW_MMAP_EMPTY.
+			 *
+			 * If SIGUSR2 raise after or during record__mmap_read_all(),
+			 * record__mmap_read_all() didn't collect data from
+			 * overwritable ring buffer. Read again.
+			 */
+			if (rec->evlist->bkw_mmap_state == BKW_MMAP_RUNNING)
+				continue;
 			trigger_ready(&switch_output_trigger);
 
+			/*
+			 * Reenable events in overwrite ring buffer after
+			 * record__mmap_read_all(): we should have collected
+			 * data from it.
+			 */
+			perf_evlist__toggle_bkw_mmap(rec->evlist, BKW_MMAP_RUNNING);
+
 			if (!quiet)
 				fprintf(stderr, "[ perf record: dump data: Woken up %ld times ]\n",
 					waking);
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v16 12/15] perf tools: Make perf_evlist__{pause,resume} internal helpers
  2016-07-14  8:34 [PATCH v16 00/15] perf tools: Support overwritable ring buffer Wang Nan
                   ` (10 preceding siblings ...)
  2016-07-14  8:34 ` [PATCH v16 11/15] perf record: Read from overwritable ring buffer Wang Nan
@ 2016-07-14  8:34 ` Wang Nan
  2016-07-16 20:50   ` [tip:perf/core] perf evlist: Make {pause,resume} " tip-bot for Wang Nan
  2016-07-14  8:34 ` [PATCH v16 13/15] perf tools: Enable overwrite settings Wang Nan
                   ` (3 subsequent siblings)
  15 siblings, 1 reply; 34+ messages in thread
From: Wang Nan @ 2016-07-14  8:34 UTC (permalink / raw)
  To: acme
  Cc: lizefan, linux-kernel, pi3orama, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Nilay Vaish

There's no user of these two function outside evlist.c. Remove them
from public namespace.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: pi3orama@163.com
---
 tools/perf/util/evlist.c | 4 ++--
 tools/perf/util/evlist.h | 2 --
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 400cb22..9822de3 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -708,12 +708,12 @@ static int perf_evlist__set_paused(struct perf_evlist *evlist, bool value)
 	return 0;
 }
 
-int perf_evlist__pause(struct perf_evlist *evlist)
+static int perf_evlist__pause(struct perf_evlist *evlist)
 {
 	return perf_evlist__set_paused(evlist, true);
 }
 
-int perf_evlist__resume(struct perf_evlist *evlist)
+static int perf_evlist__resume(struct perf_evlist *evlist)
 {
 	return perf_evlist__set_paused(evlist, false);
 }
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index a2b84dd..9dde51d 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -183,8 +183,6 @@ void perf_evlist__mmap_read_catchup(struct perf_evlist *evlist, int idx);
 
 void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx);
 
-int perf_evlist__pause(struct perf_evlist *evlist);
-int perf_evlist__resume(struct perf_evlist *evlist);
 int perf_evlist__open(struct perf_evlist *evlist);
 void perf_evlist__close(struct perf_evlist *evlist);
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v16 13/15] perf tools: Enable overwrite settings
  2016-07-14  8:34 [PATCH v16 00/15] perf tools: Support overwritable ring buffer Wang Nan
                   ` (11 preceding siblings ...)
  2016-07-14  8:34 ` [PATCH v16 12/15] perf tools: Make perf_evlist__{pause,resume} internal helpers Wang Nan
@ 2016-07-14  8:34 ` Wang Nan
  2016-07-16 20:50   ` [tip:perf/core] " tip-bot for Wang Nan
  2016-07-14  8:34 ` [PATCH v16 14/15] perf tools: Don't warn about out of order event if write_backward is used Wang Nan
                   ` (2 subsequent siblings)
  15 siblings, 1 reply; 34+ messages in thread
From: Wang Nan @ 2016-07-14  8:34 UTC (permalink / raw)
  To: acme
  Cc: lizefan, linux-kernel, pi3orama, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Nilay Vaish

This patch allows following config terms and option:

Globally setting events to overwrite;

 # perf record --overwrite ...

Set specific events to be overwrite or no-overwrite.

 # perf record --event cycles/overwrite/ ...
 # perf record --event cycles/no-overwrite/ ...

Add missing config terms and update config term array size because the
longest string length is changed.

For overwritable events, automatically select attr.write_backward since
perf requires it to be backward for reading.

Test result:
 # perf record --overwrite -e syscalls:*enter_nanosleep* usleep 1
 [ perf record: Woken up 2 times to write data ]
 [ perf record: Captured and wrote 0.011 MB perf.data (1 samples) ]
 # perf evlist -v
 syscalls:sys_enter_nanosleep: type: 2, size: 112, config: 0x134, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1, inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, write_backward: 1
 # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: pi3orama@163.com
---
 tools/perf/Documentation/perf-record.txt | 14 ++++++++++++++
 tools/perf/builtin-record.c              |  1 +
 tools/perf/perf.h                        |  1 +
 tools/perf/tests/backward-ring-buffer.c  | 10 +++++-----
 tools/perf/util/evsel.c                  |  4 ++++
 tools/perf/util/evsel.h                  |  2 ++
 tools/perf/util/parse-events.c           | 20 ++++++++++++++++++--
 tools/perf/util/parse-events.h           |  2 ++
 tools/perf/util/parse-events.l           |  2 ++
 9 files changed, 49 insertions(+), 7 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 5b46b1d..384c630 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -367,6 +367,20 @@ options.
 'perf record --dry-run -e' can act as a BPF script compiler if llvm.dump-obj
 in config file is set to true.
 
+--overwrite::
+Makes all events use an overwritable ring buffer. An overwritable ring
+buffer works like a flight recorder: when it gets full, the kernel will
+overwrite the oldest records, that thus will never make it to the
+perf.data file.
+
+When '--overwrite' and '--switch-output' are used perf records and drops
+events until it receives a signal, meaning that something unusual was
+detected that warrants taking a snapshot of the most current events,
+those fitting in the ring buffer at that moment.
+
+'overwrite' attribute can also be set or canceled for an event using
+config terms. For example: 'cycles/overwrite/' and 'instructions/no-overwrite/'.
+
 SEE ALSO
 --------
 linkperf:perf-stat[1], linkperf:perf-list[1]
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index b87070b..39c7486 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1399,6 +1399,7 @@ struct option __record_options[] = {
 	OPT_BOOLEAN_SET('i', "no-inherit", &record.opts.no_inherit,
 			&record.opts.no_inherit_set,
 			"child tasks do not inherit counters"),
+	OPT_BOOLEAN(0, "overwrite", &record.opts.overwrite, "use overwrite mode"),
 	OPT_UINTEGER('F', "freq", &record.opts.user_freq, "profile at this frequency"),
 	OPT_CALLBACK('m', "mmap-pages", &record.opts, "pages[,pages]",
 		     "number of mmap data pages and AUX area tracing mmap pages",
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index cd8f1b1..608b42b 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -59,6 +59,7 @@ struct record_opts {
 	bool	     record_switch_events;
 	bool	     all_kernel;
 	bool	     all_user;
+	bool	     overwrite;
 	unsigned int freq;
 	unsigned int mmap_pages;
 	unsigned int auxtrace_mmap_pages;
diff --git a/tools/perf/tests/backward-ring-buffer.c b/tools/perf/tests/backward-ring-buffer.c
index db9cd30..615780c 100644
--- a/tools/perf/tests/backward-ring-buffer.c
+++ b/tools/perf/tests/backward-ring-buffer.c
@@ -108,7 +108,11 @@ int test__backward_ring_buffer(int subtest __maybe_unused)
 	}
 
 	bzero(&parse_error, sizeof(parse_error));
-	err = parse_events(evlist, "syscalls:sys_enter_prctl", &parse_error);
+	/*
+	 * Set backward bit, ring buffer should be writing from end. Record
+	 * it in aux evlist
+	 */
+	err = parse_events(evlist, "syscalls:sys_enter_prctl/overwrite/", &parse_error);
 	if (err) {
 		pr_debug("Failed to parse tracepoint event, try use root\n");
 		ret = TEST_SKIP;
@@ -117,10 +121,6 @@ int test__backward_ring_buffer(int subtest __maybe_unused)
 
 	perf_evlist__config(evlist, &opts, NULL);
 
-	/* Set backward bit, ring buffer should be writing from end */
-	evlist__for_each_entry(evlist, evsel)
-		evsel->attr.write_backward = 1;
-
 	err = perf_evlist__open(evlist);
 	if (err < 0) {
 		pr_debug("perf_evlist__open: %s\n",
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 9ac2f92..8c54df6 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -695,6 +695,9 @@ static void apply_config_terms(struct perf_evsel *evsel,
 			 */
 			attr->inherit = term->val.inherit ? 1 : 0;
 			break;
+		case PERF_EVSEL__CONFIG_TERM_OVERWRITE:
+			attr->write_backward = term->val.overwrite ? 1 : 0;
+			break;
 		default:
 			break;
 		}
@@ -776,6 +779,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts,
 
 	attr->sample_id_all = perf_missing_features.sample_id_all ? 0 : 1;
 	attr->inherit	    = !opts->no_inherit;
+	attr->write_backward = opts->overwrite ? 1 : 0;
 
 	perf_evsel__set_sample_bit(evsel, IP);
 	perf_evsel__set_sample_bit(evsel, TID);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index e60cbfc..8a4a6c9 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -45,6 +45,7 @@ enum {
 	PERF_EVSEL__CONFIG_TERM_STACK_USER,
 	PERF_EVSEL__CONFIG_TERM_INHERIT,
 	PERF_EVSEL__CONFIG_TERM_MAX_STACK,
+	PERF_EVSEL__CONFIG_TERM_OVERWRITE,
 	PERF_EVSEL__CONFIG_TERM_MAX,
 };
 
@@ -59,6 +60,7 @@ struct perf_evsel_config_term {
 		u64	stack_user;
 		int	max_stack;
 		bool	inherit;
+		bool	overwrite;
 	} val;
 };
 
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 375af0e..6c913c3 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -902,6 +902,8 @@ static const char *config_term_names[__PARSE_EVENTS__TERM_TYPE_NR] = {
 	[PARSE_EVENTS__TERM_TYPE_NOINHERIT]		= "no-inherit",
 	[PARSE_EVENTS__TERM_TYPE_INHERIT]		= "inherit",
 	[PARSE_EVENTS__TERM_TYPE_MAX_STACK]		= "max-stack",
+	[PARSE_EVENTS__TERM_TYPE_OVERWRITE]		= "overwrite",
+	[PARSE_EVENTS__TERM_TYPE_NOOVERWRITE]		= "no-overwrite",
 };
 
 static bool config_term_shrinked;
@@ -994,6 +996,12 @@ do {									   \
 	case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
 		CHECK_TYPE_VAL(NUM);
 		break;
+	case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
+		CHECK_TYPE_VAL(NUM);
+		break;
+	case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
+		CHECK_TYPE_VAL(NUM);
+		break;
 	case PARSE_EVENTS__TERM_TYPE_NAME:
 		CHECK_TYPE_VAL(STR);
 		break;
@@ -1046,6 +1054,8 @@ static int config_term_tracepoint(struct perf_event_attr *attr,
 	case PARSE_EVENTS__TERM_TYPE_INHERIT:
 	case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
 	case PARSE_EVENTS__TERM_TYPE_MAX_STACK:
+	case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
+	case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
 		return config_term_common(attr, term, err);
 	default:
 		if (err) {
@@ -1118,6 +1128,12 @@ do {								\
 		case PARSE_EVENTS__TERM_TYPE_MAX_STACK:
 			ADD_CONFIG_TERM(MAX_STACK, max_stack, term->val.num);
 			break;
+		case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
+			ADD_CONFIG_TERM(OVERWRITE, overwrite, term->val.num ? 1 : 0);
+			break;
+		case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
+			ADD_CONFIG_TERM(OVERWRITE, overwrite, term->val.num ? 0 : 1);
+			break;
 		default:
 			break;
 		}
@@ -2412,9 +2428,9 @@ static void config_terms_list(char *buf, size_t buf_sz)
 char *parse_events_formats_error_string(char *additional_terms)
 {
 	char *str;
-	/* "branch_type" is the longest name */
+	/* "no-overwrite" is the longest name */
 	char static_terms[__PARSE_EVENTS__TERM_TYPE_NR *
-			  (sizeof("branch_type") - 1)];
+			  (sizeof("no-overwrite") - 1)];
 
 	config_terms_list(static_terms, sizeof(static_terms));
 	/* valid terms */
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index b4aa7eb..d1edbf8 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -69,6 +69,8 @@ enum {
 	PARSE_EVENTS__TERM_TYPE_NOINHERIT,
 	PARSE_EVENTS__TERM_TYPE_INHERIT,
 	PARSE_EVENTS__TERM_TYPE_MAX_STACK,
+	PARSE_EVENTS__TERM_TYPE_NOOVERWRITE,
+	PARSE_EVENTS__TERM_TYPE_OVERWRITE,
 	__PARSE_EVENTS__TERM_TYPE_NR,
 };
 
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 3c15b33..7a25194 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -202,6 +202,8 @@ stack-size		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_STACKSIZE); }
 max-stack		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_MAX_STACK); }
 inherit			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_INHERIT); }
 no-inherit		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOINHERIT); }
+overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_OVERWRITE); }
+no-overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOOVERWRITE); }
 ,			{ return ','; }
 "/"			{ BEGIN(INITIAL); return '/'; }
 {name_minus}		{ return str(yyscanner, PE_NAME); }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v16 14/15] perf tools: Don't warn about out of order event if write_backward is used
  2016-07-14  8:34 [PATCH v16 00/15] perf tools: Support overwritable ring buffer Wang Nan
                   ` (12 preceding siblings ...)
  2016-07-14  8:34 ` [PATCH v16 13/15] perf tools: Enable overwrite settings Wang Nan
@ 2016-07-14  8:34 ` Wang Nan
  2016-07-15 14:59   ` Jiri Olsa
  2016-07-16 20:51   ` [tip:perf/core] perf session: " tip-bot for Wang Nan
  2016-07-14  8:34 ` [PATCH v16 15/15] perf tools: Add --tail-synthesize option Wang Nan
  2016-07-15 15:00 ` [PATCH v16 00/15] perf tools: Support overwritable ring buffer Jiri Olsa
  15 siblings, 2 replies; 34+ messages in thread
From: Wang Nan @ 2016-07-14  8:34 UTC (permalink / raw)
  To: acme
  Cc: lizefan, linux-kernel, pi3orama, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Nilay Vaish

If write_backward attribute is set, records are written into kernel
ring buffer from end to beginning, but read from beginning to end.
To avoid 'XX out of order events recorded' warning message (timestamps
of records is in reverse order when using write_backward), suppress the
warning message if write_backward is selected by at lease one event.

Result:

Before this patch:
 # perf record -m 1 -e raw_syscalls:sys_exit/overwrite/ \
                    -e raw_syscalls:sys_enter \
                    dd if=/dev/zero of=/dev/null count=300
 300+0 records in
 300+0 records out
 153600 bytes (154 kB) copied, 0.000601617 s, 255 MB/s
 [ perf record: Woken up 5 times to write data ]
 Warning:
 40 out of order events recorded.
 [ perf record: Captured and wrote 0.096 MB perf.data (696 samples) ]

After this patch:
 # perf record -m 1 -e raw_syscalls:sys_exit/overwrite/ \
                    -e raw_syscalls:sys_enter \
                    dd if=/dev/zero of=/dev/null count=300
 300+0 records in
 300+0 records out
 153600 bytes (154 kB) copied, 0.000644873 s, 238 MB/s
 [ perf record: Woken up 5 times to write data ]
 [ perf record: Captured and wrote 0.096 MB perf.data (696 samples) ]

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: pi3orama@163.com
---
 tools/perf/util/session.c | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 078d496..5d61242 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1499,10 +1499,27 @@ int perf_session__register_idle_thread(struct perf_session *session)
 	return err;
 }
 
+static void
+perf_session__warn_order(const struct perf_session *session)
+{
+	const struct ordered_events *oe = &session->ordered_events;
+	struct perf_evsel *evsel;
+	bool should_warn = true;
+
+	evlist__for_each_entry(session->evlist, evsel) {
+		if (evsel->attr.write_backward)
+			should_warn = false;
+	}
+
+	if (!should_warn)
+		return;
+	if (oe->nr_unordered_events != 0)
+		ui__warning("%u out of order events recorded.\n", oe->nr_unordered_events);
+}
+
 static void perf_session__warn_about_errors(const struct perf_session *session)
 {
 	const struct events_stats *stats = &session->evlist->stats;
-	const struct ordered_events *oe = &session->ordered_events;
 
 	if (session->tool->lost == perf_event__process_lost &&
 	    stats->nr_events[PERF_RECORD_LOST] != 0) {
@@ -1559,8 +1576,7 @@ static void perf_session__warn_about_errors(const struct perf_session *session)
 			    stats->nr_unprocessable_samples);
 	}
 
-	if (oe->nr_unordered_events != 0)
-		ui__warning("%u out of order events recorded.\n", oe->nr_unordered_events);
+	perf_session__warn_order(session);
 
 	events_stats__auxtrace_error_warn(stats);
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v16 15/15] perf tools: Add --tail-synthesize option
  2016-07-14  8:34 [PATCH v16 00/15] perf tools: Support overwritable ring buffer Wang Nan
                   ` (13 preceding siblings ...)
  2016-07-14  8:34 ` [PATCH v16 14/15] perf tools: Don't warn about out of order event if write_backward is used Wang Nan
@ 2016-07-14  8:34 ` Wang Nan
  2016-07-16 20:51   ` [tip:perf/core] perf record: " tip-bot for Wang Nan
  2016-07-15 15:00 ` [PATCH v16 00/15] perf tools: Support overwritable ring buffer Jiri Olsa
  15 siblings, 1 reply; 34+ messages in thread
From: Wang Nan @ 2016-07-14  8:34 UTC (permalink / raw)
  To: acme
  Cc: lizefan, linux-kernel, pi3orama, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Nilay Vaish

When working with overwritable ring buffer there's a inconvenience
problem: if perf dumps data after a long period after it starts, non-sample
events may lost, which makes following 'perf report' unable to identify
proc name and mmap layout. For example:

 # perf record -m 4 -e raw_syscalls:* -g --overwrite --switch-output \
        dd if=/dev/zero of=/dev/null

send SIGUSR2 after dd runs long enough. The resuling perf.data lost
correct comm and mmap events:

 # perf script -i perf.data.2016061522374354
 perf 24478 [004] 2581325.601789:  raw_syscalls:sys_exit: NR 0 = 512
 ^^^^
 Should be 'dd'
                   27b2e8 syscall_slow_exit_work+0xfe2000e3 (/lib/modules/4.6.0-rc3+/build/vmlinux)
                   203cc7 do_syscall_64+0xfe200117 (/lib/modules/4.6.0-rc3+/build/vmlinux)
                   b18d83 return_from_SYSCALL_64+0xfe200000 (/lib/modules/4.6.0-rc3+/build/vmlinux)
             7f47c417edf0 [unknown] ([unknown])
             ^^^^^^^^^^^^
             Fail to unwind

This patch provides a '--tail-synthesize' option, allows perf to collect
system status when finalizing output file. In resuling output file, the
non-sample events reflect system status when dumping data.

After this patch:
 # perf record -m 4 -e raw_syscalls:* -g --overwrite --switch-output --tail-synthesize \
        dd if=/dev/zero of=/dev/null

 # perf script -i perf.data.2016061600544998
 dd 27364 [004] 2583244.994464: raw_syscalls:sys_enter: NR 1 (1, ...
 ^^
 Correct comm
                   203a18 syscall_trace_enter_phase2+0xfe2001a8 ([kernel.kallsyms])
                   203aa5 syscall_trace_enter+0xfe200055 ([kernel.kallsyms])
                   203caa do_syscall_64+0xfe2000fa ([kernel.kallsyms])
                   b18d83 return_from_SYSCALL_64+0xfe200000 ([kernel.kallsyms])
                    d8e50 __GI___libc_write+0xffff01d9639f4010 (/tmp/oxygen_root-w00229757/lib64/libc-2.18.so)
                    ^^^^^
                    Correct unwind

This option doesn't aim to solve this problem completely. If a process
terminates before SIGUSR2, we still lost its COMM and MMAP events. For
example, we can't unwind correctly from the final perf.data we get from
the previous example, because when perf collects the final output file
(when we press C-c), 'dd' has been terminated so its
'/proc/<pid>/mmap' becomes empty. However, this is a cheaper choice. To
completely solve this problem we need to continously output non-sample
events. To satisify the requirement of daemonization, we need to merge
them periodically. It is possible but requires much more code and cycles.

Automatically select --tail-synthesize when --overwrite is provided.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: pi3orama@163.com
---
 tools/perf/Documentation/perf-record.txt |  8 ++++++++
 tools/perf/builtin-record.c              | 31 +++++++++++++++++++++++++------
 tools/perf/perf.h                        |  1 +
 3 files changed, 34 insertions(+), 6 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 384c630..69966ab 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -367,6 +367,12 @@ options.
 'perf record --dry-run -e' can act as a BPF script compiler if llvm.dump-obj
 in config file is set to true.
 
+--tail-synthesize::
+Instead of collecting non-sample events (for example, fork, comm, mmap) at
+the beginning of record, collect them during finalizing an output file.
+The collected non-sample events reflects the status of the system when
+record is finished.
+
 --overwrite::
 Makes all events use an overwritable ring buffer. An overwritable ring
 buffer works like a flight recorder: when it gets full, the kernel will
@@ -381,6 +387,8 @@ those fitting in the ring buffer at that moment.
 'overwrite' attribute can also be set or canceled for an event using
 config terms. For example: 'cycles/overwrite/' and 'instructions/no-overwrite/'.
 
+Implies --tail-synthesize.
+
 SEE ALSO
 --------
 linkperf:perf-stat[1], linkperf:perf-list[1]
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 39c7486..8f2c16d 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -604,13 +604,16 @@ record__finish_output(struct record *rec)
 	return;
 }
 
-static int record__synthesize_workload(struct record *rec)
+static int record__synthesize_workload(struct record *rec, bool tail)
 {
 	struct {
 		struct thread_map map;
 		struct thread_map_data map_data;
 	} thread_map;
 
+	if (rec->opts.tail_synthesize != tail)
+		return 0;
+
 	thread_map.map.nr = 1;
 	thread_map.map.map[0].pid = rec->evlist->workload.pid;
 	thread_map.map.map[0].comm = NULL;
@@ -621,7 +624,7 @@ static int record__synthesize_workload(struct record *rec)
 						 rec->opts.proc_map_timeout);
 }
 
-static int record__synthesize(struct record *rec);
+static int record__synthesize(struct record *rec, bool tail);
 
 static int
 record__switch_output(struct record *rec, bool at_exit)
@@ -632,6 +635,10 @@ record__switch_output(struct record *rec, bool at_exit)
 	/* Same Size:      "2015122520103046"*/
 	char timestamp[] = "InvalidTimestamp";
 
+	record__synthesize(rec, true);
+	if (target__none(&rec->opts.target))
+		record__synthesize_workload(rec, true);
+
 	rec->samples = 0;
 	record__finish_output(rec);
 	err = fetch_current_timestamp(timestamp, sizeof(timestamp));
@@ -654,7 +661,7 @@ record__switch_output(struct record *rec, bool at_exit)
 
 	/* Output tracking events */
 	if (!at_exit) {
-		record__synthesize(rec);
+		record__synthesize(rec, false);
 
 		/*
 		 * In 'perf record --switch-output' without -a,
@@ -666,7 +673,7 @@ record__switch_output(struct record *rec, bool at_exit)
 		 * perf_event__synthesize_thread_map() for those events.
 		 */
 		if (target__none(&rec->opts.target))
-			record__synthesize_workload(rec);
+			record__synthesize_workload(rec, false);
 	}
 	return fd;
 }
@@ -720,7 +727,7 @@ static const struct perf_event_mmap_page *record__pick_pc(struct record *rec)
 	return NULL;
 }
 
-static int record__synthesize(struct record *rec)
+static int record__synthesize(struct record *rec, bool tail)
 {
 	struct perf_session *session = rec->session;
 	struct machine *machine = &session->machines.host;
@@ -730,6 +737,9 @@ static int record__synthesize(struct record *rec)
 	int fd = perf_data_file__fd(file);
 	int err = 0;
 
+	if (rec->opts.tail_synthesize != tail)
+		return 0;
+
 	if (file->is_pipe) {
 		err = perf_event__synthesize_attrs(tool, session,
 						   process_synthesized_event);
@@ -893,7 +903,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 
 	machine = &session->machines.host;
 
-	err = record__synthesize(rec);
+	err = record__synthesize(rec, false);
 	if (err < 0)
 		goto out_child;
 
@@ -1057,6 +1067,9 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	if (!quiet)
 		fprintf(stderr, "[ perf record: Woken up %ld times to write data ]\n", waking);
 
+	if (target__none(&rec->opts.target))
+		record__synthesize_workload(rec, true);
+
 out_child:
 	if (forks) {
 		int exit_status;
@@ -1075,6 +1088,7 @@ out_child:
 	} else
 		status = err;
 
+	record__synthesize(rec, true);
 	/* this will be recalculated during process_buildids() */
 	rec->samples = 0;
 
@@ -1399,6 +1413,8 @@ struct option __record_options[] = {
 	OPT_BOOLEAN_SET('i', "no-inherit", &record.opts.no_inherit,
 			&record.opts.no_inherit_set,
 			"child tasks do not inherit counters"),
+	OPT_BOOLEAN(0, "tail-synthesize", &record.opts.tail_synthesize,
+		    "synthesize non-sample events at the end of output"),
 	OPT_BOOLEAN(0, "overwrite", &record.opts.overwrite, "use overwrite mode"),
 	OPT_UINTEGER('F', "freq", &record.opts.user_freq, "profile at this frequency"),
 	OPT_CALLBACK('m', "mmap-pages", &record.opts, "pages[,pages]",
@@ -1610,6 +1626,9 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
 		}
 	}
 
+	if (record.opts.overwrite)
+		record.opts.tail_synthesize = true;
+
 	if (rec->evlist->nr_entries == 0 &&
 	    perf_evlist__add_default(rec->evlist) < 0) {
 		pr_err("Not enough memory for event selector list\n");
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 608b42b..a7e0f14 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -59,6 +59,7 @@ struct record_opts {
 	bool	     record_switch_events;
 	bool	     all_kernel;
 	bool	     all_user;
+	bool	     tail_synthesize;
 	bool	     overwrite;
 	unsigned int freq;
 	unsigned int mmap_pages;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v16 14/15] perf tools: Don't warn about out of order event if write_backward is used
  2016-07-14  8:34 ` [PATCH v16 14/15] perf tools: Don't warn about out of order event if write_backward is used Wang Nan
@ 2016-07-15 14:59   ` Jiri Olsa
  2016-07-16 20:51   ` [tip:perf/core] perf session: " tip-bot for Wang Nan
  1 sibling, 0 replies; 34+ messages in thread
From: Jiri Olsa @ 2016-07-15 14:59 UTC (permalink / raw)
  To: Wang Nan
  Cc: acme, lizefan, linux-kernel, pi3orama, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Nilay Vaish

On Thu, Jul 14, 2016 at 08:34:46AM +0000, Wang Nan wrote:
> If write_backward attribute is set, records are written into kernel
> ring buffer from end to beginning, but read from beginning to end.

so IIUIC they are stored from the newest to the oldest one right?

we might want to sort them back when reading the perf.data,
or mark the area in the data section and trat it specially

or just make ordered_events__queue to recognize backward
data and buffer long enough before pushing the sample
to higher level

or any other way that would make sense ;-)

however this might come as follow up update, no reason
to hold this patchset on this

jirka

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v16 05/15] perf tools: Record mmap cookie into fdarray private field
  2016-07-14  8:34 ` [PATCH v16 05/15] perf tools: Record mmap cookie into fdarray private field Wang Nan
@ 2016-07-15 14:59   ` Jiri Olsa
  2016-07-16 20:47   ` [tip:perf/core] perf evlist: " tip-bot for Wang Nan
  1 sibling, 0 replies; 34+ messages in thread
From: Jiri Olsa @ 2016-07-15 14:59 UTC (permalink / raw)
  To: Wang Nan
  Cc: acme, lizefan, linux-kernel, pi3orama, Arnaldo Carvalho de Melo,
	He Kuang, Jiri Olsa, Masami Hiramatsu, Namhyung Kim, Nilay Vaish

On Thu, Jul 14, 2016 at 08:34:37AM +0000, Wang Nan wrote:
> Insetad of saving a index into fdarray entries private field, save the
> corresponding 'struct perf_mmap' pointer, and release them directly
> using perf_mmap__put().
> 
> Following commits introduce multiple mmap arrays to evlist. Without this
> patch, perf_evlist__munmap_filtered() is unable to retrive correct
> 'struct perf_mmap' pointer.
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: He Kuang <hekuang@huawei.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Masami Hiramatsu <mhiramat@kernel.org>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: Nilay Vaish <nilayvaish@gmail.com>
> Cc: pi3orama@163.com
> ---
>  tools/perf/util/evlist.c | 15 +++++++++------
>  1 file changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index a4137e0..1462085 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -30,6 +30,7 @@
>  static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx);
>  static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx);
>  static void perf_mmap__munmap(struct perf_mmap *map);
> +static void perf_mmap__put(struct perf_mmap *map);
>  
>  #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
>  #define SID(e, x, y) xyarray__entry(e->sample_id, x, y)
> @@ -466,7 +467,8 @@ int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
>  	return 0;
>  }
>  
> -static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx, short revent)
> +static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd,
> +				     struct perf_mmap *map, short revent)
>  {
>  	int pos = fdarray__add(&evlist->pollfd, fd, revent | POLLERR | POLLHUP);
>  	/*
> @@ -474,7 +476,7 @@ static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx
>  	 * close the associated evlist->mmap[] entry.
>  	 */
>  	if (pos >= 0) {
> -		evlist->pollfd.priv[pos].idx = idx;
> +		evlist->pollfd.priv[pos].ptr = map;

looks like there's no user for idx now, patch below compiles for me
I guess we should remove it..

jirka


---
diff --git a/tools/lib/api/fd/array.h b/tools/lib/api/fd/array.h
index 71287dddc05f..54041a4ded39 100644
--- a/tools/lib/api/fd/array.h
+++ b/tools/lib/api/fd/array.h
@@ -21,7 +21,6 @@ struct fdarray {
 	int	       nr_autogrow;
 	struct pollfd *entries;
 	union {
-		int    idx;
 		void   *ptr;
 	} *priv;
 };

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v16 00/15] perf tools: Support overwritable ring buffer
  2016-07-14  8:34 [PATCH v16 00/15] perf tools: Support overwritable ring buffer Wang Nan
                   ` (14 preceding siblings ...)
  2016-07-14  8:34 ` [PATCH v16 15/15] perf tools: Add --tail-synthesize option Wang Nan
@ 2016-07-15 15:00 ` Jiri Olsa
  15 siblings, 0 replies; 34+ messages in thread
From: Jiri Olsa @ 2016-07-15 15:00 UTC (permalink / raw)
  To: Wang Nan
  Cc: acme, lizefan, linux-kernel, pi3orama, Arnaldo Carvalho de Melo,
	He Kuang, Jiri Olsa, Masami Hiramatsu, Namhyung Kim, Nilay Vaish

On Thu, Jul 14, 2016 at 08:34:32AM +0000, Wang Nan wrote:

SNIP

> 
> v15 -> v16: Follow Jiri Olsa's suggestion:
>             1. Move backward ring buffer state machine to evlist.
> 	    2. Dynamically allocate backward mmap, so backward_mmap == NULL
> 	       means there's no backward ring buffer.
> 	       Stop the state machine when there's no backward ring buffer.
> 	    3. Rename: _output2 to _output_backward.
> 	    4. Patch rearrangement.
> 	    5. Update record__pick_pc(): read from backward_mmap if normal
>                mmap is empty.
> 
> Arnaldo Carvalho de Melo (1):
>   perf tools: Drop redundant evsel->overwrite indicator
> 
> Wang Nan (14):
>   tools lib fd array: Allow associating a pointer cookie with each entry
>   perf tools: Update perf evlist mmap related APIs and helpers
>   perf tools: Decouple record__mmap_read() and evlist.
>   perf tools: Record mmap cookie into fdarray private field
>   perf tools: Extract common code in mmap failure processing
>   perf tools: Introduce backward_mmap array for evlist
>   perf tools: Map backward events to backward_mmap
>   perf tools: Drop evlist->backward
>   perf tools: Setup backward mmap state machine
>   perf record: Read from overwritable ring buffer
>   perf tools: Make perf_evlist__{pause,resume} internal helpers
>   perf tools: Enable overwrite settings
>   perf tools: Don't warn about out of order event if write_backward is
>     used
>   perf tools: Add --tail-synthesize option

looks good to me now ;-)

Acked-by: Jiri Olsa <jolsa@kernel.org>

thanks,
jirka

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [tip:perf/core] perf evlist: Drop redundant evsel->overwrite indicator
  2016-07-14  8:34 ` [PATCH v16 01/15] perf tools: Drop redundant evsel->overwrite indicator Wang Nan
@ 2016-07-16 20:45   ` tip-bot for Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 34+ messages in thread
From: tip-bot for Arnaldo Carvalho de Melo @ 2016-07-16 20:45 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, linux-kernel, jolsa, wangnan0, lizefan, hekuang,
	nilayvaish, hpa, namhyung, tglx, acme, mhiramat

Commit-ID:  32a951b4fd6bbe60ef5d65930b1712321e241b27
Gitweb:     http://git.kernel.org/tip/32a951b4fd6bbe60ef5d65930b1712321e241b27
Author:     Arnaldo Carvalho de Melo <acme@redhat.com>
AuthorDate: Thu, 14 Jul 2016 08:34:33 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 15 Jul 2016 13:38:06 -0300

perf evlist: Drop redundant evsel->overwrite indicator

evsel->overwrite indicator means an event should be put into
overwritable ring buffer. In current implementation, it equals to
evsel->attr.write_backward. To reduce compliexity, remove
evsel->overwrite, use evsel->attr.write_backward instead.

In addition, in __perf_evsel__open(), if kernel doesn't support
write_backward and user explicitly set it in evsel, don't fallback
like other missing feature, since it is meaningless to fall back to
a forward ring buffer in this case: we are unable to stably read
from an forward overwritable ring buffer.

Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1468485287-33422-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/tests/backward-ring-buffer.c |  1 +
 tools/perf/util/evlist.c                |  4 ++--
 tools/perf/util/evsel.c                 | 12 +++++-------
 tools/perf/util/evsel.h                 |  1 -
 4 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/tools/perf/tests/backward-ring-buffer.c b/tools/perf/tests/backward-ring-buffer.c
index f20ea4c..5cee387 100644
--- a/tools/perf/tests/backward-ring-buffer.c
+++ b/tools/perf/tests/backward-ring-buffer.c
@@ -101,6 +101,7 @@ int test__backward_ring_buffer(int subtest __maybe_unused)
 		return TEST_FAIL;
 	}
 
+	evlist->backward = true;
 	err = perf_evlist__create_maps(evlist, &opts.target);
 	if (err < 0) {
 		pr_debug("Not enough memory to create thread/cpu maps\n");
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 862e69c..6803f5c 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1003,7 +1003,7 @@ static bool
 perf_evlist__should_poll(struct perf_evlist *evlist __maybe_unused,
 			 struct perf_evsel *evsel)
 {
-	if (evsel->overwrite)
+	if (evsel->attr.write_backward)
 		return false;
 	return true;
 }
@@ -1018,7 +1018,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 	evlist__for_each_entry(evlist, evsel) {
 		int fd;
 
-		if (evsel->overwrite != (evlist->overwrite && evlist->backward))
+		if (!!evsel->attr.write_backward != (evlist->overwrite && evlist->backward))
 			continue;
 
 		if (evsel->system_wide && thread)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index ba0f59f..9ac2f92 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1377,6 +1377,9 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 	int pid = -1, err;
 	enum { NO_CHANGE, SET_TO_MAX, INCREASED_MAX } set_rlimit = NO_CHANGE;
 
+	if (perf_missing_features.write_backward && evsel->attr.write_backward)
+		return -EINVAL;
+
 	if (evsel->system_wide)
 		nthreads = 1;
 	else
@@ -1407,11 +1410,6 @@ fallback_missing_features:
 	if (perf_missing_features.lbr_flags)
 		evsel->attr.branch_sample_type &= ~(PERF_SAMPLE_BRANCH_NO_FLAGS |
 				     PERF_SAMPLE_BRANCH_NO_CYCLES);
-	if (perf_missing_features.write_backward) {
-		if (evsel->overwrite)
-			return -EINVAL;
-		evsel->attr.write_backward = false;
-	}
 retry_sample_id:
 	if (perf_missing_features.sample_id_all)
 		evsel->attr.sample_id_all = 0;
@@ -1513,7 +1511,7 @@ try_fallback:
 	 */
 	if (!perf_missing_features.write_backward && evsel->attr.write_backward) {
 		perf_missing_features.write_backward = true;
-		goto fallback_missing_features;
+		goto out_close;
 	} else if (!perf_missing_features.clockid_wrong && evsel->attr.use_clockid) {
 		perf_missing_features.clockid_wrong = true;
 		goto fallback_missing_features;
@@ -2422,7 +2420,7 @@ int perf_evsel__open_strerror(struct perf_evsel *evsel, struct target *target,
 	"We found oprofile daemon running, please stop it and try again.");
 		break;
 	case EINVAL:
-		if (evsel->overwrite && perf_missing_features.write_backward)
+		if (evsel->attr.write_backward && perf_missing_features.write_backward)
 			return scnprintf(msg, size, "Reading from overwrite event is not supported by this kernel.");
 		if (perf_missing_features.clockid)
 			return scnprintf(msg, size, "clockid feature not supported.");
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index d73391e..e60cbfc 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -114,7 +114,6 @@ struct perf_evsel {
 	bool			tracking;
 	bool			per_pkg;
 	bool			precise_max;
-	bool			overwrite;
 	/* parse modifier helper */
 	int			exclude_GH;
 	int			nr_members;

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [tip:perf/core] tools lib fd array: Allow associating a pointer cookie with each entry
  2016-07-14  8:34 ` [PATCH v16 02/15] tools lib fd array: Allow associating a pointer cookie with each entry Wang Nan
@ 2016-07-16 20:46   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 34+ messages in thread
From: tip-bot for Wang Nan @ 2016-07-16 20:46 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: wangnan0, tglx, mingo, adrian.hunter, mhiramat, namhyung,
	lizefan, nilayvaish, hpa, hekuang, linux-kernel, acme,
	a.p.zijlstra, dsahern, jolsa

Commit-ID:  2b4383470675c85add72725cc08840d36b409276
Gitweb:     http://git.kernel.org/tip/2b4383470675c85add72725cc08840d36b409276
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Thu, 14 Jul 2016 08:34:34 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 15 Jul 2016 17:27:46 -0300

tools lib fd array: Allow associating a pointer cookie with each entry

Add a 'ptr' field to fdarray->priv array.

This feature will be used by following commits, which introduce
muiltiple 'struct perf_mmap' arrays for different types of mapping.

Because of this, during fdarray__filter(), a simple 'idx' is not enough.

Add a pointer cookie that allows to directly associate a 'struct
perf_mmap' pointer to an fdarray entry.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1468485287-33422-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/lib/api/fd/array.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/lib/api/fd/array.h b/tools/lib/api/fd/array.h
index e87fd80..71287dd 100644
--- a/tools/lib/api/fd/array.h
+++ b/tools/lib/api/fd/array.h
@@ -22,6 +22,7 @@ struct fdarray {
 	struct pollfd *entries;
 	union {
 		int    idx;
+		void   *ptr;
 	} *priv;
 };
 

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [tip:perf/core] perf evlist: Update mmap related APIs and helpers
  2016-07-14  8:34 ` [PATCH v16 03/15] perf tools: Update perf evlist mmap related APIs and helpers Wang Nan
@ 2016-07-16 20:46   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 34+ messages in thread
From: tip-bot for Wang Nan @ 2016-07-16 20:46 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: nilayvaish, lizefan, namhyung, hekuang, hpa, acme, mingo, jolsa,
	mhiramat, wangnan0, tglx, linux-kernel

Commit-ID:  8db6d6b19e486eef3db41bbd74a1f4c2b82d7706
Gitweb:     http://git.kernel.org/tip/8db6d6b19e486eef3db41bbd74a1f4c2b82d7706
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Thu, 14 Jul 2016 08:34:35 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 15 Jul 2016 17:27:46 -0300

perf evlist: Update mmap related APIs and helpers

Currently, the evlist mmap related helpers and APIs accept evlist and
idx, and dereference 'struct perf_mmap' by evlist->mmap[idx]. This is
unnecessary, and force each evlist contains only one mmap array.

Following commits are going to introduce multiple mmap arrays to a
evlist.  This patch refators these APIs and helpers, introduces
functions accept perf_mmap pointer directly. New helpers and APIs are
decoupled with perf_evlist, and become perf_mmap functions (so they have
perf_mmap prefix).

Old functions are reimplemented with new functions. Some of them will be
removed in following commits.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1468485287-33422-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evlist.c | 139 ++++++++++++++++++++++++++++++++---------------
 tools/perf/util/evlist.h |  12 ++++
 2 files changed, 108 insertions(+), 43 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 6803f5c..a4137e0 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -29,6 +29,7 @@
 
 static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx);
 static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx);
+static void perf_mmap__munmap(struct perf_mmap *map);
 
 #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
 #define SID(e, x, y) xyarray__entry(e->sample_id, x, y)
@@ -781,9 +782,8 @@ broken_event:
 	return event;
 }
 
-union perf_event *perf_evlist__mmap_read_forward(struct perf_evlist *evlist, int idx)
+union perf_event *perf_mmap__read_forward(struct perf_mmap *md, bool check_messup)
 {
-	struct perf_mmap *md = &evlist->mmap[idx];
 	u64 head;
 	u64 old = md->prev;
 
@@ -795,13 +795,12 @@ union perf_event *perf_evlist__mmap_read_forward(struct perf_evlist *evlist, int
 
 	head = perf_mmap__read_head(md);
 
-	return perf_mmap__read(md, evlist->overwrite, old, head, &md->prev);
+	return perf_mmap__read(md, check_messup, old, head, &md->prev);
 }
 
 union perf_event *
-perf_evlist__mmap_read_backward(struct perf_evlist *evlist, int idx)
+perf_mmap__read_backward(struct perf_mmap *md)
 {
-	struct perf_mmap *md = &evlist->mmap[idx];
 	u64 head, end;
 	u64 start = md->prev;
 
@@ -836,6 +835,31 @@ perf_evlist__mmap_read_backward(struct perf_evlist *evlist, int idx)
 	return perf_mmap__read(md, false, start, end, &md->prev);
 }
 
+union perf_event *perf_evlist__mmap_read_forward(struct perf_evlist *evlist, int idx)
+{
+	struct perf_mmap *md = &evlist->mmap[idx];
+
+	/*
+	 * Check messup is required for forward overwritable ring buffer:
+	 * memory pointed by md->prev can be overwritten in this case.
+	 * No need for read-write ring buffer: kernel stop outputting when
+	 * it hit md->prev (perf_mmap__consume()).
+	 */
+	return perf_mmap__read_forward(md, evlist->overwrite);
+}
+
+union perf_event *perf_evlist__mmap_read_backward(struct perf_evlist *evlist, int idx)
+{
+	struct perf_mmap *md = &evlist->mmap[idx];
+
+	/*
+	 * No need to check messup for backward ring buffer:
+	 * We can always read arbitrary long data from a backward
+	 * ring buffer unless we forget to pause it before reading.
+	 */
+	return perf_mmap__read_backward(md);
+}
+
 union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
 {
 	if (!evlist->backward)
@@ -843,9 +867,8 @@ union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
 	return perf_evlist__mmap_read_backward(evlist, idx);
 }
 
-void perf_evlist__mmap_read_catchup(struct perf_evlist *evlist, int idx)
+void perf_mmap__read_catchup(struct perf_mmap *md)
 {
-	struct perf_mmap *md = &evlist->mmap[idx];
 	u64 head;
 
 	if (!atomic_read(&md->refcnt))
@@ -855,38 +878,54 @@ void perf_evlist__mmap_read_catchup(struct perf_evlist *evlist, int idx)
 	md->prev = head;
 }
 
+void perf_evlist__mmap_read_catchup(struct perf_evlist *evlist, int idx)
+{
+	perf_mmap__read_catchup(&evlist->mmap[idx]);
+}
+
 static bool perf_mmap__empty(struct perf_mmap *md)
 {
 	return perf_mmap__read_head(md) == md->prev && !md->auxtrace_mmap.base;
 }
 
-static void perf_evlist__mmap_get(struct perf_evlist *evlist, int idx)
+static void perf_mmap__get(struct perf_mmap *map)
 {
-	atomic_inc(&evlist->mmap[idx].refcnt);
+	atomic_inc(&map->refcnt);
 }
 
-static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx)
+static void perf_mmap__put(struct perf_mmap *md)
 {
-	struct perf_mmap *md = &evlist->mmap[idx];
-
 	BUG_ON(md->base && atomic_read(&md->refcnt) == 0);
 
 	if (atomic_dec_and_test(&md->refcnt))
-		__perf_evlist__munmap(evlist, idx);
+		perf_mmap__munmap(md);
 }
 
-void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
+static void perf_evlist__mmap_get(struct perf_evlist *evlist, int idx)
 {
-	struct perf_mmap *md = &evlist->mmap[idx];
+	perf_mmap__get(&evlist->mmap[idx]);
+}
+
+static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx)
+{
+	perf_mmap__put(&evlist->mmap[idx]);
+}
 
-	if (!evlist->overwrite) {
+void perf_mmap__consume(struct perf_mmap *md, bool overwrite)
+{
+	if (!overwrite) {
 		u64 old = md->prev;
 
 		perf_mmap__write_tail(md, old);
 	}
 
 	if (atomic_read(&md->refcnt) == 1 && perf_mmap__empty(md))
-		perf_evlist__mmap_put(evlist, idx);
+		perf_mmap__put(md);
+}
+
+void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
+{
+	perf_mmap__consume(&evlist->mmap[idx], evlist->overwrite);
 }
 
 int __weak auxtrace_mmap__mmap(struct auxtrace_mmap *mm __maybe_unused,
@@ -917,15 +956,20 @@ void __weak auxtrace_mmap_params__set_idx(
 {
 }
 
-static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
+static void perf_mmap__munmap(struct perf_mmap *map)
 {
-	if (evlist->mmap[idx].base != NULL) {
-		munmap(evlist->mmap[idx].base, evlist->mmap_len);
-		evlist->mmap[idx].base = NULL;
-		evlist->mmap[idx].fd = -1;
-		atomic_set(&evlist->mmap[idx].refcnt, 0);
+	if (map->base != NULL) {
+		munmap(map->base, perf_mmap__mmap_len(map));
+		map->base = NULL;
+		map->fd = -1;
+		atomic_set(&map->refcnt, 0);
 	}
-	auxtrace_mmap__munmap(&evlist->mmap[idx].auxtrace_mmap);
+	auxtrace_mmap__munmap(&map->auxtrace_mmap);
+}
+
+static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
+{
+	perf_mmap__munmap(&evlist->mmap[idx]);
 }
 
 void perf_evlist__munmap(struct perf_evlist *evlist)
@@ -941,20 +985,21 @@ void perf_evlist__munmap(struct perf_evlist *evlist)
 	zfree(&evlist->mmap);
 }
 
-static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
+static struct perf_mmap *perf_evlist__alloc_mmap(struct perf_evlist *evlist)
 {
 	int i;
+	struct perf_mmap *map;
 
 	evlist->nr_mmaps = cpu_map__nr(evlist->cpus);
 	if (cpu_map__empty(evlist->cpus))
 		evlist->nr_mmaps = thread_map__nr(evlist->threads);
-	evlist->mmap = zalloc(evlist->nr_mmaps * sizeof(struct perf_mmap));
-	if (!evlist->mmap)
-		return -ENOMEM;
+	map = zalloc(evlist->nr_mmaps * sizeof(struct perf_mmap));
+	if (!map)
+		return NULL;
 
 	for (i = 0; i < evlist->nr_mmaps; i++)
-		evlist->mmap[i].fd = -1;
-	return 0;
+		map[i].fd = -1;
+	return map;
 }
 
 struct mmap_params {
@@ -963,8 +1008,8 @@ struct mmap_params {
 	struct auxtrace_mmap_params auxtrace_mp;
 };
 
-static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
-			       struct mmap_params *mp, int fd)
+static int perf_mmap__mmap(struct perf_mmap *map,
+			   struct mmap_params *mp, int fd)
 {
 	/*
 	 * The last one will be done at perf_evlist__mmap_consume(), so that we
@@ -979,26 +1024,32 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 	 * evlist layer can't just drop it when filtering events in
 	 * perf_evlist__filter_pollfd().
 	 */
-	atomic_set(&evlist->mmap[idx].refcnt, 2);
-	evlist->mmap[idx].prev = 0;
-	evlist->mmap[idx].mask = mp->mask;
-	evlist->mmap[idx].base = mmap(NULL, evlist->mmap_len, mp->prot,
-				      MAP_SHARED, fd, 0);
-	if (evlist->mmap[idx].base == MAP_FAILED) {
+	atomic_set(&map->refcnt, 2);
+	map->prev = 0;
+	map->mask = mp->mask;
+	map->base = mmap(NULL, perf_mmap__mmap_len(map), mp->prot,
+			 MAP_SHARED, fd, 0);
+	if (map->base == MAP_FAILED) {
 		pr_debug2("failed to mmap perf event ring buffer, error %d\n",
 			  errno);
-		evlist->mmap[idx].base = NULL;
+		map->base = NULL;
 		return -1;
 	}
-	evlist->mmap[idx].fd = fd;
+	map->fd = fd;
 
-	if (auxtrace_mmap__mmap(&evlist->mmap[idx].auxtrace_mmap,
-				&mp->auxtrace_mp, evlist->mmap[idx].base, fd))
+	if (auxtrace_mmap__mmap(&map->auxtrace_mmap,
+				&mp->auxtrace_mp, map->base, fd))
 		return -1;
 
 	return 0;
 }
 
+static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
+			       struct mmap_params *mp, int fd)
+{
+	return perf_mmap__mmap(&evlist->mmap[idx], mp, fd);
+}
+
 static bool
 perf_evlist__should_poll(struct perf_evlist *evlist __maybe_unused,
 			 struct perf_evsel *evsel)
@@ -1248,7 +1299,9 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 		.prot = PROT_READ | (overwrite ? 0 : PROT_WRITE),
 	};
 
-	if (evlist->mmap == NULL && perf_evlist__alloc_mmap(evlist) < 0)
+	if (!evlist->mmap)
+		evlist->mmap = perf_evlist__alloc_mmap(evlist);
+	if (!evlist->mmap)
 		return -ENOMEM;
 
 	if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index afd0877..9e680c6 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -35,6 +35,12 @@ struct perf_mmap {
 	char		 event_copy[PERF_SAMPLE_MAX_SIZE] __attribute__((aligned(8)));
 };
 
+static inline size_t
+perf_mmap__mmap_len(struct perf_mmap *map)
+{
+	return map->mask + 1 + page_size;
+}
+
 struct perf_evlist {
 	struct list_head entries;
 	struct hlist_head heads[PERF_EVLIST__HLIST_SIZE];
@@ -129,6 +135,12 @@ struct perf_evsel *perf_evlist__id2evsel_strict(struct perf_evlist *evlist,
 
 struct perf_sample_id *perf_evlist__id2sid(struct perf_evlist *evlist, u64 id);
 
+union perf_event *perf_mmap__read_forward(struct perf_mmap *map, bool check_messup);
+union perf_event *perf_mmap__read_backward(struct perf_mmap *map);
+
+void perf_mmap__read_catchup(struct perf_mmap *md);
+void perf_mmap__consume(struct perf_mmap *md, bool overwrite);
+
 union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx);
 
 union perf_event *perf_evlist__mmap_read_forward(struct perf_evlist *evlist,

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [tip:perf/core] perf record: Decouple record__mmap_read() and evlist.
  2016-07-14  8:34 ` [PATCH v16 04/15] perf tools: Decouple record__mmap_read() and evlist Wang Nan
@ 2016-07-16 20:47   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 34+ messages in thread
From: tip-bot for Wang Nan @ 2016-07-16 20:47 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, lizefan, acme, nilayvaish, wangnan0, jolsa, hpa,
	namhyung, mingo, hekuang, mhiramat, tglx

Commit-ID:  a4ea0ec4f24a721bea5447a27ad5fbcb89275bae
Gitweb:     http://git.kernel.org/tip/a4ea0ec4f24a721bea5447a27ad5fbcb89275bae
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Thu, 14 Jul 2016 08:34:36 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 15 Jul 2016 17:27:46 -0300

perf record: Decouple record__mmap_read() and evlist.

Perf evlist will have multiple mmap arrays. Update record__mmap_read():
it should read from 'struct perf_mmap' directly.

Also, make record__mmap_read() ready to read from backward ring buffer.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1468485287-33422-5-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-record.c | 37 ++++++++++++++++++++++---------------
 1 file changed, 22 insertions(+), 15 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index d9f5cc3..d15517e 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -119,11 +119,10 @@ backward_rb_find_range(void *buf, int mask, u64 head, u64 *start, u64 *end)
 }
 
 static int
-rb_find_range(struct perf_evlist *evlist,
-	      void *data, int mask, u64 head, u64 old,
-	      u64 *start, u64 *end)
+rb_find_range(void *data, int mask, u64 head, u64 old,
+	      u64 *start, u64 *end, bool backward)
 {
-	if (!evlist->backward) {
+	if (!backward) {
 		*start = old;
 		*end = head;
 		return 0;
@@ -132,9 +131,10 @@ rb_find_range(struct perf_evlist *evlist,
 	return backward_rb_find_range(data, mask, head, start, end);
 }
 
-static int record__mmap_read(struct record *rec, struct perf_evlist *evlist, int idx)
+static int
+record__mmap_read(struct record *rec, struct perf_mmap *md,
+		  bool overwrite, bool backward)
 {
-	struct perf_mmap *md = &evlist->mmap[idx];
 	u64 head = perf_mmap__read_head(md);
 	u64 old = md->prev;
 	u64 end = head, start = old;
@@ -143,8 +143,8 @@ static int record__mmap_read(struct record *rec, struct perf_evlist *evlist, int
 	void *buf;
 	int rc = 0;
 
-	if (rb_find_range(evlist, data, md->mask, head,
-			  old, &start, &end))
+	if (rb_find_range(data, md->mask, head,
+			  old, &start, &end, backward))
 		return -1;
 
 	if (start == end)
@@ -157,7 +157,7 @@ static int record__mmap_read(struct record *rec, struct perf_evlist *evlist, int
 		WARN_ONCE(1, "failed to keep up with mmap data. (warn only once)\n");
 
 		md->prev = head;
-		perf_evlist__mmap_consume(evlist, idx);
+		perf_mmap__consume(md, overwrite || backward);
 		return 0;
 	}
 
@@ -182,7 +182,7 @@ static int record__mmap_read(struct record *rec, struct perf_evlist *evlist, int
 	}
 
 	md->prev = head;
-	perf_evlist__mmap_consume(evlist, idx);
+	perf_mmap__consume(md, overwrite || backward);
 out:
 	return rc;
 }
@@ -498,20 +498,27 @@ static struct perf_event_header finished_round_event = {
 	.type = PERF_RECORD_FINISHED_ROUND,
 };
 
-static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evlist)
+static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evlist,
+				    bool backward)
 {
 	u64 bytes_written = rec->bytes_written;
 	int i;
 	int rc = 0;
+	struct perf_mmap *maps;
 
 	if (!evlist)
 		return 0;
 
+	maps = evlist->mmap;
+	if (!maps)
+		return 0;
+
 	for (i = 0; i < evlist->nr_mmaps; i++) {
-		struct auxtrace_mmap *mm = &evlist->mmap[i].auxtrace_mmap;
+		struct auxtrace_mmap *mm = &maps[i].auxtrace_mmap;
 
-		if (evlist->mmap[i].base) {
-			if (record__mmap_read(rec, evlist, i) != 0) {
+		if (maps[i].base) {
+			if (record__mmap_read(rec, &maps[i],
+					      evlist->overwrite, backward) != 0) {
 				rc = -1;
 				goto out;
 			}
@@ -539,7 +546,7 @@ static int record__mmap_read_all(struct record *rec)
 {
 	int err;
 
-	err = record__mmap_read_evlist(rec, rec->evlist);
+	err = record__mmap_read_evlist(rec, rec->evlist, false);
 	if (err)
 		return err;
 

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [tip:perf/core] perf evlist: Record mmap cookie into fdarray private field
  2016-07-14  8:34 ` [PATCH v16 05/15] perf tools: Record mmap cookie into fdarray private field Wang Nan
  2016-07-15 14:59   ` Jiri Olsa
@ 2016-07-16 20:47   ` tip-bot for Wang Nan
  1 sibling, 0 replies; 34+ messages in thread
From: tip-bot for Wang Nan @ 2016-07-16 20:47 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: wangnan0, hpa, mhiramat, mingo, acme, hekuang, lizefan, tglx,
	jolsa, nilayvaish, namhyung, linux-kernel

Commit-ID:  4876075b3205af992bf1012f6d6fbc03593d55b9
Gitweb:     http://git.kernel.org/tip/4876075b3205af992bf1012f6d6fbc03593d55b9
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Thu, 14 Jul 2016 08:34:37 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 15 Jul 2016 17:27:47 -0300

perf evlist: Record mmap cookie into fdarray private field

Insetad of saving a index into fdarray entries private field, save the
corresponding 'struct perf_mmap' pointer, and release them directly
using perf_mmap__put().

Following commits introduce multiple mmap arrays to evlist. Without this
patch, perf_evlist__munmap_filtered() is unable to retrive correct
'struct perf_mmap' pointer.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1468485287-33422-6-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evlist.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index a4137e0..1462085 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -30,6 +30,7 @@
 static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx);
 static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx);
 static void perf_mmap__munmap(struct perf_mmap *map);
+static void perf_mmap__put(struct perf_mmap *map);
 
 #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
 #define SID(e, x, y) xyarray__entry(e->sample_id, x, y)
@@ -466,7 +467,8 @@ int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
 	return 0;
 }
 
-static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx, short revent)
+static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd,
+				     struct perf_mmap *map, short revent)
 {
 	int pos = fdarray__add(&evlist->pollfd, fd, revent | POLLERR | POLLHUP);
 	/*
@@ -474,7 +476,7 @@ static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx
 	 * close the associated evlist->mmap[] entry.
 	 */
 	if (pos >= 0) {
-		evlist->pollfd.priv[pos].idx = idx;
+		evlist->pollfd.priv[pos].ptr = map;
 
 		fcntl(fd, F_SETFL, O_NONBLOCK);
 	}
@@ -484,15 +486,16 @@ static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx
 
 int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd)
 {
-	return __perf_evlist__add_pollfd(evlist, fd, -1, POLLIN);
+	return __perf_evlist__add_pollfd(evlist, fd, NULL, POLLIN);
 }
 
 static void perf_evlist__munmap_filtered(struct fdarray *fda, int fd,
 					 void *arg __maybe_unused)
 {
-	struct perf_evlist *evlist = container_of(fda, struct perf_evlist, pollfd);
+	struct perf_mmap *map = fda->priv[fd].ptr;
 
-	perf_evlist__mmap_put(evlist, fda->priv[fd].idx);
+	if (map)
+		perf_mmap__put(map);
 }
 
 int perf_evlist__filter_pollfd(struct perf_evlist *evlist, short revents_and_mask)
@@ -1098,7 +1101,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 		 * Therefore don't add it for polling.
 		 */
 		if (!evsel->system_wide &&
-		    __perf_evlist__add_pollfd(evlist, fd, idx, revent) < 0) {
+		    __perf_evlist__add_pollfd(evlist, fd, &evlist->mmap[idx], revent) < 0) {
 			perf_evlist__mmap_put(evlist, idx);
 			return -1;
 		}

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [tip:perf/core] perf evlist: Extract common code in mmap failure processing
  2016-07-14  8:34 ` [PATCH v16 06/15] perf tools: Extract common code in mmap failure processing Wang Nan
@ 2016-07-16 20:48   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 34+ messages in thread
From: tip-bot for Wang Nan @ 2016-07-16 20:48 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, linux-kernel, hekuang, acme, namhyung, mhiramat,
	nilayvaish, hpa, wangnan0, lizefan, tglx, jolsa

Commit-ID:  a1f72618346a4e7ce9cff6aec1a62737d5d08763
Gitweb:     http://git.kernel.org/tip/a1f72618346a4e7ce9cff6aec1a62737d5d08763
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Thu, 14 Jul 2016 08:34:38 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 15 Jul 2016 17:27:47 -0300

perf evlist: Extract common code in mmap failure processing

In perf_evlist__mmap_per_cpu() and perf_evlist__mmap_per_thread(), in
case of mmap failure, successfully created maps should be cleared.

Current code uses two loops calling __perf_evlist__munmap() for each
function.

This patch extracts common code to perf_evlist__munmap_nofree() and use
previous introduced decoupled API perf_mmap__munmap(). Now
__perf_evlist__munmap() can be removed because of no user.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1468485287-33422-7-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evlist.c | 20 ++++++++------------
 1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 1462085..54ae0a0 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -28,7 +28,6 @@
 #include <linux/err.h>
 
 static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx);
-static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx);
 static void perf_mmap__munmap(struct perf_mmap *map);
 static void perf_mmap__put(struct perf_mmap *map);
 
@@ -970,12 +969,7 @@ static void perf_mmap__munmap(struct perf_mmap *map)
 	auxtrace_mmap__munmap(&map->auxtrace_mmap);
 }
 
-static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
-{
-	perf_mmap__munmap(&evlist->mmap[idx]);
-}
-
-void perf_evlist__munmap(struct perf_evlist *evlist)
+static void perf_evlist__munmap_nofree(struct perf_evlist *evlist)
 {
 	int i;
 
@@ -983,8 +977,12 @@ void perf_evlist__munmap(struct perf_evlist *evlist)
 		return;
 
 	for (i = 0; i < evlist->nr_mmaps; i++)
-		__perf_evlist__munmap(evlist, i);
+		perf_mmap__munmap(&evlist->mmap[i]);
+}
 
+void perf_evlist__munmap(struct perf_evlist *evlist)
+{
+	perf_evlist__munmap_nofree(evlist);
 	zfree(&evlist->mmap);
 }
 
@@ -1142,8 +1140,7 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist,
 	return 0;
 
 out_unmap:
-	for (cpu = 0; cpu < nr_cpus; cpu++)
-		__perf_evlist__munmap(evlist, cpu);
+	perf_evlist__munmap_nofree(evlist);
 	return -1;
 }
 
@@ -1168,8 +1165,7 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist,
 	return 0;
 
 out_unmap:
-	for (thread = 0; thread < nr_threads; thread++)
-		__perf_evlist__munmap(evlist, thread);
+	perf_evlist__munmap_nofree(evlist);
 	return -1;
 }
 

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [tip:perf/core] perf evlist: Introduce backward_mmap array for evlist
  2016-07-14  8:34 ` [PATCH v16 07/15] perf tools: Introduce backward_mmap array for evlist Wang Nan
@ 2016-07-16 20:48   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 34+ messages in thread
From: tip-bot for Wang Nan @ 2016-07-16 20:48 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, mhiramat, jolsa, namhyung, lizefan, nilayvaish, wangnan0,
	linux-kernel, acme, hekuang, mingo, hpa

Commit-ID:  b2cb615d8aaba520fe351ff456f6c7730828b3fe
Gitweb:     http://git.kernel.org/tip/b2cb615d8aaba520fe351ff456f6c7730828b3fe
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Thu, 14 Jul 2016 08:34:39 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 15 Jul 2016 17:27:48 -0300

perf evlist: Introduce backward_mmap array for evlist

Add backward_mmap to evlist, free it together with normal mmap.

Improve perf_evlist__pick_pc(), search backward_mmap if evlist->mmap is
not available.

This patch doesn't alloc this array. It will be allocated conditionally
in the following commits.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1468485287-33422-8-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-record.c | 10 +++++++---
 tools/perf/util/evlist.c    | 12 ++++++++----
 tools/perf/util/evlist.h    |  1 +
 3 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index d15517e..dbcb223 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -509,7 +509,7 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli
 	if (!evlist)
 		return 0;
 
-	maps = evlist->mmap;
+	maps = backward ? evlist->backward_mmap : evlist->mmap;
 	if (!maps)
 		return 0;
 
@@ -696,8 +696,12 @@ perf_event__synth_time_conv(const struct perf_event_mmap_page *pc __maybe_unused
 static const struct perf_event_mmap_page *
 perf_evlist__pick_pc(struct perf_evlist *evlist)
 {
-	if (evlist && evlist->mmap && evlist->mmap[0].base)
-		return evlist->mmap[0].base;
+	if (evlist) {
+		if (evlist->mmap && evlist->mmap[0].base)
+			return evlist->mmap[0].base;
+		if (evlist->backward_mmap && evlist->backward_mmap[0].base)
+			return evlist->backward_mmap[0].base;
+	}
 	return NULL;
 }
 
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 54ae0a0..24927e1 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -123,6 +123,7 @@ static void perf_evlist__purge(struct perf_evlist *evlist)
 void perf_evlist__exit(struct perf_evlist *evlist)
 {
 	zfree(&evlist->mmap);
+	zfree(&evlist->backward_mmap);
 	fdarray__exit(&evlist->pollfd);
 }
 
@@ -973,17 +974,20 @@ static void perf_evlist__munmap_nofree(struct perf_evlist *evlist)
 {
 	int i;
 
-	if (evlist->mmap == NULL)
-		return;
+	if (evlist->mmap)
+		for (i = 0; i < evlist->nr_mmaps; i++)
+			perf_mmap__munmap(&evlist->mmap[i]);
 
-	for (i = 0; i < evlist->nr_mmaps; i++)
-		perf_mmap__munmap(&evlist->mmap[i]);
+	if (evlist->backward_mmap)
+		for (i = 0; i < evlist->nr_mmaps; i++)
+			perf_mmap__munmap(&evlist->backward_mmap[i]);
 }
 
 void perf_evlist__munmap(struct perf_evlist *evlist)
 {
 	perf_evlist__munmap_nofree(evlist);
 	zfree(&evlist->mmap);
+	zfree(&evlist->backward_mmap);
 }
 
 static struct perf_mmap *perf_evlist__alloc_mmap(struct perf_evlist *evlist)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 9e680c6..07a1ad0 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -61,6 +61,7 @@ struct perf_evlist {
 	} workload;
 	struct fdarray	 pollfd;
 	struct perf_mmap *mmap;
+	struct perf_mmap *backward_mmap;
 	struct thread_map *threads;
 	struct cpu_map	  *cpus;
 	struct perf_evsel *selected;

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [tip:perf/core] perf evlist: Map backward events to backward_mmap
  2016-07-14  8:34 ` [PATCH v16 08/15] perf tools: Map backward events to backward_mmap Wang Nan
@ 2016-07-16 20:48   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 34+ messages in thread
From: tip-bot for Wang Nan @ 2016-07-16 20:48 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: lizefan, tglx, nilayvaish, namhyung, mhiramat, acme, wangnan0,
	hekuang, jolsa, hpa, linux-kernel, mingo

Commit-ID:  078c33862e042b3778dce3bcc8eaef84ab40715c
Gitweb:     http://git.kernel.org/tip/078c33862e042b3778dce3bcc8eaef84ab40715c
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Thu, 14 Jul 2016 08:34:40 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 15 Jul 2016 17:27:48 -0300

perf evlist: Map backward events to backward_mmap

In perf_evlist__mmap_per_evsel(), select backward_mmap for backward
events.  Utilize new perf_mmap APIs. Dynamically alloc backward_mmap.

Remove useless functions.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1468485287-33422-9-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/tests/backward-ring-buffer.c |  4 +--
 tools/perf/util/evlist.c                | 54 ++++++++++++++++-----------------
 2 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/tools/perf/tests/backward-ring-buffer.c b/tools/perf/tests/backward-ring-buffer.c
index 5cee387..b2c6348 100644
--- a/tools/perf/tests/backward-ring-buffer.c
+++ b/tools/perf/tests/backward-ring-buffer.c
@@ -31,8 +31,8 @@ static int count_samples(struct perf_evlist *evlist, int *sample_count,
 	for (i = 0; i < evlist->nr_mmaps; i++) {
 		union perf_event *event;
 
-		perf_evlist__mmap_read_catchup(evlist, i);
-		while ((event = perf_evlist__mmap_read_backward(evlist, i)) != NULL) {
+		perf_mmap__read_catchup(&evlist->backward_mmap[i]);
+		while ((event = perf_mmap__read_backward(&evlist->backward_mmap[i])) != NULL) {
 			const u32 type = event->header.type;
 
 			switch (type) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 24927e1..7570f90 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -27,7 +27,6 @@
 #include <linux/log2.h>
 #include <linux/err.h>
 
-static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx);
 static void perf_mmap__munmap(struct perf_mmap *map);
 static void perf_mmap__put(struct perf_mmap *map);
 
@@ -692,8 +691,11 @@ static int perf_evlist__set_paused(struct perf_evlist *evlist, bool value)
 {
 	int i;
 
+	if (!evlist->backward_mmap)
+		return 0;
+
 	for (i = 0; i < evlist->nr_mmaps; i++) {
-		int fd = evlist->mmap[i].fd;
+		int fd = evlist->backward_mmap[i].fd;
 		int err;
 
 		if (fd < 0)
@@ -904,16 +906,6 @@ static void perf_mmap__put(struct perf_mmap *md)
 		perf_mmap__munmap(md);
 }
 
-static void perf_evlist__mmap_get(struct perf_evlist *evlist, int idx)
-{
-	perf_mmap__get(&evlist->mmap[idx]);
-}
-
-static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx)
-{
-	perf_mmap__put(&evlist->mmap[idx]);
-}
-
 void perf_mmap__consume(struct perf_mmap *md, bool overwrite)
 {
 	if (!overwrite) {
@@ -1049,12 +1041,6 @@ static int perf_mmap__mmap(struct perf_mmap *map,
 	return 0;
 }
 
-static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
-			       struct mmap_params *mp, int fd)
-{
-	return perf_mmap__mmap(&evlist->mmap[idx], mp, fd);
-}
-
 static bool
 perf_evlist__should_poll(struct perf_evlist *evlist __maybe_unused,
 			 struct perf_evsel *evsel)
@@ -1066,16 +1052,27 @@ perf_evlist__should_poll(struct perf_evlist *evlist __maybe_unused,
 
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 				       struct mmap_params *mp, int cpu,
-				       int thread, int *output)
+				       int thread, int *_output, int *_output_backward)
 {
 	struct perf_evsel *evsel;
 	int revent;
 
 	evlist__for_each_entry(evlist, evsel) {
+		struct perf_mmap *maps = evlist->mmap;
+		int *output = _output;
 		int fd;
 
-		if (!!evsel->attr.write_backward != (evlist->overwrite && evlist->backward))
-			continue;
+		if (evsel->attr.write_backward) {
+			output = _output_backward;
+			maps = evlist->backward_mmap;
+
+			if (!maps) {
+				maps = perf_evlist__alloc_mmap(evlist);
+				if (!maps)
+					return -1;
+				evlist->backward_mmap = maps;
+			}
+		}
 
 		if (evsel->system_wide && thread)
 			continue;
@@ -1084,13 +1081,14 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 
 		if (*output == -1) {
 			*output = fd;
-			if (__perf_evlist__mmap(evlist, idx, mp, *output) < 0)
+
+			if (perf_mmap__mmap(&maps[idx], mp, *output)  < 0)
 				return -1;
 		} else {
 			if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, *output) != 0)
 				return -1;
 
-			perf_evlist__mmap_get(evlist, idx);
+			perf_mmap__get(&maps[idx]);
 		}
 
 		revent = perf_evlist__should_poll(evlist, evsel) ? POLLIN : 0;
@@ -1103,8 +1101,8 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 		 * Therefore don't add it for polling.
 		 */
 		if (!evsel->system_wide &&
-		    __perf_evlist__add_pollfd(evlist, fd, &evlist->mmap[idx], revent) < 0) {
-			perf_evlist__mmap_put(evlist, idx);
+		    __perf_evlist__add_pollfd(evlist, fd, &maps[idx], revent) < 0) {
+			perf_mmap__put(&maps[idx]);
 			return -1;
 		}
 
@@ -1130,13 +1128,14 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist,
 	pr_debug2("perf event ring buffer mmapped per cpu\n");
 	for (cpu = 0; cpu < nr_cpus; cpu++) {
 		int output = -1;
+		int output_backward = -1;
 
 		auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, cpu,
 					      true);
 
 		for (thread = 0; thread < nr_threads; thread++) {
 			if (perf_evlist__mmap_per_evsel(evlist, cpu, mp, cpu,
-							thread, &output))
+							thread, &output, &output_backward))
 				goto out_unmap;
 		}
 	}
@@ -1157,12 +1156,13 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist,
 	pr_debug2("perf event ring buffer mmapped per thread\n");
 	for (thread = 0; thread < nr_threads; thread++) {
 		int output = -1;
+		int output_backward = -1;
 
 		auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, thread,
 					      false);
 
 		if (perf_evlist__mmap_per_evsel(evlist, thread, mp, 0, thread,
-						&output))
+						&output, &output_backward))
 			goto out_unmap;
 	}
 

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [tip:perf/core] perf evlist: Drop evlist->backward
  2016-07-14  8:34 ` [PATCH v16 09/15] perf tools: Drop evlist->backward Wang Nan
@ 2016-07-16 20:49   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 34+ messages in thread
From: tip-bot for Wang Nan @ 2016-07-16 20:49 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hekuang, wangnan0, mhiramat, mingo, linux-kernel, jolsa, tglx,
	acme, namhyung, lizefan, hpa, nilayvaish

Commit-ID:  a0c6f451f90204847ce5f91c3268d83a76bde1b6
Gitweb:     http://git.kernel.org/tip/a0c6f451f90204847ce5f91c3268d83a76bde1b6
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Thu, 14 Jul 2016 08:34:41 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 15 Jul 2016 17:27:49 -0300

perf evlist: Drop evlist->backward

Now there's no real user of evlist->backward. Drop it. We are going to
use evlist->backward_mmap as a container for backward ring buffer.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1468485287-33422-10-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/tests/backward-ring-buffer.c | 1 -
 tools/perf/util/evlist.c                | 5 +----
 tools/perf/util/evlist.h                | 1 -
 3 files changed, 1 insertion(+), 6 deletions(-)

diff --git a/tools/perf/tests/backward-ring-buffer.c b/tools/perf/tests/backward-ring-buffer.c
index b2c6348..db9cd30 100644
--- a/tools/perf/tests/backward-ring-buffer.c
+++ b/tools/perf/tests/backward-ring-buffer.c
@@ -101,7 +101,6 @@ int test__backward_ring_buffer(int subtest __maybe_unused)
 		return TEST_FAIL;
 	}
 
-	evlist->backward = true;
 	err = perf_evlist__create_maps(evlist, &opts.target);
 	if (err < 0) {
 		pr_debug("Not enough memory to create thread/cpu maps\n");
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 7570f90..5beb44f 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -44,7 +44,6 @@ void perf_evlist__init(struct perf_evlist *evlist, struct cpu_map *cpus,
 	perf_evlist__set_maps(evlist, cpus, threads);
 	fdarray__init(&evlist->pollfd, 64);
 	evlist->workload.pid = -1;
-	evlist->backward = false;
 }
 
 struct perf_evlist *perf_evlist__new(void)
@@ -867,9 +866,7 @@ union perf_event *perf_evlist__mmap_read_backward(struct perf_evlist *evlist, in
 
 union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
 {
-	if (!evlist->backward)
-		return perf_evlist__mmap_read_forward(evlist, idx);
-	return perf_evlist__mmap_read_backward(evlist, idx);
+	return perf_evlist__mmap_read_forward(evlist, idx);
 }
 
 void perf_mmap__read_catchup(struct perf_mmap *md)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 07a1ad0..6a3d9bd 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -50,7 +50,6 @@ struct perf_evlist {
 	bool		 overwrite;
 	bool		 enabled;
 	bool		 has_user_cpus;
-	bool		 backward;
 	size_t		 mmap_len;
 	int		 id_pos;
 	int		 is_pos;

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [tip:perf/core] perf evlist: Setup backward mmap state machine
  2016-07-14  8:34 ` [PATCH v16 10/15] perf tools: Setup backward mmap state machine Wang Nan
@ 2016-07-16 20:49   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 34+ messages in thread
From: tip-bot for Wang Nan @ 2016-07-16 20:49 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, lizefan, jolsa, hekuang, acme, linux-kernel, wangnan0, hpa,
	namhyung, nilayvaish, mhiramat, mingo

Commit-ID:  54cc54decd79b2c92f4db06001b4b245f38181b7
Gitweb:     http://git.kernel.org/tip/54cc54decd79b2c92f4db06001b4b245f38181b7
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Thu, 14 Jul 2016 08:34:42 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 15 Jul 2016 17:27:49 -0300

perf evlist: Setup backward mmap state machine

Introduce a bkw_mmap_state state machine to evlist:

                     .________________(forbid)_____________.
                     |                                     V
 NOTREADY --(0)--> RUNNING --(1)--> DATA_PENDING --(2)--> EMPTY
                     ^  ^              |   ^               |
                     |  |__(forbid)____/   |___(forbid)___/|
                     |                                     |
                      \_________________(3)_______________/

 NOTREADY     : Backward ring buffers are not ready
 RUNNING      : Backward ring buffers are recording
 DATA_PENDING : We are required to collect data from backward ring buffers
 EMPTY        : We have collected data from backward ring buffers.

 (0): Setup backward ring buffer
 (1): Pause ring buffers for reading
 (2): Read from ring buffers
 (3): Resume ring buffers for recording

We can't avoid this complexity. Since we deliberately drop records from
overwritable ring buffer, there's no way for us to check remaining from
ring buffer itself (by checking head and old pointers). Therefore, we
need DATA_PENDING and EMPTY state to help us recording what we have done
to the ring buffer.

In record__mmap_read_evlist(), drive this state machine from DATA_PENDING
to EMPTY.

In perf_evlist__mmap_per_evsel(), drive this state machine from NOTREADY
to RUNNING when creating backward mmap.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1468485287-33422-11-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-record.c |  5 ++++
 tools/perf/util/evlist.c    | 62 +++++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/evlist.h    | 31 +++++++++++++++++++++++
 3 files changed, 98 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index dbcb223..d4f15e7 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -513,6 +513,9 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli
 	if (!maps)
 		return 0;
 
+	if (backward && evlist->bkw_mmap_state != BKW_MMAP_DATA_PENDING)
+		return 0;
+
 	for (i = 0; i < evlist->nr_mmaps; i++) {
 		struct auxtrace_mmap *mm = &maps[i].auxtrace_mmap;
 
@@ -538,6 +541,8 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli
 	if (bytes_written != rec->bytes_written)
 		rc = record__write(rec, &finished_round_event, sizeof(finished_round_event));
 
+	if (backward)
+		perf_evlist__toggle_bkw_mmap(evlist, BKW_MMAP_EMPTY);
 out:
 	return rc;
 }
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 5beb44f..93ab664 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -15,6 +15,7 @@
 #include "evlist.h"
 #include "evsel.h"
 #include "debug.h"
+#include "asm/bug.h"
 #include <unistd.h>
 
 #include "parse-events.h"
@@ -44,6 +45,7 @@ void perf_evlist__init(struct perf_evlist *evlist, struct cpu_map *cpus,
 	perf_evlist__set_maps(evlist, cpus, threads);
 	fdarray__init(&evlist->pollfd, 64);
 	evlist->workload.pid = -1;
+	evlist->bkw_mmap_state = BKW_MMAP_NOTREADY;
 }
 
 struct perf_evlist *perf_evlist__new(void)
@@ -1068,6 +1070,8 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 				if (!maps)
 					return -1;
 				evlist->backward_mmap = maps;
+				if (evlist->bkw_mmap_state == BKW_MMAP_NOTREADY)
+					perf_evlist__toggle_bkw_mmap(evlist, BKW_MMAP_RUNNING);
 			}
 		}
 
@@ -1972,3 +1976,61 @@ perf_evlist__find_evsel_by_str(struct perf_evlist *evlist,
 
 	return NULL;
 }
+
+void perf_evlist__toggle_bkw_mmap(struct perf_evlist *evlist,
+				  enum bkw_mmap_state state)
+{
+	enum bkw_mmap_state old_state = evlist->bkw_mmap_state;
+	enum action {
+		NONE,
+		PAUSE,
+		RESUME,
+	} action = NONE;
+
+	if (!evlist->backward_mmap)
+		return;
+
+	switch (old_state) {
+	case BKW_MMAP_NOTREADY: {
+		if (state != BKW_MMAP_RUNNING)
+			goto state_err;;
+		break;
+	}
+	case BKW_MMAP_RUNNING: {
+		if (state != BKW_MMAP_DATA_PENDING)
+			goto state_err;
+		action = PAUSE;
+		break;
+	}
+	case BKW_MMAP_DATA_PENDING: {
+		if (state != BKW_MMAP_EMPTY)
+			goto state_err;
+		break;
+	}
+	case BKW_MMAP_EMPTY: {
+		if (state != BKW_MMAP_RUNNING)
+			goto state_err;
+		action = RESUME;
+		break;
+	}
+	default:
+		WARN_ONCE(1, "Shouldn't get there\n");
+	}
+
+	evlist->bkw_mmap_state = state;
+
+	switch (action) {
+	case PAUSE:
+		perf_evlist__pause(evlist);
+		break;
+	case RESUME:
+		perf_evlist__resume(evlist);
+		break;
+	case NONE:
+	default:
+		break;
+	}
+
+state_err:
+	return;
+}
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 6a3d9bd..20faaab 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -41,6 +41,34 @@ perf_mmap__mmap_len(struct perf_mmap *map)
 	return map->mask + 1 + page_size;
 }
 
+/*
+ * State machine of bkw_mmap_state:
+ *
+ *                     .________________(forbid)_____________.
+ *                     |                                     V
+ * NOTREADY --(0)--> RUNNING --(1)--> DATA_PENDING --(2)--> EMPTY
+ *                     ^  ^              |   ^               |
+ *                     |  |__(forbid)____/   |___(forbid)___/|
+ *                     |                                     |
+ *                      \_________________(3)_______________/
+ *
+ * NOTREADY     : Backward ring buffers are not ready
+ * RUNNING      : Backward ring buffers are recording
+ * DATA_PENDING : We are required to collect data from backward ring buffers
+ * EMPTY        : We have collected data from backward ring buffers.
+ *
+ * (0): Setup backward ring buffer
+ * (1): Pause ring buffers for reading
+ * (2): Read from ring buffers
+ * (3): Resume ring buffers for recording
+ */
+enum bkw_mmap_state {
+	BKW_MMAP_NOTREADY,
+	BKW_MMAP_RUNNING,
+	BKW_MMAP_DATA_PENDING,
+	BKW_MMAP_EMPTY,
+};
+
 struct perf_evlist {
 	struct list_head entries;
 	struct hlist_head heads[PERF_EVLIST__HLIST_SIZE];
@@ -54,6 +82,7 @@ struct perf_evlist {
 	int		 id_pos;
 	int		 is_pos;
 	u64		 combined_sample_type;
+	enum bkw_mmap_state bkw_mmap_state;
 	struct {
 		int	cork_fd;
 		pid_t	pid;
@@ -135,6 +164,8 @@ struct perf_evsel *perf_evlist__id2evsel_strict(struct perf_evlist *evlist,
 
 struct perf_sample_id *perf_evlist__id2sid(struct perf_evlist *evlist, u64 id);
 
+void perf_evlist__toggle_bkw_mmap(struct perf_evlist *evlist, enum bkw_mmap_state state);
+
 union perf_event *perf_mmap__read_forward(struct perf_mmap *map, bool check_messup);
 union perf_event *perf_mmap__read_backward(struct perf_mmap *map);
 

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [tip:perf/core] perf record: Read from overwritable ring buffer
  2016-07-14  8:34 ` [PATCH v16 11/15] perf record: Read from overwritable ring buffer Wang Nan
@ 2016-07-16 20:50   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 34+ messages in thread
From: tip-bot for Wang Nan @ 2016-07-16 20:50 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, mhiramat, linux-kernel, lizefan, hekuang, tglx, jolsa,
	nilayvaish, wangnan0, mingo, acme, namhyung

Commit-ID:  057374645bd42e8bcf22aa4529f99cf7c920a1c6
Gitweb:     http://git.kernel.org/tip/057374645bd42e8bcf22aa4529f99cf7c920a1c6
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Thu, 14 Jul 2016 08:34:43 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 15 Jul 2016 17:27:50 -0300

perf record: Read from overwritable ring buffer

Drive the evlist->bkw_mmap_state state machine during draining and when
SIGUSR2 is received. Read the backward ring buffer in record__mmap_read_all.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1468485287-33422-12-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-record.c | 31 ++++++++++++++++++++++++++++++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index d4f15e7..b87070b 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -555,7 +555,7 @@ static int record__mmap_read_all(struct record *rec)
 	if (err)
 		return err;
 
-	return err;
+	return record__mmap_read_evlist(rec, rec->evlist, true);
 }
 
 static void record__init_features(struct record *rec)
@@ -953,6 +953,17 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	for (;;) {
 		unsigned long long hits = rec->samples;
 
+		/*
+		 * rec->evlist->bkw_mmap_state is possible to be
+		 * BKW_MMAP_EMPTY here: when done == true and
+		 * hits != rec->samples in previous round.
+		 *
+		 * perf_evlist__toggle_bkw_mmap ensure we never
+		 * convert BKW_MMAP_EMPTY to BKW_MMAP_DATA_PENDING.
+		 */
+		if (trigger_is_hit(&switch_output_trigger) || done || draining)
+			perf_evlist__toggle_bkw_mmap(rec->evlist, BKW_MMAP_DATA_PENDING);
+
 		if (record__mmap_read_all(rec) < 0) {
 			trigger_error(&auxtrace_snapshot_trigger);
 			trigger_error(&switch_output_trigger);
@@ -972,8 +983,26 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		}
 
 		if (trigger_is_hit(&switch_output_trigger)) {
+			/*
+			 * If switch_output_trigger is hit, the data in
+			 * overwritable ring buffer should have been collected,
+			 * so bkw_mmap_state should be set to BKW_MMAP_EMPTY.
+			 *
+			 * If SIGUSR2 raise after or during record__mmap_read_all(),
+			 * record__mmap_read_all() didn't collect data from
+			 * overwritable ring buffer. Read again.
+			 */
+			if (rec->evlist->bkw_mmap_state == BKW_MMAP_RUNNING)
+				continue;
 			trigger_ready(&switch_output_trigger);
 
+			/*
+			 * Reenable events in overwrite ring buffer after
+			 * record__mmap_read_all(): we should have collected
+			 * data from it.
+			 */
+			perf_evlist__toggle_bkw_mmap(rec->evlist, BKW_MMAP_RUNNING);
+
 			if (!quiet)
 				fprintf(stderr, "[ perf record: dump data: Woken up %ld times ]\n",
 					waking);

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [tip:perf/core] perf evlist: Make {pause,resume} internal helpers
  2016-07-14  8:34 ` [PATCH v16 12/15] perf tools: Make perf_evlist__{pause,resume} internal helpers Wang Nan
@ 2016-07-16 20:50   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 34+ messages in thread
From: tip-bot for Wang Nan @ 2016-07-16 20:50 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: lizefan, wangnan0, linux-kernel, mhiramat, jolsa, hpa, mingo,
	namhyung, nilayvaish, acme, tglx, hekuang

Commit-ID:  f6cdff8329e04b08cbc195194223e9dbadeeaa1e
Gitweb:     http://git.kernel.org/tip/f6cdff8329e04b08cbc195194223e9dbadeeaa1e
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Thu, 14 Jul 2016 08:34:44 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 15 Jul 2016 17:27:50 -0300

perf evlist: Make {pause,resume} internal helpers

There's no user of these two function outside evlist.c. Remove them from
public namespace.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1468485287-33422-13-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evlist.c | 4 ++--
 tools/perf/util/evlist.h | 2 --
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 93ab664..2a40b8e 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -708,12 +708,12 @@ static int perf_evlist__set_paused(struct perf_evlist *evlist, bool value)
 	return 0;
 }
 
-int perf_evlist__pause(struct perf_evlist *evlist)
+static int perf_evlist__pause(struct perf_evlist *evlist)
 {
 	return perf_evlist__set_paused(evlist, true);
 }
 
-int perf_evlist__resume(struct perf_evlist *evlist)
+static int perf_evlist__resume(struct perf_evlist *evlist)
 {
 	return perf_evlist__set_paused(evlist, false);
 }
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 20faaab..4fd034f 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -182,8 +182,6 @@ void perf_evlist__mmap_read_catchup(struct perf_evlist *evlist, int idx);
 
 void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx);
 
-int perf_evlist__pause(struct perf_evlist *evlist);
-int perf_evlist__resume(struct perf_evlist *evlist);
 int perf_evlist__open(struct perf_evlist *evlist);
 void perf_evlist__close(struct perf_evlist *evlist);
 

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [tip:perf/core] perf tools: Enable overwrite settings
  2016-07-14  8:34 ` [PATCH v16 13/15] perf tools: Enable overwrite settings Wang Nan
@ 2016-07-16 20:50   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 34+ messages in thread
From: tip-bot for Wang Nan @ 2016-07-16 20:50 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hekuang, jolsa, acme, mhiramat, wangnan0, linux-kernel, lizefan,
	mingo, namhyung, nilayvaish, tglx, hpa

Commit-ID:  626a6b784e91bc61ca9fe0f9dd5bb60cb91ccb6b
Gitweb:     http://git.kernel.org/tip/626a6b784e91bc61ca9fe0f9dd5bb60cb91ccb6b
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Thu, 14 Jul 2016 08:34:45 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 15 Jul 2016 17:27:51 -0300

perf tools: Enable overwrite settings

This patch allows following config terms and option:

Globally setting events to overwrite;

  # perf record --overwrite ...

Set specific events to be overwrite or no-overwrite.

  # perf record --event cycles/overwrite/ ...
  # perf record --event cycles/no-overwrite/ ...

Add missing config terms and update the config term array size because
the longest string length has changed.

For overwritable events, it automatically selects attr.write_backward
since perf requires it to be backward for reading.

Test result:

  # perf record --overwrite -e syscalls:*enter_nanosleep* usleep 1
  [ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 0.011 MB perf.data (1 samples) ]
  # perf evlist -v
  syscalls:sys_enter_nanosleep: type: 2, size: 112, config: 0x134, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1, inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, write_backward: 1
  # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1468485287-33422-14-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-record.txt | 14 ++++++++++++++
 tools/perf/builtin-record.c              |  1 +
 tools/perf/perf.h                        |  1 +
 tools/perf/tests/backward-ring-buffer.c  | 10 +++++-----
 tools/perf/util/evsel.c                  |  4 ++++
 tools/perf/util/evsel.h                  |  2 ++
 tools/perf/util/parse-events.c           | 20 ++++++++++++++++++--
 tools/perf/util/parse-events.h           |  2 ++
 tools/perf/util/parse-events.l           |  2 ++
 9 files changed, 49 insertions(+), 7 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 5b46b1d..384c630 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -367,6 +367,20 @@ options.
 'perf record --dry-run -e' can act as a BPF script compiler if llvm.dump-obj
 in config file is set to true.
 
+--overwrite::
+Makes all events use an overwritable ring buffer. An overwritable ring
+buffer works like a flight recorder: when it gets full, the kernel will
+overwrite the oldest records, that thus will never make it to the
+perf.data file.
+
+When '--overwrite' and '--switch-output' are used perf records and drops
+events until it receives a signal, meaning that something unusual was
+detected that warrants taking a snapshot of the most current events,
+those fitting in the ring buffer at that moment.
+
+'overwrite' attribute can also be set or canceled for an event using
+config terms. For example: 'cycles/overwrite/' and 'instructions/no-overwrite/'.
+
 SEE ALSO
 --------
 linkperf:perf-stat[1], linkperf:perf-list[1]
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index b87070b..39c7486 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1399,6 +1399,7 @@ struct option __record_options[] = {
 	OPT_BOOLEAN_SET('i', "no-inherit", &record.opts.no_inherit,
 			&record.opts.no_inherit_set,
 			"child tasks do not inherit counters"),
+	OPT_BOOLEAN(0, "overwrite", &record.opts.overwrite, "use overwrite mode"),
 	OPT_UINTEGER('F', "freq", &record.opts.user_freq, "profile at this frequency"),
 	OPT_CALLBACK('m', "mmap-pages", &record.opts, "pages[,pages]",
 		     "number of mmap data pages and AUX area tracing mmap pages",
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index cd8f1b1..608b42b 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -59,6 +59,7 @@ struct record_opts {
 	bool	     record_switch_events;
 	bool	     all_kernel;
 	bool	     all_user;
+	bool	     overwrite;
 	unsigned int freq;
 	unsigned int mmap_pages;
 	unsigned int auxtrace_mmap_pages;
diff --git a/tools/perf/tests/backward-ring-buffer.c b/tools/perf/tests/backward-ring-buffer.c
index db9cd30..615780c 100644
--- a/tools/perf/tests/backward-ring-buffer.c
+++ b/tools/perf/tests/backward-ring-buffer.c
@@ -108,7 +108,11 @@ int test__backward_ring_buffer(int subtest __maybe_unused)
 	}
 
 	bzero(&parse_error, sizeof(parse_error));
-	err = parse_events(evlist, "syscalls:sys_enter_prctl", &parse_error);
+	/*
+	 * Set backward bit, ring buffer should be writing from end. Record
+	 * it in aux evlist
+	 */
+	err = parse_events(evlist, "syscalls:sys_enter_prctl/overwrite/", &parse_error);
 	if (err) {
 		pr_debug("Failed to parse tracepoint event, try use root\n");
 		ret = TEST_SKIP;
@@ -117,10 +121,6 @@ int test__backward_ring_buffer(int subtest __maybe_unused)
 
 	perf_evlist__config(evlist, &opts, NULL);
 
-	/* Set backward bit, ring buffer should be writing from end */
-	evlist__for_each_entry(evlist, evsel)
-		evsel->attr.write_backward = 1;
-
 	err = perf_evlist__open(evlist);
 	if (err < 0) {
 		pr_debug("perf_evlist__open: %s\n",
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 9ac2f92..8c54df6 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -695,6 +695,9 @@ static void apply_config_terms(struct perf_evsel *evsel,
 			 */
 			attr->inherit = term->val.inherit ? 1 : 0;
 			break;
+		case PERF_EVSEL__CONFIG_TERM_OVERWRITE:
+			attr->write_backward = term->val.overwrite ? 1 : 0;
+			break;
 		default:
 			break;
 		}
@@ -776,6 +779,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts,
 
 	attr->sample_id_all = perf_missing_features.sample_id_all ? 0 : 1;
 	attr->inherit	    = !opts->no_inherit;
+	attr->write_backward = opts->overwrite ? 1 : 0;
 
 	perf_evsel__set_sample_bit(evsel, IP);
 	perf_evsel__set_sample_bit(evsel, TID);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index e60cbfc..8a4a6c9 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -45,6 +45,7 @@ enum {
 	PERF_EVSEL__CONFIG_TERM_STACK_USER,
 	PERF_EVSEL__CONFIG_TERM_INHERIT,
 	PERF_EVSEL__CONFIG_TERM_MAX_STACK,
+	PERF_EVSEL__CONFIG_TERM_OVERWRITE,
 	PERF_EVSEL__CONFIG_TERM_MAX,
 };
 
@@ -59,6 +60,7 @@ struct perf_evsel_config_term {
 		u64	stack_user;
 		int	max_stack;
 		bool	inherit;
+		bool	overwrite;
 	} val;
 };
 
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 375af0e..6c913c3 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -902,6 +902,8 @@ static const char *config_term_names[__PARSE_EVENTS__TERM_TYPE_NR] = {
 	[PARSE_EVENTS__TERM_TYPE_NOINHERIT]		= "no-inherit",
 	[PARSE_EVENTS__TERM_TYPE_INHERIT]		= "inherit",
 	[PARSE_EVENTS__TERM_TYPE_MAX_STACK]		= "max-stack",
+	[PARSE_EVENTS__TERM_TYPE_OVERWRITE]		= "overwrite",
+	[PARSE_EVENTS__TERM_TYPE_NOOVERWRITE]		= "no-overwrite",
 };
 
 static bool config_term_shrinked;
@@ -994,6 +996,12 @@ do {									   \
 	case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
 		CHECK_TYPE_VAL(NUM);
 		break;
+	case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
+		CHECK_TYPE_VAL(NUM);
+		break;
+	case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
+		CHECK_TYPE_VAL(NUM);
+		break;
 	case PARSE_EVENTS__TERM_TYPE_NAME:
 		CHECK_TYPE_VAL(STR);
 		break;
@@ -1046,6 +1054,8 @@ static int config_term_tracepoint(struct perf_event_attr *attr,
 	case PARSE_EVENTS__TERM_TYPE_INHERIT:
 	case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
 	case PARSE_EVENTS__TERM_TYPE_MAX_STACK:
+	case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
+	case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
 		return config_term_common(attr, term, err);
 	default:
 		if (err) {
@@ -1118,6 +1128,12 @@ do {								\
 		case PARSE_EVENTS__TERM_TYPE_MAX_STACK:
 			ADD_CONFIG_TERM(MAX_STACK, max_stack, term->val.num);
 			break;
+		case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
+			ADD_CONFIG_TERM(OVERWRITE, overwrite, term->val.num ? 1 : 0);
+			break;
+		case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
+			ADD_CONFIG_TERM(OVERWRITE, overwrite, term->val.num ? 0 : 1);
+			break;
 		default:
 			break;
 		}
@@ -2412,9 +2428,9 @@ static void config_terms_list(char *buf, size_t buf_sz)
 char *parse_events_formats_error_string(char *additional_terms)
 {
 	char *str;
-	/* "branch_type" is the longest name */
+	/* "no-overwrite" is the longest name */
 	char static_terms[__PARSE_EVENTS__TERM_TYPE_NR *
-			  (sizeof("branch_type") - 1)];
+			  (sizeof("no-overwrite") - 1)];
 
 	config_terms_list(static_terms, sizeof(static_terms));
 	/* valid terms */
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index b4aa7eb..d1edbf8 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -69,6 +69,8 @@ enum {
 	PARSE_EVENTS__TERM_TYPE_NOINHERIT,
 	PARSE_EVENTS__TERM_TYPE_INHERIT,
 	PARSE_EVENTS__TERM_TYPE_MAX_STACK,
+	PARSE_EVENTS__TERM_TYPE_NOOVERWRITE,
+	PARSE_EVENTS__TERM_TYPE_OVERWRITE,
 	__PARSE_EVENTS__TERM_TYPE_NR,
 };
 
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 3c15b33..7a25194 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -202,6 +202,8 @@ stack-size		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_STACKSIZE); }
 max-stack		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_MAX_STACK); }
 inherit			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_INHERIT); }
 no-inherit		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOINHERIT); }
+overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_OVERWRITE); }
+no-overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOOVERWRITE); }
 ,			{ return ','; }
 "/"			{ BEGIN(INITIAL); return '/'; }
 {name_minus}		{ return str(yyscanner, PE_NAME); }

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [tip:perf/core] perf session: Don't warn about out of order event if write_backward is used
  2016-07-14  8:34 ` [PATCH v16 14/15] perf tools: Don't warn about out of order event if write_backward is used Wang Nan
  2016-07-15 14:59   ` Jiri Olsa
@ 2016-07-16 20:51   ` tip-bot for Wang Nan
  1 sibling, 0 replies; 34+ messages in thread
From: tip-bot for Wang Nan @ 2016-07-16 20:51 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mhiramat, jolsa, lizefan, wangnan0, tglx, nilayvaish,
	linux-kernel, namhyung, mingo, acme, hpa, hekuang

Commit-ID:  f06149c0db430d3694d601df126b0944cc0156a6
Gitweb:     http://git.kernel.org/tip/f06149c0db430d3694d601df126b0944cc0156a6
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Thu, 14 Jul 2016 08:34:46 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 15 Jul 2016 17:27:51 -0300

perf session: Don't warn about out of order event if write_backward is used

If write_backward attribute is set, records are written into kernel
ring buffer from end to beginning, but read from beginning to end.
To avoid 'XX out of order events recorded' warning message (timestamps
of records is in reverse order when using write_backward), suppress the
warning message if write_backward is selected by at lease one event.

Result:

Before this patch:
  # perf record -m 1 -e raw_syscalls:sys_exit/overwrite/ \
                     -e raw_syscalls:sys_enter \
                     dd if=/dev/zero of=/dev/null count=300
  300+0 records in
  300+0 records out
  153600 bytes (154 kB) copied, 0.000601617 s, 255 MB/s
  [ perf record: Woken up 5 times to write data ]
  Warning:
  40 out of order events recorded.
  [ perf record: Captured and wrote 0.096 MB perf.data (696 samples) ]

After this patch:
  # perf record -m 1 -e raw_syscalls:sys_exit/overwrite/ \
                     -e raw_syscalls:sys_enter \
                     dd if=/dev/zero of=/dev/null count=300
  300+0 records in
  300+0 records out
  153600 bytes (154 kB) copied, 0.000644873 s, 238 MB/s
  [ perf record: Woken up 5 times to write data ]
  [ perf record: Captured and wrote 0.096 MB perf.data (696 samples) ]

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1468485287-33422-15-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/session.c | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 078d496..5d61242 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1499,10 +1499,27 @@ int perf_session__register_idle_thread(struct perf_session *session)
 	return err;
 }
 
+static void
+perf_session__warn_order(const struct perf_session *session)
+{
+	const struct ordered_events *oe = &session->ordered_events;
+	struct perf_evsel *evsel;
+	bool should_warn = true;
+
+	evlist__for_each_entry(session->evlist, evsel) {
+		if (evsel->attr.write_backward)
+			should_warn = false;
+	}
+
+	if (!should_warn)
+		return;
+	if (oe->nr_unordered_events != 0)
+		ui__warning("%u out of order events recorded.\n", oe->nr_unordered_events);
+}
+
 static void perf_session__warn_about_errors(const struct perf_session *session)
 {
 	const struct events_stats *stats = &session->evlist->stats;
-	const struct ordered_events *oe = &session->ordered_events;
 
 	if (session->tool->lost == perf_event__process_lost &&
 	    stats->nr_events[PERF_RECORD_LOST] != 0) {
@@ -1559,8 +1576,7 @@ static void perf_session__warn_about_errors(const struct perf_session *session)
 			    stats->nr_unprocessable_samples);
 	}
 
-	if (oe->nr_unordered_events != 0)
-		ui__warning("%u out of order events recorded.\n", oe->nr_unordered_events);
+	perf_session__warn_order(session);
 
 	events_stats__auxtrace_error_warn(stats);
 

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [tip:perf/core] perf record: Add --tail-synthesize option
  2016-07-14  8:34 ` [PATCH v16 15/15] perf tools: Add --tail-synthesize option Wang Nan
@ 2016-07-16 20:51   ` tip-bot for Wang Nan
  0 siblings, 0 replies; 34+ messages in thread
From: tip-bot for Wang Nan @ 2016-07-16 20:51 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jolsa, namhyung, mhiramat, nilayvaish, linux-kernel, mingo,
	lizefan, wangnan0, hekuang, acme, tglx, hpa

Commit-ID:  4ea648aec01982d5a57816a95c4665d6081e78f9
Gitweb:     http://git.kernel.org/tip/4ea648aec01982d5a57816a95c4665d6081e78f9
Author:     Wang Nan <wangnan0@huawei.com>
AuthorDate: Thu, 14 Jul 2016 08:34:47 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 15 Jul 2016 17:27:52 -0300

perf record: Add --tail-synthesize option

When working with overwritable ring buffer there's a inconvenience
problem: if perf dumps data after a long period after it starts,
non-sample events may lost, which makes following 'perf report' unable
to identify proc name and mmap layout. For example:

 # perf record -m 4 -e raw_syscalls:* -g --overwrite --switch-output \
        dd if=/dev/zero of=/dev/null

send SIGUSR2 after dd runs long enough. The resuling perf.data lost
correct comm and mmap events:

 # perf script -i perf.data.2016061522374354
 perf 24478 [004] 2581325.601789:  raw_syscalls:sys_exit: NR 0 = 512
 ^^^^
 Should be 'dd'
                   27b2e8 syscall_slow_exit_work+0xfe2000e3 (/lib/modules/4.6.0-rc3+/build/vmlinux)
                   203cc7 do_syscall_64+0xfe200117 (/lib/modules/4.6.0-rc3+/build/vmlinux)
                   b18d83 return_from_SYSCALL_64+0xfe200000 (/lib/modules/4.6.0-rc3+/build/vmlinux)
             7f47c417edf0 [unknown] ([unknown])
             ^^^^^^^^^^^^
             Fail to unwind

This patch provides a '--tail-synthesize' option, allows perf to collect
system status when finalizing output file. In resuling output file, the
non-sample events reflect system status when dumping data.

After this patch:
 # perf record -m 4 -e raw_syscalls:* -g --overwrite --switch-output --tail-synthesize \
        dd if=/dev/zero of=/dev/null

 # perf script -i perf.data.2016061600544998
 dd 27364 [004] 2583244.994464: raw_syscalls:sys_enter: NR 1 (1, ...
 ^^
 Correct comm
                   203a18 syscall_trace_enter_phase2+0xfe2001a8 ([kernel.kallsyms])
                   203aa5 syscall_trace_enter+0xfe200055 ([kernel.kallsyms])
                   203caa do_syscall_64+0xfe2000fa ([kernel.kallsyms])
                   b18d83 return_from_SYSCALL_64+0xfe200000 ([kernel.kallsyms])
                    d8e50 __GI___libc_write+0xffff01d9639f4010 (/tmp/oxygen_root-w00229757/lib64/libc-2.18.so)
                    ^^^^^
                    Correct unwind

This option doesn't aim to solve this problem completely. If a process
terminates before SIGUSR2, we still lost its COMM and MMAP events. For
example, we can't unwind correctly from the final perf.data we get from
the previous example, because when perf collects the final output file
(when we press C-c), 'dd' has been terminated so its '/proc/<pid>/mmap'
becomes empty.

However, this is a cheaper choice. To completely solve this problem we
need to continously output non-sample events. To satisify the
requirement of daemonization, we need to merge them periodically. It is
possible but requires much more code and cycles.

Automatically select --tail-synthesize when --overwrite is provided.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1468485287-33422-16-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-record.txt |  8 ++++++++
 tools/perf/builtin-record.c              | 31 +++++++++++++++++++++++++------
 tools/perf/perf.h                        |  1 +
 3 files changed, 34 insertions(+), 6 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 384c630..69966ab 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -367,6 +367,12 @@ options.
 'perf record --dry-run -e' can act as a BPF script compiler if llvm.dump-obj
 in config file is set to true.
 
+--tail-synthesize::
+Instead of collecting non-sample events (for example, fork, comm, mmap) at
+the beginning of record, collect them during finalizing an output file.
+The collected non-sample events reflects the status of the system when
+record is finished.
+
 --overwrite::
 Makes all events use an overwritable ring buffer. An overwritable ring
 buffer works like a flight recorder: when it gets full, the kernel will
@@ -381,6 +387,8 @@ those fitting in the ring buffer at that moment.
 'overwrite' attribute can also be set or canceled for an event using
 config terms. For example: 'cycles/overwrite/' and 'instructions/no-overwrite/'.
 
+Implies --tail-synthesize.
+
 SEE ALSO
 --------
 linkperf:perf-stat[1], linkperf:perf-list[1]
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 39c7486..8f2c16d 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -604,13 +604,16 @@ record__finish_output(struct record *rec)
 	return;
 }
 
-static int record__synthesize_workload(struct record *rec)
+static int record__synthesize_workload(struct record *rec, bool tail)
 {
 	struct {
 		struct thread_map map;
 		struct thread_map_data map_data;
 	} thread_map;
 
+	if (rec->opts.tail_synthesize != tail)
+		return 0;
+
 	thread_map.map.nr = 1;
 	thread_map.map.map[0].pid = rec->evlist->workload.pid;
 	thread_map.map.map[0].comm = NULL;
@@ -621,7 +624,7 @@ static int record__synthesize_workload(struct record *rec)
 						 rec->opts.proc_map_timeout);
 }
 
-static int record__synthesize(struct record *rec);
+static int record__synthesize(struct record *rec, bool tail);
 
 static int
 record__switch_output(struct record *rec, bool at_exit)
@@ -632,6 +635,10 @@ record__switch_output(struct record *rec, bool at_exit)
 	/* Same Size:      "2015122520103046"*/
 	char timestamp[] = "InvalidTimestamp";
 
+	record__synthesize(rec, true);
+	if (target__none(&rec->opts.target))
+		record__synthesize_workload(rec, true);
+
 	rec->samples = 0;
 	record__finish_output(rec);
 	err = fetch_current_timestamp(timestamp, sizeof(timestamp));
@@ -654,7 +661,7 @@ record__switch_output(struct record *rec, bool at_exit)
 
 	/* Output tracking events */
 	if (!at_exit) {
-		record__synthesize(rec);
+		record__synthesize(rec, false);
 
 		/*
 		 * In 'perf record --switch-output' without -a,
@@ -666,7 +673,7 @@ record__switch_output(struct record *rec, bool at_exit)
 		 * perf_event__synthesize_thread_map() for those events.
 		 */
 		if (target__none(&rec->opts.target))
-			record__synthesize_workload(rec);
+			record__synthesize_workload(rec, false);
 	}
 	return fd;
 }
@@ -720,7 +727,7 @@ static const struct perf_event_mmap_page *record__pick_pc(struct record *rec)
 	return NULL;
 }
 
-static int record__synthesize(struct record *rec)
+static int record__synthesize(struct record *rec, bool tail)
 {
 	struct perf_session *session = rec->session;
 	struct machine *machine = &session->machines.host;
@@ -730,6 +737,9 @@ static int record__synthesize(struct record *rec)
 	int fd = perf_data_file__fd(file);
 	int err = 0;
 
+	if (rec->opts.tail_synthesize != tail)
+		return 0;
+
 	if (file->is_pipe) {
 		err = perf_event__synthesize_attrs(tool, session,
 						   process_synthesized_event);
@@ -893,7 +903,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 
 	machine = &session->machines.host;
 
-	err = record__synthesize(rec);
+	err = record__synthesize(rec, false);
 	if (err < 0)
 		goto out_child;
 
@@ -1057,6 +1067,9 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	if (!quiet)
 		fprintf(stderr, "[ perf record: Woken up %ld times to write data ]\n", waking);
 
+	if (target__none(&rec->opts.target))
+		record__synthesize_workload(rec, true);
+
 out_child:
 	if (forks) {
 		int exit_status;
@@ -1075,6 +1088,7 @@ out_child:
 	} else
 		status = err;
 
+	record__synthesize(rec, true);
 	/* this will be recalculated during process_buildids() */
 	rec->samples = 0;
 
@@ -1399,6 +1413,8 @@ struct option __record_options[] = {
 	OPT_BOOLEAN_SET('i', "no-inherit", &record.opts.no_inherit,
 			&record.opts.no_inherit_set,
 			"child tasks do not inherit counters"),
+	OPT_BOOLEAN(0, "tail-synthesize", &record.opts.tail_synthesize,
+		    "synthesize non-sample events at the end of output"),
 	OPT_BOOLEAN(0, "overwrite", &record.opts.overwrite, "use overwrite mode"),
 	OPT_UINTEGER('F', "freq", &record.opts.user_freq, "profile at this frequency"),
 	OPT_CALLBACK('m', "mmap-pages", &record.opts, "pages[,pages]",
@@ -1610,6 +1626,9 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
 		}
 	}
 
+	if (record.opts.overwrite)
+		record.opts.tail_synthesize = true;
+
 	if (rec->evlist->nr_entries == 0 &&
 	    perf_evlist__add_default(rec->evlist) < 0) {
 		pr_err("Not enough memory for event selector list\n");
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 608b42b..a7e0f14 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -59,6 +59,7 @@ struct record_opts {
 	bool	     record_switch_events;
 	bool	     all_kernel;
 	bool	     all_user;
+	bool	     tail_synthesize;
 	bool	     overwrite;
 	unsigned int freq;
 	unsigned int mmap_pages;

^ permalink raw reply related	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2016-07-16 20:51 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-14  8:34 [PATCH v16 00/15] perf tools: Support overwritable ring buffer Wang Nan
2016-07-14  8:34 ` [PATCH v16 01/15] perf tools: Drop redundant evsel->overwrite indicator Wang Nan
2016-07-16 20:45   ` [tip:perf/core] perf evlist: " tip-bot for Arnaldo Carvalho de Melo
2016-07-14  8:34 ` [PATCH v16 02/15] tools lib fd array: Allow associating a pointer cookie with each entry Wang Nan
2016-07-16 20:46   ` [tip:perf/core] " tip-bot for Wang Nan
2016-07-14  8:34 ` [PATCH v16 03/15] perf tools: Update perf evlist mmap related APIs and helpers Wang Nan
2016-07-16 20:46   ` [tip:perf/core] perf evlist: Update " tip-bot for Wang Nan
2016-07-14  8:34 ` [PATCH v16 04/15] perf tools: Decouple record__mmap_read() and evlist Wang Nan
2016-07-16 20:47   ` [tip:perf/core] perf record: " tip-bot for Wang Nan
2016-07-14  8:34 ` [PATCH v16 05/15] perf tools: Record mmap cookie into fdarray private field Wang Nan
2016-07-15 14:59   ` Jiri Olsa
2016-07-16 20:47   ` [tip:perf/core] perf evlist: " tip-bot for Wang Nan
2016-07-14  8:34 ` [PATCH v16 06/15] perf tools: Extract common code in mmap failure processing Wang Nan
2016-07-16 20:48   ` [tip:perf/core] perf evlist: " tip-bot for Wang Nan
2016-07-14  8:34 ` [PATCH v16 07/15] perf tools: Introduce backward_mmap array for evlist Wang Nan
2016-07-16 20:48   ` [tip:perf/core] perf evlist: " tip-bot for Wang Nan
2016-07-14  8:34 ` [PATCH v16 08/15] perf tools: Map backward events to backward_mmap Wang Nan
2016-07-16 20:48   ` [tip:perf/core] perf evlist: " tip-bot for Wang Nan
2016-07-14  8:34 ` [PATCH v16 09/15] perf tools: Drop evlist->backward Wang Nan
2016-07-16 20:49   ` [tip:perf/core] perf evlist: " tip-bot for Wang Nan
2016-07-14  8:34 ` [PATCH v16 10/15] perf tools: Setup backward mmap state machine Wang Nan
2016-07-16 20:49   ` [tip:perf/core] perf evlist: " tip-bot for Wang Nan
2016-07-14  8:34 ` [PATCH v16 11/15] perf record: Read from overwritable ring buffer Wang Nan
2016-07-16 20:50   ` [tip:perf/core] " tip-bot for Wang Nan
2016-07-14  8:34 ` [PATCH v16 12/15] perf tools: Make perf_evlist__{pause,resume} internal helpers Wang Nan
2016-07-16 20:50   ` [tip:perf/core] perf evlist: Make {pause,resume} " tip-bot for Wang Nan
2016-07-14  8:34 ` [PATCH v16 13/15] perf tools: Enable overwrite settings Wang Nan
2016-07-16 20:50   ` [tip:perf/core] " tip-bot for Wang Nan
2016-07-14  8:34 ` [PATCH v16 14/15] perf tools: Don't warn about out of order event if write_backward is used Wang Nan
2016-07-15 14:59   ` Jiri Olsa
2016-07-16 20:51   ` [tip:perf/core] perf session: " tip-bot for Wang Nan
2016-07-14  8:34 ` [PATCH v16 15/15] perf tools: Add --tail-synthesize option Wang Nan
2016-07-16 20:51   ` [tip:perf/core] perf record: " tip-bot for Wang Nan
2016-07-15 15:00 ` [PATCH v16 00/15] perf tools: Support overwritable ring buffer Jiri Olsa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).