All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/17] perf tools: Support overwritable ring buffer
@ 2016-05-13  7:55 Wang Nan
  2016-05-13  7:55 ` [PATCH 01/17] perf tools: Extract __perf_evlist__mmap_read() Wang Nan
                   ` (16 more replies)
  0 siblings, 17 replies; 28+ messages in thread
From: Wang Nan @ 2016-05-13  7:55 UTC (permalink / raw)
  To: acme
  Cc: arnaldo.melo, linux-kernel, Wang Nan, Arnaldo Carvalho de Melo,
	He Kuang, Jiri Olsa, Masami Hiramatsu, Namhyung Kim,
	Peter Zijlstra, Zefan Li, pi3orama

This patch set enables daemonized perf recording by utilizing
overwritable backward ring buffer. With this feature one can
put perf background, and dump ring buffer records by a SIGUSR2
when he/she find something unusual. For example, following
command record system calls, schedule events and samples on cpu cycles
continously:

 # perf record -g -e cycles -e raw_syscalls:*/call-graph=no/ \
                  -e sched:sched_switch/call-graph=no/ \
                  --switch-output --overwrite -a

Then by sending SIGUSR2 to perf when lagging is happen, we get multiple
perf.data output, each of them correspond a abnormal event, and the data
size is reasonable:

 # ls -l ./perf.data*
 -rw------- 1 root root 5122165 May 13 23:51 ./perf.data.2016051323511683
 -rw------- 1 root root 5135093 May 13 23:51 ./perf.data.2016051323512107
 -rw------- 1 root root 5135213 May 13 23:51 ./perf.data.2016051323512215
 -rw------- 1 root root 5135157 May 13 23:51 ./perf.data.2016051323512387

Wang Nan (17):
  perf tools: Extract __perf_evlist__mmap_read()
  perf tools: Add evlist channel helpers
  perf tools: Automatically add new channel according to evlist
  perf tools: Operate multiple channels
  perf record: Prevent reading invalid data in record__mmap_read
  perf tools: Squash overwrite setting into channel
  perf record: Don't read from and poll overwrite channel
  perf record: Don't poll on overwrite channel
  perf tools: Detect avalibility of write_backward
  perf tools: Enable overwrite settings
  perf tools: Set write_backward attribut bit for overwrite events
  perf tools: Record fd into perf_mmap
  perf tools: Add API to pause a channel
  perf record: Rename variable to make code clear
  perf record: Read from backward ring buffer
  perf record: Toggle overwrite ring buffer for reading
  perf tools: Don't warn about out of order event if write_backward is
    used

 tools/perf/builtin-record.c    | 207 +++++++++++++++++++++++--
 tools/perf/perf.h              |   2 +
 tools/perf/util/evlist.c       | 332 ++++++++++++++++++++++++++++++++++++-----
 tools/perf/util/evlist.h       |  67 ++++++++-
 tools/perf/util/evsel.c        |  17 +++
 tools/perf/util/evsel.h        |   3 +
 tools/perf/util/parse-events.c |  20 ++-
 tools/perf/util/parse-events.h |   2 +
 tools/perf/util/parse-events.l |   2 +
 tools/perf/util/record.c       |  11 ++
 tools/perf/util/session.c      |  22 ++-
 11 files changed, 625 insertions(+), 60 deletions(-)

Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com

-- 
1.8.3.4

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 01/17] perf tools: Extract __perf_evlist__mmap_read()
  2016-05-13  7:55 [PATCH 00/17] perf tools: Support overwritable ring buffer Wang Nan
@ 2016-05-13  7:55 ` Wang Nan
  2016-05-13 13:03   ` Arnaldo Carvalho de Melo
  2016-05-13  7:55 ` [PATCH 02/17] perf tools: Add evlist channel helpers Wang Nan
                   ` (15 subsequent siblings)
  16 siblings, 1 reply; 28+ messages in thread
From: Wang Nan @ 2016-05-13  7:55 UTC (permalink / raw)
  To: acme
  Cc: arnaldo.melo, linux-kernel, Wang Nan, Arnaldo Carvalho de Melo,
	Peter Zijlstra, Zefan Li, pi3orama

Extract event reader to __perf_evlist__mmap_read(). Future commit will
feed it with manually computed 'head' and 'old' pointers.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/evlist.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index c4bfe11..5e86972 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -749,6 +749,13 @@ broken_event:
 	return event;
 }
 
+static union perf_event *
+__perf_evlist__mmap_read(struct perf_mmap *md, bool overwrite, u64 head,
+			 u64 old, u64 *prev)
+{
+	return perf_mmap__read(md, overwrite, old, head, prev);
+}
+
 union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
 {
 	struct perf_mmap *md = &evlist->mmap[idx];
@@ -763,7 +770,8 @@ union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
 
 	head = perf_mmap__read_head(md);
 
-	return perf_mmap__read(md, evlist->overwrite, old, head, &md->prev);
+	return __perf_evlist__mmap_read(md, evlist->overwrite, head,
+					old, &md->prev);
 }
 
 union perf_event *
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 02/17] perf tools: Add evlist channel helpers
  2016-05-13  7:55 [PATCH 00/17] perf tools: Support overwritable ring buffer Wang Nan
  2016-05-13  7:55 ` [PATCH 01/17] perf tools: Extract __perf_evlist__mmap_read() Wang Nan
@ 2016-05-13  7:55 ` Wang Nan
  2016-05-13 13:05   ` Arnaldo Carvalho de Melo
  2016-05-13  7:56 ` [PATCH 03/17] perf tools: Automatically add new channel according to evlist Wang Nan
                   ` (14 subsequent siblings)
  16 siblings, 1 reply; 28+ messages in thread
From: Wang Nan @ 2016-05-13  7:55 UTC (permalink / raw)
  To: acme
  Cc: arnaldo.melo, linux-kernel, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li, pi3orama

In this commit sereval helpers are introduced to support the principle
of channel. Channels hold different groups of evsels which configured
differently. It will be used for overwritable evsels, which allows perf
record some events continuously while capture snapshot for other events
when something happen. Tracking events (mmap, mmap2, fork, exit ...)
are another possible events worth to be put into a separated channel.

Channels are represented by an array with channel flags. Each channel
contains evlist->nr_mmaps mmaps. Channels are configured before
perf_evlist__mmap_ex(). During that function nr_mmaps mmaps for each
channel are allocated together as a big array.
perf_evlist__channel_idx() converts index in the big array and the
channel number. For API functions which accept idx, _ex() versions are
introduced to accept selecting an mmap from a channel.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c |   6 ++
 tools/perf/util/evlist.c    | 130 ++++++++++++++++++++++++++++++++++++++++++--
 tools/perf/util/evlist.h    |  58 ++++++++++++++++++++
 3 files changed, 188 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index f3679c4..6e44834 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -316,6 +316,12 @@ try_again:
 		goto out;
 	}
 
+	perf_evlist__channel_reset(evlist);
+	rc = perf_evlist__channel_add(evlist, 0, true);
+	if (rc < 0)
+		goto out;
+	rc = 0;
+
 	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, false,
 				 opts->auxtrace_mmap_pages,
 				 opts->auxtrace_snapshot_mode) < 0) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 5e86972..6c11b9e 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -679,6 +679,33 @@ static struct perf_evsel *perf_evlist__event2evsel(struct perf_evlist *evlist,
 	return NULL;
 }
 
+int perf_evlist__channel_idx(struct perf_evlist *evlist,
+			     int *p_channel, int *p_idx)
+{
+	int channel = *p_channel;
+	int _idx = *p_idx;
+
+	if (_idx < 0)
+		return -EINVAL;
+	/*
+	 * Negative channel means caller explicitly use real index.
+	 */
+	if (channel < 0) {
+		channel = perf_evlist__idx_channel(evlist, _idx);
+		_idx = _idx % evlist->nr_mmaps;
+	}
+	if (channel < 0)
+		return channel;
+	if (channel >= PERF_EVLIST__NR_CHANNELS)
+		return -E2BIG;
+	if (_idx >= evlist->nr_mmaps)
+		return -E2BIG;
+
+	*p_channel = channel;
+	*p_idx = evlist->nr_mmaps * channel + _idx;
+	return 0;
+}
+
 /* When check_messup is true, 'end' must points to a good entry */
 static union perf_event *
 perf_mmap__read(struct perf_mmap *md, bool check_messup, u64 start,
@@ -756,11 +783,19 @@ __perf_evlist__mmap_read(struct perf_mmap *md, bool overwrite, u64 head,
 	return perf_mmap__read(md, overwrite, old, head, prev);
 }
 
-union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
+union perf_event *perf_evlist__mmap_read_ex(struct perf_evlist *evlist,
+					    int channel, int idx)
 {
 	struct perf_mmap *md = &evlist->mmap[idx];
-	u64 head;
-	u64 old = md->prev;
+	u64 head, old;
+	int err = perf_evlist__channel_idx(evlist, &channel, &idx);
+
+	if (err || !perf_evlist__channel_is_enabled(evlist, channel)) {
+		pr_err("ERROR: invalid mmap index: channel %d, idx: %d\n",
+		       channel, idx);
+		return NULL;
+	}
+	old = md->prev;
 
 	/*
 	 * Check if event was unmapped due to a POLLHUP/POLLERR.
@@ -824,6 +859,11 @@ void perf_evlist__mmap_read_catchup(struct perf_evlist *evlist, int idx)
 	md->prev = head;
 }
 
+union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
+{
+	return perf_evlist__mmap_read_ex(evlist, -1, idx);
+}
+
 static bool perf_mmap__empty(struct perf_mmap *md)
 {
 	return perf_mmap__read_head(md) == md->prev && !md->auxtrace_mmap.base;
@@ -842,10 +882,18 @@ static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx)
 		__perf_evlist__munmap(evlist, idx);
 }
 
-void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
+void perf_evlist__mmap_consume_ex(struct perf_evlist *evlist,
+				  int channel, int idx)
 {
+	int err = perf_evlist__channel_idx(evlist, &channel, &idx);
 	struct perf_mmap *md = &evlist->mmap[idx];
 
+	if (err || !perf_evlist__channel_is_enabled(evlist, channel)) {
+		pr_err("ERROR: invalid mmap index: channel %d, idx: %d\n",
+		       channel, idx);
+		return;
+	}
+
 	if (!evlist->overwrite) {
 		u64 old = md->prev;
 
@@ -856,6 +904,11 @@ void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
 		perf_evlist__mmap_put(evlist, idx);
 }
 
+void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
+{
+	perf_evlist__mmap_consume_ex(evlist, -1, idx);
+}
+
 int __weak auxtrace_mmap__mmap(struct auxtrace_mmap *mm __maybe_unused,
 			       struct auxtrace_mmap_params *mp __maybe_unused,
 			       void *userpg __maybe_unused,
@@ -901,7 +954,7 @@ void perf_evlist__munmap(struct perf_evlist *evlist)
 	if (evlist->mmap == NULL)
 		return;
 
-	for (i = 0; i < evlist->nr_mmaps; i++)
+	for (i = 0; i < perf_evlist__mmap_nr(evlist); i++)
 		__perf_evlist__munmap(evlist, i);
 
 	zfree(&evlist->mmap);
@@ -909,10 +962,17 @@ void perf_evlist__munmap(struct perf_evlist *evlist)
 
 static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
 {
+	int total_mmaps;
+
 	evlist->nr_mmaps = cpu_map__nr(evlist->cpus);
 	if (cpu_map__empty(evlist->cpus))
 		evlist->nr_mmaps = thread_map__nr(evlist->threads);
-	evlist->mmap = zalloc(evlist->nr_mmaps * sizeof(struct perf_mmap));
+
+	total_mmaps = perf_evlist__mmap_nr(evlist);
+	if (!total_mmaps)
+		return -EINVAL;
+
+	evlist->mmap = zalloc(total_mmaps * sizeof(struct perf_mmap));
 	return evlist->mmap != NULL ? 0 : -ENOMEM;
 }
 
@@ -1221,6 +1281,12 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 		      bool overwrite)
 {
+	int err;
+
+	perf_evlist__channel_reset(evlist);
+	err = perf_evlist__channel_add(evlist, 0, true);
+	if (err < 0)
+		return err;
 	return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false);
 }
 
@@ -1862,3 +1928,55 @@ perf_evlist__find_evsel_by_str(struct perf_evlist *evlist,
 
 	return NULL;
 }
+
+int perf_evlist__channel_nr(struct perf_evlist *evlist)
+{
+	int i;
+
+	for (i = PERF_EVLIST__NR_CHANNELS - 1; i >= 0; i--) {
+		unsigned long flags = evlist->channel_flags[i];
+
+		if (flags & PERF_EVLIST__CHANNEL_ENABLED)
+			return i + 1;
+	}
+	return 0;
+}
+
+int perf_evlist__mmap_nr(struct perf_evlist *evlist)
+{
+	return evlist->nr_mmaps * perf_evlist__channel_nr(evlist);
+}
+
+void perf_evlist__channel_reset(struct perf_evlist *evlist)
+{
+	int i;
+
+	BUG_ON(evlist->mmap);
+
+	for (i = 0; i < PERF_EVLIST__NR_CHANNELS; i++)
+		evlist->channel_flags[i] = 0;
+}
+
+int perf_evlist__channel_add(struct perf_evlist *evlist,
+			     unsigned long flag,
+			     bool is_default)
+{
+	int n = perf_evlist__channel_nr(evlist);
+	unsigned long *flags = evlist->channel_flags;
+
+	BUG_ON(evlist->mmap);
+
+	if (n >= PERF_EVLIST__NR_CHANNELS) {
+		pr_debug("ERROR: too many channels. Increase PERF_EVLIST__NR_CHANNELS\n");
+		return -ENOSPC;
+	}
+
+	if (is_default) {
+		memmove(&flags[1], &flags[0],
+			sizeof(evlist->channel_flags) -
+			sizeof(evlist->channel_flags[0]));
+		n = 0;
+	}
+	flags[n] = flag | PERF_EVLIST__CHANNEL_ENABLED;
+	return n;
+}
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 85d1b59..4cb5d3a 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -20,6 +20,11 @@ struct record_opts;
 #define PERF_EVLIST__HLIST_BITS 8
 #define PERF_EVLIST__HLIST_SIZE (1 << PERF_EVLIST__HLIST_BITS)
 
+#define PERF_EVLIST__NR_CHANNELS	1
+enum perf_evlist_mmap_flag {
+	PERF_EVLIST__CHANNEL_ENABLED	= 1,
+};
+
 /**
  * struct perf_mmap - perf's ring buffer mmap details
  *
@@ -52,6 +57,7 @@ struct perf_evlist {
 		pid_t	pid;
 	} workload;
 	struct fdarray	 pollfd;
+	unsigned long channel_flags[PERF_EVLIST__NR_CHANNELS];
 	struct perf_mmap *mmap;
 	struct thread_map *threads;
 	struct cpu_map	  *cpus;
@@ -127,13 +133,65 @@ struct perf_evsel *perf_evlist__id2evsel_strict(struct perf_evlist *evlist,
 
 struct perf_sample_id *perf_evlist__id2sid(struct perf_evlist *evlist, u64 id);
 
+union perf_event *perf_evlist__mmap_read_ex(struct perf_evlist *evlist,
+					    int channel, int idx);
 union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx);
 
 union perf_event *perf_evlist__mmap_read_backward(struct perf_evlist *evlist,
 						  int idx);
 void perf_evlist__mmap_read_catchup(struct perf_evlist *evlist, int idx);
 
+void perf_evlist__mmap_consume_ex(struct perf_evlist *evlist,
+				  int channel, int idx);
 void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx);
+int perf_evlist__mmap_nr(struct perf_evlist *evlist);
+
+int perf_evlist__channel_nr(struct perf_evlist *evlist);
+void perf_evlist__channel_reset(struct perf_evlist *evlist);
+int perf_evlist__channel_add(struct perf_evlist *evlist,
+			     unsigned long flag,
+			     bool is_default);
+
+static inline bool
+__perf_evlist__channel_check(struct perf_evlist *evlist, int channel,
+			     enum perf_evlist_mmap_flag bits)
+{
+	if (channel >= PERF_EVLIST__NR_CHANNELS)
+		return false;
+
+	return (evlist->channel_flags[channel] & bits) ? true : false;
+}
+#define perf_evlist__channel_check(e, c, b) \
+		__perf_evlist__channel_check(e, c, PERF_EVLIST__CHANNEL_##b)
+
+static inline bool
+perf_evlist__channel_is_enabled(struct perf_evlist *evlist, int channel)
+{
+	return perf_evlist__channel_check(evlist, channel, ENABLED);
+}
+
+static inline int
+perf_evlist__idx_channel(struct perf_evlist *evlist, int idx)
+{
+	int channel = idx / evlist->nr_mmaps;
+
+	if (channel >= PERF_EVLIST__NR_CHANNELS)
+		return -E2BIG;
+	return channel;
+}
+
+int perf_evlist__channel_idx(struct perf_evlist *evlist,
+			     int *p_channel, int *p_idx);
+
+static inline struct perf_mmap *
+perf_evlist__get_mmap(struct perf_evlist *evlist,
+		      int channel, int idx)
+{
+	if (perf_evlist__channel_idx(evlist, &channel, &idx))
+		return NULL;
+
+	return &evlist->mmap[idx];
+}
 
 int perf_evlist__open(struct perf_evlist *evlist);
 void perf_evlist__close(struct perf_evlist *evlist);
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 03/17] perf tools: Automatically add new channel according to evlist
  2016-05-13  7:55 [PATCH 00/17] perf tools: Support overwritable ring buffer Wang Nan
  2016-05-13  7:55 ` [PATCH 01/17] perf tools: Extract __perf_evlist__mmap_read() Wang Nan
  2016-05-13  7:55 ` [PATCH 02/17] perf tools: Add evlist channel helpers Wang Nan
@ 2016-05-13  7:56 ` Wang Nan
  2016-05-13  7:56 ` [PATCH 04/17] perf tools: Operate multiple channels Wang Nan
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 28+ messages in thread
From: Wang Nan @ 2016-05-13  7:56 UTC (permalink / raw)
  To: acme
  Cc: arnaldo.melo, linux-kernel, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li, pi3orama

perf_evlist__channel_find() can be used to find a proper channel based
on propreties of a evsel. If the channel doesn't exist, it can create
new one for it. After this patch there's no need to create default
channel explicitly.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c |  5 -----
 tools/perf/util/evlist.c    | 47 ++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 42 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 6e44834..3140378 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -317,11 +317,6 @@ try_again:
 	}
 
 	perf_evlist__channel_reset(evlist);
-	rc = perf_evlist__channel_add(evlist, 0, true);
-	if (rc < 0)
-		goto out;
-	rc = 0;
-
 	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, false,
 				 opts->auxtrace_mmap_pages,
 				 opts->auxtrace_snapshot_mode) < 0) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 6c11b9e..47a8f1f 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1017,6 +1017,43 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 	return 0;
 }
 
+static unsigned long
+perf_evlist__channel_for_evsel(struct perf_evsel *evsel __maybe_unused)
+{
+	return 0;
+}
+
+static int
+perf_evlist__channel_find(struct perf_evlist *evlist,
+			  struct perf_evsel *evsel,
+			  bool add_new)
+{
+	unsigned long flag = perf_evlist__channel_for_evsel(evsel);
+	int i;
+
+	flag |= PERF_EVLIST__CHANNEL_ENABLED;
+	for (i = 0; i < perf_evlist__channel_nr(evlist); i++)
+		if (evlist->channel_flags[i] == flag)
+			return i;
+	if (add_new)
+		return perf_evlist__channel_add(evlist, flag, false);
+	return -ENOENT;
+}
+
+static int
+perf_evlist__channel_complete(struct perf_evlist *evlist)
+{
+	struct perf_evsel *evsel;
+	int err;
+
+	evlist__for_each(evlist, evsel) {
+		err = perf_evlist__channel_find(evlist, evsel, true);
+		if (err < 0)
+			return err;
+	}
+	return 0;
+}
+
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 				       struct mmap_params *mp, int cpu,
 				       int thread, int *output)
@@ -1244,6 +1281,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 			 bool overwrite, unsigned int auxtrace_pages,
 			 bool auxtrace_overwrite)
 {
+	int err;
 	struct perf_evsel *evsel;
 	const struct cpu_map *cpus = evlist->cpus;
 	const struct thread_map *threads = evlist->threads;
@@ -1251,6 +1289,10 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 		.prot = PROT_READ | (overwrite ? 0 : PROT_WRITE),
 	};
 
+	err = perf_evlist__channel_complete(evlist);
+	if (err)
+		return err;
+
 	if (evlist->mmap == NULL && perf_evlist__alloc_mmap(evlist) < 0)
 		return -ENOMEM;
 
@@ -1281,12 +1323,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 		      bool overwrite)
 {
-	int err;
-
 	perf_evlist__channel_reset(evlist);
-	err = perf_evlist__channel_add(evlist, 0, true);
-	if (err < 0)
-		return err;
 	return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false);
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 04/17] perf tools: Operate multiple channels
  2016-05-13  7:55 [PATCH 00/17] perf tools: Support overwritable ring buffer Wang Nan
                   ` (2 preceding siblings ...)
  2016-05-13  7:56 ` [PATCH 03/17] perf tools: Automatically add new channel according to evlist Wang Nan
@ 2016-05-13  7:56 ` Wang Nan
  2016-05-13  7:56 ` [PATCH 05/17] perf record: Prevent reading invalid data in record__mmap_read Wang Nan
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 28+ messages in thread
From: Wang Nan @ 2016-05-13  7:56 UTC (permalink / raw)
  To: acme
  Cc: arnaldo.melo, linux-kernel, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li, pi3orama

Before this patch perf operates on only the first channel. Make perf
mmap and read from multiple channels.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c |  3 ++-
 tools/perf/util/evlist.c    | 52 ++++++++++++++++++++++++++++++++-------------
 tools/perf/util/evlist.h    |  2 +-
 3 files changed, 40 insertions(+), 17 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 3140378..21ef8a0 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -426,8 +426,9 @@ static int record__mmap_read_all(struct record *rec)
 	u64 bytes_written = rec->bytes_written;
 	int i;
 	int rc = 0;
+	int total_mmaps = perf_evlist__mmap_nr(rec->evlist);
 
-	for (i = 0; i < rec->evlist->nr_mmaps; i++) {
+	for (i = 0; i < total_mmaps; i++) {
 		struct auxtrace_mmap *mm = &rec->evlist->mmap[i].auxtrace_mmap;
 
 		if (rec->evlist->mmap[i].base) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 47a8f1f..eefa33b 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -947,6 +947,16 @@ static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
 	auxtrace_mmap__munmap(&evlist->mmap[idx].auxtrace_mmap);
 }
 
+static void
+__perf_evlist__munmap_all(struct perf_evlist *evlist)
+{
+	int ch, i, idx = 0;
+
+	for (ch = 0; ch < perf_evlist__channel_nr(evlist); ch++)
+		for (i = 0; i < evlist->nr_mmaps; i++)
+			__perf_evlist__munmap(evlist, idx++);
+}
+
 void perf_evlist__munmap(struct perf_evlist *evlist)
 {
 	int i;
@@ -1054,26 +1064,38 @@ perf_evlist__channel_complete(struct perf_evlist *evlist)
 	return 0;
 }
 
-static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
+static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
 				       struct mmap_params *mp, int cpu,
-				       int thread, int *output)
+				       int thread, int *outputs)
 {
 	struct perf_evsel *evsel;
 
 	evlist__for_each(evlist, evsel) {
-		int fd;
+		int fd, channel, idx, err;
+
+		channel = perf_evlist__channel_find(evlist, evsel, false);
+		if (channel < 0) {
+			pr_err("ERROR: unable to find suitable channel for %s\n",
+			       evsel->name);
+			return -1;
+		}
+
+		idx = _idx;
+		err = perf_evlist__channel_idx(evlist, &channel, &idx);
+		if (err < 0)
+			return err;
 
 		if (evsel->system_wide && thread)
 			continue;
 
 		fd = FD(evsel, cpu, thread);
 
-		if (*output == -1) {
-			*output = fd;
-			if (__perf_evlist__mmap(evlist, idx, mp, *output) < 0)
+		if (outputs[channel] == -1) {
+			outputs[channel] = fd;
+			if (__perf_evlist__mmap(evlist, idx, mp, outputs[channel]) < 0)
 				return -1;
 		} else {
-			if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, *output) != 0)
+			if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, outputs[channel]) != 0)
 				return -1;
 
 			perf_evlist__mmap_get(evlist, idx);
@@ -1113,14 +1135,15 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist,
 
 	pr_debug2("perf event ring buffer mmapped per cpu\n");
 	for (cpu = 0; cpu < nr_cpus; cpu++) {
-		int output = -1;
+		int outputs[PERF_EVLIST__NR_CHANNELS];
 
+		memset(outputs, -1, sizeof(outputs));
 		auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, cpu,
 					      true);
 
 		for (thread = 0; thread < nr_threads; thread++) {
 			if (perf_evlist__mmap_per_evsel(evlist, cpu, mp, cpu,
-							thread, &output))
+							thread, outputs))
 				goto out_unmap;
 		}
 	}
@@ -1128,8 +1151,7 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist,
 	return 0;
 
 out_unmap:
-	for (cpu = 0; cpu < nr_cpus; cpu++)
-		__perf_evlist__munmap(evlist, cpu);
+	__perf_evlist__munmap_all(evlist);
 	return -1;
 }
 
@@ -1141,21 +1163,21 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist,
 
 	pr_debug2("perf event ring buffer mmapped per thread\n");
 	for (thread = 0; thread < nr_threads; thread++) {
-		int output = -1;
+		int outputs[PERF_EVLIST__NR_CHANNELS];
 
+		memset(outputs, -1, sizeof(outputs));
 		auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, thread,
 					      false);
 
 		if (perf_evlist__mmap_per_evsel(evlist, thread, mp, 0, thread,
-						&output))
+						outputs))
 			goto out_unmap;
 	}
 
 	return 0;
 
 out_unmap:
-	for (thread = 0; thread < nr_threads; thread++)
-		__perf_evlist__munmap(evlist, thread);
+	__perf_evlist__munmap_all(evlist);
 	return -1;
 }
 
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 4cb5d3a..188f0c7 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -20,7 +20,7 @@ struct record_opts;
 #define PERF_EVLIST__HLIST_BITS 8
 #define PERF_EVLIST__HLIST_SIZE (1 << PERF_EVLIST__HLIST_BITS)
 
-#define PERF_EVLIST__NR_CHANNELS	1
+#define PERF_EVLIST__NR_CHANNELS	2
 enum perf_evlist_mmap_flag {
 	PERF_EVLIST__CHANNEL_ENABLED	= 1,
 };
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 05/17] perf record: Prevent reading invalid data in record__mmap_read
  2016-05-13  7:55 [PATCH 00/17] perf tools: Support overwritable ring buffer Wang Nan
                   ` (3 preceding siblings ...)
  2016-05-13  7:56 ` [PATCH 04/17] perf tools: Operate multiple channels Wang Nan
@ 2016-05-13  7:56 ` Wang Nan
  2016-05-13  7:56 ` [PATCH 06/17] perf tools: Squash overwrite setting into channel Wang Nan
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 28+ messages in thread
From: Wang Nan @ 2016-05-13  7:56 UTC (permalink / raw)
  To: acme
  Cc: arnaldo.melo, linux-kernel, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li, pi3orama

When record__mmap_read() requires data more than the size of ring
buffer, drop those data to avoid accessing invalid memory.

This can happen when reading from overwritable ring buffer, which
should be avoided. However, check this for robustness.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 21ef8a0..81c700d 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -40,6 +40,7 @@
 #include <unistd.h>
 #include <sched.h>
 #include <sys/mman.h>
+#include <asm/bug.h>
 
 
 struct record {
@@ -98,6 +99,13 @@ static int record__mmap_read(struct record *rec, int idx)
 	rec->samples++;
 
 	size = head - old;
+	if (size > (unsigned long)(md->mask) + 1) {
+		WARN_ONCE(1, "failed to keep up with mmap data. (warn only once)\n");
+
+		md->prev = head;
+		perf_evlist__mmap_consume(rec->evlist, idx);
+		return 0;
+	}
 
 	if ((old & md->mask) + size != (head & md->mask)) {
 		buf = &data[old & md->mask];
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 06/17] perf tools: Squash overwrite setting into channel
  2016-05-13  7:55 [PATCH 00/17] perf tools: Support overwritable ring buffer Wang Nan
                   ` (4 preceding siblings ...)
  2016-05-13  7:56 ` [PATCH 05/17] perf record: Prevent reading invalid data in record__mmap_read Wang Nan
@ 2016-05-13  7:56 ` Wang Nan
  2016-05-13  7:56 ` [PATCH 07/17] perf record: Don't read from and poll overwrite channel Wang Nan
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 28+ messages in thread
From: Wang Nan @ 2016-05-13  7:56 UTC (permalink / raw)
  To: acme
  Cc: arnaldo.melo, linux-kernel, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li, pi3orama

Make 'overwrite' a channel configuration other than a evlist global
option. With this setting an evlist can have two channels, one is
normal channel, another is overwritable channel.
perf_evlist__channel_for_evsel() ensures events with 'overwrite'
configuration inserted to overwritable channel.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c |  2 +-
 tools/perf/util/evlist.c    | 43 ++++++++++++++++++++++++++++---------------
 tools/perf/util/evlist.h    |  7 +++----
 tools/perf/util/evsel.h     |  1 +
 4 files changed, 33 insertions(+), 20 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 81c700d..5e87602 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -325,7 +325,7 @@ try_again:
 	}
 
 	perf_evlist__channel_reset(evlist);
-	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, false,
+	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages,
 				 opts->auxtrace_mmap_pages,
 				 opts->auxtrace_snapshot_mode) < 0) {
 		if (errno == EPERM) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index eefa33b..abce588 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -789,6 +789,7 @@ union perf_event *perf_evlist__mmap_read_ex(struct perf_evlist *evlist,
 	struct perf_mmap *md = &evlist->mmap[idx];
 	u64 head, old;
 	int err = perf_evlist__channel_idx(evlist, &channel, &idx);
+	bool rdonly;
 
 	if (err || !perf_evlist__channel_is_enabled(evlist, channel)) {
 		pr_err("ERROR: invalid mmap index: channel %d, idx: %d\n",
@@ -805,8 +806,8 @@ union perf_event *perf_evlist__mmap_read_ex(struct perf_evlist *evlist,
 
 	head = perf_mmap__read_head(md);
 
-	return __perf_evlist__mmap_read(md, evlist->overwrite, head,
-					old, &md->prev);
+	rdonly = perf_evlist__channel_check(evlist, channel, RDONLY);
+	return __perf_evlist__mmap_read(md, rdonly, old, head, &md->prev);
 }
 
 union perf_event *
@@ -894,7 +895,7 @@ void perf_evlist__mmap_consume_ex(struct perf_evlist *evlist,
 		return;
 	}
 
-	if (!evlist->overwrite) {
+	if (!perf_evlist__channel_check(evlist, channel, RDONLY)) {
 		u64 old = md->prev;
 
 		perf_mmap__write_tail(md, old);
@@ -987,7 +988,6 @@ static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
 }
 
 struct mmap_params {
-	int prot;
 	int mask;
 	struct auxtrace_mmap_params auxtrace_mp;
 };
@@ -995,6 +995,15 @@ struct mmap_params {
 static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 			       struct mmap_params *mp, int fd)
 {
+	int channel = perf_evlist__idx_channel(evlist, idx);
+	int prot = PROT_READ;
+
+	if (channel < 0)
+		return -1;
+
+	if (!perf_evlist__channel_check(evlist, channel, RDONLY))
+		prot |= PROT_WRITE;
+
 	/*
 	 * The last one will be done at perf_evlist__mmap_consume(), so that we
 	 * make sure we don't prevent tools from consuming every last event in
@@ -1011,7 +1020,7 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 	atomic_set(&evlist->mmap[idx].refcnt, 2);
 	evlist->mmap[idx].prev = 0;
 	evlist->mmap[idx].mask = mp->mask;
-	evlist->mmap[idx].base = mmap(NULL, evlist->mmap_len, mp->prot,
+	evlist->mmap[idx].base = mmap(NULL, evlist->mmap_len, prot,
 				      MAP_SHARED, fd, 0);
 	if (evlist->mmap[idx].base == MAP_FAILED) {
 		pr_debug2("failed to mmap perf event ring buffer, error %d\n",
@@ -1030,7 +1039,11 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 static unsigned long
 perf_evlist__channel_for_evsel(struct perf_evsel *evsel __maybe_unused)
 {
-	return 0;
+	unsigned long flag = 0;
+
+	if (evsel->overwrite)
+		flag |= PERF_EVLIST__CHANNEL_RDONLY;
+	return flag;
 }
 
 static int
@@ -1286,11 +1299,10 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
  * perf_evlist__mmap_ex - Create mmaps to receive events.
  * @evlist: list of events
  * @pages: map length in pages
- * @overwrite: overwrite older events?
  * @auxtrace_pages - auxtrace map length in pages
  * @auxtrace_overwrite - overwrite older auxtrace data?
  *
- * If @overwrite is %false the user needs to signal event consumption using
+ * For writable channel, the user needs to signal event consumption using
  * perf_mmap__write_tail().  Using perf_evlist__mmap_read() does this
  * automatically.
  *
@@ -1300,16 +1312,13 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
  * Return: %0 on success, negative error code otherwise.
  */
 int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
-			 bool overwrite, unsigned int auxtrace_pages,
-			 bool auxtrace_overwrite)
+			 unsigned int auxtrace_pages, bool auxtrace_overwrite)
 {
 	int err;
 	struct perf_evsel *evsel;
 	const struct cpu_map *cpus = evlist->cpus;
 	const struct thread_map *threads = evlist->threads;
-	struct mmap_params mp = {
-		.prot = PROT_READ | (overwrite ? 0 : PROT_WRITE),
-	};
+	struct mmap_params mp;
 
 	err = perf_evlist__channel_complete(evlist);
 	if (err)
@@ -1321,7 +1330,6 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 	if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0)
 		return -ENOMEM;
 
-	evlist->overwrite = overwrite;
 	evlist->mmap_len = perf_evlist__mmap_size(pages);
 	pr_debug("mmap size %zuB\n", evlist->mmap_len);
 	mp.mask = evlist->mmap_len - page_size - 1;
@@ -1345,8 +1353,13 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 		      bool overwrite)
 {
+	struct perf_evsel *evsel;
+
 	perf_evlist__channel_reset(evlist);
-	return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false);
+	evlist__for_each(evlist, evsel)
+		evsel->overwrite = overwrite;
+
+	return perf_evlist__mmap_ex(evlist, pages, 0, false);
 }
 
 int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 188f0c7..c53bdbd 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -20,9 +20,10 @@ struct record_opts;
 #define PERF_EVLIST__HLIST_BITS 8
 #define PERF_EVLIST__HLIST_SIZE (1 << PERF_EVLIST__HLIST_BITS)
 
-#define PERF_EVLIST__NR_CHANNELS	2
+#define PERF_EVLIST__NR_CHANNELS	3
 enum perf_evlist_mmap_flag {
 	PERF_EVLIST__CHANNEL_ENABLED	= 1,
+	PERF_EVLIST__CHANNEL_RDONLY	= 2,
 };
 
 /**
@@ -45,7 +46,6 @@ struct perf_evlist {
 	int		 nr_entries;
 	int		 nr_groups;
 	int		 nr_mmaps;
-	bool		 overwrite;
 	bool		 enabled;
 	bool		 has_user_cpus;
 	size_t		 mmap_len;
@@ -223,8 +223,7 @@ int perf_evlist__parse_mmap_pages(const struct option *opt,
 unsigned long perf_event_mlock_kb_in_pages(void);
 
 int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
-			 bool overwrite, unsigned int auxtrace_pages,
-			 bool auxtrace_overwrite);
+			 unsigned int auxtrace_pages, bool auxtrace_overwrite);
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 		      bool overwrite);
 void perf_evlist__munmap(struct perf_evlist *evlist);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 8a644fe..c1f1015 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -112,6 +112,7 @@ struct perf_evsel {
 	bool			tracking;
 	bool			per_pkg;
 	bool			precise_max;
+	bool			overwrite;
 	/* parse modifier helper */
 	int			exclude_GH;
 	int			nr_members;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 07/17] perf record: Don't read from and poll overwrite channel
  2016-05-13  7:55 [PATCH 00/17] perf tools: Support overwritable ring buffer Wang Nan
                   ` (5 preceding siblings ...)
  2016-05-13  7:56 ` [PATCH 06/17] perf tools: Squash overwrite setting into channel Wang Nan
@ 2016-05-13  7:56 ` Wang Nan
  2016-05-13  7:56 ` [PATCH 08/17] perf record: Don't poll on " Wang Nan
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 28+ messages in thread
From: Wang Nan @ 2016-05-13  7:56 UTC (permalink / raw)
  To: acme
  Cc: arnaldo.melo, linux-kernel, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li, pi3orama

Reading from overwritable ring buffer is unreliable. Introduce
record__mmap_should_read() and prevent reading from such ring
buffers in 'perf record'. The rule in record__mmap_should_read() will
be changed when perf support reading from backward writing ring buffer.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 5e87602..d9a92e0 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -429,6 +429,19 @@ static struct perf_event_header finished_round_event = {
 	.type = PERF_RECORD_FINISHED_ROUND,
 };
 
+static bool record__mmap_should_read(struct record *rec, int idx)
+{
+	int channel = -1;
+
+	if (!rec->evlist->mmap[idx].base)
+		return false;
+	if (perf_evlist__channel_idx(rec->evlist, &channel, &idx))
+		return false;
+	if (perf_evlist__channel_check(rec->evlist, channel, RDONLY))
+		return false;
+	return true;
+}
+
 static int record__mmap_read_all(struct record *rec)
 {
 	u64 bytes_written = rec->bytes_written;
@@ -439,7 +452,7 @@ static int record__mmap_read_all(struct record *rec)
 	for (i = 0; i < total_mmaps; i++) {
 		struct auxtrace_mmap *mm = &rec->evlist->mmap[i].auxtrace_mmap;
 
-		if (rec->evlist->mmap[i].base) {
+		if (record__mmap_should_read(rec, i)) {
 			if (record__mmap_read(rec, i) != 0) {
 				rc = -1;
 				goto out;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 08/17] perf record: Don't poll on overwrite channel
  2016-05-13  7:55 [PATCH 00/17] perf tools: Support overwritable ring buffer Wang Nan
                   ` (6 preceding siblings ...)
  2016-05-13  7:56 ` [PATCH 07/17] perf record: Don't read from and poll overwrite channel Wang Nan
@ 2016-05-13  7:56 ` Wang Nan
  2016-05-13 13:12   ` Arnaldo Carvalho de Melo
  2016-05-13  7:56 ` [PATCH 09/17] perf tools: Detect avalibility of write_backward Wang Nan
                   ` (8 subsequent siblings)
  16 siblings, 1 reply; 28+ messages in thread
From: Wang Nan @ 2016-05-13  7:56 UTC (permalink / raw)
  To: acme
  Cc: arnaldo.melo, linux-kernel, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li, pi3orama

There's no need to receive events from overwritable ring buffer. Instead,
perf should make them run background until something happen. This patch
makes normal events from overwrite ring buffer ignored.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/evlist.c | 23 +++++++++++++++++++----
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index abce588..f0b0457 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -461,9 +461,9 @@ int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
 	return 0;
 }
 
-static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx)
+static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx, short revent)
 {
-	int pos = fdarray__add(&evlist->pollfd, fd, POLLIN | POLLERR | POLLHUP);
+	int pos = fdarray__add(&evlist->pollfd, fd, revent | POLLERR | POLLHUP);
 	/*
 	 * Save the idx so that when we filter out fds POLLHUP'ed we can
 	 * close the associated evlist->mmap[] entry.
@@ -479,7 +479,7 @@ static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx
 
 int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd)
 {
-	return __perf_evlist__add_pollfd(evlist, fd, -1);
+	return __perf_evlist__add_pollfd(evlist, fd, -1, POLLIN);
 }
 
 static void perf_evlist__munmap_filtered(struct fdarray *fda, int fd)
@@ -1077,6 +1077,18 @@ perf_evlist__channel_complete(struct perf_evlist *evlist)
 	return 0;
 }
 
+static bool
+perf_evlist__should_poll(struct perf_evlist *evlist,
+			 struct perf_evsel *evsel,
+			 int channel)
+{
+	if (evsel->system_wide)
+		return false;
+	if (perf_evlist__channel_check(evlist, channel, RDONLY))
+		return false;
+	return true;
+}
+
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
 				       struct mmap_params *mp, int cpu,
 				       int thread, int *outputs)
@@ -1085,6 +1097,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
 
 	evlist__for_each(evlist, evsel) {
 		int fd, channel, idx, err;
+		short revent = POLLIN;
 
 		channel = perf_evlist__channel_find(evlist, evsel, false);
 		if (channel < 0) {
@@ -1114,6 +1127,8 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
 			perf_evlist__mmap_get(evlist, idx);
 		}
 
+		if (!perf_evlist__should_poll(evlist, evsel, channel))
+			revent = 0;
 		/*
 		 * The system_wide flag causes a selected event to be opened
 		 * always without a pid.  Consequently it will never get a
@@ -1122,7 +1137,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
 		 * Therefore don't add it for polling.
 		 */
 		if (!evsel->system_wide &&
-		    __perf_evlist__add_pollfd(evlist, fd, idx) < 0) {
+		    __perf_evlist__add_pollfd(evlist, fd, idx, revent) < 0) {
 			perf_evlist__mmap_put(evlist, idx);
 			return -1;
 		}
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 09/17] perf tools: Detect avalibility of write_backward
  2016-05-13  7:55 [PATCH 00/17] perf tools: Support overwritable ring buffer Wang Nan
                   ` (7 preceding siblings ...)
  2016-05-13  7:56 ` [PATCH 08/17] perf record: Don't poll on " Wang Nan
@ 2016-05-13  7:56 ` Wang Nan
  2016-05-13 13:08   ` Arnaldo Carvalho de Melo
  2016-05-13  7:56 ` [PATCH 10/17] perf tools: Enable overwrite settings Wang Nan
                   ` (7 subsequent siblings)
  16 siblings, 1 reply; 28+ messages in thread
From: Wang Nan @ 2016-05-13  7:56 UTC (permalink / raw)
  To: acme
  Cc: arnaldo.melo, linux-kernel, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li, pi3orama

Detect avalibility of write_backward and save the result into
record_opts. With write_backward the start pointer of a ring
buffer mapped read only can be found reliably.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/perf.h        |  1 +
 tools/perf/util/record.c | 11 +++++++++++
 2 files changed, 12 insertions(+)

diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index cd8f1b1..c35bcfd 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -72,6 +72,7 @@ struct record_opts {
 	bool	     sample_transaction;
 	unsigned     initial_delay;
 	bool         use_clockid;
+	bool	     has_write_backward;
 	clockid_t    clockid;
 	unsigned int proc_map_timeout;
 };
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index 481792c..bb871d8 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -85,6 +85,11 @@ static void perf_probe_comm_exec(struct perf_evsel *evsel)
 	evsel->attr.comm_exec = 1;
 }
 
+static void perf_probe_write_backward(struct perf_evsel *evsel)
+{
+	evsel->attr.write_backward = 1;
+}
+
 static void perf_probe_context_switch(struct perf_evsel *evsel)
 {
 	evsel->attr.context_switch = 1;
@@ -105,6 +110,11 @@ bool perf_can_record_switch_events(void)
 	return perf_probe_api(perf_probe_context_switch);
 }
 
+static bool perf_can_write_backward(void)
+{
+	return perf_probe_api(perf_probe_write_backward);
+}
+
 bool perf_can_record_cpu_wide(void)
 {
 	struct perf_event_attr attr = {
@@ -236,6 +246,7 @@ static int record_opts__config_freq(struct record_opts *opts)
 
 int record_opts__config(struct record_opts *opts)
 {
+	opts->has_write_backward = perf_can_write_backward();
 	return record_opts__config_freq(opts);
 }
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 10/17] perf tools: Enable overwrite settings
  2016-05-13  7:55 [PATCH 00/17] perf tools: Support overwritable ring buffer Wang Nan
                   ` (8 preceding siblings ...)
  2016-05-13  7:56 ` [PATCH 09/17] perf tools: Detect avalibility of write_backward Wang Nan
@ 2016-05-13  7:56 ` Wang Nan
  2016-05-16 13:38   ` Arnaldo Carvalho de Melo
  2016-05-13  7:56 ` [PATCH 11/17] perf tools: Set write_backward attribut bit for overwrite events Wang Nan
                   ` (6 subsequent siblings)
  16 siblings, 1 reply; 28+ messages in thread
From: Wang Nan @ 2016-05-13  7:56 UTC (permalink / raw)
  To: acme
  Cc: arnaldo.melo, linux-kernel, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li, pi3orama

This patch allows following config terms and option:

Globally setting events to overwrite;

 # perf record --overwrite ...

Set specific events to be overwrite or no-overwrite.

 # perf record --event cycles/overwrite/ ...
 # perf record --event cycles/no-overwrite/ ...

Add missing config terms and update config term array size because the
longest string length is changed.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c    |  1 +
 tools/perf/perf.h              |  1 +
 tools/perf/util/evsel.c        |  4 ++++
 tools/perf/util/evsel.h        |  2 ++
 tools/perf/util/parse-events.c | 20 ++++++++++++++++++--
 tools/perf/util/parse-events.h |  2 ++
 tools/perf/util/parse-events.l |  2 ++
 7 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index d9a92e0..939aa68 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1265,6 +1265,7 @@ struct option __record_options[] = {
 	OPT_BOOLEAN_SET('i', "no-inherit", &record.opts.no_inherit,
 			&record.opts.no_inherit_set,
 			"child tasks do not inherit counters"),
+	OPT_BOOLEAN(0, "overwrite", &record.opts.overwrite, "use overwrite mode"),
 	OPT_UINTEGER('F', "freq", &record.opts.user_freq, "profile at this frequency"),
 	OPT_CALLBACK('m', "mmap-pages", &record.opts, "pages[,pages]",
 		     "number of mmap data pages and AUX area tracing mmap pages",
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index c35bcfd..386d030 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -59,6 +59,7 @@ struct record_opts {
 	bool	     record_switch_events;
 	bool	     all_kernel;
 	bool	     all_user;
+	bool	     overwrite;
 	unsigned int freq;
 	unsigned int mmap_pages;
 	unsigned int auxtrace_mmap_pages;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index a23f547..be4fc25 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -670,6 +670,9 @@ static void apply_config_terms(struct perf_evsel *evsel,
 			 */
 			attr->inherit = term->val.inherit ? 1 : 0;
 			break;
+		case PERF_EVSEL__CONFIG_TERM_OVERWRITE:
+			evsel->overwrite = term->val.overwrite ? 1 : 0;
+			break;
 		default:
 			break;
 		}
@@ -746,6 +749,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts,
 
 	attr->sample_id_all = perf_missing_features.sample_id_all ? 0 : 1;
 	attr->inherit	    = !opts->no_inherit;
+	evsel->overwrite    = opts->overwrite;
 
 	perf_evsel__set_sample_bit(evsel, IP);
 	perf_evsel__set_sample_bit(evsel, TID);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index c1f1015..bce99fa 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -44,6 +44,7 @@ enum {
 	PERF_EVSEL__CONFIG_TERM_CALLGRAPH,
 	PERF_EVSEL__CONFIG_TERM_STACK_USER,
 	PERF_EVSEL__CONFIG_TERM_INHERIT,
+	PERF_EVSEL__CONFIG_TERM_OVERWRITE,
 	PERF_EVSEL__CONFIG_TERM_MAX,
 };
 
@@ -57,6 +58,7 @@ struct perf_evsel_config_term {
 		char	*callgraph;
 		u64	stack_user;
 		bool	inherit;
+		bool	overwrite;
 	} val;
 };
 
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index bcbc983..85f813d 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -900,6 +900,8 @@ static const char *config_term_names[__PARSE_EVENTS__TERM_TYPE_NR] = {
 	[PARSE_EVENTS__TERM_TYPE_STACKSIZE]		= "stack-size",
 	[PARSE_EVENTS__TERM_TYPE_NOINHERIT]		= "no-inherit",
 	[PARSE_EVENTS__TERM_TYPE_INHERIT]		= "inherit",
+	[PARSE_EVENTS__TERM_TYPE_OVERWRITE]		= "overwrite",
+	[PARSE_EVENTS__TERM_TYPE_NOOVERWRITE]		= "no-overwrite",
 };
 
 static bool config_term_shrinked;
@@ -992,6 +994,12 @@ do {									   \
 	case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
 		CHECK_TYPE_VAL(NUM);
 		break;
+	case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
+		CHECK_TYPE_VAL(NUM);
+		break;
+	case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
+		CHECK_TYPE_VAL(NUM);
+		break;
 	case PARSE_EVENTS__TERM_TYPE_NAME:
 		CHECK_TYPE_VAL(STR);
 		break;
@@ -1040,6 +1048,8 @@ static int config_term_tracepoint(struct perf_event_attr *attr,
 	case PARSE_EVENTS__TERM_TYPE_STACKSIZE:
 	case PARSE_EVENTS__TERM_TYPE_INHERIT:
 	case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
+	case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
+	case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
 		return config_term_common(attr, term, err);
 	default:
 		if (err) {
@@ -1109,6 +1119,12 @@ do {								\
 		case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
 			ADD_CONFIG_TERM(INHERIT, inherit, term->val.num ? 0 : 1);
 			break;
+		case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
+			ADD_CONFIG_TERM(OVERWRITE, overwrite, term->val.num ? 1 : 0);
+			break;
+		case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
+			ADD_CONFIG_TERM(OVERWRITE, overwrite, term->val.num ? 0 : 1);
+			break;
 		default:
 			break;
 		}
@@ -2322,9 +2338,9 @@ static void config_terms_list(char *buf, size_t buf_sz)
 char *parse_events_formats_error_string(char *additional_terms)
 {
 	char *str;
-	/* "branch_type" is the longest name */
+	/* "no-overwrite" is the longest name */
 	char static_terms[__PARSE_EVENTS__TERM_TYPE_NR *
-			  (sizeof("branch_type") - 1)];
+			  (sizeof("no-overwrite") - 1)];
 
 	config_terms_list(static_terms, sizeof(static_terms));
 	/* valid terms */
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index d740c3c..f341d9d 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -68,6 +68,8 @@ enum {
 	PARSE_EVENTS__TERM_TYPE_STACKSIZE,
 	PARSE_EVENTS__TERM_TYPE_NOINHERIT,
 	PARSE_EVENTS__TERM_TYPE_INHERIT,
+	PARSE_EVENTS__TERM_TYPE_NOOVERWRITE,
+	PARSE_EVENTS__TERM_TYPE_OVERWRITE,
 	__PARSE_EVENTS__TERM_TYPE_NR,
 };
 
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 1477fbc..cc4c426 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -201,6 +201,8 @@ call-graph		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CALLGRAPH); }
 stack-size		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_STACKSIZE); }
 inherit			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_INHERIT); }
 no-inherit		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOINHERIT); }
+overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_OVERWRITE); }
+no-overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOOVERWRITE); }
 ,			{ return ','; }
 "/"			{ BEGIN(INITIAL); return '/'; }
 {name_minus}		{ return str(yyscanner, PE_NAME); }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 11/17] perf tools: Set write_backward attribut bit for overwrite events
  2016-05-13  7:55 [PATCH 00/17] perf tools: Support overwritable ring buffer Wang Nan
                   ` (9 preceding siblings ...)
  2016-05-13  7:56 ` [PATCH 10/17] perf tools: Enable overwrite settings Wang Nan
@ 2016-05-13  7:56 ` Wang Nan
  2016-05-13  7:56 ` [PATCH 12/17] perf tools: Record fd into perf_mmap Wang Nan
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 28+ messages in thread
From: Wang Nan @ 2016-05-13  7:56 UTC (permalink / raw)
  To: acme
  Cc: arnaldo.melo, linux-kernel, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li, pi3orama

write_backward attribute makes kernel filling ring buffer from the end
of it, makes reading from overwrite ring buffer possible.

This patch selects this attribute if evsel->overwrite is selected
explicitly by user.

Overwrite and write_backward are still controled separatly for legacy
readonly mmap users (most of them are in perf/tests).

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c |  7 +++++++
 tools/perf/util/evlist.c    |  2 ++
 tools/perf/util/evlist.h    |  1 +
 tools/perf/util/evsel.c     | 13 +++++++++++++
 4 files changed, 23 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 939aa68..49c41c3 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -300,6 +300,13 @@ static int record__open(struct record *rec)
 	perf_evlist__config(evlist, opts, &callchain_param);
 
 	evlist__for_each(evlist, pos) {
+		if (pos->overwrite) {
+			if (!pos->attr.write_backward) {
+				ui__warning("Unable to read from overwrite ring buffer\n\n");
+				rc = -ENOSYS;
+				goto out;
+			}
+		}
 try_again:
 		if (perf_evsel__open(pos, pos->cpus, pos->threads) < 0) {
 			if (perf_evsel__fallback(pos, errno, msg, sizeof(msg))) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index f0b0457..dc2e509 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1043,6 +1043,8 @@ perf_evlist__channel_for_evsel(struct perf_evsel *evsel __maybe_unused)
 
 	if (evsel->overwrite)
 		flag |= PERF_EVLIST__CHANNEL_RDONLY;
+	if (evsel->attr.write_backward)
+		flag |= PERF_EVLIST__CHANNEL_BACKWARD;
 	return flag;
 }
 
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index c53bdbd..bdd8e98 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -24,6 +24,7 @@ struct record_opts;
 enum perf_evlist_mmap_flag {
 	PERF_EVLIST__CHANNEL_ENABLED	= 1,
 	PERF_EVLIST__CHANNEL_RDONLY	= 2,
+	PERF_EVLIST__CHANNEL_BACKWARD	= 4,
 };
 
 /**
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index be4fc25..f1b060b 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -678,6 +678,19 @@ static void apply_config_terms(struct perf_evsel *evsel,
 		}
 	}
 
+	/*
+	 * Set backward after config term processing because it is
+	 * possible to set overwrite globally, without config
+	 * terms.
+	 */
+	if (evsel->overwrite) {
+		if (opts->has_write_backward)
+			attr->write_backward = 1;
+		else
+			pr_err("Reading from overwrite event %s is not supported\n",
+			       evsel->name);
+	}
+
 	/* User explicitly set per-event callgraph, clear the old setting and reset. */
 	if ((callgraph_buf != NULL) || (dump_size > 0)) {
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 12/17] perf tools: Record fd into perf_mmap
  2016-05-13  7:55 [PATCH 00/17] perf tools: Support overwritable ring buffer Wang Nan
                   ` (10 preceding siblings ...)
  2016-05-13  7:56 ` [PATCH 11/17] perf tools: Set write_backward attribut bit for overwrite events Wang Nan
@ 2016-05-13  7:56 ` Wang Nan
  2016-05-13  7:56 ` [PATCH 13/17] perf tools: Add API to pause a channel Wang Nan
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 28+ messages in thread
From: Wang Nan @ 2016-05-13  7:56 UTC (permalink / raw)
  To: acme
  Cc: arnaldo.melo, linux-kernel, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li, pi3orama

Add a fd field into perf_mmap so perf can track fd from mmap.
This feature will be used for toggling overwrite ring buffers.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/evlist.c | 15 +++++++++++++--
 tools/perf/util/evlist.h |  1 +
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index dc2e509..4295d7e 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -943,6 +943,7 @@ static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
 	if (evlist->mmap[idx].base != NULL) {
 		munmap(evlist->mmap[idx].base, evlist->mmap_len);
 		evlist->mmap[idx].base = NULL;
+		evlist->mmap[idx].fd = -1;
 		atomic_set(&evlist->mmap[idx].refcnt, 0);
 	}
 	auxtrace_mmap__munmap(&evlist->mmap[idx].auxtrace_mmap);
@@ -973,7 +974,7 @@ void perf_evlist__munmap(struct perf_evlist *evlist)
 
 static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
 {
-	int total_mmaps;
+	int total_mmaps, i;
 
 	evlist->nr_mmaps = cpu_map__nr(evlist->cpus);
 	if (cpu_map__empty(evlist->cpus))
@@ -984,7 +985,12 @@ static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
 		return -EINVAL;
 
 	evlist->mmap = zalloc(total_mmaps * sizeof(struct perf_mmap));
-	return evlist->mmap != NULL ? 0 : -ENOMEM;
+	if (!evlist->mmap)
+		return -ENOMEM;
+
+	for (i = 0; i < total_mmaps; i++)
+		evlist->mmap[i].fd = -1;
+	return 0;
 }
 
 struct mmap_params {
@@ -1004,6 +1010,10 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 	if (!perf_evlist__channel_check(evlist, channel, RDONLY))
 		prot |= PROT_WRITE;
 
+	if (evlist->mmap[idx].fd >= 0) {
+		pr_err("idx %d already mapped\n", idx);
+		return -1;
+	}
 	/*
 	 * The last one will be done at perf_evlist__mmap_consume(), so that we
 	 * make sure we don't prevent tools from consuming every last event in
@@ -1028,6 +1038,7 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 		evlist->mmap[idx].base = NULL;
 		return -1;
 	}
+	evlist->mmap[idx].fd = fd;
 
 	if (auxtrace_mmap__mmap(&evlist->mmap[idx].auxtrace_mmap,
 				&mp->auxtrace_mp, evlist->mmap[idx].base, fd))
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index bdd8e98..ee17449 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -35,6 +35,7 @@ enum perf_evlist_mmap_flag {
 struct perf_mmap {
 	void		 *base;
 	int		 mask;
+	int		 fd;
 	atomic_t	 refcnt;
 	u64		 prev;
 	struct auxtrace_mmap auxtrace_mmap;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 13/17] perf tools: Add API to pause a channel
  2016-05-13  7:55 [PATCH 00/17] perf tools: Support overwritable ring buffer Wang Nan
                   ` (11 preceding siblings ...)
  2016-05-13  7:56 ` [PATCH 12/17] perf tools: Record fd into perf_mmap Wang Nan
@ 2016-05-13  7:56 ` Wang Nan
  2016-05-13  7:56 ` [PATCH 14/17] perf record: Rename variable to make code clear Wang Nan
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 28+ messages in thread
From: Wang Nan @ 2016-05-13  7:56 UTC (permalink / raw)
  To: acme
  Cc: arnaldo.melo, linux-kernel, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li, pi3orama

perf_evlist__channel_toggle_paused() is introduced to pause/resume a
channel in an evlist. Utilize PERF_EVENT_IOC_PAUSE_OUTPUT ioctl.
Following commits use perf_evlist__channel_toggle_paused() to ensure
overwrite ring buffer is turned off before reading.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/evlist.c | 28 ++++++++++++++++++++++++++++
 tools/perf/util/evlist.h |  2 ++
 2 files changed, 30 insertions(+)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 4295d7e..cfcd39a 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -706,6 +706,34 @@ int perf_evlist__channel_idx(struct perf_evlist *evlist,
 	return 0;
 }
 
+int perf_evlist__channel_toggle_paused(struct perf_evlist *evlist,
+				       int channel, bool pause)
+{
+	int i;
+
+	if (channel >= perf_evlist__channel_nr(evlist))
+		return -E2BIG;
+	if (!evlist->mmap)
+		return -EFAULT;
+	for (i = 0; i < evlist->nr_mmaps; i++) {
+		int n = channel * evlist->nr_mmaps + i;
+		int fd = evlist->mmap[n].fd;
+		int err;
+
+		if (fd < 0)
+			continue;
+		err = ioctl(fd, PERF_EVENT_IOC_PAUSE_OUTPUT,
+			    pause ? 1 : 0);
+		if (err) {
+			err = (errno == 0 ? -EINVAL : -errno);
+			pr_err("Unable to pause output on %d: %s\n",
+			       fd, strerror(-err));
+			return err;
+		}
+	}
+	return 0;
+}
+
 /* When check_messup is true, 'end' must points to a good entry */
 static union perf_event *
 perf_mmap__read(struct perf_mmap *md, bool check_messup, u64 start,
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index ee17449..2bb42fd 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -195,6 +195,8 @@ perf_evlist__get_mmap(struct perf_evlist *evlist,
 	return &evlist->mmap[idx];
 }
 
+int perf_evlist__channel_toggle_paused(struct perf_evlist *evlist,
+				       int channel, bool pause);
 int perf_evlist__open(struct perf_evlist *evlist);
 void perf_evlist__close(struct perf_evlist *evlist);
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 14/17] perf record: Rename variable to make code clear
  2016-05-13  7:55 [PATCH 00/17] perf tools: Support overwritable ring buffer Wang Nan
                   ` (12 preceding siblings ...)
  2016-05-13  7:56 ` [PATCH 13/17] perf tools: Add API to pause a channel Wang Nan
@ 2016-05-13  7:56 ` Wang Nan
  2016-05-13  7:56 ` [PATCH 15/17] perf record: Read from backward ring buffer Wang Nan
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 28+ messages in thread
From: Wang Nan @ 2016-05-13  7:56 UTC (permalink / raw)
  To: acme
  Cc: arnaldo.melo, linux-kernel, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li, pi3orama

record__mmap_read() write data from ring buffer into perf.data.
'head' is maintained by kernel, points to the last writtend record.
'old' is maintained by perf, points to the record read in previous
round. record__mmap_read() saves data from 'old' to 'head' to
perf.data.

The names of these variables are not easy to read. In addition,
when dealing with backward writing ring buffer, the md->prev pointer
should point to 'head' instead of the last byte it got.

Add start and end pointer to make code clear and set md->prev to 'head'
instead of the moved 'old' pointer. This patch doesn't change
behavior since:

    buf = &data[old & md->mask];
    size = head - old;
    old += size;     <--- Here, old == head

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 49c41c3..9f4d3ad 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -88,17 +88,18 @@ static int record__mmap_read(struct record *rec, int idx)
 	struct perf_mmap *md = &rec->evlist->mmap[idx];
 	u64 head = perf_mmap__read_head(md);
 	u64 old = md->prev;
+	u64 end = head, start = old;
 	unsigned char *data = md->base + page_size;
 	unsigned long size;
 	void *buf;
 	int rc = 0;
 
-	if (old == head)
+	if (start == end)
 		return 0;
 
 	rec->samples++;
 
-	size = head - old;
+	size = end - start;
 	if (size > (unsigned long)(md->mask) + 1) {
 		WARN_ONCE(1, "failed to keep up with mmap data. (warn only once)\n");
 
@@ -107,10 +108,10 @@ static int record__mmap_read(struct record *rec, int idx)
 		return 0;
 	}
 
-	if ((old & md->mask) + size != (head & md->mask)) {
-		buf = &data[old & md->mask];
-		size = md->mask + 1 - (old & md->mask);
-		old += size;
+	if ((start & md->mask) + size != (end & md->mask)) {
+		buf = &data[start & md->mask];
+		size = md->mask + 1 - (start & md->mask);
+		start += size;
 
 		if (record__write(rec, buf, size) < 0) {
 			rc = -1;
@@ -118,16 +119,16 @@ static int record__mmap_read(struct record *rec, int idx)
 		}
 	}
 
-	buf = &data[old & md->mask];
-	size = head - old;
-	old += size;
+	buf = &data[start & md->mask];
+	size = end - start;
+	start += size;
 
 	if (record__write(rec, buf, size) < 0) {
 		rc = -1;
 		goto out;
 	}
 
-	md->prev = old;
+	md->prev = head;
 	perf_evlist__mmap_consume(rec->evlist, idx);
 out:
 	return rc;
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 15/17] perf record: Read from backward ring buffer
  2016-05-13  7:55 [PATCH 00/17] perf tools: Support overwritable ring buffer Wang Nan
                   ` (13 preceding siblings ...)
  2016-05-13  7:56 ` [PATCH 14/17] perf record: Rename variable to make code clear Wang Nan
@ 2016-05-13  7:56 ` Wang Nan
  2016-05-13  7:56 ` [PATCH 16/17] perf record: Toggle overwrite ring buffer for reading Wang Nan
  2016-05-13  7:56 ` [PATCH 17/17] perf tools: Don't warn about out of order event if write_backward is used Wang Nan
  16 siblings, 0 replies; 28+ messages in thread
From: Wang Nan @ 2016-05-13  7:56 UTC (permalink / raw)
  To: acme
  Cc: arnaldo.melo, linux-kernel, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li, pi3orama

Introduce rb_find_range() to find start and end position from a backward
ring buffer.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 59 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 59 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 9f4d3ad..e637ea2 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -83,6 +83,61 @@ static int process_synthesized_event(struct perf_tool *tool,
 	return record__write(rec, event, event->header.size);
 }
 
+static int
+backward_rb_find_range(void *buf, int mask, u64 head, u64 *start, u64 *end)
+{
+	struct perf_event_header *pheader;
+	u64 evt_head = head;
+	int size = mask + 1;
+
+	pr_debug2("backward_rb_find_range: buf=%p, head=%"PRIx64"\n", buf, head);
+	pheader = (struct perf_event_header *)(buf + (head & mask));
+	*start = head;
+	while (true) {
+		if (evt_head - head >= (unsigned int)size) {
+			pr_debug("Finshed reading backward ring buffer: rewind\n");
+			if (evt_head - head > (unsigned int)size)
+				evt_head -= pheader->size;
+			*end = evt_head;
+			return 0;
+		}
+
+		pheader = (struct perf_event_header *)(buf + (evt_head & mask));
+
+		if (pheader->size == 0) {
+			pr_debug("Finshed reading backward ring buffer: get start\n");
+			*end = evt_head;
+			return 0;
+		}
+
+		evt_head += pheader->size;
+		pr_debug3("move evt_head: %"PRIx64"\n", evt_head);
+	}
+	WARN_ONCE(1, "Shouldn't get here\n");
+	return -1;
+}
+
+static int
+rb_find_range(struct perf_evlist *evlist, int idx,
+	      void *data, int mask, u64 head, u64 old,
+	      u64 *start, u64 *end)
+{
+	int channel;
+
+	channel = perf_evlist__idx_channel(evlist, idx);
+	if (!perf_evlist__channel_check(evlist, channel, RDONLY)) {
+		*start = old;
+		*end = head;
+		return 0;
+	}
+
+	if (perf_evlist__channel_check(evlist, channel, BACKWARD))
+		return backward_rb_find_range(data, mask, head, start, end);
+
+	WARN_ONCE(1, "Unable to find start position from a read-only ring buffer\n");
+	return -1;
+}
+
 static int record__mmap_read(struct record *rec, int idx)
 {
 	struct perf_mmap *md = &rec->evlist->mmap[idx];
@@ -94,6 +149,10 @@ static int record__mmap_read(struct record *rec, int idx)
 	void *buf;
 	int rc = 0;
 
+	if (rb_find_range(rec->evlist, idx, data, md->mask, head,
+			  old, &start, &end))
+		return -1;
+
 	if (start == end)
 		return 0;
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 16/17] perf record: Toggle overwrite ring buffer for reading
  2016-05-13  7:55 [PATCH 00/17] perf tools: Support overwritable ring buffer Wang Nan
                   ` (14 preceding siblings ...)
  2016-05-13  7:56 ` [PATCH 15/17] perf record: Read from backward ring buffer Wang Nan
@ 2016-05-13  7:56 ` Wang Nan
  2016-05-13  7:56 ` [PATCH 17/17] perf tools: Don't warn about out of order event if write_backward is used Wang Nan
  16 siblings, 0 replies; 28+ messages in thread
From: Wang Nan @ 2016-05-13  7:56 UTC (permalink / raw)
  To: acme
  Cc: arnaldo.melo, linux-kernel, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li, pi3orama

Reading from a overwrite ring buffer is unrelible.
perf_evlist__channel_toggle_paused() should be called before
reading from them.

Toggel overwrite_evt_paused director after receiving done or switch
output.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/builtin-record.c | 94 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 92 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index e637ea2..606fcd05 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -42,6 +42,11 @@
 #include <sys/mman.h>
 #include <asm/bug.h>
 
+enum overwrite_evt_state {
+	OVERWRITE_EVT_RUNNING,
+	OVERWRITE_EVT_DATA_PENDING,
+	OVERWRITE_EVT_EMPTY,
+};
 
 struct record {
 	struct perf_tool	tool;
@@ -60,6 +65,7 @@ struct record {
 	bool			buildid_all;
 	bool			timestamp_filename;
 	bool			switch_output;
+	enum overwrite_evt_state overwrite_evt_state;
 	unsigned long long	samples;
 };
 
@@ -416,6 +422,7 @@ try_again:
 
 	session->evlist = evlist;
 	perf_session__set_id_hdr_size(session);
+	rec->overwrite_evt_state = OVERWRITE_EVT_RUNNING;
 out:
 	return rc;
 }
@@ -496,6 +503,52 @@ static struct perf_event_header finished_round_event = {
 	.type = PERF_RECORD_FINISHED_ROUND,
 };
 
+static void
+record__toggle_overwrite_evsels(struct record *rec,
+				enum overwrite_evt_state state)
+{
+	struct perf_evlist *evlist = rec->evlist;
+	enum overwrite_evt_state old_state = rec->overwrite_evt_state;
+	enum action {
+		NONE,
+		PAUSE,
+		RESUME,
+	} action = NONE;
+	int ch, nr_channels;
+
+	switch (old_state) {
+	case OVERWRITE_EVT_RUNNING:
+		if (state != OVERWRITE_EVT_RUNNING)
+			action = PAUSE;
+		break;
+	case OVERWRITE_EVT_DATA_PENDING:
+		if (state == OVERWRITE_EVT_RUNNING)
+			action = RESUME;
+		break;
+	case OVERWRITE_EVT_EMPTY:
+		if (state == OVERWRITE_EVT_RUNNING)
+			action = RESUME;
+		if (state == OVERWRITE_EVT_DATA_PENDING)
+			state = OVERWRITE_EVT_EMPTY;
+		break;
+	default:
+		WARN_ONCE(1, "Shouldn't get there\n");
+	}
+
+	rec->overwrite_evt_state = state;
+
+	if (action == NONE)
+		return;
+
+	nr_channels = perf_evlist__channel_nr(evlist);
+	for (ch = 0; ch < nr_channels; ch++) {
+		if (!perf_evlist__channel_check(evlist, ch, RDONLY))
+			continue;
+		perf_evlist__channel_toggle_paused(evlist, ch,
+						   action == PAUSE);
+	}
+}
+
 static bool record__mmap_should_read(struct record *rec, int idx)
 {
 	int channel = -1;
@@ -504,8 +557,13 @@ static bool record__mmap_should_read(struct record *rec, int idx)
 		return false;
 	if (perf_evlist__channel_idx(rec->evlist, &channel, &idx))
 		return false;
-	if (perf_evlist__channel_check(rec->evlist, channel, RDONLY))
-		return false;
+	if (perf_evlist__channel_check(rec->evlist, channel, RDONLY)) {
+		if (rec->overwrite_evt_state != OVERWRITE_EVT_DATA_PENDING)
+			return false;
+		if (!perf_evlist__channel_check(rec->evlist, channel, BACKWARD))
+			return false;
+		return true;
+	}
 	return true;
 }
 
@@ -540,6 +598,8 @@ static int record__mmap_read_all(struct record *rec)
 	if (bytes_written != rec->bytes_written)
 		rc = record__write(rec, &finished_round_event, sizeof(finished_round_event));
 
+	if (rec->overwrite_evt_state == OVERWRITE_EVT_DATA_PENDING)
+		record__toggle_overwrite_evsels(rec, OVERWRITE_EVT_EMPTY);
 out:
 	return rc;
 }
@@ -917,6 +977,17 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	for (;;) {
 		unsigned long long hits = rec->samples;
 
+		/*
+		 * rec->overwrite_evt_state is possible to be
+		 * OVERWRITE_EVT_EMPTY here: when done == true and
+		 * hits != rec->samples after previous reading.
+		 *
+		 * record__toggle_overwrite_evsels ensure we never
+		 * convert OVERWRITE_EVT_EMPTY to OVERWRITE_EVT_DATA_PENDING.
+		 */
+		if (trigger_is_hit(&switch_output_trigger) || done || draining)
+			record__toggle_overwrite_evsels(rec, OVERWRITE_EVT_DATA_PENDING);
+
 		if (record__mmap_read_all(rec) < 0) {
 			trigger_error(&auxtrace_snapshot_trigger);
 			trigger_error(&switch_output_trigger);
@@ -936,8 +1007,27 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		}
 
 		if (trigger_is_hit(&switch_output_trigger)) {
+			/*
+			 * If switch_output_trigger is hit, the data in
+			 * overwritable ring buffer should have been collected,
+			 * so overwrite_evt_state should be set to
+			 * OVERWRITE_EVT_EMPTY.
+			 *
+			 * If SIGUSR2 raise after or during record__mmap_read_all(),
+			 * record__mmap_read_all() didn't collect data from
+			 * overwritable ring buffer. Read again.
+			 */
+			if (rec->overwrite_evt_state == OVERWRITE_EVT_RUNNING)
+				continue;
 			trigger_ready(&switch_output_trigger);
 
+			/*
+			 * Reenable events in overwrite ring buffer after
+			 * record__mmap_read_all(): we should have collected
+			 * data from it.
+			 */
+			record__toggle_overwrite_evsels(rec, OVERWRITE_EVT_RUNNING);
+
 			if (!quiet)
 				fprintf(stderr, "[ perf record: dump data: Woken up %ld times ]\n",
 					waking);
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 17/17] perf tools: Don't warn about out of order event if write_backward is used
  2016-05-13  7:55 [PATCH 00/17] perf tools: Support overwritable ring buffer Wang Nan
                   ` (15 preceding siblings ...)
  2016-05-13  7:56 ` [PATCH 16/17] perf record: Toggle overwrite ring buffer for reading Wang Nan
@ 2016-05-13  7:56 ` Wang Nan
  16 siblings, 0 replies; 28+ messages in thread
From: Wang Nan @ 2016-05-13  7:56 UTC (permalink / raw)
  To: acme
  Cc: arnaldo.melo, linux-kernel, Wang Nan, He Kuang,
	Arnaldo Carvalho de Melo, Jiri Olsa, Masami Hiramatsu,
	Namhyung Kim, Zefan Li, pi3orama

If write_backward attribute is set, records are written into kernel
ring buffer from end to beginning, but read from beginning to end.
To avoid 'XX out of order events recorded' warning message (timestamps
of records is in reverse order when using write_backward), suppress the
warning message if write_backward is selected by at lease one event.

Result:

Before this patch:
 # perf record -m 1 -e raw_syscalls:sys_exit/overwrite/ \
                    -e raw_syscalls:sys_enter \
                    dd if=/dev/zero of=/dev/null count=300
 300+0 records in
 300+0 records out
 153600 bytes (154 kB) copied, 0.000601617 s, 255 MB/s
 [ perf record: Woken up 5 times to write data ]
 Warning:
 40 out of order events recorded.
 [ perf record: Captured and wrote 0.096 MB perf.data (696 samples) ]

After this patch:
 # perf record -m 1 -e raw_syscalls:sys_exit/overwrite/ \
                    -e raw_syscalls:sys_enter \
                    dd if=/dev/zero of=/dev/null count=300
 300+0 records in
 300+0 records out
 153600 bytes (154 kB) copied, 0.000644873 s, 238 MB/s
 [ perf record: Woken up 5 times to write data ]
 [ perf record: Captured and wrote 0.096 MB perf.data (696 samples) ]

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---
 tools/perf/util/session.c | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 2335b28..8e3d9d4 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1495,10 +1495,27 @@ int perf_session__register_idle_thread(struct perf_session *session)
 	return err;
 }
 
+static void
+perf_session__warn_order(const struct perf_session *session)
+{
+	const struct ordered_events *oe = &session->ordered_events;
+	struct perf_evsel *evsel;
+	bool should_warn = true;
+
+	evlist__for_each(session->evlist, evsel) {
+		if (evsel->attr.write_backward)
+			should_warn = false;
+	}
+
+	if (!should_warn)
+		return;
+	if (oe->nr_unordered_events != 0)
+		ui__warning("%u out of order events recorded.\n", oe->nr_unordered_events);
+}
+
 static void perf_session__warn_about_errors(const struct perf_session *session)
 {
 	const struct events_stats *stats = &session->evlist->stats;
-	const struct ordered_events *oe = &session->ordered_events;
 
 	if (session->tool->lost == perf_event__process_lost &&
 	    stats->nr_events[PERF_RECORD_LOST] != 0) {
@@ -1555,8 +1572,7 @@ static void perf_session__warn_about_errors(const struct perf_session *session)
 			    stats->nr_unprocessable_samples);
 	}
 
-	if (oe->nr_unordered_events != 0)
-		ui__warning("%u out of order events recorded.\n", oe->nr_unordered_events);
+	perf_session__warn_order(session);
 
 	events_stats__auxtrace_error_warn(stats);
 
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 01/17] perf tools: Extract __perf_evlist__mmap_read()
  2016-05-13  7:55 ` [PATCH 01/17] perf tools: Extract __perf_evlist__mmap_read() Wang Nan
@ 2016-05-13 13:03   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-13 13:03 UTC (permalink / raw)
  To: Wang Nan
  Cc: arnaldo.melo, linux-kernel, Arnaldo Carvalho de Melo,
	Peter Zijlstra, Zefan Li, pi3orama

Em Fri, May 13, 2016 at 07:55:58AM +0000, Wang Nan escreveu:
> Extract event reader to __perf_evlist__mmap_read(). Future commit will
> feed it with manually computed 'head' and 'old' pointers.

why not use the perf_mmap__read() directly then?

- Arnaldo
 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> ---
>  tools/perf/util/evlist.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index c4bfe11..5e86972 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -749,6 +749,13 @@ broken_event:
>  	return event;
>  }
>  
> +static union perf_event *
> +__perf_evlist__mmap_read(struct perf_mmap *md, bool overwrite, u64 head,
> +			 u64 old, u64 *prev)
> +{
> +	return perf_mmap__read(md, overwrite, old, head, prev);
> +}
> +
>  union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
>  {
>  	struct perf_mmap *md = &evlist->mmap[idx];
> @@ -763,7 +770,8 @@ union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
>  
>  	head = perf_mmap__read_head(md);
>  
> -	return perf_mmap__read(md, evlist->overwrite, old, head, &md->prev);
> +	return __perf_evlist__mmap_read(md, evlist->overwrite, head,
> +					old, &md->prev);
>  }
>  
>  union perf_event *
> -- 
> 1.8.3.4

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 02/17] perf tools: Add evlist channel helpers
  2016-05-13  7:55 ` [PATCH 02/17] perf tools: Add evlist channel helpers Wang Nan
@ 2016-05-13 13:05   ` Arnaldo Carvalho de Melo
  2016-05-18  3:27     ` Wangnan (F)
  0 siblings, 1 reply; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-13 13:05 UTC (permalink / raw)
  To: Wang Nan
  Cc: arnaldo.melo, linux-kernel, He Kuang, Arnaldo Carvalho de Melo,
	Jiri Olsa, Masami Hiramatsu, Namhyung Kim, Zefan Li, pi3orama

Em Fri, May 13, 2016 at 07:55:59AM +0000, Wang Nan escreveu:
> In this commit sereval helpers are introduced to support the principle

                 several

> of channel. Channels hold different groups of evsels which configured
> differently. It will be used for overwritable evsels, which allows perf

why not use multiple evlists? An "evlist" is a "list of evsels", why do
we need yet another way of grouping evlists?

- Arnaldo

> record some events continuously while capture snapshot for other events
> when something happen. Tracking events (mmap, mmap2, fork, exit ...)
> are another possible events worth to be put into a separated channel.
> 
> Channels are represented by an array with channel flags. Each channel
> contains evlist->nr_mmaps mmaps. Channels are configured before
> perf_evlist__mmap_ex(). During that function nr_mmaps mmaps for each
> channel are allocated together as a big array.
> perf_evlist__channel_idx() converts index in the big array and the
> channel number. For API functions which accept idx, _ex() versions are
> introduced to accept selecting an mmap from a channel.
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Signed-off-by: He Kuang <hekuang@huawei.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> ---
>  tools/perf/builtin-record.c |   6 ++
>  tools/perf/util/evlist.c    | 130 ++++++++++++++++++++++++++++++++++++++++++--
>  tools/perf/util/evlist.h    |  58 ++++++++++++++++++++
>  3 files changed, 188 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index f3679c4..6e44834 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -316,6 +316,12 @@ try_again:
>  		goto out;
>  	}
>  
> +	perf_evlist__channel_reset(evlist);
> +	rc = perf_evlist__channel_add(evlist, 0, true);
> +	if (rc < 0)
> +		goto out;
> +	rc = 0;
> +
>  	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, false,
>  				 opts->auxtrace_mmap_pages,
>  				 opts->auxtrace_snapshot_mode) < 0) {
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index 5e86972..6c11b9e 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -679,6 +679,33 @@ static struct perf_evsel *perf_evlist__event2evsel(struct perf_evlist *evlist,
>  	return NULL;
>  }
>  
> +int perf_evlist__channel_idx(struct perf_evlist *evlist,
> +			     int *p_channel, int *p_idx)
> +{
> +	int channel = *p_channel;
> +	int _idx = *p_idx;
> +
> +	if (_idx < 0)
> +		return -EINVAL;
> +	/*
> +	 * Negative channel means caller explicitly use real index.
> +	 */
> +	if (channel < 0) {
> +		channel = perf_evlist__idx_channel(evlist, _idx);
> +		_idx = _idx % evlist->nr_mmaps;
> +	}
> +	if (channel < 0)
> +		return channel;
> +	if (channel >= PERF_EVLIST__NR_CHANNELS)
> +		return -E2BIG;
> +	if (_idx >= evlist->nr_mmaps)
> +		return -E2BIG;
> +
> +	*p_channel = channel;
> +	*p_idx = evlist->nr_mmaps * channel + _idx;
> +	return 0;
> +}
> +
>  /* When check_messup is true, 'end' must points to a good entry */
>  static union perf_event *
>  perf_mmap__read(struct perf_mmap *md, bool check_messup, u64 start,
> @@ -756,11 +783,19 @@ __perf_evlist__mmap_read(struct perf_mmap *md, bool overwrite, u64 head,
>  	return perf_mmap__read(md, overwrite, old, head, prev);
>  }
>  
> -union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
> +union perf_event *perf_evlist__mmap_read_ex(struct perf_evlist *evlist,
> +					    int channel, int idx)
>  {
>  	struct perf_mmap *md = &evlist->mmap[idx];
> -	u64 head;
> -	u64 old = md->prev;
> +	u64 head, old;
> +	int err = perf_evlist__channel_idx(evlist, &channel, &idx);
> +
> +	if (err || !perf_evlist__channel_is_enabled(evlist, channel)) {
> +		pr_err("ERROR: invalid mmap index: channel %d, idx: %d\n",
> +		       channel, idx);
> +		return NULL;
> +	}
> +	old = md->prev;
>  
>  	/*
>  	 * Check if event was unmapped due to a POLLHUP/POLLERR.
> @@ -824,6 +859,11 @@ void perf_evlist__mmap_read_catchup(struct perf_evlist *evlist, int idx)
>  	md->prev = head;
>  }
>  
> +union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
> +{
> +	return perf_evlist__mmap_read_ex(evlist, -1, idx);
> +}
> +
>  static bool perf_mmap__empty(struct perf_mmap *md)
>  {
>  	return perf_mmap__read_head(md) == md->prev && !md->auxtrace_mmap.base;
> @@ -842,10 +882,18 @@ static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx)
>  		__perf_evlist__munmap(evlist, idx);
>  }
>  
> -void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
> +void perf_evlist__mmap_consume_ex(struct perf_evlist *evlist,
> +				  int channel, int idx)
>  {
> +	int err = perf_evlist__channel_idx(evlist, &channel, &idx);
>  	struct perf_mmap *md = &evlist->mmap[idx];
>  
> +	if (err || !perf_evlist__channel_is_enabled(evlist, channel)) {
> +		pr_err("ERROR: invalid mmap index: channel %d, idx: %d\n",
> +		       channel, idx);
> +		return;
> +	}
> +
>  	if (!evlist->overwrite) {
>  		u64 old = md->prev;
>  
> @@ -856,6 +904,11 @@ void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
>  		perf_evlist__mmap_put(evlist, idx);
>  }
>  
> +void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
> +{
> +	perf_evlist__mmap_consume_ex(evlist, -1, idx);
> +}
> +
>  int __weak auxtrace_mmap__mmap(struct auxtrace_mmap *mm __maybe_unused,
>  			       struct auxtrace_mmap_params *mp __maybe_unused,
>  			       void *userpg __maybe_unused,
> @@ -901,7 +954,7 @@ void perf_evlist__munmap(struct perf_evlist *evlist)
>  	if (evlist->mmap == NULL)
>  		return;
>  
> -	for (i = 0; i < evlist->nr_mmaps; i++)
> +	for (i = 0; i < perf_evlist__mmap_nr(evlist); i++)
>  		__perf_evlist__munmap(evlist, i);
>  
>  	zfree(&evlist->mmap);
> @@ -909,10 +962,17 @@ void perf_evlist__munmap(struct perf_evlist *evlist)
>  
>  static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
>  {
> +	int total_mmaps;
> +
>  	evlist->nr_mmaps = cpu_map__nr(evlist->cpus);
>  	if (cpu_map__empty(evlist->cpus))
>  		evlist->nr_mmaps = thread_map__nr(evlist->threads);
> -	evlist->mmap = zalloc(evlist->nr_mmaps * sizeof(struct perf_mmap));
> +
> +	total_mmaps = perf_evlist__mmap_nr(evlist);
> +	if (!total_mmaps)
> +		return -EINVAL;
> +
> +	evlist->mmap = zalloc(total_mmaps * sizeof(struct perf_mmap));
>  	return evlist->mmap != NULL ? 0 : -ENOMEM;
>  }
>  
> @@ -1221,6 +1281,12 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
>  int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
>  		      bool overwrite)
>  {
> +	int err;
> +
> +	perf_evlist__channel_reset(evlist);
> +	err = perf_evlist__channel_add(evlist, 0, true);
> +	if (err < 0)
> +		return err;
>  	return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false);
>  }
>  
> @@ -1862,3 +1928,55 @@ perf_evlist__find_evsel_by_str(struct perf_evlist *evlist,
>  
>  	return NULL;
>  }
> +
> +int perf_evlist__channel_nr(struct perf_evlist *evlist)
> +{
> +	int i;
> +
> +	for (i = PERF_EVLIST__NR_CHANNELS - 1; i >= 0; i--) {
> +		unsigned long flags = evlist->channel_flags[i];
> +
> +		if (flags & PERF_EVLIST__CHANNEL_ENABLED)
> +			return i + 1;
> +	}
> +	return 0;
> +}
> +
> +int perf_evlist__mmap_nr(struct perf_evlist *evlist)
> +{
> +	return evlist->nr_mmaps * perf_evlist__channel_nr(evlist);
> +}
> +
> +void perf_evlist__channel_reset(struct perf_evlist *evlist)
> +{
> +	int i;
> +
> +	BUG_ON(evlist->mmap);
> +
> +	for (i = 0; i < PERF_EVLIST__NR_CHANNELS; i++)
> +		evlist->channel_flags[i] = 0;
> +}
> +
> +int perf_evlist__channel_add(struct perf_evlist *evlist,
> +			     unsigned long flag,
> +			     bool is_default)
> +{
> +	int n = perf_evlist__channel_nr(evlist);
> +	unsigned long *flags = evlist->channel_flags;
> +
> +	BUG_ON(evlist->mmap);
> +
> +	if (n >= PERF_EVLIST__NR_CHANNELS) {
> +		pr_debug("ERROR: too many channels. Increase PERF_EVLIST__NR_CHANNELS\n");
> +		return -ENOSPC;
> +	}
> +
> +	if (is_default) {
> +		memmove(&flags[1], &flags[0],
> +			sizeof(evlist->channel_flags) -
> +			sizeof(evlist->channel_flags[0]));
> +		n = 0;
> +	}
> +	flags[n] = flag | PERF_EVLIST__CHANNEL_ENABLED;
> +	return n;
> +}
> diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
> index 85d1b59..4cb5d3a 100644
> --- a/tools/perf/util/evlist.h
> +++ b/tools/perf/util/evlist.h
> @@ -20,6 +20,11 @@ struct record_opts;
>  #define PERF_EVLIST__HLIST_BITS 8
>  #define PERF_EVLIST__HLIST_SIZE (1 << PERF_EVLIST__HLIST_BITS)
>  
> +#define PERF_EVLIST__NR_CHANNELS	1
> +enum perf_evlist_mmap_flag {
> +	PERF_EVLIST__CHANNEL_ENABLED	= 1,
> +};
> +
>  /**
>   * struct perf_mmap - perf's ring buffer mmap details
>   *
> @@ -52,6 +57,7 @@ struct perf_evlist {
>  		pid_t	pid;
>  	} workload;
>  	struct fdarray	 pollfd;
> +	unsigned long channel_flags[PERF_EVLIST__NR_CHANNELS];
>  	struct perf_mmap *mmap;
>  	struct thread_map *threads;
>  	struct cpu_map	  *cpus;
> @@ -127,13 +133,65 @@ struct perf_evsel *perf_evlist__id2evsel_strict(struct perf_evlist *evlist,
>  
>  struct perf_sample_id *perf_evlist__id2sid(struct perf_evlist *evlist, u64 id);
>  
> +union perf_event *perf_evlist__mmap_read_ex(struct perf_evlist *evlist,
> +					    int channel, int idx);
>  union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx);
>  
>  union perf_event *perf_evlist__mmap_read_backward(struct perf_evlist *evlist,
>  						  int idx);
>  void perf_evlist__mmap_read_catchup(struct perf_evlist *evlist, int idx);
>  
> +void perf_evlist__mmap_consume_ex(struct perf_evlist *evlist,
> +				  int channel, int idx);
>  void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx);
> +int perf_evlist__mmap_nr(struct perf_evlist *evlist);
> +
> +int perf_evlist__channel_nr(struct perf_evlist *evlist);
> +void perf_evlist__channel_reset(struct perf_evlist *evlist);
> +int perf_evlist__channel_add(struct perf_evlist *evlist,
> +			     unsigned long flag,
> +			     bool is_default);
> +
> +static inline bool
> +__perf_evlist__channel_check(struct perf_evlist *evlist, int channel,
> +			     enum perf_evlist_mmap_flag bits)
> +{
> +	if (channel >= PERF_EVLIST__NR_CHANNELS)
> +		return false;
> +
> +	return (evlist->channel_flags[channel] & bits) ? true : false;
> +}
> +#define perf_evlist__channel_check(e, c, b) \
> +		__perf_evlist__channel_check(e, c, PERF_EVLIST__CHANNEL_##b)
> +
> +static inline bool
> +perf_evlist__channel_is_enabled(struct perf_evlist *evlist, int channel)
> +{
> +	return perf_evlist__channel_check(evlist, channel, ENABLED);
> +}
> +
> +static inline int
> +perf_evlist__idx_channel(struct perf_evlist *evlist, int idx)
> +{
> +	int channel = idx / evlist->nr_mmaps;
> +
> +	if (channel >= PERF_EVLIST__NR_CHANNELS)
> +		return -E2BIG;
> +	return channel;
> +}
> +
> +int perf_evlist__channel_idx(struct perf_evlist *evlist,
> +			     int *p_channel, int *p_idx);
> +
> +static inline struct perf_mmap *
> +perf_evlist__get_mmap(struct perf_evlist *evlist,
> +		      int channel, int idx)
> +{
> +	if (perf_evlist__channel_idx(evlist, &channel, &idx))
> +		return NULL;
> +
> +	return &evlist->mmap[idx];
> +}
>  
>  int perf_evlist__open(struct perf_evlist *evlist);
>  void perf_evlist__close(struct perf_evlist *evlist);
> -- 
> 1.8.3.4

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 09/17] perf tools: Detect avalibility of write_backward
  2016-05-13  7:56 ` [PATCH 09/17] perf tools: Detect avalibility of write_backward Wang Nan
@ 2016-05-13 13:08   ` Arnaldo Carvalho de Melo
  2016-05-20 15:31     ` Wangnan (F)
  0 siblings, 1 reply; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-13 13:08 UTC (permalink / raw)
  To: Wang Nan
  Cc: arnaldo.melo, linux-kernel, He Kuang, Arnaldo Carvalho de Melo,
	Jiri Olsa, Masami Hiramatsu, Namhyung Kim, Zefan Li, pi3orama

Em Fri, May 13, 2016 at 07:56:06AM +0000, Wang Nan escreveu:
> Detect avalibility of write_backward and save the result into
> record_opts. With write_backward the start pointer of a ring
> buffer mapped read only can be found reliably.

We have perf_missing_features for that, please try to use it.

- Arnaldo
 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Signed-off-by: He Kuang <hekuang@huawei.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> ---
>  tools/perf/perf.h        |  1 +
>  tools/perf/util/record.c | 11 +++++++++++
>  2 files changed, 12 insertions(+)
> 
> diff --git a/tools/perf/perf.h b/tools/perf/perf.h
> index cd8f1b1..c35bcfd 100644
> --- a/tools/perf/perf.h
> +++ b/tools/perf/perf.h
> @@ -72,6 +72,7 @@ struct record_opts {
>  	bool	     sample_transaction;
>  	unsigned     initial_delay;
>  	bool         use_clockid;
> +	bool	     has_write_backward;
>  	clockid_t    clockid;
>  	unsigned int proc_map_timeout;
>  };
> diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
> index 481792c..bb871d8 100644
> --- a/tools/perf/util/record.c
> +++ b/tools/perf/util/record.c
> @@ -85,6 +85,11 @@ static void perf_probe_comm_exec(struct perf_evsel *evsel)
>  	evsel->attr.comm_exec = 1;
>  }
>  
> +static void perf_probe_write_backward(struct perf_evsel *evsel)
> +{
> +	evsel->attr.write_backward = 1;
> +}
> +
>  static void perf_probe_context_switch(struct perf_evsel *evsel)
>  {
>  	evsel->attr.context_switch = 1;
> @@ -105,6 +110,11 @@ bool perf_can_record_switch_events(void)
>  	return perf_probe_api(perf_probe_context_switch);
>  }
>  
> +static bool perf_can_write_backward(void)
> +{
> +	return perf_probe_api(perf_probe_write_backward);
> +}
> +
>  bool perf_can_record_cpu_wide(void)
>  {
>  	struct perf_event_attr attr = {
> @@ -236,6 +246,7 @@ static int record_opts__config_freq(struct record_opts *opts)
>  
>  int record_opts__config(struct record_opts *opts)
>  {
> +	opts->has_write_backward = perf_can_write_backward();
>  	return record_opts__config_freq(opts);
>  }
>  
> -- 
> 1.8.3.4

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 08/17] perf record: Don't poll on overwrite channel
  2016-05-13  7:56 ` [PATCH 08/17] perf record: Don't poll on " Wang Nan
@ 2016-05-13 13:12   ` Arnaldo Carvalho de Melo
  2016-05-16  3:18     ` Wangnan (F)
  0 siblings, 1 reply; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-13 13:12 UTC (permalink / raw)
  To: Wang Nan
  Cc: arnaldo.melo, linux-kernel, He Kuang, Arnaldo Carvalho de Melo,
	Jiri Olsa, Masami Hiramatsu, Namhyung Kim, Zefan Li, pi3orama

Em Fri, May 13, 2016 at 07:56:05AM +0000, Wang Nan escreveu:
> There's no need to receive events from overwritable ring buffer. Instead,
> perf should make them run background until something happen. This patch
> makes normal events from overwrite ring buffer ignored.
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Signed-off-by: He Kuang <hekuang@huawei.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> ---
>  tools/perf/util/evlist.c | 23 +++++++++++++++++++----
>  1 file changed, 19 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index abce588..f0b0457 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -461,9 +461,9 @@ int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
>  	return 0;
>  }
>  
> -static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx)
> +static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx, short revent)
>  {
> -	int pos = fdarray__add(&evlist->pollfd, fd, POLLIN | POLLERR | POLLHUP);
> +	int pos = fdarray__add(&evlist->pollfd, fd, revent | POLLERR | POLLHUP);
>  	/*
>  	 * Save the idx so that when we filter out fds POLLHUP'ed we can
>  	 * close the associated evlist->mmap[] entry.
> @@ -479,7 +479,7 @@ static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx
>  
>  int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd)
>  {
> -	return __perf_evlist__add_pollfd(evlist, fd, -1);
> +	return __perf_evlist__add_pollfd(evlist, fd, -1, POLLIN);
>  }
>  
>  static void perf_evlist__munmap_filtered(struct fdarray *fda, int fd)
> @@ -1077,6 +1077,18 @@ perf_evlist__channel_complete(struct perf_evlist *evlist)
>  	return 0;
>  }
>  
> +static bool
> +perf_evlist__should_poll(struct perf_evlist *evlist,
> +			 struct perf_evsel *evsel,
> +			 int channel)
> +{
> +	if (evsel->system_wide)
> +		return false;

So, what is the above doing in this patch? If we should not poll when in
syswide mode, then this should be in a separate patch, unrelated to
'channels'.  No?

I.e. it would be an improvement that would be cherry pickable right now,
even before reviewing the channel concept.

- Arnaldo

> +	if (perf_evlist__channel_check(evlist, channel, RDONLY))
> +		return false;
> +	return true;
> +}
> +
>  static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
>  				       struct mmap_params *mp, int cpu,
>  				       int thread, int *outputs)
> @@ -1085,6 +1097,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
>  
>  	evlist__for_each(evlist, evsel) {
>  		int fd, channel, idx, err;
> +		short revent = POLLIN;
>  
>  		channel = perf_evlist__channel_find(evlist, evsel, false);
>  		if (channel < 0) {
> @@ -1114,6 +1127,8 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
>  			perf_evlist__mmap_get(evlist, idx);
>  		}
>  
> +		if (!perf_evlist__should_poll(evlist, evsel, channel))
> +			revent = 0;
>  		/*
>  		 * The system_wide flag causes a selected event to be opened
>  		 * always without a pid.  Consequently it will never get a
> @@ -1122,7 +1137,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int _idx,
>  		 * Therefore don't add it for polling.
>  		 */
>  		if (!evsel->system_wide &&
> -		    __perf_evlist__add_pollfd(evlist, fd, idx) < 0) {
> +		    __perf_evlist__add_pollfd(evlist, fd, idx, revent) < 0) {
>  			perf_evlist__mmap_put(evlist, idx);
>  			return -1;
>  		}
> -- 
> 1.8.3.4

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 08/17] perf record: Don't poll on overwrite channel
  2016-05-13 13:12   ` Arnaldo Carvalho de Melo
@ 2016-05-16  3:18     ` Wangnan (F)
  0 siblings, 0 replies; 28+ messages in thread
From: Wangnan (F) @ 2016-05-16  3:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: arnaldo.melo, linux-kernel, He Kuang, Arnaldo Carvalho de Melo,
	Jiri Olsa, Masami Hiramatsu, Namhyung Kim, Zefan Li, pi3orama



On 2016/5/13 21:12, Arnaldo Carvalho de Melo wrote:
> Em Fri, May 13, 2016 at 07:56:05AM +0000, Wang Nan escreveu:
>> There's no need to receive events from overwritable ring buffer. Instead,
>> perf should make them run background until something happen. This patch
>> makes normal events from overwrite ring buffer ignored.
>>
>> Signed-off-by: Wang Nan <wangnan0@huawei.com>
>> Signed-off-by: He Kuang <hekuang@huawei.com>
>> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
>> Cc: Jiri Olsa <jolsa@kernel.org>
>> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
>> Cc: Namhyung Kim <namhyung@kernel.org>
>> Cc: Zefan Li <lizefan@huawei.com>
>> Cc: pi3orama@163.com
>> ---
>>   tools/perf/util/evlist.c | 23 +++++++++++++++++++----
>>   1 file changed, 19 insertions(+), 4 deletions(-)
>>
>> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
>> index abce588..f0b0457 100644
>> --- a/tools/perf/util/evlist.c
>> +++ b/tools/perf/util/evlist.c
>> @@ -461,9 +461,9 @@ int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
>>   	return 0;
>>   }
>>   
>> -static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx)
>> +static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx, short revent)
>>   {
>> -	int pos = fdarray__add(&evlist->pollfd, fd, POLLIN | POLLERR | POLLHUP);
>> +	int pos = fdarray__add(&evlist->pollfd, fd, revent | POLLERR | POLLHUP);
>>   	/*
>>   	 * Save the idx so that when we filter out fds POLLHUP'ed we can
>>   	 * close the associated evlist->mmap[] entry.
>> @@ -479,7 +479,7 @@ static int __perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd, int idx
>>   
>>   int perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd)
>>   {
>> -	return __perf_evlist__add_pollfd(evlist, fd, -1);
>> +	return __perf_evlist__add_pollfd(evlist, fd, -1, POLLIN);
>>   }
>>   
>>   static void perf_evlist__munmap_filtered(struct fdarray *fda, int fd)
>> @@ -1077,6 +1077,18 @@ perf_evlist__channel_complete(struct perf_evlist *evlist)
>>   	return 0;
>>   }
>>   
>> +static bool
>> +perf_evlist__should_poll(struct perf_evlist *evlist,
>> +			 struct perf_evsel *evsel,
>> +			 int channel)
>> +{
>> +	if (evsel->system_wide)
>> +		return false;
> So, what is the above doing in this patch? If we should not poll when in
> syswide mode, then this should be in a separate patch, unrelated to
> 'channels'.  No?

I think the name 'system_wide' is more or less missleading. It is not means
an event in 'perf record -a', but means "a selected event to be opened 
always
without a pid when configured by perf_evsel__config().". See bf8e8f4b8.

Here we use similary logic in existing perf_evlist__mmap_per_evsel. It never
poll system_wide evsel:

                 /*
                  * The system_wide flag causes a selected event to be 
opened
                  * always without a pid.  Consequently it will never get a
                  * POLLHUP, but it is used for tracking in combination with
                  * other events, so it should not need to be polled anyway.
                  * Therefore don't add it for polling.
                  */
                 if (!evsel->system_wide &&
                     __perf_evlist__add_pollfd(evlist, fd, idx) < 0) {
                         perf_evlist__mmap_put(evlist, idx);
                         return -1;
                 }

Thank you.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 10/17] perf tools: Enable overwrite settings
  2016-05-13  7:56 ` [PATCH 10/17] perf tools: Enable overwrite settings Wang Nan
@ 2016-05-16 13:38   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-16 13:38 UTC (permalink / raw)
  To: Wang Nan
  Cc: arnaldo.melo, linux-kernel, He Kuang, Arnaldo Carvalho de Melo,
	Jiri Olsa, Masami Hiramatsu, Namhyung Kim, Zefan Li, pi3orama

Em Fri, May 13, 2016 at 07:56:07AM +0000, Wang Nan escreveu:
> This patch allows following config terms and option:
> 
> Globally setting events to overwrite;
> 
>  # perf record --overwrite ...
> 
> Set specific events to be overwrite or no-overwrite.
> 
>  # perf record --event cycles/overwrite/ ...
>  # perf record --event cycles/no-overwrite/ ...
> 
> Add missing config terms and update config term array size because the
> longest string length is changed.

You forgot to add this to the documentation, please add it when you
respin this patch. If you done so in a separate patch, please yank it
from there and add it here.

- Arnaldo
 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Signed-off-by: He Kuang <hekuang@huawei.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Zefan Li <lizefan@huawei.com>
> Cc: pi3orama@163.com
> ---
>  tools/perf/builtin-record.c    |  1 +
>  tools/perf/perf.h              |  1 +
>  tools/perf/util/evsel.c        |  4 ++++
>  tools/perf/util/evsel.h        |  2 ++
>  tools/perf/util/parse-events.c | 20 ++++++++++++++++++--
>  tools/perf/util/parse-events.h |  2 ++
>  tools/perf/util/parse-events.l |  2 ++
>  7 files changed, 30 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index d9a92e0..939aa68 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -1265,6 +1265,7 @@ struct option __record_options[] = {
>  	OPT_BOOLEAN_SET('i', "no-inherit", &record.opts.no_inherit,
>  			&record.opts.no_inherit_set,
>  			"child tasks do not inherit counters"),
> +	OPT_BOOLEAN(0, "overwrite", &record.opts.overwrite, "use overwrite mode"),
>  	OPT_UINTEGER('F', "freq", &record.opts.user_freq, "profile at this frequency"),
>  	OPT_CALLBACK('m', "mmap-pages", &record.opts, "pages[,pages]",
>  		     "number of mmap data pages and AUX area tracing mmap pages",
> diff --git a/tools/perf/perf.h b/tools/perf/perf.h
> index c35bcfd..386d030 100644
> --- a/tools/perf/perf.h
> +++ b/tools/perf/perf.h
> @@ -59,6 +59,7 @@ struct record_opts {
>  	bool	     record_switch_events;
>  	bool	     all_kernel;
>  	bool	     all_user;
> +	bool	     overwrite;
>  	unsigned int freq;
>  	unsigned int mmap_pages;
>  	unsigned int auxtrace_mmap_pages;
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index a23f547..be4fc25 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -670,6 +670,9 @@ static void apply_config_terms(struct perf_evsel *evsel,
>  			 */
>  			attr->inherit = term->val.inherit ? 1 : 0;
>  			break;
> +		case PERF_EVSEL__CONFIG_TERM_OVERWRITE:
> +			evsel->overwrite = term->val.overwrite ? 1 : 0;
> +			break;
>  		default:
>  			break;
>  		}
> @@ -746,6 +749,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts,
>  
>  	attr->sample_id_all = perf_missing_features.sample_id_all ? 0 : 1;
>  	attr->inherit	    = !opts->no_inherit;
> +	evsel->overwrite    = opts->overwrite;
>  
>  	perf_evsel__set_sample_bit(evsel, IP);
>  	perf_evsel__set_sample_bit(evsel, TID);
> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> index c1f1015..bce99fa 100644
> --- a/tools/perf/util/evsel.h
> +++ b/tools/perf/util/evsel.h
> @@ -44,6 +44,7 @@ enum {
>  	PERF_EVSEL__CONFIG_TERM_CALLGRAPH,
>  	PERF_EVSEL__CONFIG_TERM_STACK_USER,
>  	PERF_EVSEL__CONFIG_TERM_INHERIT,
> +	PERF_EVSEL__CONFIG_TERM_OVERWRITE,
>  	PERF_EVSEL__CONFIG_TERM_MAX,
>  };
>  
> @@ -57,6 +58,7 @@ struct perf_evsel_config_term {
>  		char	*callgraph;
>  		u64	stack_user;
>  		bool	inherit;
> +		bool	overwrite;
>  	} val;
>  };
>  
> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> index bcbc983..85f813d 100644
> --- a/tools/perf/util/parse-events.c
> +++ b/tools/perf/util/parse-events.c
> @@ -900,6 +900,8 @@ static const char *config_term_names[__PARSE_EVENTS__TERM_TYPE_NR] = {
>  	[PARSE_EVENTS__TERM_TYPE_STACKSIZE]		= "stack-size",
>  	[PARSE_EVENTS__TERM_TYPE_NOINHERIT]		= "no-inherit",
>  	[PARSE_EVENTS__TERM_TYPE_INHERIT]		= "inherit",
> +	[PARSE_EVENTS__TERM_TYPE_OVERWRITE]		= "overwrite",
> +	[PARSE_EVENTS__TERM_TYPE_NOOVERWRITE]		= "no-overwrite",
>  };
>  
>  static bool config_term_shrinked;
> @@ -992,6 +994,12 @@ do {									   \
>  	case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
>  		CHECK_TYPE_VAL(NUM);
>  		break;
> +	case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
> +		CHECK_TYPE_VAL(NUM);
> +		break;
> +	case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
> +		CHECK_TYPE_VAL(NUM);
> +		break;
>  	case PARSE_EVENTS__TERM_TYPE_NAME:
>  		CHECK_TYPE_VAL(STR);
>  		break;
> @@ -1040,6 +1048,8 @@ static int config_term_tracepoint(struct perf_event_attr *attr,
>  	case PARSE_EVENTS__TERM_TYPE_STACKSIZE:
>  	case PARSE_EVENTS__TERM_TYPE_INHERIT:
>  	case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
> +	case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
> +	case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
>  		return config_term_common(attr, term, err);
>  	default:
>  		if (err) {
> @@ -1109,6 +1119,12 @@ do {								\
>  		case PARSE_EVENTS__TERM_TYPE_NOINHERIT:
>  			ADD_CONFIG_TERM(INHERIT, inherit, term->val.num ? 0 : 1);
>  			break;
> +		case PARSE_EVENTS__TERM_TYPE_OVERWRITE:
> +			ADD_CONFIG_TERM(OVERWRITE, overwrite, term->val.num ? 1 : 0);
> +			break;
> +		case PARSE_EVENTS__TERM_TYPE_NOOVERWRITE:
> +			ADD_CONFIG_TERM(OVERWRITE, overwrite, term->val.num ? 0 : 1);
> +			break;
>  		default:
>  			break;
>  		}
> @@ -2322,9 +2338,9 @@ static void config_terms_list(char *buf, size_t buf_sz)
>  char *parse_events_formats_error_string(char *additional_terms)
>  {
>  	char *str;
> -	/* "branch_type" is the longest name */
> +	/* "no-overwrite" is the longest name */
>  	char static_terms[__PARSE_EVENTS__TERM_TYPE_NR *
> -			  (sizeof("branch_type") - 1)];
> +			  (sizeof("no-overwrite") - 1)];
>  
>  	config_terms_list(static_terms, sizeof(static_terms));
>  	/* valid terms */
> diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
> index d740c3c..f341d9d 100644
> --- a/tools/perf/util/parse-events.h
> +++ b/tools/perf/util/parse-events.h
> @@ -68,6 +68,8 @@ enum {
>  	PARSE_EVENTS__TERM_TYPE_STACKSIZE,
>  	PARSE_EVENTS__TERM_TYPE_NOINHERIT,
>  	PARSE_EVENTS__TERM_TYPE_INHERIT,
> +	PARSE_EVENTS__TERM_TYPE_NOOVERWRITE,
> +	PARSE_EVENTS__TERM_TYPE_OVERWRITE,
>  	__PARSE_EVENTS__TERM_TYPE_NR,
>  };
>  
> diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
> index 1477fbc..cc4c426 100644
> --- a/tools/perf/util/parse-events.l
> +++ b/tools/perf/util/parse-events.l
> @@ -201,6 +201,8 @@ call-graph		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CALLGRAPH); }
>  stack-size		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_STACKSIZE); }
>  inherit			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_INHERIT); }
>  no-inherit		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOINHERIT); }
> +overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_OVERWRITE); }
> +no-overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOOVERWRITE); }
>  ,			{ return ','; }
>  "/"			{ BEGIN(INITIAL); return '/'; }
>  {name_minus}		{ return str(yyscanner, PE_NAME); }
> -- 
> 1.8.3.4

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 02/17] perf tools: Add evlist channel helpers
  2016-05-13 13:05   ` Arnaldo Carvalho de Melo
@ 2016-05-18  3:27     ` Wangnan (F)
  2016-05-18 13:23       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 28+ messages in thread
From: Wangnan (F) @ 2016-05-18  3:27 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: arnaldo.melo, linux-kernel, He Kuang, Arnaldo Carvalho de Melo,
	Jiri Olsa, Masami Hiramatsu, Namhyung Kim, Zefan Li, pi3orama



On 2016/5/13 21:05, Arnaldo Carvalho de Melo wrote:
> Em Fri, May 13, 2016 at 07:55:59AM +0000, Wang Nan escreveu:
>> In this commit sereval helpers are introduced to support the principle
>                   several
>
>> of channel. Channels hold different groups of evsels which configured
>> differently. It will be used for overwritable evsels, which allows perf
> why not use multiple evlists? An "evlist" is a "list of evsels", why do
> we need yet another way of grouping evlists?
>
> - Arnaldo
>

There's an assumption all over perf that there's only one evlist: in 
'struct record'
there's an 'evlist' pointer, in 'struct session' there's also an 
'evlist' pointer.
Trying to change them to an array results in 181 errors, so I think 
fundamentally
moving to multiple evlists is nearly impossible.

Now I'm thinking introducing auxiliary evlists to perf record. We still 
obey one
evlist assumption, only creates separated evlists for mmap.

Thank you.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 02/17] perf tools: Add evlist channel helpers
  2016-05-18  3:27     ` Wangnan (F)
@ 2016-05-18 13:23       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-18 13:23 UTC (permalink / raw)
  To: Wangnan (F)
  Cc: Arnaldo Carvalho de Melo, linux-kernel, He Kuang, Jiri Olsa,
	Masami Hiramatsu, Namhyung Kim, Zefan Li, pi3orama

Em Wed, May 18, 2016 at 11:27:43AM +0800, Wangnan (F) escreveu:
> On 2016/5/13 21:05, Arnaldo Carvalho de Melo wrote:
> >Em Fri, May 13, 2016 at 07:55:59AM +0000, Wang Nan escreveu:
> >> Channels hold different groups of evsels which configured
> >> differently. It will be used for overwritable evsels, which allows
> >> perf

> >why not use multiple evlists? An "evlist" is a "list of evsels", why do
> >we need yet another way of grouping evlists?
 
> There's an assumption all over perf that there's only one evlist: in
> 'struct record' there's an 'evlist' pointer, in 'struct session'
> there's also an 'evlist' pointer.

Well, at some point there were none, and multiple tools used multiple
ways to deal with lists of events :-)

> Trying to change them to an array results in 181 errors, so I think
> fundamentally moving to multiple evlists is nearly impossible.

Well, in the next paragraph you give it some hope :-)
 
> Now I'm thinking introducing auxiliary evlists to perf record. We
> still obey one evlist assumption, only creates separated evlists for
> mmap.

Ok, that may be the way to go, i.e. linking evlists somehow for some
specific use cases, i.e. consuming events from multiple evlists,
probably sorting them via the ordered_events class, etc.

I have to review this more deeply to try and come with suggestions :-\

- Arnaldo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 09/17] perf tools: Detect avalibility of write_backward
  2016-05-13 13:08   ` Arnaldo Carvalho de Melo
@ 2016-05-20 15:31     ` Wangnan (F)
  2016-05-20 15:39       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 28+ messages in thread
From: Wangnan (F) @ 2016-05-20 15:31 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: arnaldo.melo, linux-kernel, He Kuang, Arnaldo Carvalho de Melo,
	Jiri Olsa, Masami Hiramatsu, Namhyung Kim, Zefan Li, pi3orama



On 2016/5/13 21:08, Arnaldo Carvalho de Melo wrote:
> Em Fri, May 13, 2016 at 07:56:06AM +0000, Wang Nan escreveu:
>> Detect avalibility of write_backward and save the result into
>> record_opts. With write_backward the start pointer of a ring
>> buffer mapped read only can be found reliably.
> We have perf_missing_features for that, please try to use it.

I'll try it, but write_backward can't fallback, if kernel doesn't
support it, I think we'd better throw an error earlier. Using
perf_missing_features we get error during opening the event, so if we want
to fail earlier we still need API probing.

Thank you.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 09/17] perf tools: Detect avalibility of write_backward
  2016-05-20 15:31     ` Wangnan (F)
@ 2016-05-20 15:39       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-20 15:39 UTC (permalink / raw)
  To: Wangnan (F)
  Cc: arnaldo.melo, linux-kernel, He Kuang, Arnaldo Carvalho de Melo,
	Jiri Olsa, Masami Hiramatsu, Namhyung Kim, Zefan Li, pi3orama

Em Fri, May 20, 2016 at 11:31:44PM +0800, Wangnan (F) escreveu:
> 
> 
> On 2016/5/13 21:08, Arnaldo Carvalho de Melo wrote:
> > Em Fri, May 13, 2016 at 07:56:06AM +0000, Wang Nan escreveu:
> > > Detect avalibility of write_backward and save the result into
> > > record_opts. With write_backward the start pointer of a ring
> > > buffer mapped read only can be found reliably.
> > We have perf_missing_features for that, please try to use it.
> 
> I'll try it, but write_backward can't fallback, if kernel doesn't
> support it, I think we'd better throw an error earlier. Using
> perf_missing_features we get error during opening the event, so if we want
> to fail earlier we still need API probing.

Conceptually 'perf_missing_features' shouldn't be strictly tied to
fallbacking, its just a way to mark what perf features are missing in
the current kernel, that info may be used for fallbacking, or for any
other purpose.

- Arnaldo

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2016-05-20 15:39 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-13  7:55 [PATCH 00/17] perf tools: Support overwritable ring buffer Wang Nan
2016-05-13  7:55 ` [PATCH 01/17] perf tools: Extract __perf_evlist__mmap_read() Wang Nan
2016-05-13 13:03   ` Arnaldo Carvalho de Melo
2016-05-13  7:55 ` [PATCH 02/17] perf tools: Add evlist channel helpers Wang Nan
2016-05-13 13:05   ` Arnaldo Carvalho de Melo
2016-05-18  3:27     ` Wangnan (F)
2016-05-18 13:23       ` Arnaldo Carvalho de Melo
2016-05-13  7:56 ` [PATCH 03/17] perf tools: Automatically add new channel according to evlist Wang Nan
2016-05-13  7:56 ` [PATCH 04/17] perf tools: Operate multiple channels Wang Nan
2016-05-13  7:56 ` [PATCH 05/17] perf record: Prevent reading invalid data in record__mmap_read Wang Nan
2016-05-13  7:56 ` [PATCH 06/17] perf tools: Squash overwrite setting into channel Wang Nan
2016-05-13  7:56 ` [PATCH 07/17] perf record: Don't read from and poll overwrite channel Wang Nan
2016-05-13  7:56 ` [PATCH 08/17] perf record: Don't poll on " Wang Nan
2016-05-13 13:12   ` Arnaldo Carvalho de Melo
2016-05-16  3:18     ` Wangnan (F)
2016-05-13  7:56 ` [PATCH 09/17] perf tools: Detect avalibility of write_backward Wang Nan
2016-05-13 13:08   ` Arnaldo Carvalho de Melo
2016-05-20 15:31     ` Wangnan (F)
2016-05-20 15:39       ` Arnaldo Carvalho de Melo
2016-05-13  7:56 ` [PATCH 10/17] perf tools: Enable overwrite settings Wang Nan
2016-05-16 13:38   ` Arnaldo Carvalho de Melo
2016-05-13  7:56 ` [PATCH 11/17] perf tools: Set write_backward attribut bit for overwrite events Wang Nan
2016-05-13  7:56 ` [PATCH 12/17] perf tools: Record fd into perf_mmap Wang Nan
2016-05-13  7:56 ` [PATCH 13/17] perf tools: Add API to pause a channel Wang Nan
2016-05-13  7:56 ` [PATCH 14/17] perf record: Rename variable to make code clear Wang Nan
2016-05-13  7:56 ` [PATCH 15/17] perf record: Read from backward ring buffer Wang Nan
2016-05-13  7:56 ` [PATCH 16/17] perf record: Toggle overwrite ring buffer for reading Wang Nan
2016-05-13  7:56 ` [PATCH 17/17] perf tools: Don't warn about out of order event if write_backward is used Wang Nan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.