All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5)
@ 2015-10-02  5:18 Namhyung Kim
  2015-10-02  5:18 ` [RFC/PATCH 01/38] perf tools: Use a software dummy event to track task/mmap events Namhyung Kim
                   ` (29 more replies)
  0 siblings, 30 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

Hello,

This patchset converts perf report to use multiple threads in order to
speed up the processing on large data files.  I can see a minimum ~30%
of speedup with this change.  The code is still experimental and
contains many rough edges.  But I'd like to share and give some
feedbacks.

 * changes in v5)
   - move struct machines from perf_session to perf_tool  (Arnaldo)
   - add --num-thread option to perf report  (David)
   - separate track_mmap code for review  (Jiri)
   - fix some minor bugs  (Jiri)

 * changes in v4)
   - rename to *_find(new)_by_time()
   - also sort struct map by time
   - use 'perf_has_index' global variable
   - fix an off-by-one bug in index generation
   - rebased onto the recent atomic thread refcounting
   - remove missing threads tree in favor of thread rwlock

 * changes in v3)
   - handle header (metadata) same as sample data (at index 0)
   - maintain libunwind address space in map_groups instead of thread
   - use *_time API only for indexed data file
   - resolve callchain with the *_time API
   - use dso__data_get/put_fd() to protect access to fd
   - synthesize COMM event for command line workload

 * changes in v2)
   - rework with single indexed data file rather than multiple files in
     a directory

The perf report processes (sample) events like below:

  1. preprocess sample to get matching thread/dso/symbol info
  2. insert it to hists rbtree (with callchain tree) based on the info
  3. optionally collapse hist entries that match given sort key(s)
  4. resort hist entries (by overhead) for output
  5. display the hist entries

The stage 1 is a preprocessing and mostly act like a read-only
operation in that it doesn't change a machine state during the sample
processing.  Meta events like fork, comm and mmap can change the
machine/thread state but symbols can be loaded during the processing
(stage 2).

The stage 2 consumes most of the time especially with callchains and
 --children option is enabled.  And this work can be easily patitioned
as each sample is independent to others.  But the resulting hists must
be combined/collapsed to a single global hists before going to further
steps.

The stage 3 is optional and only needed by certain sort keys - but
with stage 2 paralellized, it needs to be done always.

The stage 4 and 5 works on whole hists so must be done serially.

So my approach is like this:

Partially do stage 1 first - but only for meta events that changes
machine state.  To do this I add a dummy tracking event to perf record
and make it collect such meta events only.  They are saved as normal
data and processed before sample events at perf report time.

This also requires to handle multiple sample data concurrently and to
find a corresponding machine state when processing samples.  On a
large profiling session, many tasks were created and exited so pid
might be recycled (even more than once!).  To deal with it, I managed
to have thread, map_groups, map and comm in time sorted.  The only
remaining thing is symbol loading as it's done lazily when sample
requires it.

With that being done, the stage 2 can be done by multiple threads.  I
also save each sample data (per-cpu or per-thread) in separate files
during record and then merge them into a single data file with an
index table.  On perf report time, each region of sample data will be
processed by each thread.  And symbol loading is protected by a mutex
lock.

For DWARF post-unwinding, dso cache data also needs to be protected by
a lock and this caused a huge contention.  I made it to search the
rbtree speculatively first and then, if it didn't find one, search it
again under the dso lock.  Please take a look at it if it's acceptable.

The patch 1-9 are to support indexing for data file.  With --index
option, perf record will create a intermediate directory and then save
meta events and sample data to separate files.  And finally it'll
build an index table and concatenate the data files (and also remove
the intermediate directory).

The patch 10-24 are to manage machine and thread state using timestamp
so that it can be searched when processing samples.  The patch 25-35
are to implement parallel report.  And on patch 36 I implemented 'perf
data index' command to build an index table for a given data file.
The last two patches are to improve speed when using multi-thread.

This patchset didn't change perf record to use multi-thread.  But I
think it can be easily done later if needed.

Note that output has a slight difference to original version when
compared using indexed data file.  But they're mostly unresolved
symbols for callchains.

Here is the result:

This is just elapsed time measured by 'perf stat -r 5'.  Note that
overall performance in v5 slows down than v4 by ~30% possibly due to
more agressive use of atomic thread refcounting and rwlocks.

The data file was recorded during kernel build with fp callchain and
size is 2.1GB.  The machine has 6 core with hyper-threading enabled
and I got a similar result on my laptop too.

 perf report          --children  --no-children  + --call-graph none
                   -------------  -------------  -------------------
 current           497.568804423  131.312637209      58.972991789
 with index        443.527574644  110.671594167      40.755612227
 + --multi-thread  304.064133574   63.175497780      18.815425955


This result is with 7.7GB data file using libunwind for callchain.

 perf report          --children  --no-children  + --call-graph none
                   -------------  -------------  -------------------
 current            20.600862660   13.723705468       6.865225263
 with index         17.926679276   11.492099330       5.254594583
 + --multi-thread   10.549591227     6.891150645       3.261140410


This result is with same file but using libdw for callchain unwind.

 perf report          --children  --no-children  + --call-graph none
                   -------------  -------------  -------------------
 current           259.553111664  229.953713435       7.083999430
 with index        239.071895690  222.996473823       4.967384924
 + --multi-thread  119.465870204  110.833195062       3.144185106

On my archlinux system, callchain unwind using libdw is much slower
than libunwind.  I'm using elfutils version 0.161.  Also I don't know
why --children takes less time than --no-children.  Anyway we can see
the --multi-thread performance is much better for each case.

This patch is based on acme/perf/core - commit ("dbc67409fa91 perf
list: Do event name substring search as last resort when no events
found").

You can get it from 'perf/threaded-v5' branch on my tree at:

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Please take a look and play with it.  Any comments are welcome! :)

Thanks,
Namhyung


Namhyung Kim (38):
  perf tools: Use a software dummy event to track task/mmap events
  perf tools: Save mmap_param.len instead of mask
  perf tools: Move auxtrace_mmap field to struct perf_evlist
  perf tools: pass perf_mmap desc directly
  perf tools: Create separate mmap for dummy tracking event
  perf tools: Extend perf_evlist__mmap_ex() to use track mmap
  perf tools: Add HEADER_DATA_INDEX feature
  perf tools: Handle indexed data file properly
  perf record: Add --index option for building index table
  perf report: Skip dummy tracking event
  perf tools: Introduce thread__comm(_str)_by_time() helpers
  perf tools: Add a test case for thread comm handling
  perf tools: Use thread__comm_by_time() when adding hist entries
  perf tools: Convert dead thread list into rbtree
  perf tools: Introduce machine__find*_thread_by_time()
  perf tools: Add a test case for timed thread handling
  perf tools: Maintain map groups list in a leader thread
  perf tools: Introduce thread__find_addr_location_by_time() and friends
  perf callchain: Use thread__find_addr_location_by_time() and friends
  perf tools: Add a test case for timed map groups handling
  perf tools: Save timestamp of a map creation
  perf tools: Introduce map_groups__{insert,find}_by_time()
  perf tools: Use map_groups__find_addr_by_time()
  perf tools: Add testcase for managing maps with time
  perf callchain: Maintain libunwind's address space in map_groups
  perf session: Pass struct events stats to event processing functions
  perf hists: Pass hists struct to hist_entry_iter struct
  perf tools: Move BUILD_ID_SIZE definition to perf.h
  perf tools: Introduce machines__new/delete()
  perf session: Separate struct machines from session
  perf report: Parallelize perf report using multi-thread
  perf tools: Fix progress ui to support multi thread
  perf report: Add --multi-thread option and config item
  perf session: Handle index files generally
  perf report: Add --num-thread option to control number of thread
  perf data: Implement 'index' subcommand
  perf tools: Reduce lock contention in perf_event__preprocess_sample()
  perf tools: Skip dso front cache for indexed data file

 tools/perf/Documentation/perf-data.txt   |  25 +-
 tools/perf/Documentation/perf-record.txt |   4 +
 tools/perf/Documentation/perf-report.txt |  10 +
 tools/perf/builtin-annotate.c            |  12 +-
 tools/perf/builtin-buildid-cache.c       |  14 +-
 tools/perf/builtin-buildid-list.c        |  16 +-
 tools/perf/builtin-data.c                | 358 +++++++++++++++++++++++++++-
 tools/perf/builtin-diff.c                |  16 +-
 tools/perf/builtin-evlist.c              |  18 +-
 tools/perf/builtin-inject.c              |   6 +
 tools/perf/builtin-kmem.c                |  14 +-
 tools/perf/builtin-kvm.c                 |  14 +-
 tools/perf/builtin-lock.c                |   7 +-
 tools/perf/builtin-mem.c                 |  14 +-
 tools/perf/builtin-record.c              | 196 +++++++++++++--
 tools/perf/builtin-report.c              | 102 +++++++-
 tools/perf/builtin-sched.c               |   8 +-
 tools/perf/builtin-script.c              |  34 ++-
 tools/perf/builtin-timechart.c           |  12 +-
 tools/perf/builtin-top.c                 |  26 +-
 tools/perf/builtin-trace.c               |  10 +-
 tools/perf/perf.c                        |   1 +
 tools/perf/perf.h                        |   6 +
 tools/perf/tests/Build                   |   4 +
 tools/perf/tests/builtin-test.c          |  16 ++
 tools/perf/tests/dwarf-unwind.c          |  13 +-
 tools/perf/tests/hists_common.c          |   3 +-
 tools/perf/tests/hists_cumulate.c        |   1 +
 tools/perf/tests/hists_filter.c          |   1 +
 tools/perf/tests/hists_link.c            |   6 +-
 tools/perf/tests/hists_output.c          |   1 +
 tools/perf/tests/tests.h                 |   4 +
 tools/perf/tests/thread-comm.c           |  47 ++++
 tools/perf/tests/thread-lookup-time.c    | 179 ++++++++++++++
 tools/perf/tests/thread-map-time.c       |  90 +++++++
 tools/perf/tests/thread-mg-share.c       |   7 +-
 tools/perf/tests/thread-mg-time.c        |  93 ++++++++
 tools/perf/tests/topology.c              |  19 +-
 tools/perf/ui/browsers/hists.c           |  30 ++-
 tools/perf/ui/gtk/hists.c                |   3 +
 tools/perf/util/build-id.c               |  16 +-
 tools/perf/util/build-id.h               |   3 -
 tools/perf/util/data-convert-bt.c        |   8 +-
 tools/perf/util/dso.c                    |   2 +-
 tools/perf/util/dso.h                    |   1 +
 tools/perf/util/event.c                  | 144 +++++++++--
 tools/perf/util/event.h                  |   1 -
 tools/perf/util/evlist.c                 | 209 ++++++++++++----
 tools/perf/util/evlist.h                 |  15 +-
 tools/perf/util/evsel.h                  |  15 ++
 tools/perf/util/header.c                 |  68 +++++-
 tools/perf/util/header.h                 |   3 +
 tools/perf/util/hist.c                   | 117 ++++++---
 tools/perf/util/hist.h                   |   7 +-
 tools/perf/util/intel-bts.c              |   2 +-
 tools/perf/util/intel-pt.c               |   2 +-
 tools/perf/util/machine.c                | 320 +++++++++++++++++++++----
 tools/perf/util/machine.h                |  17 +-
 tools/perf/util/map.c                    |  84 ++++++-
 tools/perf/util/map.h                    |  37 ++-
 tools/perf/util/probe-event.c            |   2 +-
 tools/perf/util/session.c                | 394 +++++++++++++++++++++++++++----
 tools/perf/util/session.h                |   9 +-
 tools/perf/util/symbol-elf.c             |   2 +-
 tools/perf/util/symbol.c                 |   7 +-
 tools/perf/util/thread.c                 | 205 +++++++++++++++-
 tools/perf/util/thread.h                 |  28 ++-
 tools/perf/util/tool.h                   |  16 ++
 tools/perf/util/unwind-libdw.c           |  12 +-
 tools/perf/util/unwind-libunwind.c       |  55 +++--
 tools/perf/util/unwind.h                 |  15 +-
 71 files changed, 2868 insertions(+), 388 deletions(-)
 create mode 100644 tools/perf/tests/thread-comm.c
 create mode 100644 tools/perf/tests/thread-lookup-time.c
 create mode 100644 tools/perf/tests/thread-map-time.c
 create mode 100644 tools/perf/tests/thread-mg-time.c

-- 
2.6.0


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [RFC/PATCH 01/38] perf tools: Use a software dummy event to track task/mmap events
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
@ 2015-10-02  5:18 ` Namhyung Kim
  2015-10-05 12:51   ` Jiri Olsa
  2015-10-02  5:18 ` [RFC/PATCH 02/38] perf tools: Save mmap_param.len instead of mask Namhyung Kim
                   ` (28 subsequent siblings)
  29 siblings, 1 reply; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen,
	Adrian Hunter

Add APIs for software dummy event to track task/comm/mmap events
separately.  The perf record will use them to save such events in a
separate mmap buffer to make it easy to index.  This is a preparation of
multi-thread support which will come later.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/evlist.c | 30 ++++++++++++++++++++++++++++++
 tools/perf/util/evlist.h |  1 +
 tools/perf/util/evsel.h  | 15 +++++++++++++++
 3 files changed, 46 insertions(+)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index e7e195d867ea..c5180a29db1b 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -248,6 +248,36 @@ error:
 	return -ENOMEM;
 }
 
+int perf_evlist__add_dummy_tracking(struct perf_evlist *evlist)
+{
+	struct perf_event_attr attr = {
+		.type = PERF_TYPE_SOFTWARE,
+		.config = PERF_COUNT_SW_DUMMY,
+		.exclude_kernel = 1,
+	};
+	struct perf_evsel *evsel;
+
+	event_attr_init(&attr);
+
+	evsel = perf_evsel__new(&attr);
+	if (evsel == NULL)
+		goto error;
+
+	/* use strdup() because free(evsel) assumes name is allocated */
+	evsel->name = strdup("dummy");
+	if (!evsel->name)
+		goto error_free;
+
+	perf_evlist__add(evlist, evsel);
+	perf_evlist__set_tracking_event(evlist, evsel);
+
+	return 0;
+error_free:
+	perf_evsel__delete(evsel);
+error:
+	return -ENOMEM;
+}
+
 static int perf_evlist__add_attrs(struct perf_evlist *evlist,
 				  struct perf_event_attr *attrs, size_t nr_attrs)
 {
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 66bc9d4c0869..414e383885f5 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -75,6 +75,7 @@ void perf_evlist__delete(struct perf_evlist *evlist);
 void perf_evlist__add(struct perf_evlist *evlist, struct perf_evsel *entry);
 void perf_evlist__remove(struct perf_evlist *evlist, struct perf_evsel *evsel);
 int perf_evlist__add_default(struct perf_evlist *evlist);
+int perf_evlist__add_dummy_tracking(struct perf_evlist *evlist);
 int __perf_evlist__add_default_attrs(struct perf_evlist *evlist,
 				     struct perf_event_attr *attrs, size_t nr_attrs);
 
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 7906666580da..8d2445347b0f 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -359,6 +359,21 @@ static inline bool perf_evsel__is_function_event(struct perf_evsel *evsel)
 #undef FUNCTION_EVENT
 }
 
+/**
+ * perf_evsel__is_dummy_tracking - Return whether given evsel is a dummy
+ * event for tracking meta events only
+ *
+ * @evsel - evsel selector to be tested
+ *
+ * Return %true if event is a dummy tracking event
+ */
+static inline bool perf_evsel__is_dummy_tracking(struct perf_evsel *evsel)
+{
+	return evsel->attr.type == PERF_TYPE_SOFTWARE &&
+		evsel->attr.config == PERF_COUNT_SW_DUMMY &&
+		evsel->attr.task == 1 && evsel->attr.mmap == 1;
+}
+
 struct perf_attr_details {
 	bool freq;
 	bool verbose;
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 02/38] perf tools: Save mmap_param.len instead of mask
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
  2015-10-02  5:18 ` [RFC/PATCH 01/38] perf tools: Use a software dummy event to track task/mmap events Namhyung Kim
@ 2015-10-02  5:18 ` Namhyung Kim
  2015-10-02 18:44   ` Arnaldo Carvalho de Melo
  2015-10-08 10:17   ` Jiri Olsa
  2015-10-02  5:18 ` [RFC/PATCH 03/38] perf tools: Move auxtrace_mmap field to struct perf_evlist Namhyung Kim
                   ` (27 subsequent siblings)
  29 siblings, 2 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen,
	Adrian Hunter

It is more convenient saving mmap length rather than (bit) mask.  With
this patch, we can eliminate dependency to perf_evlist other than
getting mmap_desc for dealing with mmaps.  The mask and length can be
converted using perf_evlist__mmap_mask/len().

Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/evlist.c | 31 +++++++++++++++++++++++++------
 1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index c5180a29db1b..e46adcd5b408 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -29,6 +29,8 @@
 
 static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx);
 static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx);
+static size_t perf_evlist__mmap_mask(size_t len);
+static size_t perf_evlist__mmap_len(size_t mask);
 
 #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
 #define SID(e, x, y) xyarray__entry(e->sample_id, x, y)
@@ -871,7 +873,9 @@ void __weak auxtrace_mmap_params__set_idx(
 static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
 {
 	if (evlist->mmap[idx].base != NULL) {
-		munmap(evlist->mmap[idx].base, evlist->mmap_len);
+		size_t mmap_len = perf_evlist__mmap_len(evlist->mmap[idx].mask);
+
+		munmap(evlist->mmap[idx].base, mmap_len);
 		evlist->mmap[idx].base = NULL;
 		atomic_set(&evlist->mmap[idx].refcnt, 0);
 	}
@@ -901,8 +905,8 @@ static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
 }
 
 struct mmap_params {
-	int prot;
-	int mask;
+	int	prot;
+	size_t	len;
 	struct auxtrace_mmap_params auxtrace_mp;
 };
 
@@ -924,8 +928,8 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 	 */
 	atomic_set(&evlist->mmap[idx].refcnt, 2);
 	evlist->mmap[idx].prev = 0;
-	evlist->mmap[idx].mask = mp->mask;
-	evlist->mmap[idx].base = mmap(NULL, evlist->mmap_len, mp->prot,
+	evlist->mmap[idx].mask = perf_evlist__mmap_mask(mp->len);
+	evlist->mmap[idx].base = mmap(NULL, mp->len, mp->prot,
 				      MAP_SHARED, fd, 0);
 	if (evlist->mmap[idx].base == MAP_FAILED) {
 		pr_debug2("failed to mmap perf event ring buffer, error %d\n",
@@ -1071,6 +1075,21 @@ static size_t perf_evlist__mmap_size(unsigned long pages)
 	return (pages + 1) * page_size;
 }
 
+static size_t perf_evlist__mmap_mask(size_t len)
+{
+	BUG_ON(len <= page_size);
+	BUG_ON((len % page_size) != 0);
+
+	return len - page_size - 1;
+}
+
+static size_t perf_evlist__mmap_len(size_t mask)
+{
+	BUG_ON(((mask + 1) % page_size) != 0);
+
+	return mask + 1 + page_size;
+}
+
 static long parse_pages_arg(const char *str, unsigned long min,
 			    unsigned long max)
 {
@@ -1176,7 +1195,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 	evlist->overwrite = overwrite;
 	evlist->mmap_len = perf_evlist__mmap_size(pages);
 	pr_debug("mmap size %zuB\n", evlist->mmap_len);
-	mp.mask = evlist->mmap_len - page_size - 1;
+	mp.len = evlist->mmap_len;
 
 	auxtrace_mmap_params__init(&mp.auxtrace_mp, evlist->mmap_len,
 				   auxtrace_pages, auxtrace_overwrite);
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 03/38] perf tools: Move auxtrace_mmap field to struct perf_evlist
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
  2015-10-02  5:18 ` [RFC/PATCH 01/38] perf tools: Use a software dummy event to track task/mmap events Namhyung Kim
  2015-10-02  5:18 ` [RFC/PATCH 02/38] perf tools: Save mmap_param.len instead of mask Namhyung Kim
@ 2015-10-02  5:18 ` Namhyung Kim
  2015-10-02 18:45   ` Arnaldo Carvalho de Melo
                     ` (2 more replies)
  2015-10-02  5:18 ` [RFC/PATCH 04/38] perf tools: pass perf_mmap desc directly Namhyung Kim
                   ` (26 subsequent siblings)
  29 siblings, 3 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen,
	Adrian Hunter

Since it's gonna share struct mmap with dummy tracking evsel to track
meta events only, let's move auxtrace out of struct perf_mmap.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-record.c |  4 ++--
 tools/perf/util/evlist.c    | 30 +++++++++++++++++++++---------
 tools/perf/util/evlist.h    |  2 +-
 3 files changed, 24 insertions(+), 12 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 5e01c070dbf2..0accac6e0812 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -220,7 +220,7 @@ static int record__auxtrace_read_snapshot_all(struct record *rec)
 
 	for (i = 0; i < rec->evlist->nr_mmaps; i++) {
 		struct auxtrace_mmap *mm =
-				&rec->evlist->mmap[i].auxtrace_mmap;
+				&rec->evlist->auxtrace_mmap[i];
 
 		if (!mm->base)
 			continue;
@@ -405,7 +405,7 @@ static int record__mmap_read_all(struct record *rec)
 	int rc = 0;
 
 	for (i = 0; i < rec->evlist->nr_mmaps; i++) {
-		struct auxtrace_mmap *mm = &rec->evlist->mmap[i].auxtrace_mmap;
+		struct auxtrace_mmap *mm = &rec->evlist->auxtrace_mmap[i];
 
 		if (rec->evlist->mmap[i].base) {
 			if (record__mmap_read(rec, i) != 0) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index e46adcd5b408..042dffc67986 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -810,9 +810,12 @@ union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
 	return event;
 }
 
-static bool perf_mmap__empty(struct perf_mmap *md)
+static bool perf_evlist__mmap_empty(struct perf_evlist *evlist, int idx)
 {
-	return perf_mmap__read_head(md) == md->prev && !md->auxtrace_mmap.base;
+	struct perf_mmap *md = &evlist->mmap[idx];
+
+	return perf_mmap__read_head(md) == md->prev &&
+		evlist->auxtrace_mmap[idx].base == NULL;
 }
 
 static void perf_evlist__mmap_get(struct perf_evlist *evlist, int idx)
@@ -838,7 +841,7 @@ void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
 		perf_mmap__write_tail(md, old);
 	}
 
-	if (atomic_read(&md->refcnt) == 1 && perf_mmap__empty(md))
+	if (atomic_read(&md->refcnt) == 1 && perf_evlist__mmap_empty(evlist, idx))
 		perf_evlist__mmap_put(evlist, idx);
 }
 
@@ -879,7 +882,7 @@ static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
 		evlist->mmap[idx].base = NULL;
 		atomic_set(&evlist->mmap[idx].refcnt, 0);
 	}
-	auxtrace_mmap__munmap(&evlist->mmap[idx].auxtrace_mmap);
+	auxtrace_mmap__munmap(&evlist->auxtrace_mmap[idx]);
 }
 
 void perf_evlist__munmap(struct perf_evlist *evlist)
@@ -901,7 +904,15 @@ static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
 	if (cpu_map__empty(evlist->cpus))
 		evlist->nr_mmaps = thread_map__nr(evlist->threads);
 	evlist->mmap = zalloc(evlist->nr_mmaps * sizeof(struct perf_mmap));
-	return evlist->mmap != NULL ? 0 : -ENOMEM;
+	if (evlist->mmap == NULL)
+		return -ENOMEM;
+	evlist->auxtrace_mmap = calloc(evlist->nr_mmaps,
+				       sizeof(struct auxtrace_mmap));
+	if (evlist->auxtrace_mmap == NULL) {
+		zfree(&evlist->mmap);
+		return -ENOMEM;
+	}
+	return 0;
 }
 
 struct mmap_params {
@@ -938,10 +949,6 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 		return -1;
 	}
 
-	if (auxtrace_mmap__mmap(&evlist->mmap[idx].auxtrace_mmap,
-				&mp->auxtrace_mp, evlist->mmap[idx].base, fd))
-		return -1;
-
 	return 0;
 }
 
@@ -963,6 +970,11 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 			*output = fd;
 			if (__perf_evlist__mmap(evlist, idx, mp, *output) < 0)
 				return -1;
+
+			if (auxtrace_mmap__mmap(&evlist->auxtrace_mmap[idx],
+						&mp->auxtrace_mp,
+						evlist->mmap[idx].base, fd))
+				return -1;
 		} else {
 			if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, *output) != 0)
 				return -1;
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 414e383885f5..51574ce8ac69 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -30,7 +30,6 @@ struct perf_mmap {
 	int		 mask;
 	atomic_t	 refcnt;
 	u64		 prev;
-	struct auxtrace_mmap auxtrace_mmap;
 	char		 event_copy[PERF_SAMPLE_MAX_SIZE] __attribute__((aligned(8)));
 };
 
@@ -53,6 +52,7 @@ struct perf_evlist {
 	} workload;
 	struct fdarray	 pollfd;
 	struct perf_mmap *mmap;
+	struct auxtrace_mmap *auxtrace_mmap;
 	struct thread_map *threads;
 	struct cpu_map	  *cpus;
 	struct perf_evsel *selected;
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 04/38] perf tools: pass perf_mmap desc directly
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
                   ` (2 preceding siblings ...)
  2015-10-02  5:18 ` [RFC/PATCH 03/38] perf tools: Move auxtrace_mmap field to struct perf_evlist Namhyung Kim
@ 2015-10-02  5:18 ` Namhyung Kim
  2015-10-02 18:47   ` Arnaldo Carvalho de Melo
  2015-10-02  5:18 ` [RFC/PATCH 05/38] perf tools: Create separate mmap for dummy tracking event Namhyung Kim
                   ` (25 subsequent siblings)
  29 siblings, 1 reply; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

Pass struct perf_mmap to mmap handling functions directly.  This will
be used by both of normal mmap and track mmap later.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/evlist.c | 24 +++++++++++++++---------
 tools/perf/util/evlist.h |  1 +
 2 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 042dffc67986..8d31883cbeb8 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -921,8 +921,8 @@ struct mmap_params {
 	struct auxtrace_mmap_params auxtrace_mp;
 };
 
-static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
-			       struct mmap_params *mp, int fd)
+static int perf_mmap__mmap(struct perf_mmap *desc,
+			   struct mmap_params *mp, int fd)
 {
 	/*
 	 * The last one will be done at perf_evlist__mmap_consume(), so that we
@@ -937,21 +937,26 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 	 * evlist layer can't just drop it when filtering events in
 	 * perf_evlist__filter_pollfd().
 	 */
-	atomic_set(&evlist->mmap[idx].refcnt, 2);
-	evlist->mmap[idx].prev = 0;
-	evlist->mmap[idx].mask = perf_evlist__mmap_mask(mp->len);
-	evlist->mmap[idx].base = mmap(NULL, mp->len, mp->prot,
+	atomic_set(&desc->refcnt, 2);
+	desc->prev = 0;
+	desc->mask = perf_evlist__mmap_mask(mp->len);
+	desc->base = mmap(NULL, mp->len, mp->prot,
 				      MAP_SHARED, fd, 0);
-	if (evlist->mmap[idx].base == MAP_FAILED) {
+	if (desc->base == MAP_FAILED) {
 		pr_debug2("failed to mmap perf event ring buffer, error %d\n",
 			  errno);
-		evlist->mmap[idx].base = NULL;
+		desc->base = NULL;
 		return -1;
 	}
 
 	return 0;
 }
 
+struct perf_mmap *perf_evlist__mmap_desc(struct perf_evlist *evlist, int idx)
+{
+	return &evlist->mmap[idx];
+}
+
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 				       struct mmap_params *mp, int cpu,
 				       int thread, int *output)
@@ -960,6 +965,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 
 	evlist__for_each(evlist, evsel) {
 		int fd;
+		struct perf_mmap *desc = perf_evlist__mmap_desc(evlist, idx);
 
 		if (evsel->system_wide && thread)
 			continue;
@@ -968,7 +974,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 
 		if (*output == -1) {
 			*output = fd;
-			if (__perf_evlist__mmap(evlist, idx, mp, *output) < 0)
+			if (perf_mmap__mmap(desc, mp, *output) < 0)
 				return -1;
 
 			if (auxtrace_mmap__mmap(&evlist->auxtrace_mmap[idx],
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 51574ce8ac69..79f8245300ad 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -145,6 +145,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 		      bool overwrite);
 void perf_evlist__munmap(struct perf_evlist *evlist);
+struct perf_mmap *perf_evlist__mmap_desc(struct perf_evlist *evlist, int idx);
 
 void perf_evlist__disable(struct perf_evlist *evlist);
 void perf_evlist__enable(struct perf_evlist *evlist);
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 05/38] perf tools: Create separate mmap for dummy tracking event
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
                   ` (3 preceding siblings ...)
  2015-10-02  5:18 ` [RFC/PATCH 04/38] perf tools: pass perf_mmap desc directly Namhyung Kim
@ 2015-10-02  5:18 ` Namhyung Kim
  2015-10-02  5:18 ` [RFC/PATCH 06/38] perf tools: Extend perf_evlist__mmap_ex() to use track mmap Namhyung Kim
                   ` (24 subsequent siblings)
  29 siblings, 0 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen,
	Adrian Hunter

When indexed data file support is enabled, a dummy tracking event will
be used to track metadata (like task, comm and mmap events) for a
session and actual samples will be recorded in separate (intermediate)
files and then merged (with index table).

Provide separate mmap to the dummy tracking event.  The size is fixed
to 128KiB (+ 1 page) as the event rate will be lower than samples.  I
originally wanted to use a single mmap for this but cross-cpu sharing
is prohibited so it's per-cpu (or per-task) like normal mmaps.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-record.c |   2 +-
 tools/perf/util/evlist.c    | 106 ++++++++++++++++++++++++++++++++++----------
 tools/perf/util/evlist.h    |   9 ++++
 3 files changed, 93 insertions(+), 24 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 0accac6e0812..33dc2eafe2b5 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -74,7 +74,7 @@ static int process_synthesized_event(struct perf_tool *tool,
 
 static int record__mmap_read(struct record *rec, int idx)
 {
-	struct perf_mmap *md = &rec->evlist->mmap[idx];
+	struct perf_mmap *md = perf_evlist__mmap_desc(rec->evlist, idx);
 	u64 head = perf_mmap__read_head(md);
 	u64 old = md->prev;
 	unsigned char *data = md->base + page_size;
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 8d31883cbeb8..25a9c3b5f473 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -743,7 +743,7 @@ static struct perf_evsel *perf_evlist__event2evsel(struct perf_evlist *evlist,
 
 union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
 {
-	struct perf_mmap *md = &evlist->mmap[idx];
+	struct perf_mmap *md = perf_evlist__mmap_desc(evlist, idx);
 	u64 head;
 	u64 old = md->prev;
 	unsigned char *data = md->base + page_size;
@@ -812,28 +812,38 @@ union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
 
 static bool perf_evlist__mmap_empty(struct perf_evlist *evlist, int idx)
 {
-	struct perf_mmap *md = &evlist->mmap[idx];
+	struct perf_mmap *md = perf_evlist__mmap_desc(evlist, idx);
 
-	return perf_mmap__read_head(md) == md->prev &&
-		evlist->auxtrace_mmap[idx].base == NULL;
+	if (perf_mmap__read_head(md) != md->prev)
+		return false;
+
+	if (idx >= 0)
+		return !evlist->auxtrace_mmap[idx].base;
+	return true;
 }
 
 static void perf_evlist__mmap_get(struct perf_evlist *evlist, int idx)
 {
-	atomic_inc(&evlist->mmap[idx].refcnt);
+	struct perf_mmap *md = perf_evlist__mmap_desc(evlist, idx);
+
+	atomic_inc(&md->refcnt);
 }
 
 static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx)
 {
-	BUG_ON(atomic_read(&evlist->mmap[idx].refcnt) == 0);
+	struct perf_mmap *md = perf_evlist__mmap_desc(evlist, idx);
+
+	BUG_ON(atomic_read(&md->refcnt) == 0);
 
-	if (atomic_dec_and_test(&evlist->mmap[idx].refcnt))
-		__perf_evlist__munmap(evlist, idx);
+	if (!atomic_dec_and_test(&md->refcnt))
+		return;
+
+	__perf_evlist__munmap(evlist, idx);
 }
 
 void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
 {
-	struct perf_mmap *md = &evlist->mmap[idx];
+	struct perf_mmap *md = perf_evlist__mmap_desc(evlist, idx);
 
 	if (!evlist->overwrite) {
 		u64 old = md->prev;
@@ -875,14 +885,15 @@ void __weak auxtrace_mmap_params__set_idx(
 
 static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
 {
-	if (evlist->mmap[idx].base != NULL) {
-		size_t mmap_len = perf_evlist__mmap_len(evlist->mmap[idx].mask);
+	struct perf_mmap *md = perf_evlist__mmap_desc(evlist, idx);
+
+	if (md->base != NULL) {
+		size_t mmap_len = perf_evlist__mmap_len(md->mask);
 
-		munmap(evlist->mmap[idx].base, mmap_len);
-		evlist->mmap[idx].base = NULL;
-		atomic_set(&evlist->mmap[idx].refcnt, 0);
+		munmap(md->base, mmap_len);
+		md->base = NULL;
+		atomic_set(&md->refcnt, 0);
 	}
-	auxtrace_mmap__munmap(&evlist->auxtrace_mmap[idx]);
 }
 
 void perf_evlist__munmap(struct perf_evlist *evlist)
@@ -892,13 +903,17 @@ void perf_evlist__munmap(struct perf_evlist *evlist)
 	if (evlist->mmap == NULL)
 		return;
 
-	for (i = 0; i < evlist->nr_mmaps; i++)
+	for (i = 0; i < evlist->nr_mmaps; i++) {
 		__perf_evlist__munmap(evlist, i);
+		auxtrace_mmap__munmap(&evlist->auxtrace_mmap[i]);
+		if (evlist->track_mmap)
+			__perf_evlist__munmap(evlist, track_mmap_idx(i));
+	}
 
 	zfree(&evlist->mmap);
 }
 
-static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
+static int perf_evlist__alloc_mmap(struct perf_evlist *evlist, bool track_mmap)
 {
 	evlist->nr_mmaps = cpu_map__nr(evlist->cpus);
 	if (cpu_map__empty(evlist->cpus))
@@ -912,12 +927,22 @@ static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
 		zfree(&evlist->mmap);
 		return -ENOMEM;
 	}
+	if (track_mmap) {
+		evlist->track_mmap = calloc(evlist->nr_mmaps,
+					    sizeof(struct perf_mmap));
+		if (evlist->track_mmap == NULL) {
+			zfree(&evlist->mmap);
+			zfree(&evlist->auxtrace_mmap);
+			return -ENOMEM;
+		}
+	}
 	return 0;
 }
 
 struct mmap_params {
 	int	prot;
 	size_t	len;
+	bool	track_mmap;
 	struct auxtrace_mmap_params auxtrace_mp;
 };
 
@@ -954,12 +979,16 @@ static int perf_mmap__mmap(struct perf_mmap *desc,
 
 struct perf_mmap *perf_evlist__mmap_desc(struct perf_evlist *evlist, int idx)
 {
-	return &evlist->mmap[idx];
+	if (idx >= 0)
+		return &evlist->mmap[idx];
+	else
+		return &evlist->track_mmap[track_mmap_idx(idx)];
 }
 
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 				       struct mmap_params *mp, int cpu,
-				       int thread, int *output)
+				       int thread, int *output,
+				       int *track_output)
 {
 	struct perf_evsel *evsel;
 
@@ -972,7 +1001,30 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 
 		fd = FD(evsel, cpu, thread);
 
-		if (*output == -1) {
+		if (mp->track_mmap && perf_evsel__is_dummy_tracking(evsel)) {
+			size_t old_len = mp->len;
+
+			/* mark idx as track mmap idx (negative) */
+			idx = track_mmap_idx(idx);
+
+			desc = perf_evlist__mmap_desc(evlist, idx);
+			mp->len = TRACK_MMAP_SIZE;
+
+			if (*track_output == -1) {
+				*track_output = fd;
+				if (perf_mmap__mmap(desc, mp, fd) < 0)
+					return -1;
+			} else {
+				if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT,
+					  *track_output) != 0)
+					return -1;
+
+				perf_evlist__mmap_get(evlist, idx);
+			}
+
+			mp->len = old_len;
+
+		} else if (*output == -1) {
 			*output = fd;
 			if (perf_mmap__mmap(desc, mp, *output) < 0)
 				return -1;
@@ -1008,6 +1060,11 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 			perf_evlist__set_sid_idx(evlist, evsel, idx, cpu,
 						 thread);
 		}
+
+		if (mp->track_mmap && perf_evsel__is_dummy_tracking(evsel)) {
+			/* restore idx as normal mmap idx (positive) */
+			idx = track_mmap_idx(idx);
+		}
 	}
 
 	return 0;
@@ -1023,13 +1080,15 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist,
 	pr_debug2("perf event ring buffer mmapped per cpu\n");
 	for (cpu = 0; cpu < nr_cpus; cpu++) {
 		int output = -1;
+		int track_output = -1;
 
 		auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, cpu,
 					      true);
 
 		for (thread = 0; thread < nr_threads; thread++) {
 			if (perf_evlist__mmap_per_evsel(evlist, cpu, mp, cpu,
-							thread, &output))
+							thread, &output,
+							&track_output))
 				goto out_unmap;
 		}
 	}
@@ -1051,12 +1110,13 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist,
 	pr_debug2("perf event ring buffer mmapped per thread\n");
 	for (thread = 0; thread < nr_threads; thread++) {
 		int output = -1;
+		int track_output = -1;
 
 		auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, thread,
 					      false);
 
 		if (perf_evlist__mmap_per_evsel(evlist, thread, mp, 0, thread,
-						&output))
+						&output, &track_output))
 			goto out_unmap;
 	}
 
@@ -1204,7 +1264,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 		.prot = PROT_READ | (overwrite ? 0 : PROT_WRITE),
 	};
 
-	if (evlist->mmap == NULL && perf_evlist__alloc_mmap(evlist) < 0)
+	if (evlist->mmap == NULL && perf_evlist__alloc_mmap(evlist, mp.track_mmap) < 0)
 		return -ENOMEM;
 
 	if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 79f8245300ad..fc53eb817c51 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -33,6 +33,8 @@ struct perf_mmap {
 	char		 event_copy[PERF_SAMPLE_MAX_SIZE] __attribute__((aligned(8)));
 };
 
+#define TRACK_MMAP_SIZE  (((128 * 1024 / page_size) + 1) * page_size)
+
 struct perf_evlist {
 	struct list_head entries;
 	struct hlist_head heads[PERF_EVLIST__HLIST_SIZE];
@@ -52,6 +54,7 @@ struct perf_evlist {
 	} workload;
 	struct fdarray	 pollfd;
 	struct perf_mmap *mmap;
+	struct perf_mmap *track_mmap;
 	struct auxtrace_mmap *auxtrace_mmap;
 	struct thread_map *threads;
 	struct cpu_map	  *cpus;
@@ -224,6 +227,12 @@ bool perf_evlist__can_select_event(struct perf_evlist *evlist, const char *str);
 void perf_evlist__to_front(struct perf_evlist *evlist,
 			   struct perf_evsel *move_evsel);
 
+/* convert from/to negative idx for track mmap */
+static inline int track_mmap_idx(int idx)
+{
+	return -idx - 1;
+}
+
 /**
  * __evlist__for_each - iterate thru all the evsels
  * @list: list_head instance to iterate
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 06/38] perf tools: Extend perf_evlist__mmap_ex() to use track mmap
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
                   ` (4 preceding siblings ...)
  2015-10-02  5:18 ` [RFC/PATCH 05/38] perf tools: Create separate mmap for dummy tracking event Namhyung Kim
@ 2015-10-02  5:18 ` Namhyung Kim
  2015-10-02  5:18 ` [RFC/PATCH 07/38] perf tools: Add HEADER_DATA_INDEX feature Namhyung Kim
                   ` (23 subsequent siblings)
  29 siblings, 0 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen,
	Adrian Hunter

The perf_evlist__mmap_ex function creates data and auxtrace mmaps and
optionally tracking mmaps for events now.  It'll be used for perf
record to save events in a separate files and build an index table.
Checking dummy tracking event in perf_evlist__mmap() alone is not
enough as users can specify a dummy event (like in keep tracking
testcase) without the index option.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-record.c | 2 +-
 tools/perf/util/evlist.c    | 6 ++++--
 tools/perf/util/evlist.h    | 2 +-
 3 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 33dc2eafe2b5..90b1237d2525 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -306,7 +306,7 @@ try_again:
 
 	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, false,
 				 opts->auxtrace_mmap_pages,
-				 opts->auxtrace_snapshot_mode) < 0) {
+				 opts->auxtrace_snapshot_mode, false) < 0) {
 		if (errno == EPERM) {
 			pr_err("Permission error mapping pages.\n"
 			       "Consider increasing "
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 25a9c3b5f473..61e83d930e92 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1243,6 +1243,7 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
  * @overwrite: overwrite older events?
  * @auxtrace_pages - auxtrace map length in pages
  * @auxtrace_overwrite - overwrite older auxtrace data?
+ * @use_track_mmap: use another mmaps to track meta events
  *
  * If @overwrite is %false the user needs to signal event consumption using
  * perf_mmap__write_tail().  Using perf_evlist__mmap_read() does this
@@ -1255,13 +1256,14 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
  */
 int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 			 bool overwrite, unsigned int auxtrace_pages,
-			 bool auxtrace_overwrite)
+			 bool auxtrace_overwrite, bool use_track_mmap)
 {
 	struct perf_evsel *evsel;
 	const struct cpu_map *cpus = evlist->cpus;
 	const struct thread_map *threads = evlist->threads;
 	struct mmap_params mp = {
 		.prot = PROT_READ | (overwrite ? 0 : PROT_WRITE),
+		.track_mmap = use_track_mmap,
 	};
 
 	if (evlist->mmap == NULL && perf_evlist__alloc_mmap(evlist, mp.track_mmap) < 0)
@@ -1294,7 +1296,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 		      bool overwrite)
 {
-	return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false);
+	return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false, false);
 }
 
 int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index fc53eb817c51..1ed7b7e8b968 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -144,7 +144,7 @@ int perf_evlist__parse_mmap_pages(const struct option *opt,
 
 int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 			 bool overwrite, unsigned int auxtrace_pages,
-			 bool auxtrace_overwrite);
+			 bool auxtrace_overwrite, bool use_track_mmap);
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 		      bool overwrite);
 void perf_evlist__munmap(struct perf_evlist *evlist);
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 07/38] perf tools: Add HEADER_DATA_INDEX feature
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
                   ` (5 preceding siblings ...)
  2015-10-02  5:18 ` [RFC/PATCH 06/38] perf tools: Extend perf_evlist__mmap_ex() to use track mmap Namhyung Kim
@ 2015-10-02  5:18 ` Namhyung Kim
  2015-10-02  5:18 ` [RFC/PATCH 08/38] perf tools: Handle indexed data file properly Namhyung Kim
                   ` (22 subsequent siblings)
  29 siblings, 0 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

The HEADER_DATA_INDEX feature is to record index table for sample data
so that they can be processed by multiple thread concurrently.  Each
item is a struct perf_file_section which consists of an offset and size.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-record.c |  2 ++
 tools/perf/util/header.c    | 64 +++++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/header.h    |  3 +++
 3 files changed, 69 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 90b1237d2525..623984c81478 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -451,6 +451,8 @@ static void record__init_features(struct record *rec)
 
 	if (!rec->opts.full_auxtrace)
 		perf_header__clear_feat(&session->header, HEADER_AUXTRACE);
+
+	perf_header__clear_feat(&session->header, HEADER_DATA_INDEX);
 }
 
 static volatile int workload_exec_errno;
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 43838003c1a1..c357f7f47d32 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -868,6 +868,24 @@ static int write_auxtrace(int fd, struct perf_header *h,
 	return err;
 }
 
+static int write_data_index(int fd, struct perf_header *h,
+			    struct perf_evlist *evlist __maybe_unused)
+{
+	int ret;
+	unsigned i;
+
+	ret = do_write(fd, &h->nr_index, sizeof(h->nr_index));
+	if (ret < 0)
+		return ret;
+
+	for (i = 0; i < h->nr_index; i++) {
+		ret = do_write(fd, &h->index[i], sizeof(*h->index));
+		if (ret < 0)
+			return ret;
+	}
+	return 0;
+}
+
 static void print_hostname(struct perf_header *ph, int fd __maybe_unused,
 			   FILE *fp)
 {
@@ -1221,6 +1239,12 @@ static void print_group_desc(struct perf_header *ph, int fd __maybe_unused,
 	}
 }
 
+static void print_data_index(struct perf_header *ph __maybe_unused,
+			     int fd __maybe_unused, FILE *fp)
+{
+	fprintf(fp, "# contains data index for parallel processing\n");
+}
+
 static int __event_process_build_id(struct build_id_event *bev,
 				    char *filename,
 				    struct perf_session *session)
@@ -1891,6 +1915,7 @@ out_free:
 	return ret;
 }
 
+
 static int process_auxtrace(struct perf_file_section *section,
 			    struct perf_header *ph, int fd,
 			    void *data __maybe_unused)
@@ -1907,6 +1932,44 @@ static int process_auxtrace(struct perf_file_section *section,
 	return err;
 }
 
+static int process_data_index(struct perf_file_section *section __maybe_unused,
+			      struct perf_header *ph, int fd,
+			      void *data __maybe_unused)
+{
+	ssize_t ret;
+	u64 nr_idx;
+	unsigned i;
+	struct perf_file_section *idx;
+
+	ret = readn(fd, &nr_idx, sizeof(nr_idx));
+	if (ret != sizeof(nr_idx))
+		return -1;
+
+	if (ph->needs_swap)
+		nr_idx = bswap_64(nr_idx);
+
+	idx = calloc(nr_idx, sizeof(*idx));
+	if (idx == NULL)
+		return -1;
+
+	for (i = 0; i < nr_idx; i++) {
+		ret = readn(fd, &idx[i], sizeof(*idx));
+		if (ret != sizeof(*idx)) {
+			free(idx);
+			return -1;
+		}
+
+		if (ph->needs_swap) {
+			idx[i].offset = bswap_64(idx[i].offset);
+			idx[i].size   = bswap_64(idx[i].size);
+		}
+	}
+
+	ph->index = idx;
+	ph->nr_index = nr_idx;
+	return 0;
+}
+
 struct feature_ops {
 	int (*write)(int fd, struct perf_header *h, struct perf_evlist *evlist);
 	void (*print)(struct perf_header *h, int fd, FILE *fp);
@@ -1948,6 +2011,7 @@ static const struct feature_ops feat_ops[HEADER_LAST_FEATURE] = {
 	FEAT_OPP(HEADER_PMU_MAPPINGS,	pmu_mappings),
 	FEAT_OPP(HEADER_GROUP_DESC,	group_desc),
 	FEAT_OPP(HEADER_AUXTRACE,	auxtrace),
+	FEAT_OPP(HEADER_DATA_INDEX,	data_index),
 };
 
 struct header_print_data {
diff --git a/tools/perf/util/header.h b/tools/perf/util/header.h
index 05f27cb6b7e3..add455a7abff 100644
--- a/tools/perf/util/header.h
+++ b/tools/perf/util/header.h
@@ -31,6 +31,7 @@ enum {
 	HEADER_PMU_MAPPINGS,
 	HEADER_GROUP_DESC,
 	HEADER_AUXTRACE,
+	HEADER_DATA_INDEX,
 	HEADER_LAST_FEATURE,
 	HEADER_FEAT_BITS	= 256,
 };
@@ -71,6 +72,8 @@ struct perf_header {
 	bool				needs_swap;
 	u64				data_offset;
 	u64				data_size;
+	struct perf_file_section	*index;
+	u64				nr_index;
 	u64				feat_offset;
 	DECLARE_BITMAP(adds_features, HEADER_FEAT_BITS);
 	struct perf_env 	env;
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 08/38] perf tools: Handle indexed data file properly
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
                   ` (6 preceding siblings ...)
  2015-10-02  5:18 ` [RFC/PATCH 07/38] perf tools: Add HEADER_DATA_INDEX feature Namhyung Kim
@ 2015-10-02  5:18 ` Namhyung Kim
  2015-10-02  5:18 ` [RFC/PATCH 09/38] perf record: Add --index option for building index table Namhyung Kim
                   ` (21 subsequent siblings)
  29 siblings, 0 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

When perf detects data file has index table, process header part first
and then rest data files in a row.  Note that the indexed sample data is
recorded for each cpu/thread separately, it's already ordered with
respect to themselves so no need to use the ordered event queue
interface.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/perf.c         |  1 +
 tools/perf/perf.h         |  2 ++
 tools/perf/util/session.c | 55 +++++++++++++++++++++++++++++++++++++++--------
 3 files changed, 49 insertions(+), 9 deletions(-)

diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index 1fded922bcc8..9664d84a9f8c 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -28,6 +28,7 @@ const char perf_more_info_string[] =
 int use_browser = -1;
 static int use_pager = -1;
 const char *input_name;
+bool perf_has_index;
 
 struct cmd_struct {
 	const char *cmd;
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 90129accffbe..f4b4d7d8752c 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -39,6 +39,8 @@ void pthread__unblock_sigwinch(void);
 
 #include "util/target.h"
 
+extern bool perf_has_index;
+
 struct record_opts {
 	struct target target;
 	bool	     group;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 428149bc64d2..91fa9647f565 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1586,7 +1586,9 @@ static int __perf_session__process_events(struct perf_session *session,
 	mmap_size = MMAP_SIZE;
 	if (mmap_size > file_size) {
 		mmap_size = file_size;
-		session->one_mmap = true;
+
+		if (!perf_has_index)
+			session->one_mmap = true;
 	}
 
 	memset(mmaps, 0, sizeof(mmaps));
@@ -1664,28 +1666,63 @@ out:
 	err = perf_session__flush_thread_stacks(session);
 out_err:
 	ui_progress__finish();
-	perf_session__warn_about_errors(session);
 	ordered_events__free(&session->ordered_events);
 	auxtrace__free_events(session);
 	session->one_mmap = false;
 	return err;
 }
 
+static int __perf_session__process_indexed_events(struct perf_session *session)
+{
+	struct perf_data_file *file = session->file;
+	struct perf_tool *tool = session->tool;
+	u64 size = perf_data_file__size(file);
+	int err = 0, i;
+
+	for (i = 0; i < (int)session->header.nr_index; i++) {
+		struct perf_file_section *idx = &session->header.index[i];
+
+		if (!idx->size)
+			continue;
+
+		/*
+		 * For indexed data file, samples are processed for
+		 * each cpu/thread so it's already ordered.  However
+		 * meta-events at index 0 should be processed in order.
+		 */
+		if (i > 0)
+			tool->ordered_events = false;
+
+		err = __perf_session__process_events(session, idx->offset,
+						     idx->size, size);
+		if (err < 0)
+			break;
+	}
+
+	perf_session__warn_about_errors(session);
+	return err;
+}
+
 int perf_session__process_events(struct perf_session *session)
 {
-	u64 size = perf_data_file__size(session->file);
+	struct perf_data_file *file = session->file;
+	u64 size = perf_data_file__size(file);
 	int err;
 
 	if (perf_session__register_idle_thread(session) == NULL)
 		return -ENOMEM;
 
-	if (!perf_data_file__is_pipe(session->file))
-		err = __perf_session__process_events(session,
-						     session->header.data_offset,
-						     session->header.data_size, size);
-	else
-		err = __perf_session__process_pipe_events(session);
+	if (perf_data_file__is_pipe(file))
+		return __perf_session__process_pipe_events(session);
+	if (perf_has_index)
+		return __perf_session__process_indexed_events(session);
+
+	err = __perf_session__process_events(session,
+					     session->header.data_offset,
+					     session->header.data_size,
+					     size);
 
+	perf_session__warn_about_errors(session);
 	return err;
 }
 
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 09/38] perf record: Add --index option for building index table
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
                   ` (7 preceding siblings ...)
  2015-10-02  5:18 ` [RFC/PATCH 08/38] perf tools: Handle indexed data file properly Namhyung Kim
@ 2015-10-02  5:18 ` Namhyung Kim
  2015-10-02 18:58   ` Arnaldo Carvalho de Melo
  2015-10-05 13:46   ` Jiri Olsa
  2015-10-02  5:18 ` [RFC/PATCH 10/38] perf report: Skip dummy tracking event Namhyung Kim
                   ` (20 subsequent siblings)
  29 siblings, 2 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

The new --index option will create indexed data file which can be
processed by multiple threads parallelly.  It saves meta event and
sample data in separate files and merges them with an index table.

If there's an index table in the data file, the HEADER_DATA_INDEX
feature bit is set and session->header.index[0] will point to the meta
event area, and rest are sample data.  It'd look like below:

        +---------------------+
        |     file header     |
        |---------------------|
        |                     |
        |    meta events[0] <-+--+
        |                     |  |
        |---------------------|  |
        |                     |  |
        |    sample data[1] <-+--+
        |                     |  |
        |---------------------|  |
        |                     |  |
        |    sample data[2] <-|--+
        |                     |  |
        |---------------------|  |
        |         ...         | ...
        |---------------------|  |
        |     feature data    |  |
        |   (contains index) -+--+
        +---------------------+

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/Documentation/perf-record.txt |   4 +
 tools/perf/builtin-record.c              | 178 ++++++++++++++++++++++++++++---
 tools/perf/perf.h                        |   1 +
 tools/perf/util/header.c                 |   2 +
 tools/perf/util/session.c                |   1 +
 5 files changed, 173 insertions(+), 13 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 2e9ce77b5e14..71a9520b10b0 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -308,6 +308,10 @@ This option sets the time out limit. The default value is 500 ms.
 Record context switch events i.e. events of type PERF_RECORD_SWITCH or
 PERF_RECORD_SWITCH_CPU_WIDE.
 
+--index::
+Build an index table for sample data.  This will speed up perf report by
+parallel processing.
+
 SEE ALSO
 --------
 linkperf:perf-stat[1], linkperf:perf-list[1]
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 623984c81478..096634c4c5ea 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -43,6 +43,7 @@ struct record {
 	u64			bytes_written;
 	struct perf_data_file	file;
 	struct auxtrace_record	*itr;
+	int			*fds;
 	struct perf_evlist	*evlist;
 	struct perf_session	*session;
 	const char		*progname;
@@ -52,9 +53,16 @@ struct record {
 	long			samples;
 };
 
-static int record__write(struct record *rec, void *bf, size_t size)
+static int record__write(struct record *rec, void *bf, size_t size, int idx)
 {
-	if (perf_data_file__write(rec->session->file, bf, size) < 0) {
+	int fd;
+
+	if (rec->fds && idx >= 0)
+		fd = rec->fds[idx];
+	else
+		fd = perf_data_file__fd(rec->session->file);
+
+	if (writen(fd, bf, size) < 0) {
 		pr_err("failed to write perf data, error: %m\n");
 		return -1;
 	}
@@ -69,7 +77,7 @@ static int process_synthesized_event(struct perf_tool *tool,
 				     struct machine *machine __maybe_unused)
 {
 	struct record *rec = container_of(tool, struct record, tool);
-	return record__write(rec, event, event->header.size);
+	return record__write(rec, event, event->header.size, -1);
 }
 
 static int record__mmap_read(struct record *rec, int idx)
@@ -94,7 +102,7 @@ static int record__mmap_read(struct record *rec, int idx)
 		size = md->mask + 1 - (old & md->mask);
 		old += size;
 
-		if (record__write(rec, buf, size) < 0) {
+		if (record__write(rec, buf, size, idx) < 0) {
 			rc = -1;
 			goto out;
 		}
@@ -104,7 +112,7 @@ static int record__mmap_read(struct record *rec, int idx)
 	size = head - old;
 	old += size;
 
-	if (record__write(rec, buf, size) < 0) {
+	if (record__write(rec, buf, size, idx) < 0) {
 		rc = -1;
 		goto out;
 	}
@@ -151,6 +159,7 @@ static int record__process_auxtrace(struct perf_tool *tool,
 	struct perf_data_file *file = &rec->file;
 	size_t padding;
 	u8 pad[8] = {0};
+	int idx = event->auxtrace.idx;
 
 	if (!perf_data_file__is_pipe(file)) {
 		off_t file_offset;
@@ -171,11 +180,11 @@ static int record__process_auxtrace(struct perf_tool *tool,
 	if (padding)
 		padding = 8 - padding;
 
-	record__write(rec, event, event->header.size);
-	record__write(rec, data1, len1);
+	record__write(rec, event, event->header.size, idx);
+	record__write(rec, data1, len1, idx);
 	if (len2)
-		record__write(rec, data2, len2);
-	record__write(rec, &pad, padding);
+		record__write(rec, data2, len2, idx);
+	record__write(rec, &pad, padding, idx);
 
 	return 0;
 }
@@ -268,6 +277,110 @@ int auxtrace_record__snapshot_start(struct auxtrace_record *itr __maybe_unused)
 
 #endif
 
+#define INDEX_FILE_FMT  "%s.dir/perf.data.%d"
+
+static int record__create_index_files(struct record *rec, int nr_index)
+{
+	int i = 0;
+	int ret = -1;
+	char path[PATH_MAX];
+	struct perf_data_file *file = &rec->file;
+
+	rec->fds = malloc(nr_index * sizeof(int));
+	if (rec->fds == NULL)
+		return -ENOMEM;
+
+	scnprintf(path, sizeof(path), "%s.dir", file->path);
+	if (rm_rf(path) < 0 || mkdir(path, S_IRWXU) < 0)
+		goto out_err;
+
+	for (i = 0; i < nr_index; i++) {
+		scnprintf(path, sizeof(path), INDEX_FILE_FMT, file->path, i);
+		ret = open(path, O_RDWR|O_CREAT|O_TRUNC, S_IRUSR|S_IWUSR);
+		if (ret < 0)
+			goto out_err;
+
+		rec->fds[i] = ret;
+	}
+	return 0;
+
+out_err:
+	while (--i >= 1)
+		close(rec->fds[i]);
+	zfree(&rec->fds);
+
+	scnprintf(path, sizeof(path), "%s.dir", file->path);
+	rm_rf(path);
+
+	return ret;
+}
+
+static int record__merge_index_files(struct record *rec, int nr_index)
+{
+	int i;
+	int ret = -ENOMEM;
+	u64 offset;
+	char path[PATH_MAX];
+	struct perf_file_section *idx;
+	struct perf_data_file *file = &rec->file;
+	struct perf_session *session = rec->session;
+	int output_fd = perf_data_file__fd(file);
+
+	/* +1 for header file itself */
+	nr_index++;
+
+	idx = calloc(nr_index, sizeof(*idx));
+	if (idx == NULL)
+		goto out_close;
+
+	offset = lseek(output_fd, 0, SEEK_END);
+
+	idx[0].offset = session->header.data_offset;
+	idx[0].size   = offset - idx[0].offset;
+
+	for (i = 1; i < nr_index; i++) {
+		struct stat stbuf;
+		int fd = rec->fds[i - 1];
+
+		ret = fstat(fd, &stbuf);
+		if (ret < 0)
+			goto out_close;
+
+		idx[i].offset = offset;
+		idx[i].size   = stbuf.st_size;
+
+		offset += stbuf.st_size;
+
+		if (idx[i].size == 0)
+			continue;
+
+		ret = copyfile_offset(fd, 0, output_fd, idx[i].offset,
+				      idx[i].size);
+		if (ret < 0)
+			goto out_close;
+	}
+
+	session->header.index = idx;
+	session->header.nr_index = nr_index;
+
+	perf_has_index = true;
+
+	ret = 0;
+
+out_close:
+	if (ret < 0)
+		pr_err("failed to merge index files: %d\n", ret);
+
+	for (i = 0; i < nr_index - 1; i++)
+		close(rec->fds[i]);
+
+	scnprintf(path, sizeof(path), "%s.dir", file->path);
+	rm_rf(path);
+
+	zfree(&rec->fds);
+	return ret;
+}
+
 static int record__open(struct record *rec)
 {
 	char msg[512];
@@ -306,7 +419,8 @@ try_again:
 
 	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, false,
 				 opts->auxtrace_mmap_pages,
-				 opts->auxtrace_snapshot_mode, false) < 0) {
+				 opts->auxtrace_snapshot_mode,
+				 opts->index) < 0) {
 		if (errno == EPERM) {
 			pr_err("Permission error mapping pages.\n"
 			       "Consider increasing "
@@ -323,6 +437,14 @@ try_again:
 		goto out;
 	}
 
+	if (opts->index) {
+		rc = record__create_index_files(rec, evlist->nr_mmaps);
+		if (rc < 0) {
+			pr_err("failed to create index file: %d\n", rc);
+			goto out;
+		}
+	}
+
 	session->evlist = evlist;
 	perf_session__set_id_hdr_size(session);
 out:
@@ -347,7 +469,9 @@ static int process_buildids(struct record *rec)
 	struct perf_data_file *file  = &rec->file;
 	struct perf_session *session = rec->session;
 
-	if (file->size == 0)
+	/* update file size after merging sample files with index */
+	u64 size = lseek(perf_data_file__fd(file), 0, SEEK_END);
+	if (size == 0)
 		return 0;
 
 	/*
@@ -414,6 +538,13 @@ static int record__mmap_read_all(struct record *rec)
 			}
 		}
 
+		if (rec->evlist->track_mmap && rec->evlist->track_mmap[i].base) {
+			if (record__mmap_read(rec, track_mmap_idx(i)) != 0) {
+				rc = -1;
+				goto out;
+			}
+		}
+
 		if (mm->base && !rec->opts.auxtrace_snapshot_mode &&
 		    record__auxtrace_mmap_read(rec, mm) != 0) {
 			rc = -1;
@@ -426,7 +557,8 @@ static int record__mmap_read_all(struct record *rec)
 	 * at least one event.
 	 */
 	if (bytes_written != rec->bytes_written)
-		rc = record__write(rec, &finished_round_event, sizeof(finished_round_event));
+		rc = record__write(rec, &finished_round_event,
+				   sizeof(finished_round_event), -1);
 
 out:
 	return rc;
@@ -452,7 +584,8 @@ static void record__init_features(struct record *rec)
 	if (!rec->opts.full_auxtrace)
 		perf_header__clear_feat(&session->header, HEADER_AUXTRACE);
 
-	perf_header__clear_feat(&session->header, HEADER_DATA_INDEX);
+	if (!rec->opts.index)
+		perf_header__clear_feat(&session->header, HEADER_DATA_INDEX);
 }
 
 static volatile int workload_exec_errno;
@@ -520,6 +653,11 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		}
 	}
 
+	if (file->is_pipe && opts->index) {
+		pr_warning("Indexing is disabled for pipe output\n");
+		opts->index = false;
+	}
+
 	if (record__open(rec) != 0) {
 		err = -1;
 		goto out_child;
@@ -753,6 +891,9 @@ out_child:
 		rec->session->header.data_size += rec->bytes_written;
 		file->size = lseek(perf_data_file__fd(file), 0, SEEK_CUR);
 
+		if (rec->opts.index)
+			record__merge_index_files(rec, rec->evlist->nr_mmaps);
+
 		if (!rec->no_buildid) {
 			process_buildids(rec);
 			/*
@@ -1119,6 +1260,8 @@ struct option __record_options[] = {
 			"per thread proc mmap processing timeout in ms"),
 	OPT_BOOLEAN(0, "switch-events", &record.opts.record_switch_events,
 		    "Record context switch events"),
+	OPT_BOOLEAN(0, "index", &record.opts.index,
+		    "make index for sample data to speed-up processing"),
 	OPT_END()
 };
 
@@ -1186,6 +1329,15 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
 		goto out_symbol_exit;
 	}
 
+	if (rec->opts.index) {
+		if (!rec->opts.sample_time) {
+			pr_err("Sample timestamp is required for indexing\n");
+			goto out_symbol_exit;
+		}
+
+		perf_evlist__add_dummy_tracking(rec->evlist);
+	}
+
 	if (rec->opts.target.tid && !rec->opts.no_inherit_set)
 		rec->opts.no_inherit = true;
 
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index f4b4d7d8752c..df7c208abb74 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -60,6 +60,7 @@ struct record_opts {
 	bool	     full_auxtrace;
 	bool	     auxtrace_snapshot_mode;
 	bool	     record_switch_events;
+	bool	     index;
 	unsigned int freq;
 	unsigned int mmap_pages;
 	unsigned int auxtrace_mmap_pages;
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index c357f7f47d32..13ba1402ec1b 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -2706,6 +2706,8 @@ int perf_session__read_header(struct perf_session *session)
 						   session->tevent.pevent))
 		goto out_delete_evlist;
 
+	perf_has_index = perf_header__has_feat(&session->header, HEADER_DATA_INDEX);
+
 	return 0;
 out_errno:
 	return -errno;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 91fa9647f565..7546c4d147b9 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -182,6 +182,7 @@ void perf_session__delete(struct perf_session *session)
 	machines__exit(&session->machines);
 	if (session->file)
 		perf_data_file__close(session->file);
+	free(session->header.index);
 	free(session);
 }
 
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 10/38] perf report: Skip dummy tracking event
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
                   ` (8 preceding siblings ...)
  2015-10-02  5:18 ` [RFC/PATCH 09/38] perf record: Add --index option for building index table Namhyung Kim
@ 2015-10-02  5:18 ` Namhyung Kim
  2015-10-02  5:18 ` [RFC/PATCH 11/38] perf tools: Introduce thread__comm(_str)_by_time() helpers Namhyung Kim
                   ` (19 subsequent siblings)
  29 siblings, 0 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

The dummy tracking event is only for tracking task/comom/mmap events
and has no sample data for itself.  So no need to report, just skip it.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-report.c    |  3 +++
 tools/perf/ui/browsers/hists.c | 30 ++++++++++++++++++++++++------
 tools/perf/ui/gtk/hists.c      |  3 +++
 3 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index b5623639f67d..aeced4fa27e8 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -362,6 +362,9 @@ static int perf_evlist__tty_browse_hists(struct perf_evlist *evlist,
 		struct hists *hists = evsel__hists(pos);
 		const char *evname = perf_evsel__name(pos);
 
+		if (perf_evsel__is_dummy_tracking(pos))
+			continue;
+
 		if (symbol_conf.event_group &&
 		    !perf_evsel__is_group_leader(pos))
 			continue;
diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index a4e9b370c037..c8227dbb0fcc 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -2219,14 +2219,17 @@ out:
 	return key;
 }
 
-static bool filter_group_entries(struct ui_browser *browser __maybe_unused,
-				 void *entry)
+static bool filter_entries(struct ui_browser *browser __maybe_unused,
+			   void *entry)
 {
 	struct perf_evsel *evsel = list_entry(entry, struct perf_evsel, node);
 
 	if (symbol_conf.event_group && !perf_evsel__is_group_leader(evsel))
 		return true;
 
+	if (perf_evsel__is_dummy_tracking(evsel))
+		return true;
+
 	return false;
 }
 
@@ -2243,7 +2246,7 @@ static int __perf_evlist__tui_browse_hists(struct perf_evlist *evlist,
 			.refresh    = ui_browser__list_head_refresh,
 			.seek	    = ui_browser__list_head_seek,
 			.write	    = perf_evsel_menu__write,
-			.filter	    = filter_group_entries,
+			.filter	    = filter_entries,
 			.nr_entries = nr_entries,
 			.priv	    = evlist,
 		},
@@ -2270,21 +2273,22 @@ int perf_evlist__tui_browse_hists(struct perf_evlist *evlist, const char *help,
 				  struct perf_env *env)
 {
 	int nr_entries = evlist->nr_entries;
+	struct perf_evsel *first = perf_evlist__first(evlist);
+	struct perf_evsel *pos;
 
 single_entry:
 	if (nr_entries == 1) {
-		struct perf_evsel *first = perf_evlist__first(evlist);
-
 		return perf_evsel__hists_browse(first, nr_entries, help,
 						false, hbt, min_pcnt,
 						env);
 	}
 
 	if (symbol_conf.event_group) {
-		struct perf_evsel *pos;
 
 		nr_entries = 0;
 		evlist__for_each(evlist, pos) {
+			if (perf_evsel__is_dummy_tracking(pos))
+				continue;
 			if (perf_evsel__is_group_leader(pos))
 				nr_entries++;
 		}
@@ -2293,6 +2297,20 @@ single_entry:
 			goto single_entry;
 	}
 
+	evlist__for_each(evlist, pos) {
+		if (perf_evsel__is_dummy_tracking(pos))
+			nr_entries--;
+	}
+
+	if (nr_entries == 1) {
+		evlist__for_each(evlist, pos) {
+			if (!perf_evsel__is_dummy_tracking(pos)) {
+				first = pos;
+				goto single_entry;
+			}
+		}
+	}
+
 	return __perf_evlist__tui_browse_hists(evlist, nr_entries, help,
 					       hbt, min_pcnt, env);
 }
diff --git a/tools/perf/ui/gtk/hists.c b/tools/perf/ui/gtk/hists.c
index 4b3585eed1e8..83a7ecd5cda8 100644
--- a/tools/perf/ui/gtk/hists.c
+++ b/tools/perf/ui/gtk/hists.c
@@ -317,6 +317,9 @@ int perf_evlist__gtk_browse_hists(struct perf_evlist *evlist,
 		char buf[512];
 		size_t size = sizeof(buf);
 
+		if (perf_evsel__is_dummy_tracking(pos))
+			continue;
+
 		if (symbol_conf.event_group) {
 			if (!perf_evsel__is_group_leader(pos))
 				continue;
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 11/38] perf tools: Introduce thread__comm(_str)_by_time() helpers
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
                   ` (9 preceding siblings ...)
  2015-10-02  5:18 ` [RFC/PATCH 10/38] perf report: Skip dummy tracking event Namhyung Kim
@ 2015-10-02  5:18 ` Namhyung Kim
  2015-10-02  5:18 ` [RFC/PATCH 12/38] perf tools: Add a test case for thread comm handling Namhyung Kim
                   ` (18 subsequent siblings)
  29 siblings, 0 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

When data file indexing is enabled, it processes all task, comm and mmap
events first and then goes to the sample events.  So all it sees is the
last comm of a thread although it has information at the time of sample.

Sort thread's comm by time so that it can find appropriate comm at the
sample time.  The thread__comm_by_time() will mostly work even if
PERF_SAMPLE_TIME bit is off since in that case, sample->time will be
-1 so it'll take the last comm anyway.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/thread.c | 33 ++++++++++++++++++++++++++++++++-
 tools/perf/util/thread.h |  2 ++
 2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 0a9ae8014729..8244397753fd 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -121,6 +121,21 @@ struct comm *thread__exec_comm(const struct thread *thread)
 	return last;
 }
 
+struct comm *thread__comm_by_time(const struct thread *thread, u64 timestamp)
+{
+	struct comm *comm;
+
+	list_for_each_entry(comm, &thread->comm_list, list) {
+		if (timestamp >= comm->start)
+			return comm;
+	}
+
+	if (list_empty(&thread->comm_list))
+		return NULL;
+
+	return list_last_entry(&thread->comm_list, struct comm, list);
+}
+
 int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
 		       bool exec)
 {
@@ -136,7 +151,13 @@ int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
 		new = comm__new(str, timestamp, exec);
 		if (!new)
 			return -ENOMEM;
-		list_add(&new->list, &thread->comm_list);
+
+		/* sort by time */
+		list_for_each_entry(curr, &thread->comm_list, list) {
+			if (timestamp >= curr->start)
+				break;
+		}
+		list_add_tail(&new->list, &curr->list);
 
 		if (exec)
 			unwind__flush_access(thread);
@@ -157,6 +178,16 @@ const char *thread__comm_str(const struct thread *thread)
 	return comm__str(comm);
 }
 
+const char *thread__comm_str_by_time(const struct thread *thread, u64 timestamp)
+{
+	const struct comm *comm = thread__comm_by_time(thread, timestamp);
+
+	if (!comm)
+		return NULL;
+
+	return comm__str(comm);
+}
+
 /* CHECKME: it should probably better return the max comm len from its comm list */
 int thread__comm_len(struct thread *thread)
 {
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index a0ac0317affb..33418e6dc64a 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -68,7 +68,9 @@ static inline int thread__set_comm(struct thread *thread, const char *comm,
 int thread__comm_len(struct thread *thread);
 struct comm *thread__comm(const struct thread *thread);
 struct comm *thread__exec_comm(const struct thread *thread);
+struct comm *thread__comm_by_time(const struct thread *thread, u64 timestamp);
 const char *thread__comm_str(const struct thread *thread);
+const char *thread__comm_str_by_time(const struct thread *thread, u64 timestamp);
 void thread__insert_map(struct thread *thread, struct map *map);
 int thread__fork(struct thread *thread, struct thread *parent, u64 timestamp);
 size_t thread__fprintf(struct thread *thread, FILE *fp);
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 12/38] perf tools: Add a test case for thread comm handling
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
                   ` (10 preceding siblings ...)
  2015-10-02  5:18 ` [RFC/PATCH 11/38] perf tools: Introduce thread__comm(_str)_by_time() helpers Namhyung Kim
@ 2015-10-02  5:18 ` Namhyung Kim
  2015-10-02  5:18 ` [RFC/PATCH 13/38] perf tools: Use thread__comm_by_time() when adding hist entries Namhyung Kim
                   ` (17 subsequent siblings)
  29 siblings, 0 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

The new test case checks various thread comm handling APIs like
overridding and time sorting.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/tests/Build          |  1 +
 tools/perf/tests/builtin-test.c |  4 ++++
 tools/perf/tests/tests.h        |  1 +
 tools/perf/tests/thread-comm.c  | 47 +++++++++++++++++++++++++++++++++++++++++
 4 files changed, 53 insertions(+)
 create mode 100644 tools/perf/tests/thread-comm.c

diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index c6f198ae65fb..6bf705c4cd89 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -24,6 +24,7 @@ perf-y += bp_signal_overflow.o
 perf-y += task-exit.o
 perf-y += sw-clock.o
 perf-y += mmap-thread-lookup.o
+perf-y += thread-comm.o
 perf-y += thread-mg-share.o
 perf-y += switch-tracking.o
 perf-y += keep-tracking.o
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index d9bf51dc8cf5..5e6f4e56113c 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -191,6 +191,10 @@ static struct test {
 		.func = test_session_topology,
 	},
 	{
+		.desc = "Test thread comm handling",
+		.func = test__thread_comm,
+	},
+	{
 		.func = NULL,
 	},
 };
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index 0b3549672c16..23552471535a 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -65,6 +65,7 @@ int test__thread_map(void);
 int test__llvm(void);
 int test__insn_x86(void);
 int test_session_topology(void);
+int test__thread_comm(void);
 
 #if defined(__x86_64__) || defined(__i386__) || defined(__arm__) || defined(__aarch64__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
diff --git a/tools/perf/tests/thread-comm.c b/tools/perf/tests/thread-comm.c
new file mode 100644
index 000000000000..d146dedf63b4
--- /dev/null
+++ b/tools/perf/tests/thread-comm.c
@@ -0,0 +1,47 @@
+#include "tests.h"
+#include "machine.h"
+#include "thread.h"
+#include "debug.h"
+
+int test__thread_comm(void)
+{
+	struct machines machines;
+	struct machine *machine;
+	struct thread *t;
+
+	/*
+	 * This test is to check whether it can retrieve a correct
+	 * comm for a given time.  When multi-file data storage is
+	 * enabled, those task/comm events are processed first so the
+	 * later sample should find a matching comm properly.
+	 */
+	machines__init(&machines);
+	machine = &machines.host;
+
+	t = machine__findnew_thread(machine, 100, 100);
+	TEST_ASSERT_VAL("wrong init thread comm",
+			!strcmp(thread__comm_str(t), ":100"));
+
+	thread__set_comm(t, "perf-test1", 10000);
+	TEST_ASSERT_VAL("failed to override thread comm",
+			!strcmp(thread__comm_str(t), "perf-test1"));
+
+	thread__set_comm(t, "perf-test2", 20000);
+	thread__set_comm(t, "perf-test3", 30000);
+	thread__set_comm(t, "perf-test4", 40000);
+
+	TEST_ASSERT_VAL("failed to find timed comm",
+			!strcmp(thread__comm_str_by_time(t, 20000), "perf-test2"));
+	TEST_ASSERT_VAL("failed to find timed comm",
+			!strcmp(thread__comm_str_by_time(t, 35000), "perf-test3"));
+	TEST_ASSERT_VAL("failed to find timed comm",
+			!strcmp(thread__comm_str_by_time(t, 50000), "perf-test4"));
+
+	thread__set_comm(t, "perf-test1.5", 15000);
+	TEST_ASSERT_VAL("failed to sort timed comm",
+			!strcmp(thread__comm_str_by_time(t, 15000), "perf-test1.5"));
+
+	machine__delete_threads(machine);
+	machines__exit(&machines);
+	return 0;
+}
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 13/38] perf tools: Use thread__comm_by_time() when adding hist entries
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
                   ` (11 preceding siblings ...)
  2015-10-02  5:18 ` [RFC/PATCH 12/38] perf tools: Add a test case for thread comm handling Namhyung Kim
@ 2015-10-02  5:18 ` Namhyung Kim
  2015-10-02  5:18 ` [RFC/PATCH 14/38] perf tools: Convert dead thread list into rbtree Namhyung Kim
                   ` (16 subsequent siblings)
  29 siblings, 0 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

Now thread->comm can be handled with time properly, use it to find
the correct comm at the time when adding hist entries.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-annotate.c |  5 +++--
 tools/perf/builtin-diff.c     |  8 ++++----
 tools/perf/tests/hists_link.c |  4 ++--
 tools/perf/util/hist.c        | 19 ++++++++++---------
 tools/perf/util/hist.h        |  2 +-
 5 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 2bf9b3fd9e61..3afb858eac6e 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -47,7 +47,7 @@ struct perf_annotate {
 };
 
 static int perf_evsel__add_sample(struct perf_evsel *evsel,
-				  struct perf_sample *sample __maybe_unused,
+				  struct perf_sample *sample,
 				  struct addr_location *al,
 				  struct perf_annotate *ann)
 {
@@ -72,7 +72,8 @@ static int perf_evsel__add_sample(struct perf_evsel *evsel,
 		return 0;
 	}
 
-	he = __hists__add_entry(hists, al, NULL, NULL, NULL, 1, 1, 0, true);
+	he = __hists__add_entry(hists, al, NULL, NULL, NULL, 1, 1, 0,
+				sample->time, true);
 	if (he == NULL)
 		return -ENOMEM;
 
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 0b180a885ba3..623ecc53c0c9 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -312,10 +312,10 @@ static int formula_fprintf(struct hist_entry *he, struct hist_entry *pair,
 
 static int hists__add_entry(struct hists *hists,
 			    struct addr_location *al, u64 period,
-			    u64 weight, u64 transaction)
+			    u64 weight, u64 transaction, u64 timestamp)
 {
 	if (__hists__add_entry(hists, al, NULL, NULL, NULL, period, weight,
-			       transaction, true) != NULL)
+			       transaction, timestamp, true) != NULL)
 		return 0;
 	return -ENOMEM;
 }
@@ -336,8 +336,8 @@ static int diff__process_sample_event(struct perf_tool *tool __maybe_unused,
 		return -1;
 	}
 
-	if (hists__add_entry(hists, &al, sample->period,
-			     sample->weight, sample->transaction)) {
+	if (hists__add_entry(hists, &al, sample->period, sample->weight,
+			     sample->transaction, sample->time)) {
 		pr_warning("problem incrementing symbol period, skipping event\n");
 		goto out_put;
 	}
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index 8c102b011424..27bae90c9a95 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -90,7 +90,7 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
 				goto out;
 
 			he = __hists__add_entry(hists, &al, NULL,
-						NULL, NULL, 1, 1, 0, true);
+						NULL, NULL, 1, 1, 0, -1, true);
 			if (he == NULL) {
 				addr_location__put(&al);
 				goto out;
@@ -116,7 +116,7 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
 				goto out;
 
 			he = __hists__add_entry(hists, &al, NULL,
-						NULL, NULL, 1, 1, 0, true);
+						NULL, NULL, 1, 1, 0, -1, true);
 			if (he == NULL) {
 				addr_location__put(&al);
 				goto out;
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index c346b331b892..10454197a508 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -447,11 +447,11 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
 				      struct branch_info *bi,
 				      struct mem_info *mi,
 				      u64 period, u64 weight, u64 transaction,
-				      bool sample_self)
+				      u64 timestamp, bool sample_self)
 {
 	struct hist_entry entry = {
 		.thread	= al->thread,
-		.comm = thread__comm(al->thread),
+		.comm = thread__comm_by_time(al->thread, timestamp),
 		.ms = {
 			.map	= al->map,
 			.sym	= al->sym,
@@ -510,13 +510,14 @@ iter_add_single_mem_entry(struct hist_entry_iter *iter, struct addr_location *al
 {
 	u64 cost;
 	struct mem_info *mi = iter->priv;
+	struct perf_sample *sample = iter->sample;
 	struct hists *hists = evsel__hists(iter->evsel);
 	struct hist_entry *he;
 
 	if (mi == NULL)
 		return -EINVAL;
 
-	cost = iter->sample->weight;
+	cost = sample->weight;
 	if (!cost)
 		cost = 1;
 
@@ -528,7 +529,7 @@ iter_add_single_mem_entry(struct hist_entry_iter *iter, struct addr_location *al
 	 * and the he_stat__add_period() function.
 	 */
 	he = __hists__add_entry(hists, al, iter->parent, NULL, mi,
-				cost, cost, 0, true);
+				cost, cost, 0, sample->time, true);
 	if (!he)
 		return -ENOMEM;
 
@@ -630,7 +631,7 @@ iter_add_next_branch_entry(struct hist_entry_iter *iter, struct addr_location *a
 	 */
 	he = __hists__add_entry(hists, al, iter->parent, &bi[i], NULL,
 				1, bi->flags.cycles ? bi->flags.cycles : 1,
-				0, true);
+				0, iter->sample->time, true);
 	if (he == NULL)
 		return -ENOMEM;
 
@@ -668,7 +669,7 @@ iter_add_single_normal_entry(struct hist_entry_iter *iter, struct addr_location
 
 	he = __hists__add_entry(evsel__hists(evsel), al, iter->parent, NULL, NULL,
 				sample->period, sample->weight,
-				sample->transaction, true);
+				sample->transaction, sample->time, true);
 	if (he == NULL)
 		return -ENOMEM;
 
@@ -730,7 +731,7 @@ iter_add_single_cumulative_entry(struct hist_entry_iter *iter,
 
 	he = __hists__add_entry(hists, al, iter->parent, NULL, NULL,
 				sample->period, sample->weight,
-				sample->transaction, true);
+				sample->transaction, sample->time, true);
 	if (he == NULL)
 		return -ENOMEM;
 
@@ -775,7 +776,7 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
 		.hists = evsel__hists(evsel),
 		.cpu = al->cpu,
 		.thread = al->thread,
-		.comm = thread__comm(al->thread),
+		.comm = thread__comm_by_time(al->thread, sample->time),
 		.ip = al->addr,
 		.ms = {
 			.map = al->map,
@@ -804,7 +805,7 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
 
 	he = __hists__add_entry(evsel__hists(evsel), al, iter->parent, NULL, NULL,
 				sample->period, sample->weight,
-				sample->transaction, false);
+				sample->transaction, sample->time, false);
 	if (he == NULL)
 		return -ENOMEM;
 
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 8c20a8f6b214..7fbb60857f26 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -115,7 +115,7 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
 				      struct branch_info *bi,
 				      struct mem_info *mi, u64 period,
 				      u64 weight, u64 transaction,
-				      bool sample_self);
+				      u64 timestamp, bool sample_self);
 int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
 			 int max_stack_depth, void *arg);
 
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 14/38] perf tools: Convert dead thread list into rbtree
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
                   ` (12 preceding siblings ...)
  2015-10-02  5:18 ` [RFC/PATCH 13/38] perf tools: Use thread__comm_by_time() when adding hist entries Namhyung Kim
@ 2015-10-02  5:18 ` Namhyung Kim
  2015-10-02  5:18 ` [RFC/PATCH 15/38] perf tools: Introduce machine__find*_thread_by_time() Namhyung Kim
                   ` (15 subsequent siblings)
  29 siblings, 0 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

Currently perf maintains dead threads in a linked list but this can be
a problem if someone needs to search from it especially in a large
session which might have many dead threads.  Convert it to a rbtree
like normal threads and it'll be used later with multi-thread changes.

The list node is now used for chaining dead threads of same tid since
it's easier to handle such threads in time order.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/machine.c | 82 +++++++++++++++++++++++++++++++++++++++++------
 tools/perf/util/machine.h |  2 +-
 tools/perf/util/thread.c  | 18 ++++++++++-
 tools/perf/util/thread.h  | 11 +++----
 4 files changed, 96 insertions(+), 17 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 5ef90be2a249..2b9fbe55c896 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -30,8 +30,8 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid)
 	dsos__init(&machine->dsos);
 
 	machine->threads = RB_ROOT;
+	machine->dead_threads = RB_ROOT;
 	pthread_rwlock_init(&machine->threads_lock, NULL);
-	INIT_LIST_HEAD(&machine->dead_threads);
 	machine->last_match = NULL;
 
 	machine->vdso_info = NULL;
@@ -104,6 +104,28 @@ static void dsos__exit(struct dsos *dsos)
 	pthread_rwlock_destroy(&dsos->lock);
 }
 
+static void machine__delete_dead_threads(struct machine *machine)
+{
+	struct rb_node *nd = rb_first(&machine->dead_threads);
+
+	while (nd) {
+		struct thread *t = rb_entry(nd, struct thread, rb_node);
+		struct thread *pos;
+
+		nd = rb_next(nd);
+		rb_erase_init(&t->rb_node, &machine->dead_threads);
+
+		while (!list_empty(&t->tid_list)) {
+			pos = list_first_entry(&t->tid_list,
+					       struct thread, tid_list);
+			list_del_init(&pos->tid_list);
+			thread__delete(pos);
+		}
+
+		thread__delete(t);
+	}
+}
+
 void machine__delete_threads(struct machine *machine)
 {
 	struct rb_node *nd;
@@ -117,6 +139,8 @@ void machine__delete_threads(struct machine *machine)
 		__machine__remove_thread(machine, t, false);
 	}
 	pthread_rwlock_unlock(&machine->threads_lock);
+
+	machine__delete_dead_threads(machine);
 }
 
 void machine__exit(struct machine *machine)
@@ -1361,6 +1385,10 @@ out_problem:
 
 static void __machine__remove_thread(struct machine *machine, struct thread *th, bool lock)
 {
+	struct rb_node **p = &machine->dead_threads.rb_node;
+	struct rb_node *parent = NULL;
+	struct thread *pos;
+
 	if (machine->last_match == th)
 		machine->last_match = NULL;
 
@@ -1368,16 +1396,45 @@ static void __machine__remove_thread(struct machine *machine, struct thread *th,
 	if (lock)
 		pthread_rwlock_wrlock(&machine->threads_lock);
 	rb_erase_init(&th->rb_node, &machine->threads);
-	RB_CLEAR_NODE(&th->rb_node);
+
+	th->dead = true;
+
 	/*
-	 * Move it first to the dead_threads list, then drop the reference,
-	 * if this is the last reference, then the thread__delete destructor
-	 * will be called and we will remove it from the dead_threads list.
+	 * No need to have an additional reference for non-index file
+	 * as they can be released when reference holders died and
+	 * there will be no more new references.
 	 */
-	list_add_tail(&th->node, &machine->dead_threads);
+	if (!perf_has_index) {
+		thread__put(th);
+		goto out;
+	}
+
+	/*
+	 * For indexed file, We may have references to this (dead)
+	 * thread, as samples are processed after fork/exit events.
+	 * Just move them to a separate rbtree and keep a reference.
+	 */
+	while (*p != NULL) {
+		parent = *p;
+		pos = rb_entry(parent, struct thread, rb_node);
+
+		if (pos->tid == th->tid) {
+			list_add_tail(&th->tid_list, &pos->tid_list);
+			goto out;
+		}
+
+		if (th->tid < pos->tid)
+			p = &(*p)->rb_left;
+		else
+			p = &(*p)->rb_right;
+	}
+
+	rb_link_node(&th->rb_node, parent, p);
+	rb_insert_color(&th->rb_node, &machine->dead_threads);
+
+out:
 	if (lock)
 		pthread_rwlock_unlock(&machine->threads_lock);
-	thread__put(th);
 }
 
 void machine__remove_thread(struct machine *machine, struct thread *th)
@@ -1899,7 +1956,7 @@ int machine__for_each_thread(struct machine *machine,
 			     void *priv)
 {
 	struct rb_node *nd;
-	struct thread *thread;
+	struct thread *thread, *pos;
 	int rc = 0;
 
 	for (nd = rb_first(&machine->threads); nd; nd = rb_next(nd)) {
@@ -1909,10 +1966,17 @@ int machine__for_each_thread(struct machine *machine,
 			return rc;
 	}
 
-	list_for_each_entry(thread, &machine->dead_threads, node) {
+	for (nd = rb_first(&machine->dead_threads); nd; nd = rb_next(nd)) {
+		thread = rb_entry(nd, struct thread, rb_node);
 		rc = fn(thread, priv);
 		if (rc != 0)
 			return rc;
+
+		list_for_each_entry(pos, &thread->tid_list, tid_list) {
+			rc = fn(pos, priv);
+			if (rc != 0)
+				return rc;
+		}
 	}
 	return rc;
 }
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 2c2b443df5ba..23242de926df 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -30,8 +30,8 @@ struct machine {
 	bool		  comm_exec;
 	char		  *root_dir;
 	struct rb_root	  threads;
+	struct rb_root	  dead_threads;
 	pthread_rwlock_t  threads_lock;
-	struct list_head  dead_threads;
 	struct thread	  *last_match;
 	struct vdso_info  *vdso_info;
 	struct perf_env   *env;
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 8244397753fd..674792e8fa2f 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -9,6 +9,7 @@
 #include "debug.h"
 #include "comm.h"
 #include "unwind.h"
+#include "machine.h"
 
 int thread__init_map_groups(struct thread *thread, struct machine *machine)
 {
@@ -54,6 +55,7 @@ struct thread *thread__new(pid_t pid, pid_t tid)
 
 		list_add(&comm->list, &thread->comm_list);
 		atomic_set(&thread->refcnt, 0);
+		INIT_LIST_HEAD(&thread->tid_list);
 		RB_CLEAR_NODE(&thread->rb_node);
 	}
 
@@ -69,6 +71,7 @@ void thread__delete(struct thread *thread)
 	struct comm *comm, *tmp;
 
 	BUG_ON(!RB_EMPTY_NODE(&thread->rb_node));
+	BUG_ON(!list_empty(&thread->tid_list));
 
 	thread_stack__free(thread);
 
@@ -95,7 +98,20 @@ struct thread *thread__get(struct thread *thread)
 void thread__put(struct thread *thread)
 {
 	if (thread && atomic_dec_and_test(&thread->refcnt)) {
-		list_del_init(&thread->node);
+		if (!RB_EMPTY_NODE(&thread->rb_node)) {
+			struct machine *machine = thread->mg->machine;
+
+			if (thread->dead) {
+				rb_erase(&thread->rb_node,
+					 &machine->dead_threads);
+			} else {
+				rb_erase(&thread->rb_node,
+					 &machine->threads);
+			}
+			RB_CLEAR_NODE(&thread->rb_node);
+		}
+
+		list_del_init(&thread->tid_list);
 		thread__delete(thread);
 	}
 }
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index 33418e6dc64a..b8f794d97b75 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -13,10 +13,8 @@
 struct thread_stack;
 
 struct thread {
-	union {
-		struct rb_node	 rb_node;
-		struct list_head node;
-	};
+	struct rb_node	 	rb_node;
+	struct list_head 	tid_list;
 	struct map_groups	*mg;
 	pid_t			pid_; /* Not all tools update this */
 	pid_t			tid;
@@ -26,7 +24,8 @@ struct thread {
 	char			shortname[3];
 	bool			comm_set;
 	int			comm_len;
-	bool			dead; /* if set thread has exited */
+	bool			exited; /* if set thread has exited */
+	bool			dead; /* thread is in dead_threads list */
 	struct list_head	comm_list;
 	u64			db_id;
 
@@ -54,7 +53,7 @@ static inline void __thread__zput(struct thread **thread)
 
 static inline void thread__exited(struct thread *thread)
 {
-	thread->dead = true;
+	thread->exited = true;
 }
 
 int __thread__set_comm(struct thread *thread, const char *comm, u64 timestamp,
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 15/38] perf tools: Introduce machine__find*_thread_by_time()
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
                   ` (13 preceding siblings ...)
  2015-10-02  5:18 ` [RFC/PATCH 14/38] perf tools: Convert dead thread list into rbtree Namhyung Kim
@ 2015-10-02  5:18 ` Namhyung Kim
  2015-10-08 12:20   ` Jiri Olsa
  2015-10-02  5:18 ` [RFC/PATCH 16/38] perf tools: Add a test case for timed thread handling Namhyung Kim
                   ` (14 subsequent siblings)
  29 siblings, 1 reply; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

With data file indexing is enabled, it needs to search thread based on
sample time since sample processing is done after other (task, comm and
mmap) events are processed.  This can be a problem if a session is very
long and pid is recycled - in that case it'll only see the last one.

So keep thread start time in it, and search thread based on the time.
This patch introduces machine__find{,new}_thread_by_time() function
for this.  It'll first search current (i.e. recent) thread rbtree and
then dead thread tree (and tid list).  If it couldn't find anyone,
it'll create a new (missing) thread.

The sample timestamp of 0 means that this is called from synthesized
event so just use current rbtree.  The timestamp will be -1 if sample
didn't record the timestamp so will see current threads automatically.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/tests/dwarf-unwind.c |   8 +--
 tools/perf/tests/hists_common.c |   3 +-
 tools/perf/tests/hists_link.c   |   2 +-
 tools/perf/util/event.c         |   6 +-
 tools/perf/util/machine.c       | 126 +++++++++++++++++++++++++++++++++++++++-
 tools/perf/util/machine.h       |  10 +++-
 tools/perf/util/thread.c        |   4 ++
 tools/perf/util/thread.h        |   1 +
 8 files changed, 148 insertions(+), 12 deletions(-)

diff --git a/tools/perf/tests/dwarf-unwind.c b/tools/perf/tests/dwarf-unwind.c
index 40b36c462427..b9ca0a72fc4d 100644
--- a/tools/perf/tests/dwarf-unwind.c
+++ b/tools/perf/tests/dwarf-unwind.c
@@ -16,10 +16,10 @@
 
 static int mmap_handler(struct perf_tool *tool __maybe_unused,
 			union perf_event *event,
-			struct perf_sample *sample __maybe_unused,
+			struct perf_sample *sample,
 			struct machine *machine)
 {
-	return machine__process_mmap2_event(machine, event, NULL);
+	return machine__process_mmap2_event(machine, event, sample);
 }
 
 static int init_live_machine(struct machine *machine)
@@ -66,12 +66,10 @@ static int unwind_entry(struct unwind_entry *entry, void *arg)
 __attribute__ ((noinline))
 static int unwind_thread(struct thread *thread)
 {
-	struct perf_sample sample;
+	struct perf_sample sample = { .time = -1ULL, };
 	unsigned long cnt = 0;
 	int err = -1;
 
-	memset(&sample, 0, sizeof(sample));
-
 	if (test__arch_unwind_sample(&sample, thread)) {
 		pr_debug("failed to get unwind sample\n");
 		goto out;
diff --git a/tools/perf/tests/hists_common.c b/tools/perf/tests/hists_common.c
index ce80b274b097..1d657fa2830f 100644
--- a/tools/perf/tests/hists_common.c
+++ b/tools/perf/tests/hists_common.c
@@ -80,6 +80,7 @@ static struct {
 struct machine *setup_fake_machine(struct machines *machines)
 {
 	struct machine *machine = machines__find(machines, HOST_KERNEL_ID);
+	struct perf_sample sample = { .time = -1ULL, };
 	size_t i;
 
 	if (machine == NULL) {
@@ -114,7 +115,7 @@ struct machine *setup_fake_machine(struct machines *machines)
 		strcpy(fake_mmap_event.mmap.filename,
 		       fake_mmap_info[i].filename);
 
-		machine__process_mmap_event(machine, &fake_mmap_event, NULL);
+		machine__process_mmap_event(machine, &fake_mmap_event, &sample);
 	}
 
 	for (i = 0; i < ARRAY_SIZE(fake_symbols); i++) {
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index 27bae90c9a95..cacc8617bf02 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -64,7 +64,7 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
 	struct perf_evsel *evsel;
 	struct addr_location al;
 	struct hist_entry *he;
-	struct perf_sample sample = { .period = 1, };
+	struct perf_sample sample = { .period = 1, .time = -1ULL, };
 	size_t i = 0, k;
 
 	/*
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index cb98b5af9e17..3dff1b5cd4cc 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -9,6 +9,7 @@
 #include "strlist.h"
 #include "thread.h"
 #include "thread_map.h"
+#include "session.h"
 #include "symbol/kallsyms.h"
 
 static const char *perf_event__names[] = {
@@ -992,9 +993,10 @@ int perf_event__preprocess_sample(const union perf_event *event,
 				  struct perf_sample *sample)
 {
 	u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
-	struct thread *thread = machine__findnew_thread(machine, sample->pid,
-							sample->tid);
+	struct thread *thread;
 
+	thread = machine__findnew_thread_by_time(machine, sample->pid,
+						 sample->tid, sample->time);
 	if (thread == NULL)
 		return -1;
 
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 2b9fbe55c896..7cfaa2c3f131 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -475,6 +475,120 @@ struct thread *machine__find_thread(struct machine *machine, pid_t pid,
 	return th;
 }
 
+static struct thread *
+__machine__findnew_thread_by_time(struct machine *machine, pid_t pid, pid_t tid,
+				  u64 timestamp, bool create)
+{
+	struct thread *curr, *pos, *new;
+	struct thread *th = NULL;
+	struct rb_node **p;
+	struct rb_node *parent = NULL;
+
+	if (!perf_has_index)
+		return ____machine__findnew_thread(machine, pid, tid, create);
+
+	/* lookup current thread first */
+	curr = ____machine__findnew_thread(machine, pid, tid, false);
+	if (curr && timestamp >= curr->start_time)
+		return curr;
+
+	/* and then check dead threads tree & list */
+	p = &machine->dead_threads.rb_node;
+	while (*p != NULL) {
+		parent = *p;
+		th = rb_entry(parent, struct thread, rb_node);
+
+		if (th->tid == tid) {
+			list_for_each_entry(pos, &th->tid_list, tid_list) {
+				if (timestamp >= pos->start_time &&
+				    pos->start_time > th->start_time) {
+					th = pos;
+					break;
+				}
+			}
+
+			if (timestamp >= th->start_time) {
+				machine__update_thread_pid(machine, th, pid);
+				return th;
+			}
+			break;
+		}
+
+		if (tid < th->tid)
+			p = &(*p)->rb_left;
+		else
+			p = &(*p)->rb_right;
+	}
+
+	if (!create)
+		return NULL;
+
+	if (!curr && !*p) {
+		/* found no thread.  create one as current thread */
+		return __machine__findnew_thread(machine, pid, tid);
+	}
+
+	new = thread__new(pid, tid);
+	if (new == NULL)
+		return NULL;
+
+	new->dead = true;
+	new->start_time = timestamp;
+
+	if (*p) {
+		list_for_each_entry(pos, &th->tid_list, tid_list) {
+			/* sort by time */
+			if (timestamp >= pos->start_time) {
+				th = pos;
+				break;
+			}
+		}
+		list_add_tail(&new->tid_list, &th->tid_list);
+	} else {
+		rb_link_node(&new->rb_node, parent, p);
+		rb_insert_color(&new->rb_node, &machine->dead_threads);
+	}
+
+	thread__get(new);
+
+	/*
+	 * We have to initialize map_groups separately
+	 * after rb tree is updated.
+	 *
+	 * The reason is that we call machine__findnew_thread
+	 * within thread__init_map_groups to find the thread
+	 * leader and that would screwed the rb tree.
+	 */
+	if (thread__init_map_groups(new, machine))
+		thread__zput(new);
+
+	return new;
+}
+
+struct thread *machine__find_thread_by_time(struct machine *machine, pid_t pid,
+					    pid_t tid, u64 timestamp)
+{
+	struct thread *th;
+
+	pthread_rwlock_rdlock(&machine->threads_lock);
+	th = thread__get(__machine__findnew_thread_by_time(machine, pid, tid,
+							   timestamp, false));
+	pthread_rwlock_unlock(&machine->threads_lock);
+	return th;
+}
+
+struct thread *machine__findnew_thread_by_time(struct machine *machine, pid_t pid,
+					       pid_t tid, u64 timestamp)
+{
+	struct thread *th;
+
+	pthread_rwlock_wrlock(&machine->threads_lock);
+	th = thread__get(__machine__findnew_thread_by_time(machine, pid, tid,
+							   timestamp, true));
+	pthread_rwlock_unlock(&machine->threads_lock);
+	return th;
+}
+
 struct comm *machine__thread_exec_comm(struct machine *machine,
 				       struct thread *thread)
 {
@@ -1299,7 +1413,7 @@ int machine__process_mmap2_event(struct machine *machine,
 	}
 
 	thread = machine__findnew_thread(machine, event->mmap2.pid,
-					event->mmap2.tid);
+					 event->mmap2.tid);
 	if (thread == NULL)
 		goto out_problem;
 
@@ -1419,6 +1533,16 @@ static void __machine__remove_thread(struct machine *machine, struct thread *th,
 		pos = rb_entry(parent, struct thread, rb_node);
 
 		if (pos->tid == th->tid) {
+			struct thread *old;
+
+			/* sort by time */
+			list_for_each_entry(old, &pos->tid_list, tid_list) {
+				if (th->start_time >= old->start_time) {
+					pos = old;
+					break;
+				}
+			}
+
 			list_add_tail(&th->tid_list, &pos->tid_list);
 			goto out;
 		}
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 23242de926df..f0c2cc9c90ae 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -75,8 +75,6 @@ static inline bool machine__kernel_ip(struct machine *machine, u64 ip)
 	return ip >= kernel_start;
 }
 
-struct thread *machine__find_thread(struct machine *machine, pid_t pid,
-				    pid_t tid);
 struct comm *machine__thread_exec_comm(struct machine *machine,
 				       struct thread *thread);
 
@@ -164,6 +162,14 @@ static inline bool machine__is_host(struct machine *machine)
 
 struct thread *__machine__findnew_thread(struct machine *machine, pid_t pid, pid_t tid);
 struct thread *machine__findnew_thread(struct machine *machine, pid_t pid, pid_t tid);
+struct thread *machine__find_thread(struct machine *machine, pid_t pid,
+				    pid_t tid);
+struct thread *machine__findnew_thread_by_time(struct machine *machine,
+					       pid_t pid, pid_t tid,
+					       u64 timestamp);
+struct thread *machine__find_thread_by_time(struct machine *machine,
+					    pid_t pid, pid_t tid,
+					    u64 timestamp);
 
 struct dso *machine__findnew_dso(struct machine *machine, const char *filename);
 
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 674792e8fa2f..ad7c2a00bff8 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -160,6 +160,9 @@ int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
 
 	/* Override the default :tid entry */
 	if (!thread->comm_set) {
+		if (!thread->start_time)
+			thread->start_time = timestamp;
+
 		err = comm__override(curr, str, timestamp, exec);
 		if (err)
 			return err;
@@ -266,6 +269,7 @@ int thread__fork(struct thread *thread, struct thread *parent, u64 timestamp)
 	}
 
 	thread->ppid = parent->tid;
+	thread->start_time = timestamp;
 	return thread__clone_map_groups(thread, parent);
 }
 
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index b8f794d97b75..97026a9660ec 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -28,6 +28,7 @@ struct thread {
 	bool			dead; /* thread is in dead_threads list */
 	struct list_head	comm_list;
 	u64			db_id;
+	u64			start_time;
 
 	void			*priv;
 	struct thread_stack	*ts;
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 16/38] perf tools: Add a test case for timed thread handling
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
                   ` (14 preceding siblings ...)
  2015-10-02  5:18 ` [RFC/PATCH 15/38] perf tools: Introduce machine__find*_thread_by_time() Namhyung Kim
@ 2015-10-02  5:18 ` Namhyung Kim
  2015-10-02  5:18 ` [RFC/PATCH 17/38] perf tools: Maintain map groups list in a leader thread Namhyung Kim
                   ` (13 subsequent siblings)
  29 siblings, 0 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

A test case for verifying live and dead thread tree management during
time change and new machine__find{,new}_thread_time().

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/tests/Build                |   1 +
 tools/perf/tests/builtin-test.c       |   4 +
 tools/perf/tests/tests.h              |   1 +
 tools/perf/tests/thread-lookup-time.c | 179 ++++++++++++++++++++++++++++++++++
 4 files changed, 185 insertions(+)
 create mode 100644 tools/perf/tests/thread-lookup-time.c

diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index 6bf705c4cd89..fd4cabb9a1a0 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -26,6 +26,7 @@ perf-y += sw-clock.o
 perf-y += mmap-thread-lookup.o
 perf-y += thread-comm.o
 perf-y += thread-mg-share.o
+perf-y += thread-lookup-time.o
 perf-y += switch-tracking.o
 perf-y += keep-tracking.o
 perf-y += code-reading.o
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 5e6f4e56113c..027796fa105e 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -195,6 +195,10 @@ static struct test {
 		.func = test__thread_comm,
 	},
 	{
+		.desc = "Test thread lookup with time",
+		.func = test__thread_lookup_time,
+	},
+	{
 		.func = NULL,
 	},
 };
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index 23552471535a..9c02755e86dd 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -66,6 +66,7 @@ int test__llvm(void);
 int test__insn_x86(void);
 int test_session_topology(void);
 int test__thread_comm(void);
+int test__thread_lookup_time(void);
 
 #if defined(__x86_64__) || defined(__i386__) || defined(__arm__) || defined(__aarch64__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
diff --git a/tools/perf/tests/thread-lookup-time.c b/tools/perf/tests/thread-lookup-time.c
new file mode 100644
index 000000000000..0133a241b9fc
--- /dev/null
+++ b/tools/perf/tests/thread-lookup-time.c
@@ -0,0 +1,179 @@
+#include "tests.h"
+#include "machine.h"
+#include "thread.h"
+#include "map.h"
+#include "debug.h"
+
+static int thread__print_cb(struct thread *th, void *arg __maybe_unused)
+{
+	printf("thread: %d, start time: %"PRIu64" %s\n",
+	       th->tid, th->start_time,
+	       th->dead ? "(dead)" : th->exited ? "(exited)" : "");
+	return 0;
+}
+
+static int lookup_with_timestamp(struct machine *machine)
+{
+	struct thread *t1, *t2, *t3;
+	union perf_event fork_event = {
+		.fork = {
+			.pid = 0,
+			.tid = 0,
+			.ppid = 1,
+			.ptid = 1,
+		},
+	};
+	struct perf_sample sample = {
+		.time = 50000,
+	};
+
+	/* this is needed to keep dead threads in rbtree */
+	perf_has_index = true;
+
+	/* start_time is set to 0 */
+	t1 = machine__findnew_thread(machine, 0, 0);
+
+	if (verbose > 1) {
+		printf("========= after t1 created ==========\n");
+		machine__for_each_thread(machine, thread__print_cb, NULL);
+	}
+
+	TEST_ASSERT_VAL("wrong start time of old thread", t1->start_time == 0);
+
+	TEST_ASSERT_VAL("cannot find current thread",
+			machine__find_thread(machine, 0, 0) == t1);
+
+	TEST_ASSERT_VAL("cannot find current thread with time",
+			machine__findnew_thread_by_time(machine, 0, 0, 10000) == t1);
+
+	/* start_time is overwritten to new value */
+	thread__set_comm(t1, "/usr/bin/perf", 20000);
+
+	if (verbose > 1) {
+		printf("========= after t1 set comm ==========\n");
+		machine__for_each_thread(machine, thread__print_cb, NULL);
+	}
+
+	TEST_ASSERT_VAL("failed to update start time", t1->start_time == 20000);
+
+	TEST_ASSERT_VAL("should not find passed thread",
+			/* this will create yet another dead thread */
+			machine__findnew_thread_by_time(machine, 0, 0, 10000) != t1);
+
+	TEST_ASSERT_VAL("cannot find overwritten thread with time",
+			machine__find_thread_by_time(machine, 0, 0, 20000) == t1);
+
+	/* now t1 goes to dead thread tree, and create t2 */
+	machine__process_fork_event(machine, &fork_event, &sample);
+
+	if (verbose > 1) {
+		printf("========= after t2 forked ==========\n");
+		machine__for_each_thread(machine, thread__print_cb, NULL);
+	}
+
+	t2 = machine__find_thread(machine, 0, 0);
+
+	TEST_ASSERT_VAL("cannot find current thread", t2 != NULL);
+
+	TEST_ASSERT_VAL("wrong start time of new thread", t2->start_time == 50000);
+
+	TEST_ASSERT_VAL("dead thread cannot be found",
+			machine__find_thread_by_time(machine, 0, 0, 10000) != t1);
+
+	TEST_ASSERT_VAL("cannot find dead thread after new thread",
+			machine__find_thread_by_time(machine, 0, 0, 30000) == t1);
+
+	TEST_ASSERT_VAL("cannot find current thread after new thread",
+			machine__find_thread_by_time(machine, 0, 0, 50000) == t2);
+
+	/* now t2 goes to dead thread tree, and create t3 */
+	sample.time = 60000;
+	machine__process_fork_event(machine, &fork_event, &sample);
+
+	if (verbose > 1) {
+		printf("========= after t3 forked ==========\n");
+		machine__for_each_thread(machine, thread__print_cb, NULL);
+	}
+
+	t3 = machine__find_thread(machine, 0, 0);
+	TEST_ASSERT_VAL("cannot find current thread", t3 != NULL);
+
+	TEST_ASSERT_VAL("wrong start time of new thread", t3->start_time == 60000);
+
+	TEST_ASSERT_VAL("cannot find dead thread after new thread",
+			machine__findnew_thread_by_time(machine, 0, 0, 30000) == t1);
+
+	TEST_ASSERT_VAL("cannot find dead thread after new thread",
+			machine__findnew_thread_by_time(machine, 0, 0, 50000) == t2);
+
+	TEST_ASSERT_VAL("cannot find current thread after new thread",
+			machine__findnew_thread_by_time(machine, 0, 0, 70000) == t3);
+
+	machine__delete_threads(machine);
+	return 0;
+}
+
+static int lookup_without_timestamp(struct machine *machine)
+{
+	struct thread *t1, *t2, *t3;
+	union perf_event fork_event = {
+		.fork = {
+			.pid = 0,
+			.tid = 0,
+			.ppid = 1,
+			.ptid = 1,
+		},
+	};
+	struct perf_sample sample = {
+		.time = -1ULL,
+	};
+
+	t1 = machine__findnew_thread(machine, 0, 0);
+	TEST_ASSERT_VAL("cannot find current thread", t1 != NULL);
+
+	TEST_ASSERT_VAL("cannot find new thread with time",
+			machine__findnew_thread_by_time(machine, 0, 0, -1ULL) == t1);
+
+	machine__process_fork_event(machine, &fork_event, &sample);
+
+	t2 = machine__find_thread(machine, 0, 0);
+	TEST_ASSERT_VAL("cannot find current thread", t2 != NULL);
+
+	TEST_ASSERT_VAL("cannot find new thread with time",
+			machine__find_thread_by_time(machine, 0, 0, -1ULL) == t2);
+
+	machine__process_fork_event(machine, &fork_event, &sample);
+
+	t3 = machine__find_thread(machine, 0, 0);
+	TEST_ASSERT_VAL("cannot find current thread", t3 != NULL);
+
+	TEST_ASSERT_VAL("cannot find new thread with time",
+			machine__findnew_thread_by_time(machine, 0, 0, -1ULL) == t3);
+
+	machine__delete_threads(machine);
+	return 0;
+}
+
+int test__thread_lookup_time(void)
+{
+	struct machines machines;
+	struct machine *machine;
+
+	/*
+	 * This test is to check whether it can retrieve a correct
+	 * thread for a given time.  When multi-file data storage is
+	 * enabled, those task/comm/mmap events are processed first so
+	 * the later sample should find a matching thread properly.
+	 */
+	machines__init(&machines);
+	machine = &machines.host;
+
+	if (lookup_with_timestamp(machine) < 0)
+		return -1;
+
+	if (lookup_without_timestamp(machine) < 0)
+		return -1;
+
+	machines__exit(&machines);
+	return 0;
+}
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 17/38] perf tools: Maintain map groups list in a leader thread
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
                   ` (15 preceding siblings ...)
  2015-10-02  5:18 ` [RFC/PATCH 16/38] perf tools: Add a test case for timed thread handling Namhyung Kim
@ 2015-10-02  5:18 ` Namhyung Kim
  2015-10-08 12:51   ` Jiri Olsa
  2015-10-08 12:58   ` Jiri Olsa
  2015-10-02  5:18 ` [RFC/PATCH 18/38] perf tools: Introduce thread__find_addr_location_by_time() and friends Namhyung Kim
                   ` (12 subsequent siblings)
  29 siblings, 2 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

To support multi-threaded perf report, we need to maintain time-sorted
map groups.  Add ->mg_list member to struct thread and sort the list
by time.  Now leader threads have one more refcnt for map groups in
the list so also update the thread-mg-share test case.

Currently only add a new map groups when an exec (comm) event is
received.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/tests/thread-mg-share.c |   7 ++-
 tools/perf/util/event.c            |   2 +
 tools/perf/util/machine.c          |  15 ++++-
 tools/perf/util/map.c              |   3 +
 tools/perf/util/map.h              |   2 +
 tools/perf/util/thread.c           | 111 ++++++++++++++++++++++++++++++++++++-
 tools/perf/util/thread.h           |   3 +
 7 files changed, 138 insertions(+), 5 deletions(-)

diff --git a/tools/perf/tests/thread-mg-share.c b/tools/perf/tests/thread-mg-share.c
index 01fabb19d746..b258d5298b9b 100644
--- a/tools/perf/tests/thread-mg-share.c
+++ b/tools/perf/tests/thread-mg-share.c
@@ -23,6 +23,9 @@ int test__thread_mg_share(void)
 	 * with several threads and checks they properly share and
 	 * maintain map groups info (struct map_groups).
 	 *
+	 * Note that a leader thread has one more refcnt for its
+	 * (current) map groups.
+	 *
 	 * thread group (pid: 0, tids: 0, 1, 2, 3)
 	 * other  group (pid: 4, tids: 4, 5)
 	*/
@@ -43,7 +46,7 @@ int test__thread_mg_share(void)
 			leader && t1 && t2 && t3 && other);
 
 	mg = leader->mg;
-	TEST_ASSERT_EQUAL("wrong refcnt", atomic_read(&mg->refcnt), 4);
+	TEST_ASSERT_EQUAL("wrong refcnt", atomic_read(&mg->refcnt), 5);
 
 	/* test the map groups pointer is shared */
 	TEST_ASSERT_VAL("map groups don't match", mg == t1->mg);
@@ -71,7 +74,7 @@ int test__thread_mg_share(void)
 	machine__remove_thread(machine, other_leader);
 
 	other_mg = other->mg;
-	TEST_ASSERT_EQUAL("wrong refcnt", atomic_read(&other_mg->refcnt), 2);
+	TEST_ASSERT_EQUAL("wrong refcnt", atomic_read(&other_mg->refcnt), 3);
 
 	TEST_ASSERT_VAL("map groups don't match", other_mg == other_leader->mg);
 
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 3dff1b5cd4cc..887f18266ab5 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -914,6 +914,8 @@ void thread__find_addr_map(struct thread *thread, u8 cpumode,
 		return;
 	}
 
+	BUG_ON(mg == NULL);
+
 	if (cpumode == PERF_RECORD_MISC_KERNEL && perf_host) {
 		al->level = 'k';
 		mg = &machine->kmaps;
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 7cfaa2c3f131..3373e8455945 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -349,8 +349,19 @@ static void machine__update_thread_pid(struct machine *machine,
 	if (!leader)
 		goto out_err;
 
-	if (!leader->mg)
-		leader->mg = map_groups__new(machine);
+	if (!leader->mg) {
+		struct map_groups *mg = map_groups__new(machine);
+
+		if (mg == NULL) {
+			pr_err("Not enough memory for map groups\n");
+			return;
+		}
+
+		if (thread__set_map_groups(leader, mg, 0) < 0) {
+			map_groups__put(mg);
+			goto out_err;
+		}
+	}
 
 	if (!leader->mg)
 		goto out_err;
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 4e38c396a897..addd4b323027 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -471,6 +471,8 @@ void map_groups__init(struct map_groups *mg, struct machine *machine)
 	}
 	mg->machine = machine;
 	atomic_set(&mg->refcnt, 1);
+	mg->timestamp = 0;
+	INIT_LIST_HEAD(&mg->list);
 }
 
 static void __maps__purge(struct maps *maps)
@@ -527,6 +529,7 @@ struct map_groups *map_groups__new(struct machine *machine)
 void map_groups__delete(struct map_groups *mg)
 {
 	map_groups__exit(mg);
+	list_del(&mg->list);
 	free(mg);
 }
 
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index 7309d64ce39e..1e3313a22d3a 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -68,6 +68,8 @@ struct map_groups {
 	struct maps	 maps[MAP__NR_TYPES];
 	struct machine	 *machine;
 	atomic_t	 refcnt;
+	u64		 timestamp;
+	struct list_head list;
 };
 
 struct map_groups *map_groups__new(struct machine *machine);
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index ad7c2a00bff8..33de8b010282 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -11,13 +11,79 @@
 #include "unwind.h"
 #include "machine.h"
 
+struct map_groups *thread__get_map_groups(struct thread *thread, u64 timestamp)
+{
+	struct map_groups *mg;
+	struct thread *leader = thread;
+
+	BUG_ON(thread->mg == NULL);
+
+	if (thread->tid != thread->pid_) {
+		leader = machine__find_thread_by_time(thread->mg->machine,
+						      thread->pid_, thread->pid_,
+						      timestamp);
+		if (leader == NULL)
+			goto out;
+	}
+
+	list_for_each_entry(mg, &leader->mg_list, list)
+		if (timestamp >= mg->timestamp)
+			return mg;
+
+out:
+	return thread->mg;
+}
+
+int thread__set_map_groups(struct thread *thread, struct map_groups *mg,
+			   u64 timestamp)
+{
+	struct list_head *pos;
+	struct map_groups *old;
+
+	if (mg == NULL)
+		return -ENOMEM;
+
+	/*
+	 * Only a leader thread can have map groups list - others
+	 * reference it through map_groups__get.  This means the
+	 * leader thread will have one more refcnt than others.
+	 */
+	if (thread->tid != thread->pid_)
+		return -EINVAL;
+
+	if (thread->mg) {
+		BUG_ON(atomic_read(&thread->mg->refcnt) <= 1);
+		map_groups__put(thread->mg);
+	}
+
+	/* sort by time */
+	list_for_each(pos, &thread->mg_list) {
+		old = list_entry(pos, struct map_groups, list);
+		if (timestamp > old->timestamp)
+			break;
+	}
+
+	list_add_tail(&mg->list, pos);
+	mg->timestamp = timestamp;
+
+	/* set current ->mg to most recent one */
+	thread->mg = list_first_entry(&thread->mg_list, struct map_groups, list);
+	/* increase one more refcnt for current */
+	map_groups__get(thread->mg);
+
+	return 0;
+}
+
 int thread__init_map_groups(struct thread *thread, struct machine *machine)
 {
 	struct thread *leader;
 	pid_t pid = thread->pid_;
 
 	if (pid == thread->tid || pid == -1) {
-		thread->mg = map_groups__new(machine);
+		struct map_groups *mg = map_groups__new(machine);
+
+		if (thread__set_map_groups(thread, mg, 0) < 0)
+			map_groups__put(mg);
 	} else {
 		leader = __machine__findnew_thread(machine, pid, pid);
 		if (leader)
@@ -39,6 +105,7 @@ struct thread *thread__new(pid_t pid, pid_t tid)
 		thread->ppid = -1;
 		thread->cpu = -1;
 		INIT_LIST_HEAD(&thread->comm_list);
+		INIT_LIST_HEAD(&thread->mg_list);
 
 		if (unwind__prepare_access(thread) < 0)
 			goto err_thread;
@@ -69,6 +136,7 @@ err_thread:
 void thread__delete(struct thread *thread)
 {
 	struct comm *comm, *tmp;
+	struct map_groups *mg, *tmp_mg;
 
 	BUG_ON(!RB_EMPTY_NODE(&thread->rb_node));
 	BUG_ON(!list_empty(&thread->tid_list));
@@ -79,6 +147,10 @@ void thread__delete(struct thread *thread)
 		map_groups__put(thread->mg);
 		thread->mg = NULL;
 	}
+	/* only leader threads have mg list */
+	list_for_each_entry_safe(mg, tmp_mg, &thread->mg_list, list)
+		map_groups__put(mg);
+
 	list_for_each_entry_safe(comm, tmp, &thread->comm_list, list) {
 		list_del(&comm->list);
 		comm__free(comm);
@@ -152,6 +224,9 @@ struct comm *thread__comm_by_time(const struct thread *thread, u64 timestamp)
 	return list_last_entry(&thread->comm_list, struct comm, list);
 }
 
+static int thread__clone_map_groups(struct thread *thread,
+				    struct thread *parent);
+
 int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
 		       bool exec)
 {
@@ -182,6 +257,40 @@ int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
 			unwind__flush_access(thread);
 	}
 
+	if (exec) {
+		struct machine *machine;
+
+		BUG_ON(thread->mg == NULL || thread->mg->machine == NULL);
+
+		machine = thread->mg->machine;
+
+		if (thread->tid != thread->pid_) {
+			struct map_groups *old = thread->mg;
+			struct thread *leader;
+
+			leader = machine__findnew_thread(machine, thread->pid_,
+							 thread->pid_);
+
+			/* now it'll be a new leader */
+			thread->pid_ = thread->tid;
+
+			thread->mg = map_groups__new(old->machine);
+			if (thread->mg == NULL)
+				return -ENOMEM;
+
+			/* save current mg in the new leader */
+			thread__clone_map_groups(thread, leader);
+
+			/* current mg of leader thread needs one more refcnt */
+			map_groups__get(thread->mg);
+
+			thread__set_map_groups(thread, thread->mg, old->timestamp);
+		}
+
+		/* create a new mg for newly executed binary */
+		thread__set_map_groups(thread, map_groups__new(machine), timestamp);
+	}
+
 	thread->comm_set = true;
 
 	return 0;
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index 97026a9660ec..c8463d08a6dd 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -16,6 +16,7 @@ struct thread {
 	struct rb_node	 	rb_node;
 	struct list_head 	tid_list;
 	struct map_groups	*mg;
+	struct list_head	mg_list;
 	pid_t			pid_; /* Not all tools update this */
 	pid_t			tid;
 	pid_t			ppid;
@@ -71,6 +72,8 @@ struct comm *thread__exec_comm(const struct thread *thread);
 struct comm *thread__comm_by_time(const struct thread *thread, u64 timestamp);
 const char *thread__comm_str(const struct thread *thread);
 const char *thread__comm_str_by_time(const struct thread *thread, u64 timestamp);
+struct map_groups *thread__get_map_groups(struct thread *thread, u64 timestamp);
+int thread__set_map_groups(struct thread *thread, struct map_groups *mg, u64 timestamp);
 void thread__insert_map(struct thread *thread, struct map *map);
 int thread__fork(struct thread *thread, struct thread *parent, u64 timestamp);
 size_t thread__fprintf(struct thread *thread, FILE *fp);
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 18/38] perf tools: Introduce thread__find_addr_location_by_time() and friends
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
                   ` (16 preceding siblings ...)
  2015-10-02  5:18 ` [RFC/PATCH 17/38] perf tools: Maintain map groups list in a leader thread Namhyung Kim
@ 2015-10-02  5:18 ` Namhyung Kim
  2015-10-12 13:35   ` Jiri Olsa
  2015-10-02  5:19 ` [RFC/PATCH 19/38] perf callchain: Use " Namhyung Kim
                   ` (11 subsequent siblings)
  29 siblings, 1 reply; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

These new functions are for find appropriate map (and symbol) at the
given time when used with an indexed data file.  This is based on the
fact that map_groups list is sorted by time in the previous patch.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/event.c   | 59 +++++++++++++++++++++++++++++++++++++++--------
 tools/perf/util/machine.c | 28 ++++++++++++++--------
 tools/perf/util/session.h |  1 +
 tools/perf/util/thread.c  | 26 +++++++++++++++++++++
 tools/perf/util/thread.h  | 11 +++++++++
 5 files changed, 106 insertions(+), 19 deletions(-)

diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 887f18266ab5..c960cbcd30d4 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -895,16 +895,14 @@ int perf_event__process(struct perf_tool *tool __maybe_unused,
 	return machine__process_event(machine, event, sample);
 }
 
-void thread__find_addr_map(struct thread *thread, u8 cpumode,
-			   enum map_type type, u64 addr,
-			   struct addr_location *al)
+static void map_groups__find_addr_map(struct map_groups *mg, u8 cpumode,
+				      enum map_type type, u64 addr,
+				      struct addr_location *al)
 {
-	struct map_groups *mg = thread->mg;
 	struct machine *machine = mg->machine;
 	bool load_map = false;
 
 	al->machine = machine;
-	al->thread = thread;
 	al->addr = addr;
 	al->cpumode = cpumode;
 	al->filtered = 0;
@@ -973,6 +971,29 @@ try_again:
 	}
 }
 
+void thread__find_addr_map(struct thread *thread, u8 cpumode,
+			   enum map_type type, u64 addr,
+			   struct addr_location *al)
+{
+	al->thread = thread;
+	map_groups__find_addr_map(thread->mg, cpumode, type, addr, al);
+}
+
+void thread__find_addr_map_by_time(struct thread *thread, u8 cpumode,
+				   enum map_type type, u64 addr,
+				   struct addr_location *al, u64 timestamp)
+{
+	struct map_groups *mg;
+
+	if (perf_has_index)
+		mg = thread__get_map_groups(thread, timestamp);
+	else
+		mg = thread->mg;
+
+	al->thread = thread;
+	map_groups__find_addr_map(mg, cpumode, type, addr, al);
+}
+
 void thread__find_addr_location(struct thread *thread,
 				u8 cpumode, enum map_type type, u64 addr,
 				struct addr_location *al)
@@ -985,6 +1006,23 @@ void thread__find_addr_location(struct thread *thread,
 		al->sym = NULL;
 }
 
+void thread__find_addr_location_by_time(struct thread *thread, u8 cpumode,
+					enum map_type type, u64 addr,
+					struct addr_location *al, u64 timestamp)
+{
+	if (perf_has_index)
+		thread__find_addr_map_by_time(thread, cpumode, type, addr, al,
+					      timestamp);
+	else
+		thread__find_addr_map(thread, cpumode, type, addr, al);
+
+	if (al->map != NULL)
+		al->sym = map__find_symbol(al->map, al->addr,
+					   thread->mg->machine->symbol_filter);
+	else
+		al->sym = NULL;
+}
+
 /*
  * Callers need to drop the reference to al->thread, obtained in
  * machine__findnew_thread()
@@ -1014,7 +1052,9 @@ int perf_event__preprocess_sample(const union perf_event *event,
 	    machine__kernel_map(machine) == NULL)
 		machine__create_kernel_maps(machine);
 
-	thread__find_addr_map(thread, cpumode, MAP__FUNCTION, sample->ip, al);
+	thread__find_addr_map_by_time(thread, cpumode, MAP__FUNCTION,
+				      sample->ip, al, sample->time);
+
 	dump_printf(" ...... dso: %s\n",
 		    al->map ? al->map->dso->long_name :
 			al->level == 'H' ? "[hypervisor]" : "<not found>");
@@ -1097,10 +1137,11 @@ void perf_event__preprocess_sample_addr(union perf_event *event,
 {
 	u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
 
-	thread__find_addr_map(thread, cpumode, MAP__FUNCTION, sample->addr, al);
+	thread__find_addr_map_by_time(thread, cpumode, MAP__FUNCTION,
+				      sample->addr, al, sample->time);
 	if (!al->map)
-		thread__find_addr_map(thread, cpumode, MAP__VARIABLE,
-				      sample->addr, al);
+		thread__find_addr_map_by_time(thread, cpumode, MAP__VARIABLE,
+					      sample->addr, al, sample->time);
 
 	al->cpu = sample->cpu;
 	al->sym = NULL;
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 3373e8455945..d5e5b9de54b0 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1688,7 +1688,7 @@ static bool symbol__match_regex(struct symbol *sym, regex_t *regex)
 
 static void ip__resolve_ams(struct thread *thread,
 			    struct addr_map_symbol *ams,
-			    u64 ip)
+			    u64 ip, u64 timestamp)
 {
 	struct addr_location al;
 
@@ -1700,7 +1700,8 @@ static void ip__resolve_ams(struct thread *thread,
 	 * Thus, we have to try consecutively until we find a match
 	 * or else, the symbol is unknown
 	 */
-	thread__find_cpumode_addr_location(thread, MAP__FUNCTION, ip, &al);
+	thread__find_cpumode_addr_location_by_time(thread, MAP__FUNCTION,
+						   ip, &al, timestamp);
 
 	ams->addr = ip;
 	ams->al_addr = al.addr;
@@ -1708,21 +1709,25 @@ static void ip__resolve_ams(struct thread *thread,
 	ams->map = al.map;
 }
 
-static void ip__resolve_data(struct thread *thread,
-			     u8 m, struct addr_map_symbol *ams, u64 addr)
+static void ip__resolve_data(struct thread *thread, u8 m,
+			     struct addr_map_symbol *ams,
+			     u64 addr, u64 timestamp)
 {
 	struct addr_location al;
 
 	memset(&al, 0, sizeof(al));
 
-	thread__find_addr_location(thread, m, MAP__VARIABLE, addr, &al);
+	thread__find_addr_location_by_time(thread, m, MAP__VARIABLE,
+					   addr, &al, timestamp);
+
 	if (al.map == NULL) {
 		/*
 		 * some shared data regions have execute bit set which puts
 		 * their mapping in the MAP__FUNCTION type array.
 		 * Check there as a fallback option before dropping the sample.
 		 */
-		thread__find_addr_location(thread, m, MAP__FUNCTION, addr, &al);
+		thread__find_addr_location_by_time(thread, m, MAP__FUNCTION,
+						   addr, &al, timestamp);
 	}
 
 	ams->addr = addr;
@@ -1739,8 +1744,9 @@ struct mem_info *sample__resolve_mem(struct perf_sample *sample,
 	if (!mi)
 		return NULL;
 
-	ip__resolve_ams(al->thread, &mi->iaddr, sample->ip);
-	ip__resolve_data(al->thread, al->cpumode, &mi->daddr, sample->addr);
+	ip__resolve_ams(al->thread, &mi->iaddr, sample->ip, sample->time);
+	ip__resolve_data(al->thread, al->cpumode, &mi->daddr, sample->addr,
+			 sample->time);
 	mi->data_src.val = sample->data_src;
 
 	return mi;
@@ -1814,8 +1820,10 @@ struct branch_info *sample__resolve_bstack(struct perf_sample *sample,
 		return NULL;
 
 	for (i = 0; i < bs->nr; i++) {
-		ip__resolve_ams(al->thread, &bi[i].to, bs->entries[i].to);
-		ip__resolve_ams(al->thread, &bi[i].from, bs->entries[i].from);
+		ip__resolve_ams(al->thread, &bi[i].to,
+				bs->entries[i].to, sample->time);
+		ip__resolve_ams(al->thread, &bi[i].from,
+				bs->entries[i].from, sample->time);
 		bi[i].flags = bs->entries[i].flags;
 	}
 	return bi;
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index 3e900c0efc73..1dd864e983d2 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -138,4 +138,5 @@ int perf_event__synthesize_id_index(struct perf_tool *tool,
 				    struct perf_evlist *evlist,
 				    struct machine *machine);
 
+
 #endif /* __PERF_SESSION_H */
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 33de8b010282..efd510d5d966 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -400,3 +400,29 @@ void thread__find_cpumode_addr_location(struct thread *thread,
 			break;
 	}
 }
+
+void thread__find_cpumode_addr_location_by_time(struct thread *thread,
+						enum map_type type, u64 addr,
+						struct addr_location *al,
+						u64 timestamp)
+{
+	size_t i;
+	const u8 const cpumodes[] = {
+		PERF_RECORD_MISC_USER,
+		PERF_RECORD_MISC_KERNEL,
+		PERF_RECORD_MISC_GUEST_USER,
+		PERF_RECORD_MISC_GUEST_KERNEL
+	};
+
+	if (!perf_has_index) {
+		thread__find_cpumode_addr_location(thread, type, addr, al);
+		return;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(cpumodes); i++) {
+		thread__find_addr_location_by_time(thread, cpumodes[i], type,
+						   addr, al, timestamp);
+		if (al->map)
+			break;
+	}
+}
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index c8463d08a6dd..8815ac7bba3c 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -81,14 +81,25 @@ size_t thread__fprintf(struct thread *thread, FILE *fp);
 void thread__find_addr_map(struct thread *thread,
 			   u8 cpumode, enum map_type type, u64 addr,
 			   struct addr_location *al);
+void thread__find_addr_map_by_time(struct thread *thread, u8 cpumode,
+				   enum map_type type, u64 addr,
+				   struct addr_location *al, u64 timestamp);
 
 void thread__find_addr_location(struct thread *thread,
 				u8 cpumode, enum map_type type, u64 addr,
 				struct addr_location *al);
+void thread__find_addr_location_by_time(struct thread *thread, u8 cpumode,
+					enum map_type type, u64 addr,
+					struct addr_location *al,
+					u64 timestamp);
 
 void thread__find_cpumode_addr_location(struct thread *thread,
 					enum map_type type, u64 addr,
 					struct addr_location *al);
+void thread__find_cpumode_addr_location_by_time(struct thread *thread,
+						enum map_type type, u64 addr,
+						struct addr_location *al,
+						u64 timestamp);
 
 static inline void *thread__priv(struct thread *thread)
 {
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 19/38] perf callchain: Use thread__find_addr_location_by_time() and friends
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
                   ` (17 preceding siblings ...)
  2015-10-02  5:18 ` [RFC/PATCH 18/38] perf tools: Introduce thread__find_addr_location_by_time() and friends Namhyung Kim
@ 2015-10-02  5:19 ` Namhyung Kim
  2015-10-02  5:19 ` [RFC/PATCH 20/38] perf tools: Add a test case for timed map groups handling Namhyung Kim
                   ` (10 subsequent siblings)
  29 siblings, 0 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:19 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

Find correct thread/map/symbol using proper functions.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/machine.c          | 25 ++++++++++++++++---------
 tools/perf/util/unwind-libdw.c     | 12 +++++++-----
 tools/perf/util/unwind-libunwind.c | 27 ++++++++++++++-------------
 3 files changed, 37 insertions(+), 27 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index d5e5b9de54b0..761b04b970ad 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1756,15 +1756,17 @@ static int add_callchain_ip(struct thread *thread,
 			    struct symbol **parent,
 			    struct addr_location *root_al,
 			    u8 *cpumode,
-			    u64 ip)
+			    u64 ip,
+			    u64 timestamp)
 {
 	struct addr_location al;
 
 	al.filtered = 0;
 	al.sym = NULL;
 	if (!cpumode) {
-		thread__find_cpumode_addr_location(thread, MAP__FUNCTION,
-						   ip, &al);
+		thread__find_cpumode_addr_location_by_time(thread,
+							   MAP__FUNCTION, ip,
+							   &al, timestamp);
 	} else {
 		if (ip >= PERF_CONTEXT_MAX) {
 			switch (ip) {
@@ -1789,8 +1791,9 @@ static int add_callchain_ip(struct thread *thread,
 			}
 			return 0;
 		}
-		thread__find_addr_location(thread, *cpumode, MAP__FUNCTION,
-					   ip, &al);
+		thread__find_addr_location_by_time(thread, *cpumode,
+						   MAP__FUNCTION, ip,
+						   &al, timestamp);
 	}
 
 	if (al.sym != NULL) {
@@ -1932,7 +1935,8 @@ static int resolve_lbr_callchain_sample(struct thread *thread,
 					ip = lbr_stack->entries[0].to;
 			}
 
-			err = add_callchain_ip(thread, parent, root_al, &cpumode, ip);
+			err = add_callchain_ip(thread, parent, root_al, &cpumode, ip,
+					       sample->time);
 			if (err)
 				return (err < 0) ? err : 0;
 		}
@@ -1953,6 +1957,7 @@ static int thread__resolve_callchain_sample(struct thread *thread,
 	struct ip_callchain *chain = sample->callchain;
 	int chain_nr = min(max_stack, (int)chain->nr);
 	u8 cpumode = PERF_RECORD_MISC_USER;
+	u64 timestamp = sample->time;
 	int i, j, err;
 	int skip_idx = -1;
 	int first_call = 0;
@@ -2018,10 +2023,11 @@ static int thread__resolve_callchain_sample(struct thread *thread,
 
 		for (i = 0; i < nr; i++) {
 			err = add_callchain_ip(thread, parent, root_al,
-					       NULL, be[i].to);
+					       NULL, be[i].to, timestamp);
 			if (!err)
 				err = add_callchain_ip(thread, parent, root_al,
-						       NULL, be[i].from);
+						       NULL, be[i].from,
+						       timestamp);
 			if (err == -EINVAL)
 				break;
 			if (err)
@@ -2050,7 +2056,8 @@ check_calls:
 #endif
 		ip = chain->ips[j];
 
-		err = add_callchain_ip(thread, parent, root_al, &cpumode, ip);
+		err = add_callchain_ip(thread, parent, root_al, &cpumode, ip,
+				       timestamp);
 
 		if (err)
 			return (err < 0) ? err : 0;
diff --git a/tools/perf/util/unwind-libdw.c b/tools/perf/util/unwind-libdw.c
index 2dcfe9a7c8d0..0dd2b6ff4093 100644
--- a/tools/perf/util/unwind-libdw.c
+++ b/tools/perf/util/unwind-libdw.c
@@ -26,9 +26,9 @@ static int __report_module(struct addr_location *al, u64 ip,
 	Dwfl_Module *mod;
 	struct dso *dso = NULL;
 
-	thread__find_addr_location(ui->thread,
-				   PERF_RECORD_MISC_USER,
-				   MAP__FUNCTION, ip, al);
+	thread__find_addr_location_by_time(ui->thread, PERF_RECORD_MISC_USER,
+					   MAP__FUNCTION, ip, al,
+					   ui->sample->time);
 
 	if (al->map)
 		dso = al->map->dso;
@@ -89,8 +89,10 @@ static int access_dso_mem(struct unwind_info *ui, Dwarf_Addr addr,
 	struct addr_location al;
 	ssize_t size;
 
-	thread__find_addr_map(ui->thread, PERF_RECORD_MISC_USER,
-			      MAP__FUNCTION, addr, &al);
+	thread__find_addr_map_by_time(ui->thread, PERF_RECORD_MISC_USER,
+				      MAP__FUNCTION, addr, &al,
+				      ui->sample->time);
+
 	if (!al.map) {
 		pr_debug("unwind: no map for %lx\n", (unsigned long)addr);
 		return -1;
diff --git a/tools/perf/util/unwind-libunwind.c b/tools/perf/util/unwind-libunwind.c
index 4c00507ee3fd..5cac2dd68688 100644
--- a/tools/perf/util/unwind-libunwind.c
+++ b/tools/perf/util/unwind-libunwind.c
@@ -317,8 +317,10 @@ static struct map *find_map(unw_word_t ip, struct unwind_info *ui)
 {
 	struct addr_location al;
 
-	thread__find_addr_map(ui->thread, PERF_RECORD_MISC_USER,
-			      MAP__FUNCTION, ip, &al);
+	thread__find_addr_map_by_time(ui->thread, PERF_RECORD_MISC_USER,
+				      MAP__FUNCTION, ip, &al,
+				      ui->sample->time);
+
 	return al.map;
 }
 
@@ -411,20 +413,19 @@ get_proc_name(unw_addr_space_t __maybe_unused as,
 static int access_dso_mem(struct unwind_info *ui, unw_word_t addr,
 			  unw_word_t *data)
 {
-	struct addr_location al;
+	struct map *map;
 	ssize_t size;
 
-	thread__find_addr_map(ui->thread, PERF_RECORD_MISC_USER,
-			      MAP__FUNCTION, addr, &al);
-	if (!al.map) {
+	map = find_map(addr, ui);
+	if (!map) {
 		pr_debug("unwind: no map for %lx\n", (unsigned long)addr);
 		return -1;
 	}
 
-	if (!al.map->dso)
+	if (!map->dso)
 		return -1;
 
-	size = dso__data_read_addr(al.map->dso, al.map, ui->machine,
+	size = dso__data_read_addr(map->dso, map, ui->machine,
 				   addr, (u8 *) data, sizeof(*data));
 
 	return !(size == sizeof(*data));
@@ -516,14 +517,14 @@ static void put_unwind_info(unw_addr_space_t __maybe_unused as,
 	pr_debug("unwind: put_unwind_info called\n");
 }
 
-static int entry(u64 ip, struct thread *thread,
+static int entry(u64 ip, struct thread *thread, u64 timestamp,
 		 unwind_entry_cb_t cb, void *arg)
 {
 	struct unwind_entry e;
 	struct addr_location al;
 
-	thread__find_addr_location(thread, PERF_RECORD_MISC_USER,
-				   MAP__FUNCTION, ip, &al);
+	thread__find_addr_location_by_time(thread, PERF_RECORD_MISC_USER,
+					   MAP__FUNCTION, ip, &al, timestamp);
 
 	e.ip = ip;
 	e.map = al.map;
@@ -625,7 +626,7 @@ static int get_entries(struct unwind_info *ui, unwind_entry_cb_t cb,
 		unw_word_t ip;
 
 		unw_get_reg(&c, UNW_REG_IP, &ip);
-		ret = ip ? entry(ip, ui->thread, cb, arg) : 0;
+		ret = ip ? entry(ip, ui->thread, ui->sample->time, cb, arg) : 0;
 	}
 
 	return ret;
@@ -650,7 +651,7 @@ int unwind__get_entries(unwind_entry_cb_t cb, void *arg,
 	if (ret)
 		return ret;
 
-	ret = entry(ip, thread, cb, arg);
+	ret = entry(ip, thread, data->time, cb, arg);
 	if (ret)
 		return -ENOMEM;
 
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 20/38] perf tools: Add a test case for timed map groups handling
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
                   ` (18 preceding siblings ...)
  2015-10-02  5:19 ` [RFC/PATCH 19/38] perf callchain: Use " Namhyung Kim
@ 2015-10-02  5:19 ` Namhyung Kim
  2015-10-02  5:19 ` [RFC/PATCH 21/38] perf tools: Save timestamp of a map creation Namhyung Kim
                   ` (9 subsequent siblings)
  29 siblings, 0 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:19 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

A test case for verifying thread->mg and ->mg_list handling during
time change and new thread__find_addr_map_by_time() and friends.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/tests/Build            |  1 +
 tools/perf/tests/builtin-test.c   |  4 ++
 tools/perf/tests/tests.h          |  1 +
 tools/perf/tests/thread-mg-time.c | 93 +++++++++++++++++++++++++++++++++++++++
 4 files changed, 99 insertions(+)
 create mode 100644 tools/perf/tests/thread-mg-time.c

diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index fd4cabb9a1a0..d287b99ff3bb 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -27,6 +27,7 @@ perf-y += mmap-thread-lookup.o
 perf-y += thread-comm.o
 perf-y += thread-mg-share.o
 perf-y += thread-lookup-time.o
+perf-y += thread-mg-time.o
 perf-y += switch-tracking.o
 perf-y += keep-tracking.o
 perf-y += code-reading.o
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 027796fa105e..62de08a89e0e 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -199,6 +199,10 @@ static struct test {
 		.func = test__thread_lookup_time,
 	},
 	{
+		.desc = "Test thread map group handling with time",
+		.func = test__thread_mg_time,
+	},
+	{
 		.func = NULL,
 	},
 };
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index 9c02755e86dd..03dcaccb570f 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -67,6 +67,7 @@ int test__insn_x86(void);
 int test_session_topology(void);
 int test__thread_comm(void);
 int test__thread_lookup_time(void);
+int test__thread_mg_time(void);
 
 #if defined(__x86_64__) || defined(__i386__) || defined(__arm__) || defined(__aarch64__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
diff --git a/tools/perf/tests/thread-mg-time.c b/tools/perf/tests/thread-mg-time.c
new file mode 100644
index 000000000000..841777125a64
--- /dev/null
+++ b/tools/perf/tests/thread-mg-time.c
@@ -0,0 +1,93 @@
+#include "tests.h"
+#include "machine.h"
+#include "thread.h"
+#include "map.h"
+#include "debug.h"
+
+#define PERF_MAP_START  0x40000
+
+int test__thread_mg_time(void)
+{
+	struct machines machines;
+	struct machine *machine;
+	struct thread *t;
+	struct map_groups *mg;
+	struct map *map, *old_map;
+	struct addr_location al = { .map = NULL, };
+
+	/*
+	 * This test is to check whether it can retrieve a correct map
+	 * for a given time.  When multi-file data storage is enabled,
+	 * those task/comm/mmap events are processed first so the
+	 * later sample should find a matching comm properly.
+	 */
+	machines__init(&machines);
+	machine = &machines.host;
+
+	/* this is needed to add/find map by time */
+	perf_has_index = true;
+
+	t = machine__findnew_thread(machine, 0, 0);
+	mg = t->mg;
+
+	map = dso__new_map("/usr/bin/perf");
+	map->start = PERF_MAP_START;
+	map->end = PERF_MAP_START + 0x1000;
+
+	thread__insert_map(t, map);
+
+	if (verbose > 1)
+		map_groups__fprintf(t->mg, stderr);
+
+	thread__find_addr_map(t, PERF_RECORD_MISC_USER, MAP__FUNCTION,
+			      PERF_MAP_START, &al);
+
+	TEST_ASSERT_VAL("cannot find mapping for perf", al.map != NULL);
+	TEST_ASSERT_VAL("non matched mapping found", al.map == map);
+	TEST_ASSERT_VAL("incorrect map groups", al.map->groups == mg);
+	TEST_ASSERT_VAL("incorrect map groups", al.map->groups == t->mg);
+
+	thread__find_addr_map_by_time(t, PERF_RECORD_MISC_USER, MAP__FUNCTION,
+				      PERF_MAP_START, &al, -1ULL);
+
+	TEST_ASSERT_VAL("cannot find timed mapping for perf", al.map != NULL);
+	TEST_ASSERT_VAL("non matched timed mapping", al.map == map);
+	TEST_ASSERT_VAL("incorrect timed map groups", al.map->groups == mg);
+	TEST_ASSERT_VAL("incorrect map groups", al.map->groups == t->mg);
+
+
+	pr_debug("simulate EXEC event (generate new mg)\n");
+	__thread__set_comm(t, "perf-test", 10000, true);
+
+	old_map = map;
+
+	map = dso__new_map("/usr/bin/perf-test");
+	map->start = PERF_MAP_START;
+	map->end = PERF_MAP_START + 0x2000;
+
+	thread__insert_map(t, map);
+
+	if (verbose > 1)
+		map_groups__fprintf(t->mg, stderr);
+
+	thread__find_addr_map(t, PERF_RECORD_MISC_USER, MAP__FUNCTION,
+			      PERF_MAP_START + 4, &al);
+
+	TEST_ASSERT_VAL("cannot find mapping for perf-test", al.map != NULL);
+	TEST_ASSERT_VAL("invalid mapping found", al.map == map);
+	TEST_ASSERT_VAL("incorrect map groups", al.map->groups != mg);
+	TEST_ASSERT_VAL("incorrect map groups", al.map->groups == t->mg);
+
+	pr_debug("searching map in the old mag groups\n");
+	thread__find_addr_map_by_time(t, PERF_RECORD_MISC_USER, MAP__FUNCTION,
+				      PERF_MAP_START, &al, 5000);
+
+	TEST_ASSERT_VAL("cannot find timed mapping for perf-test", al.map != NULL);
+	TEST_ASSERT_VAL("non matched timed mapping", al.map == old_map);
+	TEST_ASSERT_VAL("incorrect timed map groups", al.map->groups == mg);
+	TEST_ASSERT_VAL("incorrect map groups", al.map->groups != t->mg);
+
+	machine__delete_threads(machine);
+	machines__exit(&machines);
+	return 0;
+}
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 21/38] perf tools: Save timestamp of a map creation
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
                   ` (19 preceding siblings ...)
  2015-10-02  5:19 ` [RFC/PATCH 20/38] perf tools: Add a test case for timed map groups handling Namhyung Kim
@ 2015-10-02  5:19 ` Namhyung Kim
  2015-10-02  5:19 ` [RFC/PATCH 22/38] perf tools: Introduce map_groups__{insert,find}_by_time() Namhyung Kim
                   ` (8 subsequent siblings)
  29 siblings, 0 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:19 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

It'll be used to support multiple maps on a same address like dlopen()
and/or JIT compile cases.

Cc: Stephane Eranian <eranian@google.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/dso.c         |  2 +-
 tools/perf/util/machine.c     | 29 +++++++++++++++++------------
 tools/perf/util/machine.h     |  2 +-
 tools/perf/util/map.c         | 12 +++++++-----
 tools/perf/util/map.h         |  9 ++++++---
 tools/perf/util/probe-event.c |  2 +-
 tools/perf/util/symbol-elf.c  |  2 +-
 tools/perf/util/symbol.c      |  4 ++--
 8 files changed, 36 insertions(+), 26 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 7c0c08386a1d..6cf2c0b095cf 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -859,7 +859,7 @@ struct map *dso__new_map(const char *name)
 	struct dso *dso = dso__new(name);
 
 	if (dso)
-		map = map__new2(0, dso, MAP__FUNCTION);
+		map = map__new2(0, dso, MAP__FUNCTION, 0);
 
 	return map;
 }
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 761b04b970ad..57f9aa1800a2 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -709,7 +709,7 @@ int machine__process_switch_event(struct machine *machine __maybe_unused,
 }
 
 struct map *machine__findnew_module_map(struct machine *machine, u64 start,
-					const char *filename)
+					const char *filename, u64 timestamp)
 {
 	struct map *map = NULL;
 	struct dso *dso;
@@ -727,7 +727,7 @@ struct map *machine__findnew_module_map(struct machine *machine, u64 start,
 	if (dso == NULL)
 		goto out;
 
-	map = map__new2(start, dso, MAP__FUNCTION);
+	map = map__new2(start, dso, MAP__FUNCTION, timestamp);
 	if (map == NULL)
 		goto out;
 
@@ -892,7 +892,7 @@ int __machine__create_kernel_maps(struct machine *machine, struct dso *kernel)
 		struct kmap *kmap;
 		struct map *map;
 
-		machine->vmlinux_maps[type] = map__new2(start, kernel, type);
+		machine->vmlinux_maps[type] = map__new2(start, kernel, type, 0);
 		if (machine->vmlinux_maps[type] == NULL)
 			return -1;
 
@@ -1192,7 +1192,7 @@ static int machine__create_module(void *arg, const char *name, u64 start)
 	struct machine *machine = arg;
 	struct map *map;
 
-	map = machine__findnew_module_map(machine, start, name);
+	map = machine__findnew_module_map(machine, start, name, 0);
 	if (map == NULL)
 		return -1;
 
@@ -1293,7 +1293,8 @@ static bool machine__uses_kcore(struct machine *machine)
 }
 
 static int machine__process_kernel_mmap_event(struct machine *machine,
-					      union perf_event *event)
+					      union perf_event *event,
+					      u64 timestamp)
 {
 	struct map *map;
 	char kmmap_prefix[PATH_MAX];
@@ -1316,7 +1317,8 @@ static int machine__process_kernel_mmap_event(struct machine *machine,
 	if (event->mmap.filename[0] == '/' ||
 	    (!is_kernel_mmap && event->mmap.filename[0] == '[')) {
 		map = machine__findnew_module_map(machine, event->mmap.start,
-						  event->mmap.filename);
+						  event->mmap.filename,
+						  timestamp);
 		if (map == NULL)
 			goto out_problem;
 
@@ -1404,7 +1406,7 @@ out_problem:
 
 int machine__process_mmap2_event(struct machine *machine,
 				 union perf_event *event,
-				 struct perf_sample *sample __maybe_unused)
+				 struct perf_sample *sample)
 {
 	u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
 	struct thread *thread;
@@ -1417,7 +1419,8 @@ int machine__process_mmap2_event(struct machine *machine,
 
 	if (cpumode == PERF_RECORD_MISC_GUEST_KERNEL ||
 	    cpumode == PERF_RECORD_MISC_KERNEL) {
-		ret = machine__process_kernel_mmap_event(machine, event);
+		ret = machine__process_kernel_mmap_event(machine, event,
+							 sample->time);
 		if (ret < 0)
 			goto out_problem;
 		return 0;
@@ -1440,7 +1443,8 @@ int machine__process_mmap2_event(struct machine *machine,
 			event->mmap2.ino_generation,
 			event->mmap2.prot,
 			event->mmap2.flags,
-			event->mmap2.filename, type, thread);
+			event->mmap2.filename, type, thread,
+			sample->time);
 
 	if (map == NULL)
 		goto out_problem_map;
@@ -1458,7 +1462,7 @@ out_problem:
 }
 
 int machine__process_mmap_event(struct machine *machine, union perf_event *event,
-				struct perf_sample *sample __maybe_unused)
+				struct perf_sample *sample)
 {
 	u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
 	struct thread *thread;
@@ -1471,7 +1475,8 @@ int machine__process_mmap_event(struct machine *machine, union perf_event *event
 
 	if (cpumode == PERF_RECORD_MISC_GUEST_KERNEL ||
 	    cpumode == PERF_RECORD_MISC_KERNEL) {
-		ret = machine__process_kernel_mmap_event(machine, event);
+		ret = machine__process_kernel_mmap_event(machine, event,
+							 sample->time);
 		if (ret < 0)
 			goto out_problem;
 		return 0;
@@ -1491,7 +1496,7 @@ int machine__process_mmap_event(struct machine *machine, union perf_event *event
 			event->mmap.len, event->mmap.pgoff,
 			event->mmap.pid, 0, 0, 0, 0, 0, 0,
 			event->mmap.filename,
-			type, thread);
+			type, thread, sample->time);
 
 	if (map == NULL)
 		goto out_problem_map;
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index f0c2cc9c90ae..98ade93f433f 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -205,7 +205,7 @@ struct symbol *machine__find_kernel_function_by_name(struct machine *machine,
 }
 
 struct map *machine__findnew_module_map(struct machine *machine, u64 start,
-					const char *filename);
+					const char *filename, u64 timestamp);
 
 int machine__load_kallsyms(struct machine *machine, const char *filename,
 			   enum map_type type, symbol_filter_t filter);
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index addd4b323027..2034127ac3f0 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -125,7 +125,7 @@ static inline bool replace_android_lib(const char *filename, char *newfilename)
 }
 
 void map__init(struct map *map, enum map_type type,
-	       u64 start, u64 end, u64 pgoff, struct dso *dso)
+	       u64 start, u64 end, u64 pgoff, struct dso *dso, u64 timestamp)
 {
 	map->type     = type;
 	map->start    = start;
@@ -138,13 +138,14 @@ void map__init(struct map *map, enum map_type type,
 	RB_CLEAR_NODE(&map->rb_node);
 	map->groups   = NULL;
 	map->erange_warned = false;
+	map->timestamp = timestamp;
 	atomic_set(&map->refcnt, 1);
 }
 
 struct map *map__new(struct machine *machine, u64 start, u64 len,
 		     u64 pgoff, u32 pid, u32 d_maj, u32 d_min, u64 ino,
 		     u64 ino_gen, u32 prot, u32 flags, char *filename,
-		     enum map_type type, struct thread *thread)
+		     enum map_type type, struct thread *thread, u64 timestamp)
 {
 	struct map *map = malloc(sizeof(*map));
 
@@ -184,7 +185,7 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
 		if (dso == NULL)
 			goto out_delete;
 
-		map__init(map, type, start, start + len, pgoff, dso);
+		map__init(map, type, start, start + len, pgoff, dso, timestamp);
 
 		if (anon || no_dso) {
 			map->map_ip = map->unmap_ip = identity__map_ip;
@@ -210,7 +211,8 @@ out_delete:
  * they are loaded) and for vmlinux, where only after we load all the
  * symbols we'll know where it starts and ends.
  */
-struct map *map__new2(u64 start, struct dso *dso, enum map_type type)
+struct map *map__new2(u64 start, struct dso *dso, enum map_type type,
+		      u64 timestamp)
 {
 	struct map *map = calloc(1, (sizeof(*map) +
 				     (dso->kernel ? sizeof(struct kmap) : 0)));
@@ -218,7 +220,7 @@ struct map *map__new2(u64 start, struct dso *dso, enum map_type type)
 		/*
 		 * ->end will be filled after we load all the symbols
 		 */
-		map__init(map, type, start, 0, 0, dso);
+		map__init(map, type, start, 0, 0, dso, timestamp);
 	}
 
 	return map;
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index 1e3313a22d3a..858f700ea5a3 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -43,6 +43,7 @@ struct map {
 	u32			maj, min; /* only valid for MMAP2 record */
 	u64			ino;      /* only valid for MMAP2 record */
 	u64			ino_generation;/* only valid for MMAP2 record */
+	u64			timestamp;
 
 	/* ip -> dso rip */
 	u64			(*map_ip)(struct map *, u64);
@@ -143,12 +144,14 @@ typedef int (*symbol_filter_t)(struct map *map, struct symbol *sym);
 
 int arch__compare_symbol_names(const char *namea, const char *nameb);
 void map__init(struct map *map, enum map_type type,
-	       u64 start, u64 end, u64 pgoff, struct dso *dso);
+	       u64 start, u64 end, u64 pgoff, struct dso *dso, u64 timestamp);
 struct map *map__new(struct machine *machine, u64 start, u64 len,
 		     u64 pgoff, u32 pid, u32 d_maj, u32 d_min, u64 ino,
 		     u64 ino_gen, u32 prot, u32 flags,
-		     char *filename, enum map_type type, struct thread *thread);
-struct map *map__new2(u64 start, struct dso *dso, enum map_type type);
+		     char *filename, enum map_type type, struct thread *thread,
+		     u64 timestamp);
+struct map *map__new2(u64 start, struct dso *dso, enum map_type type,
+		      u64 timestamp);
 void map__delete(struct map *map);
 struct map *map__clone(struct map *map);
 
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 3010abc071ff..01ecfbfe999a 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -167,7 +167,7 @@ static struct map *kernel_get_module_map(const char *module)
 
 	/* A file path -- this is an offline module */
 	if (module && strchr(module, '/'))
-		return machine__findnew_module_map(host_machine, 0, module);
+		return machine__findnew_module_map(host_machine, 0, module, 0);
 
 	if (!module)
 		module = "kernel";
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 475d88d0a1c9..b4443bacf811 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1025,7 +1025,7 @@ int dso__load_sym(struct dso *dso, struct map *map,
 				curr_dso->long_name = dso->long_name;
 				curr_dso->long_name_len = dso->long_name_len;
 				curr_map = map__new2(start, curr_dso,
-						     map->type);
+						     map->type, map->timestamp);
 				if (curr_map == NULL) {
 					dso__put(curr_dso);
 					goto out_elf_end;
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index bcda43bee4d4..ff2a298f1780 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -799,7 +799,7 @@ static int dso__split_kallsyms(struct dso *dso, struct map *map, u64 delta,
 
 			ndso->kernel = dso->kernel;
 
-			curr_map = map__new2(pos->start, ndso, map->type);
+			curr_map = map__new2(pos->start, ndso, map->type, 0);
 			if (curr_map == NULL) {
 				dso__put(ndso);
 				return -1;
@@ -1101,7 +1101,7 @@ static int kcore_mapfn(u64 start, u64 len, u64 pgoff, void *data)
 	struct kcore_mapfn_data *md = data;
 	struct map *map;
 
-	map = map__new2(start, md->dso, md->type);
+	map = map__new2(start, md->dso, md->type, 0);
 	if (map == NULL)
 		return -ENOMEM;
 
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 22/38] perf tools: Introduce map_groups__{insert,find}_by_time()
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
                   ` (20 preceding siblings ...)
  2015-10-02  5:19 ` [RFC/PATCH 21/38] perf tools: Save timestamp of a map creation Namhyung Kim
@ 2015-10-02  5:19 ` Namhyung Kim
  2015-10-02  5:19 ` [RFC/PATCH 23/38] perf tools: Use map_groups__find_addr_by_time() Namhyung Kim
                   ` (7 subsequent siblings)
  29 siblings, 0 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:19 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

It'll manage maps using timestamp so that it can find correct
map/symbol for sample at a certain time.  With this API, it can
maintain overlapping maps in a map_groups.

Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/map.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/map.h | 25 ++++++++++++++++++++
 2 files changed, 89 insertions(+)

diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 2034127ac3f0..c0f8933f29f0 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -777,6 +777,41 @@ void maps__insert(struct maps *maps, struct map *map)
 	pthread_rwlock_unlock(&maps->lock);
 }
 
+static void __maps__insert_by_time(struct maps *maps, struct map *map)
+{
+	struct rb_node **p = &maps->entries.rb_node;
+	struct rb_node *parent = NULL;
+	const u64 ip = map->start;
+	const u64 timestamp = map->timestamp;
+	struct map *m;
+
+	while (*p != NULL) {
+		parent = *p;
+		m = rb_entry(parent, struct map, rb_node);
+		if (ip < m->start)
+			p = &(*p)->rb_left;
+		else if (ip > m->start)
+			p = &(*p)->rb_right;
+		else if (timestamp > m->timestamp)
+			p = &(*p)->rb_left;
+		else if (timestamp < m->timestamp)
+			p = &(*p)->rb_right;
+		else
+			BUG_ON(1);
+	}
+
+	rb_link_node(&map->rb_node, parent, p);
+	rb_insert_color(&map->rb_node, &maps->entries);
+	map__get(map);
+}
+
+void maps__insert_by_time(struct maps *maps, struct map *map)
+{
+	pthread_rwlock_wrlock(&maps->lock);
+	__maps__insert_by_time(maps, map);
+	pthread_rwlock_unlock(&maps->lock);
+}
+
 static void __maps__remove(struct maps *maps, struct map *map)
 {
 	rb_erase_init(&map->rb_node, &maps->entries);
@@ -815,6 +850,35 @@ out:
 	return m;
 }
 
+struct map *maps__find_by_time(struct maps *maps, u64 ip, u64 timestamp)
+{
+	struct rb_node **p;
+	struct rb_node *parent = NULL;
+	struct map *m;
+	struct map *best = NULL;
+
+	pthread_rwlock_rdlock(&maps->lock);
+
+	p = &maps->entries.rb_node;
+	while (*p != NULL) {
+		parent = *p;
+		m = rb_entry(parent, struct map, rb_node);
+		if (ip < m->start)
+			p = &(*p)->rb_left;
+		else if (ip >= m->end)
+			p = &(*p)->rb_right;
+		else if (timestamp >= m->timestamp) {
+			if (!best || best->timestamp < m->timestamp)
+				best = m;
+			p = &(*p)->rb_left;
+		} else
+			p = &(*p)->rb_right;
+	}
+
+	pthread_rwlock_unlock(&maps->lock);
+	return best;
+}
+
 struct map *maps__first(struct maps *maps)
 {
 	struct rb_node *first = rb_first(&maps->entries);
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index 858f700ea5a3..41a9a39f1027 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -10,6 +10,8 @@
 #include <stdbool.h>
 #include <linux/types.h>
 
+#include "perf.h"  /* for perf_has_index */
+
 enum map_type {
 	MAP__FUNCTION = 0,
 	MAP__VARIABLE,
@@ -191,8 +193,10 @@ void map__reloc_vmlinux(struct map *map);
 size_t __map_groups__fprintf_maps(struct map_groups *mg, enum map_type type,
 				  FILE *fp);
 void maps__insert(struct maps *maps, struct map *map);
+void maps__insert_by_time(struct maps *maps, struct map *map);
 void maps__remove(struct maps *maps, struct map *map);
 struct map *maps__find(struct maps *maps, u64 addr);
+struct map *maps__find_by_time(struct maps *maps, u64 addr, u64 timestamp);
 struct map *maps__first(struct maps *maps);
 struct map *map__next(struct map *map);
 struct symbol *maps__find_symbol_by_name(struct maps *maps, const char *name,
@@ -212,6 +216,17 @@ static inline void map_groups__insert(struct map_groups *mg, struct map *map)
 	map->groups = mg;
 }
 
+static inline void map_groups__insert_by_time(struct map_groups *mg,
+					      struct map *map)
+{
+	if (perf_has_index)
+		maps__insert_by_time(&mg->maps[map->type], map);
+	else
+		maps__insert(&mg->maps[map->type], map);
+
+	map->groups = mg;
+}
+
 static inline void map_groups__remove(struct map_groups *mg, struct map *map)
 {
 	maps__remove(&mg->maps[map->type], map);
@@ -223,6 +238,16 @@ static inline struct map *map_groups__find(struct map_groups *mg,
 	return maps__find(&mg->maps[type], addr);
 }
 
+static inline struct map *map_groups__find_by_time(struct map_groups *mg,
+						   enum map_type type, u64 addr,
+						   u64 timestamp)
+{
+	if (!perf_has_index)
+		return maps__find(&mg->maps[type], addr);
+
+	return maps__find_by_time(&mg->maps[type], addr, timestamp);
+}
+
 static inline struct map *map_groups__first(struct map_groups *mg,
 					    enum map_type type)
 {
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 23/38] perf tools: Use map_groups__find_addr_by_time()
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
                   ` (21 preceding siblings ...)
  2015-10-02  5:19 ` [RFC/PATCH 22/38] perf tools: Introduce map_groups__{insert,find}_by_time() Namhyung Kim
@ 2015-10-02  5:19 ` Namhyung Kim
  2015-10-02  5:19 ` [RFC/PATCH 24/38] perf tools: Add testcase for managing maps with time Namhyung Kim
                   ` (6 subsequent siblings)
  29 siblings, 0 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:19 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

Use timestamp to find a corresponding map so that it can find a match
symbol eventually.

Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/event.c  | 81 ++++++++++++++++++++++++++++++++++++++++++------
 tools/perf/util/thread.c |  8 +++--
 2 files changed, 77 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index c960cbcd30d4..d7997105ee7a 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -895,12 +895,11 @@ int perf_event__process(struct perf_tool *tool __maybe_unused,
 	return machine__process_event(machine, event, sample);
 }
 
-static void map_groups__find_addr_map(struct map_groups *mg, u8 cpumode,
-				      enum map_type type, u64 addr,
-				      struct addr_location *al)
+static bool map_groups__set_addr_location(struct map_groups *mg,
+					  struct addr_location *al,
+					  u8 cpumode, u64 addr)
 {
 	struct machine *machine = mg->machine;
-	bool load_map = false;
 
 	al->machine = machine;
 	al->addr = addr;
@@ -909,21 +908,17 @@ static void map_groups__find_addr_map(struct map_groups *mg, u8 cpumode,
 
 	if (machine == NULL) {
 		al->map = NULL;
-		return;
+		return true;
 	}
 
 	BUG_ON(mg == NULL);
 
 	if (cpumode == PERF_RECORD_MISC_KERNEL && perf_host) {
 		al->level = 'k';
-		mg = &machine->kmaps;
-		load_map = true;
 	} else if (cpumode == PERF_RECORD_MISC_USER && perf_host) {
 		al->level = '.';
 	} else if (cpumode == PERF_RECORD_MISC_GUEST_KERNEL && perf_guest) {
 		al->level = 'g';
-		mg = &machine->kmaps;
-		load_map = true;
 	} else if (cpumode == PERF_RECORD_MISC_GUEST_USER && perf_guest) {
 		al->level = 'u';
 	} else {
@@ -939,8 +934,27 @@ static void map_groups__find_addr_map(struct map_groups *mg, u8 cpumode,
 			!perf_host)
 			al->filtered |= (1 << HIST_FILTER__HOST);
 
+		return true;
+	}
+	return false;
+}
+
+static void map_groups__find_addr_map(struct map_groups *mg, u8 cpumode,
+				      enum map_type type, u64 addr,
+				      struct addr_location *al)
+{
+	struct machine *machine = mg->machine;
+	bool load_map = false;
+
+	if (map_groups__set_addr_location(mg, al, cpumode, addr))
 		return;
+
+	if ((cpumode == PERF_RECORD_MISC_KERNEL && perf_host) ||
+	    (cpumode == PERF_RECORD_MISC_GUEST_KERNEL && perf_guest)) {
+		mg = &machine->kmaps;
+		load_map = true;
 	}
+
 try_again:
 	al->map = map_groups__find(mg, type, al->addr);
 	if (al->map == NULL) {
@@ -971,6 +985,53 @@ try_again:
 	}
 }
 
+static void map_groups__find_addr_map_by_time(struct map_groups *mg, u8 cpumode,
+					      enum map_type type, u64 addr,
+					      struct addr_location *al,
+					      u64 timestamp)
+{
+	struct machine *machine = mg->machine;
+	bool load_map = false;
+
+	if (map_groups__set_addr_location(mg, al, cpumode, addr))
+		return;
+
+	if ((cpumode == PERF_RECORD_MISC_KERNEL && perf_host) ||
+	    (cpumode == PERF_RECORD_MISC_GUEST_KERNEL && perf_guest)) {
+		mg = &machine->kmaps;
+		load_map = true;
+	}
+
+try_again:
+	al->map = map_groups__find_by_time(mg, type, al->addr, timestamp);
+	if (al->map == NULL) {
+		/*
+		 * If this is outside of all known maps, and is a negative
+		 * address, try to look it up in the kernel dso, as it might be
+		 * a vsyscall or vdso (which executes in user-mode).
+		 *
+		 * XXX This is nasty, we should have a symbol list in the
+		 * "[vdso]" dso, but for now lets use the old trick of looking
+		 * in the whole kernel symbol list.
+		 */
+		if (cpumode == PERF_RECORD_MISC_USER && machine &&
+		    mg != &machine->kmaps &&
+		    machine__kernel_ip(machine, al->addr)) {
+			mg = &machine->kmaps;
+			load_map = true;
+			goto try_again;
+		}
+	} else {
+		/*
+		 * Kernel maps might be changed when loading symbols so loading
+		 * must be done prior to using kernel maps.
+		 */
+		if (load_map)
+			map__load(al->map, machine->symbol_filter);
+		al->addr = al->map->map_ip(al->map, al->addr);
+	}
+}
+
 void thread__find_addr_map(struct thread *thread, u8 cpumode,
 			   enum map_type type, u64 addr,
 			   struct addr_location *al)
@@ -991,7 +1052,7 @@ void thread__find_addr_map_by_time(struct thread *thread, u8 cpumode,
 		mg = thread->mg;
 
 	al->thread = thread;
-	map_groups__find_addr_map(mg, cpumode, type, addr, al);
+	map_groups__find_addr_map_by_time(mg, cpumode, type, addr, al, timestamp);
 }
 
 void thread__find_addr_location(struct thread *thread,
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index efd510d5d966..21de681415f4 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -337,8 +337,12 @@ size_t thread__fprintf(struct thread *thread, FILE *fp)
 
 void thread__insert_map(struct thread *thread, struct map *map)
 {
-	map_groups__fixup_overlappings(thread->mg, map, stderr);
-	map_groups__insert(thread->mg, map);
+	if (perf_has_index) {
+		map_groups__insert_by_time(thread->mg, map);
+	} else {
+		map_groups__fixup_overlappings(thread->mg, map, stderr);
+		map_groups__insert(thread->mg, map);
+	}
 }
 
 static int thread__clone_map_groups(struct thread *thread,
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 24/38] perf tools: Add testcase for managing maps with time
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
                   ` (22 preceding siblings ...)
  2015-10-02  5:19 ` [RFC/PATCH 23/38] perf tools: Use map_groups__find_addr_by_time() Namhyung Kim
@ 2015-10-02  5:19 ` Namhyung Kim
  2015-10-02  5:19 ` [RFC/PATCH 25/38] perf callchain: Maintain libunwind's address space in map_groups Namhyung Kim
                   ` (5 subsequent siblings)
  29 siblings, 0 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:19 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

This tests new map_groups__{insert,find}_by_time() API working
correctly by using 3 * 100 maps.

Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/tests/Build             |  1 +
 tools/perf/tests/builtin-test.c    |  4 ++
 tools/perf/tests/tests.h           |  1 +
 tools/perf/tests/thread-map-time.c | 90 ++++++++++++++++++++++++++++++++++++++
 4 files changed, 96 insertions(+)
 create mode 100644 tools/perf/tests/thread-map-time.c

diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index d287b99ff3bb..cc4e3af3e0fd 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -28,6 +28,7 @@ perf-y += thread-comm.o
 perf-y += thread-mg-share.o
 perf-y += thread-lookup-time.o
 perf-y += thread-mg-time.o
+perf-y += thread-map-time.o
 perf-y += switch-tracking.o
 perf-y += keep-tracking.o
 perf-y += code-reading.o
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 62de08a89e0e..d5f9fcef5571 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -203,6 +203,10 @@ static struct test {
 		.func = test__thread_mg_time,
 	},
 	{
+		.desc = "Test thread map lookup with time",
+		.func = test__thread_map_lookup_time,
+	},
+	{
 		.func = NULL,
 	},
 };
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index 03dcaccb570f..e498b23f1580 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -68,6 +68,7 @@ int test_session_topology(void);
 int test__thread_comm(void);
 int test__thread_lookup_time(void);
 int test__thread_mg_time(void);
+int test__thread_map_lookup_time(void);
 
 #if defined(__x86_64__) || defined(__i386__) || defined(__arm__) || defined(__aarch64__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
diff --git a/tools/perf/tests/thread-map-time.c b/tools/perf/tests/thread-map-time.c
new file mode 100644
index 000000000000..6f28975faeb5
--- /dev/null
+++ b/tools/perf/tests/thread-map-time.c
@@ -0,0 +1,90 @@
+#include "debug.h"
+#include "tests.h"
+#include "machine.h"
+#include "thread.h"
+#include "map.h"
+
+#define PERF_MAP_START  0x40000
+#define LIBC_MAP_START  0x80000
+#define VDSO_MAP_START  0x7F000
+
+#define NR_MAPS  100
+
+static int lookup_maps(struct map_groups *mg)
+{
+	struct map *map;
+	int i, ret = -1;
+	size_t n;
+	struct {
+		const char *path;
+		u64 start;
+	} maps[] = {
+		{ "/usr/bin/perf",	PERF_MAP_START },
+		{ "/usr/lib/libc.so",	LIBC_MAP_START },
+		{ "[vdso]",		VDSO_MAP_START },
+	};
+
+	/* this is needed to insert/find map by time */
+	perf_has_index = true;
+
+	for (n = 0; n < ARRAY_SIZE(maps); n++) {
+		for (i = 0; i < NR_MAPS; i++) {
+			map = map__new2(maps[n].start, dso__new(maps[n].path),
+					MAP__FUNCTION, i * 10000);
+			if (map == NULL) {
+				pr_debug("memory allocation failed\n");
+				goto out;
+			}
+
+			map->end = map->start + 0x1000;
+			map_groups__insert_by_time(mg, map);
+		}
+	}
+
+	if (verbose > 1)
+		map_groups__fprintf(mg, stderr);
+
+	for (n = 0; n < ARRAY_SIZE(maps); n++) {
+		for (i = 0; i < NR_MAPS; i++) {
+			u64 timestamp = i * 10000;
+
+			map = map_groups__find_by_time(mg, MAP__FUNCTION,
+						       maps[n].start,
+						       timestamp);
+
+			TEST_ASSERT_VAL("cannot find map", map);
+			TEST_ASSERT_VAL("addr not matched",
+					map->start == maps[n].start);
+			TEST_ASSERT_VAL("pathname not matched",
+					!strcmp(map->dso->name, maps[n].path));
+			TEST_ASSERT_VAL("timestamp not matched",
+					map->timestamp == timestamp);
+		}
+	}
+
+	ret = 0;
+out:
+	return ret;
+}
+
+/*
+ * This test creates large number of overlapping maps for increasing
+ * time and find a map based on timestamp.
+ */
+int test__thread_map_lookup_time(void)
+{
+	struct machines machines;
+	struct machine *machine;
+	struct thread *t;
+	int ret;
+
+	machines__init(&machines);
+	machine = &machines.host;
+
+	t = machine__findnew_thread(machine, 0, 0);
+
+	ret = lookup_maps(t->mg);
+
+	machine__delete_threads(machine);
+	return ret;
+}
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 25/38] perf callchain: Maintain libunwind's address space in map_groups
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
                   ` (23 preceding siblings ...)
  2015-10-02  5:19 ` [RFC/PATCH 24/38] perf tools: Add testcase for managing maps with time Namhyung Kim
@ 2015-10-02  5:19 ` Namhyung Kim
  2015-10-02  5:19 ` [RFC/PATCH 26/38] perf session: Pass struct events stats to event processing functions Namhyung Kim
                   ` (4 subsequent siblings)
  29 siblings, 0 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:19 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

Currently the address_space was kept in thread struct but it's more
appropriate to keep it in map_groups as it's maintained throughout
exec's with timestamps.  Also we should not flush the address space
after exec since it still can be accessed when used with an indexed
data file.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/tests/dwarf-unwind.c    |  5 +++--
 tools/perf/util/map.c              |  5 +++++
 tools/perf/util/map.h              |  1 +
 tools/perf/util/thread.c           |  7 -------
 tools/perf/util/unwind-libunwind.c | 28 +++++++++++++---------------
 tools/perf/util/unwind.h           | 15 ++++++---------
 6 files changed, 28 insertions(+), 33 deletions(-)

diff --git a/tools/perf/tests/dwarf-unwind.c b/tools/perf/tests/dwarf-unwind.c
index b9ca0a72fc4d..c3774e0ebcf4 100644
--- a/tools/perf/tests/dwarf-unwind.c
+++ b/tools/perf/tests/dwarf-unwind.c
@@ -143,6 +143,9 @@ int test__dwarf_unwind(void)
 	struct thread *thread;
 	int err = -1;
 
+	/* The record_mode should be set before calling map_groups__init() */
+	callchain_param.record_mode = CALLCHAIN_DWARF;
+
 	machines__init(&machines);
 
 	machine = machines__find(&machines, HOST_KERNEL_ID);
@@ -151,8 +154,6 @@ int test__dwarf_unwind(void)
 		return -1;
 	}
 
-	callchain_param.record_mode = CALLCHAIN_DWARF;
-
 	if (init_live_machine(machine)) {
 		pr_err("Could not init machine\n");
 		goto out;
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index c0f8933f29f0..c458a40f8d26 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -14,6 +14,7 @@
 #include "util.h"
 #include "debug.h"
 #include "machine.h"
+#include "unwind.h"
 #include <linux/string.h>
 
 static void __maps__insert(struct maps *maps, struct map *map);
@@ -475,6 +476,8 @@ void map_groups__init(struct map_groups *mg, struct machine *machine)
 	atomic_set(&mg->refcnt, 1);
 	mg->timestamp = 0;
 	INIT_LIST_HEAD(&mg->list);
+
+	unwind__prepare_access(mg);
 }
 
 static void __maps__purge(struct maps *maps)
@@ -504,6 +507,8 @@ void map_groups__exit(struct map_groups *mg)
 
 	for (i = 0; i < MAP__NR_TYPES; ++i)
 		maps__exit(&mg->maps[i]);
+
+	unwind__finish_access(mg);
 }
 
 bool map_groups__empty(struct map_groups *mg)
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index 41a9a39f1027..ef3f8b1649ab 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -73,6 +73,7 @@ struct map_groups {
 	atomic_t	 refcnt;
 	u64		 timestamp;
 	struct list_head list;
+	void		 *priv;
 };
 
 struct map_groups *map_groups__new(struct machine *machine);
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 21de681415f4..05aaf7d0ad18 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -107,9 +107,6 @@ struct thread *thread__new(pid_t pid, pid_t tid)
 		INIT_LIST_HEAD(&thread->comm_list);
 		INIT_LIST_HEAD(&thread->mg_list);
 
-		if (unwind__prepare_access(thread) < 0)
-			goto err_thread;
-
 		comm_str = malloc(32);
 		if (!comm_str)
 			goto err_thread;
@@ -155,7 +152,6 @@ void thread__delete(struct thread *thread)
 		list_del(&comm->list);
 		comm__free(comm);
 	}
-	unwind__finish_access(thread);
 
 	free(thread);
 }
@@ -252,9 +248,6 @@ int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
 				break;
 		}
 		list_add_tail(&new->list, &curr->list);
-
-		if (exec)
-			unwind__flush_access(thread);
 	}
 
 	if (exec) {
diff --git a/tools/perf/util/unwind-libunwind.c b/tools/perf/util/unwind-libunwind.c
index 5cac2dd68688..5cad1aecf051 100644
--- a/tools/perf/util/unwind-libunwind.c
+++ b/tools/perf/util/unwind-libunwind.c
@@ -32,6 +32,7 @@
 #include "symbol.h"
 #include "util.h"
 #include "debug.h"
+#include "map.h"
 
 extern int
 UNW_OBJ(dwarf_search_unwind_table) (unw_addr_space_t as,
@@ -566,7 +567,7 @@ static unw_accessors_t accessors = {
 	.get_proc_name		= get_proc_name,
 };
 
-int unwind__prepare_access(struct thread *thread)
+int unwind__prepare_access(struct map_groups *mg)
 {
 	unw_addr_space_t addr_space;
 
@@ -580,41 +581,38 @@ int unwind__prepare_access(struct thread *thread)
 	}
 
 	unw_set_caching_policy(addr_space, UNW_CACHE_GLOBAL);
-	thread__set_priv(thread, addr_space);
+	mg->priv = addr_space;
 
 	return 0;
 }
 
-void unwind__flush_access(struct thread *thread)
+void unwind__finish_access(struct map_groups *mg)
 {
-	unw_addr_space_t addr_space;
+	unw_addr_space_t addr_space = mg->priv;
 
 	if (callchain_param.record_mode != CALLCHAIN_DWARF)
 		return;
 
-	addr_space = thread__priv(thread);
-	unw_flush_cache(addr_space, 0, 0);
-}
-
-void unwind__finish_access(struct thread *thread)
-{
-	unw_addr_space_t addr_space;
-
-	if (callchain_param.record_mode != CALLCHAIN_DWARF)
+	if (addr_space == NULL)
 		return;
 
-	addr_space = thread__priv(thread);
 	unw_destroy_addr_space(addr_space);
+	mg->priv = NULL;
 }
 
 static int get_entries(struct unwind_info *ui, unwind_entry_cb_t cb,
 		       void *arg, int max_stack)
 {
+	struct map_groups *mg;
 	unw_addr_space_t addr_space;
 	unw_cursor_t c;
 	int ret;
 
-	addr_space = thread__priv(ui->thread);
+	mg = thread__get_map_groups(ui->thread, ui->sample->time);
+	if (mg == NULL)
+		return -1;
+
+	addr_space = mg->priv;
 	if (addr_space == NULL)
 		return -1;
 
diff --git a/tools/perf/util/unwind.h b/tools/perf/util/unwind.h
index 12790cf94618..c6860b481d07 100644
--- a/tools/perf/util/unwind.h
+++ b/tools/perf/util/unwind.h
@@ -21,17 +21,15 @@ int unwind__get_entries(unwind_entry_cb_t cb, void *arg,
 /* libunwind specific */
 #ifdef HAVE_LIBUNWIND_SUPPORT
 int libunwind__arch_reg_id(int regnum);
-int unwind__prepare_access(struct thread *thread);
-void unwind__flush_access(struct thread *thread);
-void unwind__finish_access(struct thread *thread);
+int unwind__prepare_access(struct map_groups *mg);
+void unwind__finish_access(struct map_groups *mg);
 #else
-static inline int unwind__prepare_access(struct thread *thread __maybe_unused)
+static inline int unwind__prepare_access(struct map_groups *mg __maybe_unused)
 {
 	return 0;
 }
 
-static inline void unwind__flush_access(struct thread *thread __maybe_unused) {}
-static inline void unwind__finish_access(struct thread *thread __maybe_unused) {}
+static inline void unwind__finish_access(struct map_groups *mg __maybe_unused) {}
 #endif
 #else
 static inline int
@@ -44,12 +42,11 @@ unwind__get_entries(unwind_entry_cb_t cb __maybe_unused,
 	return 0;
 }
 
-static inline int unwind__prepare_access(struct thread *thread __maybe_unused)
+static inline int unwind__prepare_access(struct map_groups *mg __maybe_unused)
 {
 	return 0;
 }
 
-static inline void unwind__flush_access(struct thread *thread __maybe_unused) {}
-static inline void unwind__finish_access(struct thread *thread __maybe_unused) {}
+static inline void unwind__finish_access(struct map_groups *mg __maybe_unused) {}
 #endif /* HAVE_DWARF_UNWIND_SUPPORT */
 #endif /* __UNWIND_H */
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 26/38] perf session: Pass struct events stats to event processing functions
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
                   ` (24 preceding siblings ...)
  2015-10-02  5:19 ` [RFC/PATCH 25/38] perf callchain: Maintain libunwind's address space in map_groups Namhyung Kim
@ 2015-10-02  5:19 ` Namhyung Kim
  2015-10-02  5:19 ` [RFC/PATCH 27/38] perf hists: Pass hists struct to hist_entry_iter struct Namhyung Kim
                   ` (3 subsequent siblings)
  29 siblings, 0 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:19 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

Pass stats structure so that it can point separate object when used in
multi-thread environment.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/session.c | 71 ++++++++++++++++++++++++++++++-----------------
 1 file changed, 45 insertions(+), 26 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 7546c4d147b9..8cafc679096b 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -19,6 +19,7 @@
 #include "thread-stack.h"
 
 static int perf_session__deliver_event(struct perf_session *session,
+				       struct events_stats *stats,
 				       union perf_event *event,
 				       struct perf_sample *sample,
 				       struct perf_tool *tool,
@@ -107,7 +108,8 @@ static int ordered_events__deliver_event(struct ordered_events *oe,
 		return ret;
 	}
 
-	return perf_session__deliver_event(session, event->event, &sample,
+	return perf_session__deliver_event(session, &session->evlist->stats,
+					   event->event, &sample,
 					   session->tool, event->file_offset);
 }
 
@@ -981,6 +983,7 @@ static struct machine *machines__find_for_cpumode(struct machines *machines,
 }
 
 static int deliver_sample_value(struct perf_evlist *evlist,
+				struct events_stats *stats,
 				struct perf_tool *tool,
 				union perf_event *event,
 				struct perf_sample *sample,
@@ -996,7 +999,7 @@ static int deliver_sample_value(struct perf_evlist *evlist,
 	}
 
 	if (!sid || sid->evsel == NULL) {
-		++evlist->stats.nr_unknown_id;
+		++stats->nr_unknown_id;
 		return 0;
 	}
 
@@ -1004,6 +1007,7 @@ static int deliver_sample_value(struct perf_evlist *evlist,
 }
 
 static int deliver_sample_group(struct perf_evlist *evlist,
+				struct events_stats *stats,
 				struct perf_tool *tool,
 				union  perf_event *event,
 				struct perf_sample *sample,
@@ -1013,7 +1017,7 @@ static int deliver_sample_group(struct perf_evlist *evlist,
 	u64 i;
 
 	for (i = 0; i < sample->read.group.nr; i++) {
-		ret = deliver_sample_value(evlist, tool, event, sample,
+		ret = deliver_sample_value(evlist, stats, tool, event, sample,
 					   &sample->read.group.values[i],
 					   machine);
 		if (ret)
@@ -1025,6 +1029,7 @@ static int deliver_sample_group(struct perf_evlist *evlist,
 
 static int
  perf_evlist__deliver_sample(struct perf_evlist *evlist,
+			     struct events_stats *stats,
 			     struct perf_tool *tool,
 			     union  perf_event *event,
 			     struct perf_sample *sample,
@@ -1041,14 +1046,15 @@ static int
 
 	/* For PERF_SAMPLE_READ we have either single or group mode. */
 	if (read_format & PERF_FORMAT_GROUP)
-		return deliver_sample_group(evlist, tool, event, sample,
+		return deliver_sample_group(evlist, stats, tool, event, sample,
 					    machine);
 	else
-		return deliver_sample_value(evlist, tool, event, sample,
+		return deliver_sample_value(evlist, stats, tool, event, sample,
 					    &sample->read.one, machine);
 }
 
 static int machines__deliver_event(struct machines *machines,
+				   struct events_stats *stats,
 				   struct perf_evlist *evlist,
 				   union perf_event *event,
 				   struct perf_sample *sample,
@@ -1066,15 +1072,16 @@ static int machines__deliver_event(struct machines *machines,
 	switch (event->header.type) {
 	case PERF_RECORD_SAMPLE:
 		if (evsel == NULL) {
-			++evlist->stats.nr_unknown_id;
+			++stats->nr_unknown_id;
 			return 0;
 		}
 		dump_sample(evsel, event, sample);
 		if (machine == NULL) {
-			++evlist->stats.nr_unprocessable_samples;
+			++stats->nr_unprocessable_samples;
 			return 0;
 		}
-		return perf_evlist__deliver_sample(evlist, tool, event, sample, evsel, machine);
+		return perf_evlist__deliver_sample(evlist, stats, tool, event,
+						   sample, evsel, machine);
 	case PERF_RECORD_MMAP:
 		return tool->mmap(tool, event, sample, machine);
 	case PERF_RECORD_MMAP2:
@@ -1089,7 +1096,7 @@ static int machines__deliver_event(struct machines *machines,
 		return tool->exit(tool, event, sample, machine);
 	case PERF_RECORD_LOST:
 		if (tool->lost == perf_event__process_lost)
-			evlist->stats.total_lost += event->lost.lost;
+			stats->total_lost += event->lost.lost;
 		return tool->lost(tool, event, sample, machine);
 	case PERF_RECORD_LOST_SAMPLES:
 		if (tool->lost_samples == perf_event__process_lost_samples)
@@ -1112,12 +1119,13 @@ static int machines__deliver_event(struct machines *machines,
 	case PERF_RECORD_SWITCH_CPU_WIDE:
 		return tool->context_switch(tool, event, sample, machine);
 	default:
-		++evlist->stats.nr_unknown_events;
+		++stats->nr_unknown_events;
 		return -1;
 	}
 }
 
 static int perf_session__deliver_event(struct perf_session *session,
+				       struct events_stats *stats,
 				       union perf_event *event,
 				       struct perf_sample *sample,
 				       struct perf_tool *tool,
@@ -1131,8 +1139,9 @@ static int perf_session__deliver_event(struct perf_session *session,
 	if (ret > 0)
 		return 0;
 
-	return machines__deliver_event(&session->machines, session->evlist,
-				       event, sample, tool, file_offset);
+	return machines__deliver_event(&session->machines, stats,
+				       session->evlist, event, sample,
+				       tool, file_offset);
 }
 
 static s64 perf_session__process_user_event(struct perf_session *session,
@@ -1197,7 +1206,8 @@ int perf_session__deliver_synth_event(struct perf_session *session,
 	if (event->header.type >= PERF_RECORD_USER_TYPE_START)
 		return perf_session__process_user_event(session, event, 0);
 
-	return machines__deliver_event(&session->machines, evlist, event, sample, tool, 0);
+	return machines__deliver_event(&session->machines, &evlist->stats,
+				       evlist, event, sample, tool, 0);
 }
 
 static void event_swap(union perf_event *event, bool sample_id_all)
@@ -1265,7 +1275,9 @@ out_parse_sample:
 }
 
 static s64 perf_session__process_event(struct perf_session *session,
-				       union perf_event *event, u64 file_offset)
+				       struct events_stats *stats,
+				       union perf_event *event,
+				       u64 file_offset)
 {
 	struct perf_evlist *evlist = session->evlist;
 	struct perf_tool *tool = session->tool;
@@ -1278,7 +1290,7 @@ static s64 perf_session__process_event(struct perf_session *session,
 	if (event->header.type >= PERF_RECORD_HEADER_MAX)
 		return -EINVAL;
 
-	events_stats__inc(&evlist->stats, event->header.type);
+	events_stats__inc(stats, event->header.type);
 
 	if (event->header.type >= PERF_RECORD_USER_TYPE_START)
 		return perf_session__process_user_event(session, event, file_offset);
@@ -1296,8 +1308,8 @@ static s64 perf_session__process_event(struct perf_session *session,
 			return ret;
 	}
 
-	return perf_session__deliver_event(session, event, &sample, tool,
-					   file_offset);
+	return perf_session__deliver_event(session, stats, event,
+					   &sample, tool, file_offset);
 }
 
 void perf_event_header__bswap(struct perf_event_header *hdr)
@@ -1325,9 +1337,9 @@ struct thread *perf_session__register_idle_thread(struct perf_session *session)
 	return thread;
 }
 
-static void perf_session__warn_about_errors(const struct perf_session *session)
+static void perf_session__warn_about_errors(const struct perf_session *session,
+					    const struct events_stats *stats)
 {
-	const struct events_stats *stats = &session->evlist->stats;
 	const struct ordered_events *oe = &session->ordered_events;
 
 	if (session->tool->lost == perf_event__process_lost &&
@@ -1420,6 +1432,7 @@ volatile int session_done;
 static int __perf_session__process_pipe_events(struct perf_session *session)
 {
 	struct ordered_events *oe = &session->ordered_events;
+	struct events_stats *stats = &session->evlist->stats;
 	struct perf_tool *tool = session->tool;
 	int fd = perf_data_file__fd(session->file);
 	union perf_event *event;
@@ -1484,7 +1497,8 @@ more:
 		}
 	}
 
-	if ((skip = perf_session__process_event(session, event, head)) < 0) {
+	if ((skip = perf_session__process_event(session, stats, event,
+						head)) < 0) {
 		pr_err("%#" PRIx64 " [%#x]: failed to process type: %d\n",
 		       head, event->header.size, event->header.type);
 		err = -EINVAL;
@@ -1509,7 +1523,7 @@ done:
 	err = perf_session__flush_thread_stacks(session);
 out_err:
 	free(buf);
-	perf_session__warn_about_errors(session);
+	perf_session__warn_about_errors(session, stats);
 	ordered_events__free(&session->ordered_events);
 	auxtrace__free_events(session);
 	return err;
@@ -1556,6 +1570,7 @@ fetch_mmaped_event(struct perf_session *session,
 #endif
 
 static int __perf_session__process_events(struct perf_session *session,
+					  struct events_stats *stats,
 					  u64 data_offset, u64 data_size,
 					  u64 file_size)
 {
@@ -1634,7 +1649,8 @@ more:
 	size = event->header.size;
 
 	if (size < sizeof(struct perf_event_header) ||
-	    (skip = perf_session__process_event(session, event, file_pos)) < 0) {
+	    (skip = perf_session__process_event(session, stats, event,
+						file_pos)) < 0) {
 		pr_err("%#" PRIx64 " [%#x]: failed to process type: %d\n",
 		       file_offset + head, event->header.size,
 		       event->header.type);
@@ -1678,6 +1694,7 @@ static int __perf_session__process_indexed_events(struct perf_session *session)
 	struct perf_data_file *file = session->file;
 	struct perf_tool *tool = session->tool;
 	u64 size = perf_data_file__size(file);
+	struct events_stats *stats = &session->evlist->stats;
 	int err = 0, i;
 
 	for (i = 0; i < (int)session->header.nr_index; i++) {
@@ -1694,13 +1711,14 @@ static int __perf_session__process_indexed_events(struct perf_session *session)
 		if (i > 0)
 			tool->ordered_events = false;
 
-		err = __perf_session__process_events(session, idx->offset,
+		err = __perf_session__process_events(session, stats,
+						     idx->offset,
 						     idx->size, size);
 		if (err < 0)
 			break;
 	}
 
-	perf_session__warn_about_errors(session);
+	perf_session__warn_about_errors(session, stats);
 	return err;
 }
 
@@ -1708,6 +1726,7 @@ int perf_session__process_events(struct perf_session *session)
 {
 	struct perf_data_file *file = session->file;
 	u64 size = perf_data_file__size(file);
+	struct events_stats *stats = &session->evlist->stats;
 	int err;
 
 	if (perf_session__register_idle_thread(session) == NULL)
@@ -1718,12 +1737,12 @@ int perf_session__process_events(struct perf_session *session)
 	if (perf_has_index)
 		return __perf_session__process_indexed_events(session);
 
-	err = __perf_session__process_events(session,
+	err = __perf_session__process_events(session, stats,
 					     session->header.data_offset,
 					     session->header.data_size,
 					     size);
 
-	perf_session__warn_about_errors(session);
+	perf_session__warn_about_errors(session, stats);
 	return err;
 }
 
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 27/38] perf hists: Pass hists struct to hist_entry_iter struct
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
                   ` (25 preceding siblings ...)
  2015-10-02  5:19 ` [RFC/PATCH 26/38] perf session: Pass struct events stats to event processing functions Namhyung Kim
@ 2015-10-02  5:19 ` Namhyung Kim
  2015-10-02  5:19 ` [RFC/PATCH 28/38] perf tools: Move BUILD_ID_SIZE definition to perf.h Namhyung Kim
                   ` (2 subsequent siblings)
  29 siblings, 0 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:19 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

This is a preparation for perf report multi-thread support.  When
multi-thread is enable, each thread will have its own hists during the
sample processing.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-report.c       |  1 +
 tools/perf/builtin-top.c          |  1 +
 tools/perf/tests/hists_cumulate.c |  1 +
 tools/perf/tests/hists_filter.c   |  1 +
 tools/perf/tests/hists_output.c   |  1 +
 tools/perf/util/hist.c            | 22 ++++++++--------------
 tools/perf/util/hist.h            |  1 +
 7 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index aeced4fa27e8..f6e000b87108 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -145,6 +145,7 @@ static int process_sample_event(struct perf_tool *tool,
 	struct addr_location al;
 	struct hist_entry_iter iter = {
 		.evsel 			= evsel,
+		.hists 			= evsel__hists(evsel),
 		.sample 		= sample,
 		.hide_unresolved 	= rep->hide_unresolved,
 		.add_entry_cb 		= hist_iter__report_callback,
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 6f641fd68296..e1ead08755c8 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -787,6 +787,7 @@ static void perf_event__process_sample(struct perf_tool *tool,
 		struct hists *hists = evsel__hists(evsel);
 		struct hist_entry_iter iter = {
 			.evsel		= evsel,
+			.hists 		= evsel__hists(evsel),
 			.sample 	= sample,
 			.add_entry_cb 	= hist_iter__top_callback,
 		};
diff --git a/tools/perf/tests/hists_cumulate.c b/tools/perf/tests/hists_cumulate.c
index 7ed737019de7..0c046dde9a64 100644
--- a/tools/perf/tests/hists_cumulate.c
+++ b/tools/perf/tests/hists_cumulate.c
@@ -88,6 +88,7 @@ static int add_hist_entries(struct hists *hists, struct machine *machine)
 		};
 		struct hist_entry_iter iter = {
 			.evsel = evsel,
+			.hists = evsel__hists(evsel),
 			.sample	= &sample,
 			.hide_unresolved = false,
 		};
diff --git a/tools/perf/tests/hists_filter.c b/tools/perf/tests/hists_filter.c
index 818acf875dd0..b866a5389a1d 100644
--- a/tools/perf/tests/hists_filter.c
+++ b/tools/perf/tests/hists_filter.c
@@ -65,6 +65,7 @@ static int add_hist_entries(struct perf_evlist *evlist,
 			};
 			struct hist_entry_iter iter = {
 				.evsel = evsel,
+				.hists = evsel__hists(evsel),
 				.sample = &sample,
 				.ops = &hist_iter_normal,
 				.hide_unresolved = false,
diff --git a/tools/perf/tests/hists_output.c b/tools/perf/tests/hists_output.c
index adbebc852cc8..bf9efe145260 100644
--- a/tools/perf/tests/hists_output.c
+++ b/tools/perf/tests/hists_output.c
@@ -58,6 +58,7 @@ static int add_hist_entries(struct hists *hists, struct machine *machine)
 		};
 		struct hist_entry_iter iter = {
 			.evsel = evsel,
+			.hists = evsel__hists(evsel),
 			.sample = &sample,
 			.ops = &hist_iter_normal,
 			.hide_unresolved = false,
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 10454197a508..045313c219b6 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -511,7 +511,7 @@ iter_add_single_mem_entry(struct hist_entry_iter *iter, struct addr_location *al
 	u64 cost;
 	struct mem_info *mi = iter->priv;
 	struct perf_sample *sample = iter->sample;
-	struct hists *hists = evsel__hists(iter->evsel);
+	struct hists *hists = iter->hists;
 	struct hist_entry *he;
 
 	if (mi == NULL)
@@ -541,8 +541,7 @@ static int
 iter_finish_mem_entry(struct hist_entry_iter *iter,
 		      struct addr_location *al __maybe_unused)
 {
-	struct perf_evsel *evsel = iter->evsel;
-	struct hists *hists = evsel__hists(evsel);
+	struct hists *hists = iter->hists;
 	struct hist_entry *he = iter->he;
 	int err = -EINVAL;
 
@@ -614,8 +613,7 @@ static int
 iter_add_next_branch_entry(struct hist_entry_iter *iter, struct addr_location *al)
 {
 	struct branch_info *bi;
-	struct perf_evsel *evsel = iter->evsel;
-	struct hists *hists = evsel__hists(evsel);
+	struct hists *hists = iter->hists;
 	struct hist_entry *he = NULL;
 	int i = iter->curr;
 	int err = 0;
@@ -663,11 +661,10 @@ iter_prepare_normal_entry(struct hist_entry_iter *iter __maybe_unused,
 static int
 iter_add_single_normal_entry(struct hist_entry_iter *iter, struct addr_location *al)
 {
-	struct perf_evsel *evsel = iter->evsel;
 	struct perf_sample *sample = iter->sample;
 	struct hist_entry *he;
 
-	he = __hists__add_entry(evsel__hists(evsel), al, iter->parent, NULL, NULL,
+	he = __hists__add_entry(iter->hists, al, iter->parent, NULL, NULL,
 				sample->period, sample->weight,
 				sample->transaction, sample->time, true);
 	if (he == NULL)
@@ -682,7 +679,6 @@ iter_finish_normal_entry(struct hist_entry_iter *iter,
 			 struct addr_location *al __maybe_unused)
 {
 	struct hist_entry *he = iter->he;
-	struct perf_evsel *evsel = iter->evsel;
 	struct perf_sample *sample = iter->sample;
 
 	if (he == NULL)
@@ -690,7 +686,7 @@ iter_finish_normal_entry(struct hist_entry_iter *iter,
 
 	iter->he = NULL;
 
-	hists__inc_nr_samples(evsel__hists(evsel), he->filtered);
+	hists__inc_nr_samples(iter->hists, he->filtered);
 
 	return hist_entry__append_callchain(he, sample);
 }
@@ -722,8 +718,7 @@ static int
 iter_add_single_cumulative_entry(struct hist_entry_iter *iter,
 				 struct addr_location *al)
 {
-	struct perf_evsel *evsel = iter->evsel;
-	struct hists *hists = evsel__hists(evsel);
+	struct hists *hists = iter->hists;
 	struct perf_sample *sample = iter->sample;
 	struct hist_entry **he_cache = iter->priv;
 	struct hist_entry *he;
@@ -768,12 +763,11 @@ static int
 iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
 			       struct addr_location *al)
 {
-	struct perf_evsel *evsel = iter->evsel;
 	struct perf_sample *sample = iter->sample;
 	struct hist_entry **he_cache = iter->priv;
 	struct hist_entry *he;
 	struct hist_entry he_tmp = {
-		.hists = evsel__hists(evsel),
+		.hists = iter->hists,
 		.cpu = al->cpu,
 		.thread = al->thread,
 		.comm = thread__comm_by_time(al->thread, sample->time),
@@ -803,7 +797,7 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
 		}
 	}
 
-	he = __hists__add_entry(evsel__hists(evsel), al, iter->parent, NULL, NULL,
+	he = __hists__add_entry(iter->hists, al, iter->parent, NULL, NULL,
 				sample->period, sample->weight,
 				sample->transaction, sample->time, false);
 	if (he == NULL)
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 7fbb60857f26..081a1f7ffb84 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -92,6 +92,7 @@ struct hist_entry_iter {
 	bool hide_unresolved;
 	int max_stack;
 
+	struct hists *hists;
 	struct perf_evsel *evsel;
 	struct perf_sample *sample;
 	struct hist_entry *he;
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 28/38] perf tools: Move BUILD_ID_SIZE definition to perf.h
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
                   ` (26 preceding siblings ...)
  2015-10-02  5:19 ` [RFC/PATCH 27/38] perf hists: Pass hists struct to hist_entry_iter struct Namhyung Kim
@ 2015-10-02  5:19 ` Namhyung Kim
  2015-10-02  5:22 ` Namhyung Kim
  2015-10-02  6:58 ` Namhyung Kim
  29 siblings, 0 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:19 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

The util/event.h includes util/build-id.h only for BUILD_ID_SIZE.
This is a problem when I include util/event.h from util/tool.h which
is also included by util/build-id.h since it now makes a circular
dependency resulting in incomplete type error.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/perf.h          | 3 +++
 tools/perf/util/build-id.h | 3 ---
 tools/perf/util/dso.h      | 1 +
 tools/perf/util/event.h    | 1 -
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index df7c208abb74..d21b5c63f244 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -31,6 +31,9 @@ static inline unsigned long long rdclock(void)
 
 #define MAX_NR_CPUS			1024
 
+#define BUILD_ID_SIZE			20
+#define SBUILD_ID_SIZE	(BUILD_ID_SIZE * 2 + 1)
+
 extern const char *input_name;
 extern bool perf_host, perf_guest;
 extern const char perf_version_string[];
diff --git a/tools/perf/util/build-id.h b/tools/perf/util/build-id.h
index 27a14a8a945b..8f9a5720bc5e 100644
--- a/tools/perf/util/build-id.h
+++ b/tools/perf/util/build-id.h
@@ -1,9 +1,6 @@
 #ifndef PERF_BUILD_ID_H_
 #define PERF_BUILD_ID_H_ 1
 
-#define BUILD_ID_SIZE	20
-#define SBUILD_ID_SIZE	(BUILD_ID_SIZE * 2 + 1)
-
 #include "tool.h"
 #include "strlist.h"
 #include <linux/types.h>
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index fc8db9c764ac..416b9a57fcb9 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -9,6 +9,7 @@
 #include <linux/types.h>
 #include <linux/bitops.h>
 #include "map.h"
+#include "perf.h"
 #include "build-id.h"
 
 enum dso_binary_type {
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index a0dbcbd4f6d8..3812d645362c 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -6,7 +6,6 @@
 
 #include "../perf.h"
 #include "map.h"
-#include "build-id.h"
 #include "perf_regs.h"
 
 struct mmap_event {
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 28/38] perf tools: Move BUILD_ID_SIZE definition to perf.h
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
                   ` (27 preceding siblings ...)
  2015-10-02  5:19 ` [RFC/PATCH 28/38] perf tools: Move BUILD_ID_SIZE definition to perf.h Namhyung Kim
@ 2015-10-02  5:22 ` Namhyung Kim
  2015-10-02  6:58 ` Namhyung Kim
  29 siblings, 0 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  5:22 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

The util/event.h includes util/build-id.h only for BUILD_ID_SIZE.
This is a problem when I include util/event.h from util/tool.h which
is also included by util/build-id.h since it now makes a circular
dependency resulting in incomplete type error.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/perf.h          | 3 +++
 tools/perf/util/build-id.h | 3 ---
 tools/perf/util/dso.h      | 1 +
 tools/perf/util/event.h    | 1 -
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index df7c208abb74..d21b5c63f244 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -31,6 +31,9 @@ static inline unsigned long long rdclock(void)
 
 #define MAX_NR_CPUS			1024
 
+#define BUILD_ID_SIZE			20
+#define SBUILD_ID_SIZE	(BUILD_ID_SIZE * 2 + 1)
+
 extern const char *input_name;
 extern bool perf_host, perf_guest;
 extern const char perf_version_string[];
diff --git a/tools/perf/util/build-id.h b/tools/perf/util/build-id.h
index 27a14a8a945b..8f9a5720bc5e 100644
--- a/tools/perf/util/build-id.h
+++ b/tools/perf/util/build-id.h
@@ -1,9 +1,6 @@
 #ifndef PERF_BUILD_ID_H_
 #define PERF_BUILD_ID_H_ 1
 
-#define BUILD_ID_SIZE	20
-#define SBUILD_ID_SIZE	(BUILD_ID_SIZE * 2 + 1)
-
 #include "tool.h"
 #include "strlist.h"
 #include <linux/types.h>
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index fc8db9c764ac..416b9a57fcb9 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -9,6 +9,7 @@
 #include <linux/types.h>
 #include <linux/bitops.h>
 #include "map.h"
+#include "perf.h"
 #include "build-id.h"
 
 enum dso_binary_type {
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index a0dbcbd4f6d8..3812d645362c 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -6,7 +6,6 @@
 
 #include "../perf.h"
 #include "map.h"
-#include "build-id.h"
 #include "perf_regs.h"
 
 struct mmap_event {
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [RFC/PATCH 28/38] perf tools: Move BUILD_ID_SIZE definition to perf.h
  2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
                   ` (28 preceding siblings ...)
  2015-10-02  5:22 ` Namhyung Kim
@ 2015-10-02  6:58 ` Namhyung Kim
  2015-10-12 14:32   ` Jiri Olsa
  29 siblings, 1 reply; 63+ messages in thread
From: Namhyung Kim @ 2015-10-02  6:58 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

The util/event.h includes util/build-id.h only for BUILD_ID_SIZE.
This is a problem when I include util/event.h from util/tool.h which
is also included by util/build-id.h since it now makes a circular
dependency resulting in incomplete type error.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/perf.h          | 3 +++
 tools/perf/util/build-id.h | 3 ---
 tools/perf/util/dso.h      | 1 +
 tools/perf/util/event.h    | 1 -
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index df7c208abb74..d21b5c63f244 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -31,6 +31,9 @@ static inline unsigned long long rdclock(void)
 
 #define MAX_NR_CPUS			1024
 
+#define BUILD_ID_SIZE			20
+#define SBUILD_ID_SIZE	(BUILD_ID_SIZE * 2 + 1)
+
 extern const char *input_name;
 extern bool perf_host, perf_guest;
 extern const char perf_version_string[];
diff --git a/tools/perf/util/build-id.h b/tools/perf/util/build-id.h
index 27a14a8a945b..8f9a5720bc5e 100644
--- a/tools/perf/util/build-id.h
+++ b/tools/perf/util/build-id.h
@@ -1,9 +1,6 @@
 #ifndef PERF_BUILD_ID_H_
 #define PERF_BUILD_ID_H_ 1
 
-#define BUILD_ID_SIZE	20
-#define SBUILD_ID_SIZE	(BUILD_ID_SIZE * 2 + 1)
-
 #include "tool.h"
 #include "strlist.h"
 #include <linux/types.h>
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index fc8db9c764ac..416b9a57fcb9 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -9,6 +9,7 @@
 #include <linux/types.h>
 #include <linux/bitops.h>
 #include "map.h"
+#include "perf.h"
 #include "build-id.h"
 
 enum dso_binary_type {
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index a0dbcbd4f6d8..3812d645362c 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -6,7 +6,6 @@
 
 #include "../perf.h"
 #include "map.h"
-#include "build-id.h"
 #include "perf_regs.h"
 
 struct mmap_event {
-- 
2.6.0


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 02/38] perf tools: Save mmap_param.len instead of mask
  2015-10-02  5:18 ` [RFC/PATCH 02/38] perf tools: Save mmap_param.len instead of mask Namhyung Kim
@ 2015-10-02 18:44   ` Arnaldo Carvalho de Melo
  2015-10-06  8:34     ` Namhyung Kim
  2015-10-08 10:17   ` Jiri Olsa
  1 sibling, 1 reply; 63+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-10-02 18:44 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen,
	Adrian Hunter

Em Fri, Oct 02, 2015 at 02:18:43PM +0900, Namhyung Kim escreveu:
> It is more convenient saving mmap length rather than (bit) mask.  With
> this patch, we can eliminate dependency to perf_evlist other than
> getting mmap_desc for dealing with mmaps.  The mask and length can be
> converted using perf_evlist__mmap_mask/len().
> 
> Cc: Jiri Olsa <jolsa@redhat.com>
> Cc: Adrian Hunter <adrian.hunter@intel.com>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/util/evlist.c | 31 +++++++++++++++++++++++++------
>  1 file changed, 25 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index c5180a29db1b..e46adcd5b408 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -29,6 +29,8 @@
>  
>  static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx);
>  static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx);
> +static size_t perf_evlist__mmap_mask(size_t len);
> +static size_t perf_evlist__mmap_len(size_t mask);

Are these "perf_evlist" methods? I don't think so, those are related to
"perf_mmap".

>  #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
>  #define SID(e, x, y) xyarray__entry(e->sample_id, x, y)
> @@ -871,7 +873,9 @@ void __weak auxtrace_mmap_params__set_idx(
>  static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
>  {
>  	if (evlist->mmap[idx].base != NULL) {
> -		munmap(evlist->mmap[idx].base, evlist->mmap_len);
> +		size_t mmap_len = perf_evlist__mmap_len(evlist->mmap[idx].mask);

I.e. here you could have it as:

		size_t mmap_len = perf_mmap__len(evlist->mmap[idx]);

> +
> +		munmap(evlist->mmap[idx].base, mmap_len);
>  		evlist->mmap[idx].base = NULL;
>  		atomic_set(&evlist->mmap[idx].refcnt, 0);
>  	}
> @@ -901,8 +905,8 @@ static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
>  }
>  
>  struct mmap_params {
> -	int prot;
> -	int mask;
> +	int	prot;
> +	size_t	len;
>  	struct auxtrace_mmap_params auxtrace_mp;
>  };
>  
> @@ -924,8 +928,8 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
>  	 */
>  	atomic_set(&evlist->mmap[idx].refcnt, 2);
>  	evlist->mmap[idx].prev = 0;
> -	evlist->mmap[idx].mask = mp->mask;
> -	evlist->mmap[idx].base = mmap(NULL, evlist->mmap_len, mp->prot,
> +	evlist->mmap[idx].mask = perf_evlist__mmap_mask(mp->len);

Here, since you're not using a perf_mmap instance, but the calculation
is relative to a perf_mmap property, we would use:

	evlist->mmap[idx].mask = __perf_mmap__mask(mp->len);
	

> +	evlist->mmap[idx].base = mmap(NULL, mp->len, mp->prot,
>  				      MAP_SHARED, fd, 0);
>  	if (evlist->mmap[idx].base == MAP_FAILED) {
>  		pr_debug2("failed to mmap perf event ring buffer, error %d\n",
> @@ -1071,6 +1075,21 @@ static size_t perf_evlist__mmap_size(unsigned long pages)
>  	return (pages + 1) * page_size;
>  }
>  
> +static size_t perf_evlist__mmap_mask(size_t len)
> +{
> +	BUG_ON(len <= page_size);
> +	BUG_ON((len % page_size) != 0);
> +
> +	return len - page_size - 1;
> +}
> +
> +static size_t perf_evlist__mmap_len(size_t mask)
> +{
> +	BUG_ON(((mask + 1) % page_size) != 0);
> +
> +	return mask + 1 + page_size;
> +}
> +
>  static long parse_pages_arg(const char *str, unsigned long min,
>  			    unsigned long max)
>  {
> @@ -1176,7 +1195,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
>  	evlist->overwrite = overwrite;
>  	evlist->mmap_len = perf_evlist__mmap_size(pages);
>  	pr_debug("mmap size %zuB\n", evlist->mmap_len);
> -	mp.mask = evlist->mmap_len - page_size - 1;
> +	mp.len = evlist->mmap_len;
>  
>  	auxtrace_mmap_params__init(&mp.auxtrace_mp, evlist->mmap_len,
>  				   auxtrace_pages, auxtrace_overwrite);
> -- 
> 2.6.0

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 03/38] perf tools: Move auxtrace_mmap field to struct perf_evlist
  2015-10-02  5:18 ` [RFC/PATCH 03/38] perf tools: Move auxtrace_mmap field to struct perf_evlist Namhyung Kim
@ 2015-10-02 18:45   ` Arnaldo Carvalho de Melo
  2015-10-05 11:29     ` Adrian Hunter
  2015-10-06  8:56     ` Namhyung Kim
  2015-10-05 13:14   ` Jiri Olsa
  2015-10-08 10:18   ` Jiri Olsa
  2 siblings, 2 replies; 63+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-10-02 18:45 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen,
	Adrian Hunter

Em Fri, Oct 02, 2015 at 02:18:44PM +0900, Namhyung Kim escreveu:
> Since it's gonna share struct mmap with dummy tracking evsel to track
> meta events only, let's move auxtrace out of struct perf_mmap.

Is this moving around _strictly_ needed?

- Arnaldo
 
> Cc: Adrian Hunter <adrian.hunter@intel.com>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/builtin-record.c |  4 ++--
>  tools/perf/util/evlist.c    | 30 +++++++++++++++++++++---------
>  tools/perf/util/evlist.h    |  2 +-
>  3 files changed, 24 insertions(+), 12 deletions(-)
> 
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 5e01c070dbf2..0accac6e0812 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -220,7 +220,7 @@ static int record__auxtrace_read_snapshot_all(struct record *rec)
>  
>  	for (i = 0; i < rec->evlist->nr_mmaps; i++) {
>  		struct auxtrace_mmap *mm =
> -				&rec->evlist->mmap[i].auxtrace_mmap;
> +				&rec->evlist->auxtrace_mmap[i];
>  
>  		if (!mm->base)
>  			continue;
> @@ -405,7 +405,7 @@ static int record__mmap_read_all(struct record *rec)
>  	int rc = 0;
>  
>  	for (i = 0; i < rec->evlist->nr_mmaps; i++) {
> -		struct auxtrace_mmap *mm = &rec->evlist->mmap[i].auxtrace_mmap;
> +		struct auxtrace_mmap *mm = &rec->evlist->auxtrace_mmap[i];
>  
>  		if (rec->evlist->mmap[i].base) {
>  			if (record__mmap_read(rec, i) != 0) {
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index e46adcd5b408..042dffc67986 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -810,9 +810,12 @@ union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
>  	return event;
>  }
>  
> -static bool perf_mmap__empty(struct perf_mmap *md)
> +static bool perf_evlist__mmap_empty(struct perf_evlist *evlist, int idx)
>  {
> -	return perf_mmap__read_head(md) == md->prev && !md->auxtrace_mmap.base;
> +	struct perf_mmap *md = &evlist->mmap[idx];
> +
> +	return perf_mmap__read_head(md) == md->prev &&
> +		evlist->auxtrace_mmap[idx].base == NULL;
>  }
>  
>  static void perf_evlist__mmap_get(struct perf_evlist *evlist, int idx)
> @@ -838,7 +841,7 @@ void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
>  		perf_mmap__write_tail(md, old);
>  	}
>  
> -	if (atomic_read(&md->refcnt) == 1 && perf_mmap__empty(md))
> +	if (atomic_read(&md->refcnt) == 1 && perf_evlist__mmap_empty(evlist, idx))
>  		perf_evlist__mmap_put(evlist, idx);
>  }
>  
> @@ -879,7 +882,7 @@ static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
>  		evlist->mmap[idx].base = NULL;
>  		atomic_set(&evlist->mmap[idx].refcnt, 0);
>  	}
> -	auxtrace_mmap__munmap(&evlist->mmap[idx].auxtrace_mmap);
> +	auxtrace_mmap__munmap(&evlist->auxtrace_mmap[idx]);
>  }
>  
>  void perf_evlist__munmap(struct perf_evlist *evlist)
> @@ -901,7 +904,15 @@ static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
>  	if (cpu_map__empty(evlist->cpus))
>  		evlist->nr_mmaps = thread_map__nr(evlist->threads);
>  	evlist->mmap = zalloc(evlist->nr_mmaps * sizeof(struct perf_mmap));
> -	return evlist->mmap != NULL ? 0 : -ENOMEM;
> +	if (evlist->mmap == NULL)
> +		return -ENOMEM;
> +	evlist->auxtrace_mmap = calloc(evlist->nr_mmaps,
> +				       sizeof(struct auxtrace_mmap));
> +	if (evlist->auxtrace_mmap == NULL) {
> +		zfree(&evlist->mmap);
> +		return -ENOMEM;
> +	}
> +	return 0;
>  }
>  
>  struct mmap_params {
> @@ -938,10 +949,6 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
>  		return -1;
>  	}
>  
> -	if (auxtrace_mmap__mmap(&evlist->mmap[idx].auxtrace_mmap,
> -				&mp->auxtrace_mp, evlist->mmap[idx].base, fd))
> -		return -1;
> -
>  	return 0;
>  }
>  
> @@ -963,6 +970,11 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
>  			*output = fd;
>  			if (__perf_evlist__mmap(evlist, idx, mp, *output) < 0)
>  				return -1;
> +
> +			if (auxtrace_mmap__mmap(&evlist->auxtrace_mmap[idx],
> +						&mp->auxtrace_mp,
> +						evlist->mmap[idx].base, fd))
> +				return -1;
>  		} else {
>  			if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, *output) != 0)
>  				return -1;
> diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
> index 414e383885f5..51574ce8ac69 100644
> --- a/tools/perf/util/evlist.h
> +++ b/tools/perf/util/evlist.h
> @@ -30,7 +30,6 @@ struct perf_mmap {
>  	int		 mask;
>  	atomic_t	 refcnt;
>  	u64		 prev;
> -	struct auxtrace_mmap auxtrace_mmap;
>  	char		 event_copy[PERF_SAMPLE_MAX_SIZE] __attribute__((aligned(8)));
>  };
>  
> @@ -53,6 +52,7 @@ struct perf_evlist {
>  	} workload;
>  	struct fdarray	 pollfd;
>  	struct perf_mmap *mmap;
> +	struct auxtrace_mmap *auxtrace_mmap;
>  	struct thread_map *threads;
>  	struct cpu_map	  *cpus;
>  	struct perf_evsel *selected;
> -- 
> 2.6.0

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 04/38] perf tools: pass perf_mmap desc directly
  2015-10-02  5:18 ` [RFC/PATCH 04/38] perf tools: pass perf_mmap desc directly Namhyung Kim
@ 2015-10-02 18:47   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 63+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-10-02 18:47 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

Em Fri, Oct 02, 2015 at 02:18:45PM +0900, Namhyung Kim escreveu:
> Pass struct perf_mmap to mmap handling functions directly.  This will
> be used by both of normal mmap and track mmap later.
> 
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/util/evlist.c | 24 +++++++++++++++---------
>  tools/perf/util/evlist.h |  1 +
>  2 files changed, 16 insertions(+), 9 deletions(-)
> 
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index 042dffc67986..8d31883cbeb8 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -921,8 +921,8 @@ struct mmap_params {
>  	struct auxtrace_mmap_params auxtrace_mp;
>  };
>  
> -static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
> -			       struct mmap_params *mp, int fd)
> +static int perf_mmap__mmap(struct perf_mmap *desc,
> +			   struct mmap_params *mp, int fd)
>  {
>  	/*
>  	 * The last one will be done at perf_evlist__mmap_consume(), so that we
> @@ -937,21 +937,26 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
>  	 * evlist layer can't just drop it when filtering events in
>  	 * perf_evlist__filter_pollfd().
>  	 */
> -	atomic_set(&evlist->mmap[idx].refcnt, 2);
> -	evlist->mmap[idx].prev = 0;
> -	evlist->mmap[idx].mask = perf_evlist__mmap_mask(mp->len);
> -	evlist->mmap[idx].base = mmap(NULL, mp->len, mp->prot,
> +	atomic_set(&desc->refcnt, 2);
> +	desc->prev = 0;
> +	desc->mask = perf_evlist__mmap_mask(mp->len);
> +	desc->base = mmap(NULL, mp->len, mp->prot,
>  				      MAP_SHARED, fd, 0);
> -	if (evlist->mmap[idx].base == MAP_FAILED) {
> +	if (desc->base == MAP_FAILED) {
>  		pr_debug2("failed to mmap perf event ring buffer, error %d\n",
>  			  errno);
> -		evlist->mmap[idx].base = NULL;
> +		desc->base = NULL;
>  		return -1;
>  	}
>  
>  	return 0;
>  }
>  
> +struct perf_mmap *perf_evlist__mmap_desc(struct perf_evlist *evlist, int idx)
> +{
> +	return &evlist->mmap[idx];
> +}
> +

What does 'desc' stands for? Why not just use evlist->mmap[idx]?

>  static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
>  				       struct mmap_params *mp, int cpu,
>  				       int thread, int *output)
> @@ -960,6 +965,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
>  
>  	evlist__for_each(evlist, evsel) {
>  		int fd;
> +		struct perf_mmap *desc = perf_evlist__mmap_desc(evlist, idx);
>  
>  		if (evsel->system_wide && thread)
>  			continue;
> @@ -968,7 +974,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
>  
>  		if (*output == -1) {
>  			*output = fd;
> -			if (__perf_evlist__mmap(evlist, idx, mp, *output) < 0)
> +			if (perf_mmap__mmap(desc, mp, *output) < 0)
>  				return -1;
>  
>  			if (auxtrace_mmap__mmap(&evlist->auxtrace_mmap[idx],
> diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
> index 51574ce8ac69..79f8245300ad 100644
> --- a/tools/perf/util/evlist.h
> +++ b/tools/perf/util/evlist.h
> @@ -145,6 +145,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
>  int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
>  		      bool overwrite);
>  void perf_evlist__munmap(struct perf_evlist *evlist);
> +struct perf_mmap *perf_evlist__mmap_desc(struct perf_evlist *evlist, int idx);
>  
>  void perf_evlist__disable(struct perf_evlist *evlist);
>  void perf_evlist__enable(struct perf_evlist *evlist);
> -- 
> 2.6.0

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 09/38] perf record: Add --index option for building index table
  2015-10-02  5:18 ` [RFC/PATCH 09/38] perf record: Add --index option for building index table Namhyung Kim
@ 2015-10-02 18:58   ` Arnaldo Carvalho de Melo
  2015-10-05 13:46   ` Jiri Olsa
  1 sibling, 0 replies; 63+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-10-02 18:58 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

Em Fri, Oct 02, 2015 at 02:18:50PM +0900, Namhyung Kim escreveu:
> The new --index option will create indexed data file which can be
> processed by multiple threads parallelly.  It saves meta event and
> sample data in separate files and merges them with an index table.
> 
> If there's an index table in the data file, the HEADER_DATA_INDEX
> feature bit is set and session->header.index[0] will point to the meta
> event area, and rest are sample data.  It'd look like below:

So this is all about perf.data files, i.e. we will traverse it all
looking for metadata events, then the samples itself will be processed
using multiple threads. I.e. two stages, touching the whole perf.data
file looking for the metadata events, then touching it all again to
process the files, right?

The model processing events for 'perf report' and for 'perf top' will
differ, right?

Can't we have one thread reading the events, another batching then to
merge up to those FINISHED_ROUND when it needs to sort things (in
parallel with reading more events for the next round) and then pass the
sorted batch to one thread per CPU to actually process the samples as in
hist processing, etc?

Need to try this, see if 'perf top' works, which I think it will, but
you haven't mentioned anything about it in the cover letter for this
patchkit.

But no speedups should be expected there, as no 'perf.data' file is
involved...

- Arnaldo
 
>         +---------------------+
>         |     file header     |
>         |---------------------|
>         |                     |
>         |    meta events[0] <-+--+
>         |                     |  |
>         |---------------------|  |
>         |                     |  |
>         |    sample data[1] <-+--+
>         |                     |  |
>         |---------------------|  |
>         |                     |  |
>         |    sample data[2] <-|--+
>         |                     |  |
>         |---------------------|  |
>         |         ...         | ...
>         |---------------------|  |
>         |     feature data    |  |
>         |   (contains index) -+--+
>         +---------------------+
> 
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/Documentation/perf-record.txt |   4 +
>  tools/perf/builtin-record.c              | 178 ++++++++++++++++++++++++++++---
>  tools/perf/perf.h                        |   1 +
>  tools/perf/util/header.c                 |   2 +
>  tools/perf/util/session.c                |   1 +
>  5 files changed, 173 insertions(+), 13 deletions(-)
> 
> diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
> index 2e9ce77b5e14..71a9520b10b0 100644
> --- a/tools/perf/Documentation/perf-record.txt
> +++ b/tools/perf/Documentation/perf-record.txt
> @@ -308,6 +308,10 @@ This option sets the time out limit. The default value is 500 ms.
>  Record context switch events i.e. events of type PERF_RECORD_SWITCH or
>  PERF_RECORD_SWITCH_CPU_WIDE.
>  
> +--index::
> +Build an index table for sample data.  This will speed up perf report by
> +parallel processing.
> +
>  SEE ALSO
>  --------
>  linkperf:perf-stat[1], linkperf:perf-list[1]
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 623984c81478..096634c4c5ea 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -43,6 +43,7 @@ struct record {
>  	u64			bytes_written;
>  	struct perf_data_file	file;
>  	struct auxtrace_record	*itr;
> +	int			*fds;
>  	struct perf_evlist	*evlist;
>  	struct perf_session	*session;
>  	const char		*progname;
> @@ -52,9 +53,16 @@ struct record {
>  	long			samples;
>  };
>  
> -static int record__write(struct record *rec, void *bf, size_t size)
> +static int record__write(struct record *rec, void *bf, size_t size, int idx)
>  {
> -	if (perf_data_file__write(rec->session->file, bf, size) < 0) {
> +	int fd;
> +
> +	if (rec->fds && idx >= 0)
> +		fd = rec->fds[idx];
> +	else
> +		fd = perf_data_file__fd(rec->session->file);
> +
> +	if (writen(fd, bf, size) < 0) {
>  		pr_err("failed to write perf data, error: %m\n");
>  		return -1;
>  	}
> @@ -69,7 +77,7 @@ static int process_synthesized_event(struct perf_tool *tool,
>  				     struct machine *machine __maybe_unused)
>  {
>  	struct record *rec = container_of(tool, struct record, tool);
> -	return record__write(rec, event, event->header.size);
> +	return record__write(rec, event, event->header.size, -1);
>  }
>  
>  static int record__mmap_read(struct record *rec, int idx)
> @@ -94,7 +102,7 @@ static int record__mmap_read(struct record *rec, int idx)
>  		size = md->mask + 1 - (old & md->mask);
>  		old += size;
>  
> -		if (record__write(rec, buf, size) < 0) {
> +		if (record__write(rec, buf, size, idx) < 0) {
>  			rc = -1;
>  			goto out;
>  		}
> @@ -104,7 +112,7 @@ static int record__mmap_read(struct record *rec, int idx)
>  	size = head - old;
>  	old += size;
>  
> -	if (record__write(rec, buf, size) < 0) {
> +	if (record__write(rec, buf, size, idx) < 0) {
>  		rc = -1;
>  		goto out;
>  	}
> @@ -151,6 +159,7 @@ static int record__process_auxtrace(struct perf_tool *tool,
>  	struct perf_data_file *file = &rec->file;
>  	size_t padding;
>  	u8 pad[8] = {0};
> +	int idx = event->auxtrace.idx;
>  
>  	if (!perf_data_file__is_pipe(file)) {
>  		off_t file_offset;
> @@ -171,11 +180,11 @@ static int record__process_auxtrace(struct perf_tool *tool,
>  	if (padding)
>  		padding = 8 - padding;
>  
> -	record__write(rec, event, event->header.size);
> -	record__write(rec, data1, len1);
> +	record__write(rec, event, event->header.size, idx);
> +	record__write(rec, data1, len1, idx);
>  	if (len2)
> -		record__write(rec, data2, len2);
> -	record__write(rec, &pad, padding);
> +		record__write(rec, data2, len2, idx);
> +	record__write(rec, &pad, padding, idx);
>  
>  	return 0;
>  }
> @@ -268,6 +277,110 @@ int auxtrace_record__snapshot_start(struct auxtrace_record *itr __maybe_unused)
>  
>  #endif
>  
> +#define INDEX_FILE_FMT  "%s.dir/perf.data.%d"
> +
> +static int record__create_index_files(struct record *rec, int nr_index)
> +{
> +	int i = 0;
> +	int ret = -1;
> +	char path[PATH_MAX];
> +	struct perf_data_file *file = &rec->file;
> +
> +	rec->fds = malloc(nr_index * sizeof(int));
> +	if (rec->fds == NULL)
> +		return -ENOMEM;
> +
> +	scnprintf(path, sizeof(path), "%s.dir", file->path);
> +	if (rm_rf(path) < 0 || mkdir(path, S_IRWXU) < 0)
> +		goto out_err;
> +
> +	for (i = 0; i < nr_index; i++) {
> +		scnprintf(path, sizeof(path), INDEX_FILE_FMT, file->path, i);
> +		ret = open(path, O_RDWR|O_CREAT|O_TRUNC, S_IRUSR|S_IWUSR);
> +		if (ret < 0)
> +			goto out_err;
> +
> +		rec->fds[i] = ret;
> +	}
> +	return 0;
> +
> +out_err:
> +	while (--i >= 1)
> +		close(rec->fds[i]);
> +	zfree(&rec->fds);
> +
> +	scnprintf(path, sizeof(path), "%s.dir", file->path);
> +	rm_rf(path);
> +
> +	return ret;
> +}
> +
> +static int record__merge_index_files(struct record *rec, int nr_index)
> +{
> +	int i;
> +	int ret = -ENOMEM;
> +	u64 offset;
> +	char path[PATH_MAX];
> +	struct perf_file_section *idx;
> +	struct perf_data_file *file = &rec->file;
> +	struct perf_session *session = rec->session;
> +	int output_fd = perf_data_file__fd(file);
> +
> +	/* +1 for header file itself */
> +	nr_index++;
> +
> +	idx = calloc(nr_index, sizeof(*idx));
> +	if (idx == NULL)
> +		goto out_close;
> +
> +	offset = lseek(output_fd, 0, SEEK_END);
> +
> +	idx[0].offset = session->header.data_offset;
> +	idx[0].size   = offset - idx[0].offset;
> +
> +	for (i = 1; i < nr_index; i++) {
> +		struct stat stbuf;
> +		int fd = rec->fds[i - 1];
> +
> +		ret = fstat(fd, &stbuf);
> +		if (ret < 0)
> +			goto out_close;
> +
> +		idx[i].offset = offset;
> +		idx[i].size   = stbuf.st_size;
> +
> +		offset += stbuf.st_size;
> +
> +		if (idx[i].size == 0)
> +			continue;
> +
> +		ret = copyfile_offset(fd, 0, output_fd, idx[i].offset,
> +				      idx[i].size);
> +		if (ret < 0)
> +			goto out_close;
> +	}
> +
> +	session->header.index = idx;
> +	session->header.nr_index = nr_index;
> +
> +	perf_has_index = true;
> +
> +	ret = 0;
> +
> +out_close:
> +	if (ret < 0)
> +		pr_err("failed to merge index files: %d\n", ret);
> +
> +	for (i = 0; i < nr_index - 1; i++)
> +		close(rec->fds[i]);
> +
> +	scnprintf(path, sizeof(path), "%s.dir", file->path);
> +	rm_rf(path);
> +
> +	zfree(&rec->fds);
> +	return ret;
> +}
> +
>  static int record__open(struct record *rec)
>  {
>  	char msg[512];
> @@ -306,7 +419,8 @@ try_again:
>  
>  	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, false,
>  				 opts->auxtrace_mmap_pages,
> -				 opts->auxtrace_snapshot_mode, false) < 0) {
> +				 opts->auxtrace_snapshot_mode,
> +				 opts->index) < 0) {
>  		if (errno == EPERM) {
>  			pr_err("Permission error mapping pages.\n"
>  			       "Consider increasing "
> @@ -323,6 +437,14 @@ try_again:
>  		goto out;
>  	}
>  
> +	if (opts->index) {
> +		rc = record__create_index_files(rec, evlist->nr_mmaps);
> +		if (rc < 0) {
> +			pr_err("failed to create index file: %d\n", rc);
> +			goto out;
> +		}
> +	}
> +
>  	session->evlist = evlist;
>  	perf_session__set_id_hdr_size(session);
>  out:
> @@ -347,7 +469,9 @@ static int process_buildids(struct record *rec)
>  	struct perf_data_file *file  = &rec->file;
>  	struct perf_session *session = rec->session;
>  
> -	if (file->size == 0)
> +	/* update file size after merging sample files with index */
> +	u64 size = lseek(perf_data_file__fd(file), 0, SEEK_END);
> +	if (size == 0)
>  		return 0;
>  
>  	/*
> @@ -414,6 +538,13 @@ static int record__mmap_read_all(struct record *rec)
>  			}
>  		}
>  
> +		if (rec->evlist->track_mmap && rec->evlist->track_mmap[i].base) {
> +			if (record__mmap_read(rec, track_mmap_idx(i)) != 0) {
> +				rc = -1;
> +				goto out;
> +			}
> +		}
> +
>  		if (mm->base && !rec->opts.auxtrace_snapshot_mode &&
>  		    record__auxtrace_mmap_read(rec, mm) != 0) {
>  			rc = -1;
> @@ -426,7 +557,8 @@ static int record__mmap_read_all(struct record *rec)
>  	 * at least one event.
>  	 */
>  	if (bytes_written != rec->bytes_written)
> -		rc = record__write(rec, &finished_round_event, sizeof(finished_round_event));
> +		rc = record__write(rec, &finished_round_event,
> +				   sizeof(finished_round_event), -1);
>  
>  out:
>  	return rc;
> @@ -452,7 +584,8 @@ static void record__init_features(struct record *rec)
>  	if (!rec->opts.full_auxtrace)
>  		perf_header__clear_feat(&session->header, HEADER_AUXTRACE);
>  
> -	perf_header__clear_feat(&session->header, HEADER_DATA_INDEX);
> +	if (!rec->opts.index)
> +		perf_header__clear_feat(&session->header, HEADER_DATA_INDEX);
>  }
>  
>  static volatile int workload_exec_errno;
> @@ -520,6 +653,11 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
>  		}
>  	}
>  
> +	if (file->is_pipe && opts->index) {
> +		pr_warning("Indexing is disabled for pipe output\n");
> +		opts->index = false;
> +	}
> +
>  	if (record__open(rec) != 0) {
>  		err = -1;
>  		goto out_child;
> @@ -753,6 +891,9 @@ out_child:
>  		rec->session->header.data_size += rec->bytes_written;
>  		file->size = lseek(perf_data_file__fd(file), 0, SEEK_CUR);
>  
> +		if (rec->opts.index)
> +			record__merge_index_files(rec, rec->evlist->nr_mmaps);
> +
>  		if (!rec->no_buildid) {
>  			process_buildids(rec);
>  			/*
> @@ -1119,6 +1260,8 @@ struct option __record_options[] = {
>  			"per thread proc mmap processing timeout in ms"),
>  	OPT_BOOLEAN(0, "switch-events", &record.opts.record_switch_events,
>  		    "Record context switch events"),
> +	OPT_BOOLEAN(0, "index", &record.opts.index,
> +		    "make index for sample data to speed-up processing"),
>  	OPT_END()
>  };
>  
> @@ -1186,6 +1329,15 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
>  		goto out_symbol_exit;
>  	}
>  
> +	if (rec->opts.index) {
> +		if (!rec->opts.sample_time) {
> +			pr_err("Sample timestamp is required for indexing\n");
> +			goto out_symbol_exit;
> +		}
> +
> +		perf_evlist__add_dummy_tracking(rec->evlist);
> +	}
> +
>  	if (rec->opts.target.tid && !rec->opts.no_inherit_set)
>  		rec->opts.no_inherit = true;
>  
> diff --git a/tools/perf/perf.h b/tools/perf/perf.h
> index f4b4d7d8752c..df7c208abb74 100644
> --- a/tools/perf/perf.h
> +++ b/tools/perf/perf.h
> @@ -60,6 +60,7 @@ struct record_opts {
>  	bool	     full_auxtrace;
>  	bool	     auxtrace_snapshot_mode;
>  	bool	     record_switch_events;
> +	bool	     index;
>  	unsigned int freq;
>  	unsigned int mmap_pages;
>  	unsigned int auxtrace_mmap_pages;
> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> index c357f7f47d32..13ba1402ec1b 100644
> --- a/tools/perf/util/header.c
> +++ b/tools/perf/util/header.c
> @@ -2706,6 +2706,8 @@ int perf_session__read_header(struct perf_session *session)
>  						   session->tevent.pevent))
>  		goto out_delete_evlist;
>  
> +	perf_has_index = perf_header__has_feat(&session->header, HEADER_DATA_INDEX);
> +
>  	return 0;
>  out_errno:
>  	return -errno;
> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index 91fa9647f565..7546c4d147b9 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
> @@ -182,6 +182,7 @@ void perf_session__delete(struct perf_session *session)
>  	machines__exit(&session->machines);
>  	if (session->file)
>  		perf_data_file__close(session->file);
> +	free(session->header.index);
>  	free(session);
>  }
>  
> -- 
> 2.6.0

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 03/38] perf tools: Move auxtrace_mmap field to struct perf_evlist
  2015-10-02 18:45   ` Arnaldo Carvalho de Melo
@ 2015-10-05 11:29     ` Adrian Hunter
  2015-10-06  9:03       ` Namhyung Kim
  2015-10-06  8:56     ` Namhyung Kim
  1 sibling, 1 reply; 63+ messages in thread
From: Adrian Hunter @ 2015-10-05 11:29 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Namhyung Kim
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

On 02/10/15 21:45, Arnaldo Carvalho de Melo wrote:
> Em Fri, Oct 02, 2015 at 02:18:44PM +0900, Namhyung Kim escreveu:
>> Since it's gonna share struct mmap with dummy tracking evsel to track
>> meta events only, let's move auxtrace out of struct perf_mmap.
> Is this moving around _strictly_ needed?

Also, what if you wanted to capture AUX data and tracking together.

In addition, currently Intel PT can have either 1 dummy event for tracking
plus sched_switch
or 2 dummy events to allow for system-wide tracking of context switches.
i.e. there
are multiple tracking events.

>
> - Arnaldo
>  
>> Cc: Adrian Hunter <adrian.hunter@intel.com>
>> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
>> ---
>>  tools/perf/builtin-record.c |  4 ++--
>>  tools/perf/util/evlist.c    | 30 +++++++++++++++++++++---------
>>  tools/perf/util/evlist.h    |  2 +-
>>  3 files changed, 24 insertions(+), 12 deletions(-)
>>
>> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
>> index 5e01c070dbf2..0accac6e0812 100644
>> --- a/tools/perf/builtin-record.c
>> +++ b/tools/perf/builtin-record.c
>> @@ -220,7 +220,7 @@ static int record__auxtrace_read_snapshot_all(struct record *rec)
>>  
>>  	for (i = 0; i < rec->evlist->nr_mmaps; i++) {
>>  		struct auxtrace_mmap *mm =
>> -				&rec->evlist->mmap[i].auxtrace_mmap;
>> +				&rec->evlist->auxtrace_mmap[i];
>>  
>>  		if (!mm->base)
>>  			continue;
>> @@ -405,7 +405,7 @@ static int record__mmap_read_all(struct record *rec)
>>  	int rc = 0;
>>  
>>  	for (i = 0; i < rec->evlist->nr_mmaps; i++) {
>> -		struct auxtrace_mmap *mm = &rec->evlist->mmap[i].auxtrace_mmap;
>> +		struct auxtrace_mmap *mm = &rec->evlist->auxtrace_mmap[i];
>>  
>>  		if (rec->evlist->mmap[i].base) {
>>  			if (record__mmap_read(rec, i) != 0) {
>> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
>> index e46adcd5b408..042dffc67986 100644
>> --- a/tools/perf/util/evlist.c
>> +++ b/tools/perf/util/evlist.c
>> @@ -810,9 +810,12 @@ union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
>>  	return event;
>>  }
>>  
>> -static bool perf_mmap__empty(struct perf_mmap *md)
>> +static bool perf_evlist__mmap_empty(struct perf_evlist *evlist, int idx)
>>  {
>> -	return perf_mmap__read_head(md) == md->prev && !md->auxtrace_mmap.base;
>> +	struct perf_mmap *md = &evlist->mmap[idx];
>> +
>> +	return perf_mmap__read_head(md) == md->prev &&
>> +		evlist->auxtrace_mmap[idx].base == NULL;
>>  }
>>  
>>  static void perf_evlist__mmap_get(struct perf_evlist *evlist, int idx)
>> @@ -838,7 +841,7 @@ void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
>>  		perf_mmap__write_tail(md, old);
>>  	}
>>  
>> -	if (atomic_read(&md->refcnt) == 1 && perf_mmap__empty(md))
>> +	if (atomic_read(&md->refcnt) == 1 && perf_evlist__mmap_empty(evlist, idx))
>>  		perf_evlist__mmap_put(evlist, idx);
>>  }
>>  
>> @@ -879,7 +882,7 @@ static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
>>  		evlist->mmap[idx].base = NULL;
>>  		atomic_set(&evlist->mmap[idx].refcnt, 0);
>>  	}
>> -	auxtrace_mmap__munmap(&evlist->mmap[idx].auxtrace_mmap);
>> +	auxtrace_mmap__munmap(&evlist->auxtrace_mmap[idx]);
>>  }
>>  
>>  void perf_evlist__munmap(struct perf_evlist *evlist)
>> @@ -901,7 +904,15 @@ static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
>>  	if (cpu_map__empty(evlist->cpus))
>>  		evlist->nr_mmaps = thread_map__nr(evlist->threads);
>>  	evlist->mmap = zalloc(evlist->nr_mmaps * sizeof(struct perf_mmap));
>> -	return evlist->mmap != NULL ? 0 : -ENOMEM;
>> +	if (evlist->mmap == NULL)
>> +		return -ENOMEM;
>> +	evlist->auxtrace_mmap = calloc(evlist->nr_mmaps,
>> +				       sizeof(struct auxtrace_mmap));
>> +	if (evlist->auxtrace_mmap == NULL) {
>> +		zfree(&evlist->mmap);
>> +		return -ENOMEM;
>> +	}
>> +	return 0;
>>  }
>>  
>>  struct mmap_params {
>> @@ -938,10 +949,6 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
>>  		return -1;
>>  	}
>>  
>> -	if (auxtrace_mmap__mmap(&evlist->mmap[idx].auxtrace_mmap,
>> -				&mp->auxtrace_mp, evlist->mmap[idx].base, fd))
>> -		return -1;
>> -
>>  	return 0;
>>  }
>>  
>> @@ -963,6 +970,11 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
>>  			*output = fd;
>>  			if (__perf_evlist__mmap(evlist, idx, mp, *output) < 0)
>>  				return -1;
>> +
>> +			if (auxtrace_mmap__mmap(&evlist->auxtrace_mmap[idx],
>> +						&mp->auxtrace_mp,
>> +						evlist->mmap[idx].base, fd))
>> +				return -1;
>>  		} else {
>>  			if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, *output) != 0)
>>  				return -1;
>> diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
>> index 414e383885f5..51574ce8ac69 100644
>> --- a/tools/perf/util/evlist.h
>> +++ b/tools/perf/util/evlist.h
>> @@ -30,7 +30,6 @@ struct perf_mmap {
>>  	int		 mask;
>>  	atomic_t	 refcnt;
>>  	u64		 prev;
>> -	struct auxtrace_mmap auxtrace_mmap;
>>  	char		 event_copy[PERF_SAMPLE_MAX_SIZE] __attribute__((aligned(8)));
>>  };
>>  
>> @@ -53,6 +52,7 @@ struct perf_evlist {
>>  	} workload;
>>  	struct fdarray	 pollfd;
>>  	struct perf_mmap *mmap;
>> +	struct auxtrace_mmap *auxtrace_mmap;
>>  	struct thread_map *threads;
>>  	struct cpu_map	  *cpus;
>>  	struct perf_evsel *selected;
>> -- 
>> 2.6.0


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 01/38] perf tools: Use a software dummy event to track task/mmap events
  2015-10-02  5:18 ` [RFC/PATCH 01/38] perf tools: Use a software dummy event to track task/mmap events Namhyung Kim
@ 2015-10-05 12:51   ` Jiri Olsa
  2015-10-06  8:31     ` Namhyung Kim
  0 siblings, 1 reply; 63+ messages in thread
From: Jiri Olsa @ 2015-10-05 12:51 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen,
	Adrian Hunter

On Fri, Oct 02, 2015 at 02:18:42PM +0900, Namhyung Kim wrote:

SNIP

>  
> +/**
> + * perf_evsel__is_dummy_tracking - Return whether given evsel is a dummy
> + * event for tracking meta events only
> + *
> + * @evsel - evsel selector to be tested
> + *
> + * Return %true if event is a dummy tracking event
> + */
> +static inline bool perf_evsel__is_dummy_tracking(struct perf_evsel *evsel)
> +{
> +	return evsel->attr.type == PERF_TYPE_SOFTWARE &&
> +		evsel->attr.config == PERF_COUNT_SW_DUMMY &&
> +		evsel->attr.task == 1 && evsel->attr.mmap == 1;

should this use now check for evsel->tracking ?

jirka

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 03/38] perf tools: Move auxtrace_mmap field to struct perf_evlist
  2015-10-02  5:18 ` [RFC/PATCH 03/38] perf tools: Move auxtrace_mmap field to struct perf_evlist Namhyung Kim
  2015-10-02 18:45   ` Arnaldo Carvalho de Melo
@ 2015-10-05 13:14   ` Jiri Olsa
  2015-10-06  8:40     ` Namhyung Kim
  2015-10-08 10:18   ` Jiri Olsa
  2 siblings, 1 reply; 63+ messages in thread
From: Jiri Olsa @ 2015-10-05 13:14 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen,
	Adrian Hunter

On Fri, Oct 02, 2015 at 02:18:44PM +0900, Namhyung Kim wrote:

SNIP

> @@ -838,7 +841,7 @@ void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
>  		perf_mmap__write_tail(md, old);
>  	}
>  
> -	if (atomic_read(&md->refcnt) == 1 && perf_mmap__empty(md))
> +	if (atomic_read(&md->refcnt) == 1 && perf_evlist__mmap_empty(evlist, idx))
>  		perf_evlist__mmap_put(evlist, idx);
>  }
>  
> @@ -879,7 +882,7 @@ static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
>  		evlist->mmap[idx].base = NULL;
>  		atomic_set(&evlist->mmap[idx].refcnt, 0);
>  	}
> -	auxtrace_mmap__munmap(&evlist->mmap[idx].auxtrace_mmap);
> +	auxtrace_mmap__munmap(&evlist->auxtrace_mmap[idx]);
>  }
>  
>  void perf_evlist__munmap(struct perf_evlist *evlist)
> @@ -901,7 +904,15 @@ static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
>  	if (cpu_map__empty(evlist->cpus))
>  		evlist->nr_mmaps = thread_map__nr(evlist->threads);
>  	evlist->mmap = zalloc(evlist->nr_mmaps * sizeof(struct perf_mmap));
> -	return evlist->mmap != NULL ? 0 : -ENOMEM;
> +	if (evlist->mmap == NULL)
> +		return -ENOMEM;
> +	evlist->auxtrace_mmap = calloc(evlist->nr_mmaps,
> +				       sizeof(struct auxtrace_mmap));
> +	if (evlist->auxtrace_mmap == NULL) {
> +		zfree(&evlist->mmap);
> +		return -ENOMEM;
> +	}

can't see evlist->auxtrace_mmap being freed 

jirka

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 09/38] perf record: Add --index option for building index table
  2015-10-02  5:18 ` [RFC/PATCH 09/38] perf record: Add --index option for building index table Namhyung Kim
  2015-10-02 18:58   ` Arnaldo Carvalho de Melo
@ 2015-10-05 13:46   ` Jiri Olsa
  2015-10-07  8:21     ` Namhyung Kim
  1 sibling, 1 reply; 63+ messages in thread
From: Jiri Olsa @ 2015-10-05 13:46 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

On Fri, Oct 02, 2015 at 02:18:50PM +0900, Namhyung Kim wrote:

SNIP

> +static int record__merge_index_files(struct record *rec, int nr_index)
> +{
> +	int i;
> +	int ret = -ENOMEM;
> +	u64 offset;
> +	char path[PATH_MAX];
> +	struct perf_file_section *idx;
> +	struct perf_data_file *file = &rec->file;
> +	struct perf_session *session = rec->session;
> +	int output_fd = perf_data_file__fd(file);
> +
> +	/* +1 for header file itself */
> +	nr_index++;
> +
> +	idx = calloc(nr_index, sizeof(*idx));
> +	if (idx == NULL)
> +		goto out_close;
> +
> +	offset = lseek(output_fd, 0, SEEK_END);
> +
> +	idx[0].offset = session->header.data_offset;
> +	idx[0].size   = offset - idx[0].offset;
> +
> +	for (i = 1; i < nr_index; i++) {
> +		struct stat stbuf;
> +		int fd = rec->fds[i - 1];
> +
> +		ret = fstat(fd, &stbuf);
> +		if (ret < 0)
> +			goto out_close;
> +
> +		idx[i].offset = offset;
> +		idx[i].size   = stbuf.st_size;
> +
> +		offset += stbuf.st_size;
> +
> +		if (idx[i].size == 0)
> +			continue;
> +
> +		ret = copyfile_offset(fd, 0, output_fd, idx[i].offset,
> +				      idx[i].size);
> +		if (ret < 0)
> +			goto out_close;
> +	}
> +
> +	session->header.index = idx;
> +	session->header.nr_index = nr_index;
> +
> +	perf_has_index = true;

I might have asked earlier, but why is this global? seems like
perf_session member to me..

thanks,
jirka

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 01/38] perf tools: Use a software dummy event to track task/mmap events
  2015-10-05 12:51   ` Jiri Olsa
@ 2015-10-06  8:31     ` Namhyung Kim
  0 siblings, 0 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-06  8:31 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen,
	Adrian Hunter

Hi Jiri,

On Mon, Oct 05, 2015 at 02:51:37PM +0200, Jiri Olsa wrote:
> On Fri, Oct 02, 2015 at 02:18:42PM +0900, Namhyung Kim wrote:
> 
> SNIP
> 
> >  
> > +/**
> > + * perf_evsel__is_dummy_tracking - Return whether given evsel is a dummy
> > + * event for tracking meta events only
> > + *
> > + * @evsel - evsel selector to be tested
> > + *
> > + * Return %true if event is a dummy tracking event
> > + */
> > +static inline bool perf_evsel__is_dummy_tracking(struct perf_evsel *evsel)
> > +{
> > +	return evsel->attr.type == PERF_TYPE_SOFTWARE &&
> > +		evsel->attr.config == PERF_COUNT_SW_DUMMY &&
> > +		evsel->attr.task == 1 && evsel->attr.mmap == 1;
> 
> should this use now check for evsel->tracking ?

Originally I think it need to differentiate the dummy tracking events
and possible other dummy events.  But it seems no need to do it.  So
yes, maybe I can just check the tracking bit.

Anyway, I need to check the Intel PT code as Adrian said it might have
multiple tracking events.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 02/38] perf tools: Save mmap_param.len instead of mask
  2015-10-02 18:44   ` Arnaldo Carvalho de Melo
@ 2015-10-06  8:34     ` Namhyung Kim
  0 siblings, 0 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-06  8:34 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen,
	Adrian Hunter

Hi Arnaldo,

On Fri, Oct 02, 2015 at 03:44:33PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Fri, Oct 02, 2015 at 02:18:43PM +0900, Namhyung Kim escreveu:
> > It is more convenient saving mmap length rather than (bit) mask.  With
> > this patch, we can eliminate dependency to perf_evlist other than
> > getting mmap_desc for dealing with mmaps.  The mask and length can be
> > converted using perf_evlist__mmap_mask/len().
> > 
> > Cc: Jiri Olsa <jolsa@redhat.com>
> > Cc: Adrian Hunter <adrian.hunter@intel.com>
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---
> >  tools/perf/util/evlist.c | 31 +++++++++++++++++++++++++------
> >  1 file changed, 25 insertions(+), 6 deletions(-)
> > 
> > diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> > index c5180a29db1b..e46adcd5b408 100644
> > --- a/tools/perf/util/evlist.c
> > +++ b/tools/perf/util/evlist.c
> > @@ -29,6 +29,8 @@
> >  
> >  static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx);
> >  static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx);
> > +static size_t perf_evlist__mmap_mask(size_t len);
> > +static size_t perf_evlist__mmap_len(size_t mask);
> 
> Are these "perf_evlist" methods? I don't think so, those are related to
> "perf_mmap".

Agreed.

> 
> >  #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
> >  #define SID(e, x, y) xyarray__entry(e->sample_id, x, y)
> > @@ -871,7 +873,9 @@ void __weak auxtrace_mmap_params__set_idx(
> >  static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
> >  {
> >  	if (evlist->mmap[idx].base != NULL) {
> > -		munmap(evlist->mmap[idx].base, evlist->mmap_len);
> > +		size_t mmap_len = perf_evlist__mmap_len(evlist->mmap[idx].mask);
> 
> I.e. here you could have it as:
> 
> 		size_t mmap_len = perf_mmap__len(evlist->mmap[idx]);

OK.

> 
> > +
> > +		munmap(evlist->mmap[idx].base, mmap_len);
> >  		evlist->mmap[idx].base = NULL;
> >  		atomic_set(&evlist->mmap[idx].refcnt, 0);
> >  	}
> > @@ -901,8 +905,8 @@ static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
> >  }
> >  
> >  struct mmap_params {
> > -	int prot;
> > -	int mask;
> > +	int	prot;
> > +	size_t	len;
> >  	struct auxtrace_mmap_params auxtrace_mp;
> >  };
> >  
> > @@ -924,8 +928,8 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
> >  	 */
> >  	atomic_set(&evlist->mmap[idx].refcnt, 2);
> >  	evlist->mmap[idx].prev = 0;
> > -	evlist->mmap[idx].mask = mp->mask;
> > -	evlist->mmap[idx].base = mmap(NULL, evlist->mmap_len, mp->prot,
> > +	evlist->mmap[idx].mask = perf_evlist__mmap_mask(mp->len);
> 
> Here, since you're not using a perf_mmap instance, but the calculation
> is relative to a perf_mmap property, we would use:
> 
> 	evlist->mmap[idx].mask = __perf_mmap__mask(mp->len);

Will change.

Thanks,
Namhyung


> 	
> 
> > +	evlist->mmap[idx].base = mmap(NULL, mp->len, mp->prot,
> >  				      MAP_SHARED, fd, 0);
> >  	if (evlist->mmap[idx].base == MAP_FAILED) {
> >  		pr_debug2("failed to mmap perf event ring buffer, error %d\n",
> > @@ -1071,6 +1075,21 @@ static size_t perf_evlist__mmap_size(unsigned long pages)
> >  	return (pages + 1) * page_size;
> >  }
> >  
> > +static size_t perf_evlist__mmap_mask(size_t len)
> > +{
> > +	BUG_ON(len <= page_size);
> > +	BUG_ON((len % page_size) != 0);
> > +
> > +	return len - page_size - 1;
> > +}
> > +
> > +static size_t perf_evlist__mmap_len(size_t mask)
> > +{
> > +	BUG_ON(((mask + 1) % page_size) != 0);
> > +
> > +	return mask + 1 + page_size;
> > +}
> > +
> >  static long parse_pages_arg(const char *str, unsigned long min,
> >  			    unsigned long max)
> >  {
> > @@ -1176,7 +1195,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
> >  	evlist->overwrite = overwrite;
> >  	evlist->mmap_len = perf_evlist__mmap_size(pages);
> >  	pr_debug("mmap size %zuB\n", evlist->mmap_len);
> > -	mp.mask = evlist->mmap_len - page_size - 1;
> > +	mp.len = evlist->mmap_len;
> >  
> >  	auxtrace_mmap_params__init(&mp.auxtrace_mp, evlist->mmap_len,
> >  				   auxtrace_pages, auxtrace_overwrite);
> > -- 
> > 2.6.0

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 03/38] perf tools: Move auxtrace_mmap field to struct perf_evlist
  2015-10-05 13:14   ` Jiri Olsa
@ 2015-10-06  8:40     ` Namhyung Kim
  0 siblings, 0 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-06  8:40 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen,
	Adrian Hunter

On Mon, Oct 05, 2015 at 03:14:34PM +0200, Jiri Olsa wrote:
> On Fri, Oct 02, 2015 at 02:18:44PM +0900, Namhyung Kim wrote:
> 
> SNIP
> 
> > @@ -838,7 +841,7 @@ void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
> >  		perf_mmap__write_tail(md, old);
> >  	}
> >  
> > -	if (atomic_read(&md->refcnt) == 1 && perf_mmap__empty(md))
> > +	if (atomic_read(&md->refcnt) == 1 && perf_evlist__mmap_empty(evlist, idx))
> >  		perf_evlist__mmap_put(evlist, idx);
> >  }
> >  
> > @@ -879,7 +882,7 @@ static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
> >  		evlist->mmap[idx].base = NULL;
> >  		atomic_set(&evlist->mmap[idx].refcnt, 0);
> >  	}
> > -	auxtrace_mmap__munmap(&evlist->mmap[idx].auxtrace_mmap);
> > +	auxtrace_mmap__munmap(&evlist->auxtrace_mmap[idx]);
> >  }
> >  
> >  void perf_evlist__munmap(struct perf_evlist *evlist)
> > @@ -901,7 +904,15 @@ static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
> >  	if (cpu_map__empty(evlist->cpus))
> >  		evlist->nr_mmaps = thread_map__nr(evlist->threads);
> >  	evlist->mmap = zalloc(evlist->nr_mmaps * sizeof(struct perf_mmap));
> > -	return evlist->mmap != NULL ? 0 : -ENOMEM;
> > +	if (evlist->mmap == NULL)
> > +		return -ENOMEM;
> > +	evlist->auxtrace_mmap = calloc(evlist->nr_mmaps,
> > +				       sizeof(struct auxtrace_mmap));
> > +	if (evlist->auxtrace_mmap == NULL) {
> > +		zfree(&evlist->mmap);
> > +		return -ENOMEM;
> > +	}
> 
> can't see evlist->auxtrace_mmap being freed 

Ooops, will add.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 03/38] perf tools: Move auxtrace_mmap field to struct perf_evlist
  2015-10-02 18:45   ` Arnaldo Carvalho de Melo
  2015-10-05 11:29     ` Adrian Hunter
@ 2015-10-06  8:56     ` Namhyung Kim
  1 sibling, 0 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-06  8:56 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen,
	Adrian Hunter

On Sat, Oct 3, 2015 at 3:45 AM, Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
> Em Fri, Oct 02, 2015 at 02:18:44PM +0900, Namhyung Kim escreveu:
>> Since it's gonna share struct mmap with dummy tracking evsel to track
>> meta events only, let's move auxtrace out of struct perf_mmap.
>
> Is this moving around _strictly_ needed?

In the later patch, I added another perf_mmap instance for dummy
tracking events. So keeping auxtrace_mmap in the perf_mmap is a
duplication.  It's not strictly needed but will waste some memory.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 03/38] perf tools: Move auxtrace_mmap field to struct perf_evlist
  2015-10-05 11:29     ` Adrian Hunter
@ 2015-10-06  9:03       ` Namhyung Kim
  2015-10-06  9:26         ` Adrian Hunter
  0 siblings, 1 reply; 63+ messages in thread
From: Namhyung Kim @ 2015-10-06  9:03 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, Jiri Olsa,
	LKML, Frederic Weisbecker, Stephane Eranian, David Ahern,
	Andi Kleen

Hi Adrian,

On Mon, Oct 5, 2015 at 8:29 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:
> On 02/10/15 21:45, Arnaldo Carvalho de Melo wrote:
>> Em Fri, Oct 02, 2015 at 02:18:44PM +0900, Namhyung Kim escreveu:
>>> Since it's gonna share struct mmap with dummy tracking evsel to track
>>> meta events only, let's move auxtrace out of struct perf_mmap.
>> Is this moving around _strictly_ needed?
>
> Also, what if you wanted to capture AUX data and tracking together.

Hmm.. I don't know what's the problem.  It should be orthogonal and
support doing that together IMHO.  Maybe I'm missing something about
the aux data processing and Intel PT.  I'll take a look at it..


>
> In addition, currently Intel PT can have either 1 dummy event for tracking
> plus sched_switch
> or 2 dummy events to allow for system-wide tracking of context switches.
> i.e. there
> are multiple tracking events.

Again, I don't have an idea what's going on this area. I need to look
at the code and think about what I can do.

Thanks for your review!
Namhyung



>
>>
>> - Arnaldo
>>
>>> Cc: Adrian Hunter <adrian.hunter@intel.com>
>>> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
>>> ---
>>>  tools/perf/builtin-record.c |  4 ++--
>>>  tools/perf/util/evlist.c    | 30 +++++++++++++++++++++---------
>>>  tools/perf/util/evlist.h    |  2 +-
>>>  3 files changed, 24 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
>>> index 5e01c070dbf2..0accac6e0812 100644
>>> --- a/tools/perf/builtin-record.c
>>> +++ b/tools/perf/builtin-record.c
>>> @@ -220,7 +220,7 @@ static int record__auxtrace_read_snapshot_all(struct record *rec)
>>>
>>>      for (i = 0; i < rec->evlist->nr_mmaps; i++) {
>>>              struct auxtrace_mmap *mm =
>>> -                            &rec->evlist->mmap[i].auxtrace_mmap;
>>> +                            &rec->evlist->auxtrace_mmap[i];
>>>
>>>              if (!mm->base)
>>>                      continue;
>>> @@ -405,7 +405,7 @@ static int record__mmap_read_all(struct record *rec)
>>>      int rc = 0;
>>>
>>>      for (i = 0; i < rec->evlist->nr_mmaps; i++) {
>>> -            struct auxtrace_mmap *mm = &rec->evlist->mmap[i].auxtrace_mmap;
>>> +            struct auxtrace_mmap *mm = &rec->evlist->auxtrace_mmap[i];
>>>
>>>              if (rec->evlist->mmap[i].base) {
>>>                      if (record__mmap_read(rec, i) != 0) {
>>> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
>>> index e46adcd5b408..042dffc67986 100644
>>> --- a/tools/perf/util/evlist.c
>>> +++ b/tools/perf/util/evlist.c
>>> @@ -810,9 +810,12 @@ union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
>>>      return event;
>>>  }
>>>
>>> -static bool perf_mmap__empty(struct perf_mmap *md)
>>> +static bool perf_evlist__mmap_empty(struct perf_evlist *evlist, int idx)
>>>  {
>>> -    return perf_mmap__read_head(md) == md->prev && !md->auxtrace_mmap.base;
>>> +    struct perf_mmap *md = &evlist->mmap[idx];
>>> +
>>> +    return perf_mmap__read_head(md) == md->prev &&
>>> +            evlist->auxtrace_mmap[idx].base == NULL;
>>>  }
>>>
>>>  static void perf_evlist__mmap_get(struct perf_evlist *evlist, int idx)
>>> @@ -838,7 +841,7 @@ void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
>>>              perf_mmap__write_tail(md, old);
>>>      }
>>>
>>> -    if (atomic_read(&md->refcnt) == 1 && perf_mmap__empty(md))
>>> +    if (atomic_read(&md->refcnt) == 1 && perf_evlist__mmap_empty(evlist, idx))
>>>              perf_evlist__mmap_put(evlist, idx);
>>>  }
>>>
>>> @@ -879,7 +882,7 @@ static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
>>>              evlist->mmap[idx].base = NULL;
>>>              atomic_set(&evlist->mmap[idx].refcnt, 0);
>>>      }
>>> -    auxtrace_mmap__munmap(&evlist->mmap[idx].auxtrace_mmap);
>>> +    auxtrace_mmap__munmap(&evlist->auxtrace_mmap[idx]);
>>>  }
>>>
>>>  void perf_evlist__munmap(struct perf_evlist *evlist)
>>> @@ -901,7 +904,15 @@ static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
>>>      if (cpu_map__empty(evlist->cpus))
>>>              evlist->nr_mmaps = thread_map__nr(evlist->threads);
>>>      evlist->mmap = zalloc(evlist->nr_mmaps * sizeof(struct perf_mmap));
>>> -    return evlist->mmap != NULL ? 0 : -ENOMEM;
>>> +    if (evlist->mmap == NULL)
>>> +            return -ENOMEM;
>>> +    evlist->auxtrace_mmap = calloc(evlist->nr_mmaps,
>>> +                                   sizeof(struct auxtrace_mmap));
>>> +    if (evlist->auxtrace_mmap == NULL) {
>>> +            zfree(&evlist->mmap);
>>> +            return -ENOMEM;
>>> +    }
>>> +    return 0;
>>>  }
>>>
>>>  struct mmap_params {
>>> @@ -938,10 +949,6 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
>>>              return -1;
>>>      }
>>>
>>> -    if (auxtrace_mmap__mmap(&evlist->mmap[idx].auxtrace_mmap,
>>> -                            &mp->auxtrace_mp, evlist->mmap[idx].base, fd))
>>> -            return -1;
>>> -
>>>      return 0;
>>>  }
>>>
>>> @@ -963,6 +970,11 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
>>>                      *output = fd;
>>>                      if (__perf_evlist__mmap(evlist, idx, mp, *output) < 0)
>>>                              return -1;
>>> +
>>> +                    if (auxtrace_mmap__mmap(&evlist->auxtrace_mmap[idx],
>>> +                                            &mp->auxtrace_mp,
>>> +                                            evlist->mmap[idx].base, fd))
>>> +                            return -1;
>>>              } else {
>>>                      if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, *output) != 0)
>>>                              return -1;
>>> diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
>>> index 414e383885f5..51574ce8ac69 100644
>>> --- a/tools/perf/util/evlist.h
>>> +++ b/tools/perf/util/evlist.h
>>> @@ -30,7 +30,6 @@ struct perf_mmap {
>>>      int              mask;
>>>      atomic_t         refcnt;
>>>      u64              prev;
>>> -    struct auxtrace_mmap auxtrace_mmap;
>>>      char             event_copy[PERF_SAMPLE_MAX_SIZE] __attribute__((aligned(8)));
>>>  };
>>>
>>> @@ -53,6 +52,7 @@ struct perf_evlist {
>>>      } workload;
>>>      struct fdarray   pollfd;
>>>      struct perf_mmap *mmap;
>>> +    struct auxtrace_mmap *auxtrace_mmap;
>>>      struct thread_map *threads;
>>>      struct cpu_map    *cpus;
>>>      struct perf_evsel *selected;
>>> --
>>> 2.6.0
>



-- 
Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 03/38] perf tools: Move auxtrace_mmap field to struct perf_evlist
  2015-10-06  9:03       ` Namhyung Kim
@ 2015-10-06  9:26         ` Adrian Hunter
  2015-10-07  9:06           ` Namhyung Kim
  0 siblings, 1 reply; 63+ messages in thread
From: Adrian Hunter @ 2015-10-06  9:26 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, Jiri Olsa,
	LKML, Frederic Weisbecker, Stephane Eranian, David Ahern,
	Andi Kleen

On 06/10/15 12:03, Namhyung Kim wrote:
> Hi Adrian,
> 
> On Mon, Oct 5, 2015 at 8:29 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:
>> On 02/10/15 21:45, Arnaldo Carvalho de Melo wrote:
>>> Em Fri, Oct 02, 2015 at 02:18:44PM +0900, Namhyung Kim escreveu:
>>>> Since it's gonna share struct mmap with dummy tracking evsel to track
>>>> meta events only, let's move auxtrace out of struct perf_mmap.
>>> Is this moving around _strictly_ needed?
>>
>> Also, what if you wanted to capture AUX data and tracking together.
> 
> Hmm.. I don't know what's the problem.  It should be orthogonal and
> support doing that together IMHO.  Maybe I'm missing something about
> the aux data processing and Intel PT.  I'll take a look at it..
> 

It is only orthogonal if you assume we will never want to support parallel
processing with Intel PT.

The only change that needs to be made is not to assume there is only 1
tracking event.

IMHO there could be separate mmap_params also, which would allow for
different mmap sizes for the tracking and main mmaps.


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 09/38] perf record: Add --index option for building index table
  2015-10-05 13:46   ` Jiri Olsa
@ 2015-10-07  8:21     ` Namhyung Kim
  2015-10-07 12:10       ` Jiri Olsa
  0 siblings, 1 reply; 63+ messages in thread
From: Namhyung Kim @ 2015-10-07  8:21 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

Hi Jiri,

On Mon, Oct 5, 2015 at 10:46 PM, Jiri Olsa <jolsa@redhat.com> wrote:
> On Fri, Oct 02, 2015 at 02:18:50PM +0900, Namhyung Kim wrote:
>
> SNIP
>
>> +static int record__merge_index_files(struct record *rec, int nr_index)
>> +{
>> +     int i;
>> +     int ret = -ENOMEM;
>> +     u64 offset;
>> +     char path[PATH_MAX];
>> +     struct perf_file_section *idx;
>> +     struct perf_data_file *file = &rec->file;
>> +     struct perf_session *session = rec->session;
>> +     int output_fd = perf_data_file__fd(file);
>> +
>> +     /* +1 for header file itself */
>> +     nr_index++;
>> +
>> +     idx = calloc(nr_index, sizeof(*idx));
>> +     if (idx == NULL)
>> +             goto out_close;
>> +
>> +     offset = lseek(output_fd, 0, SEEK_END);
>> +
>> +     idx[0].offset = session->header.data_offset;
>> +     idx[0].size   = offset - idx[0].offset;
>> +
>> +     for (i = 1; i < nr_index; i++) {
>> +             struct stat stbuf;
>> +             int fd = rec->fds[i - 1];
>> +
>> +             ret = fstat(fd, &stbuf);
>> +             if (ret < 0)
>> +                     goto out_close;
>> +
>> +             idx[i].offset = offset;
>> +             idx[i].size   = stbuf.st_size;
>> +
>> +             offset += stbuf.st_size;
>> +
>> +             if (idx[i].size == 0)
>> +                     continue;
>> +
>> +             ret = copyfile_offset(fd, 0, output_fd, idx[i].offset,
>> +                                   idx[i].size);
>> +             if (ret < 0)
>> +                     goto out_close;
>> +     }
>> +
>> +     session->header.index = idx;
>> +     session->header.nr_index = nr_index;
>> +
>> +     perf_has_index = true;
>
> I might have asked earlier, but why is this global? seems like
> perf_session member to me..

Yes you did. :-)

https://lkml.org/lkml/2015/5/19/110


Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 03/38] perf tools: Move auxtrace_mmap field to struct perf_evlist
  2015-10-06  9:26         ` Adrian Hunter
@ 2015-10-07  9:06           ` Namhyung Kim
  2015-10-08 16:07             ` Adrian Hunter
  0 siblings, 1 reply; 63+ messages in thread
From: Namhyung Kim @ 2015-10-07  9:06 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, Jiri Olsa,
	LKML, Frederic Weisbecker, Stephane Eranian, David Ahern,
	Andi Kleen

Hi Adrian,

On Tue, Oct 6, 2015 at 6:26 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:
> On 06/10/15 12:03, Namhyung Kim wrote:
>> Hi Adrian,
>>
>> On Mon, Oct 5, 2015 at 8:29 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:
>>> On 02/10/15 21:45, Arnaldo Carvalho de Melo wrote:
>>>> Em Fri, Oct 02, 2015 at 02:18:44PM +0900, Namhyung Kim escreveu:
>>>>> Since it's gonna share struct mmap with dummy tracking evsel to track
>>>>> meta events only, let's move auxtrace out of struct perf_mmap.
>>>> Is this moving around _strictly_ needed?
>>>
>>> Also, what if you wanted to capture AUX data and tracking together.
>>
>> Hmm.. I don't know what's the problem.  It should be orthogonal and
>> support doing that together IMHO.  Maybe I'm missing something about
>> the aux data processing and Intel PT.  I'll take a look at it..
>>
>
> It is only orthogonal if you assume we will never want to support parallel
> processing with Intel PT.

We'll definitely want it. :)

>
> The only change that needs to be made is not to assume there is only 1
> tracking event.

IIUC Intel PT (and BTS?) needs maximum 2 dummy events - one is to
track task/mmap and another is to track context switches.  The latter
is basically a light-weight version of the sched_switch event, right?

For parallel processing, each cpu needs to keep current thread to
synthesize events from auxtrace data.  So if it processed the switch
events before processing samples, it'd need to build long lists of
current thread per cpu.  IMHO it'd be better to process the switch
events with samples using multi-thread rather than processing them
prior to samples.

So how about this?  It'd use *always* 2 dummy (or 1 dummy + 1
sched_switch) events.  The tracking dummy events would be recorded on
the tracking mmaps and switch (dummy) event would be recorded on the
main mmaps.  This way we can parallelize the auxtrace processing
without the list of current thread IMHO.

Do I miss something?

>
> IMHO there could be separate mmap_params also, which would allow for
> different mmap sizes for the tracking and main mmaps.

Currently, the tracking mmap size is fixed at an arbitrary size
(128KiB) regardless of the main mmaps.  I can add an option to change
the tracking mmap size too.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 09/38] perf record: Add --index option for building index table
  2015-10-07  8:21     ` Namhyung Kim
@ 2015-10-07 12:10       ` Jiri Olsa
  0 siblings, 0 replies; 63+ messages in thread
From: Jiri Olsa @ 2015-10-07 12:10 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

On Wed, Oct 07, 2015 at 05:21:46PM +0900, Namhyung Kim wrote:
> Hi Jiri,
> 
> On Mon, Oct 5, 2015 at 10:46 PM, Jiri Olsa <jolsa@redhat.com> wrote:
> > On Fri, Oct 02, 2015 at 02:18:50PM +0900, Namhyung Kim wrote:
> >
> > SNIP
> >
> >> +static int record__merge_index_files(struct record *rec, int nr_index)
> >> +{
> >> +     int i;
> >> +     int ret = -ENOMEM;
> >> +     u64 offset;
> >> +     char path[PATH_MAX];
> >> +     struct perf_file_section *idx;
> >> +     struct perf_data_file *file = &rec->file;
> >> +     struct perf_session *session = rec->session;
> >> +     int output_fd = perf_data_file__fd(file);
> >> +
> >> +     /* +1 for header file itself */
> >> +     nr_index++;
> >> +
> >> +     idx = calloc(nr_index, sizeof(*idx));
> >> +     if (idx == NULL)
> >> +             goto out_close;
> >> +
> >> +     offset = lseek(output_fd, 0, SEEK_END);
> >> +
> >> +     idx[0].offset = session->header.data_offset;
> >> +     idx[0].size   = offset - idx[0].offset;
> >> +
> >> +     for (i = 1; i < nr_index; i++) {
> >> +             struct stat stbuf;
> >> +             int fd = rec->fds[i - 1];
> >> +
> >> +             ret = fstat(fd, &stbuf);
> >> +             if (ret < 0)
> >> +                     goto out_close;
> >> +
> >> +             idx[i].offset = offset;
> >> +             idx[i].size   = stbuf.st_size;
> >> +
> >> +             offset += stbuf.st_size;
> >> +
> >> +             if (idx[i].size == 0)
> >> +                     continue;
> >> +
> >> +             ret = copyfile_offset(fd, 0, output_fd, idx[i].offset,
> >> +                                   idx[i].size);
> >> +             if (ret < 0)
> >> +                     goto out_close;
> >> +     }
> >> +
> >> +     session->header.index = idx;
> >> +     session->header.nr_index = nr_index;
> >> +
> >> +     perf_has_index = true;
> >
> > I might have asked earlier, but why is this global? seems like
> > perf_session member to me..
> 
> Yes you did. :-)
> 
> https://lkml.org/lkml/2015/5/19/110

ah right ;-) ok

thanks,
jirka

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 02/38] perf tools: Save mmap_param.len instead of mask
  2015-10-02  5:18 ` [RFC/PATCH 02/38] perf tools: Save mmap_param.len instead of mask Namhyung Kim
  2015-10-02 18:44   ` Arnaldo Carvalho de Melo
@ 2015-10-08 10:17   ` Jiri Olsa
  2015-10-09  6:03     ` Namhyung Kim
  1 sibling, 1 reply; 63+ messages in thread
From: Jiri Olsa @ 2015-10-08 10:17 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen,
	Adrian Hunter

On Fri, Oct 02, 2015 at 02:18:43PM +0900, Namhyung Kim wrote:
> It is more convenient saving mmap length rather than (bit) mask.  With
> this patch, we can eliminate dependency to perf_evlist other than
> getting mmap_desc for dealing with mmaps.  The mask and length can be
> converted using perf_evlist__mmap_mask/len().
> 
> Cc: Jiri Olsa <jolsa@redhat.com>
> Cc: Adrian Hunter <adrian.hunter@intel.com>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>

after this patch I'm hitting:

[jolsa@krava perf]$ ./perf record  kill
kill: not enough arguments
perf: util/evlist.c:1003: perf_evlist__mmap_len: Assertion `!((mask & page_size) != 0)' failed.
Aborted (core dumped)
[jolsa@krava perf]$ 


jirka

> ---
>  tools/perf/util/evlist.c | 31 +++++++++++++++++++++++++------
>  1 file changed, 25 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index c5180a29db1b..e46adcd5b408 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -29,6 +29,8 @@
>  
>  static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx);
>  static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx);
> +static size_t perf_evlist__mmap_mask(size_t len);
> +static size_t perf_evlist__mmap_len(size_t mask);
>  
>  #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
>  #define SID(e, x, y) xyarray__entry(e->sample_id, x, y)
> @@ -871,7 +873,9 @@ void __weak auxtrace_mmap_params__set_idx(
>  static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
>  {
>  	if (evlist->mmap[idx].base != NULL) {
> -		munmap(evlist->mmap[idx].base, evlist->mmap_len);
> +		size_t mmap_len = perf_evlist__mmap_len(evlist->mmap[idx].mask);
> +
> +		munmap(evlist->mmap[idx].base, mmap_len);
>  		evlist->mmap[idx].base = NULL;
>  		atomic_set(&evlist->mmap[idx].refcnt, 0);
>  	}
> @@ -901,8 +905,8 @@ static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
>  }
>  
>  struct mmap_params {
> -	int prot;
> -	int mask;
> +	int	prot;
> +	size_t	len;
>  	struct auxtrace_mmap_params auxtrace_mp;
>  };
>  
> @@ -924,8 +928,8 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
>  	 */
>  	atomic_set(&evlist->mmap[idx].refcnt, 2);
>  	evlist->mmap[idx].prev = 0;
> -	evlist->mmap[idx].mask = mp->mask;
> -	evlist->mmap[idx].base = mmap(NULL, evlist->mmap_len, mp->prot,
> +	evlist->mmap[idx].mask = perf_evlist__mmap_mask(mp->len);
> +	evlist->mmap[idx].base = mmap(NULL, mp->len, mp->prot,
>  				      MAP_SHARED, fd, 0);
>  	if (evlist->mmap[idx].base == MAP_FAILED) {
>  		pr_debug2("failed to mmap perf event ring buffer, error %d\n",
> @@ -1071,6 +1075,21 @@ static size_t perf_evlist__mmap_size(unsigned long pages)
>  	return (pages + 1) * page_size;
>  }
>  
> +static size_t perf_evlist__mmap_mask(size_t len)
> +{
> +	BUG_ON(len <= page_size);
> +	BUG_ON((len % page_size) != 0);
> +
> +	return len - page_size - 1;
> +}
> +
> +static size_t perf_evlist__mmap_len(size_t mask)
> +{
> +	BUG_ON(((mask + 1) % page_size) != 0);
> +
> +	return mask + 1 + page_size;
> +}
> +
>  static long parse_pages_arg(const char *str, unsigned long min,
>  			    unsigned long max)
>  {
> @@ -1176,7 +1195,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
>  	evlist->overwrite = overwrite;
>  	evlist->mmap_len = perf_evlist__mmap_size(pages);
>  	pr_debug("mmap size %zuB\n", evlist->mmap_len);
> -	mp.mask = evlist->mmap_len - page_size - 1;
> +	mp.len = evlist->mmap_len;
>  
>  	auxtrace_mmap_params__init(&mp.auxtrace_mp, evlist->mmap_len,
>  				   auxtrace_pages, auxtrace_overwrite);
> -- 
> 2.6.0
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 03/38] perf tools: Move auxtrace_mmap field to struct perf_evlist
  2015-10-02  5:18 ` [RFC/PATCH 03/38] perf tools: Move auxtrace_mmap field to struct perf_evlist Namhyung Kim
  2015-10-02 18:45   ` Arnaldo Carvalho de Melo
  2015-10-05 13:14   ` Jiri Olsa
@ 2015-10-08 10:18   ` Jiri Olsa
  2 siblings, 0 replies; 63+ messages in thread
From: Jiri Olsa @ 2015-10-08 10:18 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen,
	Adrian Hunter

On Fri, Oct 02, 2015 at 02:18:44PM +0900, Namhyung Kim wrote:
> Since it's gonna share struct mmap with dummy tracking evsel to track
> meta events only, let's move auxtrace out of struct perf_mmap.

after applying this one I got:

[jolsa@krava perf]$ ./perf record  kill
failed to mmap with 13 (Permission denied)

jirka

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 15/38] perf tools: Introduce machine__find*_thread_by_time()
  2015-10-02  5:18 ` [RFC/PATCH 15/38] perf tools: Introduce machine__find*_thread_by_time() Namhyung Kim
@ 2015-10-08 12:20   ` Jiri Olsa
  2015-10-09  6:04     ` Namhyung Kim
  0 siblings, 1 reply; 63+ messages in thread
From: Jiri Olsa @ 2015-10-08 12:20 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

On Fri, Oct 02, 2015 at 02:18:56PM +0900, Namhyung Kim wrote:

SNIP

> diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
> index 674792e8fa2f..ad7c2a00bff8 100644
> --- a/tools/perf/util/thread.c
> +++ b/tools/perf/util/thread.c
> @@ -160,6 +160,9 @@ int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
>  
>  	/* Override the default :tid entry */
>  	if (!thread->comm_set) {
> +		if (!thread->start_time)
> +			thread->start_time = timestamp;
> +
>  		err = comm__override(curr, str, timestamp, exec);
>  		if (err)
>  			return err;
> @@ -266,6 +269,7 @@ int thread__fork(struct thread *thread, struct thread *parent, u64 timestamp)
>  	}
>  
>  	thread->ppid = parent->tid;
> +	thread->start_time = timestamp;
>  	return thread__clone_map_groups(thread, parent);
>  }
>  
> diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
> index b8f794d97b75..97026a9660ec 100644
> --- a/tools/perf/util/thread.h
> +++ b/tools/perf/util/thread.h
> @@ -28,6 +28,7 @@ struct thread {
>  	bool			dead; /* thread is in dead_threads list */
>  	struct list_head	comm_list;
>  	u64			db_id;
> +	u64			start_time;

introducing start_time could be in separate patch
would ease up the review a bit

jirka

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 17/38] perf tools: Maintain map groups list in a leader thread
  2015-10-02  5:18 ` [RFC/PATCH 17/38] perf tools: Maintain map groups list in a leader thread Namhyung Kim
@ 2015-10-08 12:51   ` Jiri Olsa
  2015-10-09  6:24     ` Namhyung Kim
  2015-10-08 12:58   ` Jiri Olsa
  1 sibling, 1 reply; 63+ messages in thread
From: Jiri Olsa @ 2015-10-08 12:51 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

On Fri, Oct 02, 2015 at 02:18:58PM +0900, Namhyung Kim wrote:

SNIP

> +static int thread__clone_map_groups(struct thread *thread,
> +				    struct thread *parent);
> +
>  int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
>  		       bool exec)
>  {
> @@ -182,6 +257,40 @@ int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
>  			unwind__flush_access(thread);
>  	}
>  
> +	if (exec) {
> +		struct machine *machine;
> +
> +		BUG_ON(thread->mg == NULL || thread->mg->machine == NULL);
> +
> +		machine = thread->mg->machine;
> +
> +		if (thread->tid != thread->pid_) {
> +			struct map_groups *old = thread->mg;
> +			struct thread *leader;
> +
> +			leader = machine__findnew_thread(machine, thread->pid_,
> +							 thread->pid_);
> +
> +			/* now it'll be a new leader */
> +			thread->pid_ = thread->tid;
> +
> +			thread->mg = map_groups__new(old->machine);
> +			if (thread->mg == NULL)
> +				return -ENOMEM;

hum, isn't this leaking thread->mg ?
should we call map_groups__put(old) at the end of block?


jirka

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 17/38] perf tools: Maintain map groups list in a leader thread
  2015-10-02  5:18 ` [RFC/PATCH 17/38] perf tools: Maintain map groups list in a leader thread Namhyung Kim
  2015-10-08 12:51   ` Jiri Olsa
@ 2015-10-08 12:58   ` Jiri Olsa
  2015-10-09  6:58     ` Namhyung Kim
  1 sibling, 1 reply; 63+ messages in thread
From: Jiri Olsa @ 2015-10-08 12:58 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

On Fri, Oct 02, 2015 at 02:18:58PM +0900, Namhyung Kim wrote:

SNIP

>  int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
>  		       bool exec)
>  {
> @@ -182,6 +257,40 @@ int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
>  			unwind__flush_access(thread);
>  	}
>  
> +	if (exec) {
> +		struct machine *machine;
> +
> +		BUG_ON(thread->mg == NULL || thread->mg->machine == NULL);
> +
> +		machine = thread->mg->machine;
> +
> +		if (thread->tid != thread->pid_) {
> +			struct map_groups *old = thread->mg;
> +			struct thread *leader;
> +
> +			leader = machine__findnew_thread(machine, thread->pid_,
> +							 thread->pid_);
> +
> +			/* now it'll be a new leader */
> +			thread->pid_ = thread->tid;
> +
> +			thread->mg = map_groups__new(old->machine);
> +			if (thread->mg == NULL)
> +				return -ENOMEM;
> +
> +			/* save current mg in the new leader */
> +			thread__clone_map_groups(thread, leader);
> +
> +			/* current mg of leader thread needs one more refcnt */
> +			map_groups__get(thread->mg);
> +
> +			thread__set_map_groups(thread, thread->mg, old->timestamp);
> +		}
> +
> +		/* create a new mg for newly executed binary */
> +		thread__set_map_groups(thread, map_groups__new(machine), timestamp);

should this     ^^^^ be in the else case of above condition?

also thread__fork calls thread__clone_map_groups once again,
I have some difficulty to sort this out ATM.. is that correct?

some comment on how we treat map groups in general (for fork/clone/exit)
would be awesome ;-)

thanks,
jirka

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 03/38] perf tools: Move auxtrace_mmap field to struct perf_evlist
  2015-10-07  9:06           ` Namhyung Kim
@ 2015-10-08 16:07             ` Adrian Hunter
  2015-10-09  7:54               ` Namhyung Kim
  0 siblings, 1 reply; 63+ messages in thread
From: Adrian Hunter @ 2015-10-08 16:07 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, Jiri Olsa,
	LKML, Frederic Weisbecker, Stephane Eranian, David Ahern,
	Andi Kleen

On 7/10/2015 12:06 p.m., Namhyung Kim wrote:
> Hi Adrian,
>
> On Tue, Oct 6, 2015 at 6:26 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:
>> On 06/10/15 12:03, Namhyung Kim wrote:
>>> Hi Adrian,
>>>
>>> On Mon, Oct 5, 2015 at 8:29 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:
>>>> On 02/10/15 21:45, Arnaldo Carvalho de Melo wrote:
>>>>> Em Fri, Oct 02, 2015 at 02:18:44PM +0900, Namhyung Kim escreveu:
>>>>>> Since it's gonna share struct mmap with dummy tracking evsel to track
>>>>>> meta events only, let's move auxtrace out of struct perf_mmap.
>>>>> Is this moving around _strictly_ needed?
>>>>
>>>> Also, what if you wanted to capture AUX data and tracking together.
>>>
>>> Hmm.. I don't know what's the problem.  It should be orthogonal and
>>> support doing that together IMHO.  Maybe I'm missing something about
>>> the aux data processing and Intel PT.  I'll take a look at it..
>>>
>>
>> It is only orthogonal if you assume we will never want to support parallel
>> processing with Intel PT.
>
> We'll definitely want it. :)
>
>>
>> The only change that needs to be made is not to assume there is only 1
>> tracking event.

Sorry for the slow reply.

>
> IIUC Intel PT (and BTS?) needs maximum 2 dummy events - one is to
> track task/mmap and another is to track context switches.  The latter
> is basically a light-weight version of the sched_switch event, right?

Yes

>
> For parallel processing, each cpu needs to keep current thread to
> synthesize events from auxtrace data.  So if it processed the switch
> events before processing samples, it'd need to build long lists of
> current thread per cpu.  IMHO it'd be better to process the switch
> events with samples using multi-thread rather than processing them
> prior to samples.

That is a good point.

But that would be limited to dividing the data by cpu.  It would be more
useful to divide it any which way.  Does 'perf report' care if the
data is not in order?

> So how about this?  It'd use *always* 2 dummy (or 1 dummy + 1
> sched_switch) events.  The tracking dummy events would be recorded on
> the tracking mmaps and switch (dummy) event would be recorded on the
> main mmaps.  This way we can parallelize the auxtrace processing
> without the list of current thread IMHO.
>
> Do I miss something?

Thinking about it now, it would probably make sense to put the AUX
event with the tracking events as well, so the data can be queued up
ready for processing, then the AUX index would not be needed.  But of
course, if there were no other events, then there would be no main
mmap at all.

 From that point of view, I guess I don't need to worry about splitting
up the mmaps at all, just process them more than once if need be.

>
>>
>> IMHO there could be separate mmap_params also, which would allow for
>> different mmap sizes for the tracking and main mmaps.
>
> Currently, the tracking mmap size is fixed at an arbitrary size
> (128KiB) regardless of the main mmaps.  I can add an option to change
> the tracking mmap size too.

I meant more from the program point of view, to allow different parameters.
Such as allowing one mmap to be PROT_READ and the other PROT_READ|PROT_WRITE
i.e. collect all the tracking events but let the other events overwrite
- perhaps as some kind of snapshot mode like we do with Intel PT.

It seemed to me that it would be more flexible to put evsels into mmap
groups.  Then those groups could have any events or be used in various ways.
I also thought it might make the mmap code more readable, instead of having
lots of "if tracking event do something different".

On the other hand, it is just a thought.  As I mentioned above, I realized
I could probably manage without splitting the mmaps.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 02/38] perf tools: Save mmap_param.len instead of mask
  2015-10-08 10:17   ` Jiri Olsa
@ 2015-10-09  6:03     ` Namhyung Kim
  2015-10-12 12:42       ` Jiri Olsa
  0 siblings, 1 reply; 63+ messages in thread
From: Namhyung Kim @ 2015-10-09  6:03 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen,
	Adrian Hunter

Hi Jiri,

On Thu, Oct 08, 2015 at 12:17:11PM +0200, Jiri Olsa wrote:
> On Fri, Oct 02, 2015 at 02:18:43PM +0900, Namhyung Kim wrote:
> > It is more convenient saving mmap length rather than (bit) mask.  With
> > this patch, we can eliminate dependency to perf_evlist other than
> > getting mmap_desc for dealing with mmaps.  The mask and length can be
> > converted using perf_evlist__mmap_mask/len().
> > 
> > Cc: Jiri Olsa <jolsa@redhat.com>
> > Cc: Adrian Hunter <adrian.hunter@intel.com>
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> 
> after this patch I'm hitting:
> 
> [jolsa@krava perf]$ ./perf record  kill
> kill: not enough arguments
> perf: util/evlist.c:1003: perf_evlist__mmap_len: Assertion `!((mask & page_size) != 0)' failed.
> Aborted (core dumped)
> [jolsa@krava perf]$ 

This is strange..  I think I fixed it already.  And the expression in
the assertion is different than the code in the patch:

  static size_t perf_evlist__mmap_len(size_t mask)
  {
         BUG_ON(((mask + 1) % page_size) != 0);
  
         return mask + 1 + page_size;
  }

Could you please double check?

Thanks,
Namhyung


> 
> 
> jirka
> 
> > ---
> >  tools/perf/util/evlist.c | 31 +++++++++++++++++++++++++------
> >  1 file changed, 25 insertions(+), 6 deletions(-)
> > 
> > diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> > index c5180a29db1b..e46adcd5b408 100644
> > --- a/tools/perf/util/evlist.c
> > +++ b/tools/perf/util/evlist.c
> > @@ -29,6 +29,8 @@
> >  
> >  static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx);
> >  static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx);
> > +static size_t perf_evlist__mmap_mask(size_t len);
> > +static size_t perf_evlist__mmap_len(size_t mask);
> >  
> >  #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
> >  #define SID(e, x, y) xyarray__entry(e->sample_id, x, y)
> > @@ -871,7 +873,9 @@ void __weak auxtrace_mmap_params__set_idx(
> >  static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
> >  {
> >  	if (evlist->mmap[idx].base != NULL) {
> > -		munmap(evlist->mmap[idx].base, evlist->mmap_len);
> > +		size_t mmap_len = perf_evlist__mmap_len(evlist->mmap[idx].mask);
> > +
> > +		munmap(evlist->mmap[idx].base, mmap_len);
> >  		evlist->mmap[idx].base = NULL;
> >  		atomic_set(&evlist->mmap[idx].refcnt, 0);
> >  	}
> > @@ -901,8 +905,8 @@ static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
> >  }
> >  
> >  struct mmap_params {
> > -	int prot;
> > -	int mask;
> > +	int	prot;
> > +	size_t	len;
> >  	struct auxtrace_mmap_params auxtrace_mp;
> >  };
> >  
> > @@ -924,8 +928,8 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
> >  	 */
> >  	atomic_set(&evlist->mmap[idx].refcnt, 2);
> >  	evlist->mmap[idx].prev = 0;
> > -	evlist->mmap[idx].mask = mp->mask;
> > -	evlist->mmap[idx].base = mmap(NULL, evlist->mmap_len, mp->prot,
> > +	evlist->mmap[idx].mask = perf_evlist__mmap_mask(mp->len);
> > +	evlist->mmap[idx].base = mmap(NULL, mp->len, mp->prot,
> >  				      MAP_SHARED, fd, 0);
> >  	if (evlist->mmap[idx].base == MAP_FAILED) {
> >  		pr_debug2("failed to mmap perf event ring buffer, error %d\n",
> > @@ -1071,6 +1075,21 @@ static size_t perf_evlist__mmap_size(unsigned long pages)
> >  	return (pages + 1) * page_size;
> >  }
> >  
> > +static size_t perf_evlist__mmap_mask(size_t len)
> > +{
> > +	BUG_ON(len <= page_size);
> > +	BUG_ON((len % page_size) != 0);
> > +
> > +	return len - page_size - 1;
> > +}
> > +
> > +static size_t perf_evlist__mmap_len(size_t mask)
> > +{
> > +	BUG_ON(((mask + 1) % page_size) != 0);
> > +
> > +	return mask + 1 + page_size;
> > +}
> > +
> >  static long parse_pages_arg(const char *str, unsigned long min,
> >  			    unsigned long max)
> >  {
> > @@ -1176,7 +1195,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
> >  	evlist->overwrite = overwrite;
> >  	evlist->mmap_len = perf_evlist__mmap_size(pages);
> >  	pr_debug("mmap size %zuB\n", evlist->mmap_len);
> > -	mp.mask = evlist->mmap_len - page_size - 1;
> > +	mp.len = evlist->mmap_len;
> >  
> >  	auxtrace_mmap_params__init(&mp.auxtrace_mp, evlist->mmap_len,
> >  				   auxtrace_pages, auxtrace_overwrite);
> > -- 
> > 2.6.0
> > 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 15/38] perf tools: Introduce machine__find*_thread_by_time()
  2015-10-08 12:20   ` Jiri Olsa
@ 2015-10-09  6:04     ` Namhyung Kim
  0 siblings, 0 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-09  6:04 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

On Thu, Oct 08, 2015 at 02:20:11PM +0200, Jiri Olsa wrote:
> On Fri, Oct 02, 2015 at 02:18:56PM +0900, Namhyung Kim wrote:
> 
> SNIP
> 
> > diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
> > index 674792e8fa2f..ad7c2a00bff8 100644
> > --- a/tools/perf/util/thread.c
> > +++ b/tools/perf/util/thread.c
> > @@ -160,6 +160,9 @@ int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
> >  
> >  	/* Override the default :tid entry */
> >  	if (!thread->comm_set) {
> > +		if (!thread->start_time)
> > +			thread->start_time = timestamp;
> > +
> >  		err = comm__override(curr, str, timestamp, exec);
> >  		if (err)
> >  			return err;
> > @@ -266,6 +269,7 @@ int thread__fork(struct thread *thread, struct thread *parent, u64 timestamp)
> >  	}
> >  
> >  	thread->ppid = parent->tid;
> > +	thread->start_time = timestamp;
> >  	return thread__clone_map_groups(thread, parent);
> >  }
> >  
> > diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
> > index b8f794d97b75..97026a9660ec 100644
> > --- a/tools/perf/util/thread.h
> > +++ b/tools/perf/util/thread.h
> > @@ -28,6 +28,7 @@ struct thread {
> >  	bool			dead; /* thread is in dead_threads list */
> >  	struct list_head	comm_list;
> >  	u64			db_id;
> > +	u64			start_time;
> 
> introducing start_time could be in separate patch
> would ease up the review a bit

Will split.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 17/38] perf tools: Maintain map groups list in a leader thread
  2015-10-08 12:51   ` Jiri Olsa
@ 2015-10-09  6:24     ` Namhyung Kim
  0 siblings, 0 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-09  6:24 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

On Thu, Oct 08, 2015 at 02:51:43PM +0200, Jiri Olsa wrote:
> On Fri, Oct 02, 2015 at 02:18:58PM +0900, Namhyung Kim wrote:
> 
> SNIP
> 
> > +static int thread__clone_map_groups(struct thread *thread,
> > +				    struct thread *parent);
> > +
> >  int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
> >  		       bool exec)
> >  {
> > @@ -182,6 +257,40 @@ int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
> >  			unwind__flush_access(thread);
> >  	}
> >  
> > +	if (exec) {
> > +		struct machine *machine;
> > +
> > +		BUG_ON(thread->mg == NULL || thread->mg->machine == NULL);
> > +
> > +		machine = thread->mg->machine;
> > +
> > +		if (thread->tid != thread->pid_) {
> > +			struct map_groups *old = thread->mg;
> > +			struct thread *leader;
> > +
> > +			leader = machine__findnew_thread(machine, thread->pid_,
> > +							 thread->pid_);
> > +
> > +			/* now it'll be a new leader */
> > +			thread->pid_ = thread->tid;
> > +
> > +			thread->mg = map_groups__new(old->machine);
> > +			if (thread->mg == NULL)
> > +				return -ENOMEM;
> 
> hum, isn't this leaking thread->mg ?
> should we call map_groups__put(old) at the end of block?

You're right!  Will fix.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 17/38] perf tools: Maintain map groups list in a leader thread
  2015-10-08 12:58   ` Jiri Olsa
@ 2015-10-09  6:58     ` Namhyung Kim
  2015-10-12 12:43       ` Jiri Olsa
  0 siblings, 1 reply; 63+ messages in thread
From: Namhyung Kim @ 2015-10-09  6:58 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

On Thu, Oct 08, 2015 at 02:58:00PM +0200, Jiri Olsa wrote:
> On Fri, Oct 02, 2015 at 02:18:58PM +0900, Namhyung Kim wrote:
> 
> SNIP
> 
> >  int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
> >  		       bool exec)
> >  {
> > @@ -182,6 +257,40 @@ int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
> >  			unwind__flush_access(thread);
> >  	}
> >  
> > +	if (exec) {
> > +		struct machine *machine;
> > +
> > +		BUG_ON(thread->mg == NULL || thread->mg->machine == NULL);
> > +
> > +		machine = thread->mg->machine;
> > +
> > +		if (thread->tid != thread->pid_) {
> > +			struct map_groups *old = thread->mg;
> > +			struct thread *leader;
> > +
> > +			leader = machine__findnew_thread(machine, thread->pid_,
> > +							 thread->pid_);
> > +
> > +			/* now it'll be a new leader */
> > +			thread->pid_ = thread->tid;
> > +
> > +			thread->mg = map_groups__new(old->machine);
> > +			if (thread->mg == NULL)
> > +				return -ENOMEM;
> > +
> > +			/* save current mg in the new leader */
> > +			thread__clone_map_groups(thread, leader);
> > +
> > +			/* current mg of leader thread needs one more refcnt */
> > +			map_groups__get(thread->mg);
> > +
> > +			thread__set_map_groups(thread, thread->mg, old->timestamp);
> > +		}
> > +
> > +		/* create a new mg for newly executed binary */
> > +		thread__set_map_groups(thread, map_groups__new(machine), timestamp);
> 
> should this     ^^^^ be in the else case of above condition?

Nop.  Above condition is to make the thread a new leader thread and
for that purpose, it clones old thread->mg and add into the mg_list.
Because a non-leader thread don't have mg_list.

After that, we can add a new mg to the now-available mg_list.


> 
> also thread__fork calls thread__clone_map_groups once again,
> I have some difficulty to sort this out ATM.. is that correct?

In fork case, above code will not be called since it's only for exec path.


> 
> some comment on how we treat map groups in general (for fork/clone/exit)
> would be awesome ;-)

I admit that this code is subtle and confusing..  How about this?


Managing map groups is subtle in that we basically want to share a map
groups between threads in a process.  When a new process is created
(forked), the child clones (current) map groups from the parent.  But
if a new thread is called it only gets a reference of the leader's mg.

Complication comes from the exec as we also want to keep the history
of a thread's execution, so the map groups are now managed by mg_list.
This mg_list is maintained by leader threads only, and non-leader
threads have a reference a mg at the time in the mg_list.  It uses a
timestamp at the event to find out the correct mg in the mg_list.

One corner case is when exec is called from a non-leader thread.  We
want to add a new mg to the mg_list in the thread.  But it doesn't
have a mg_list since it was not a leader.  So it sets up a mg_list and
insert a cloned mg from the old leader.  Now it can handle exec as
usual - create a new mg and insert it to the mg_list.


Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 03/38] perf tools: Move auxtrace_mmap field to struct perf_evlist
  2015-10-08 16:07             ` Adrian Hunter
@ 2015-10-09  7:54               ` Namhyung Kim
  0 siblings, 0 replies; 63+ messages in thread
From: Namhyung Kim @ 2015-10-09  7:54 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, Jiri Olsa,
	LKML, Frederic Weisbecker, Stephane Eranian, David Ahern,
	Andi Kleen

Hi Adrian,

On Thu, Oct 08, 2015 at 07:07:43PM +0300, Adrian Hunter wrote:
> On 7/10/2015 12:06 p.m., Namhyung Kim wrote:
> >Hi Adrian,
> >
> >On Tue, Oct 6, 2015 at 6:26 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:
> >>On 06/10/15 12:03, Namhyung Kim wrote:
> >>>Hi Adrian,
> >>>
> >>>On Mon, Oct 5, 2015 at 8:29 PM, Adrian Hunter <adrian.hunter@intel.com> wrote:
> >>>>On 02/10/15 21:45, Arnaldo Carvalho de Melo wrote:
> >>>>>Em Fri, Oct 02, 2015 at 02:18:44PM +0900, Namhyung Kim escreveu:
> >>>>>>Since it's gonna share struct mmap with dummy tracking evsel to track
> >>>>>>meta events only, let's move auxtrace out of struct perf_mmap.
> >>>>>Is this moving around _strictly_ needed?
> >>>>
> >>>>Also, what if you wanted to capture AUX data and tracking together.
> >>>
> >>>Hmm.. I don't know what's the problem.  It should be orthogonal and
> >>>support doing that together IMHO.  Maybe I'm missing something about
> >>>the aux data processing and Intel PT.  I'll take a look at it..
> >>>
> >>
> >>It is only orthogonal if you assume we will never want to support parallel
> >>processing with Intel PT.
> >
> >We'll definitely want it. :)
> >
> >>
> >>The only change that needs to be made is not to assume there is only 1
> >>tracking event.
> 
> Sorry for the slow reply.

No problem at all.  JFYI I'm travelling now.. :)


> 
> >
> >IIUC Intel PT (and BTS?) needs maximum 2 dummy events - one is to
> >track task/mmap and another is to track context switches.  The latter
> >is basically a light-weight version of the sched_switch event, right?
> 
> Yes
> 
> >
> >For parallel processing, each cpu needs to keep current thread to
> >synthesize events from auxtrace data.  So if it processed the switch
> >events before processing samples, it'd need to build long lists of
> >current thread per cpu.  IMHO it'd be better to process the switch
> >events with samples using multi-thread rather than processing them
> >prior to samples.
> 
> That is a good point.
> 
> But that would be limited to dividing the data by cpu.  It would be more
> useful to divide it any which way.  Does 'perf report' care if the
> data is not in order?

It doesn't as long as it could find a correct thread/dso/symbol ...

Btw I thought it'd also work if the targets are tasks since it'd still
be able to follow context switches of the tasks as switch events are
recorded along with the auxtrace events per task, no?

> 
> >So how about this?  It'd use *always* 2 dummy (or 1 dummy + 1
> >sched_switch) events.  The tracking dummy events would be recorded on
> >the tracking mmaps and switch (dummy) event would be recorded on the
> >main mmaps.  This way we can parallelize the auxtrace processing
> >without the list of current thread IMHO.
> >
> >Do I miss something?
> 
> Thinking about it now, it would probably make sense to put the AUX
> event with the tracking events as well, so the data can be queued up
> ready for processing, then the AUX index would not be needed.  But of
> course, if there were no other events, then there would be no main
> mmap at all.

Hmm.. let me try to follow. :)

So we can have 3 types of mmap in this case:

  1. track mmap for task/mmap events - it'll be saved in a separate
     file (in the meantime).
  2. main mmap for samples - it'll be saved in per-index (cpu or task)
     file.  For Intel PT, the switch events will be saved here too.
  3. auxtrace mmap - it'll be saved in per-index file (with switch events).

> 
> From that point of view, I guess I don't need to worry about splitting
> up the mmaps at all, just process them more than once if need be.

OK. I don't follow.. Can you elaborate it more?  Do you think it's not
necessary to use two dummy events?  What can be processed more than
once?

> 
> >
> >>
> >>IMHO there could be separate mmap_params also, which would allow for
> >>different mmap sizes for the tracking and main mmaps.
> >
> >Currently, the tracking mmap size is fixed at an arbitrary size
> >(128KiB) regardless of the main mmaps.  I can add an option to change
> >the tracking mmap size too.
> 
> I meant more from the program point of view, to allow different parameters.
> Such as allowing one mmap to be PROT_READ and the other PROT_READ|PROT_WRITE
> i.e. collect all the tracking events but let the other events overwrite
> - perhaps as some kind of snapshot mode like we do with Intel PT.

Ah, I see.

> 
> It seemed to me that it would be more flexible to put evsels into mmap
> groups.  Then those groups could have any events or be used in various ways.
> I also thought it might make the mmap code more readable, instead of having
> lots of "if tracking event do something different".

Hmm.. good idea.  I'll think about it.

> 
> On the other hand, it is just a thought.  As I mentioned above, I realized
> I could probably manage without splitting the mmaps.

It'd be nice if you'd explain your thoughts in more detail.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 02/38] perf tools: Save mmap_param.len instead of mask
  2015-10-09  6:03     ` Namhyung Kim
@ 2015-10-12 12:42       ` Jiri Olsa
  0 siblings, 0 replies; 63+ messages in thread
From: Jiri Olsa @ 2015-10-12 12:42 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen,
	Adrian Hunter

On Fri, Oct 09, 2015 at 03:03:33PM +0900, Namhyung Kim wrote:
> Hi Jiri,
> 
> On Thu, Oct 08, 2015 at 12:17:11PM +0200, Jiri Olsa wrote:
> > On Fri, Oct 02, 2015 at 02:18:43PM +0900, Namhyung Kim wrote:
> > > It is more convenient saving mmap length rather than (bit) mask.  With
> > > this patch, we can eliminate dependency to perf_evlist other than
> > > getting mmap_desc for dealing with mmaps.  The mask and length can be
> > > converted using perf_evlist__mmap_mask/len().
> > > 
> > > Cc: Jiri Olsa <jolsa@redhat.com>
> > > Cc: Adrian Hunter <adrian.hunter@intel.com>
> > > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > 
> > after this patch I'm hitting:
> > 
> > [jolsa@krava perf]$ ./perf record  kill
> > kill: not enough arguments
> > perf: util/evlist.c:1003: perf_evlist__mmap_len: Assertion `!((mask & page_size) != 0)' failed.
> > Aborted (core dumped)
> > [jolsa@krava perf]$ 
> 
> This is strange..  I think I fixed it already.  And the expression in
> the assertion is different than the code in the patch:
> 
>   static size_t perf_evlist__mmap_len(size_t mask)
>   {
>          BUG_ON(((mask + 1) % page_size) != 0);
>   
>          return mask + 1 + page_size;
>   }
> 
> Could you please double check?

yep, that one still works.. I probably forked some older version
of your perf/threaded-v5.. the current one has it fixed already

will check the new one

thanks,
jirka

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 17/38] perf tools: Maintain map groups list in a leader thread
  2015-10-09  6:58     ` Namhyung Kim
@ 2015-10-12 12:43       ` Jiri Olsa
  0 siblings, 0 replies; 63+ messages in thread
From: Jiri Olsa @ 2015-10-12 12:43 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

On Fri, Oct 09, 2015 at 03:58:49PM +0900, Namhyung Kim wrote:

SNIP

> 
> 
> > 
> > some comment on how we treat map groups in general (for fork/clone/exit)
> > would be awesome ;-)
> 
> I admit that this code is subtle and confusing..  How about this?
> 
> 
> Managing map groups is subtle in that we basically want to share a map
> groups between threads in a process.  When a new process is created
> (forked), the child clones (current) map groups from the parent.  But
> if a new thread is called it only gets a reference of the leader's mg.
> 
> Complication comes from the exec as we also want to keep the history
> of a thread's execution, so the map groups are now managed by mg_list.
> This mg_list is maintained by leader threads only, and non-leader
> threads have a reference a mg at the time in the mg_list.  It uses a
> timestamp at the event to find out the correct mg in the mg_list.
> 
> One corner case is when exec is called from a non-leader thread.  We
> want to add a new mg to the mg_list in the thread.  But it doesn't
> have a mg_list since it was not a leader.  So it sets up a mg_list and
> insert a cloned mg from the old leader.  Now it can handle exec as
> usual - create a new mg and insert it to the mg_list.

seems ok, thanks

jirka

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 18/38] perf tools: Introduce thread__find_addr_location_by_time() and friends
  2015-10-02  5:18 ` [RFC/PATCH 18/38] perf tools: Introduce thread__find_addr_location_by_time() and friends Namhyung Kim
@ 2015-10-12 13:35   ` Jiri Olsa
  0 siblings, 0 replies; 63+ messages in thread
From: Jiri Olsa @ 2015-10-12 13:35 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

On Fri, Oct 02, 2015 at 02:18:59PM +0900, Namhyung Kim wrote:

SNIP

> +void thread__find_addr_map(struct thread *thread, u8 cpumode,
> +			   enum map_type type, u64 addr,
> +			   struct addr_location *al)
> +{
> +	al->thread = thread;
> +	map_groups__find_addr_map(thread->mg, cpumode, type, addr, al);
> +}
> +
> +void thread__find_addr_map_by_time(struct thread *thread, u8 cpumode,
> +				   enum map_type type, u64 addr,
> +				   struct addr_location *al, u64 timestamp)
> +{
> +	struct map_groups *mg;
> +
> +	if (perf_has_index)
> +		mg = thread__get_map_groups(thread, timestamp);
> +	else
> +		mg = thread->mg;
> +
> +	al->thread = thread;
> +	map_groups__find_addr_map(mg, cpumode, type, addr, al);
> +}
> +
>  void thread__find_addr_location(struct thread *thread,
>  				u8 cpumode, enum map_type type, u64 addr,
>  				struct addr_location *al)
> @@ -985,6 +1006,23 @@ void thread__find_addr_location(struct thread *thread,
>  		al->sym = NULL;
>  }
>  
> +void thread__find_addr_location_by_time(struct thread *thread, u8 cpumode,
> +					enum map_type type, u64 addr,
> +					struct addr_location *al, u64 timestamp)
> +{
> +	if (perf_has_index)
> +		thread__find_addr_map_by_time(thread, cpumode, type, addr, al,
> +					      timestamp);
> +	else
> +		thread__find_addr_map(thread, cpumode, type, addr, al);

hum, we make the 'perf_has_index' decision here and also in
thread__find_addr_map_by_time which seems to have the same code
as thread__find_addr_map for !perf_has_index case

it seems redundant to have both original and _by_time versions
why not just thread__find_cpumode_addr_location with time arg?

also how about other callers of original versions like
thread__find_addr_location call in util/unwind-libunwind.c ?

jirka

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [RFC/PATCH 28/38] perf tools: Move BUILD_ID_SIZE definition to perf.h
  2015-10-02  6:58 ` Namhyung Kim
@ 2015-10-12 14:32   ` Jiri Olsa
  0 siblings, 0 replies; 63+ messages in thread
From: Jiri Olsa @ 2015-10-12 14:32 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML,
	Frederic Weisbecker, Stephane Eranian, David Ahern, Andi Kleen

On Fri, Oct 02, 2015 at 03:58:49PM +0900, Namhyung Kim wrote:
> The util/event.h includes util/build-id.h only for BUILD_ID_SIZE.
> This is a problem when I include util/event.h from util/tool.h which
> is also included by util/build-id.h since it now makes a circular
> dependency resulting in incomplete type error.

BUILD_ID_SIZE is build-id.h specific though.. ;-)

how about removing tool.h include from build-id.h
and add just 'struct perf_tool;' declaration

or some other similar fix..

jirka

^ permalink raw reply	[flat|nested] 63+ messages in thread

end of thread, other threads:[~2015-10-12 14:32 UTC | newest]

Thread overview: 63+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-02  5:18 [RFC/PATCH 00/38] perf tools: Speed-up perf report by using multi thread (v5) Namhyung Kim
2015-10-02  5:18 ` [RFC/PATCH 01/38] perf tools: Use a software dummy event to track task/mmap events Namhyung Kim
2015-10-05 12:51   ` Jiri Olsa
2015-10-06  8:31     ` Namhyung Kim
2015-10-02  5:18 ` [RFC/PATCH 02/38] perf tools: Save mmap_param.len instead of mask Namhyung Kim
2015-10-02 18:44   ` Arnaldo Carvalho de Melo
2015-10-06  8:34     ` Namhyung Kim
2015-10-08 10:17   ` Jiri Olsa
2015-10-09  6:03     ` Namhyung Kim
2015-10-12 12:42       ` Jiri Olsa
2015-10-02  5:18 ` [RFC/PATCH 03/38] perf tools: Move auxtrace_mmap field to struct perf_evlist Namhyung Kim
2015-10-02 18:45   ` Arnaldo Carvalho de Melo
2015-10-05 11:29     ` Adrian Hunter
2015-10-06  9:03       ` Namhyung Kim
2015-10-06  9:26         ` Adrian Hunter
2015-10-07  9:06           ` Namhyung Kim
2015-10-08 16:07             ` Adrian Hunter
2015-10-09  7:54               ` Namhyung Kim
2015-10-06  8:56     ` Namhyung Kim
2015-10-05 13:14   ` Jiri Olsa
2015-10-06  8:40     ` Namhyung Kim
2015-10-08 10:18   ` Jiri Olsa
2015-10-02  5:18 ` [RFC/PATCH 04/38] perf tools: pass perf_mmap desc directly Namhyung Kim
2015-10-02 18:47   ` Arnaldo Carvalho de Melo
2015-10-02  5:18 ` [RFC/PATCH 05/38] perf tools: Create separate mmap for dummy tracking event Namhyung Kim
2015-10-02  5:18 ` [RFC/PATCH 06/38] perf tools: Extend perf_evlist__mmap_ex() to use track mmap Namhyung Kim
2015-10-02  5:18 ` [RFC/PATCH 07/38] perf tools: Add HEADER_DATA_INDEX feature Namhyung Kim
2015-10-02  5:18 ` [RFC/PATCH 08/38] perf tools: Handle indexed data file properly Namhyung Kim
2015-10-02  5:18 ` [RFC/PATCH 09/38] perf record: Add --index option for building index table Namhyung Kim
2015-10-02 18:58   ` Arnaldo Carvalho de Melo
2015-10-05 13:46   ` Jiri Olsa
2015-10-07  8:21     ` Namhyung Kim
2015-10-07 12:10       ` Jiri Olsa
2015-10-02  5:18 ` [RFC/PATCH 10/38] perf report: Skip dummy tracking event Namhyung Kim
2015-10-02  5:18 ` [RFC/PATCH 11/38] perf tools: Introduce thread__comm(_str)_by_time() helpers Namhyung Kim
2015-10-02  5:18 ` [RFC/PATCH 12/38] perf tools: Add a test case for thread comm handling Namhyung Kim
2015-10-02  5:18 ` [RFC/PATCH 13/38] perf tools: Use thread__comm_by_time() when adding hist entries Namhyung Kim
2015-10-02  5:18 ` [RFC/PATCH 14/38] perf tools: Convert dead thread list into rbtree Namhyung Kim
2015-10-02  5:18 ` [RFC/PATCH 15/38] perf tools: Introduce machine__find*_thread_by_time() Namhyung Kim
2015-10-08 12:20   ` Jiri Olsa
2015-10-09  6:04     ` Namhyung Kim
2015-10-02  5:18 ` [RFC/PATCH 16/38] perf tools: Add a test case for timed thread handling Namhyung Kim
2015-10-02  5:18 ` [RFC/PATCH 17/38] perf tools: Maintain map groups list in a leader thread Namhyung Kim
2015-10-08 12:51   ` Jiri Olsa
2015-10-09  6:24     ` Namhyung Kim
2015-10-08 12:58   ` Jiri Olsa
2015-10-09  6:58     ` Namhyung Kim
2015-10-12 12:43       ` Jiri Olsa
2015-10-02  5:18 ` [RFC/PATCH 18/38] perf tools: Introduce thread__find_addr_location_by_time() and friends Namhyung Kim
2015-10-12 13:35   ` Jiri Olsa
2015-10-02  5:19 ` [RFC/PATCH 19/38] perf callchain: Use " Namhyung Kim
2015-10-02  5:19 ` [RFC/PATCH 20/38] perf tools: Add a test case for timed map groups handling Namhyung Kim
2015-10-02  5:19 ` [RFC/PATCH 21/38] perf tools: Save timestamp of a map creation Namhyung Kim
2015-10-02  5:19 ` [RFC/PATCH 22/38] perf tools: Introduce map_groups__{insert,find}_by_time() Namhyung Kim
2015-10-02  5:19 ` [RFC/PATCH 23/38] perf tools: Use map_groups__find_addr_by_time() Namhyung Kim
2015-10-02  5:19 ` [RFC/PATCH 24/38] perf tools: Add testcase for managing maps with time Namhyung Kim
2015-10-02  5:19 ` [RFC/PATCH 25/38] perf callchain: Maintain libunwind's address space in map_groups Namhyung Kim
2015-10-02  5:19 ` [RFC/PATCH 26/38] perf session: Pass struct events stats to event processing functions Namhyung Kim
2015-10-02  5:19 ` [RFC/PATCH 27/38] perf hists: Pass hists struct to hist_entry_iter struct Namhyung Kim
2015-10-02  5:19 ` [RFC/PATCH 28/38] perf tools: Move BUILD_ID_SIZE definition to perf.h Namhyung Kim
2015-10-02  5:22 ` Namhyung Kim
2015-10-02  6:58 ` Namhyung Kim
2015-10-12 14:32   ` Jiri Olsa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.