All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2)
@ 2015-01-29  8:06 Namhyung Kim
  2015-01-29  8:06 ` [PATCH 01/42] perf tools: Support to read compressed module from build-id cache Namhyung Kim
                   ` (42 more replies)
  0 siblings, 43 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:06 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

Hello,

This patchset converts perf report to use multiple threads in order to
speed up the processing on large data files.  I can see a minimum ~30%
of speedup with this change.  The code is still experimental and
contains many rough edges.  But I'd like to share and give some
feedbacks.

The main change in this version is using single data file with an
index table rather than using multiple files.  It seems that single
thread performance was improved by this than previous version but multi
thread performance remains almost same.

The perf report processes (sample) events like below:

  1. preprocess sample to get matching thread/dso/symbol info
  2. insert it to hists rbtree (with callchain tree) based on the info
  3. optionally collapse hist entries that match given sort key(s)
  4. resort hist entries (by overhead) for output
  5. display the hist entries

The stage 1 is a preprocessing and mostly act like a read-only
operation in that it doesn't change a machine state during the sample
processing.  Meta events like fork, comm and mmap can change the
machine/thread state but symbols can be loaded during the processing
(stage 2).

The stage 2 consumes most of the time especially with callchains and
 --children option is enabled.  And this work can be easily patitioned
as each sample is independent to others.  But the resulting hists must
be combined/collapsed to a single global hists before going to further
steps.

The stage 3 is optional and only needed by certain sort keys - but
with stage 2 paralellized, it needs to be done always.

The stage 4 and 5 works on whole hists so must be done serially.

So my approach is like this:

Partially do stage 1 first - but only for meta events that changes
machine state.  To do this I add a dummy tracking event to perf record
and make it collect such meta events only.  They are saved as normal
data and processed before sample events at perf report time.

This also requires to handle multiple sample data concurrently and to
find a corresponding machine state when processing samples.  On a
large profiling session, many tasks were created and exited so pid
might be recycled (even more than once!).  To deal with it, I managed
to have thread, map_groups and comm in time sorted.  The only
remaining thing is symbol loading as it's done lazily when sample
requires it.

With that being done, the stage 2 can be done by multiple threads.  I
also save each sample data (per-cpu or per-thread) in separate files
during record and then merge them into a single data file with an
index table.  On perf report time, each region of sample data will be
processed by each thread.  And symbol loading is protected by a mutex
lock.

For DWARF post-unwinding, dso cache data also needs to be protected by
a lock and this caused a huge contention.  I made it to search the
rbtree speculatively first and then, if it didn't find one, search it
again under the dso lock.  Please take a look at it if it's acceptable.

The patch 1-4 are independent fixes and cleans.  The patch 5-14 are to
support indexing for data file.  With --index option, perf record will
create a intermediate directory and then save meta events and sample
data to separate files.  And finally it'll build an index table and
concatenate the data files.

The patch 15-26 are to manage machine and thread state using timestamp
so that it can be searched when processing samples.  The patch 27-40
are to implement parallel report.  And finally I implemented 'perf
data index' command to build an index table for a given data file.

This patchset didn't change perf record to use multi-thread.  But I
think it can be easily done later if needed.

Note that output has a slight difference to original version when
compared using indexed data file.  But they're mostly unresolved
symbols for callchains.

Here is the result:

This is just elapsed time measured by 'perf stat -r 5'.

The data file was recorded during kernel build with fp callchain and
size is 2.1GB.  The machine has 6 core with hyper-threading enabled
and I got a similar result on my laptop too.

 perf report          --children  --no-children  + --call-graph none
 		   -------------  -------------  -------------------
 current           285.708340593   94.317412961      36.707232978  
 with index        253.322717665   77.079748639      24.892021523
 + --multi-thread  174.037760271   44.717308080       8.300466711


This result is with 7.7GB data file using libunwind for callchain.

 perf report          --children  --no-children  + --call-graph none
 		   -------------  -------------  -------------------
 current           247.070444039  196.393820003       5.068489333
 with index        149.456483830  108.917644447       3.642109876
 + --multi-thread   43.990095636   28.342798882       1.829218561

I guess the speedup of indexed data file came from skipping ordered
event layer.

This result is with same file but using libdw for callchain unwind.

 perf report          --children  --no-children  + --call-graph none
 		   -------------  -------------  -------------------
 current           465.661321115  496.153153039       4.629841428
 with index        445.712762188  462.146612217       3.535147499
 + --multi-thread  215.264706814   29.279996335       1.938137940

On my archlinux system, callchain unwind using libdw is much slower
than libunwind.  I'm using elfutils version 0.160.  Also I don't know
why --children takes less time than --no-children.  Anyway we can see
the --multi-thread performance is much better for each case.


You can get it from 'perf/threaded-v2' branch on my tree at:

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Please take a look and play with it.  Any comments are welcome! :)

Thanks,
Namhyung


Jiri Olsa (1):
  perf tools: Add new perf data command

Namhyung Kim (41):
  perf tools: Support to read compressed module from build-id cache
  perf tools: Do not use __perf_session__process_events() directly
  perf record: Show precise number of samples
  perf header: Set header version correctly
  perf tools: Set attr.task bit for a tracking event
  perf tools: Use a software dummy event to track task/mmap events
  perf tools: Use perf_data_file__fd() consistently
  perf tools: Add rm_rf() utility function
  perf tools: Introduce copyfile_offset() function
  perf tools: Create separate mmap for dummy tracking event
  perf tools: Introduce perf_evlist__mmap_track()
  perf tools: Add HEADER_DATA_INDEX feature
  perf tools: Handle indexed data file properly
  perf record: Add --index option for building index table
  perf report: Skip dummy tracking event
  perf tools: Pass session arg to perf_event__preprocess_sample()
  perf script: Pass session arg to ->process_event callback
  perf tools: Introduce thread__comm_time() helpers
  perf tools: Add a test case for thread comm handling
  perf tools: Use thread__comm_time() when adding hist entries
  perf tools: Convert dead thread list into rbtree
  perf tools: Introduce machine__find*_thread_time()
  perf tools: Add a test case for timed thread handling
  perf tools: Maintain map groups list in a leader thread
  perf tools: Introduce thread__find_addr_location_time() and friends
  perf tools: Add a test case for timed map groups handling
  perf tools: Protect dso symbol loading using a mutex
  perf tools: Protect dso cache tree using dso->lock
  perf tools: Protect dso cache fd with a mutex
  perf session: Pass struct events stats to event processing functions
  perf hists: Pass hists struct to hist_entry_iter functions
  perf tools: Move BUILD_ID_SIZE definition to perf.h
  perf report: Parallelize perf report using multi-thread
  perf tools: Add missing_threads rb tree
  perf record: Synthesize COMM event for a command line workload
  perf tools: Fix progress ui to support multi thread
  perf report: Add --multi-thread option and config item
  perf session: Handle index files generally
  perf tools: Convert lseek + read to pread
  perf callchain: Save eh/debug frame offset for dwarf unwind
  perf data: Implement 'index' subcommand

 tools/perf/Documentation/perf-data.txt             |  44 +++
 tools/perf/Documentation/perf-record.txt           |   4 +
 tools/perf/Documentation/perf-report.txt           |   3 +
 tools/perf/Makefile.perf                           |   4 +
 tools/perf/builtin-annotate.c                      |   8 +-
 tools/perf/builtin-data.c                          | 428 +++++++++++++++++++++
 tools/perf/builtin-diff.c                          |  21 +-
 tools/perf/builtin-inject.c                        |   5 +-
 tools/perf/builtin-mem.c                           |   6 +-
 tools/perf/builtin-record.c                        | 261 +++++++++++--
 tools/perf/builtin-report.c                        |  74 +++-
 tools/perf/builtin-script.c                        |  54 ++-
 tools/perf/builtin-timechart.c                     |  10 +-
 tools/perf/builtin-top.c                           |   7 +-
 tools/perf/builtin.h                               |   1 +
 tools/perf/command-list.txt                        |   1 +
 tools/perf/perf.c                                  |   1 +
 tools/perf/perf.h                                  |   2 +
 tools/perf/tests/builtin-test.c                    |  12 +
 tools/perf/tests/dso-data.c                        |   5 +
 tools/perf/tests/dwarf-unwind.c                    |   8 +-
 tools/perf/tests/hists_common.c                    |   3 +-
 tools/perf/tests/hists_cumulate.c                  |   6 +-
 tools/perf/tests/hists_filter.c                    |   5 +-
 tools/perf/tests/hists_link.c                      |  10 +-
 tools/perf/tests/hists_output.c                    |   6 +-
 tools/perf/tests/tests.h                           |   3 +
 tools/perf/tests/thread-comm.c                     |  47 +++
 tools/perf/tests/thread-lookup-time.c              | 180 +++++++++
 tools/perf/tests/thread-mg-share.c                 |   7 +-
 tools/perf/tests/thread-mg-time.c                  |  88 +++++
 tools/perf/ui/browsers/hists.c                     |  30 +-
 tools/perf/ui/gtk/hists.c                          |   3 +
 tools/perf/util/build-id.c                         |   9 +-
 tools/perf/util/build-id.h                         |   2 -
 tools/perf/util/db-export.c                        |   6 +-
 tools/perf/util/db-export.h                        |   4 +-
 tools/perf/util/dso.c                              | 159 +++++---
 tools/perf/util/dso.h                              |   3 +
 tools/perf/util/event.c                            | 106 ++++-
 tools/perf/util/event.h                            |  13 +-
 tools/perf/util/evlist.c                           | 161 ++++++--
 tools/perf/util/evlist.h                           |  22 +-
 tools/perf/util/evsel.c                            |   1 +
 tools/perf/util/evsel.h                            |  15 +
 tools/perf/util/header.c                           |  63 ++-
 tools/perf/util/header.h                           |   3 +
 tools/perf/util/hist.c                             | 121 ++++--
 tools/perf/util/hist.h                             |  12 +-
 tools/perf/util/machine.c                          | 258 +++++++++++--
 tools/perf/util/machine.h                          |  12 +-
 tools/perf/util/map.c                              |   1 +
 tools/perf/util/map.h                              |   2 +
 tools/perf/util/ordered-events.c                   |   4 +-
 .../perf/util/scripting-engines/trace-event-perl.c |   3 +-
 .../util/scripting-engines/trace-event-python.c    |   5 +-
 tools/perf/util/session.c                          | 356 ++++++++++++++---
 tools/perf/util/session.h                          |  11 +-
 tools/perf/util/symbol-elf.c                       |  13 +-
 tools/perf/util/symbol.c                           |  34 +-
 tools/perf/util/thread.c                           | 139 ++++++-
 tools/perf/util/thread.h                           |  28 +-
 tools/perf/util/tool.h                             |  14 +
 tools/perf/util/trace-event-scripting.c            |   3 +-
 tools/perf/util/trace-event.h                      |   3 +-
 tools/perf/util/unwind-libdw.c                     |  11 +-
 tools/perf/util/unwind-libunwind.c                 |  49 ++-
 tools/perf/util/util.c                             |  81 +++-
 tools/perf/util/util.h                             |   2 +
 69 files changed, 2662 insertions(+), 414 deletions(-)
 create mode 100644 tools/perf/Documentation/perf-data.txt
 create mode 100644 tools/perf/builtin-data.c
 create mode 100644 tools/perf/tests/thread-comm.c
 create mode 100644 tools/perf/tests/thread-lookup-time.c
 create mode 100644 tools/perf/tests/thread-mg-time.c

-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 01/42] perf tools: Support to read compressed module from build-id cache
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
@ 2015-01-29  8:06 ` Namhyung Kim
  2015-01-30 14:32   ` Jiri Olsa
  2015-01-30 18:33   ` [tip:perf/core] perf symbols: " tip-bot for Namhyung Kim
  2015-01-29  8:06 ` [PATCH 02/42] perf tools: Do not use __perf_session__process_events() directly Namhyung Kim
                   ` (41 subsequent siblings)
  42 siblings, 2 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:06 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

The commit c00c48fc6e6e ("perf symbols: Preparation for compressed
kernel module support") added support for compressed kernel modules
but it only supports system path DSOs.  When a dso is read from
build-id cache, its filename doesn't end with ".gz" but has build-id.
In this case, we should fallback to the original dso->name.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/symbol-elf.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 06fcd1bf98b6..b24f9d8727a8 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -574,13 +574,16 @@ static int decompress_kmodule(struct dso *dso, const char *name,
 	const char *ext = strrchr(name, '.');
 	char tmpbuf[] = "/tmp/perf-kmod-XXXXXX";
 
-	if ((type != DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE_COMP &&
-	     type != DSO_BINARY_TYPE__GUEST_KMODULE_COMP) ||
-	    type != dso->symtab_type)
+	if (type != DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE_COMP &&
+	    type != DSO_BINARY_TYPE__GUEST_KMODULE_COMP &&
+	    type != DSO_BINARY_TYPE__BUILD_ID_CACHE)
 		return -1;
 
-	if (!ext || !is_supported_compression(ext + 1))
-		return -1;
+	if (!ext || !is_supported_compression(ext + 1)) {
+		ext = strrchr(dso->name, '.');
+		if (!ext || !is_supported_compression(ext + 1))
+			return -1;
+	}
 
 	fd = mkstemp(tmpbuf);
 	if (fd < 0)
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 02/42] perf tools: Do not use __perf_session__process_events() directly
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
  2015-01-29  8:06 ` [PATCH 01/42] perf tools: Support to read compressed module from build-id cache Namhyung Kim
@ 2015-01-29  8:06 ` Namhyung Kim
  2015-01-30 18:32   ` [tip:perf/core] " tip-bot for Namhyung Kim
  2015-01-29  8:06 ` [PATCH 03/42] perf record: Show precise number of samples Namhyung Kim
                   ` (40 subsequent siblings)
  42 siblings, 1 reply; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:06 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

It's only used for perf record to process build-id because its file
size it's not fixed at this time due to remaining header features.
However data offset and size is available so that we can use the
perf_session__process_events() once we set the file size as the
current offset like for now.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-record.c | 7 +++----
 tools/perf/util/session.c   | 6 +++---
 tools/perf/util/session.h   | 3 ---
 3 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 8648c6d3003d..1134de22979e 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -194,12 +194,13 @@ static int process_buildids(struct record *rec)
 {
 	struct perf_data_file *file  = &rec->file;
 	struct perf_session *session = rec->session;
-	u64 start = session->header.data_offset;
 
 	u64 size = lseek(file->fd, 0, SEEK_CUR);
 	if (size == 0)
 		return 0;
 
+	file->size = size;
+
 	/*
 	 * During this process, it'll load kernel map and replace the
 	 * dso->long_name to a real pathname it found.  In this case
@@ -211,9 +212,7 @@ static int process_buildids(struct record *rec)
 	 */
 	symbol_conf.ignore_vmlinux_buildid = true;
 
-	return __perf_session__process_events(session, start,
-					      size - start,
-					      size, &build_id__mark_dso_hit_ops);
+	return perf_session__process_events(session, &build_id__mark_dso_hit_ops);
 }
 
 static void perf_event__synthesize_guest_os(struct machine *machine, void *data)
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index b0ce3d6e6231..0baf75f12b7c 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1251,9 +1251,9 @@ fetch_mmaped_event(struct perf_session *session,
 #define NUM_MMAPS 128
 #endif
 
-int __perf_session__process_events(struct perf_session *session,
-				   u64 data_offset, u64 data_size,
-				   u64 file_size, struct perf_tool *tool)
+static int __perf_session__process_events(struct perf_session *session,
+					  u64 data_offset, u64 data_size,
+					  u64 file_size, struct perf_tool *tool)
 {
 	int fd = perf_data_file__fd(session->file);
 	u64 head, page_offset, file_offset, file_pos, size;
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index dc26ebf60fe4..6d663dc76404 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -49,9 +49,6 @@ int perf_session__peek_event(struct perf_session *session, off_t file_offset,
 			     union perf_event **event_ptr,
 			     struct perf_sample *sample);
 
-int __perf_session__process_events(struct perf_session *session,
-				   u64 data_offset, u64 data_size, u64 size,
-				   struct perf_tool *tool);
 int perf_session__process_events(struct perf_session *session,
 				 struct perf_tool *tool);
 
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 03/42] perf record: Show precise number of samples
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
  2015-01-29  8:06 ` [PATCH 01/42] perf tools: Support to read compressed module from build-id cache Namhyung Kim
  2015-01-29  8:06 ` [PATCH 02/42] perf tools: Do not use __perf_session__process_events() directly Namhyung Kim
@ 2015-01-29  8:06 ` Namhyung Kim
  2015-01-30 18:32   ` [tip:perf/core] " tip-bot for Namhyung Kim
  2015-01-29  8:06 ` [PATCH 04/42] perf header: Set header version correctly Namhyung Kim
                   ` (39 subsequent siblings)
  42 siblings, 1 reply; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:06 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker,
	Milian Wolff, Jiri Olsa

After perf record finishes, it prints file size and number of samples
in the file but this info is wrong since it assumes typical sample
size of 24 bytes and divides file size by the value.

However as we post-process recorded samples for build-id, it can show
correct number like below.  If build-id post-processing is not requested
just omit the wrong number of samples.

  $ perf record noploop 1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.159 MB perf.data (3989 samples) ]

  $ perf report --stdio -n
  # To display the perf.data header info, please use --header/--header-only options.
  #
  # Samples: 3K of event 'cycles'
  # Event count (approx.): 3771330663
  #
  # Overhead       Samples  Command  Shared Object     Symbol
  # ........  ............  .......  ................  ..........................
  #
      99.90%          3982  noploop  noploop           [.] main
       0.09%             1  noploop  ld-2.17.so        [.] _dl_check_map_versions
       0.01%             1  noploop  [kernel.vmlinux]  [k] setup_arg_pages
       0.00%             5  noploop  [kernel.vmlinux]  [k] intel_pmu_enable_all

Reported-by: Milian Wolff <mail@milianw.de>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-record.c | 51 ++++++++++++++++++++++++++++++++++-----------
 1 file changed, 39 insertions(+), 12 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 1134de22979e..9900b433e861 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -190,6 +190,19 @@ static int record__open(struct record *rec)
 	return rc;
 }
 
+static int process_sample_event(struct perf_tool *tool,
+				union perf_event *event,
+				struct perf_sample *sample,
+				struct perf_evsel *evsel,
+				struct machine *machine)
+{
+	struct record *rec = container_of(tool, struct record, tool);
+
+	rec->samples++;
+
+	return build_id__mark_dso_hit(tool, event, sample, evsel, machine);
+}
+
 static int process_buildids(struct record *rec)
 {
 	struct perf_data_file *file  = &rec->file;
@@ -212,7 +225,7 @@ static int process_buildids(struct record *rec)
 	 */
 	symbol_conf.ignore_vmlinux_buildid = true;
 
-	return perf_session__process_events(session, &build_id__mark_dso_hit_ops);
+	return perf_session__process_events(session, &rec->tool);
 }
 
 static void perf_event__synthesize_guest_os(struct machine *machine, void *data)
@@ -503,19 +516,9 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		goto out_child;
 	}
 
-	if (!quiet) {
+	if (!quiet)
 		fprintf(stderr, "[ perf record: Woken up %ld times to write data ]\n", waking);
 
-		/*
-		 * Approximate RIP event size: 24 bytes.
-		 */
-		fprintf(stderr,
-			"[ perf record: Captured and wrote %.3f MB %s (~%" PRIu64 " samples) ]\n",
-			(double)rec->bytes_written / 1024.0 / 1024.0,
-			file->path,
-			rec->bytes_written / 24);
-	}
-
 out_child:
 	if (forks) {
 		int exit_status;
@@ -534,6 +537,9 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	} else
 		status = err;
 
+	/* this will be recalculated during process_buildids() */
+	rec->samples = 0;
+
 	if (!err && !file->is_pipe) {
 		rec->session->header.data_size += rec->bytes_written;
 
@@ -543,6 +549,20 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 					   file->fd, true);
 	}
 
+	if (!err && !quiet) {
+		char samples[128];
+
+		if (rec->samples)
+			scnprintf(samples, sizeof(samples),
+				  " (%" PRIu64 " samples)", rec->samples);
+		else
+			samples[0] = '\0';
+
+		fprintf(stderr,	"[ perf record: Captured and wrote %.3f MB %s%s ]\n",
+			perf_data_file__size(file) / 1024.0 / 1024.0,
+			file->path, samples);
+	}
+
 out_delete_session:
 	perf_session__delete(session);
 	return status;
@@ -719,6 +739,13 @@ static struct record record = {
 			.default_per_cpu = true,
 		},
 	},
+	.tool = {
+		.sample		= process_sample_event,
+		.fork		= perf_event__process_fork,
+		.comm		= perf_event__process_comm,
+		.mmap		= perf_event__process_mmap,
+		.mmap2		= perf_event__process_mmap2,
+	},
 };
 
 #define CALLCHAIN_HELP "setup and enables call-graph (stack chain/backtrace) recording: "
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 04/42] perf header: Set header version correctly
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (2 preceding siblings ...)
  2015-01-29  8:06 ` [PATCH 03/42] perf record: Show precise number of samples Namhyung Kim
@ 2015-01-29  8:06 ` Namhyung Kim
  2015-01-30 18:33   ` [tip:perf/core] " tip-bot for Namhyung Kim
  2015-01-29  8:06 ` [PATCH 05/42] perf tools: Set attr.task bit for a tracking event Namhyung Kim
                   ` (38 subsequent siblings)
  42 siblings, 1 reply; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:06 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

When check_magic_endian() is called, it checks the magic number in the
perf data file to determine version and endianness.  But if it uses a
same endian the verison number wasn't updated and makes confusion.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/header.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index b20e40c74468..1f407f7352a7 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -2237,6 +2237,7 @@ static int check_magic_endian(u64 magic, uint64_t hdr_sz,
 	 * - unique number to identify actual perf.data files
 	 * - encode endianness of file
 	 */
+	ph->version = PERF_HEADER_VERSION_2;
 
 	/* check magic number with one endianness */
 	if (magic == __perf_magic2)
@@ -2247,7 +2248,6 @@ static int check_magic_endian(u64 magic, uint64_t hdr_sz,
 		return -1;
 
 	ph->needs_swap = true;
-	ph->version = PERF_HEADER_VERSION_2;
 
 	return 0;
 }
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 05/42] perf tools: Set attr.task bit for a tracking event
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (3 preceding siblings ...)
  2015-01-29  8:06 ` [PATCH 04/42] perf header: Set header version correctly Namhyung Kim
@ 2015-01-29  8:06 ` Namhyung Kim
  2015-01-30 18:33   ` [tip:perf/core] perf evsel: " tip-bot for Namhyung Kim
  2015-01-29  8:06 ` [PATCH 06/42] perf tools: Use a software dummy event to track task/mmap events Namhyung Kim
                   ` (37 subsequent siblings)
  42 siblings, 1 reply; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:06 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

The perf_event_attr.task bit is to track task (fork and exit) events
but it missed to be set by perf_evsel__config().  While it was not a
problem in practice since setting other bits (comm/mmap) ended up
being in same result, it'd be good to set it explicitly anyway.

The attr->task is to track task related events (fork/exit) only but
other meta events like comm and mmap[2] also needs the task events.
So setting attr->comm and/or attr->mmap causes the kernel emits the
task events anyway.  So the attr->task is only meaningful when other
bits are off but I'd like to set it for completeness.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/evsel.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 1d826d63bc20..ea51a90e20a0 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -709,6 +709,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts)
 	if (opts->sample_weight)
 		perf_evsel__set_sample_bit(evsel, WEIGHT);
 
+	attr->task  = track;
 	attr->mmap  = track;
 	attr->mmap2 = track && !perf_missing_features.mmap2;
 	attr->comm  = track;
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 06/42] perf tools: Use a software dummy event to track task/mmap events
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (4 preceding siblings ...)
  2015-01-29  8:06 ` [PATCH 05/42] perf tools: Set attr.task bit for a tracking event Namhyung Kim
@ 2015-01-29  8:06 ` Namhyung Kim
  2015-01-29  8:06 ` [PATCH 07/42] perf tools: Use perf_data_file__fd() consistently Namhyung Kim
                   ` (36 subsequent siblings)
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:06 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

Add APIs for software dummy event to track task/comm/mmap events
separately.  The perf record will use them to save such events in a
separate mmap buffer to make it easy to index.  This is a preparation of
multi-thread support which will come later.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/evlist.c | 30 ++++++++++++++++++++++++++++++
 tools/perf/util/evlist.h |  1 +
 tools/perf/util/evsel.h  | 15 +++++++++++++++
 3 files changed, 46 insertions(+)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 28b8ce86bf12..2d81b4d154f4 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -195,6 +195,36 @@ int perf_evlist__add_default(struct perf_evlist *evlist)
 	return -ENOMEM;
 }
 
+int perf_evlist__add_dummy_tracking(struct perf_evlist *evlist)
+{
+	struct perf_event_attr attr = {
+		.type = PERF_TYPE_SOFTWARE,
+		.config = PERF_COUNT_SW_DUMMY,
+		.exclude_kernel = 1,
+	};
+	struct perf_evsel *evsel;
+
+	event_attr_init(&attr);
+
+	evsel = perf_evsel__new(&attr);
+	if (evsel == NULL)
+		goto error;
+
+	/* use strdup() because free(evsel) assumes name is allocated */
+	evsel->name = strdup("dummy");
+	if (!evsel->name)
+		goto error_free;
+
+	perf_evlist__add(evlist, evsel);
+	perf_evlist__set_tracking_event(evlist, evsel);
+
+	return 0;
+error_free:
+	perf_evsel__delete(evsel);
+error:
+	return -ENOMEM;
+}
+
 static int perf_evlist__add_attrs(struct perf_evlist *evlist,
 				  struct perf_event_attr *attrs, size_t nr_attrs)
 {
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index c94a9e03ecf1..771175e70d2f 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -67,6 +67,7 @@ void perf_evlist__delete(struct perf_evlist *evlist);
 
 void perf_evlist__add(struct perf_evlist *evlist, struct perf_evsel *entry);
 int perf_evlist__add_default(struct perf_evlist *evlist);
+int perf_evlist__add_dummy_tracking(struct perf_evlist *evlist);
 int __perf_evlist__add_default_attrs(struct perf_evlist *evlist,
 				     struct perf_event_attr *attrs, size_t nr_attrs);
 
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 38622747d130..9764e9456546 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -331,6 +331,21 @@ static inline bool perf_evsel__is_function_event(struct perf_evsel *evsel)
 #undef FUNCTION_EVENT
 }
 
+/**
+ * perf_evsel__is_dummy_tracking - Return whether given evsel is a dummy
+ * event for tracking meta events only
+ *
+ * @evsel - evsel selector to be tested
+ *
+ * Return %true if event is a dummy tracking event
+ */
+static inline bool perf_evsel__is_dummy_tracking(struct perf_evsel *evsel)
+{
+	return evsel->attr.type == PERF_TYPE_SOFTWARE &&
+		evsel->attr.config == PERF_COUNT_SW_DUMMY &&
+		evsel->attr.task == 1 && evsel->attr.mmap == 1;
+}
+
 struct perf_attr_details {
 	bool freq;
 	bool verbose;
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 07/42] perf tools: Use perf_data_file__fd() consistently
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (5 preceding siblings ...)
  2015-01-29  8:06 ` [PATCH 06/42] perf tools: Use a software dummy event to track task/mmap events Namhyung Kim
@ 2015-01-29  8:06 ` Namhyung Kim
  2015-01-30 18:33   ` [tip:perf/core] " tip-bot for Namhyung Kim
  2015-01-29  8:06 ` [PATCH 08/42] perf tools: Add rm_rf() utility function Namhyung Kim
                   ` (35 subsequent siblings)
  42 siblings, 1 reply; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:06 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

Do not reference file->fd directly since we want hide the
implementation details from outside for possible future changes.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-inject.c |  5 +++--
 tools/perf/builtin-record.c | 14 +++++++-------
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index 84df2deed988..a13641e066f5 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -343,6 +343,7 @@ static int __cmd_inject(struct perf_inject *inject)
 	int ret = -EINVAL;
 	struct perf_session *session = inject->session;
 	struct perf_data_file *file_out = &inject->output;
+	int fd = perf_data_file__fd(file_out);
 
 	signal(SIGINT, sig_handler);
 
@@ -376,7 +377,7 @@ static int __cmd_inject(struct perf_inject *inject)
 	}
 
 	if (!file_out->is_pipe)
-		lseek(file_out->fd, session->header.data_offset, SEEK_SET);
+		lseek(fd, session->header.data_offset, SEEK_SET);
 
 	ret = perf_session__process_events(session, &inject->tool);
 
@@ -385,7 +386,7 @@ static int __cmd_inject(struct perf_inject *inject)
 			perf_header__set_feat(&session->header,
 					      HEADER_BUILD_ID);
 		session->header.data_size = inject->bytes_written;
-		perf_session__write_header(session, session->evlist, file_out->fd, true);
+		perf_session__write_header(session, session->evlist, fd, true);
 	}
 
 	return ret;
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 9900b433e861..404ab3434052 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -208,7 +208,7 @@ static int process_buildids(struct record *rec)
 	struct perf_data_file *file  = &rec->file;
 	struct perf_session *session = rec->session;
 
-	u64 size = lseek(file->fd, 0, SEEK_CUR);
+	u64 size = lseek(perf_data_file__fd(file), 0, SEEK_CUR);
 	if (size == 0)
 		return 0;
 
@@ -334,6 +334,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	struct perf_data_file *file = &rec->file;
 	struct perf_session *session;
 	bool disabled = false, draining = false;
+	int fd;
 
 	rec->progname = argv[0];
 
@@ -348,6 +349,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		return -1;
 	}
 
+	fd = perf_data_file__fd(file);
 	rec->session = session;
 
 	record__init_features(rec);
@@ -372,12 +374,11 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		perf_header__clear_feat(&session->header, HEADER_GROUP_DESC);
 
 	if (file->is_pipe) {
-		err = perf_header__write_pipe(file->fd);
+		err = perf_header__write_pipe(fd);
 		if (err < 0)
 			goto out_child;
 	} else {
-		err = perf_session__write_header(session, rec->evlist,
-						 file->fd, false);
+		err = perf_session__write_header(session, rec->evlist, fd, false);
 		if (err < 0)
 			goto out_child;
 	}
@@ -409,7 +410,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 			 * return this more properly and also
 			 * propagate errors that now are calling die()
 			 */
-			err = perf_event__synthesize_tracing_data(tool, file->fd, rec->evlist,
+			err = perf_event__synthesize_tracing_data(tool,	fd, rec->evlist,
 								  process_synthesized_event);
 			if (err <= 0) {
 				pr_err("Couldn't record tracing data.\n");
@@ -545,8 +546,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 
 		if (!rec->no_buildid)
 			process_buildids(rec);
-		perf_session__write_header(rec->session, rec->evlist,
-					   file->fd, true);
+		perf_session__write_header(rec->session, rec->evlist, fd, true);
 	}
 
 	if (!err && !quiet) {
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 08/42] perf tools: Add rm_rf() utility function
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (6 preceding siblings ...)
  2015-01-29  8:06 ` [PATCH 07/42] perf tools: Use perf_data_file__fd() consistently Namhyung Kim
@ 2015-01-29  8:06 ` Namhyung Kim
  2015-01-30 15:02   ` Jiri Olsa
  2015-01-29  8:06 ` [PATCH 09/42] perf tools: Introduce copyfile_offset() function Namhyung Kim
                   ` (34 subsequent siblings)
  42 siblings, 1 reply; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:06 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

The rm_rf() function does same as the shell command 'rm -rf' which
removes all directory entries recursively.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/util.c | 43 +++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/util.h |  1 +
 2 files changed, 44 insertions(+)

diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index b86744f29eef..de1307a6ff4b 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -72,6 +72,49 @@ int mkdir_p(char *path, mode_t mode)
 	return (stat(path, &st) && mkdir(path, mode)) ? -1 : 0;
 }
 
+int rm_rf(char *path)
+{
+	DIR *dir;
+	int ret = 0;
+	struct dirent *d;
+	char namebuf[PATH_MAX];
+
+	dir = opendir(path);
+	if (dir == NULL)
+		return 0;
+
+	while ((d = readdir(dir)) != NULL && !ret) {
+		struct stat statbuf;
+
+		if (!strcmp(d->d_name, ".") || !strcmp(d->d_name, ".."))
+			continue;
+
+		scnprintf(namebuf, sizeof(namebuf), "%s/%s",
+			  path, d->d_name);
+
+		ret = stat(namebuf, &statbuf);
+		if (ret < 0) {
+			pr_debug("stat failed: %s\n", namebuf);
+			break;
+		}
+
+		if (S_ISREG(statbuf.st_mode))
+			ret = unlink(namebuf);
+		else if (S_ISDIR(statbuf.st_mode))
+			ret = rm_rf(namebuf);
+		else {
+			pr_debug("unknown file: %s\n", namebuf);
+			ret = -1;
+		}
+	}
+	closedir(dir);
+
+	if (ret < 0)
+		return ret;
+
+	return rmdir(path);
+}
+
 static int slow_copyfile(const char *from, const char *to, mode_t mode)
 {
 	int err = -1;
diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
index 027a5153495c..7b71ad87cdc3 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -248,6 +248,7 @@ static inline int sane_case(int x, int high)
 }
 
 int mkdir_p(char *path, mode_t mode);
+int rm_rf(char *path);
 int copyfile(const char *from, const char *to);
 int copyfile_mode(const char *from, const char *to, mode_t mode);
 
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 09/42] perf tools: Introduce copyfile_offset() function
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (7 preceding siblings ...)
  2015-01-29  8:06 ` [PATCH 08/42] perf tools: Add rm_rf() utility function Namhyung Kim
@ 2015-01-29  8:06 ` Namhyung Kim
  2015-01-29  8:06 ` [PATCH 10/42] perf tools: Create separate mmap for dummy tracking event Namhyung Kim
                   ` (33 subsequent siblings)
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:06 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

The copyfile_offset() function is to copy source data from given
offset to a destination file with an offset.  It'll be used to build
an indexed data file.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/util.c | 38 +++++++++++++++++++++++++++++---------
 tools/perf/util/util.h |  1 +
 2 files changed, 30 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index de1307a6ff4b..cc1aa309d643 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -145,11 +145,38 @@ static int slow_copyfile(const char *from, const char *to, mode_t mode)
 	return err;
 }
 
+int copyfile_offset(int ifd, loff_t off_in, int ofd, loff_t off_out, u64 size)
+{
+	void *ptr;
+	loff_t pgoff;
+
+	pgoff = off_in & ~(page_size - 1);
+	off_in -= pgoff;
+
+	ptr = mmap(NULL, off_in + size, PROT_READ, MAP_PRIVATE, ifd, pgoff);
+	if (ptr == MAP_FAILED)
+		return -1;
+
+	while (size) {
+		ssize_t ret = pwrite(ofd, ptr + off_in, size, off_out);
+		if (ret < 0 && errno == EINTR)
+			continue;
+		if (ret <= 0)
+			break;
+
+		size -= ret;
+		off_in += ret;
+		off_out -= ret;
+	}
+	munmap(ptr, off_in + size);
+
+	return size ? -1 : 0;
+}
+
 int copyfile_mode(const char *from, const char *to, mode_t mode)
 {
 	int fromfd, tofd;
 	struct stat st;
-	void *addr;
 	int err = -1;
 
 	if (stat(from, &st))
@@ -166,15 +193,8 @@ int copyfile_mode(const char *from, const char *to, mode_t mode)
 	if (tofd < 0)
 		goto out_close_from;
 
-	addr = mmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, fromfd, 0);
-	if (addr == MAP_FAILED)
-		goto out_close_to;
-
-	if (write(tofd, addr, st.st_size) == st.st_size)
-		err = 0;
+	err = copyfile_offset(fromfd, 0, tofd, 0, st.st_size);
 
-	munmap(addr, st.st_size);
-out_close_to:
 	close(tofd);
 	if (err)
 		unlink(to);
diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
index 7b71ad87cdc3..2291f08f11fe 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -251,6 +251,7 @@ int mkdir_p(char *path, mode_t mode);
 int rm_rf(char *path);
 int copyfile(const char *from, const char *to);
 int copyfile_mode(const char *from, const char *to, mode_t mode);
+int copyfile_offset(int fromfd, loff_t from_ofs, int tofd, loff_t to_ofs, u64 size);
 
 s64 perf_atoll(const char *str);
 char **argv_split(const char *str, int *argcp);
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 10/42] perf tools: Create separate mmap for dummy tracking event
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (8 preceding siblings ...)
  2015-01-29  8:06 ` [PATCH 09/42] perf tools: Introduce copyfile_offset() function Namhyung Kim
@ 2015-01-29  8:06 ` Namhyung Kim
  2015-01-29  8:06 ` [PATCH 11/42] perf tools: Introduce perf_evlist__mmap_track() Namhyung Kim
                   ` (32 subsequent siblings)
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:06 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

When indexed data file support is enabled, a dummy tracking event will
be used to track metadata (like task, comm and mmap events) for a
session and actual samples will be recorded in separate (intermediate)
files and then merged (with index table).

Provide separate mmap to the dummy tracking event.  The size is fixed
to 128KiB (+ 1 page) as the event rate will be lower than samples.  I
originally wanted to use a single mmap for this but cross-cpu sharing
is prohibited so it's per-cpu (or per-task) like normal mmaps.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-record.c |   9 +++-
 tools/perf/util/evlist.c    | 122 +++++++++++++++++++++++++++++++++++---------
 tools/perf/util/evlist.h    |  11 +++-
 3 files changed, 117 insertions(+), 25 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 404ab3434052..adb3eefb51ed 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -69,7 +69,7 @@ static int process_synthesized_event(struct perf_tool *tool,
 
 static int record__mmap_read(struct record *rec, int idx)
 {
-	struct perf_mmap *md = &rec->evlist->mmap[idx];
+	struct perf_mmap *md = perf_evlist__mmap_desc(rec->evlist, idx);
 	unsigned int head = perf_mmap__read_head(md);
 	unsigned int old = md->prev;
 	unsigned char *data = md->base + page_size;
@@ -105,6 +105,7 @@ static int record__mmap_read(struct record *rec, int idx)
 	}
 
 	md->prev = old;
+
 	perf_evlist__mmap_consume(rec->evlist, idx);
 out:
 	return rc;
@@ -275,6 +276,12 @@ static int record__mmap_read_all(struct record *rec)
 				goto out;
 			}
 		}
+		if (rec->evlist->track_mmap) {
+			if (record__mmap_read(rec, track_mmap_idx(i)) != 0) {
+				rc = -1;
+				goto out;
+			}
+		}
 	}
 
 	/*
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 2d81b4d154f4..ac31edecffaf 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -29,6 +29,7 @@
 
 static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx);
 static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx);
+static void __perf_evlist__munmap_track(struct perf_evlist *evlist, int idx);
 
 #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
 #define SID(e, x, y) xyarray__entry(e->sample_id, x, y)
@@ -729,22 +730,39 @@ static bool perf_mmap__empty(struct perf_mmap *md)
 	return perf_mmap__read_head(md) != md->prev;
 }
 
+struct perf_mmap *perf_evlist__mmap_desc(struct perf_evlist *evlist, int idx)
+{
+	if (idx >= 0)
+		return &evlist->mmap[idx];
+	else
+		return &evlist->track_mmap[track_mmap_idx(idx)];
+}
+
 static void perf_evlist__mmap_get(struct perf_evlist *evlist, int idx)
 {
-	++evlist->mmap[idx].refcnt;
+	struct perf_mmap *md = perf_evlist__mmap_desc(evlist, idx);
+
+	++md->refcnt;
 }
 
 static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx)
 {
-	BUG_ON(evlist->mmap[idx].refcnt == 0);
+	struct perf_mmap *md = perf_evlist__mmap_desc(evlist, idx);
+
+	BUG_ON(md->refcnt == 0);
+
+	if (--md->refcnt != 0)
+		return;
 
-	if (--evlist->mmap[idx].refcnt == 0)
+	if (idx >= 0)
 		__perf_evlist__munmap(evlist, idx);
+	else
+		__perf_evlist__munmap_track(evlist, track_mmap_idx(idx));
 }
 
 void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
 {
-	struct perf_mmap *md = &evlist->mmap[idx];
+	struct perf_mmap *md = perf_evlist__mmap_desc(evlist, idx);
 
 	if (!evlist->overwrite) {
 		unsigned int old = md->prev;
@@ -765,6 +783,15 @@ static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
 	}
 }
 
+static void __perf_evlist__munmap_track(struct perf_evlist *evlist, int idx)
+{
+	if (evlist->track_mmap[idx].base != NULL) {
+		munmap(evlist->track_mmap[idx].base, TRACK_MMAP_SIZE);
+		evlist->track_mmap[idx].base = NULL;
+		evlist->track_mmap[idx].refcnt = 0;
+	}
+}
+
 void perf_evlist__munmap(struct perf_evlist *evlist)
 {
 	int i;
@@ -776,23 +803,43 @@ void perf_evlist__munmap(struct perf_evlist *evlist)
 		__perf_evlist__munmap(evlist, i);
 
 	zfree(&evlist->mmap);
+
+	if (evlist->track_mmap == NULL)
+		return;
+
+	for (i = 0; i < evlist->nr_mmaps; i++)
+		__perf_evlist__munmap_track(evlist, i);
+
+	zfree(&evlist->track_mmap);
 }
 
-static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
+static int perf_evlist__alloc_mmap(struct perf_evlist *evlist, bool track_mmap)
 {
 	evlist->nr_mmaps = cpu_map__nr(evlist->cpus);
 	if (cpu_map__empty(evlist->cpus))
 		evlist->nr_mmaps = thread_map__nr(evlist->threads);
 	evlist->mmap = zalloc(evlist->nr_mmaps * sizeof(struct perf_mmap));
-	return evlist->mmap != NULL ? 0 : -ENOMEM;
+	if (evlist->mmap == NULL)
+		return -ENOMEM;
+
+	if (track_mmap) {
+		evlist->track_mmap = calloc(evlist->nr_mmaps,
+					    sizeof(struct perf_mmap));
+		if (evlist->track_mmap == NULL) {
+			zfree(&evlist->mmap);
+			return -ENOMEM;
+		}
+	}
+	return 0;
 }
 
 struct mmap_params {
-	int prot;
-	int mask;
+	int	prot;
+	size_t	len;
 };
 
-static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
+static int __perf_evlist__mmap(struct perf_evlist *evlist __maybe_unused,
+			       struct perf_mmap *pmmap,
 			       struct mmap_params *mp, int fd)
 {
 	/*
@@ -808,15 +855,14 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 	 * evlist layer can't just drop it when filtering events in
 	 * perf_evlist__filter_pollfd().
 	 */
-	evlist->mmap[idx].refcnt = 2;
-	evlist->mmap[idx].prev = 0;
-	evlist->mmap[idx].mask = mp->mask;
-	evlist->mmap[idx].base = mmap(NULL, evlist->mmap_len, mp->prot,
-				      MAP_SHARED, fd, 0);
-	if (evlist->mmap[idx].base == MAP_FAILED) {
+	pmmap->refcnt = 2;
+	pmmap->prev = 0;
+	pmmap->mask = mp->len - page_size - 1;
+	pmmap->base = mmap(NULL, mp->len, mp->prot, MAP_SHARED, fd, 0);
+	if (pmmap->base == MAP_FAILED) {
 		pr_debug2("failed to mmap perf event ring buffer, error %d\n",
 			  errno);
-		evlist->mmap[idx].base = NULL;
+		pmmap->base = NULL;
 		return -1;
 	}
 
@@ -825,7 +871,8 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 				       struct mmap_params *mp, int cpu,
-				       int thread, int *output)
+				       int thread, int *output,
+				       int *track_output)
 {
 	struct perf_evsel *evsel;
 
@@ -837,9 +884,30 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 
 		fd = FD(evsel, cpu, thread);
 
-		if (*output == -1) {
+		if (perf_evsel__is_dummy_tracking(evsel)) {
+			struct mmap_params track_mp = {
+				.prot	= mp->prot,
+				.len	= TRACK_MMAP_SIZE,
+			};
+
+			if (*track_output == -1) {
+				*track_output = fd;
+				if (__perf_evlist__mmap(evlist,
+							&evlist->track_mmap[idx],
+							&track_mp, fd) < 0)
+					return -1;
+			} else {
+				if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT,
+					  *track_output) != 0)
+					return -1;
+			}
+
+			/* mark idx as track mmap idx (negative) */
+			idx = track_mmap_idx(idx);
+		} else if (*output == -1) {
 			*output = fd;
-			if (__perf_evlist__mmap(evlist, idx, mp, *output) < 0)
+			if (__perf_evlist__mmap(evlist, &evlist->mmap[idx],
+						mp, *output) < 0)
 				return -1;
 		} else {
 			if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, *output) != 0)
@@ -868,6 +936,11 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 			perf_evlist__set_sid_idx(evlist, evsel, idx, cpu,
 						 thread);
 		}
+
+		if (perf_evsel__is_dummy_tracking(evsel)) {
+			/* restore idx as normal idx (positive) */
+			idx = track_mmap_idx(idx);
+		}
 	}
 
 	return 0;
@@ -883,10 +956,12 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist,
 	pr_debug2("perf event ring buffer mmapped per cpu\n");
 	for (cpu = 0; cpu < nr_cpus; cpu++) {
 		int output = -1;
+		int track_output = -1;
 
 		for (thread = 0; thread < nr_threads; thread++) {
 			if (perf_evlist__mmap_per_evsel(evlist, cpu, mp, cpu,
-							thread, &output))
+							thread, &output,
+							&track_output))
 				goto out_unmap;
 		}
 	}
@@ -908,9 +983,10 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist,
 	pr_debug2("perf event ring buffer mmapped per thread\n");
 	for (thread = 0; thread < nr_threads; thread++) {
 		int output = -1;
+		int track_output = -1;
 
 		if (perf_evlist__mmap_per_evsel(evlist, thread, mp, 0, thread,
-						&output))
+						&output, &track_output))
 			goto out_unmap;
 	}
 
@@ -1033,7 +1109,7 @@ int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 		.prot = PROT_READ | (overwrite ? 0 : PROT_WRITE),
 	};
 
-	if (evlist->mmap == NULL && perf_evlist__alloc_mmap(evlist) < 0)
+	if (evlist->mmap == NULL && perf_evlist__alloc_mmap(evlist, true) < 0)
 		return -ENOMEM;
 
 	if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0)
@@ -1042,7 +1118,7 @@ int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 	evlist->overwrite = overwrite;
 	evlist->mmap_len = perf_evlist__mmap_size(pages);
 	pr_debug("mmap size %zuB\n", evlist->mmap_len);
-	mp.mask = evlist->mmap_len - page_size - 1;
+	mp.len = evlist->mmap_len;
 
 	evlist__for_each(evlist, evsel) {
 		if ((evsel->attr.read_format & PERF_FORMAT_ID) &&
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 771175e70d2f..bf697632458d 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -48,11 +48,14 @@ struct perf_evlist {
 	bool		 overwrite;
 	struct fdarray	 pollfd;
 	struct perf_mmap *mmap;
+	struct perf_mmap *track_mmap;
 	struct thread_map *threads;
 	struct cpu_map	  *cpus;
 	struct perf_evsel *selected;
 };
 
+#define TRACK_MMAP_SIZE  (((128 * 1024 / page_size) + 1) * page_size)
+
 struct perf_evsel_str_handler {
 	const char *name;
 	void	   *handler;
@@ -100,8 +103,8 @@ struct perf_evsel *perf_evlist__id2evsel(struct perf_evlist *evlist, u64 id);
 struct perf_sample_id *perf_evlist__id2sid(struct perf_evlist *evlist, u64 id);
 
 union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx);
-
 void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx);
+struct perf_mmap *perf_evlist__mmap_desc(struct perf_evlist *evlist, int idx);
 
 int perf_evlist__open(struct perf_evlist *evlist);
 void perf_evlist__close(struct perf_evlist *evlist);
@@ -211,6 +214,12 @@ bool perf_evlist__can_select_event(struct perf_evlist *evlist, const char *str);
 void perf_evlist__to_front(struct perf_evlist *evlist,
 			   struct perf_evsel *move_evsel);
 
+/* convert from/to negative idx for track mmaps */
+static inline int track_mmap_idx(int idx)
+{
+	return -idx - 1;
+}
+
 /**
  * __evlist__for_each - iterate thru all the evsels
  * @list: list_head instance to iterate
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 11/42] perf tools: Introduce perf_evlist__mmap_track()
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (9 preceding siblings ...)
  2015-01-29  8:06 ` [PATCH 10/42] perf tools: Create separate mmap for dummy tracking event Namhyung Kim
@ 2015-01-29  8:06 ` Namhyung Kim
  2015-01-29  8:06 ` [PATCH 12/42] perf tools: Add HEADER_DATA_INDEX feature Namhyung Kim
                   ` (31 subsequent siblings)
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:06 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

The perf_evlist__mmap_track function creates data mmaps and optionally
tracking mmaps for events.  It'll be used for perf record to save events
in a separate files and build an index table.  Checking dummy tracking
event in perf_evlist__mmap() alone is not enough as users can specify a
dummy event (like in keep tracking testcase) without the index option.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-record.c |  3 ++-
 tools/perf/util/evlist.c    | 15 +++++++++------
 tools/perf/util/evlist.h    | 10 ++++++++--
 3 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index adb3eefb51ed..56118d8cf74a 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -169,7 +169,8 @@ static int record__open(struct record *rec)
 		goto out;
 	}
 
-	if (perf_evlist__mmap(evlist, opts->mmap_pages, false) < 0) {
+	if (perf_evlist__mmap_track(evlist, opts->mmap_pages, false,
+				    false) < 0) {
 		if (errno == EPERM) {
 			pr_err("Permission error mapping pages.\n"
 			       "Consider increasing "
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index ac31edecffaf..fb27b468427d 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -835,6 +835,7 @@ static int perf_evlist__alloc_mmap(struct perf_evlist *evlist, bool track_mmap)
 
 struct mmap_params {
 	int	prot;
+	bool	track;
 	size_t	len;
 };
 
@@ -884,7 +885,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 
 		fd = FD(evsel, cpu, thread);
 
-		if (perf_evsel__is_dummy_tracking(evsel)) {
+		if (mp->track && perf_evsel__is_dummy_tracking(evsel)) {
 			struct mmap_params track_mp = {
 				.prot	= mp->prot,
 				.len	= TRACK_MMAP_SIZE,
@@ -937,7 +938,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 						 thread);
 		}
 
-		if (perf_evsel__is_dummy_tracking(evsel)) {
+		if (mp->track && perf_evsel__is_dummy_tracking(evsel)) {
 			/* restore idx as normal idx (positive) */
 			idx = track_mmap_idx(idx);
 		}
@@ -1088,10 +1089,11 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
 }
 
 /**
- * perf_evlist__mmap - Create mmaps to receive events.
+ * perf_evlist__mmap_track - Create mmaps to receive events.
  * @evlist: list of events
  * @pages: map length in pages
  * @overwrite: overwrite older events?
+ * @use_track_mmap: use another mmaps to track meta events
  *
  * If @overwrite is %false the user needs to signal event consumption using
  * perf_mmap__write_tail().  Using perf_evlist__mmap_read() does this
@@ -1099,17 +1101,18 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
  *
  * Return: %0 on success, negative error code otherwise.
  */
-int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
-		      bool overwrite)
+int perf_evlist__mmap_track(struct perf_evlist *evlist, unsigned int pages,
+			    bool overwrite, bool use_track_mmap)
 {
 	struct perf_evsel *evsel;
 	const struct cpu_map *cpus = evlist->cpus;
 	const struct thread_map *threads = evlist->threads;
 	struct mmap_params mp = {
 		.prot = PROT_READ | (overwrite ? 0 : PROT_WRITE),
+		.track = use_track_mmap,
 	};
 
-	if (evlist->mmap == NULL && perf_evlist__alloc_mmap(evlist, true) < 0)
+	if (evlist->mmap == NULL && perf_evlist__alloc_mmap(evlist, mp.track) < 0)
 		return -ENOMEM;
 
 	if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index bf697632458d..c3b7b3428cbb 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -127,10 +127,16 @@ int perf_evlist__parse_mmap_pages(const struct option *opt,
 				  const char *str,
 				  int unset);
 
-int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
-		      bool overwrite);
+int perf_evlist__mmap_track(struct perf_evlist *evlist, unsigned int pages,
+			    bool overwrite, bool use_track_mmap);
 void perf_evlist__munmap(struct perf_evlist *evlist);
 
+static inline int perf_evlist__mmap(struct perf_evlist *evlist,
+				    unsigned int pages, bool overwrite)
+{
+	return perf_evlist__mmap_track(evlist, pages, overwrite, false);
+}
+
 void perf_evlist__disable(struct perf_evlist *evlist);
 void perf_evlist__enable(struct perf_evlist *evlist);
 
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 12/42] perf tools: Add HEADER_DATA_INDEX feature
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (10 preceding siblings ...)
  2015-01-29  8:06 ` [PATCH 11/42] perf tools: Introduce perf_evlist__mmap_track() Namhyung Kim
@ 2015-01-29  8:06 ` Namhyung Kim
  2015-01-29  8:06 ` [PATCH 13/42] perf tools: Handle indexed data file properly Namhyung Kim
                   ` (30 subsequent siblings)
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:06 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

The HEADER_DATA_INDEX feature is to record index table for sample data
so that they can be processed by multiple thread concurrently.  Each
item is a struct perf_file_section which consists of an offset and size.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-data.c   |  0
 tools/perf/builtin-record.c |  2 ++
 tools/perf/util/header.c    | 61 +++++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/header.h    |  3 +++
 4 files changed, 66 insertions(+)
 create mode 100644 tools/perf/builtin-data.c

diff --git a/tools/perf/builtin-data.c b/tools/perf/builtin-data.c
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 56118d8cf74a..b057e2caa5f1 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -312,6 +312,8 @@ static void record__init_features(struct record *rec)
 
 	if (!rec->opts.branch_stack)
 		perf_header__clear_feat(&session->header, HEADER_BRANCH_STACK);
+
+	perf_header__clear_feat(&session->header, HEADER_DATA_INDEX);
 }
 
 static volatile int workload_exec_errno;
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 1f407f7352a7..77206a6cbf65 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -869,6 +869,24 @@ static int write_branch_stack(int fd __maybe_unused,
 	return 0;
 }
 
+static int write_data_index(int fd, struct perf_header *h,
+			    struct perf_evlist *evlist __maybe_unused)
+{
+	int ret;
+	unsigned i;
+
+	ret = do_write(fd, &h->nr_index, sizeof(h->nr_index));
+	if (ret < 0)
+		return ret;
+
+	for (i = 0; i < h->nr_index; i++) {
+		ret = do_write(fd, &h->index[i], sizeof(*h->index));
+		if (ret < 0)
+			return ret;
+	}
+	return 0;
+}
+
 static void print_hostname(struct perf_header *ph, int fd __maybe_unused,
 			   FILE *fp)
 {
@@ -1225,6 +1243,12 @@ static void print_group_desc(struct perf_header *ph, int fd __maybe_unused,
 	}
 }
 
+static void print_data_index(struct perf_header *ph __maybe_unused,
+			     int fd __maybe_unused, FILE *fp)
+{
+	fprintf(fp, "# contains data index for parallel processing\n");
+}
+
 static int __event_process_build_id(struct build_id_event *bev,
 				    char *filename,
 				    struct perf_session *session)
@@ -1833,6 +1857,42 @@ static int process_group_desc(struct perf_file_section *section __maybe_unused,
 	return ret;
 }
 
+static int process_data_index(struct perf_file_section *section __maybe_unused,
+			      struct perf_header *ph, int fd,
+			      void *data __maybe_unused)
+{
+	ssize_t ret;
+	u64 nr_index;
+	unsigned i;
+	struct perf_file_section *index;
+
+	ret = readn(fd, &nr_index, sizeof(nr_index));
+	if (ret != sizeof(nr_index))
+		return -1;
+
+	if (ph->needs_swap)
+		nr_index = bswap_64(nr_index);
+
+	index = calloc(nr_index, sizeof(*index));
+	if (index == NULL)
+		return -1;
+
+	for (i = 0; i < nr_index; i++) {
+		ret = readn(fd, &index[i], sizeof(*index));
+		if (ret != sizeof(*index))
+			return ret;
+
+		if (ph->needs_swap) {
+			index[i].offset = bswap_64(index[i].offset);
+			index[i].size   = bswap_64(index[i].size);
+		}
+	}
+
+	ph->index = index;
+	ph->nr_index = nr_index;
+	return 0;
+}
+
 struct feature_ops {
 	int (*write)(int fd, struct perf_header *h, struct perf_evlist *evlist);
 	void (*print)(struct perf_header *h, int fd, FILE *fp);
@@ -1873,6 +1933,7 @@ static const struct feature_ops feat_ops[HEADER_LAST_FEATURE] = {
 	FEAT_OPA(HEADER_BRANCH_STACK,	branch_stack),
 	FEAT_OPP(HEADER_PMU_MAPPINGS,	pmu_mappings),
 	FEAT_OPP(HEADER_GROUP_DESC,	group_desc),
+	FEAT_OPP(HEADER_DATA_INDEX,	data_index),
 };
 
 struct header_print_data {
diff --git a/tools/perf/util/header.h b/tools/perf/util/header.h
index 3bb90ac172a1..e5594f0d6dcd 100644
--- a/tools/perf/util/header.h
+++ b/tools/perf/util/header.h
@@ -30,6 +30,7 @@ enum {
 	HEADER_BRANCH_STACK,
 	HEADER_PMU_MAPPINGS,
 	HEADER_GROUP_DESC,
+	HEADER_DATA_INDEX,
 	HEADER_LAST_FEATURE,
 	HEADER_FEAT_BITS	= 256,
 };
@@ -94,6 +95,8 @@ struct perf_header {
 	bool				needs_swap;
 	u64				data_offset;
 	u64				data_size;
+	struct perf_file_section	*index;
+	u64				nr_index;
 	u64				feat_offset;
 	DECLARE_BITMAP(adds_features, HEADER_FEAT_BITS);
 	struct perf_session_env 	env;
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 13/42] perf tools: Handle indexed data file properly
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (11 preceding siblings ...)
  2015-01-29  8:06 ` [PATCH 12/42] perf tools: Add HEADER_DATA_INDEX feature Namhyung Kim
@ 2015-01-29  8:06 ` Namhyung Kim
  2015-01-29  8:06 ` [PATCH 14/42] perf record: Add --index option for building index table Namhyung Kim
                   ` (29 subsequent siblings)
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:06 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

When perf detects data file has index table, process header file first
and then rest data files in a row.  Note that the indexed data is
recorded for each cpu/thread separately, it's already ordered with
respect to themselves so no need to use the ordered event queue
interface.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/session.c | 47 ++++++++++++++++++++++++++++++++++++-----------
 tools/perf/util/session.h |  5 +++++
 2 files changed, 41 insertions(+), 11 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 0baf75f12b7c..ff4d5913220c 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1251,11 +1251,10 @@ fetch_mmaped_event(struct perf_session *session,
 #define NUM_MMAPS 128
 #endif
 
-static int __perf_session__process_events(struct perf_session *session,
+static int __perf_session__process_events(struct perf_session *session, int fd,
 					  u64 data_offset, u64 data_size,
 					  u64 file_size, struct perf_tool *tool)
 {
-	int fd = perf_data_file__fd(session->file);
 	u64 head, page_offset, file_offset, file_pos, size;
 	int err, mmap_prot, mmap_flags, map_idx = 0;
 	size_t	mmap_size;
@@ -1278,7 +1277,9 @@ static int __perf_session__process_events(struct perf_session *session,
 	mmap_size = MMAP_SIZE;
 	if (mmap_size > file_size) {
 		mmap_size = file_size;
-		session->one_mmap = true;
+
+		if (!perf_session__has_index(session))
+			session->one_mmap = true;
 	}
 
 	memset(mmaps, 0, sizeof(mmaps));
@@ -1360,19 +1361,43 @@ static int __perf_session__process_events(struct perf_session *session,
 int perf_session__process_events(struct perf_session *session,
 				 struct perf_tool *tool)
 {
-	u64 size = perf_data_file__size(session->file);
-	int err;
+	struct perf_data_file *file = session->file;
+	u64 size = perf_data_file__size(file);
+	int err, i;
 
 	if (perf_session__register_idle_thread(session) == NULL)
 		return -ENOMEM;
 
-	if (!perf_data_file__is_pipe(session->file))
+	if (perf_data_file__is_pipe(file))
+		return __perf_session__process_pipe_events(session, tool);
+
+	err = __perf_session__process_events(session,
+					     perf_data_file__fd(file),
+					     session->header.data_offset,
+					     session->header.data_size,
+					     size, tool);
+
+	if (err < 0 || !perf_session__has_index(session))
+		return err;
+
+	/*
+	 * For indexed data file, events are processed for
+	 * each cpu/thread so it's already ordered.
+	 */
+	tool->ordered_events = false;
+
+	for (i = 0; i < (int)session->header.nr_index; i++) {
+		if (!session->header.index[i].size)
+			continue;
+
 		err = __perf_session__process_events(session,
-						     session->header.data_offset,
-						     session->header.data_size,
-						     size, tool);
-	else
-		err = __perf_session__process_pipe_events(session, tool);
+						perf_data_file__fd(file),
+						session->header.index[i].offset,
+						session->header.index[i].size,
+						size, tool);
+		if (err < 0)
+			break;
+	}
 
 	return err;
 }
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index 6d663dc76404..419976d74b51 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -138,4 +138,9 @@ int perf_event__synthesize_id_index(struct perf_tool *tool,
 				    struct perf_evlist *evlist,
 				    struct machine *machine);
 
+static inline bool perf_session__has_index(struct perf_session *session)
+{
+	return perf_header__has_feat(&session->header, HEADER_DATA_INDEX);
+}
+
 #endif /* __PERF_SESSION_H */
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 14/42] perf record: Add --index option for building index table
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (12 preceding siblings ...)
  2015-01-29  8:06 ` [PATCH 13/42] perf tools: Handle indexed data file properly Namhyung Kim
@ 2015-01-29  8:06 ` Namhyung Kim
  2015-02-01 18:06   ` Jiri Olsa
  2015-01-29  8:06 ` [PATCH 15/42] perf report: Skip dummy tracking event Namhyung Kim
                   ` (28 subsequent siblings)
  42 siblings, 1 reply; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:06 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

The new --index option will create indexed data file which can be
processed by multiple threads parallelly.  It saves meta event and
sample data in separate files and merges them with an index table.

To build an index table, it needs to know exact offsets and sizes for
each sample data.  However the offset only can be calculated after the
feature data is fixed, and to save feature data it needs to access to
the sample data because it needs to mark used DSOs for build-id table.

So I ended up with reserving 1MB hole for the feature data area and then
put sample data and calculated offsets.  Now an indexed perf data file
will look like below:

        +---------------------+
        |     file header     |
        |---------------------|
        |                     |
        |     meta events     |
        |                     |
        |---------------------|
        |     feature data    |
        |   (contains index) -+--+
        |---------------------|  |
        |      ~1MB hole      |  |
        |---------------------|  |
        |                     |  |
        |    sample data[1] <-+--+
        |                     |  |
        |---------------------|  |
        |                     |  |
        |    sample data[2] <-|--+
        |                     |  |
        |---------------------|  |
        |         ...         | ...
        +---------------------+

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/Documentation/perf-record.txt |   4 +
 tools/perf/builtin-data.c                |   0
 tools/perf/builtin-record.c              | 165 +++++++++++++++++++++++++++++--
 tools/perf/perf.h                        |   1 +
 tools/perf/util/session.c                |   1 +
 5 files changed, 161 insertions(+), 10 deletions(-)
 delete mode 100644 tools/perf/builtin-data.c

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 31e977459c51..1fe8736cc0ff 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -235,6 +235,10 @@ Capture machine state (registers) at interrupt, i.e., on counter overflows for
 each sample. List of captured registers depends on the architecture. This option
 is off by default.
 
+--index::
+Build an index table for sample data.  This will speed up perf report by
+parallel processing.
+
 SEE ALSO
 --------
 linkperf:perf-stat[1], linkperf:perf-list[1]
diff --git a/tools/perf/builtin-data.c b/tools/perf/builtin-data.c
deleted file mode 100644
index e69de29bb2d1..000000000000
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index b057e2caa5f1..0db47c97446b 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -38,6 +38,7 @@ struct record {
 	struct record_opts	opts;
 	u64			bytes_written;
 	struct perf_data_file	file;
+	int			*fds;
 	struct perf_evlist	*evlist;
 	struct perf_session	*session;
 	const char		*progname;
@@ -47,14 +48,23 @@ struct record {
 	long			samples;
 };
 
-static int record__write(struct record *rec, void *bf, size_t size)
+static int record__write(struct record *rec, void *bf, size_t size, int idx)
 {
-	if (perf_data_file__write(rec->session->file, bf, size) < 0) {
+	int fd;
+
+	if (rec->fds && idx >= 0) {
+		fd = rec->fds[idx];
+		/* do not update data size for index files */
+	} else {
+		fd = perf_data_file__fd(rec->session->file);
+		rec->bytes_written += size;
+	}
+
+	if (writen(fd, bf, size) < 0) {
 		pr_err("failed to write perf data, error: %m\n");
 		return -1;
 	}
 
-	rec->bytes_written += size;
 	return 0;
 }
 
@@ -64,7 +74,7 @@ static int process_synthesized_event(struct perf_tool *tool,
 				     struct machine *machine __maybe_unused)
 {
 	struct record *rec = container_of(tool, struct record, tool);
-	return record__write(rec, event, event->header.size);
+	return record__write(rec, event, event->header.size, -1);
 }
 
 static int record__mmap_read(struct record *rec, int idx)
@@ -89,7 +99,7 @@ static int record__mmap_read(struct record *rec, int idx)
 		size = md->mask + 1 - (old & md->mask);
 		old += size;
 
-		if (record__write(rec, buf, size) < 0) {
+		if (record__write(rec, buf, size, idx) < 0) {
 			rc = -1;
 			goto out;
 		}
@@ -99,7 +109,7 @@ static int record__mmap_read(struct record *rec, int idx)
 	size = head - old;
 	old += size;
 
-	if (record__write(rec, buf, size) < 0) {
+	if (record__write(rec, buf, size, idx) < 0) {
 		rc = -1;
 		goto out;
 	}
@@ -111,6 +121,113 @@ static int record__mmap_read(struct record *rec, int idx)
 	return rc;
 }
 
+#define INDEX_FILE_FMT  "%s.dir/perf.data.%d"
+
+static int record__create_index_files(struct record *rec, int nr_index)
+{
+	int i = 0;
+	int ret = -1;
+	char path[PATH_MAX];
+	struct perf_data_file *file = &rec->file;
+
+	rec->fds = malloc(nr_index * sizeof(int));
+	if (rec->fds == NULL)
+		return -ENOMEM;
+
+	scnprintf(path, sizeof(path), "%s.dir", file->path);
+	if (mkdir(path, S_IRWXU) < 0)
+		goto out_err;
+
+	for (i = 0; i < nr_index; i++) {
+		scnprintf(path, sizeof(path), INDEX_FILE_FMT, file->path, i);
+		ret = open(path, O_RDWR|O_CREAT|O_TRUNC, S_IRUSR|S_IWUSR);
+		if (ret < 0)
+			goto out_err;
+
+		rec->fds[i] = ret;
+	}
+	return 0;
+
+out_err:
+	while (--i >= 0)
+		close(rec->fds[i]);
+	zfree(&rec->fds);
+
+	scnprintf(path, sizeof(path), "%s.dir", file->path);
+	rm_rf(path);
+
+	return ret;
+}
+
+static int record__merge_index_files(struct record *rec, int nr_index)
+{
+	int i;
+	int ret = -1;
+	u64 offset;
+	char path[PATH_MAX];
+	struct perf_file_section *idx;
+	struct perf_data_file *file = &rec->file;
+	struct perf_session *session = rec->session;
+	int output_fd = perf_data_file__fd(file);
+
+	idx = calloc(nr_index, sizeof(*idx));
+	if (idx == NULL)
+		goto out_close;
+
+	/* index data will be placed after header */
+	offset = lseek(output_fd, 0, SEEK_END);
+	if (offset == (u64)(loff_t) -1)
+		goto out_close;
+
+	/*
+	 * increase the offset for header features (including index).
+	 * which set later.  we cannot know exact size at this stage,
+	 * but I guess 1MB should be enough..
+	 */
+	offset += 1024 * 1024;
+	offset = PERF_ALIGN(offset, page_size);
+
+	for (i = 0; i < nr_index; i++) {
+		struct stat stbuf;
+		int fd = rec->fds[i];
+
+		if (fstat(fd, &stbuf) < 0)
+			goto out_close;
+
+		idx[i].offset = offset;
+		idx[i].size   = stbuf.st_size;
+
+		offset += PERF_ALIGN(stbuf.st_size, page_size);
+	}
+
+	session->header.index = idx;
+	session->header.nr_index = nr_index;
+
+	/* copy sample events */
+	for (i = 0; i < nr_index; i++) {
+		int fd = rec->fds[i];
+
+		if (idx[i].size == 0)
+			continue;
+
+		if (copyfile_offset(fd, 0, output_fd, idx[i].offset,
+				    idx[i].size) < 0)
+			goto out_close;
+	}
+
+	ret = 0;
+
+out_close:
+	for (i = 0; i < nr_index; i++)
+		close(rec->fds[i]);
+
+	scnprintf(path, sizeof(path), "%s.dir", file->path);
+	rm_rf(path);
+
+	zfree(&rec->fds);
+	return ret;
+}
+
 static volatile int done = 0;
 static volatile int signr = -1;
 static volatile int child_finished = 0;
@@ -170,7 +287,7 @@ static int record__open(struct record *rec)
 	}
 
 	if (perf_evlist__mmap_track(evlist, opts->mmap_pages, false,
-				    false) < 0) {
+				    opts->index) < 0) {
 		if (errno == EPERM) {
 			pr_err("Permission error mapping pages.\n"
 			       "Consider increasing "
@@ -186,6 +303,12 @@ static int record__open(struct record *rec)
 		goto out;
 	}
 
+	if (opts->index) {
+		rc = record__create_index_files(rec, evlist->nr_mmaps);
+		if (rc < 0)
+			goto out;
+	}
+
 	session->evlist = evlist;
 	perf_session__set_id_hdr_size(session);
 out:
@@ -210,7 +333,8 @@ static int process_buildids(struct record *rec)
 	struct perf_data_file *file  = &rec->file;
 	struct perf_session *session = rec->session;
 
-	u64 size = lseek(perf_data_file__fd(file), 0, SEEK_CUR);
+	/* update file size after merging sample files with index */
+	u64 size = lseek(perf_data_file__fd(file), 0, SEEK_END);
 	if (size == 0)
 		return 0;
 
@@ -290,7 +414,8 @@ static int record__mmap_read_all(struct record *rec)
 	 * at least one event.
 	 */
 	if (bytes_written != rec->bytes_written)
-		rc = record__write(rec, &finished_round_event, sizeof(finished_round_event));
+		rc = record__write(rec, &finished_round_event,
+				   sizeof(finished_round_event), -1);
 
 out:
 	return rc;
@@ -313,7 +438,8 @@ static void record__init_features(struct record *rec)
 	if (!rec->opts.branch_stack)
 		perf_header__clear_feat(&session->header, HEADER_BRANCH_STACK);
 
-	perf_header__clear_feat(&session->header, HEADER_DATA_INDEX);
+	if (!rec->opts.index)
+		perf_header__clear_feat(&session->header, HEADER_DATA_INDEX);
 }
 
 static volatile int workload_exec_errno;
@@ -375,6 +501,11 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		}
 	}
 
+	if (file->is_pipe && opts->index) {
+		pr_warning("Indexing is disabled for pipe output\n");
+		opts->index = false;
+	}
+
 	if (record__open(rec) != 0) {
 		err = -1;
 		goto out_child;
@@ -554,6 +685,9 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	if (!err && !file->is_pipe) {
 		rec->session->header.data_size += rec->bytes_written;
 
+		if (rec->opts.index)
+			record__merge_index_files(rec, rec->evlist->nr_mmaps);
+
 		if (!rec->no_buildid)
 			process_buildids(rec);
 		perf_session__write_header(rec->session, rec->evlist, fd, true);
@@ -849,6 +983,8 @@ struct option __record_options[] = {
 		    "use per-thread mmaps"),
 	OPT_BOOLEAN('I', "intr-regs", &record.opts.sample_intr_regs,
 		    "Sample machine registers on interrupt"),
+	OPT_BOOLEAN(0, "index", &record.opts.index,
+		    "make index for sample data to speed-up processing"),
 	OPT_END()
 };
 
@@ -898,6 +1034,15 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
 		goto out_symbol_exit;
 	}
 
+	if (rec->opts.index) {
+		if (!rec->opts.sample_time) {
+			pr_err("Sample timestamp is required for indexing\n");
+			goto out_symbol_exit;
+		}
+
+		perf_evlist__add_dummy_tracking(rec->evlist);
+	}
+
 	if (rec->opts.target.tid && !rec->opts.no_inherit_set)
 		rec->opts.no_inherit = true;
 
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 1dabb8553499..b0fad99c9252 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -53,6 +53,7 @@ struct record_opts {
 	bool	     sample_time;
 	bool	     period;
 	bool	     sample_intr_regs;
+	bool	     index;
 	unsigned int freq;
 	unsigned int mmap_pages;
 	unsigned int user_freq;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index ff4d5913220c..e7b59fbebbc4 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -173,6 +173,7 @@ void perf_session__delete(struct perf_session *session)
 	machines__exit(&session->machines);
 	if (session->file)
 		perf_data_file__close(session->file);
+	free(session->header.index);
 	free(session);
 }
 
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 15/42] perf report: Skip dummy tracking event
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (13 preceding siblings ...)
  2015-01-29  8:06 ` [PATCH 14/42] perf record: Add --index option for building index table Namhyung Kim
@ 2015-01-29  8:06 ` Namhyung Kim
  2015-01-29  8:06 ` [PATCH 16/42] perf tools: Pass session arg to perf_event__preprocess_sample() Namhyung Kim
                   ` (27 subsequent siblings)
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:06 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

The dummy tracking event is only for tracking task/comom/mmap events
and has no sample data for itself.  So no need to report, just skip it.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-report.c    |  3 +++
 tools/perf/ui/browsers/hists.c | 30 ++++++++++++++++++++++++------
 tools/perf/ui/gtk/hists.c      |  3 +++
 3 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 2f91094e228b..4cac79ad3085 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -318,6 +318,9 @@ static int perf_evlist__tty_browse_hists(struct perf_evlist *evlist,
 		struct hists *hists = evsel__hists(pos);
 		const char *evname = perf_evsel__name(pos);
 
+		if (perf_evsel__is_dummy_tracking(pos))
+			continue;
+
 		if (symbol_conf.event_group &&
 		    !perf_evsel__is_group_leader(pos))
 			continue;
diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 788506eef567..7d33d7dc0824 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -1947,14 +1947,17 @@ static int perf_evsel_menu__run(struct perf_evsel_menu *menu,
 	return key;
 }
 
-static bool filter_group_entries(struct ui_browser *browser __maybe_unused,
-				 void *entry)
+static bool filter_entries(struct ui_browser *browser __maybe_unused,
+			   void *entry)
 {
 	struct perf_evsel *evsel = list_entry(entry, struct perf_evsel, node);
 
 	if (symbol_conf.event_group && !perf_evsel__is_group_leader(evsel))
 		return true;
 
+	if (perf_evsel__is_dummy_tracking(evsel))
+		return true;
+
 	return false;
 }
 
@@ -1971,7 +1974,7 @@ static int __perf_evlist__tui_browse_hists(struct perf_evlist *evlist,
 			.refresh    = ui_browser__list_head_refresh,
 			.seek	    = ui_browser__list_head_seek,
 			.write	    = perf_evsel_menu__write,
-			.filter	    = filter_group_entries,
+			.filter	    = filter_entries,
 			.nr_entries = nr_entries,
 			.priv	    = evlist,
 		},
@@ -1998,21 +2001,22 @@ int perf_evlist__tui_browse_hists(struct perf_evlist *evlist, const char *help,
 				  struct perf_session_env *env)
 {
 	int nr_entries = evlist->nr_entries;
+	struct perf_evsel *first = perf_evlist__first(evlist);
+	struct perf_evsel *pos;
 
 single_entry:
 	if (nr_entries == 1) {
-		struct perf_evsel *first = perf_evlist__first(evlist);
-
 		return perf_evsel__hists_browse(first, nr_entries, help,
 						false, hbt, min_pcnt,
 						env);
 	}
 
 	if (symbol_conf.event_group) {
-		struct perf_evsel *pos;
 
 		nr_entries = 0;
 		evlist__for_each(evlist, pos) {
+			if (perf_evsel__is_dummy_tracking(pos))
+				continue;
 			if (perf_evsel__is_group_leader(pos))
 				nr_entries++;
 		}
@@ -2021,6 +2025,20 @@ int perf_evlist__tui_browse_hists(struct perf_evlist *evlist, const char *help,
 			goto single_entry;
 	}
 
+	evlist__for_each(evlist, pos) {
+		if (perf_evsel__is_dummy_tracking(pos))
+			nr_entries--;
+	}
+
+	if (nr_entries == 1) {
+		evlist__for_each(evlist, pos) {
+			if (!perf_evsel__is_dummy_tracking(pos)) {
+				first = pos;
+				goto single_entry;
+			}
+		}
+	}
+
 	return __perf_evlist__tui_browse_hists(evlist, nr_entries, help,
 					       hbt, min_pcnt, env);
 }
diff --git a/tools/perf/ui/gtk/hists.c b/tools/perf/ui/gtk/hists.c
index 4b3585eed1e8..83a7ecd5cda8 100644
--- a/tools/perf/ui/gtk/hists.c
+++ b/tools/perf/ui/gtk/hists.c
@@ -317,6 +317,9 @@ int perf_evlist__gtk_browse_hists(struct perf_evlist *evlist,
 		char buf[512];
 		size_t size = sizeof(buf);
 
+		if (perf_evsel__is_dummy_tracking(pos))
+			continue;
+
 		if (symbol_conf.event_group) {
 			if (!perf_evsel__is_group_leader(pos))
 				continue;
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 16/42] perf tools: Pass session arg to perf_event__preprocess_sample()
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (14 preceding siblings ...)
  2015-01-29  8:06 ` [PATCH 15/42] perf report: Skip dummy tracking event Namhyung Kim
@ 2015-01-29  8:06 ` Namhyung Kim
  2015-01-29  8:06 ` [PATCH 17/42] perf script: Pass session arg to ->process_event callback Namhyung Kim
                   ` (26 subsequent siblings)
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:06 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

The perf_event__preprocess_sample() translates a given ip into a
matching symbol.  To do that, it first finds a corresponding thread
and map in the current thread tree.  But for indexed data files, it
needs to find a thread (and map) with slightly different APIs using
timestamp.  So it needs a way to know whether this session deals with
an indexed data file or not.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-annotate.c     |  3 ++-
 tools/perf/builtin-diff.c         | 13 +++++++++----
 tools/perf/builtin-mem.c          |  6 +++++-
 tools/perf/builtin-report.c       |  3 ++-
 tools/perf/builtin-script.c       | 20 +++++++++++---------
 tools/perf/builtin-timechart.c    | 10 +++++++---
 tools/perf/builtin-top.c          |  3 ++-
 tools/perf/tests/hists_cumulate.c |  2 +-
 tools/perf/tests/hists_filter.c   |  2 +-
 tools/perf/tests/hists_link.c     |  4 ++--
 tools/perf/tests/hists_output.c   |  2 +-
 tools/perf/util/event.c           |  3 ++-
 tools/perf/util/event.h           |  4 +++-
 13 files changed, 48 insertions(+), 27 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 747f86103599..b89e4c6ed488 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -85,7 +85,8 @@ static int process_sample_event(struct perf_tool *tool,
 	struct perf_annotate *ann = container_of(tool, struct perf_annotate, tool);
 	struct addr_location al;
 
-	if (perf_event__preprocess_sample(event, machine, &al, sample) < 0) {
+	if (perf_event__preprocess_sample(event, machine, &al, sample,
+					  ann->session) < 0) {
 		pr_warning("problem processing %d event, skipping it.\n",
 			   event->header.type);
 		return -1;
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 74aada554b12..3e2229227062 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -42,6 +42,7 @@ struct diff_hpp_fmt {
 };
 
 struct data__file {
+	struct perf_tool	tool;
 	struct perf_session	*session;
 	struct perf_data_file	file;
 	int			 idx;
@@ -320,16 +321,18 @@ static int hists__add_entry(struct hists *hists,
 	return -ENOMEM;
 }
 
-static int diff__process_sample_event(struct perf_tool *tool __maybe_unused,
+static int diff__process_sample_event(struct perf_tool *tool,
 				      union perf_event *event,
 				      struct perf_sample *sample,
 				      struct perf_evsel *evsel,
 				      struct machine *machine)
 {
 	struct addr_location al;
+	struct data__file *d = container_of(tool, struct data__file, tool);
 	struct hists *hists = evsel__hists(evsel);
 
-	if (perf_event__preprocess_sample(event, machine, &al, sample) < 0) {
+	if (perf_event__preprocess_sample(event, machine, &al, sample,
+					  d->session) < 0) {
 		pr_warning("problem processing %d event, skipping it.\n",
 			   event->header.type);
 		return -1;
@@ -740,14 +743,16 @@ static int __cmd_diff(void)
 	int ret = -EINVAL, i;
 
 	data__for_each_file(i, d) {
-		d->session = perf_session__new(&d->file, false, &tool);
+		memcpy(&d->tool, &tool, sizeof(tool));
+
+		d->session = perf_session__new(&d->file, false, &d->tool);
 		if (!d->session) {
 			pr_err("Failed to open %s\n", d->file.path);
 			ret = -1;
 			goto out_delete;
 		}
 
-		ret = perf_session__process_events(d->session, &tool);
+		ret = perf_session__process_events(d->session, &d->tool);
 		if (ret) {
 			pr_err("Failed to process %s\n", d->file.path);
 			goto out_delete;
diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 9b5663950a4d..21d46918860e 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -12,6 +12,7 @@
 
 struct perf_mem {
 	struct perf_tool	tool;
+	struct perf_session	*session;
 	char const		*input_name;
 	bool			hide_unresolved;
 	bool			dump_raw;
@@ -66,7 +67,8 @@ dump_raw_samples(struct perf_tool *tool,
 	struct addr_location al;
 	const char *fmt;
 
-	if (perf_event__preprocess_sample(event, machine, &al, sample) < 0) {
+	if (perf_event__preprocess_sample(event, machine, &al, sample,
+					  mem->session) < 0) {
 		fprintf(stderr, "problem processing %d event, skipping it.\n",
 				event->header.type);
 		return -1;
@@ -129,6 +131,8 @@ static int report_raw_events(struct perf_mem *mem)
 	if (session == NULL)
 		return -1;
 
+	mem->session = session;
+
 	if (mem->cpu_list) {
 		ret = perf_session__cpu_bitmap(session, mem->cpu_list,
 					       mem->cpu_bitmap);
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 4cac79ad3085..68d06bc02266 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -142,7 +142,8 @@ static int process_sample_event(struct perf_tool *tool,
 	};
 	int ret;
 
-	if (perf_event__preprocess_sample(event, machine, &al, sample) < 0) {
+	if (perf_event__preprocess_sample(event, machine, &al, sample,
+					  rep->session) < 0) {
 		pr_debug("problem processing %d event, skipping it.\n",
 			 event->header.type);
 		return -1;
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index ce304dfd962a..ab920f8cded6 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -542,13 +542,21 @@ static int cleanup_scripting(void)
 	return scripting_ops->stop_script();
 }
 
-static int process_sample_event(struct perf_tool *tool __maybe_unused,
+struct perf_script {
+	struct perf_tool	tool;
+	struct perf_session	*session;
+	bool			show_task_events;
+	bool			show_mmap_events;
+};
+
+static int process_sample_event(struct perf_tool *tool,
 				union perf_event *event,
 				struct perf_sample *sample,
 				struct perf_evsel *evsel,
 				struct machine *machine)
 {
 	struct addr_location al;
+	struct perf_script *script = container_of(tool, struct perf_script, tool);
 	struct thread *thread = machine__findnew_thread(machine, sample->pid,
 							sample->tid);
 
@@ -569,7 +577,8 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
 		return 0;
 	}
 
-	if (perf_event__preprocess_sample(event, machine, &al, sample) < 0) {
+	if (perf_event__preprocess_sample(event, machine, &al, sample,
+					  script->session) < 0) {
 		pr_err("problem processing %d event, skipping it.\n",
 		       event->header.type);
 		return -1;
@@ -586,13 +595,6 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
 	return 0;
 }
 
-struct perf_script {
-	struct perf_tool	tool;
-	struct perf_session	*session;
-	bool			show_task_events;
-	bool			show_mmap_events;
-};
-
 static int process_attr(struct perf_tool *tool, union perf_event *event,
 			struct perf_evlist **pevlist)
 {
diff --git a/tools/perf/builtin-timechart.c b/tools/perf/builtin-timechart.c
index f3bb1a4bf060..4178727be12c 100644
--- a/tools/perf/builtin-timechart.c
+++ b/tools/perf/builtin-timechart.c
@@ -48,6 +48,7 @@ struct wake_event;
 
 struct timechart {
 	struct perf_tool	tool;
+	struct perf_session	*session;
 	struct per_pid		*all_data;
 	struct power_event	*power_events;
 	struct wake_event	*wake_events;
@@ -469,7 +470,8 @@ static void sched_switch(struct timechart *tchart, int cpu, u64 timestamp,
 
 static const char *cat_backtrace(union perf_event *event,
 				 struct perf_sample *sample,
-				 struct machine *machine)
+				 struct machine *machine,
+				 struct perf_session *session)
 {
 	struct addr_location al;
 	unsigned int i;
@@ -488,7 +490,8 @@ static const char *cat_backtrace(union perf_event *event,
 	if (!chain)
 		goto exit;
 
-	if (perf_event__preprocess_sample(event, machine, &al, sample) < 0) {
+	if (perf_event__preprocess_sample(event, machine, &al, sample,
+					  session) < 0) {
 		fprintf(stderr, "problem processing %d event, skipping it.\n",
 			event->header.type);
 		goto exit;
@@ -567,7 +570,7 @@ static int process_sample_event(struct perf_tool *tool,
 	if (evsel->handler != NULL) {
 		tracepoint_handler f = evsel->handler;
 		return f(tchart, evsel, sample,
-			 cat_backtrace(event, sample, machine));
+			 cat_backtrace(event, sample, machine, tchart->session));
 	}
 
 	return 0;
@@ -1623,6 +1626,7 @@ static int __cmd_timechart(struct timechart *tchart, const char *output_name)
 		goto out_delete;
 	}
 
+	tchart->session = session;
 	ret = perf_session__process_events(session, &tchart->tool);
 	if (ret)
 		goto out_delete;
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index c4c7eac69de4..69a0badfb745 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -723,7 +723,8 @@ static void perf_event__process_sample(struct perf_tool *tool,
 	if (event->header.misc & PERF_RECORD_MISC_EXACT_IP)
 		top->exact_samples++;
 
-	if (perf_event__preprocess_sample(event, machine, &al, sample) < 0)
+	if (perf_event__preprocess_sample(event, machine, &al, sample,
+					  top->session) < 0)
 		return;
 
 	if (!top->kptr_restrict_warned &&
diff --git a/tools/perf/tests/hists_cumulate.c b/tools/perf/tests/hists_cumulate.c
index 18619966454c..60682e62d9de 100644
--- a/tools/perf/tests/hists_cumulate.c
+++ b/tools/perf/tests/hists_cumulate.c
@@ -101,7 +101,7 @@ static int add_hist_entries(struct hists *hists, struct machine *machine)
 		sample.callchain = (struct ip_callchain *)fake_callchains[i];
 
 		if (perf_event__preprocess_sample(&event, machine, &al,
-						  &sample) < 0)
+						  &sample, NULL) < 0)
 			goto out;
 
 		if (hist_entry_iter__add(&iter, &al, evsel, &sample,
diff --git a/tools/perf/tests/hists_filter.c b/tools/perf/tests/hists_filter.c
index 59e53db7914c..1c4e495d5137 100644
--- a/tools/perf/tests/hists_filter.c
+++ b/tools/perf/tests/hists_filter.c
@@ -78,7 +78,7 @@ static int add_hist_entries(struct perf_evlist *evlist,
 			sample.ip = fake_samples[i].ip;
 
 			if (perf_event__preprocess_sample(&event, machine, &al,
-							  &sample) < 0)
+							  &sample, NULL) < 0)
 				goto out;
 
 			if (hist_entry_iter__add(&iter, &al, evsel, &sample,
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index 278ba8344c23..a731a531a3e2 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -86,7 +86,7 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
 			sample.tid = fake_common_samples[k].pid;
 			sample.ip = fake_common_samples[k].ip;
 			if (perf_event__preprocess_sample(&event, machine, &al,
-							  &sample) < 0)
+							  &sample, NULL) < 0)
 				goto out;
 
 			he = __hists__add_entry(hists, &al, NULL,
@@ -110,7 +110,7 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
 			sample.tid = fake_samples[i][k].pid;
 			sample.ip = fake_samples[i][k].ip;
 			if (perf_event__preprocess_sample(&event, machine, &al,
-							  &sample) < 0)
+							  &sample, NULL) < 0)
 				goto out;
 
 			he = __hists__add_entry(hists, &al, NULL,
diff --git a/tools/perf/tests/hists_output.c b/tools/perf/tests/hists_output.c
index b52c9faea224..f4e3286cd496 100644
--- a/tools/perf/tests/hists_output.c
+++ b/tools/perf/tests/hists_output.c
@@ -67,7 +67,7 @@ static int add_hist_entries(struct hists *hists, struct machine *machine)
 		sample.ip = fake_samples[i].ip;
 
 		if (perf_event__preprocess_sample(&event, machine, &al,
-						  &sample) < 0)
+						  &sample, NULL) < 0)
 			goto out;
 
 		if (hist_entry_iter__add(&iter, &al, evsel, &sample,
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 6c6d044e959a..7a90c62ad07a 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -822,7 +822,8 @@ void thread__find_addr_location(struct thread *thread,
 int perf_event__preprocess_sample(const union perf_event *event,
 				  struct machine *machine,
 				  struct addr_location *al,
-				  struct perf_sample *sample)
+				  struct perf_sample *sample,
+				  struct perf_session *session __maybe_unused)
 {
 	u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
 	struct thread *thread = machine__findnew_thread(machine, sample->pid,
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index c4ffe2bd0738..19814f70292b 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -353,11 +353,13 @@ int perf_event__process(struct perf_tool *tool,
 			struct machine *machine);
 
 struct addr_location;
+struct perf_session;
 
 int perf_event__preprocess_sample(const union perf_event *event,
 				  struct machine *machine,
 				  struct addr_location *al,
-				  struct perf_sample *sample);
+				  struct perf_sample *sample,
+				  struct perf_session *session);
 
 struct thread;
 
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 17/42] perf script: Pass session arg to ->process_event callback
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (15 preceding siblings ...)
  2015-01-29  8:06 ` [PATCH 16/42] perf tools: Pass session arg to perf_event__preprocess_sample() Namhyung Kim
@ 2015-01-29  8:06 ` Namhyung Kim
  2015-01-29  8:06 ` [PATCH 18/42] perf tools: Introduce thread__comm_time() helpers Namhyung Kim
                   ` (25 subsequent siblings)
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:06 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

Sometimes it needs to retrieve symbol info inside a script engine so
we need to pass the session pointer to find the symbol correctly as
with previous patch.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-script.c                        | 23 ++++++++++++----------
 tools/perf/util/db-export.c                        |  6 ++++--
 tools/perf/util/db-export.h                        |  4 +++-
 tools/perf/util/event.c                            |  3 ++-
 tools/perf/util/event.h                            |  3 ++-
 .../perf/util/scripting-engines/trace-event-perl.c |  3 ++-
 .../util/scripting-engines/trace-event-python.c    |  5 +++--
 tools/perf/util/trace-event-scripting.c            |  3 ++-
 tools/perf/util/trace-event.h                      |  3 ++-
 9 files changed, 33 insertions(+), 20 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index ab920f8cded6..4a007110d2f7 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -377,9 +377,10 @@ static void print_sample_start(struct perf_sample *sample,
 }
 
 static void print_sample_addr(union perf_event *event,
-			  struct perf_sample *sample,
-			  struct thread *thread,
-			  struct perf_event_attr *attr)
+			      struct perf_sample *sample,
+			      struct thread *thread,
+			      struct perf_event_attr *attr,
+			      struct perf_session *session)
 {
 	struct addr_location al;
 
@@ -388,7 +389,7 @@ static void print_sample_addr(union perf_event *event,
 	if (!sample_addr_correlates_sym(attr))
 		return;
 
-	perf_event__preprocess_sample_addr(event, sample, thread, &al);
+	perf_event__preprocess_sample_addr(event, sample, thread, &al, session);
 
 	if (PRINT_FIELD(SYM)) {
 		printf(" ");
@@ -409,7 +410,8 @@ static void print_sample_bts(union perf_event *event,
 			     struct perf_sample *sample,
 			     struct perf_evsel *evsel,
 			     struct thread *thread,
-			     struct addr_location *al)
+			     struct addr_location *al,
+			     struct perf_session *session)
 {
 	struct perf_event_attr *attr = &evsel->attr;
 	bool print_srcline_last = false;
@@ -436,7 +438,7 @@ static void print_sample_bts(union perf_event *event,
 	    ((evsel->attr.sample_type & PERF_SAMPLE_ADDR) &&
 	     !output[attr->type].user_set)) {
 		printf(" => ");
-		print_sample_addr(event, sample, thread, attr);
+		print_sample_addr(event, sample, thread, attr, session);
 	}
 
 	if (print_srcline_last)
@@ -447,7 +449,7 @@ static void print_sample_bts(union perf_event *event,
 
 static void process_event(union perf_event *event, struct perf_sample *sample,
 			  struct perf_evsel *evsel, struct thread *thread,
-			  struct addr_location *al)
+			  struct addr_location *al, struct perf_session *session)
 {
 	struct perf_event_attr *attr = &evsel->attr;
 
@@ -465,7 +467,7 @@ static void process_event(union perf_event *event, struct perf_sample *sample,
 	}
 
 	if (is_bts_event(attr)) {
-		print_sample_bts(event, sample, evsel, thread, al);
+		print_sample_bts(event, sample, evsel, thread, al, session);
 		return;
 	}
 
@@ -473,7 +475,7 @@ static void process_event(union perf_event *event, struct perf_sample *sample,
 		event_format__print(evsel->tp_format, sample->cpu,
 				    sample->raw_data, sample->raw_size);
 	if (PRINT_FIELD(ADDR))
-		print_sample_addr(event, sample, thread, attr);
+		print_sample_addr(event, sample, thread, attr, session);
 
 	if (PRINT_FIELD(IP)) {
 		if (!symbol_conf.use_callchain)
@@ -590,7 +592,8 @@ static int process_sample_event(struct perf_tool *tool,
 	if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
 		return 0;
 
-	scripting_ops->process_event(event, sample, evsel, thread, &al);
+	scripting_ops->process_event(event, sample, evsel, thread, &al,
+				     script->session);
 
 	return 0;
 }
diff --git a/tools/perf/util/db-export.c b/tools/perf/util/db-export.c
index c81dae399763..e9ad11fe2e16 100644
--- a/tools/perf/util/db-export.c
+++ b/tools/perf/util/db-export.c
@@ -282,7 +282,8 @@ int db_export__branch_type(struct db_export *dbe, u32 branch_type,
 
 int db_export__sample(struct db_export *dbe, union perf_event *event,
 		      struct perf_sample *sample, struct perf_evsel *evsel,
-		      struct thread *thread, struct addr_location *al)
+		      struct thread *thread, struct addr_location *al,
+		      struct perf_session *session)
 {
 	struct export_sample es = {
 		.event = event,
@@ -328,7 +329,8 @@ int db_export__sample(struct db_export *dbe, union perf_event *event,
 	    sample_addr_correlates_sym(&evsel->attr)) {
 		struct addr_location addr_al;
 
-		perf_event__preprocess_sample_addr(event, sample, thread, &addr_al);
+		perf_event__preprocess_sample_addr(event, sample, thread,
+						   &addr_al, session);
 		err = db_ids_from_al(dbe, &addr_al, &es.addr_dso_db_id,
 				     &es.addr_sym_db_id, &es.addr_offset);
 		if (err)
diff --git a/tools/perf/util/db-export.h b/tools/perf/util/db-export.h
index adbd22d66798..b994f1041d19 100644
--- a/tools/perf/util/db-export.h
+++ b/tools/perf/util/db-export.h
@@ -29,6 +29,7 @@ struct addr_location;
 struct call_return_processor;
 struct call_path;
 struct call_return;
+struct perf_session;
 
 struct export_sample {
 	union perf_event	*event;
@@ -97,7 +98,8 @@ int db_export__branch_type(struct db_export *dbe, u32 branch_type,
 			   const char *name);
 int db_export__sample(struct db_export *dbe, union perf_event *event,
 		      struct perf_sample *sample, struct perf_evsel *evsel,
-		      struct thread *thread, struct addr_location *al);
+		      struct thread *thread, struct addr_location *al,
+		      struct perf_session *session);
 
 int db_export__branch_types(struct db_export *dbe);
 
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 7a90c62ad07a..186960a09024 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -904,7 +904,8 @@ bool sample_addr_correlates_sym(struct perf_event_attr *attr)
 void perf_event__preprocess_sample_addr(union perf_event *event,
 					struct perf_sample *sample,
 					struct thread *thread,
-					struct addr_location *al)
+					struct addr_location *al,
+					struct perf_session *session __maybe_unused)
 {
 	u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
 
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 19814f70292b..27261320249a 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -368,7 +368,8 @@ bool sample_addr_correlates_sym(struct perf_event_attr *attr);
 void perf_event__preprocess_sample_addr(union perf_event *event,
 					struct perf_sample *sample,
 					struct thread *thread,
-					struct addr_location *al);
+					struct addr_location *al,
+					struct perf_session *session);
 
 const char *perf_event__name(unsigned int id);
 
diff --git a/tools/perf/util/scripting-engines/trace-event-perl.c b/tools/perf/util/scripting-engines/trace-event-perl.c
index 22ebc46226e7..dd69fbaf03b8 100644
--- a/tools/perf/util/scripting-engines/trace-event-perl.c
+++ b/tools/perf/util/scripting-engines/trace-event-perl.c
@@ -356,7 +356,8 @@ static void perl_process_event(union perf_event *event,
 			       struct perf_sample *sample,
 			       struct perf_evsel *evsel,
 			       struct thread *thread,
-			       struct addr_location *al __maybe_unused)
+			       struct addr_location *al __maybe_unused,
+			       struct perf_session *session __maybe_unused)
 {
 	perl_process_tracepoint(sample, evsel, thread);
 	perl_process_event_generic(event, sample, evsel);
diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index 0c815a40a6e8..802def46af7b 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -839,7 +839,8 @@ static void python_process_event(union perf_event *event,
 				 struct perf_sample *sample,
 				 struct perf_evsel *evsel,
 				 struct thread *thread,
-				 struct addr_location *al)
+				 struct addr_location *al,
+				 struct perf_session *session)
 {
 	struct tables *tables = &tables_global;
 
@@ -851,7 +852,7 @@ static void python_process_event(union perf_event *event,
 	default:
 		if (tables->db_export_mode)
 			db_export__sample(&tables->dbe, event, sample, evsel,
-					  thread, al);
+					  thread, al, session);
 		else
 			python_process_general_event(sample, evsel, thread, al);
 	}
diff --git a/tools/perf/util/trace-event-scripting.c b/tools/perf/util/trace-event-scripting.c
index 5c9bdd1591a9..36ed50d71171 100644
--- a/tools/perf/util/trace-event-scripting.c
+++ b/tools/perf/util/trace-event-scripting.c
@@ -44,7 +44,8 @@ static void process_event_unsupported(union perf_event *event __maybe_unused,
 				      struct perf_sample *sample __maybe_unused,
 				      struct perf_evsel *evsel __maybe_unused,
 				      struct thread *thread __maybe_unused,
-				      struct addr_location *al __maybe_unused)
+				      struct addr_location *al __maybe_unused,
+				      struct perf_session *session __maybe_unused)
 {
 }
 
diff --git a/tools/perf/util/trace-event.h b/tools/perf/util/trace-event.h
index 52aaa19e1eb1..c5870e57eee9 100644
--- a/tools/perf/util/trace-event.h
+++ b/tools/perf/util/trace-event.h
@@ -70,7 +70,8 @@ struct scripting_ops {
 			       struct perf_sample *sample,
 			       struct perf_evsel *evsel,
 			       struct thread *thread,
-				   struct addr_location *al);
+			       struct addr_location *al,
+			       struct perf_session *session);
 	int (*generate_script) (struct pevent *pevent, const char *outfile);
 };
 
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 18/42] perf tools: Introduce thread__comm_time() helpers
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (16 preceding siblings ...)
  2015-01-29  8:06 ` [PATCH 17/42] perf script: Pass session arg to ->process_event callback Namhyung Kim
@ 2015-01-29  8:06 ` Namhyung Kim
  2015-01-29  8:07 ` [PATCH 19/42] perf tools: Add a test case for thread comm handling Namhyung Kim
                   ` (24 subsequent siblings)
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:06 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

When data file indexing is enabled, it processes all task, comm and mmap
events first and then goes to the sample events.  So all it sees is the
last comm of a thread although it has information at the time of sample.

Sort thread's comm by time so that it can find appropriate comm at the
sample time.  The thread__comm_time() will mostly work even if
PERF_SAMPLE_TIME bit is off since in that case, sample->time will be
-1 so it'll take the last comm anyway.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/thread.c | 33 ++++++++++++++++++++++++++++++++-
 tools/perf/util/thread.h |  2 ++
 2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 9ebc8b1f9be5..ad96725105c2 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -103,6 +103,21 @@ struct comm *thread__exec_comm(const struct thread *thread)
 	return last;
 }
 
+struct comm *thread__comm_time(const struct thread *thread, u64 timestamp)
+{
+	struct comm *comm;
+
+	list_for_each_entry(comm, &thread->comm_list, list) {
+		if (timestamp >= comm->start)
+			return comm;
+	}
+
+	if (list_empty(&thread->comm_list))
+		return NULL;
+
+	return list_last_entry(&thread->comm_list, struct comm, list);
+}
+
 int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
 		       bool exec)
 {
@@ -118,7 +133,13 @@ int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
 		new = comm__new(str, timestamp, exec);
 		if (!new)
 			return -ENOMEM;
-		list_add(&new->list, &thread->comm_list);
+
+		/* sort by time */
+		list_for_each_entry(curr, &thread->comm_list, list) {
+			if (timestamp >= curr->start)
+				break;
+		}
+		list_add_tail(&new->list, &curr->list);
 
 		if (exec)
 			unwind__flush_access(thread);
@@ -139,6 +160,16 @@ const char *thread__comm_str(const struct thread *thread)
 	return comm__str(comm);
 }
 
+const char *thread__comm_str_time(const struct thread *thread, u64 timestamp)
+{
+	const struct comm *comm = thread__comm_time(thread, timestamp);
+
+	if (!comm)
+		return NULL;
+
+	return comm__str(comm);
+}
+
 /* CHECKME: it should probably better return the max comm len from its comm list */
 int thread__comm_len(struct thread *thread)
 {
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index 160fd066a7d1..be67c3bad5e7 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -53,7 +53,9 @@ static inline int thread__set_comm(struct thread *thread, const char *comm,
 int thread__comm_len(struct thread *thread);
 struct comm *thread__comm(const struct thread *thread);
 struct comm *thread__exec_comm(const struct thread *thread);
+struct comm *thread__comm_time(const struct thread *thread, u64 timestamp);
 const char *thread__comm_str(const struct thread *thread);
+const char *thread__comm_str_time(const struct thread *thread, u64 timestamp);
 void thread__insert_map(struct thread *thread, struct map *map);
 int thread__fork(struct thread *thread, struct thread *parent, u64 timestamp);
 size_t thread__fprintf(struct thread *thread, FILE *fp);
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 19/42] perf tools: Add a test case for thread comm handling
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (17 preceding siblings ...)
  2015-01-29  8:06 ` [PATCH 18/42] perf tools: Introduce thread__comm_time() helpers Namhyung Kim
@ 2015-01-29  8:07 ` Namhyung Kim
  2015-01-29  8:07 ` [PATCH 20/42] perf tools: Use thread__comm_time() when adding hist entries Namhyung Kim
                   ` (23 subsequent siblings)
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

The new test case checks various thread comm handling like overridding
and time sorting.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/Makefile.perf        |  1 +
 tools/perf/tests/builtin-test.c |  4 ++++
 tools/perf/tests/tests.h        |  1 +
 tools/perf/tests/thread-comm.c  | 47 +++++++++++++++++++++++++++++++++++++++++
 4 files changed, 53 insertions(+)
 create mode 100644 tools/perf/tests/thread-comm.c

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index aa6a50447c32..8507891db69d 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -458,6 +458,7 @@ endif
 LIB_OBJS += $(OUTPUT)tests/mmap-thread-lookup.o
 LIB_OBJS += $(OUTPUT)tests/thread-mg-share.o
 LIB_OBJS += $(OUTPUT)tests/switch-tracking.o
+LIB_OBJS += $(OUTPUT)tests/thread-comm.o
 
 BUILTIN_OBJS += $(OUTPUT)builtin-annotate.o
 BUILTIN_OBJS += $(OUTPUT)builtin-bench.o
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 4b7d9ab0f049..1b463d82a71a 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -167,6 +167,10 @@ static struct test {
 		.func = test__fdarray__add,
 	},
 	{
+		.desc = "Test thread comm handling",
+		.func = test__thread_comm,
+	},
+	{
 		.func = NULL,
 	},
 };
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index 00e776a87a9c..43ac17780629 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -51,6 +51,7 @@ int test__hists_cumulate(void);
 int test__switch_tracking(void);
 int test__fdarray__filter(void);
 int test__fdarray__add(void);
+int test__thread_comm(void);
 
 #if defined(__x86_64__) || defined(__i386__) || defined(__arm__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
diff --git a/tools/perf/tests/thread-comm.c b/tools/perf/tests/thread-comm.c
new file mode 100644
index 000000000000..44ee85d71581
--- /dev/null
+++ b/tools/perf/tests/thread-comm.c
@@ -0,0 +1,47 @@
+#include "tests.h"
+#include "machine.h"
+#include "thread.h"
+#include "debug.h"
+
+int test__thread_comm(void)
+{
+	struct machines machines;
+	struct machine *machine;
+	struct thread *t;
+
+	/*
+	 * This test is to check whether it can retrieve a correct
+	 * comm for a given time.  When multi-file data storage is
+	 * enabled, those task/comm events are processed first so the
+	 * later sample should find a matching comm properly.
+	 */
+	machines__init(&machines);
+	machine = &machines.host;
+
+	t = machine__findnew_thread(machine, 100, 100);
+	TEST_ASSERT_VAL("wrong init thread comm",
+			!strcmp(thread__comm_str(t), ":100"));
+
+	thread__set_comm(t, "perf-test1", 10000);
+	TEST_ASSERT_VAL("failed to override thread comm",
+			!strcmp(thread__comm_str(t), "perf-test1"));
+
+	thread__set_comm(t, "perf-test2", 20000);
+	thread__set_comm(t, "perf-test3", 30000);
+	thread__set_comm(t, "perf-test4", 40000);
+
+	TEST_ASSERT_VAL("failed to find timed comm",
+			!strcmp(thread__comm_str_time(t, 20000), "perf-test2"));
+	TEST_ASSERT_VAL("failed to find timed comm",
+			!strcmp(thread__comm_str_time(t, 35000), "perf-test3"));
+	TEST_ASSERT_VAL("failed to find timed comm",
+			!strcmp(thread__comm_str_time(t, 50000), "perf-test4"));
+
+	thread__set_comm(t, "perf-test1.5", 15000);
+	TEST_ASSERT_VAL("failed to sort timed comm",
+			!strcmp(thread__comm_str_time(t, 15000), "perf-test1.5"));
+
+	machine__delete_threads(machine);
+	machines__exit(&machines);
+	return 0;
+}
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 20/42] perf tools: Use thread__comm_time() when adding hist entries
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (18 preceding siblings ...)
  2015-01-29  8:07 ` [PATCH 19/42] perf tools: Add a test case for thread comm handling Namhyung Kim
@ 2015-01-29  8:07 ` Namhyung Kim
  2015-01-29  8:07 ` [PATCH 21/42] perf tools: Convert dead thread list into rbtree Namhyung Kim
                   ` (22 subsequent siblings)
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

Now thread->comm can be handled with time properly, use it to find
correct comm when adding hist entries.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-annotate.c |  5 +++--
 tools/perf/builtin-diff.c     |  8 ++++----
 tools/perf/tests/hists_link.c |  4 ++--
 tools/perf/util/hist.c        | 19 ++++++++++---------
 tools/perf/util/hist.h        |  2 +-
 5 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index b89e4c6ed488..50628900f9fa 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -47,7 +47,7 @@ struct perf_annotate {
 };
 
 static int perf_evsel__add_sample(struct perf_evsel *evsel,
-				  struct perf_sample *sample __maybe_unused,
+				  struct perf_sample *sample,
 				  struct addr_location *al,
 				  struct perf_annotate *ann)
 {
@@ -67,7 +67,8 @@ static int perf_evsel__add_sample(struct perf_evsel *evsel,
 		return 0;
 	}
 
-	he = __hists__add_entry(hists, al, NULL, NULL, NULL, 1, 1, 0, true);
+	he = __hists__add_entry(hists, al, NULL, NULL, NULL, 1, 1, 0,
+				sample->time, true);
 	if (he == NULL)
 		return -ENOMEM;
 
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 3e2229227062..ddf6f0999838 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -313,10 +313,10 @@ static int formula_fprintf(struct hist_entry *he, struct hist_entry *pair,
 
 static int hists__add_entry(struct hists *hists,
 			    struct addr_location *al, u64 period,
-			    u64 weight, u64 transaction)
+			    u64 weight, u64 transaction, u64 timestamp)
 {
 	if (__hists__add_entry(hists, al, NULL, NULL, NULL, period, weight,
-			       transaction, true) != NULL)
+			       transaction, timestamp, true) != NULL)
 		return 0;
 	return -ENOMEM;
 }
@@ -338,8 +338,8 @@ static int diff__process_sample_event(struct perf_tool *tool,
 		return -1;
 	}
 
-	if (hists__add_entry(hists, &al, sample->period,
-			     sample->weight, sample->transaction)) {
+	if (hists__add_entry(hists, &al, sample->period, sample->weight,
+			     sample->transaction, sample->time)) {
 		pr_warning("problem incrementing symbol period, skipping event\n");
 		return -1;
 	}
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index a731a531a3e2..4f3d45692acb 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -90,7 +90,7 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
 				goto out;
 
 			he = __hists__add_entry(hists, &al, NULL,
-						NULL, NULL, 1, 1, 0, true);
+						NULL, NULL, 1, 1, 0, -1, true);
 			if (he == NULL)
 				goto out;
 
@@ -114,7 +114,7 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
 				goto out;
 
 			he = __hists__add_entry(hists, &al, NULL,
-						NULL, NULL, 1, 1, 0, true);
+						NULL, NULL, 1, 1, 0, -1, true);
 			if (he == NULL)
 				goto out;
 
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 70b48a65064c..4badf2491fbf 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -447,11 +447,11 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
 				      struct branch_info *bi,
 				      struct mem_info *mi,
 				      u64 period, u64 weight, u64 transaction,
-				      bool sample_self)
+				      u64 timestamp, bool sample_self)
 {
 	struct hist_entry entry = {
 		.thread	= al->thread,
-		.comm = thread__comm(al->thread),
+		.comm = thread__comm_time(al->thread, timestamp),
 		.ms = {
 			.map	= al->map,
 			.sym	= al->sym,
@@ -509,13 +509,14 @@ iter_add_single_mem_entry(struct hist_entry_iter *iter, struct addr_location *al
 {
 	u64 cost;
 	struct mem_info *mi = iter->priv;
+	struct perf_sample *sample = iter->sample;
 	struct hists *hists = evsel__hists(iter->evsel);
 	struct hist_entry *he;
 
 	if (mi == NULL)
 		return -EINVAL;
 
-	cost = iter->sample->weight;
+	cost = sample->weight;
 	if (!cost)
 		cost = 1;
 
@@ -527,7 +528,7 @@ iter_add_single_mem_entry(struct hist_entry_iter *iter, struct addr_location *al
 	 * and the he_stat__add_period() function.
 	 */
 	he = __hists__add_entry(hists, al, iter->parent, NULL, mi,
-				cost, cost, 0, true);
+				cost, cost, 0, sample->time, true);
 	if (!he)
 		return -ENOMEM;
 
@@ -628,7 +629,7 @@ iter_add_next_branch_entry(struct hist_entry_iter *iter, struct addr_location *a
 	 * and not events sampled. Thus we use a pseudo period of 1.
 	 */
 	he = __hists__add_entry(hists, al, iter->parent, &bi[i], NULL,
-				1, 1, 0, true);
+				1, 1, 0, iter->sample->time, true);
 	if (he == NULL)
 		return -ENOMEM;
 
@@ -666,7 +667,7 @@ iter_add_single_normal_entry(struct hist_entry_iter *iter, struct addr_location
 
 	he = __hists__add_entry(evsel__hists(evsel), al, iter->parent, NULL, NULL,
 				sample->period, sample->weight,
-				sample->transaction, true);
+				sample->transaction, sample->time, true);
 	if (he == NULL)
 		return -ENOMEM;
 
@@ -728,7 +729,7 @@ iter_add_single_cumulative_entry(struct hist_entry_iter *iter,
 
 	he = __hists__add_entry(hists, al, iter->parent, NULL, NULL,
 				sample->period, sample->weight,
-				sample->transaction, true);
+				sample->transaction, sample->time, true);
 	if (he == NULL)
 		return -ENOMEM;
 
@@ -772,7 +773,7 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
 	struct hist_entry he_tmp = {
 		.cpu = al->cpu,
 		.thread = al->thread,
-		.comm = thread__comm(al->thread),
+		.comm = thread__comm_time(al->thread, sample->time),
 		.ip = al->addr,
 		.ms = {
 			.map = al->map,
@@ -801,7 +802,7 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
 
 	he = __hists__add_entry(evsel__hists(evsel), al, iter->parent, NULL, NULL,
 				sample->period, sample->weight,
-				sample->transaction, false);
+				sample->transaction, sample->time, false);
 	if (he == NULL)
 		return -ENOMEM;
 
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 2b690d028907..0eed50a5b1f0 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -109,7 +109,7 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
 				      struct branch_info *bi,
 				      struct mem_info *mi, u64 period,
 				      u64 weight, u64 transaction,
-				      bool sample_self);
+				      u64 timestamp, bool sample_self);
 int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
 			 struct perf_evsel *evsel, struct perf_sample *sample,
 			 int max_stack_depth, void *arg);
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 21/42] perf tools: Convert dead thread list into rbtree
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (19 preceding siblings ...)
  2015-01-29  8:07 ` [PATCH 20/42] perf tools: Use thread__comm_time() when adding hist entries Namhyung Kim
@ 2015-01-29  8:07 ` Namhyung Kim
  2015-01-29  8:07 ` [PATCH 22/42] perf tools: Introduce machine__find*_thread_time() Namhyung Kim
                   ` (21 subsequent siblings)
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

Currently perf maintains dead threads in a linked list but this can be
a problem if someone needs to search from it especially in a large
session which might have many dead threads.  Convert it to a rbtree
like normal threads and it'll be used later with multi-file changes.

The list node is now used for chaining dead threads of same tid since
it's easier to handle such threads in time order.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/machine.c | 70 +++++++++++++++++++++++++++++++++++++++++------
 tools/perf/util/machine.h |  2 +-
 tools/perf/util/thread.c  |  1 +
 tools/perf/util/thread.h  | 11 ++++----
 4 files changed, 68 insertions(+), 16 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 1bca3a9f2b16..d4050fcba851 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -28,7 +28,7 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid)
 	dsos__init(&machine->kernel_dsos);
 
 	machine->threads = RB_ROOT;
-	INIT_LIST_HEAD(&machine->dead_threads);
+	machine->dead_threads = RB_ROOT;
 	machine->last_match = NULL;
 
 	machine->vdso_info = NULL;
@@ -91,10 +91,22 @@ static void dsos__delete(struct dsos *dsos)
 
 void machine__delete_dead_threads(struct machine *machine)
 {
-	struct thread *n, *t;
+	struct rb_node *nd = rb_first(&machine->dead_threads);
+
+	while (nd) {
+		struct thread *t = rb_entry(nd, struct thread, rb_node);
+		struct thread *pos;
+
+		nd = rb_next(nd);
+		rb_erase(&t->rb_node, &machine->dead_threads);
+
+		while (!list_empty(&t->tid_node)) {
+			pos = list_first_entry(&t->tid_node,
+					       struct thread, tid_node);
+			list_del(&pos->tid_node);
+			thread__delete(pos);
+		}
 
-	list_for_each_entry_safe(t, n, &machine->dead_threads, node) {
-		list_del(&t->node);
 		thread__delete(t);
 	}
 }
@@ -106,8 +118,8 @@ void machine__delete_threads(struct machine *machine)
 	while (nd) {
 		struct thread *t = rb_entry(nd, struct thread, rb_node);
 
-		rb_erase(&t->rb_node, &machine->threads);
 		nd = rb_next(nd);
+		rb_erase(&t->rb_node, &machine->threads);
 		thread__delete(t);
 	}
 }
@@ -1238,13 +1250,46 @@ int machine__process_mmap_event(struct machine *machine, union perf_event *event
 
 static void machine__remove_thread(struct machine *machine, struct thread *th)
 {
+	struct rb_node **p = &machine->dead_threads.rb_node;
+	struct rb_node *parent = NULL;
+	struct thread *pos;
+
 	machine->last_match = NULL;
 	rb_erase(&th->rb_node, &machine->threads);
+
+	th->dead = true;
+
 	/*
 	 * We may have references to this thread, for instance in some hist_entry
-	 * instances, so just move them to a separate list.
+	 * instances, so just move them to a separate list in rbtree.
 	 */
-	list_add_tail(&th->node, &machine->dead_threads);
+	while (*p != NULL) {
+		parent = *p;
+		pos = rb_entry(parent, struct thread, rb_node);
+
+		if (pos->tid == th->tid) {
+			struct thread *old;
+
+			/* sort by time */
+			list_for_each_entry(old, &pos->tid_node, tid_node) {
+				if (th->start_time >= old->start_time) {
+					pos = old;
+					break;
+				}
+			}
+
+			list_add_tail(&th->tid_node, &pos->tid_node);
+			return;
+		}
+
+		if (th->tid < pos->tid)
+			p = &(*p)->rb_left;
+		else
+			p = &(*p)->rb_right;
+	}
+
+	rb_link_node(&th->rb_node, parent, p);
+	rb_insert_color(&th->rb_node, &machine->dead_threads);
 }
 
 int machine__process_fork_event(struct machine *machine, union perf_event *event,
@@ -1649,7 +1694,7 @@ int machine__for_each_thread(struct machine *machine,
 			     void *priv)
 {
 	struct rb_node *nd;
-	struct thread *thread;
+	struct thread *thread, *pos;
 	int rc = 0;
 
 	for (nd = rb_first(&machine->threads); nd; nd = rb_next(nd)) {
@@ -1659,10 +1704,17 @@ int machine__for_each_thread(struct machine *machine,
 			return rc;
 	}
 
-	list_for_each_entry(thread, &machine->dead_threads, node) {
+	for (nd = rb_first(&machine->dead_threads); nd; nd = rb_next(nd)) {
+		thread = rb_entry(nd, struct thread, rb_node);
 		rc = fn(thread, priv);
 		if (rc != 0)
 			return rc;
+
+		list_for_each_entry(pos, &thread->tid_node, tid_node) {
+			rc = fn(pos, priv);
+			if (rc != 0)
+				return rc;
+		}
 	}
 	return rc;
 }
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index e8b7779a0a3f..4349946a38ff 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -30,7 +30,7 @@ struct machine {
 	bool		  comm_exec;
 	char		  *root_dir;
 	struct rb_root	  threads;
-	struct list_head  dead_threads;
+	struct rb_root	  dead_threads;
 	struct thread	  *last_match;
 	struct vdso_info  *vdso_info;
 	struct dsos	  user_dsos;
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index ad96725105c2..c9ae0e1599da 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -38,6 +38,7 @@ struct thread *thread__new(pid_t pid, pid_t tid)
 		thread->ppid = -1;
 		thread->cpu = -1;
 		INIT_LIST_HEAD(&thread->comm_list);
+		INIT_LIST_HEAD(&thread->tid_node);
 
 		if (unwind__prepare_access(thread) < 0)
 			goto err_thread;
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index be67c3bad5e7..21268e66b2ad 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -11,10 +11,8 @@
 struct thread_stack;
 
 struct thread {
-	union {
-		struct rb_node	 rb_node;
-		struct list_head node;
-	};
+	struct rb_node	 	rb_node;
+	struct list_head 	tid_node;
 	struct map_groups	*mg;
 	pid_t			pid_; /* Not all tools update this */
 	pid_t			tid;
@@ -22,7 +20,8 @@ struct thread {
 	int			cpu;
 	char			shortname[3];
 	bool			comm_set;
-	bool			dead; /* if set thread has exited */
+	bool			exited; /* if set thread has exited */
+	bool			dead; /* thread is in dead_threads list */
 	struct list_head	comm_list;
 	int			comm_len;
 	u64			db_id;
@@ -39,7 +38,7 @@ int thread__init_map_groups(struct thread *thread, struct machine *machine);
 void thread__delete(struct thread *thread);
 static inline void thread__exited(struct thread *thread)
 {
-	thread->dead = true;
+	thread->exited = true;
 }
 
 int __thread__set_comm(struct thread *thread, const char *comm, u64 timestamp,
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 22/42] perf tools: Introduce machine__find*_thread_time()
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (20 preceding siblings ...)
  2015-01-29  8:07 ` [PATCH 21/42] perf tools: Convert dead thread list into rbtree Namhyung Kim
@ 2015-01-29  8:07 ` Namhyung Kim
  2015-01-29  8:07 ` [PATCH 23/42] perf tools: Add a test case for timed thread handling Namhyung Kim
                   ` (20 subsequent siblings)
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

With data file indexing is enabled, it needs to search thread based on
sample time since sample processing is done after other (task, comm and
mmap) events are processed.  This can be a problem if a session is very
long and pid is recycled - in that case it'll only see the last one.

So keep thread start time in it, and search thread based on the time.
This patch introduces machine__find{,new}_thread_time() function for
this.  It'll first search current thread rbtree and then dead thread
tree and list.  If it couldn't find anyone, it'll create a new thread.

The sample timestamp of 0 means that this is called from synthesized
event so just use current rbtree.  The timestamp will be -1 if sample
didn't record the timestamp so will see current threads automatically.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-script.c     |  11 ++++-
 tools/perf/tests/dwarf-unwind.c |   8 ++--
 tools/perf/tests/hists_common.c |   3 +-
 tools/perf/tests/hists_link.c   |   2 +-
 tools/perf/util/event.c         |  14 ++++--
 tools/perf/util/machine.c       | 102 +++++++++++++++++++++++++++++++++++++++-
 tools/perf/util/machine.h       |   8 +++-
 tools/perf/util/thread.c        |   4 ++
 tools/perf/util/thread.h        |   1 +
 9 files changed, 138 insertions(+), 15 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 4a007110d2f7..65b3a07be2bf 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -559,8 +559,15 @@ static int process_sample_event(struct perf_tool *tool,
 {
 	struct addr_location al;
 	struct perf_script *script = container_of(tool, struct perf_script, tool);
-	struct thread *thread = machine__findnew_thread(machine, sample->pid,
-							sample->tid);
+	struct thread *thread;
+
+	if (perf_session__has_index(script->session))
+		thread = machine__findnew_thread_time(machine, sample->pid,
+						      sample->tid,
+						      sample->time);
+	else
+		thread = machine__findnew_thread(machine, sample->pid,
+						 sample->tid);
 
 	if (thread == NULL) {
 		pr_debug("problem processing %d event, skipping it.\n",
diff --git a/tools/perf/tests/dwarf-unwind.c b/tools/perf/tests/dwarf-unwind.c
index 0bf06bec68c7..7e04feb431cb 100644
--- a/tools/perf/tests/dwarf-unwind.c
+++ b/tools/perf/tests/dwarf-unwind.c
@@ -16,10 +16,10 @@
 
 static int mmap_handler(struct perf_tool *tool __maybe_unused,
 			union perf_event *event,
-			struct perf_sample *sample __maybe_unused,
+			struct perf_sample *sample,
 			struct machine *machine)
 {
-	return machine__process_mmap2_event(machine, event, NULL);
+	return machine__process_mmap2_event(machine, event, sample);
 }
 
 static int init_live_machine(struct machine *machine)
@@ -66,12 +66,10 @@ static int unwind_entry(struct unwind_entry *entry, void *arg)
 __attribute__ ((noinline))
 static int unwind_thread(struct thread *thread)
 {
-	struct perf_sample sample;
+	struct perf_sample sample = { .time = -1ULL, };
 	unsigned long cnt = 0;
 	int err = -1;
 
-	memset(&sample, 0, sizeof(sample));
-
 	if (test__arch_unwind_sample(&sample, thread)) {
 		pr_debug("failed to get unwind sample\n");
 		goto out;
diff --git a/tools/perf/tests/hists_common.c b/tools/perf/tests/hists_common.c
index a62c09134516..86a8fdb41804 100644
--- a/tools/perf/tests/hists_common.c
+++ b/tools/perf/tests/hists_common.c
@@ -80,6 +80,7 @@ static struct {
 struct machine *setup_fake_machine(struct machines *machines)
 {
 	struct machine *machine = machines__find(machines, HOST_KERNEL_ID);
+	struct perf_sample sample = { .time = -1ULL, };
 	size_t i;
 
 	if (machine == NULL) {
@@ -113,7 +114,7 @@ struct machine *setup_fake_machine(struct machines *machines)
 		strcpy(fake_mmap_event.mmap.filename,
 		       fake_mmap_info[i].filename);
 
-		machine__process_mmap_event(machine, &fake_mmap_event, NULL);
+		machine__process_mmap_event(machine, &fake_mmap_event, &sample);
 	}
 
 	for (i = 0; i < ARRAY_SIZE(fake_symbols); i++) {
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index 4f3d45692acb..1237cc87e8d5 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -64,7 +64,7 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
 	struct perf_evsel *evsel;
 	struct addr_location al;
 	struct hist_entry *he;
-	struct perf_sample sample = { .period = 1, };
+	struct perf_sample sample = { .period = 1, .time = -1ULL, };
 	size_t i = 0, k;
 
 	/*
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 186960a09024..8b9fe0a908e8 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -9,6 +9,7 @@
 #include "strlist.h"
 #include "thread.h"
 #include "thread_map.h"
+#include "session.h"
 #include "symbol/kallsyms.h"
 
 static const char *perf_event__names[] = {
@@ -823,11 +824,18 @@ int perf_event__preprocess_sample(const union perf_event *event,
 				  struct machine *machine,
 				  struct addr_location *al,
 				  struct perf_sample *sample,
-				  struct perf_session *session __maybe_unused)
+				  struct perf_session *session)
 {
 	u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
-	struct thread *thread = machine__findnew_thread(machine, sample->pid,
-							sample->tid);
+	struct thread *thread;
+
+	if (session && perf_session__has_index(session))
+		thread = machine__findnew_thread_time(machine, sample->pid,
+						      sample->tid,
+						      sample->time);
+	else
+		thread = machine__findnew_thread(machine, sample->pid,
+						 sample->tid);
 
 	if (thread == NULL)
 		return -1;
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index d4050fcba851..f8bc2f67b515 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -434,6 +434,106 @@ struct thread *machine__find_thread(struct machine *machine, pid_t pid,
 	return __machine__findnew_thread(machine, pid, tid, false);
 }
 
+static struct thread *__machine__findnew_thread_time(struct machine *machine,
+						     pid_t pid, pid_t tid,
+						     u64 timestamp, bool create)
+{
+	struct thread *curr, *pos, *new;
+	struct thread *th = NULL;
+	struct rb_node **p;
+	struct rb_node *parent = NULL;
+
+	curr = __machine__findnew_thread(machine, pid, tid, false);
+	if (curr && timestamp >= curr->start_time)
+		return curr;
+
+	p = &machine->dead_threads.rb_node;
+	while (*p != NULL) {
+		parent = *p;
+		th = rb_entry(parent, struct thread, rb_node);
+
+		if (th->tid == tid) {
+			list_for_each_entry(pos, &th->tid_node, tid_node) {
+				if (timestamp >= pos->start_time &&
+				    pos->start_time > th->start_time) {
+					th = pos;
+					break;
+				}
+			}
+
+			if (timestamp >= th->start_time) {
+				machine__update_thread_pid(machine, th, pid);
+				return th;
+			}
+			break;
+		}
+
+		if (tid < th->tid)
+			p = &(*p)->rb_left;
+		else
+			p = &(*p)->rb_right;
+	}
+
+	if (!create)
+		return NULL;
+
+	if (!curr && !*p)
+		return __machine__findnew_thread(machine, pid, tid, true);
+
+	new = thread__new(pid, tid);
+	if (new == NULL)
+		return NULL;
+
+	new->dead = true;
+	new->start_time = timestamp;
+
+	if (*p) {
+		list_for_each_entry(pos, &th->tid_node, tid_node) {
+			/* sort by time */
+			if (timestamp >= pos->start_time) {
+				th = pos;
+				break;
+			}
+		}
+		list_add_tail(&new->tid_node, &th->tid_node);
+	} else {
+		rb_link_node(&new->rb_node, parent, p);
+		rb_insert_color(&new->rb_node, &machine->dead_threads);
+	}
+
+	/*
+	 * We have to initialize map_groups separately
+	 * after rb tree is updated.
+	 *
+	 * The reason is that we call machine__findnew_thread
+	 * within thread__init_map_groups to find the thread
+	 * leader and that would screwed the rb tree.
+	 */
+	if (thread__init_map_groups(new, machine)) {
+		if (!list_empty(&new->tid_node))
+			list_del(&new->tid_node);
+		else
+			rb_erase(&new->rb_node, &machine->dead_threads);
+
+		thread__delete(new);
+		return NULL;
+	}
+
+	return new;
+}
+
+struct thread *machine__find_thread_time(struct machine *machine, pid_t pid,
+					 pid_t tid, u64 timestamp)
+{
+	return __machine__findnew_thread_time(machine, pid, tid, timestamp, false);
+}
+
+struct thread *machine__findnew_thread_time(struct machine *machine, pid_t pid,
+					    pid_t tid, u64 timestamp)
+{
+	return __machine__findnew_thread_time(machine, pid, tid, timestamp, true);
+}
+
 struct comm *machine__thread_exec_comm(struct machine *machine,
 				       struct thread *thread)
 {
@@ -1172,7 +1272,7 @@ int machine__process_mmap2_event(struct machine *machine,
 	}
 
 	thread = machine__findnew_thread(machine, event->mmap2.pid,
-					event->mmap2.tid);
+					 event->mmap2.tid);
 	if (thread == NULL)
 		goto out_problem;
 
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 4349946a38ff..9571b6b1c5b5 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -68,8 +68,6 @@ static inline bool machine__kernel_ip(struct machine *machine, u64 ip)
 	return ip >= kernel_start;
 }
 
-struct thread *machine__find_thread(struct machine *machine, pid_t pid,
-				    pid_t tid);
 struct comm *machine__thread_exec_comm(struct machine *machine,
 				       struct thread *thread);
 
@@ -149,6 +147,12 @@ static inline bool machine__is_host(struct machine *machine)
 
 struct thread *machine__findnew_thread(struct machine *machine, pid_t pid,
 				       pid_t tid);
+struct thread *machine__find_thread(struct machine *machine, pid_t pid,
+				    pid_t tid);
+struct thread *machine__findnew_thread_time(struct machine *machine, pid_t pid,
+					    pid_t tid, u64 timestamp);
+struct thread *machine__find_thread_time(struct machine *machine, pid_t pid,
+					 pid_t tid, u64 timestamp);
 
 size_t machine__fprintf(struct machine *machine, FILE *fp);
 
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index c9ae0e1599da..306bdaede019 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -127,6 +127,9 @@ int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
 
 	/* Override the default :tid entry */
 	if (!thread->comm_set) {
+		if (!thread->start_time)
+			thread->start_time = timestamp;
+
 		err = comm__override(curr, str, timestamp, exec);
 		if (err)
 			return err;
@@ -228,6 +231,7 @@ int thread__fork(struct thread *thread, struct thread *parent, u64 timestamp)
 	}
 
 	thread->ppid = parent->tid;
+	thread->start_time = timestamp;
 	return thread__clone_map_groups(thread, parent);
 }
 
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index 21268e66b2ad..e5d7abd255ea 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -25,6 +25,7 @@ struct thread {
 	struct list_head	comm_list;
 	int			comm_len;
 	u64			db_id;
+	u64			start_time;
 
 	void			*priv;
 	struct thread_stack	*ts;
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 23/42] perf tools: Add a test case for timed thread handling
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (21 preceding siblings ...)
  2015-01-29  8:07 ` [PATCH 22/42] perf tools: Introduce machine__find*_thread_time() Namhyung Kim
@ 2015-01-29  8:07 ` Namhyung Kim
  2015-01-29  8:07 ` [PATCH 24/42] perf tools: Maintain map groups list in a leader thread Namhyung Kim
                   ` (19 subsequent siblings)
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

A test case for verifying live and dead thread tree management during
time change and new machine__find{,new}_thread_time().

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/Makefile.perf              |   1 +
 tools/perf/tests/builtin-test.c       |   4 +
 tools/perf/tests/tests.h              |   1 +
 tools/perf/tests/thread-lookup-time.c | 174 ++++++++++++++++++++++++++++++++++
 4 files changed, 180 insertions(+)
 create mode 100644 tools/perf/tests/thread-lookup-time.c

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 8507891db69d..8bb35ee91fd5 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -459,6 +459,7 @@ LIB_OBJS += $(OUTPUT)tests/mmap-thread-lookup.o
 LIB_OBJS += $(OUTPUT)tests/thread-mg-share.o
 LIB_OBJS += $(OUTPUT)tests/switch-tracking.o
 LIB_OBJS += $(OUTPUT)tests/thread-comm.o
+LIB_OBJS += $(OUTPUT)tests/thread-lookup-time.o
 
 BUILTIN_OBJS += $(OUTPUT)builtin-annotate.o
 BUILTIN_OBJS += $(OUTPUT)builtin-bench.o
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 1b463d82a71a..e4d335de19ea 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -171,6 +171,10 @@ static struct test {
 		.func = test__thread_comm,
 	},
 	{
+		.desc = "Test thread lookup with time",
+		.func = test__thread_lookup_time,
+	},
+	{
 		.func = NULL,
 	},
 };
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index 43ac17780629..1090337f63e5 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -52,6 +52,7 @@ int test__switch_tracking(void);
 int test__fdarray__filter(void);
 int test__fdarray__add(void);
 int test__thread_comm(void);
+int test__thread_lookup_time(void);
 
 #if defined(__x86_64__) || defined(__i386__) || defined(__arm__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
diff --git a/tools/perf/tests/thread-lookup-time.c b/tools/perf/tests/thread-lookup-time.c
new file mode 100644
index 000000000000..6237ecf8caae
--- /dev/null
+++ b/tools/perf/tests/thread-lookup-time.c
@@ -0,0 +1,174 @@
+#include "tests.h"
+#include "machine.h"
+#include "thread.h"
+#include "map.h"
+#include "debug.h"
+
+static int thread__print_cb(struct thread *th, void *arg __maybe_unused)
+{
+	printf("thread: %d, start time: %"PRIu64" %s\n",
+	       th->tid, th->start_time, th->dead ? "(dead)" : "");
+	return 0;
+}
+
+static int lookup_with_timestamp(struct machine *machine)
+{
+	struct thread *t1, *t2, *t3;
+	union perf_event fork = {
+		.fork = {
+			.pid = 0,
+			.tid = 0,
+			.ppid = 1,
+			.ptid = 1,
+		},
+	};
+	struct perf_sample sample = {
+		.time = 50000,
+	};
+
+	/* start_time is set to 0 */
+	t1 = machine__findnew_thread(machine, 0, 0);
+
+	if (verbose > 1) {
+		printf("========= after t1 created ==========\n");
+		machine__for_each_thread(machine, thread__print_cb, NULL);
+	}
+
+	TEST_ASSERT_VAL("wrong start time of old thread", t1->start_time == 0);
+
+	TEST_ASSERT_VAL("cannot find current thread",
+			machine__find_thread(machine, 0, 0) == t1);
+
+	TEST_ASSERT_VAL("cannot find current thread with time",
+			machine__findnew_thread_time(machine, 0, 0, 10000) == t1);
+
+	/* start_time is overwritten to new value */
+	thread__set_comm(t1, "/usr/bin/perf", 20000);
+
+	if (verbose > 1) {
+		printf("========= after t1 set comm ==========\n");
+		machine__for_each_thread(machine, thread__print_cb, NULL);
+	}
+
+	TEST_ASSERT_VAL("failed to update start time", t1->start_time == 20000);
+
+	TEST_ASSERT_VAL("should not find passed thread",
+			/* this will create yet another dead thread */
+			machine__findnew_thread_time(machine, 0, 0, 10000) != t1);
+
+	TEST_ASSERT_VAL("cannot find overwritten thread with time",
+			machine__find_thread_time(machine, 0, 0, 20000) == t1);
+
+	/* now t1 goes to dead thread tree, and create t2 */
+	machine__process_fork_event(machine, &fork, &sample);
+
+	if (verbose > 1) {
+		printf("========= after t2 forked ==========\n");
+		machine__for_each_thread(machine, thread__print_cb, NULL);
+	}
+
+	t2 = machine__find_thread(machine, 0, 0);
+	TEST_ASSERT_VAL("cannot find current thread", t2 != NULL);
+
+	TEST_ASSERT_VAL("wrong start time of new thread", t2->start_time == 50000);
+
+	TEST_ASSERT_VAL("dead thread cannot be found",
+			machine__find_thread_time(machine, 0, 0, 10000) != t1);
+
+	TEST_ASSERT_VAL("cannot find dead thread after new thread",
+			machine__find_thread_time(machine, 0, 0, 30000) == t1);
+
+	TEST_ASSERT_VAL("cannot find current thread after new thread",
+			machine__find_thread_time(machine, 0, 0, 50000) == t2);
+
+	/* now t2 goes to dead thread tree, and create t3 */
+	sample.time = 60000;
+	machine__process_fork_event(machine, &fork, &sample);
+
+	if (verbose > 1) {
+		printf("========= after t3 forked ==========\n");
+		machine__for_each_thread(machine, thread__print_cb, NULL);
+	}
+
+	t3 = machine__find_thread(machine, 0, 0);
+	TEST_ASSERT_VAL("cannot find current thread", t3 != NULL);
+
+	TEST_ASSERT_VAL("wrong start time of new thread", t3->start_time == 60000);
+
+	TEST_ASSERT_VAL("cannot find dead thread after new thread",
+			machine__findnew_thread_time(machine, 0, 0, 30000) == t1);
+
+	TEST_ASSERT_VAL("cannot find dead thread after new thread",
+			machine__findnew_thread_time(machine, 0, 0, 50000) == t2);
+
+	TEST_ASSERT_VAL("cannot find current thread after new thread",
+			machine__findnew_thread_time(machine, 0, 0, 70000) == t3);
+
+	machine__delete_threads(machine);
+	return 0;
+}
+
+static int lookup_without_timestamp(struct machine *machine)
+{
+	struct thread *t1, *t2, *t3;
+	union perf_event fork = {
+		.fork = {
+			.pid = 0,
+			.tid = 0,
+			.ppid = 1,
+			.ptid = 1,
+		},
+	};
+	struct perf_sample sample = {
+		.time = -1ULL,
+	};
+
+	t1 = machine__findnew_thread(machine, 0, 0);
+	TEST_ASSERT_VAL("cannot find current thread", t1 != NULL);
+
+	TEST_ASSERT_VAL("cannot find new thread with time",
+			machine__findnew_thread_time(machine, 0, 0, -1ULL) == t1);
+
+	machine__process_fork_event(machine, &fork, &sample);
+
+	t2 = machine__find_thread(machine, 0, 0);
+	TEST_ASSERT_VAL("cannot find current thread", t2 != NULL);
+
+	TEST_ASSERT_VAL("cannot find new thread with time",
+			machine__find_thread_time(machine, 0, 0, -1ULL) == t2);
+
+	machine__process_fork_event(machine, &fork, &sample);
+
+	t3 = machine__find_thread(machine, 0, 0);
+	TEST_ASSERT_VAL("cannot find current thread", t3 != NULL);
+
+	TEST_ASSERT_VAL("cannot find new thread with time",
+			machine__findnew_thread_time(machine, 0, 0, -1ULL) == t3);
+
+	machine__delete_threads(machine);
+	return 0;
+}
+
+int test__thread_lookup_time(void)
+{
+	struct machines machines;
+	struct machine *machine;
+
+	/*
+	 * This test is to check whether it can retrieve a correct
+	 * thread for a given time.  When multi-file data storage is
+	 * enabled, those task/comm/mmap events are processed first so
+	 * the later sample should find a matching thread properly.
+	 */
+	machines__init(&machines);
+	machine = &machines.host;
+
+	if (lookup_with_timestamp(machine) < 0)
+		return -1;
+
+	if (lookup_without_timestamp(machine) < 0)
+		return -1;
+
+	machines__exit(&machines);
+	return 0;
+}
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 24/42] perf tools: Maintain map groups list in a leader thread
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (22 preceding siblings ...)
  2015-01-29  8:07 ` [PATCH 23/42] perf tools: Add a test case for timed thread handling Namhyung Kim
@ 2015-01-29  8:07 ` Namhyung Kim
  2015-01-29  8:07 ` [PATCH 25/42] perf tools: Introduce thread__find_addr_location_time() and friends Namhyung Kim
                   ` (18 subsequent siblings)
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

To support multi-threaded perf report, we need to maintain time-sorted
map groups.  Add ->mg_list member to struct thread and sort the list
by time.  Now leader threads have one more refcnt for map groups in
the list so also update the thread-mg-share test case.

Currently only add a new map groups when an exec (comm) event is
received.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/tests/thread-mg-share.c |  7 +++-
 tools/perf/util/event.c            |  2 +
 tools/perf/util/machine.c          |  2 +-
 tools/perf/util/map.c              |  1 +
 tools/perf/util/map.h              |  2 +
 tools/perf/util/thread.c           | 80 +++++++++++++++++++++++++++++++++++++-
 tools/perf/util/thread.h           |  3 ++
 7 files changed, 93 insertions(+), 4 deletions(-)

diff --git a/tools/perf/tests/thread-mg-share.c b/tools/perf/tests/thread-mg-share.c
index b028499dd3cf..8933e01d0549 100644
--- a/tools/perf/tests/thread-mg-share.c
+++ b/tools/perf/tests/thread-mg-share.c
@@ -23,6 +23,9 @@ int test__thread_mg_share(void)
 	 * with several threads and checks they properly share and
 	 * maintain map groups info (struct map_groups).
 	 *
+	 * Note that a leader thread has one more refcnt for its
+	 * (current) map groups.
+	 *
 	 * thread group (pid: 0, tids: 0, 1, 2, 3)
 	 * other  group (pid: 4, tids: 4, 5)
 	*/
@@ -43,7 +46,7 @@ int test__thread_mg_share(void)
 			leader && t1 && t2 && t3 && other);
 
 	mg = leader->mg;
-	TEST_ASSERT_VAL("wrong refcnt", mg->refcnt == 4);
+	TEST_ASSERT_VAL("wrong refcnt", mg->refcnt == 5);
 
 	/* test the map groups pointer is shared */
 	TEST_ASSERT_VAL("map groups don't match", mg == t1->mg);
@@ -59,7 +62,7 @@ int test__thread_mg_share(void)
 	TEST_ASSERT_VAL("failed to find other leader", other_leader);
 
 	other_mg = other->mg;
-	TEST_ASSERT_VAL("wrong refcnt", other_mg->refcnt == 2);
+	TEST_ASSERT_VAL("wrong refcnt", other_mg->refcnt == 3);
 
 	TEST_ASSERT_VAL("map groups don't match", other_mg == other_leader->mg);
 
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 8b9fe0a908e8..1558a7085c7f 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -751,6 +751,8 @@ void thread__find_addr_map(struct thread *thread, u8 cpumode,
 		return;
 	}
 
+	BUG_ON(mg == NULL);
+
 	if (cpumode == PERF_RECORD_MISC_KERNEL && perf_host) {
 		al->level = 'k';
 		mg = &machine->kmaps;
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index f8bc2f67b515..09c2edccccd9 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -331,7 +331,7 @@ static void machine__update_thread_pid(struct machine *machine,
 		goto out_err;
 
 	if (!leader->mg)
-		leader->mg = map_groups__new(machine);
+		thread__set_map_groups(leader, map_groups__new(machine), 0);
 
 	if (!leader->mg)
 		goto out_err;
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 62ca9f2607d5..f0c1e2a24fee 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -422,6 +422,7 @@ void map_groups__init(struct map_groups *mg, struct machine *machine)
 	}
 	mg->machine = machine;
 	mg->refcnt = 1;
+	mg->timestamp = 0;
 }
 
 static void maps__delete(struct rb_root *maps)
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index 0e42438b1e59..f33d49029ac0 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -61,7 +61,9 @@ struct map_groups {
 	struct rb_root	 maps[MAP__NR_TYPES];
 	struct list_head removed_maps[MAP__NR_TYPES];
 	struct machine	 *machine;
+	u64		 timestamp;
 	int		 refcnt;
+	struct list_head list;
 };
 
 struct map_groups *map_groups__new(struct machine *machine);
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 306bdaede019..895c74683c81 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -10,13 +10,64 @@
 #include "comm.h"
 #include "unwind.h"
 
+struct map_groups *thread__get_map_groups(struct thread *thread, u64 timestamp)
+{
+	struct map_groups *mg;
+
+	list_for_each_entry(mg, &thread->mg_list, list)
+		if (timestamp >= mg->timestamp)
+			return mg;
+
+	return thread->mg;
+}
+
+int thread__set_map_groups(struct thread *thread, struct map_groups *mg,
+			   u64 timestamp)
+{
+	struct list_head *pos;
+	struct map_groups *old;
+
+	if (mg == NULL)
+		return -ENOMEM;
+
+	/*
+	 * Only a leader thread can have map groups list - others
+	 * reference it through map_groups__get.  This means the
+	 * leader thread will have one more refcnt than others.
+	 */
+	if (thread->tid != thread->pid_)
+		return -EINVAL;
+
+	if (thread->mg) {
+		BUG_ON(thread->mg->refcnt <= 1);
+		map_groups__put(thread->mg);
+	}
+
+	/* sort by time */
+	list_for_each(pos, &thread->mg_list) {
+		old = list_entry(pos, struct map_groups, list);
+		if (timestamp > old->timestamp)
+			break;
+	}
+
+	list_add_tail(&mg->list, pos);
+	mg->timestamp = timestamp;
+
+	/* set current ->mg to most recent one */
+	thread->mg = list_first_entry(&thread->mg_list, struct map_groups, list);
+	/* increase one more refcnt for current */
+	map_groups__get(thread->mg);
+
+	return 0;
+}
+
 int thread__init_map_groups(struct thread *thread, struct machine *machine)
 {
 	struct thread *leader;
 	pid_t pid = thread->pid_;
 
 	if (pid == thread->tid || pid == -1) {
-		thread->mg = map_groups__new(machine);
+		thread__set_map_groups(thread, map_groups__new(machine), 0);
 	} else {
 		leader = machine__findnew_thread(machine, pid, pid);
 		if (leader)
@@ -39,6 +90,7 @@ struct thread *thread__new(pid_t pid, pid_t tid)
 		thread->cpu = -1;
 		INIT_LIST_HEAD(&thread->comm_list);
 		INIT_LIST_HEAD(&thread->tid_node);
+		INIT_LIST_HEAD(&thread->mg_list);
 
 		if (unwind__prepare_access(thread) < 0)
 			goto err_thread;
@@ -67,6 +119,7 @@ struct thread *thread__new(pid_t pid, pid_t tid)
 void thread__delete(struct thread *thread)
 {
 	struct comm *comm, *tmp;
+	struct map_groups *mg, *tmp_mg;
 
 	thread_stack__free(thread);
 
@@ -74,6 +127,11 @@ void thread__delete(struct thread *thread)
 		map_groups__put(thread->mg);
 		thread->mg = NULL;
 	}
+	/* only leader threads have mg list */
+	list_for_each_entry_safe(mg, tmp_mg, &thread->mg_list, list) {
+		list_del(&mg->list);
+		map_groups__put(mg);
+	}
 	list_for_each_entry_safe(comm, tmp, &thread->comm_list, list) {
 		list_del(&comm->list);
 		comm__free(comm);
@@ -149,6 +207,26 @@ int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
 			unwind__flush_access(thread);
 	}
 
+	if (exec) {
+		struct machine *machine;
+
+		BUG_ON(thread->mg == NULL || thread->mg->machine == NULL);
+
+		if (thread->tid != thread->pid_) {
+			/* now it'll be a new leader */
+			thread->pid_ = thread->tid;
+
+			/* current mg of leader thread needs one more refcnt */
+			map_groups__get(thread->mg);
+
+			thread__set_map_groups(thread, thread->mg,
+					       thread->mg->timestamp);
+		}
+
+		machine = thread->mg->machine;
+		thread__set_map_groups(thread, map_groups__new(machine), timestamp);
+	}
+
 	thread->comm_set = true;
 
 	return 0;
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index e5d7abd255ea..08cafa2d97f9 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -14,6 +14,7 @@ struct thread {
 	struct rb_node	 	rb_node;
 	struct list_head 	tid_node;
 	struct map_groups	*mg;
+	struct list_head	mg_list;
 	pid_t			pid_; /* Not all tools update this */
 	pid_t			tid;
 	pid_t			ppid;
@@ -56,6 +57,8 @@ struct comm *thread__exec_comm(const struct thread *thread);
 struct comm *thread__comm_time(const struct thread *thread, u64 timestamp);
 const char *thread__comm_str(const struct thread *thread);
 const char *thread__comm_str_time(const struct thread *thread, u64 timestamp);
+struct map_groups *thread__get_map_groups(struct thread *thread, u64 timestamp);
+int thread__set_map_groups(struct thread *thread, struct map_groups *mg, u64 timestamp);
 void thread__insert_map(struct thread *thread, struct map *map);
 int thread__fork(struct thread *thread, struct thread *parent, u64 timestamp);
 size_t thread__fprintf(struct thread *thread, FILE *fp);
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 25/42] perf tools: Introduce thread__find_addr_location_time() and friends
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (23 preceding siblings ...)
  2015-01-29  8:07 ` [PATCH 24/42] perf tools: Maintain map groups list in a leader thread Namhyung Kim
@ 2015-01-29  8:07 ` Namhyung Kim
  2015-01-29  8:07 ` [PATCH 26/42] perf tools: Add a test case for timed map groups handling Namhyung Kim
                   ` (17 subsequent siblings)
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

The *_time() variants are for find appropriate map (and symbol) at the
given time.  This is based on the fact that map_groups list is sorted
by time in the previous patch.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/event.c            | 80 +++++++++++++++++++++++++++++++++-----
 tools/perf/util/machine.c          | 51 ++++++++++++++----------
 tools/perf/util/thread.c           | 21 ++++++++++
 tools/perf/util/thread.h           | 10 +++++
 tools/perf/util/unwind-libdw.c     | 11 +++---
 tools/perf/util/unwind-libunwind.c | 18 ++++-----
 6 files changed, 146 insertions(+), 45 deletions(-)

diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 1558a7085c7f..e7152a6e3043 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -732,16 +732,14 @@ int perf_event__process(struct perf_tool *tool __maybe_unused,
 	return machine__process_event(machine, event, sample);
 }
 
-void thread__find_addr_map(struct thread *thread, u8 cpumode,
-			   enum map_type type, u64 addr,
-			   struct addr_location *al)
+static void map_groups__find_addr_map(struct map_groups *mg, u8 cpumode,
+				      enum map_type type, u64 addr,
+				      struct addr_location *al)
 {
-	struct map_groups *mg = thread->mg;
 	struct machine *machine = mg->machine;
 	bool load_map = false;
 
 	al->machine = machine;
-	al->thread = thread;
 	al->addr = addr;
 	al->cpumode = cpumode;
 	al->filtered = 0;
@@ -810,6 +808,36 @@ void thread__find_addr_map(struct thread *thread, u8 cpumode,
 	}
 }
 
+void thread__find_addr_map(struct thread *thread, u8 cpumode,
+			   enum map_type type, u64 addr,
+			   struct addr_location *al)
+{
+	al->thread = thread;
+	map_groups__find_addr_map(thread->mg, cpumode, type, addr, al);
+}
+
+void thread__find_addr_map_time(struct thread *thread, u8 cpumode,
+				enum map_type type, u64 addr,
+				struct addr_location *al, u64 timestamp)
+{
+	struct map_groups *mg;
+	struct thread *leader;
+
+	if (thread->tid == thread->pid_)
+		leader = thread;
+	else
+		leader = machine__findnew_thread_time(thread->mg->machine,
+						      thread->pid_,
+						      thread->pid_,
+						      timestamp);
+	BUG_ON(leader == NULL);
+
+	mg = thread__get_map_groups(leader, timestamp);
+
+	al->thread = thread;
+	map_groups__find_addr_map(mg, cpumode, type, addr, al);
+}
+
 void thread__find_addr_location(struct thread *thread,
 				u8 cpumode, enum map_type type, u64 addr,
 				struct addr_location *al)
@@ -822,6 +850,21 @@ void thread__find_addr_location(struct thread *thread,
 		al->sym = NULL;
 }
 
+void thread__find_addr_location_time(struct thread *thread, u8 cpumode,
+				     enum map_type type, u64 addr,
+				     struct addr_location *al, u64 timestamp)
+{
+	struct map_groups *mg;
+
+	mg = thread__get_map_groups(thread, timestamp);
+	map_groups__find_addr_map(mg, cpumode, type, addr, al);
+	if (al->map != NULL)
+		al->sym = map__find_symbol(al->map, al->addr,
+					   mg->machine->symbol_filter);
+	else
+		al->sym = NULL;
+}
+
 int perf_event__preprocess_sample(const union perf_event *event,
 				  struct machine *machine,
 				  struct addr_location *al,
@@ -854,7 +897,13 @@ int perf_event__preprocess_sample(const union perf_event *event,
 	    machine->vmlinux_maps[MAP__FUNCTION] == NULL)
 		machine__create_kernel_maps(machine);
 
-	thread__find_addr_map(thread, cpumode, MAP__FUNCTION, sample->ip, al);
+	if (session && perf_session__has_index(session))
+		thread__find_addr_map_time(thread, cpumode, MAP__FUNCTION,
+					   sample->ip, al, sample->time);
+	else
+		thread__find_addr_map(thread, cpumode, MAP__FUNCTION,
+				      sample->ip, al);
+
 	dump_printf(" ...... dso: %s\n",
 		    al->map ? al->map->dso->long_name :
 			al->level == 'H' ? "[hypervisor]" : "<not found>");
@@ -915,15 +964,26 @@ void perf_event__preprocess_sample_addr(union perf_event *event,
 					struct perf_sample *sample,
 					struct thread *thread,
 					struct addr_location *al,
-					struct perf_session *session __maybe_unused)
+					struct perf_session *session)
 {
 	u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
 
-	thread__find_addr_map(thread, cpumode, MAP__FUNCTION, sample->addr, al);
-	if (!al->map)
-		thread__find_addr_map(thread, cpumode, MAP__VARIABLE,
+	if (session && perf_session__has_index(session))
+		thread__find_addr_map_time(thread, cpumode, MAP__FUNCTION,
+					   sample->addr, al, sample->time);
+	else
+		thread__find_addr_map(thread, cpumode, MAP__FUNCTION,
 				      sample->addr, al);
 
+	if (!al->map) {
+		if (session && perf_session__has_index(session))
+			thread__find_addr_map_time(thread, cpumode, MAP__VARIABLE,
+						   sample->addr, al, sample->time);
+		else
+			thread__find_addr_map(thread, cpumode, MAP__VARIABLE,
+					      sample->addr, al);
+	}
+
 	al->cpu = sample->cpu;
 	al->sym = NULL;
 
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 09c2edccccd9..7dc044b93cf8 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1471,7 +1471,7 @@ static bool symbol__match_regex(struct symbol *sym, regex_t *regex)
 
 static void ip__resolve_ams(struct thread *thread,
 			    struct addr_map_symbol *ams,
-			    u64 ip)
+			    u64 ip, u64 timestamp)
 {
 	struct addr_location al;
 
@@ -1483,7 +1483,8 @@ static void ip__resolve_ams(struct thread *thread,
 	 * Thus, we have to try consecutively until we find a match
 	 * or else, the symbol is unknown
 	 */
-	thread__find_cpumode_addr_location(thread, MAP__FUNCTION, ip, &al);
+	thread__find_cpumode_addr_location_time(thread, MAP__FUNCTION, ip, &al,
+						timestamp);
 
 	ams->addr = ip;
 	ams->al_addr = al.addr;
@@ -1491,21 +1492,24 @@ static void ip__resolve_ams(struct thread *thread,
 	ams->map = al.map;
 }
 
-static void ip__resolve_data(struct thread *thread,
-			     u8 m, struct addr_map_symbol *ams, u64 addr)
+static void ip__resolve_data(struct thread *thread, u8 m,
+			     struct addr_map_symbol *ams,
+			     u64 addr, u64 timestamp)
 {
 	struct addr_location al;
 
 	memset(&al, 0, sizeof(al));
 
-	thread__find_addr_location(thread, m, MAP__VARIABLE, addr, &al);
+	thread__find_addr_location_time(thread, m, MAP__VARIABLE, addr,
+					&al, timestamp);
 	if (al.map == NULL) {
 		/*
 		 * some shared data regions have execute bit set which puts
 		 * their mapping in the MAP__FUNCTION type array.
 		 * Check there as a fallback option before dropping the sample.
 		 */
-		thread__find_addr_location(thread, m, MAP__FUNCTION, addr, &al);
+		thread__find_addr_location_time(thread, m, MAP__FUNCTION, addr,
+						&al, timestamp);
 	}
 
 	ams->addr = addr;
@@ -1522,8 +1526,9 @@ struct mem_info *sample__resolve_mem(struct perf_sample *sample,
 	if (!mi)
 		return NULL;
 
-	ip__resolve_ams(al->thread, &mi->iaddr, sample->ip);
-	ip__resolve_data(al->thread, al->cpumode, &mi->daddr, sample->addr);
+	ip__resolve_ams(al->thread, &mi->iaddr, sample->ip, sample->time);
+	ip__resolve_data(al->thread, al->cpumode, &mi->daddr, sample->addr,
+			 sample->time);
 	mi->data_src.val = sample->data_src;
 
 	return mi;
@@ -1533,15 +1538,16 @@ static int add_callchain_ip(struct thread *thread,
 			    struct symbol **parent,
 			    struct addr_location *root_al,
 			    bool branch_history,
-			    u64 ip)
+			    u64 ip, u64 timestamp)
 {
 	struct addr_location al;
 
 	al.filtered = 0;
 	al.sym = NULL;
+
 	if (branch_history)
-		thread__find_cpumode_addr_location(thread, MAP__FUNCTION,
-						   ip, &al);
+		thread__find_cpumode_addr_location_time(thread, MAP__FUNCTION,
+							ip, &al, timestamp);
 	else {
 		u8 cpumode = PERF_RECORD_MISC_USER;
 
@@ -1568,8 +1574,8 @@ static int add_callchain_ip(struct thread *thread,
 			}
 			return 0;
 		}
-		thread__find_addr_location(thread, cpumode, MAP__FUNCTION,
-				   ip, &al);
+		thread__find_addr_location_time(thread, cpumode, MAP__FUNCTION,
+						ip, &al, timestamp);
 	}
 
 	if (al.sym != NULL) {
@@ -1599,8 +1605,10 @@ struct branch_info *sample__resolve_bstack(struct perf_sample *sample,
 		return NULL;
 
 	for (i = 0; i < bs->nr; i++) {
-		ip__resolve_ams(al->thread, &bi[i].to, bs->entries[i].to);
-		ip__resolve_ams(al->thread, &bi[i].from, bs->entries[i].from);
+		ip__resolve_ams(al->thread, &bi[i].to, bs->entries[i].to,
+				sample->time);
+		ip__resolve_ams(al->thread, &bi[i].from, bs->entries[i].from,
+				sample->time);
 		bi[i].flags = bs->entries[i].flags;
 	}
 	return bi;
@@ -1652,7 +1660,7 @@ static int thread__resolve_callchain_sample(struct thread *thread,
 					     struct branch_stack *branch,
 					     struct symbol **parent,
 					     struct addr_location *root_al,
-					     int max_stack)
+					     int max_stack, u64 timestamp)
 {
 	int chain_nr = min(max_stack, (int)chain->nr);
 	int i, j, err;
@@ -1713,10 +1721,10 @@ static int thread__resolve_callchain_sample(struct thread *thread,
 
 		for (i = 0; i < nr; i++) {
 			err = add_callchain_ip(thread, parent, root_al,
-					       true, be[i].to);
+					       true, be[i].to, timestamp);
 			if (!err)
 				err = add_callchain_ip(thread, parent, root_al,
-						       true, be[i].from);
+						true, be[i].from, timestamp);
 			if (err == -EINVAL)
 				break;
 			if (err)
@@ -1745,8 +1753,8 @@ static int thread__resolve_callchain_sample(struct thread *thread,
 #endif
 		ip = chain->ips[j];
 
-		err = add_callchain_ip(thread, parent, root_al, false, ip);
-
+		err = add_callchain_ip(thread, parent, root_al, false, ip,
+				       timestamp);
 		if (err)
 			return (err < 0) ? err : 0;
 	}
@@ -1770,7 +1778,8 @@ int thread__resolve_callchain(struct thread *thread,
 {
 	int ret = thread__resolve_callchain_sample(thread, sample->callchain,
 						   sample->branch_stack,
-						   parent, root_al, max_stack);
+						   parent, root_al, max_stack,
+						   sample->time);
 	if (ret)
 		return ret;
 
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 895c74683c81..293157dafd2c 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -331,3 +331,24 @@ void thread__find_cpumode_addr_location(struct thread *thread,
 			break;
 	}
 }
+
+void thread__find_cpumode_addr_location_time(struct thread *thread,
+					     enum map_type type, u64 addr,
+					     struct addr_location *al,
+					     u64 timestamp)
+{
+	size_t i;
+	const u8 const cpumodes[] = {
+		PERF_RECORD_MISC_USER,
+		PERF_RECORD_MISC_KERNEL,
+		PERF_RECORD_MISC_GUEST_USER,
+		PERF_RECORD_MISC_GUEST_KERNEL
+	};
+
+	for (i = 0; i < ARRAY_SIZE(cpumodes); i++) {
+		thread__find_addr_location_time(thread, cpumodes[i], type,
+						addr, al, timestamp);
+		if (al->map)
+			break;
+	}
+}
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index 08cafa2d97f9..5209ad5adadf 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -66,14 +66,24 @@ size_t thread__fprintf(struct thread *thread, FILE *fp);
 void thread__find_addr_map(struct thread *thread,
 			   u8 cpumode, enum map_type type, u64 addr,
 			   struct addr_location *al);
+void thread__find_addr_map_time(struct thread *thread, u8 cpumode,
+				enum map_type type, u64 addr,
+				struct addr_location *al, u64 timestamp);
 
 void thread__find_addr_location(struct thread *thread,
 				u8 cpumode, enum map_type type, u64 addr,
 				struct addr_location *al);
+void thread__find_addr_location_time(struct thread *thread, u8 cpumode,
+				     enum map_type type, u64 addr,
+				     struct addr_location *al, u64 timestamp);
 
 void thread__find_cpumode_addr_location(struct thread *thread,
 					enum map_type type, u64 addr,
 					struct addr_location *al);
+void thread__find_cpumode_addr_location_time(struct thread *thread,
+					     enum map_type type, u64 addr,
+					     struct addr_location *al,
+					     u64 timestamp);
 
 static inline void *thread__priv(struct thread *thread)
 {
diff --git a/tools/perf/util/unwind-libdw.c b/tools/perf/util/unwind-libdw.c
index 2dcfe9a7c8d0..ba8d8e41d680 100644
--- a/tools/perf/util/unwind-libdw.c
+++ b/tools/perf/util/unwind-libdw.c
@@ -26,9 +26,10 @@ static int __report_module(struct addr_location *al, u64 ip,
 	Dwfl_Module *mod;
 	struct dso *dso = NULL;
 
-	thread__find_addr_location(ui->thread,
-				   PERF_RECORD_MISC_USER,
-				   MAP__FUNCTION, ip, al);
+	thread__find_addr_location_time(ui->thread,
+					PERF_RECORD_MISC_USER,
+					MAP__FUNCTION, ip, al,
+					ui->sample->time);
 
 	if (al->map)
 		dso = al->map->dso;
@@ -89,8 +90,8 @@ static int access_dso_mem(struct unwind_info *ui, Dwarf_Addr addr,
 	struct addr_location al;
 	ssize_t size;
 
-	thread__find_addr_map(ui->thread, PERF_RECORD_MISC_USER,
-			      MAP__FUNCTION, addr, &al);
+	thread__find_addr_map_time(ui->thread, PERF_RECORD_MISC_USER,
+				   MAP__FUNCTION, addr, &al, ui->sample->time);
 	if (!al.map) {
 		pr_debug("unwind: no map for %lx\n", (unsigned long)addr);
 		return -1;
diff --git a/tools/perf/util/unwind-libunwind.c b/tools/perf/util/unwind-libunwind.c
index 6edf535f65c2..7ed6eaf232b6 100644
--- a/tools/perf/util/unwind-libunwind.c
+++ b/tools/perf/util/unwind-libunwind.c
@@ -306,8 +306,8 @@ static struct map *find_map(unw_word_t ip, struct unwind_info *ui)
 {
 	struct addr_location al;
 
-	thread__find_addr_map(ui->thread, PERF_RECORD_MISC_USER,
-			      MAP__FUNCTION, ip, &al);
+	thread__find_addr_map_time(ui->thread, PERF_RECORD_MISC_USER,
+				   MAP__FUNCTION, ip, &al, ui->sample->time);
 	return al.map;
 }
 
@@ -400,8 +400,8 @@ static int access_dso_mem(struct unwind_info *ui, unw_word_t addr,
 	struct addr_location al;
 	ssize_t size;
 
-	thread__find_addr_map(ui->thread, PERF_RECORD_MISC_USER,
-			      MAP__FUNCTION, addr, &al);
+	thread__find_addr_map_time(ui->thread, PERF_RECORD_MISC_USER,
+				   MAP__FUNCTION, addr, &al, ui->sample->time);
 	if (!al.map) {
 		pr_debug("unwind: no map for %lx\n", (unsigned long)addr);
 		return -1;
@@ -502,14 +502,14 @@ static void put_unwind_info(unw_addr_space_t __maybe_unused as,
 	pr_debug("unwind: put_unwind_info called\n");
 }
 
-static int entry(u64 ip, struct thread *thread,
+static int entry(u64 ip, struct thread *thread, u64 timestamp,
 		 unwind_entry_cb_t cb, void *arg)
 {
 	struct unwind_entry e;
 	struct addr_location al;
 
-	thread__find_addr_location(thread, PERF_RECORD_MISC_USER,
-				   MAP__FUNCTION, ip, &al);
+	thread__find_addr_location_time(thread, PERF_RECORD_MISC_USER,
+					MAP__FUNCTION, ip, &al, timestamp);
 
 	e.ip = ip;
 	e.map = al.map;
@@ -611,7 +611,7 @@ static int get_entries(struct unwind_info *ui, unwind_entry_cb_t cb,
 		unw_word_t ip;
 
 		unw_get_reg(&c, UNW_REG_IP, &ip);
-		ret = ip ? entry(ip, ui->thread, cb, arg) : 0;
+		ret = ip ? entry(ip, ui->thread, ui->sample->time, cb, arg) : 0;
 	}
 
 	return ret;
@@ -636,7 +636,7 @@ int unwind__get_entries(unwind_entry_cb_t cb, void *arg,
 	if (ret)
 		return ret;
 
-	ret = entry(ip, thread, cb, arg);
+	ret = entry(ip, thread, data->time, cb, arg);
 	if (ret)
 		return -ENOMEM;
 
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 26/42] perf tools: Add a test case for timed map groups handling
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (24 preceding siblings ...)
  2015-01-29  8:07 ` [PATCH 25/42] perf tools: Introduce thread__find_addr_location_time() and friends Namhyung Kim
@ 2015-01-29  8:07 ` Namhyung Kim
  2015-01-29  8:07 ` [PATCH 27/42] perf tools: Protect dso symbol loading using a mutex Namhyung Kim
                   ` (16 subsequent siblings)
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

A test case for verifying thread->mg and ->mg_list handling during
time change and new thread__find_addr_map_time() and friends.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/Makefile.perf          |  1 +
 tools/perf/tests/builtin-test.c   |  4 ++
 tools/perf/tests/tests.h          |  1 +
 tools/perf/tests/thread-mg-time.c | 88 +++++++++++++++++++++++++++++++++++++++
 4 files changed, 94 insertions(+)
 create mode 100644 tools/perf/tests/thread-mg-time.c

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 8bb35ee91fd5..2f8c8b918cac 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -460,6 +460,7 @@ LIB_OBJS += $(OUTPUT)tests/thread-mg-share.o
 LIB_OBJS += $(OUTPUT)tests/switch-tracking.o
 LIB_OBJS += $(OUTPUT)tests/thread-comm.o
 LIB_OBJS += $(OUTPUT)tests/thread-lookup-time.o
+LIB_OBJS += $(OUTPUT)tests/thread-mg-time.o
 
 BUILTIN_OBJS += $(OUTPUT)builtin-annotate.o
 BUILTIN_OBJS += $(OUTPUT)builtin-bench.o
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index e4d335de19ea..8f61a7e291ee 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -175,6 +175,10 @@ static struct test {
 		.func = test__thread_lookup_time,
 	},
 	{
+		.desc = "Test thread map group handling with time",
+		.func = test__thread_mg_time,
+	},
+	{
 		.func = NULL,
 	},
 };
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index 1090337f63e5..03557563f31d 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -53,6 +53,7 @@ int test__fdarray__filter(void);
 int test__fdarray__add(void);
 int test__thread_comm(void);
 int test__thread_lookup_time(void);
+int test__thread_mg_time(void);
 
 #if defined(__x86_64__) || defined(__i386__) || defined(__arm__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
diff --git a/tools/perf/tests/thread-mg-time.c b/tools/perf/tests/thread-mg-time.c
new file mode 100644
index 000000000000..69fd13752c1d
--- /dev/null
+++ b/tools/perf/tests/thread-mg-time.c
@@ -0,0 +1,88 @@
+#include "tests.h"
+#include "machine.h"
+#include "thread.h"
+#include "map.h"
+#include "debug.h"
+
+#define PERF_MAP_START  0x40000
+
+int test__thread_mg_time(void)
+{
+	struct machines machines;
+	struct machine *machine;
+	struct thread *t;
+	struct map_groups *mg;
+	struct map *map;
+	struct addr_location al = { .map = NULL, };
+
+	/*
+	 * This test is to check whether it can retrieve a correct map
+	 * for a given time.  When multi-file data storage is enabled,
+	 * those task/comm/mmap events are processed first so the
+	 * later sample should find a matching comm properly.
+	 */
+	machines__init(&machines);
+	machine = &machines.host;
+
+	t = machine__findnew_thread(machine, 0, 0);
+	mg = t->mg;
+
+	map = dso__new_map("/usr/bin/perf");
+	map->start = PERF_MAP_START;
+	map->end = PERF_MAP_START + 0x1000;
+
+	thread__insert_map(t, map);
+
+	if (verbose > 1)
+		map_groups__fprintf(t->mg, stderr);
+
+	thread__find_addr_map(t, PERF_RECORD_MISC_USER, MAP__FUNCTION,
+			      PERF_MAP_START, &al);
+
+	TEST_ASSERT_VAL("cannot find mapping for perf", al.map != NULL);
+	TEST_ASSERT_VAL("non matched mapping found", al.map == map);
+	TEST_ASSERT_VAL("incorrect map groups", al.map->groups == mg);
+	TEST_ASSERT_VAL("incorrect map groups", al.map->groups == t->mg);
+
+	thread__find_addr_map_time(t, PERF_RECORD_MISC_USER,
+				   MAP__FUNCTION, PERF_MAP_START, &al, -1ULL);
+
+	TEST_ASSERT_VAL("cannot find timed mapping for perf", al.map != NULL);
+	TEST_ASSERT_VAL("non matched timed mapping", al.map == map);
+	TEST_ASSERT_VAL("incorrect timed map groups", al.map->groups == mg);
+	TEST_ASSERT_VAL("incorrect map groups", al.map->groups == t->mg);
+
+
+	pr_debug("simulate EXEC event (generate new mg)\n");
+	__thread__set_comm(t, "perf-test", 10000, true);
+
+	map = dso__new_map("/usr/bin/perf-test");
+	map->start = PERF_MAP_START;
+	map->end = PERF_MAP_START + 0x2000;
+
+	thread__insert_map(t, map);
+
+	if (verbose > 1)
+		map_groups__fprintf(t->mg, stderr);
+
+	thread__find_addr_map(t, PERF_RECORD_MISC_USER, MAP__FUNCTION,
+			      PERF_MAP_START + 4, &al);
+
+	TEST_ASSERT_VAL("cannot find mapping for perf-test", al.map != NULL);
+	TEST_ASSERT_VAL("invalid mapping found", al.map == map);
+	TEST_ASSERT_VAL("incorrect map groups", al.map->groups != mg);
+	TEST_ASSERT_VAL("incorrect map groups", al.map->groups == t->mg);
+
+	pr_debug("searching map in the old mag groups\n");
+	thread__find_addr_map_time(t, PERF_RECORD_MISC_USER,
+				   MAP__FUNCTION, PERF_MAP_START, &al, 5000);
+
+	TEST_ASSERT_VAL("cannot find timed mapping for perf-test", al.map != NULL);
+	TEST_ASSERT_VAL("non matched timed mapping", al.map != map);
+	TEST_ASSERT_VAL("incorrect timed map groups", al.map->groups == mg);
+	TEST_ASSERT_VAL("incorrect map groups", al.map->groups != t->mg);
+
+	machine__delete_threads(machine);
+	machines__exit(&machines);
+	return 0;
+}
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 27/42] perf tools: Protect dso symbol loading using a mutex
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (25 preceding siblings ...)
  2015-01-29  8:07 ` [PATCH 26/42] perf tools: Add a test case for timed map groups handling Namhyung Kim
@ 2015-01-29  8:07 ` Namhyung Kim
  2015-01-29 12:34   ` Arnaldo Carvalho de Melo
  2015-01-29  8:07 ` [PATCH 28/42] perf tools: Protect dso cache tree using dso->lock Namhyung Kim
                   ` (15 subsequent siblings)
  42 siblings, 1 reply; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

When multi-thread support for perf report is enabled, it's possible to
access a dso concurrently.  Add a new pthread_mutex to protect it from
concurrent dso__load().

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/dso.c    |  2 ++
 tools/perf/util/dso.h    |  1 +
 tools/perf/util/symbol.c | 34 ++++++++++++++++++++++++----------
 3 files changed, 27 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 45be944d450a..3da75816b8f8 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -888,6 +888,7 @@ struct dso *dso__new(const char *name)
 		RB_CLEAR_NODE(&dso->rb_node);
 		INIT_LIST_HEAD(&dso->node);
 		INIT_LIST_HEAD(&dso->data.open_entry);
+		pthread_mutex_init(&dso->lock, NULL);
 	}
 
 	return dso;
@@ -917,6 +918,7 @@ void dso__delete(struct dso *dso)
 	dso_cache__free(&dso->data.cache);
 	dso__free_a2l(dso);
 	zfree(&dso->symsrc_filename);
+	pthread_mutex_destroy(&dso->lock);
 	free(dso);
 }
 
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index 3782c82c6e44..ac753594a469 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -102,6 +102,7 @@ struct dsos {
 };
 
 struct dso {
+	pthread_mutex_t	 lock;
 	struct list_head node;
 	struct rb_node	 rb_node;	/* rbtree node sorted by long name */
 	struct rb_root	 symbols[MAP__NR_TYPES];
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index a69066865a55..714e20c99354 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -1357,12 +1357,22 @@ int dso__load(struct dso *dso, struct map *map, symbol_filter_t filter)
 	struct symsrc *syms_ss = NULL, *runtime_ss = NULL;
 	bool kmod;
 
-	dso__set_loaded(dso, map->type);
+	pthread_mutex_lock(&dso->lock);
+
+	/* check again under the dso->lock */
+	if (dso__loaded(dso, map->type)) {
+		ret = 1;
+		goto out;
+	}
+
+	if (dso->kernel) {
+		if (dso->kernel == DSO_TYPE_KERNEL)
+			ret = dso__load_kernel_sym(dso, map, filter);
+		else if (dso->kernel == DSO_TYPE_GUEST_KERNEL)
+			ret = dso__load_guest_kernel_sym(dso, map, filter);
 
-	if (dso->kernel == DSO_TYPE_KERNEL)
-		return dso__load_kernel_sym(dso, map, filter);
-	else if (dso->kernel == DSO_TYPE_GUEST_KERNEL)
-		return dso__load_guest_kernel_sym(dso, map, filter);
+		goto out;
+	}
 
 	if (map->groups && map->groups->machine)
 		machine = map->groups->machine;
@@ -1375,18 +1385,18 @@ int dso__load(struct dso *dso, struct map *map, symbol_filter_t filter)
 		struct stat st;
 
 		if (lstat(dso->name, &st) < 0)
-			return -1;
+			goto out;
 
 		if (st.st_uid && (st.st_uid != geteuid())) {
 			pr_warning("File %s not owned by current user or root, "
 				"ignoring it.\n", dso->name);
-			return -1;
+			goto out;
 		}
 
 		ret = dso__load_perf_map(dso, map, filter);
 		dso->symtab_type = ret > 0 ? DSO_BINARY_TYPE__JAVA_JIT :
 					     DSO_BINARY_TYPE__NOT_FOUND;
-		return ret;
+		goto out;
 	}
 
 	if (machine)
@@ -1394,7 +1404,7 @@ int dso__load(struct dso *dso, struct map *map, symbol_filter_t filter)
 
 	name = malloc(PATH_MAX);
 	if (!name)
-		return -1;
+		goto out;
 
 	kmod = dso->symtab_type == DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE ||
 		dso->symtab_type == DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE_COMP ||
@@ -1475,7 +1485,11 @@ int dso__load(struct dso *dso, struct map *map, symbol_filter_t filter)
 out_free:
 	free(name);
 	if (ret < 0 && strstr(dso->name, " (deleted)") != NULL)
-		return 0;
+		ret = 0;
+out:
+	dso__set_loaded(dso, map->type);
+	pthread_mutex_unlock(&dso->lock);
+
 	return ret;
 }
 
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 28/42] perf tools: Protect dso cache tree using dso->lock
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (26 preceding siblings ...)
  2015-01-29  8:07 ` [PATCH 27/42] perf tools: Protect dso symbol loading using a mutex Namhyung Kim
@ 2015-01-29  8:07 ` Namhyung Kim
  2015-01-29  8:07 ` [PATCH 29/42] perf tools: Protect dso cache fd with a mutex Namhyung Kim
                   ` (14 subsequent siblings)
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

The dso cache is accessed during dwarf callchain unwind and it might
be processed concurrently when multi-thread report is enabled.
Protect it under dso->lock.

Note that it doesn't protect dso_cache__find().  I think it's safe to
access to the cache tree without the lock since we don't delete nodes.
It it missed an existing node due to rotation, it'll find it during
dso_cache__insert() anyway.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/dso.c | 34 +++++++++++++++++++++++++++-------
 1 file changed, 27 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 3da75816b8f8..11ece224ef50 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -443,10 +443,12 @@ bool dso__data_status_seen(struct dso *dso, enum dso_data_status_seen by)
 }
 
 static void
-dso_cache__free(struct rb_root *root)
+dso_cache__free(struct dso *dso)
 {
+	struct rb_root *root = &dso->data.cache;
 	struct rb_node *next = rb_first(root);
 
+	pthread_mutex_lock(&dso->lock);
 	while (next) {
 		struct dso_cache *cache;
 
@@ -455,10 +457,12 @@ dso_cache__free(struct rb_root *root)
 		rb_erase(&cache->rb_node, root);
 		free(cache);
 	}
+	pthread_mutex_unlock(&dso->lock);
 }
 
-static struct dso_cache *dso_cache__find(const struct rb_root *root, u64 offset)
+static struct dso_cache *dso_cache__find(struct dso *dso, u64 offset)
 {
+	const struct rb_root *root = &dso->data.cache;
 	struct rb_node * const *p = &root->rb_node;
 	const struct rb_node *parent = NULL;
 	struct dso_cache *cache;
@@ -477,17 +481,20 @@ static struct dso_cache *dso_cache__find(const struct rb_root *root, u64 offset)
 		else
 			return cache;
 	}
+
 	return NULL;
 }
 
-static void
-dso_cache__insert(struct rb_root *root, struct dso_cache *new)
+static struct dso_cache *
+dso_cache__insert(struct dso *dso, struct dso_cache *new)
 {
+	struct rb_root *root = &dso->data.cache;
 	struct rb_node **p = &root->rb_node;
 	struct rb_node *parent = NULL;
 	struct dso_cache *cache;
 	u64 offset = new->offset;
 
+	pthread_mutex_lock(&dso->lock);
 	while (*p != NULL) {
 		u64 end;
 
@@ -499,10 +506,17 @@ dso_cache__insert(struct rb_root *root, struct dso_cache *new)
 			p = &(*p)->rb_left;
 		else if (offset >= end)
 			p = &(*p)->rb_right;
+		else
+			goto out;
 	}
 
 	rb_link_node(&new->rb_node, parent, p);
 	rb_insert_color(&new->rb_node, root);
+
+	cache = NULL;
+out:
+	pthread_mutex_unlock(&dso->lock);
+	return cache;
 }
 
 static ssize_t
@@ -520,6 +534,7 @@ static ssize_t
 dso_cache__read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
 {
 	struct dso_cache *cache;
+	struct dso_cache *old;
 	ssize_t ret;
 
 	do {
@@ -543,7 +558,12 @@ dso_cache__read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
 
 		cache->offset = cache_offset;
 		cache->size   = ret;
-		dso_cache__insert(&dso->data.cache, cache);
+		old = dso_cache__insert(dso, cache);
+		if (old) {
+			/* we lose the race */
+			free(cache);
+			cache = old;
+		}
 
 		ret = dso_cache__memcpy(cache, offset, data, size);
 
@@ -560,7 +580,7 @@ static ssize_t dso_cache_read(struct dso *dso, u64 offset,
 {
 	struct dso_cache *cache;
 
-	cache = dso_cache__find(&dso->data.cache, offset);
+	cache = dso_cache__find(dso, offset);
 	if (cache)
 		return dso_cache__memcpy(cache, offset, data, size);
 	else
@@ -915,7 +935,7 @@ void dso__delete(struct dso *dso)
 	}
 
 	dso__data_close(dso);
-	dso_cache__free(&dso->data.cache);
+	dso_cache__free(dso);
 	dso__free_a2l(dso);
 	zfree(&dso->symsrc_filename);
 	pthread_mutex_destroy(&dso->lock);
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 29/42] perf tools: Protect dso cache fd with a mutex
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (27 preceding siblings ...)
  2015-01-29  8:07 ` [PATCH 28/42] perf tools: Protect dso cache tree using dso->lock Namhyung Kim
@ 2015-01-29  8:07 ` Namhyung Kim
  2015-01-29 12:31   ` Arnaldo Carvalho de Melo
  2015-01-29  8:07 ` [PATCH 30/42] perf session: Pass struct events stats to event processing functions Namhyung Kim
                   ` (13 subsequent siblings)
  42 siblings, 1 reply; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

When dso cache is accessed in multi-thread environment, it's possible
to close other dso->data.fd during operation due to open file limit.
Protect the file descriptors using a separate mutex.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/tests/dso-data.c |   5 ++
 tools/perf/util/dso.c       | 136 +++++++++++++++++++++++++++++---------------
 2 files changed, 94 insertions(+), 47 deletions(-)

diff --git a/tools/perf/tests/dso-data.c b/tools/perf/tests/dso-data.c
index caaf37f079b1..0276e7d2d41b 100644
--- a/tools/perf/tests/dso-data.c
+++ b/tools/perf/tests/dso-data.c
@@ -111,6 +111,9 @@ int test__dso_data(void)
 	memset(&machine, 0, sizeof(machine));
 
 	dso = dso__new((const char *)file);
+	TEST_ASSERT_VAL("failed to get dso", dso);
+
+	dso->binary_type = DSO_BINARY_TYPE__SYSTEM_PATH_DSO;
 
 	/* Basic 10 bytes tests. */
 	for (i = 0; i < ARRAY_SIZE(offsets); i++) {
@@ -199,6 +202,8 @@ static int dsos__create(int cnt, int size)
 
 		dsos[i] = dso__new(file);
 		TEST_ASSERT_VAL("failed to get dso", dsos[i]);
+
+		dsos[i]->binary_type = DSO_BINARY_TYPE__SYSTEM_PATH_DSO;
 	}
 
 	return 0;
diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 11ece224ef50..ae92046ae2c8 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -213,6 +213,7 @@ bool dso__needs_decompress(struct dso *dso)
  */
 static LIST_HEAD(dso__data_open);
 static long dso__data_open_cnt;
+static pthread_mutex_t dso__data_open_lock = PTHREAD_MUTEX_INITIALIZER;
 
 static void dso__list_add(struct dso *dso)
 {
@@ -240,7 +241,7 @@ static int do_open(char *name)
 		if (fd >= 0)
 			return fd;
 
-		pr_debug("dso open failed, mmap: %s\n",
+		pr_debug("dso open failed: %s\n",
 			 strerror_r(errno, sbuf, sizeof(sbuf)));
 		if (!dso__data_open_cnt || errno != EMFILE)
 			break;
@@ -382,7 +383,9 @@ static void check_data_close(void)
  */
 void dso__data_close(struct dso *dso)
 {
+	pthread_mutex_lock(&dso__data_open_lock);
 	close_dso(dso);
+	pthread_mutex_unlock(&dso__data_open_lock);
 }
 
 /**
@@ -405,6 +408,8 @@ int dso__data_fd(struct dso *dso, struct machine *machine)
 	if (dso->data.status == DSO_DATA_STATUS_ERROR)
 		return -1;
 
+	pthread_mutex_lock(&dso__data_open_lock);
+
 	if (dso->data.fd >= 0)
 		goto out;
 
@@ -427,6 +432,7 @@ int dso__data_fd(struct dso *dso, struct machine *machine)
 	else
 		dso->data.status = DSO_DATA_STATUS_ERROR;
 
+	pthread_mutex_unlock(&dso__data_open_lock);
 	return dso->data.fd;
 }
 
@@ -531,52 +537,66 @@ dso_cache__memcpy(struct dso_cache *cache, u64 offset,
 }
 
 static ssize_t
-dso_cache__read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
+dso_cache__read(struct dso *dso, struct machine *machine,
+		u64 offset, u8 *data, ssize_t size)
 {
 	struct dso_cache *cache;
 	struct dso_cache *old;
-	ssize_t ret;
-
-	do {
-		u64 cache_offset;
+	ssize_t ret = -EINVAL;
+	u64 cache_offset;
 
-		ret = -ENOMEM;
+	cache = zalloc(sizeof(*cache) + DSO__DATA_CACHE_SIZE);
+	if (!cache)
+		return -ENOMEM;
 
-		cache = zalloc(sizeof(*cache) + DSO__DATA_CACHE_SIZE);
-		if (!cache)
-			break;
+	cache_offset = offset & DSO__DATA_CACHE_MASK;
 
-		cache_offset = offset & DSO__DATA_CACHE_MASK;
-		ret = -EINVAL;
+	pthread_mutex_lock(&dso__data_open_lock);
 
-		if (-1 == lseek(dso->data.fd, cache_offset, SEEK_SET))
-			break;
+	/*
+	 * dso->data.fd might be closed if other thread opened another
+	 * file (dso) due to open file limit (RLIMIT_NOFILE).
+	 */
+	if (dso->data.fd < 0) {
+		dso->data.fd = open_dso(dso, machine);
+		if (dso->data.fd < 0) {
+			ret = -errno;
+			dso->data.status = DSO_DATA_STATUS_ERROR;
+			goto err_unlock;
+		}
+	}
 
-		ret = read(dso->data.fd, cache->data, DSO__DATA_CACHE_SIZE);
-		if (ret <= 0)
-			break;
+	if (-1 == lseek(dso->data.fd, cache_offset, SEEK_SET))
+		goto err_unlock;
 
-		cache->offset = cache_offset;
-		cache->size   = ret;
-		old = dso_cache__insert(dso, cache);
-		if (old) {
-			/* we lose the race */
-			free(cache);
-			cache = old;
-		}
+	ret = read(dso->data.fd, cache->data, DSO__DATA_CACHE_SIZE);
+	if (ret <= 0)
+		goto err_unlock;
 
-		ret = dso_cache__memcpy(cache, offset, data, size);
+	pthread_mutex_unlock(&dso__data_open_lock);
 
-	} while (0);
+	cache->offset = cache_offset;
+	cache->size   = ret;
+	old = dso_cache__insert(dso, cache);
+	if (old) {
+		/* we lose the race */
+		free(cache);
+		cache = old;
+	}
 
+	ret = dso_cache__memcpy(cache, offset, data, size);
 	if (ret <= 0)
 		free(cache);
 
 	return ret;
+
+err_unlock:
+	pthread_mutex_unlock(&dso__data_open_lock);
+	return ret;
 }
 
-static ssize_t dso_cache_read(struct dso *dso, u64 offset,
-			      u8 *data, ssize_t size)
+static ssize_t dso_cache_read(struct dso *dso, struct machine *machine,
+			      u64 offset, u8 *data, ssize_t size)
 {
 	struct dso_cache *cache;
 
@@ -584,7 +604,7 @@ static ssize_t dso_cache_read(struct dso *dso, u64 offset,
 	if (cache)
 		return dso_cache__memcpy(cache, offset, data, size);
 	else
-		return dso_cache__read(dso, offset, data, size);
+		return dso_cache__read(dso, machine, offset, data, size);
 }
 
 /*
@@ -592,7 +612,8 @@ static ssize_t dso_cache_read(struct dso *dso, u64 offset,
  * in the rb_tree. Any read to already cached data is served
  * by cached data.
  */
-static ssize_t cached_read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
+static ssize_t cached_read(struct dso *dso, struct machine *machine,
+			   u64 offset, u8 *data, ssize_t size)
 {
 	ssize_t r = 0;
 	u8 *p = data;
@@ -600,7 +621,7 @@ static ssize_t cached_read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
 	do {
 		ssize_t ret;
 
-		ret = dso_cache_read(dso, offset, p, size);
+		ret = dso_cache_read(dso, machine, offset, p, size);
 		if (ret < 0)
 			return ret;
 
@@ -620,21 +641,42 @@ static ssize_t cached_read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
 	return r;
 }
 
-static int data_file_size(struct dso *dso)
+static int data_file_size(struct dso *dso, struct machine *machine)
 {
+	int ret = 0;
 	struct stat st;
 	char sbuf[STRERR_BUFSIZE];
 
-	if (!dso->data.file_size) {
-		if (fstat(dso->data.fd, &st)) {
-			pr_err("dso mmap failed, fstat: %s\n",
-				strerror_r(errno, sbuf, sizeof(sbuf)));
-			return -1;
+	if (dso->data.file_size)
+		return 0;
+
+	pthread_mutex_lock(&dso__data_open_lock);
+
+	/*
+	 * dso->data.fd might be closed if other thread opened another
+	 * file (dso) due to open file limit (RLIMIT_NOFILE).
+	 */
+	if (dso->data.fd < 0) {
+		dso->data.fd = open_dso(dso, machine);
+		if (dso->data.fd < 0) {
+			ret = -errno;
+			dso->data.status = DSO_DATA_STATUS_ERROR;
+			goto out;
 		}
-		dso->data.file_size = st.st_size;
 	}
 
-	return 0;
+	if (fstat(dso->data.fd, &st) < 0) {
+		ret = -errno;
+		pr_err("dso cache fstat failed: %s\n",
+		       strerror_r(errno, sbuf, sizeof(sbuf)));
+		dso->data.status = DSO_DATA_STATUS_ERROR;
+		goto out;
+	}
+	dso->data.file_size = st.st_size;
+
+out:
+	pthread_mutex_unlock(&dso__data_open_lock);
+	return ret;
 }
 
 /**
@@ -652,17 +694,17 @@ off_t dso__data_size(struct dso *dso, struct machine *machine)
 	if (fd < 0)
 		return fd;
 
-	if (data_file_size(dso))
+	if (data_file_size(dso, machine))
 		return -1;
 
 	/* For now just estimate dso data size is close to file size */
 	return dso->data.file_size;
 }
 
-static ssize_t data_read_offset(struct dso *dso, u64 offset,
-				u8 *data, ssize_t size)
+static ssize_t data_read_offset(struct dso *dso, struct machine *machine,
+				u64 offset, u8 *data, ssize_t size)
 {
-	if (data_file_size(dso))
+	if (data_file_size(dso, machine))
 		return -1;
 
 	/* Check the offset sanity. */
@@ -672,7 +714,7 @@ static ssize_t data_read_offset(struct dso *dso, u64 offset,
 	if (offset + size < offset)
 		return -1;
 
-	return cached_read(dso, offset, data, size);
+	return cached_read(dso, machine, offset, data, size);
 }
 
 /**
@@ -689,10 +731,10 @@ static ssize_t data_read_offset(struct dso *dso, u64 offset,
 ssize_t dso__data_read_offset(struct dso *dso, struct machine *machine,
 			      u64 offset, u8 *data, ssize_t size)
 {
-	if (dso__data_fd(dso, machine) < 0)
+	if (dso->data.status == DSO_DATA_STATUS_ERROR)
 		return -1;
 
-	return data_read_offset(dso, offset, data, size);
+	return data_read_offset(dso, machine, offset, data, size);
 }
 
 /**
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 30/42] perf session: Pass struct events stats to event processing functions
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (28 preceding siblings ...)
  2015-01-29  8:07 ` [PATCH 29/42] perf tools: Protect dso cache fd with a mutex Namhyung Kim
@ 2015-01-29  8:07 ` Namhyung Kim
  2015-01-29  8:07 ` [PATCH 31/42] perf hists: Pass hists struct to hist_entry_iter functions Namhyung Kim
                   ` (12 subsequent siblings)
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

Pass stats structure so that it can point separate object when used in
multi-thread environment.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/ordered-events.c |  4 +-
 tools/perf/util/session.c        | 81 ++++++++++++++++++++++------------------
 tools/perf/util/session.h        |  1 +
 3 files changed, 49 insertions(+), 37 deletions(-)

diff --git a/tools/perf/util/ordered-events.c b/tools/perf/util/ordered-events.c
index fd4be94125fb..e933c51d7090 100644
--- a/tools/perf/util/ordered-events.c
+++ b/tools/perf/util/ordered-events.c
@@ -183,7 +183,9 @@ static int __ordered_events__flush(struct perf_session *s,
 		if (ret)
 			pr_err("Can't parse sample, err = %d\n", ret);
 		else {
-			ret = perf_session__deliver_event(s, iter->event, &sample, tool,
+			ret = perf_session__deliver_event(s, &s->stats,
+							  iter->event,
+							  &sample, tool,
 							  iter->file_offset);
 			if (ret)
 				return ret;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index e7b59fbebbc4..7114427f3d0f 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -776,6 +776,7 @@ static struct machine *
 }
 
 static int deliver_sample_value(struct perf_session *session,
+				struct events_stats *stats,
 				struct perf_tool *tool,
 				union perf_event *event,
 				struct perf_sample *sample,
@@ -792,7 +793,7 @@ static int deliver_sample_value(struct perf_session *session,
 	}
 
 	if (!sid || sid->evsel == NULL) {
-		++session->stats.nr_unknown_id;
+		++stats->nr_unknown_id;
 		return 0;
 	}
 
@@ -800,6 +801,7 @@ static int deliver_sample_value(struct perf_session *session,
 }
 
 static int deliver_sample_group(struct perf_session *session,
+				struct events_stats *stats,
 				struct perf_tool *tool,
 				union  perf_event *event,
 				struct perf_sample *sample,
@@ -809,7 +811,7 @@ static int deliver_sample_group(struct perf_session *session,
 	u64 i;
 
 	for (i = 0; i < sample->read.group.nr; i++) {
-		ret = deliver_sample_value(session, tool, event, sample,
+		ret = deliver_sample_value(session, stats, tool, event, sample,
 					   &sample->read.group.values[i],
 					   machine);
 		if (ret)
@@ -821,6 +823,7 @@ static int deliver_sample_group(struct perf_session *session,
 
 static int
 perf_session__deliver_sample(struct perf_session *session,
+			     struct events_stats *stats,
 			     struct perf_tool *tool,
 			     union  perf_event *event,
 			     struct perf_sample *sample,
@@ -837,14 +840,15 @@ perf_session__deliver_sample(struct perf_session *session,
 
 	/* For PERF_SAMPLE_READ we have either single or group mode. */
 	if (read_format & PERF_FORMAT_GROUP)
-		return deliver_sample_group(session, tool, event, sample,
+		return deliver_sample_group(session, stats, tool, event, sample,
 					    machine);
 	else
-		return deliver_sample_value(session, tool, event, sample,
+		return deliver_sample_value(session, stats, tool, event, sample,
 					    &sample->read.one, machine);
 }
 
 int perf_session__deliver_event(struct perf_session *session,
+				struct events_stats *stats,
 				union perf_event *event,
 				struct perf_sample *sample,
 				struct perf_tool *tool, u64 file_offset)
@@ -863,14 +867,14 @@ int perf_session__deliver_event(struct perf_session *session,
 	case PERF_RECORD_SAMPLE:
 		dump_sample(evsel, event, sample);
 		if (evsel == NULL) {
-			++session->stats.nr_unknown_id;
+			++stats->nr_unknown_id;
 			return 0;
 		}
 		if (machine == NULL) {
-			++session->stats.nr_unprocessable_samples;
+			++stats->nr_unprocessable_samples;
 			return 0;
 		}
-		return perf_session__deliver_sample(session, tool, event,
+		return perf_session__deliver_sample(session, stats, tool, event,
 						    sample, evsel, machine);
 	case PERF_RECORD_MMAP:
 		return tool->mmap(tool, event, sample, machine);
@@ -884,7 +888,7 @@ int perf_session__deliver_event(struct perf_session *session,
 		return tool->exit(tool, event, sample, machine);
 	case PERF_RECORD_LOST:
 		if (tool->lost == perf_event__process_lost)
-			session->stats.total_lost += event->lost.lost;
+			stats->total_lost += event->lost.lost;
 		return tool->lost(tool, event, sample, machine);
 	case PERF_RECORD_READ:
 		return tool->read(tool, event, sample, evsel, machine);
@@ -893,7 +897,7 @@ int perf_session__deliver_event(struct perf_session *session,
 	case PERF_RECORD_UNTHROTTLE:
 		return tool->unthrottle(tool, event, sample, machine);
 	default:
-		++session->stats.nr_unknown_events;
+		++stats->nr_unknown_events;
 		return -1;
 	}
 }
@@ -948,7 +952,8 @@ int perf_session__deliver_synth_event(struct perf_session *session,
 	if (event->header.type >= PERF_RECORD_USER_TYPE_START)
 		return perf_session__process_user_event(session, event, tool, 0);
 
-	return perf_session__deliver_event(session, event, sample, tool, 0);
+	return perf_session__deliver_event(session, &session->stats,
+					   event, sample, tool, 0);
 }
 
 static void event_swap(union perf_event *event, bool sample_id_all)
@@ -1016,6 +1021,7 @@ int perf_session__peek_event(struct perf_session *session, off_t file_offset,
 }
 
 static s64 perf_session__process_event(struct perf_session *session,
+				       struct events_stats *stats,
 				       union perf_event *event,
 				       struct perf_tool *tool,
 				       u64 file_offset)
@@ -1029,7 +1035,7 @@ static s64 perf_session__process_event(struct perf_session *session,
 	if (event->header.type >= PERF_RECORD_HEADER_MAX)
 		return -EINVAL;
 
-	events_stats__inc(&session->stats, event->header.type);
+	events_stats__inc(stats, event->header.type);
 
 	if (event->header.type >= PERF_RECORD_USER_TYPE_START)
 		return perf_session__process_user_event(session, event, tool, file_offset);
@@ -1048,8 +1054,8 @@ static s64 perf_session__process_event(struct perf_session *session,
 			return ret;
 	}
 
-	return perf_session__deliver_event(session, event, &sample, tool,
-					   file_offset);
+	return perf_session__deliver_event(session, stats, event, &sample,
+					   tool, file_offset);
 }
 
 void perf_event_header__bswap(struct perf_event_header *hdr)
@@ -1077,47 +1083,49 @@ static struct thread *perf_session__register_idle_thread(struct perf_session *se
 	return thread;
 }
 
-static void perf_session__warn_about_errors(const struct perf_session *session,
+static void events_stats__warn_about_errors(const struct events_stats *stats,
 					    const struct perf_tool *tool)
 {
 	if (tool->lost == perf_event__process_lost &&
-	    session->stats.nr_events[PERF_RECORD_LOST] != 0) {
+	    stats->nr_events[PERF_RECORD_LOST] != 0) {
 		ui__warning("Processed %d events and lost %d chunks!\n\n"
 			    "Check IO/CPU overload!\n\n",
-			    session->stats.nr_events[0],
-			    session->stats.nr_events[PERF_RECORD_LOST]);
+			    stats->nr_events[0],
+			    stats->nr_events[PERF_RECORD_LOST]);
 	}
 
-	if (session->stats.nr_unknown_events != 0) {
+	if (stats->nr_unknown_events != 0) {
 		ui__warning("Found %u unknown events!\n\n"
 			    "Is this an older tool processing a perf.data "
 			    "file generated by a more recent tool?\n\n"
 			    "If that is not the case, consider "
 			    "reporting to linux-kernel@vger.kernel.org.\n\n",
-			    session->stats.nr_unknown_events);
+			    stats->nr_unknown_events);
 	}
 
-	if (session->stats.nr_unknown_id != 0) {
+	if (stats->nr_unknown_id != 0) {
 		ui__warning("%u samples with id not present in the header\n",
-			    session->stats.nr_unknown_id);
+			    stats->nr_unknown_id);
 	}
 
- 	if (session->stats.nr_invalid_chains != 0) {
+	if (stats->nr_invalid_chains != 0) {
  		ui__warning("Found invalid callchains!\n\n"
  			    "%u out of %u events were discarded for this reason.\n\n"
  			    "Consider reporting to linux-kernel@vger.kernel.org.\n\n",
- 			    session->stats.nr_invalid_chains,
- 			    session->stats.nr_events[PERF_RECORD_SAMPLE]);
+			    stats->nr_invalid_chains,
+			    stats->nr_events[PERF_RECORD_SAMPLE]);
  	}
 
-	if (session->stats.nr_unprocessable_samples != 0) {
+	if (stats->nr_unprocessable_samples != 0) {
 		ui__warning("%u unprocessable samples recorded.\n"
 			    "Do you have a KVM guest running and not using 'perf kvm'?\n",
-			    session->stats.nr_unprocessable_samples);
+			    stats->nr_unprocessable_samples);
 	}
 
-	if (session->stats.nr_unordered_events != 0)
-		ui__warning("%u out of order events recorded.\n", session->stats.nr_unordered_events);
+	if (stats->nr_unordered_events != 0) {
+		ui__warning("%u out of order events recorded.\n",
+			    stats->nr_unordered_events);
+	}
 }
 
 volatile int session_done;
@@ -1188,7 +1196,8 @@ static int __perf_session__process_pipe_events(struct perf_session *session,
 		}
 	}
 
-	if ((skip = perf_session__process_event(session, event, tool, head)) < 0) {
+	if ((skip = perf_session__process_event(session, &session->stats,
+						event, tool, head)) < 0) {
 		pr_err("%#" PRIx64 " [%#x]: failed to process type: %d\n",
 		       head, event->header.size, event->header.type);
 		err = -EINVAL;
@@ -1207,7 +1216,7 @@ static int __perf_session__process_pipe_events(struct perf_session *session,
 	err = ordered_events__flush(session, tool, OE_FLUSH__FINAL);
 out_err:
 	free(buf);
-	perf_session__warn_about_errors(session, tool);
+	events_stats__warn_about_errors(&session->stats, tool);
 	ordered_events__free(&session->ordered_events);
 	return err;
 }
@@ -1252,7 +1261,8 @@ fetch_mmaped_event(struct perf_session *session,
 #define NUM_MMAPS 128
 #endif
 
-static int __perf_session__process_events(struct perf_session *session, int fd,
+static int __perf_session__process_events(struct perf_session *session,
+					  struct events_stats *stats, int fd,
 					  u64 data_offset, u64 data_size,
 					  u64 file_size, struct perf_tool *tool)
 {
@@ -1325,8 +1335,8 @@ static int __perf_session__process_events(struct perf_session *session, int fd,
 	size = event->header.size;
 
 	if (size < sizeof(struct perf_event_header) ||
-	    (skip = perf_session__process_event(session, event, tool, file_pos))
-									< 0) {
+	    (skip = perf_session__process_event(session, stats, event,
+						tool, file_pos)) < 0) {
 		pr_err("%#" PRIx64 " [%#x]: failed to process type: %d\n",
 		       file_offset + head, event->header.size,
 		       event->header.type);
@@ -1353,7 +1363,6 @@ static int __perf_session__process_events(struct perf_session *session, int fd,
 	err = ordered_events__flush(session, tool, OE_FLUSH__FINAL);
 out_err:
 	ui_progress__finish();
-	perf_session__warn_about_errors(session, tool);
 	ordered_events__free(&session->ordered_events);
 	session->one_mmap = false;
 	return err;
@@ -1372,7 +1381,7 @@ int perf_session__process_events(struct perf_session *session,
 	if (perf_data_file__is_pipe(file))
 		return __perf_session__process_pipe_events(session, tool);
 
-	err = __perf_session__process_events(session,
+	err = __perf_session__process_events(session, &session->stats,
 					     perf_data_file__fd(file),
 					     session->header.data_offset,
 					     session->header.data_size,
@@ -1391,7 +1400,7 @@ int perf_session__process_events(struct perf_session *session,
 		if (!session->header.index[i].size)
 			continue;
 
-		err = __perf_session__process_events(session,
+		err = __perf_session__process_events(session, &session->stats,
 						perf_data_file__fd(file),
 						session->header.index[i].offset,
 						session->header.index[i].size,
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index 419976d74b51..33af571f9d08 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -59,6 +59,7 @@ int perf_session_queue_event(struct perf_session *s, union perf_event *event,
 void perf_tool__fill_defaults(struct perf_tool *tool);
 
 int perf_session__deliver_event(struct perf_session *session,
+				struct events_stats *stats,
 				union perf_event *event,
 				struct perf_sample *sample,
 				struct perf_tool *tool, u64 file_offset);
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 31/42] perf hists: Pass hists struct to hist_entry_iter functions
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (29 preceding siblings ...)
  2015-01-29  8:07 ` [PATCH 30/42] perf session: Pass struct events stats to event processing functions Namhyung Kim
@ 2015-01-29  8:07 ` Namhyung Kim
  2015-01-29  8:07 ` [PATCH 32/42] perf tools: Move BUILD_ID_SIZE definition to perf.h Namhyung Kim
                   ` (11 subsequent siblings)
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

This is a preparation for perf report multi-thread support.  When
multi-thread is enable, each thread will have its own hists during the
sample processing.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-report.c       |  4 ++--
 tools/perf/builtin-top.c          |  4 ++--
 tools/perf/tests/hists_cumulate.c |  4 ++--
 tools/perf/tests/hists_filter.c   |  3 ++-
 tools/perf/tests/hists_output.c   |  4 ++--
 tools/perf/util/hist.c            | 26 +++++++++++---------------
 tools/perf/util/hist.h            |  6 ++++--
 7 files changed, 25 insertions(+), 26 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 68d06bc02266..8a40c79d9273 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -167,8 +167,8 @@ static int process_sample_event(struct perf_tool *tool,
 	if (al.map != NULL)
 		al.map->dso->hit = 1;
 
-	ret = hist_entry_iter__add(&iter, &al, evsel, sample, rep->max_stack,
-				   rep);
+	ret = hist_entry_iter__add(&iter, evsel__hists(evsel), evsel, &al,
+				   sample, rep->max_stack, rep);
 	if (ret < 0)
 		pr_debug("problem adding hist entry, skipping event\n");
 
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 69a0badfb745..a49a34bcf791 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -784,8 +784,8 @@ static void perf_event__process_sample(struct perf_tool *tool,
 
 		pthread_mutex_lock(&hists->lock);
 
-		err = hist_entry_iter__add(&iter, &al, evsel, sample,
-					   top->max_stack, top);
+		err = hist_entry_iter__add(&iter, evsel__hists(evsel), evsel,
+					   &al, sample, top->max_stack, top);
 		if (err < 0)
 			pr_err("Problem incrementing symbol period, skipping event\n");
 
diff --git a/tools/perf/tests/hists_cumulate.c b/tools/perf/tests/hists_cumulate.c
index 60682e62d9de..71156c0d6ad5 100644
--- a/tools/perf/tests/hists_cumulate.c
+++ b/tools/perf/tests/hists_cumulate.c
@@ -104,8 +104,8 @@ static int add_hist_entries(struct hists *hists, struct machine *machine)
 						  &sample, NULL) < 0)
 			goto out;
 
-		if (hist_entry_iter__add(&iter, &al, evsel, &sample,
-					 PERF_MAX_STACK_DEPTH, NULL) < 0)
+		if (hist_entry_iter__add(&iter, evsel__hists(evsel), evsel, &al,
+					 &sample, PERF_MAX_STACK_DEPTH, NULL) < 0)
 			goto out;
 
 		fake_samples[i].thread = al.thread;
diff --git a/tools/perf/tests/hists_filter.c b/tools/perf/tests/hists_filter.c
index 1c4e495d5137..408ee7e48802 100644
--- a/tools/perf/tests/hists_filter.c
+++ b/tools/perf/tests/hists_filter.c
@@ -81,7 +81,8 @@ static int add_hist_entries(struct perf_evlist *evlist,
 							  &sample, NULL) < 0)
 				goto out;
 
-			if (hist_entry_iter__add(&iter, &al, evsel, &sample,
+			if (hist_entry_iter__add(&iter, evsel__hists(evsel),
+						 evsel, &al, &sample,
 						 PERF_MAX_STACK_DEPTH, NULL) < 0)
 				goto out;
 
diff --git a/tools/perf/tests/hists_output.c b/tools/perf/tests/hists_output.c
index f4e3286cd496..bffe8832d692 100644
--- a/tools/perf/tests/hists_output.c
+++ b/tools/perf/tests/hists_output.c
@@ -70,8 +70,8 @@ static int add_hist_entries(struct hists *hists, struct machine *machine)
 						  &sample, NULL) < 0)
 			goto out;
 
-		if (hist_entry_iter__add(&iter, &al, evsel, &sample,
-					 PERF_MAX_STACK_DEPTH, NULL) < 0)
+		if (hist_entry_iter__add(&iter, evsel__hists(evsel), evsel, &al,
+					 &sample, PERF_MAX_STACK_DEPTH, NULL) < 0)
 			goto out;
 
 		fake_samples[i].thread = al.thread;
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 4badf2491fbf..c44565b382c5 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -510,7 +510,7 @@ iter_add_single_mem_entry(struct hist_entry_iter *iter, struct addr_location *al
 	u64 cost;
 	struct mem_info *mi = iter->priv;
 	struct perf_sample *sample = iter->sample;
-	struct hists *hists = evsel__hists(iter->evsel);
+	struct hists *hists = iter->hists;
 	struct hist_entry *he;
 
 	if (mi == NULL)
@@ -540,8 +540,7 @@ static int
 iter_finish_mem_entry(struct hist_entry_iter *iter,
 		      struct addr_location *al __maybe_unused)
 {
-	struct perf_evsel *evsel = iter->evsel;
-	struct hists *hists = evsel__hists(evsel);
+	struct hists *hists = iter->hists;
 	struct hist_entry *he = iter->he;
 	int err = -EINVAL;
 
@@ -613,8 +612,7 @@ static int
 iter_add_next_branch_entry(struct hist_entry_iter *iter, struct addr_location *al)
 {
 	struct branch_info *bi;
-	struct perf_evsel *evsel = iter->evsel;
-	struct hists *hists = evsel__hists(evsel);
+	struct hists *hists = iter->hists;
 	struct hist_entry *he = NULL;
 	int i = iter->curr;
 	int err = 0;
@@ -661,11 +659,10 @@ iter_prepare_normal_entry(struct hist_entry_iter *iter __maybe_unused,
 static int
 iter_add_single_normal_entry(struct hist_entry_iter *iter, struct addr_location *al)
 {
-	struct perf_evsel *evsel = iter->evsel;
 	struct perf_sample *sample = iter->sample;
 	struct hist_entry *he;
 
-	he = __hists__add_entry(evsel__hists(evsel), al, iter->parent, NULL, NULL,
+	he = __hists__add_entry(iter->hists, al, iter->parent, NULL, NULL,
 				sample->period, sample->weight,
 				sample->transaction, sample->time, true);
 	if (he == NULL)
@@ -680,7 +677,6 @@ iter_finish_normal_entry(struct hist_entry_iter *iter,
 			 struct addr_location *al __maybe_unused)
 {
 	struct hist_entry *he = iter->he;
-	struct perf_evsel *evsel = iter->evsel;
 	struct perf_sample *sample = iter->sample;
 
 	if (he == NULL)
@@ -688,7 +684,7 @@ iter_finish_normal_entry(struct hist_entry_iter *iter,
 
 	iter->he = NULL;
 
-	hists__inc_nr_samples(evsel__hists(evsel), he->filtered);
+	hists__inc_nr_samples(iter->hists, he->filtered);
 
 	return hist_entry__append_callchain(he, sample);
 }
@@ -720,8 +716,7 @@ static int
 iter_add_single_cumulative_entry(struct hist_entry_iter *iter,
 				 struct addr_location *al)
 {
-	struct perf_evsel *evsel = iter->evsel;
-	struct hists *hists = evsel__hists(evsel);
+	struct hists *hists = iter->hists;
 	struct perf_sample *sample = iter->sample;
 	struct hist_entry **he_cache = iter->priv;
 	struct hist_entry *he;
@@ -766,7 +761,6 @@ static int
 iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
 			       struct addr_location *al)
 {
-	struct perf_evsel *evsel = iter->evsel;
 	struct perf_sample *sample = iter->sample;
 	struct hist_entry **he_cache = iter->priv;
 	struct hist_entry *he;
@@ -800,7 +794,7 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
 		}
 	}
 
-	he = __hists__add_entry(evsel__hists(evsel), al, iter->parent, NULL, NULL,
+	he = __hists__add_entry(iter->hists, al, iter->parent, NULL, NULL,
 				sample->period, sample->weight,
 				sample->transaction, sample->time, false);
 	if (he == NULL)
@@ -856,8 +850,9 @@ const struct hist_iter_ops hist_iter_cumulative = {
 	.finish_entry 		= iter_finish_cumulative_entry,
 };
 
-int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
-			 struct perf_evsel *evsel, struct perf_sample *sample,
+int hist_entry_iter__add(struct hist_entry_iter *iter, struct hists *hists,
+			 struct perf_evsel *evsel, struct addr_location *al,
+			 struct perf_sample *sample,
 			 int max_stack_depth, void *arg)
 {
 	int err, err2;
@@ -867,6 +862,7 @@ int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
 	if (err)
 		return err;
 
+	iter->hists = hists;
 	iter->evsel = evsel;
 	iter->sample = sample;
 
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 0eed50a5b1f0..991ca5504cbd 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -86,6 +86,7 @@ struct hist_entry_iter {
 
 	bool hide_unresolved;
 
+	struct hists *hists;
 	struct perf_evsel *evsel;
 	struct perf_sample *sample;
 	struct hist_entry *he;
@@ -110,8 +111,9 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
 				      struct mem_info *mi, u64 period,
 				      u64 weight, u64 transaction,
 				      u64 timestamp, bool sample_self);
-int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
-			 struct perf_evsel *evsel, struct perf_sample *sample,
+int hist_entry_iter__add(struct hist_entry_iter *iter, struct hists *hists,
+			 struct perf_evsel *evsel, struct addr_location *al,
+			 struct perf_sample *sample,
 			 int max_stack_depth, void *arg);
 
 int64_t hist_entry__cmp(struct hist_entry *left, struct hist_entry *right);
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 32/42] perf tools: Move BUILD_ID_SIZE definition to perf.h
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (30 preceding siblings ...)
  2015-01-29  8:07 ` [PATCH 31/42] perf hists: Pass hists struct to hist_entry_iter functions Namhyung Kim
@ 2015-01-29  8:07 ` Namhyung Kim
  2015-01-29  8:07 ` [PATCH 33/42] perf report: Parallelize perf report using multi-thread Namhyung Kim
                   ` (10 subsequent siblings)
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

The util/event.h includes util/build-id.h only for BUILD_ID_SIZE.
This is a problem when I include util/event.h from util/tool.h which
is also included by util/build-id.h since it now makes a circular
dependency resulting in incomplete type error.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/perf.h          | 1 +
 tools/perf/util/build-id.h | 2 --
 tools/perf/util/dso.h      | 1 +
 tools/perf/util/event.h    | 1 -
 4 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index b0fad99c9252..386de322f3a1 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -30,6 +30,7 @@ static inline unsigned long long rdclock(void)
 }
 
 #define MAX_NR_CPUS			256
+#define BUILD_ID_SIZE			20
 
 extern const char *input_name;
 extern bool perf_host, perf_guest;
diff --git a/tools/perf/util/build-id.h b/tools/perf/util/build-id.h
index 8236319514d5..8f31545edc5b 100644
--- a/tools/perf/util/build-id.h
+++ b/tools/perf/util/build-id.h
@@ -1,8 +1,6 @@
 #ifndef PERF_BUILD_ID_H_
 #define PERF_BUILD_ID_H_ 1
 
-#define BUILD_ID_SIZE 20
-
 #include "tool.h"
 #include <linux/types.h>
 
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index ac753594a469..c18fcc0e8081 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -7,6 +7,7 @@
 #include <linux/types.h>
 #include <linux/bitops.h>
 #include "map.h"
+#include "perf.h"
 #include "build-id.h"
 
 enum dso_binary_type {
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 27261320249a..1f86c279520e 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -6,7 +6,6 @@
 
 #include "../perf.h"
 #include "map.h"
-#include "build-id.h"
 #include "perf_regs.h"
 
 struct mmap_event {
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 33/42] perf report: Parallelize perf report using multi-thread
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (31 preceding siblings ...)
  2015-01-29  8:07 ` [PATCH 32/42] perf tools: Move BUILD_ID_SIZE definition to perf.h Namhyung Kim
@ 2015-01-29  8:07 ` Namhyung Kim
  2015-01-29  8:07 ` [PATCH 34/42] perf tools: Add missing_threads rb tree Namhyung Kim
                   ` (9 subsequent siblings)
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

Introduce perf_session__process_events_mt() to enable multi-thread
sample processing.  It allocates a struct perf_tool_mt and fills
needed info in it.

The session and hists event stats are counted for each thread and
summed after finishing the processing.  Similarly hist entries are
added to per-thread hists first and then move to the original hists
using hists__mt_resort().  This function reuses hists__collapse_
resort() code so makes sort__need_collapse force to true and skips
the collapsing function.

Note that most of preprocessing stage is already done by processing
meta events in dummy tracking evsel first.  We can find corresponding
thread and map based on the sample time and symbol loading and dso
cache access is protected by pthread mutex.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/hist.c    |  75 +++++++++++++++++++-----
 tools/perf/util/hist.h    |   3 +
 tools/perf/util/session.c | 141 ++++++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/session.h |   2 +
 tools/perf/util/tool.h    |  12 ++++
 5 files changed, 220 insertions(+), 13 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index c44565b382c5..14d4b9358ac6 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -950,7 +950,7 @@ void hist_entry__delete(struct hist_entry *he)
  * collapse the histogram
  */
 
-static bool hists__collapse_insert_entry(struct hists *hists __maybe_unused,
+static bool hists__collapse_insert_entry(struct hists *hists,
 					 struct rb_root *root,
 					 struct hist_entry *he)
 {
@@ -987,6 +987,13 @@ static bool hists__collapse_insert_entry(struct hists *hists __maybe_unused,
 	}
 	hists->nr_entries++;
 
+	/*
+	 * For multi-threaded report, he->hists points to a dummy
+	 * hists in the struct perf_tool_mt.  Please see
+	 * perf_session__process_events_mt().
+	 */
+	he->hists = hists;
+
 	rb_link_node(&he->rb_node_in, parent, p);
 	rb_insert_color(&he->rb_node_in, root);
 	return true;
@@ -1014,19 +1021,12 @@ static void hists__apply_filters(struct hists *hists, struct hist_entry *he)
 	hists__filter_entry_by_symbol(hists, he);
 }
 
-void hists__collapse_resort(struct hists *hists, struct ui_progress *prog)
+static void __hists__collapse_resort(struct hists *hists, struct rb_root *root,
+				     struct ui_progress *prog)
 {
-	struct rb_root *root;
 	struct rb_node *next;
 	struct hist_entry *n;
 
-	if (!sort__need_collapse)
-		return;
-
-	hists->nr_entries = 0;
-
-	root = hists__get_rotate_entries_in(hists);
-
 	next = rb_first(root);
 
 	while (next) {
@@ -1049,6 +1049,27 @@ void hists__collapse_resort(struct hists *hists, struct ui_progress *prog)
 	}
 }
 
+void hists__collapse_resort(struct hists *hists, struct ui_progress *prog)
+{
+	struct rb_root *root;
+
+	if (!sort__need_collapse)
+		return;
+
+	hists->nr_entries = 0;
+
+	root = hists__get_rotate_entries_in(hists);
+	__hists__collapse_resort(hists, root, prog);
+}
+
+void hists__mt_resort(struct hists *dst, struct hists *src)
+{
+	struct rb_root *root = src->entries_in;
+
+	sort__need_collapse = 1;
+	__hists__collapse_resort(dst, root, NULL);
+}
+
 static int hist_entry__sort(struct hist_entry *a, struct hist_entry *b)
 {
 	struct perf_hpp_fmt *fmt;
@@ -1277,6 +1298,29 @@ void events_stats__inc(struct events_stats *stats, u32 type)
 	++stats->nr_events[type];
 }
 
+void events_stats__add(struct events_stats *dst, struct events_stats *src)
+{
+	int i;
+
+#define ADD(_field)  dst->_field += src->_field
+
+	ADD(total_period);
+	ADD(total_non_filtered_period);
+	ADD(total_lost);
+	ADD(total_invalid_chains);
+	ADD(nr_non_filtered_samples);
+	ADD(nr_lost_warned);
+	ADD(nr_unknown_events);
+	ADD(nr_invalid_chains);
+	ADD(nr_unknown_id);
+	ADD(nr_unprocessable_samples);
+
+	for (i = 0; i < PERF_RECORD_HEADER_MAX; i++)
+		ADD(nr_events[i]);
+
+#undef ADD
+}
+
 void hists__inc_nr_events(struct hists *hists, u32 type)
 {
 	events_stats__inc(&hists->stats, type);
@@ -1453,16 +1497,21 @@ int perf_hist_config(const char *var, const char *value)
 	return 0;
 }
 
-static int hists_evsel__init(struct perf_evsel *evsel)
+void __hists__init(struct hists *hists)
 {
-	struct hists *hists = evsel__hists(evsel);
-
 	memset(hists, 0, sizeof(*hists));
 	hists->entries_in_array[0] = hists->entries_in_array[1] = RB_ROOT;
 	hists->entries_in = &hists->entries_in_array[0];
 	hists->entries_collapsed = RB_ROOT;
 	hists->entries = RB_ROOT;
 	pthread_mutex_init(&hists->lock, NULL);
+}
+
+static int hists_evsel__init(struct perf_evsel *evsel)
+{
+	struct hists *hists = evsel__hists(evsel);
+
+	__hists__init(hists);
 	return 0;
 }
 
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 991ca5504cbd..2c29d70b2cfe 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -124,6 +124,7 @@ int hist_entry__sort_snprintf(struct hist_entry *he, char *bf, size_t size,
 void hist_entry__delete(struct hist_entry *he);
 
 void hists__output_resort(struct hists *hists, struct ui_progress *prog);
+void hists__mt_resort(struct hists *dst, struct hists *src);
 void hists__collapse_resort(struct hists *hists, struct ui_progress *prog);
 
 void hists__decay_entries(struct hists *hists, bool zap_user, bool zap_kernel);
@@ -136,6 +137,7 @@ void hists__inc_stats(struct hists *hists, struct hist_entry *h);
 void hists__inc_nr_events(struct hists *hists, u32 type);
 void hists__inc_nr_samples(struct hists *hists, bool filtered);
 void events_stats__inc(struct events_stats *stats, u32 type);
+void events_stats__add(struct events_stats *dst, struct events_stats *src);
 size_t events_stats__fprintf(struct events_stats *stats, FILE *fp);
 
 size_t hists__fprintf(struct hists *hists, bool show_header, int max_rows,
@@ -179,6 +181,7 @@ static inline struct hists *evsel__hists(struct perf_evsel *evsel)
 }
 
 int hists__init(void);
+void __hists__init(struct hists *hists);
 
 struct perf_hpp {
 	char *buf;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 7114427f3d0f..d1d5e0b3a26e 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1412,6 +1412,147 @@ int perf_session__process_events(struct perf_session *session,
 	return err;
 }
 
+static void *processing_thread_idx(void *arg)
+{
+	struct perf_tool_mt *mt_tool = arg;
+	struct perf_session *session = mt_tool->session;
+	int fd = perf_data_file__fd(session->file);
+	u64 offset = session->header.index[mt_tool->idx].offset;
+	u64 size = session->header.index[mt_tool->idx].size;
+	u64 file_size = perf_data_file__size(session->file);
+
+	pr_debug("processing samples using thread [%d]\n", mt_tool->idx);
+	if (__perf_session__process_events(session, &mt_tool->stats,
+					   fd, offset, size, file_size,
+					   &mt_tool->tool) < 0) {
+		pr_err("processing samples failed (thread [%d)\n", mt_tool->idx);
+		return NULL;
+	}
+
+	pr_debug("processing samples done for thread [%d]\n", mt_tool->idx);
+	return arg;
+}
+
+int perf_session__process_events_mt(struct perf_session *session,
+				    struct perf_tool *tool, void *arg)
+{
+	struct perf_data_file *file = session->file;
+	struct perf_evlist *evlist = session->evlist;
+	struct perf_evsel *evsel;
+	u64 nr_entries = 0;
+	struct perf_tool_mt *mt_tools = NULL;
+	struct perf_tool_mt *mt;
+	pthread_t *th_id;
+	int err, i, k;
+	int nr_index = session->header.nr_index;
+	u64 size = perf_data_file__size(file);
+
+	if (perf_session__register_idle_thread(session) == NULL)
+		return -ENOMEM;
+
+	if (perf_data_file__is_pipe(file) || !session->header.index) {
+		pr_err("data file doesn't contain the index table\n");
+		return -EINVAL;
+	}
+
+	err = __perf_session__process_events(session, &session->stats,
+					     perf_data_file__fd(file),
+					     session->header.data_offset,
+					     session->header.data_size,
+					     size, tool);
+	if (err)
+		return err;
+
+	th_id = calloc(nr_index, sizeof(*th_id));
+	if (th_id == NULL)
+		goto out;
+
+	mt_tools = calloc(nr_index, sizeof(*mt_tools));
+	if (mt_tools == NULL)
+		goto out;
+
+	for (i = 0; i < nr_index; i++) {
+		mt = &mt_tools[i];
+
+		memcpy(&mt->tool, tool, sizeof(*tool));
+
+		mt->hists = calloc(evlist->nr_entries, sizeof(*mt->hists));
+		if (mt->hists == NULL)
+			goto err;
+
+		for (k = 0; k < evlist->nr_entries; k++)
+			__hists__init(&mt->hists[k]);
+
+		mt->session = session;
+		mt->tool.ordered_events = false;
+		mt->idx = i;
+		mt->priv = arg;
+
+		pthread_create(&th_id[i], NULL, processing_thread_idx, mt);
+	}
+
+	for (i = 0; i < nr_index; i++) {
+		pthread_join(th_id[i], (void **)&mt);
+		if (mt == NULL) {
+			err = -EINVAL;
+			continue;
+		}
+
+		events_stats__add(&session->stats, &mt->stats);
+
+		evlist__for_each(evlist, evsel) {
+			struct hists *hists = evsel__hists(evsel);
+
+			events_stats__add(&hists->stats,
+					  &mt->hists[evsel->idx].stats);
+
+			nr_entries += mt->hists[evsel->idx].nr_entries;
+		}
+	}
+
+	for (i = 0; i < nr_index; i++) {
+		mt = &mt_tools[i];
+
+		evlist__for_each(evlist, evsel) {
+			struct hists *hists = evsel__hists(evsel);
+
+			if (perf_evsel__is_dummy_tracking(evsel))
+				continue;
+
+			hists__mt_resort(hists, &mt->hists[evsel->idx]);
+
+			/* Non-group events are considered as leader */
+			if (symbol_conf.event_group &&
+			    !perf_evsel__is_group_leader(evsel)) {
+				struct hists *leader_hists;
+
+				leader_hists = evsel__hists(evsel->leader);
+				hists__match(leader_hists, hists);
+				hists__link(leader_hists, hists);
+			}
+		}
+	}
+
+out:
+	events_stats__warn_about_errors(&session->stats, tool);
+
+	if (mt_tools) {
+		for (i = 0; i < nr_index; i++)
+			free(mt_tools[i].hists);
+		free(mt_tools);
+	}
+
+	free(th_id);
+	return err;
+
+err:
+	while (i-- > 0) {
+		pthread_cancel(th_id[i]);
+		pthread_join(th_id[i], NULL);
+	}
+
+	goto out;
+}
 bool perf_session__has_traces(struct perf_session *session, const char *msg)
 {
 	struct perf_evsel *evsel;
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index 33af571f9d08..8027d6aa5fe4 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -51,6 +51,8 @@ int perf_session__peek_event(struct perf_session *session, off_t file_offset,
 
 int perf_session__process_events(struct perf_session *session,
 				 struct perf_tool *tool);
+int perf_session__process_events_mt(struct perf_session *session,
+				    struct perf_tool *tool, void *arg);
 
 int perf_session_queue_event(struct perf_session *s, union perf_event *event,
 			     struct perf_tool *tool, struct perf_sample *sample,
diff --git a/tools/perf/util/tool.h b/tools/perf/util/tool.h
index bb2708bbfaca..a04826bbe991 100644
--- a/tools/perf/util/tool.h
+++ b/tools/perf/util/tool.h
@@ -2,6 +2,7 @@
 #define __PERF_TOOL_H
 
 #include <stdbool.h>
+#include "util/event.h"
 
 struct perf_session;
 union perf_event;
@@ -10,6 +11,7 @@ struct perf_evsel;
 struct perf_sample;
 struct perf_tool;
 struct machine;
+struct hists;
 
 typedef int (*event_sample)(struct perf_tool *tool, union perf_event *event,
 			    struct perf_sample *sample,
@@ -45,4 +47,14 @@ struct perf_tool {
 	bool		ordering_requires_timestamps;
 };
 
+struct perf_tool_mt {
+	struct perf_tool	tool;
+	struct events_stats	stats;
+	struct hists		*hists;
+	struct perf_session	*session;
+	int			idx;
+
+	void			*priv;
+};
+
 #endif /* __PERF_TOOL_H */
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 34/42] perf tools: Add missing_threads rb tree
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (32 preceding siblings ...)
  2015-01-29  8:07 ` [PATCH 33/42] perf report: Parallelize perf report using multi-thread Namhyung Kim
@ 2015-01-29  8:07 ` Namhyung Kim
  2015-01-29  8:07 ` [PATCH 35/42] perf record: Synthesize COMM event for a command line workload Namhyung Kim
                   ` (8 subsequent siblings)
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

Sometimes it's possible to miss certain meta events like fork/exit and
in this case it can fail to find such thread in the machine's rbtree.
But adding a thread to the tree is dangerous since it's now executed
in multi-thread environment otherwise it'll add an overhead in order
to grab a lock for every search.  So adds a separate missing_threads
tree and protect it with a mutex.  It's expected to be accessed only
if a thread is not found in a normal tree.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/tests/thread-lookup-time.c |   8 ++-
 tools/perf/util/build-id.c            |   9 ++-
 tools/perf/util/machine.c             | 129 +++++++++++++++++++++-------------
 tools/perf/util/machine.h             |   2 +
 tools/perf/util/session.c             |   8 +--
 tools/perf/util/thread.h              |   1 +
 6 files changed, 101 insertions(+), 56 deletions(-)

diff --git a/tools/perf/tests/thread-lookup-time.c b/tools/perf/tests/thread-lookup-time.c
index 6237ecf8caae..04cdde9329d6 100644
--- a/tools/perf/tests/thread-lookup-time.c
+++ b/tools/perf/tests/thread-lookup-time.c
@@ -7,7 +7,9 @@
 static int thread__print_cb(struct thread *th, void *arg __maybe_unused)
 {
 	printf("thread: %d, start time: %"PRIu64" %s\n",
-	       th->tid, th->start_time, th->dead ? "(dead)" : "");
+	       th->tid, th->start_time,
+	       th->dead ? "(dead)" : th->exited ? "(exited)" :
+	       th->missing ? "(missing)" : "");
 	return 0;
 }
 
@@ -105,6 +107,8 @@ static int lookup_with_timestamp(struct machine *machine)
 			machine__findnew_thread_time(machine, 0, 0, 70000) == t3);
 
 	machine__delete_threads(machine);
+	machine__delete_dead_threads(machine);
+	machine__delete_missing_threads(machine);
 	return 0;
 }
 
@@ -146,6 +150,8 @@ static int lookup_without_timestamp(struct machine *machine)
 			machine__findnew_thread_time(machine, 0, 0, -1ULL) == t3);
 
 	machine__delete_threads(machine);
+	machine__delete_dead_threads(machine);
+	machine__delete_missing_threads(machine);
 	return 0;
 }
 
diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index 0c72680a977f..1a37da34d852 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -60,7 +60,14 @@ static int perf_event__exit_del_thread(struct perf_tool *tool __maybe_unused,
 		    event->fork.ppid, event->fork.ptid);
 
 	if (thread) {
-		rb_erase(&thread->rb_node, &machine->threads);
+		if (thread->dead)
+			rb_erase(&thread->rb_node, &machine->dead_threads);
+		else if (thread->missing)
+			rb_erase(&thread->rb_node, &machine->missing_threads);
+		else
+			rb_erase(&thread->rb_node, &machine->threads);
+
+		list_del(&thread->tid_node);
 		machine->last_match = NULL;
 		thread__delete(thread);
 	}
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 7dc044b93cf8..b55454d85f60 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -29,6 +29,7 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid)
 
 	machine->threads = RB_ROOT;
 	machine->dead_threads = RB_ROOT;
+	machine->missing_threads = RB_ROOT;
 	machine->last_match = NULL;
 
 	machine->vdso_info = NULL;
@@ -89,6 +90,19 @@ static void dsos__delete(struct dsos *dsos)
 	}
 }
 
+void machine__delete_missing_threads(struct machine *machine)
+{
+	struct rb_node *nd = rb_first(&machine->missing_threads);
+
+	while (nd) {
+		struct thread *t = rb_entry(nd, struct thread, rb_node);
+
+		nd = rb_next(nd);
+		rb_erase(&t->rb_node, &machine->missing_threads);
+		thread__delete(t);
+	}
+}
+
 void machine__delete_dead_threads(struct machine *machine)
 {
 	struct rb_node *nd = rb_first(&machine->dead_threads);
@@ -434,20 +448,14 @@ struct thread *machine__find_thread(struct machine *machine, pid_t pid,
 	return __machine__findnew_thread(machine, pid, tid, false);
 }
 
-static struct thread *__machine__findnew_thread_time(struct machine *machine,
-						     pid_t pid, pid_t tid,
-						     u64 timestamp, bool create)
+static struct thread *machine__find_dead_thread_time(struct machine *machine,
+						     pid_t pid __maybe_unused,
+						     pid_t tid, u64 timestamp)
 {
-	struct thread *curr, *pos, *new;
-	struct thread *th = NULL;
-	struct rb_node **p;
+	struct thread *th, *pos;
+	struct rb_node **p = &machine->dead_threads.rb_node;
 	struct rb_node *parent = NULL;
 
-	curr = __machine__findnew_thread(machine, pid, tid, false);
-	if (curr && timestamp >= curr->start_time)
-		return curr;
-
-	p = &machine->dead_threads.rb_node;
 	while (*p != NULL) {
 		parent = *p;
 		th = rb_entry(parent, struct thread, rb_node);
@@ -461,10 +469,9 @@ static struct thread *__machine__findnew_thread_time(struct machine *machine,
 				}
 			}
 
-			if (timestamp >= th->start_time) {
-				machine__update_thread_pid(machine, th, pid);
+			if (timestamp >= th->start_time)
 				return th;
-			}
+
 			break;
 		}
 
@@ -474,50 +481,67 @@ static struct thread *__machine__findnew_thread_time(struct machine *machine,
 			p = &(*p)->rb_right;
 	}
 
-	if (!create)
-		return NULL;
+	return NULL;
+}
 
-	if (!curr && !*p)
-		return __machine__findnew_thread(machine, pid, tid, true);
+static struct thread *__machine__findnew_thread_time(struct machine *machine,
+						     pid_t pid, pid_t tid,
+						     u64 timestamp, bool create)
+{
+	struct thread *th, *new = NULL;
+	struct rb_node **p = &machine->missing_threads.rb_node;
+	struct rb_node *parent = NULL;
 
-	new = thread__new(pid, tid);
-	if (new == NULL)
-		return NULL;
+	static pthread_mutex_t missing_thread_lock = PTHREAD_MUTEX_INITIALIZER;
 
-	new->dead = true;
-	new->start_time = timestamp;
+	th = __machine__findnew_thread(machine, pid, tid, false);
+	if (th && timestamp >= th->start_time)
+		return th;
 
-	if (*p) {
-		list_for_each_entry(pos, &th->tid_node, tid_node) {
-			/* sort by time */
-			if (timestamp >= pos->start_time) {
-				th = pos;
-				break;
-			}
+	th = machine__find_dead_thread_time(machine, pid, tid, timestamp);
+	if (th)
+		return th;
+
+	pthread_mutex_lock(&missing_thread_lock);
+
+	while (*p != NULL) {
+		parent = *p;
+		th = rb_entry(parent, struct thread, rb_node);
+
+		if (th->tid == tid) {
+			pthread_mutex_unlock(&missing_thread_lock);
+			return th;
 		}
-		list_add_tail(&new->tid_node, &th->tid_node);
-	} else {
-		rb_link_node(&new->rb_node, parent, p);
-		rb_insert_color(&new->rb_node, &machine->dead_threads);
+
+		if (tid < th->tid)
+			p = &(*p)->rb_left;
+		else
+			p = &(*p)->rb_right;
 	}
 
+	if (!create)
+		goto out;
+
+	new = thread__new(pid, tid);
+	if (new == NULL)
+		goto out;
+
+	/* missing threads are not bothered with timestamp */
+	new->start_time = 0;
+	new->missing = true;
+
 	/*
-	 * We have to initialize map_groups separately
-	 * after rb tree is updated.
-	 *
-	 * The reason is that we call machine__findnew_thread
-	 * within thread__init_map_groups to find the thread
-	 * leader and that would screwed the rb tree.
+	 * missing threads have their own map groups regardless of
+	 * leader for the sake of simplicity.  it's okay since the map
+	 * groups has no map in it anyway.
 	 */
-	if (thread__init_map_groups(new, machine)) {
-		if (!list_empty(&new->tid_node))
-			list_del(&new->tid_node);
-		else
-			rb_erase(&new->rb_node, &machine->dead_threads);
+	new->mg = map_groups__new(machine);
 
-		thread__delete(new);
-		return NULL;
-	}
+	rb_link_node(&new->rb_node, parent, p);
+	rb_insert_color(&new->rb_node, &machine->missing_threads);
+
+out:
+	pthread_mutex_unlock(&missing_thread_lock);
 
 	return new;
 }
@@ -1356,6 +1380,7 @@ static void machine__remove_thread(struct machine *machine, struct thread *th)
 
 	machine->last_match = NULL;
 	rb_erase(&th->rb_node, &machine->threads);
+	RB_CLEAR_NODE(&th->rb_node);
 
 	th->dead = true;
 
@@ -1825,6 +1850,14 @@ int machine__for_each_thread(struct machine *machine,
 				return rc;
 		}
 	}
+
+	for (nd = rb_first(&machine->missing_threads); nd; nd = rb_next(nd)) {
+		thread = rb_entry(nd, struct thread, rb_node);
+		rc = fn(thread, priv);
+		if (rc != 0)
+			return rc;
+	}
+
 	return rc;
 }
 
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 9571b6b1c5b5..40af1f59e360 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -31,6 +31,7 @@ struct machine {
 	char		  *root_dir;
 	struct rb_root	  threads;
 	struct rb_root	  dead_threads;
+	struct rb_root	  missing_threads;
 	struct thread	  *last_match;
 	struct vdso_info  *vdso_info;
 	struct dsos	  user_dsos;
@@ -116,6 +117,7 @@ void machines__set_comm_exec(struct machines *machines, bool comm_exec);
 struct machine *machine__new_host(void);
 int machine__init(struct machine *machine, const char *root_dir, pid_t pid);
 void machine__exit(struct machine *machine);
+void machine__delete_missing_threads(struct machine *machine);
 void machine__delete_dead_threads(struct machine *machine);
 void machine__delete_threads(struct machine *machine);
 void machine__delete(struct machine *machine);
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index d1d5e0b3a26e..507db51ccfea 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -138,14 +138,11 @@ struct perf_session *perf_session__new(struct perf_data_file *file,
 	return NULL;
 }
 
-static void perf_session__delete_dead_threads(struct perf_session *session)
-{
-	machine__delete_dead_threads(&session->machines.host);
-}
-
 static void perf_session__delete_threads(struct perf_session *session)
 {
 	machine__delete_threads(&session->machines.host);
+	machine__delete_dead_threads(&session->machines.host);
+	machine__delete_missing_threads(&session->machines.host);
 }
 
 static void perf_session_env__delete(struct perf_session_env *env)
@@ -167,7 +164,6 @@ static void perf_session_env__delete(struct perf_session_env *env)
 void perf_session__delete(struct perf_session *session)
 {
 	perf_session__destroy_kernel_maps(session);
-	perf_session__delete_dead_threads(session);
 	perf_session__delete_threads(session);
 	perf_session_env__delete(&session->header.env);
 	machines__exit(&session->machines);
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index 5209ad5adadf..88fee3d8c0dc 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -23,6 +23,7 @@ struct thread {
 	bool			comm_set;
 	bool			exited; /* if set thread has exited */
 	bool			dead; /* thread is in dead_threads list */
+	bool			missing; /* thread is in missing_threads list */
 	struct list_head	comm_list;
 	int			comm_len;
 	u64			db_id;
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 35/42] perf record: Synthesize COMM event for a command line workload
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (33 preceding siblings ...)
  2015-01-29  8:07 ` [PATCH 34/42] perf tools: Add missing_threads rb tree Namhyung Kim
@ 2015-01-29  8:07 ` Namhyung Kim
  2015-01-29  8:07 ` [PATCH 36/42] perf tools: Fix progress ui to support multi thread Namhyung Kim
                   ` (7 subsequent siblings)
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

When perf creates a new child to profile, the events are enabled on
exec().  And in this case, it doesn't synthesize any event for the
child since they'll be generated during exec().  But there's an window
between the enabling and the event generation.

It used to be overcome since samples are only in kernel (so we always
have the map) and the comm is overridden by a later COMM event.
However it won't work anymore since those samples will go to a missing
thread now but the COMM event will create a (current) thread.  This
leads to those early samples (like native_write_msr_safe) not having a
comm but pid (like ':15328').

So it needs to synthesize COMM event for the child explicitly before
enabling so that it can have a correct comm.  But at this time, the
comm will be "perf" since it's not exec-ed yet.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-record.c | 18 +++++++++++++++++-
 tools/perf/util/event.c     |  8 ++++----
 tools/perf/util/event.h     |  5 +++++
 3 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 0db47c97446b..9500e350ca95 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -607,8 +607,24 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	/*
 	 * Let the child rip
 	 */
-	if (forks)
+	if (forks) {
+		union perf_event *comm_event;
+
+		comm_event = malloc(sizeof(*comm_event) + machine->id_hdr_size);
+		if (comm_event == NULL)
+			goto out_child;
+
+		err = perf_event__synthesize_comm(tool, comm_event,
+						  rec->evlist->threads->map[0],
+						  process_synthesized_event,
+						  machine);
+		free(comm_event);
+
+		if (err < 0)
+			goto out_child;
+
 		perf_evlist__start_workload(rec->evlist);
+	}
 
 	if (opts->initial_delay) {
 		usleep(opts->initial_delay * 1000);
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index e7152a6e3043..49452852b103 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -96,10 +96,10 @@ static pid_t perf_event__get_comm_tgid(pid_t pid, char *comm, size_t len)
 	return tgid;
 }
 
-static pid_t perf_event__synthesize_comm(struct perf_tool *tool,
-					 union perf_event *event, pid_t pid,
-					 perf_event__handler_t process,
-					 struct machine *machine)
+pid_t perf_event__synthesize_comm(struct perf_tool *tool,
+				  union perf_event *event, pid_t pid,
+				  perf_event__handler_t process,
+				  struct machine *machine)
 {
 	size_t size;
 	pid_t tgid;
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 1f86c279520e..6df23199fea0 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -386,6 +386,11 @@ int perf_event__synthesize_mmap_events(struct perf_tool *tool,
 				       struct machine *machine,
 				       bool mmap_data);
 
+pid_t perf_event__synthesize_comm(struct perf_tool *tool,
+				  union perf_event *event, pid_t pid,
+				  perf_event__handler_t process,
+				  struct machine *machine);
+
 size_t perf_event__fprintf_comm(union perf_event *event, FILE *fp);
 size_t perf_event__fprintf_mmap(union perf_event *event, FILE *fp);
 size_t perf_event__fprintf_mmap2(union perf_event *event, FILE *fp);
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 36/42] perf tools: Fix progress ui to support multi thread
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (34 preceding siblings ...)
  2015-01-29  8:07 ` [PATCH 35/42] perf record: Synthesize COMM event for a command line workload Namhyung Kim
@ 2015-01-29  8:07 ` Namhyung Kim
  2015-01-29  8:07 ` [PATCH 37/42] perf report: Add --multi-thread option and config item Namhyung Kim
                   ` (6 subsequent siblings)
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

Split ui_progress struct into global and local one.  Each thread
updates local struct without lock and only updates global one if
meaningful progress is done (with lock).

To do that, pass struct ui_progress to __perf_session__process_event()
and set it for the total size of multi-file storage.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/hist.c    |  5 ++--
 tools/perf/util/hist.h    |  3 ++-
 tools/perf/util/session.c | 63 +++++++++++++++++++++++++++++++++++++----------
 tools/perf/util/tool.h    |  3 +++
 4 files changed, 58 insertions(+), 16 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 14d4b9358ac6..dab3b8b3a06a 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -1062,12 +1062,13 @@ void hists__collapse_resort(struct hists *hists, struct ui_progress *prog)
 	__hists__collapse_resort(hists, root, prog);
 }
 
-void hists__mt_resort(struct hists *dst, struct hists *src)
+void hists__mt_resort(struct hists *dst, struct hists *src,
+		      struct ui_progress *prog)
 {
 	struct rb_root *root = src->entries_in;
 
 	sort__need_collapse = 1;
-	__hists__collapse_resort(dst, root, NULL);
+	__hists__collapse_resort(dst, root, prog);
 }
 
 static int hist_entry__sort(struct hist_entry *a, struct hist_entry *b)
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 2c29d70b2cfe..94179107906a 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -124,7 +124,8 @@ int hist_entry__sort_snprintf(struct hist_entry *he, char *bf, size_t size,
 void hist_entry__delete(struct hist_entry *he);
 
 void hists__output_resort(struct hists *hists, struct ui_progress *prog);
-void hists__mt_resort(struct hists *dst, struct hists *src);
+void hists__mt_resort(struct hists *dst, struct hists *src,
+		      struct ui_progress *prog);
 void hists__collapse_resort(struct hists *hists, struct ui_progress *prog);
 
 void hists__decay_entries(struct hists *hists, bool zap_user, bool zap_kernel);
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 507db51ccfea..3596bb608f3c 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1260,14 +1260,14 @@ fetch_mmaped_event(struct perf_session *session,
 static int __perf_session__process_events(struct perf_session *session,
 					  struct events_stats *stats, int fd,
 					  u64 data_offset, u64 data_size,
-					  u64 file_size, struct perf_tool *tool)
+					  u64 file_size, struct perf_tool *tool,
+					  struct ui_progress *prog)
 {
 	u64 head, page_offset, file_offset, file_pos, size;
 	int err, mmap_prot, mmap_flags, map_idx = 0;
 	size_t	mmap_size;
 	char *buf, *mmaps[NUM_MMAPS];
 	union perf_event *event;
-	struct ui_progress prog;
 	s64 skip;
 
 	perf_tool__fill_defaults(tool);
@@ -1279,8 +1279,6 @@ static int __perf_session__process_events(struct perf_session *session,
 	if (data_size && (data_offset + data_size < file_size))
 		file_size = data_offset + data_size;
 
-	ui_progress__init(&prog, file_size, "Processing events...");
-
 	mmap_size = MMAP_SIZE;
 	if (mmap_size > file_size) {
 		mmap_size = file_size;
@@ -1346,7 +1344,7 @@ static int __perf_session__process_events(struct perf_session *session,
 	head += size;
 	file_pos += size;
 
-	ui_progress__update(&prog, size);
+	ui_progress__update(prog, size);
 
 	if (session_done())
 		goto out;
@@ -1358,7 +1356,6 @@ static int __perf_session__process_events(struct perf_session *session,
 	/* do the final flush for ordered samples */
 	err = ordered_events__flush(session, tool, OE_FLUSH__FINAL);
 out_err:
-	ui_progress__finish();
 	ordered_events__free(&session->ordered_events);
 	session->one_mmap = false;
 	return err;
@@ -1367,6 +1364,7 @@ static int __perf_session__process_events(struct perf_session *session,
 int perf_session__process_events(struct perf_session *session,
 				 struct perf_tool *tool)
 {
+	struct ui_progress prog;
 	struct perf_data_file *file = session->file;
 	u64 size = perf_data_file__size(file);
 	int err, i;
@@ -1377,11 +1375,13 @@ int perf_session__process_events(struct perf_session *session,
 	if (perf_data_file__is_pipe(file))
 		return __perf_session__process_pipe_events(session, tool);
 
+	ui_progress__init(&prog, size, "Processing events...");
+
 	err = __perf_session__process_events(session, &session->stats,
 					     perf_data_file__fd(file),
 					     session->header.data_offset,
 					     session->header.data_size,
-					     size, tool);
+					     size, tool, &prog);
 
 	if (err < 0 || !perf_session__has_index(session))
 		return err;
@@ -1400,7 +1400,7 @@ int perf_session__process_events(struct perf_session *session,
 						perf_data_file__fd(file),
 						session->header.index[i].offset,
 						session->header.index[i].size,
-						size, tool);
+						size, tool, &prog);
 		if (err < 0)
 			break;
 	}
@@ -1408,6 +1408,29 @@ int perf_session__process_events(struct perf_session *session,
 	return err;
 }
 
+struct ui_progress_ops *orig_progress__ops;
+
+static void mt_progress__update(struct ui_progress *p)
+{
+	struct perf_tool_mt *mt_tool = container_of(p, struct perf_tool_mt, prog);
+	struct ui_progress *gprog = mt_tool->global_prog;
+	static pthread_mutex_t prog_lock = PTHREAD_MUTEX_INITIALIZER;
+
+	pthread_mutex_lock(&prog_lock);
+
+	gprog->curr += p->step;
+	if (gprog->curr >= gprog->next) {
+		gprog->next += gprog->step;
+		orig_progress__ops->update(gprog);
+	}
+
+	pthread_mutex_unlock(&prog_lock);
+}
+
+static struct ui_progress_ops mt_progress__ops = {
+	.update = mt_progress__update,
+};
+
 static void *processing_thread_idx(void *arg)
 {
 	struct perf_tool_mt *mt_tool = arg;
@@ -1417,10 +1440,12 @@ static void *processing_thread_idx(void *arg)
 	u64 size = session->header.index[mt_tool->idx].size;
 	u64 file_size = perf_data_file__size(session->file);
 
+	ui_progress__init(&mt_tool->prog, size, "");
+
 	pr_debug("processing samples using thread [%d]\n", mt_tool->idx);
 	if (__perf_session__process_events(session, &mt_tool->stats,
 					   fd, offset, size, file_size,
-					   &mt_tool->tool) < 0) {
+					   &mt_tool->tool, &mt_tool->prog) < 0) {
 		pr_err("processing samples failed (thread [%d)\n", mt_tool->idx);
 		return NULL;
 	}
@@ -1438,7 +1463,8 @@ int perf_session__process_events_mt(struct perf_session *session,
 	u64 nr_entries = 0;
 	struct perf_tool_mt *mt_tools = NULL;
 	struct perf_tool_mt *mt;
-	pthread_t *th_id;
+	struct ui_progress prog;
+	pthread_t *th_id = NULL;
 	int err, i, k;
 	int nr_index = session->header.nr_index;
 	u64 size = perf_data_file__size(file);
@@ -1451,13 +1477,19 @@ int perf_session__process_events_mt(struct perf_session *session,
 		return -EINVAL;
 	}
 
+	ui_progress__init(&prog, size, "Processing events...");
+
 	err = __perf_session__process_events(session, &session->stats,
 					     perf_data_file__fd(file),
 					     session->header.data_offset,
 					     session->header.data_size,
-					     size, tool);
+					     size, tool, &prog);
 	if (err)
-		return err;
+		goto out;
+
+	orig_progress__ops = ui_progress__ops;
+	ui_progress__ops = &mt_progress__ops;
+	ui_progress__ops->finish = orig_progress__ops->finish;
 
 	th_id = calloc(nr_index, sizeof(*th_id));
 	if (th_id == NULL)
@@ -1483,6 +1515,7 @@ int perf_session__process_events_mt(struct perf_session *session,
 		mt->tool.ordered_events = false;
 		mt->idx = i;
 		mt->priv = arg;
+		mt->global_prog = &prog;
 
 		pthread_create(&th_id[i], NULL, processing_thread_idx, mt);
 	}
@@ -1506,6 +1539,9 @@ int perf_session__process_events_mt(struct perf_session *session,
 		}
 	}
 
+	ui_progress__ops = orig_progress__ops;
+	ui_progress__init(&prog, nr_entries, "Merging related events...");
+
 	for (i = 0; i < nr_index; i++) {
 		mt = &mt_tools[i];
 
@@ -1515,7 +1551,7 @@ int perf_session__process_events_mt(struct perf_session *session,
 			if (perf_evsel__is_dummy_tracking(evsel))
 				continue;
 
-			hists__mt_resort(hists, &mt->hists[evsel->idx]);
+			hists__mt_resort(hists, &mt->hists[evsel->idx], &prog);
 
 			/* Non-group events are considered as leader */
 			if (symbol_conf.event_group &&
@@ -1530,6 +1566,7 @@ int perf_session__process_events_mt(struct perf_session *session,
 	}
 
 out:
+	ui_progress__finish();
 	events_stats__warn_about_errors(&session->stats, tool);
 
 	if (mt_tools) {
diff --git a/tools/perf/util/tool.h b/tools/perf/util/tool.h
index a04826bbe991..aa7f110b9425 100644
--- a/tools/perf/util/tool.h
+++ b/tools/perf/util/tool.h
@@ -3,6 +3,7 @@
 
 #include <stdbool.h>
 #include "util/event.h"
+#include "ui/progress.h"
 
 struct perf_session;
 union perf_event;
@@ -52,6 +53,8 @@ struct perf_tool_mt {
 	struct events_stats	stats;
 	struct hists		*hists;
 	struct perf_session	*session;
+	struct ui_progress	prog;
+	struct ui_progress	*global_prog;
 	int			idx;
 
 	void			*priv;
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 37/42] perf report: Add --multi-thread option and config item
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (35 preceding siblings ...)
  2015-01-29  8:07 ` [PATCH 36/42] perf tools: Fix progress ui to support multi thread Namhyung Kim
@ 2015-01-29  8:07 ` Namhyung Kim
  2015-01-29  8:07 ` [PATCH 38/42] perf session: Handle index files generally Namhyung Kim
                   ` (5 subsequent siblings)
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

The --multi-thread option is to enable parallel processing so user can
force serial processing even for indexed data file.  It default to false
for now but users also can changes this by setting "report.multi_thread"
config option in ~/.perfconfig file.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/Documentation/perf-report.txt |  3 ++
 tools/perf/builtin-report.c              | 66 +++++++++++++++++++++++++++-----
 tools/perf/util/session.c                |  1 +
 3 files changed, 61 insertions(+), 9 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index dd7cccdde498..e00077a658c1 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -318,6 +318,9 @@ OPTIONS
 --header-only::
 	Show only perf.data header (forces --stdio).
 
+--multi-thread::
+	Speed up report by parallelizing sample processing using multi-thread.
+
 SEE ALSO
 --------
 linkperf:perf-stat[1], linkperf:perf-annotate[1]
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 8a40c79d9273..b0539c017898 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -51,6 +51,7 @@ struct report {
 	bool			mem_mode;
 	bool			header;
 	bool			header_only;
+	bool			multi_thread;
 	int			max_stack;
 	struct perf_read_values	show_threads_values;
 	const char		*pretty_printing_style;
@@ -82,6 +83,10 @@ static int report__config(const char *var, const char *value, void *cb)
 		rep->queue_size = perf_config_u64(var, value);
 		return 0;
 	}
+	if (!strcmp(var, "report.multi-thread")) {
+		rep->multi_thread = perf_config_bool(var, value);
+		return 0;
+	}
 
 	return perf_default_config(var, value, cb);
 }
@@ -128,13 +133,14 @@ static int hist_iter__report_callback(struct hist_entry_iter *iter,
 	return err;
 }
 
-static int process_sample_event(struct perf_tool *tool,
-				union perf_event *event,
-				struct perf_sample *sample,
-				struct perf_evsel *evsel,
-				struct machine *machine)
+static int __process_sample_event(struct perf_tool *tool __maybe_unused,
+				  union perf_event *event,
+				  struct perf_sample *sample,
+				  struct perf_evsel *evsel,
+				  struct machine *machine,
+				  struct hists *hists,
+				  struct report *rep)
 {
-	struct report *rep = container_of(tool, struct report, tool);
 	struct addr_location al;
 	struct hist_entry_iter iter = {
 		.hide_unresolved = rep->hide_unresolved,
@@ -167,7 +173,7 @@ static int process_sample_event(struct perf_tool *tool,
 	if (al.map != NULL)
 		al.map->dso->hit = 1;
 
-	ret = hist_entry_iter__add(&iter, evsel__hists(evsel), evsel, &al,
+	ret = hist_entry_iter__add(&iter, hists, evsel, &al,
 				   sample, rep->max_stack, rep);
 	if (ret < 0)
 		pr_debug("problem adding hist entry, skipping event\n");
@@ -175,6 +181,31 @@ static int process_sample_event(struct perf_tool *tool,
 	return ret;
 }
 
+static int process_sample_event(struct perf_tool *tool,
+				union perf_event *event,
+				struct perf_sample *sample,
+				struct perf_evsel *evsel,
+				struct machine *machine)
+{
+	struct report *rep = container_of(tool, struct report, tool);
+
+	return __process_sample_event(tool, event, sample, evsel, machine,
+				      evsel__hists(evsel), rep);
+}
+
+static int process_sample_event_mt(struct perf_tool *tool,
+				   union perf_event *event,
+				   struct perf_sample *sample,
+				   struct perf_evsel *evsel,
+				   struct machine *machine)
+{
+	struct perf_tool_mt *mt = container_of(tool, struct perf_tool_mt, tool);
+	struct report *rep = mt->priv;
+
+	return __process_sample_event(tool, event, sample, evsel, machine,
+				      &mt->hists[evsel->idx], rep);
+}
+
 static int process_read_event(struct perf_tool *tool,
 			      union perf_event *event,
 			      struct perf_sample *sample __maybe_unused,
@@ -484,7 +515,12 @@ static int __cmd_report(struct report *rep)
 	if (ret)
 		return ret;
 
-	ret = perf_session__process_events(session, &rep->tool);
+	if (rep->multi_thread) {
+		rep->tool.sample = process_sample_event_mt;
+		ret = perf_session__process_events_mt(session, &rep->tool, rep);
+	} else {
+		ret = perf_session__process_events(session, &rep->tool);
+	}
 	if (ret)
 		return ret;
 
@@ -507,7 +543,12 @@ static int __cmd_report(struct report *rep)
 		}
 	}
 
-	report__collapse_hists(rep);
+	/*
+	 * For multi-thread report, it already calls hists__mt_resort()
+	 * so no need to collapse here.
+	 */
+	if (!rep->multi_thread)
+		report__collapse_hists(rep);
 
 	if (session_done())
 		return 0;
@@ -715,6 +756,8 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
 		     "Don't show entries under that percent", parse_percent_limit),
 	OPT_CALLBACK(0, "percentage", NULL, "relative|absolute",
 		     "how to display percentage of filtered entries", parse_filter_percentage),
+	OPT_BOOLEAN(0, "multi-thread", &report.multi_thread,
+		    "Speed up sample processing using multi-thead"),
 	OPT_END()
 	};
 	struct perf_data_file file = {
@@ -759,6 +802,11 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
 					       report.queue_size);
 	}
 
+	if (report.multi_thread && !perf_session__has_index(session)) {
+		pr_debug("fallback to single thread for normal data file.\n");
+		report.multi_thread = false;
+	}
+
 	report.session = session;
 
 	has_br_stack = perf_header__has_feat(&session->header,
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 3596bb608f3c..6d34c880010f 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1586,6 +1586,7 @@ int perf_session__process_events_mt(struct perf_session *session,
 
 	goto out;
 }
+
 bool perf_session__has_traces(struct perf_session *session, const char *msg)
 {
 	struct perf_evsel *evsel;
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 38/42] perf session: Handle index files generally
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (36 preceding siblings ...)
  2015-01-29  8:07 ` [PATCH 37/42] perf report: Add --multi-thread option and config item Namhyung Kim
@ 2015-01-29  8:07 ` Namhyung Kim
  2015-01-29  8:07 ` [PATCH 39/42] perf tools: Convert lseek + read to pread Namhyung Kim
                   ` (4 subsequent siblings)
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

The current code assumes that the number of index item and cpu are
matched so it creates that number of threads.  But it's not the case
of non-system-wide session or data came from different machine.

Just creates threads at most number of online cpus and process data.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/session.c | 68 ++++++++++++++++++++++++++++++++++-------------
 tools/perf/util/tool.h    |  1 -
 2 files changed, 50 insertions(+), 19 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 6d34c880010f..ccf9371ef292 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1431,26 +1431,51 @@ static struct ui_progress_ops mt_progress__ops = {
 	.update = mt_progress__update,
 };
 
+static int perf_session__get_index(struct perf_session *session)
+{
+	int ret;
+	static unsigned index;
+	static pthread_mutex_t idx_lock = PTHREAD_MUTEX_INITIALIZER;
+
+	pthread_mutex_lock(&idx_lock);
+	if (index < session->header.nr_index)
+		ret = index++;
+	else
+		ret = -1;
+	pthread_mutex_unlock(&idx_lock);
+
+	return ret;
+}
+
 static void *processing_thread_idx(void *arg)
 {
 	struct perf_tool_mt *mt_tool = arg;
 	struct perf_session *session = mt_tool->session;
 	int fd = perf_data_file__fd(session->file);
-	u64 offset = session->header.index[mt_tool->idx].offset;
-	u64 size = session->header.index[mt_tool->idx].size;
 	u64 file_size = perf_data_file__size(session->file);
+	int idx;
 
-	ui_progress__init(&mt_tool->prog, size, "");
+	while ((idx = perf_session__get_index(session)) >= 0) {
+		u64 offset = session->header.index[idx].offset;
+		u64 size = session->header.index[idx].size;
+		struct perf_tool_mt *mtt = &mt_tool[idx];
 
-	pr_debug("processing samples using thread [%d]\n", mt_tool->idx);
-	if (__perf_session__process_events(session, &mt_tool->stats,
-					   fd, offset, size, file_size,
-					   &mt_tool->tool, &mt_tool->prog) < 0) {
-		pr_err("processing samples failed (thread [%d)\n", mt_tool->idx);
-		return NULL;
+		if (size == 0)
+			continue;
+
+		pr_debug("processing samples [index %d]\n", idx);
+
+		ui_progress__init(&mtt->prog, size, "");
+
+		if (__perf_session__process_events(session, &mtt->stats,
+						   fd, offset, size, file_size,
+						   &mtt->tool, &mtt->prog) < 0) {
+			pr_err("processing samples failed [index %d]\n", idx);
+			return NULL;
+		}
+		pr_debug("processing samples done [index %d]\n", idx);
 	}
 
-	pr_debug("processing samples done for thread [%d]\n", mt_tool->idx);
 	return arg;
 }
 
@@ -1468,6 +1493,7 @@ int perf_session__process_events_mt(struct perf_session *session,
 	int err, i, k;
 	int nr_index = session->header.nr_index;
 	u64 size = perf_data_file__size(file);
+	int nr_thread = sysconf(_SC_NPROCESSORS_ONLN);
 
 	if (perf_session__register_idle_thread(session) == NULL)
 		return -ENOMEM;
@@ -1491,10 +1517,6 @@ int perf_session__process_events_mt(struct perf_session *session,
 	ui_progress__ops = &mt_progress__ops;
 	ui_progress__ops->finish = orig_progress__ops->finish;
 
-	th_id = calloc(nr_index, sizeof(*th_id));
-	if (th_id == NULL)
-		goto out;
-
 	mt_tools = calloc(nr_index, sizeof(*mt_tools));
 	if (mt_tools == NULL)
 		goto out;
@@ -1513,20 +1535,30 @@ int perf_session__process_events_mt(struct perf_session *session,
 
 		mt->session = session;
 		mt->tool.ordered_events = false;
-		mt->idx = i;
 		mt->priv = arg;
 		mt->global_prog = &prog;
-
-		pthread_create(&th_id[i], NULL, processing_thread_idx, mt);
 	}
 
-	for (i = 0; i < nr_index; i++) {
+	if (nr_thread > nr_index)
+		nr_thread = nr_index;
+
+	th_id = calloc(nr_thread, sizeof(*th_id));
+	if (th_id == NULL)
+		goto out;
+
+	for (i = 0; i < nr_thread; i++)
+		pthread_create(&th_id[i], NULL, processing_thread_idx, mt_tools);
+
+	for (i = 0; i < nr_thread; i++) {
 		pthread_join(th_id[i], (void **)&mt);
 		if (mt == NULL) {
 			err = -EINVAL;
 			continue;
 		}
+	}
 
+	for (i = 0; i < nr_index; i++) {
+		mt = &mt_tools[i];
 		events_stats__add(&session->stats, &mt->stats);
 
 		evlist__for_each(evlist, evsel) {
diff --git a/tools/perf/util/tool.h b/tools/perf/util/tool.h
index aa7f110b9425..e52c936d1b9e 100644
--- a/tools/perf/util/tool.h
+++ b/tools/perf/util/tool.h
@@ -55,7 +55,6 @@ struct perf_tool_mt {
 	struct perf_session	*session;
 	struct ui_progress	prog;
 	struct ui_progress	*global_prog;
-	int			idx;
 
 	void			*priv;
 };
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 39/42] perf tools: Convert lseek + read to pread
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (37 preceding siblings ...)
  2015-01-29  8:07 ` [PATCH 38/42] perf session: Handle index files generally Namhyung Kim
@ 2015-01-29  8:07 ` Namhyung Kim
  2015-01-30 18:34   ` [tip:perf/core] perf symbols: " tip-bot for Namhyung Kim
  2015-01-29  8:07 ` [PATCH 40/42] perf callchain: Save eh/debug frame offset for dwarf unwind Namhyung Kim
                   ` (3 subsequent siblings)
  42 siblings, 1 reply; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

When dso_cache__read() is called, it reads data from the given offset
using lseek + normal read syscall.  It can be combined to a single
pread syscall.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/dso.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index ae92046ae2c8..b6ad22b3c6f2 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -566,10 +566,7 @@ dso_cache__read(struct dso *dso, struct machine *machine,
 		}
 	}
 
-	if (-1 == lseek(dso->data.fd, cache_offset, SEEK_SET))
-		goto err_unlock;
-
-	ret = read(dso->data.fd, cache->data, DSO__DATA_CACHE_SIZE);
+	ret = pread(dso->data.fd, cache->data, DSO__DATA_CACHE_SIZE, cache_offset);
 	if (ret <= 0)
 		goto err_unlock;
 
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 40/42] perf callchain: Save eh/debug frame offset for dwarf unwind
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (38 preceding siblings ...)
  2015-01-29  8:07 ` [PATCH 39/42] perf tools: Convert lseek + read to pread Namhyung Kim
@ 2015-01-29  8:07 ` Namhyung Kim
  2015-01-29 12:38   ` Arnaldo Carvalho de Melo
                     ` (2 more replies)
  2015-01-29  8:07 ` [PATCH 41/42] perf tools: Add new perf data command Namhyung Kim
                   ` (2 subsequent siblings)
  42 siblings, 3 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

When libunwind tries to resolve callchains it needs to know the offset
of .eh_frame_hdr or .debug_frame to access the dso.  Since it calls
dso__data_fd(), it'll try to grab dso->lock everytime for same
information.  So save it to dso_data struct and reuse it.

Note that there's a window between dso__data_fd() and actual use of
the fd.  The fd could be closed by other threads to deal with the open
file limit in dso cache code.  But I think it's ok since in that case
elf_section_offset() will return 0 so it'll be tried in next acess.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/dso.h              |  1 +
 tools/perf/util/unwind-libunwind.c | 31 ++++++++++++++++++++-----------
 2 files changed, 21 insertions(+), 11 deletions(-)

diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index c18fcc0e8081..323ee08d56fc 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -141,6 +141,7 @@ struct dso {
 		u32		 status_seen;
 		size_t		 file_size;
 		struct list_head open_entry;
+		u64		 frame_offset;
 	} data;
 
 	union { /* Tool specific area */
diff --git a/tools/perf/util/unwind-libunwind.c b/tools/perf/util/unwind-libunwind.c
index 7ed6eaf232b6..3219b20837b5 100644
--- a/tools/perf/util/unwind-libunwind.c
+++ b/tools/perf/util/unwind-libunwind.c
@@ -266,14 +266,17 @@ static int read_unwind_spec_eh_frame(struct dso *dso, struct machine *machine,
 				     u64 *fde_count)
 {
 	int ret = -EINVAL, fd;
-	u64 offset;
+	u64 offset = dso->data.frame_offset;
 
-	fd = dso__data_fd(dso, machine);
-	if (fd < 0)
-		return -EINVAL;
+	if (offset == 0) {
+		fd = dso__data_fd(dso, machine);
+		if (fd < 0)
+			return -EINVAL;
 
-	/* Check the .eh_frame section for unwinding info */
-	offset = elf_section_offset(fd, ".eh_frame_hdr");
+		/* Check the .eh_frame section for unwinding info */
+		offset = elf_section_offset(fd, ".eh_frame_hdr");
+		dso->data.frame_offset = offset;
+	}
 
 	if (offset)
 		ret = unwind_spec_ehframe(dso, machine, offset,
@@ -287,14 +290,20 @@ static int read_unwind_spec_eh_frame(struct dso *dso, struct machine *machine,
 static int read_unwind_spec_debug_frame(struct dso *dso,
 					struct machine *machine, u64 *offset)
 {
-	int fd = dso__data_fd(dso, machine);
+	int fd;
+	u64 ofs = dso->data.frame_offset;
 
-	if (fd < 0)
-		return -EINVAL;
+	if (ofs == 0) {
+		fd = dso__data_fd(dso, machine);
+		if (fd < 0)
+			return -EINVAL;
 
-	/* Check the .debug_frame section for unwinding info */
-	*offset = elf_section_offset(fd, ".debug_frame");
+		/* Check the .debug_frame section for unwinding info */
+		ofs = elf_section_offset(fd, ".debug_frame");
+		dso->data.frame_offset = ofs;
+	}
 
+	*offset = ofs;
 	if (*offset)
 		return 0;
 
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 41/42] perf tools: Add new perf data command
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (39 preceding siblings ...)
  2015-01-29  8:07 ` [PATCH 40/42] perf callchain: Save eh/debug frame offset for dwarf unwind Namhyung Kim
@ 2015-01-29  8:07 ` Namhyung Kim
  2015-01-29  8:07 ` [PATCH 42/42] perf data: Implement 'index' subcommand Namhyung Kim
  2015-01-29 19:56 ` [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Arnaldo Carvalho de Melo
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker,
	Jiri Olsa, Sebastian Andrzej Siewior, Jiri Olsa

From: Jiri Olsa <namhyung@kernel.org>

Adding new 'perf data' command to provide operations over
data files.

The 'perf data convert' sub command is coming in following
patch, but there's possibility for other useful commands
like 'perf data ls' (to display perf data file in directory
in ls style).

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/Documentation/perf-data.txt | 15 +++++++
 tools/perf/Makefile.perf               |  1 +
 tools/perf/builtin-data.c              | 75 ++++++++++++++++++++++++++++++++++
 tools/perf/builtin.h                   |  1 +
 tools/perf/command-list.txt            |  1 +
 tools/perf/perf.c                      |  1 +
 6 files changed, 94 insertions(+)
 create mode 100644 tools/perf/Documentation/perf-data.txt
 create mode 100644 tools/perf/builtin-data.c

diff --git a/tools/perf/Documentation/perf-data.txt b/tools/perf/Documentation/perf-data.txt
new file mode 100644
index 000000000000..b8c83947715c
--- /dev/null
+++ b/tools/perf/Documentation/perf-data.txt
@@ -0,0 +1,15 @@
+perf-data(1)
+==============
+
+NAME
+----
+perf-data - Data file related processing
+
+SYNOPSIS
+--------
+[verse]
+'perf data' [<common options>] <command> [<options>]",
+
+DESCRIPTION
+-----------
+Data file related processing.
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 2f8c8b918cac..16ff21fbde55 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -498,6 +498,7 @@ BUILTIN_OBJS += $(OUTPUT)builtin-kvm.o
 BUILTIN_OBJS += $(OUTPUT)builtin-inject.o
 BUILTIN_OBJS += $(OUTPUT)tests/builtin-test.o
 BUILTIN_OBJS += $(OUTPUT)builtin-mem.o
+BUILTIN_OBJS += $(OUTPUT)builtin-data.o
 
 PERFLIBS = $(LIB_FILE) $(LIBAPIKFS) $(LIBTRACEEVENT)
 
diff --git a/tools/perf/builtin-data.c b/tools/perf/builtin-data.c
new file mode 100644
index 000000000000..1eee97d020fa
--- /dev/null
+++ b/tools/perf/builtin-data.c
@@ -0,0 +1,75 @@
+#include <linux/compiler.h>
+#include "builtin.h"
+#include "perf.h"
+#include "debug.h"
+#include "parse-options.h"
+
+typedef int (*data_cmd_fn_t)(int argc, const char **argv, const char *prefix);
+
+struct data_cmd {
+	const char	*name;
+	const char	*summary;
+	data_cmd_fn_t	fn;
+};
+
+static struct data_cmd data_cmds[];
+
+#define for_each_cmd(cmd) \
+	for (cmd = data_cmds; cmd && cmd->name; cmd++)
+
+static const struct option data_options[] = {
+	OPT_END()
+};
+
+static const char * const data_usage[] = {
+	"perf data [<common options>] <command> [<options>]",
+	NULL
+};
+
+static void print_usage(void)
+{
+	struct data_cmd *cmd;
+
+	printf("Usage:\n");
+	printf("\t%s\n\n", data_usage[0]);
+	printf("\tAvailable commands:\n");
+
+	for_each_cmd(cmd) {
+		printf("\t %s\t- %s\n", cmd->name, cmd->summary);
+	}
+
+	printf("\n");
+}
+
+static struct data_cmd data_cmds[] = {
+	{ NULL },
+};
+
+int cmd_data(int argc, const char **argv, const char *prefix)
+{
+	struct data_cmd *cmd;
+	const char *cmdstr;
+
+	/* No command specified. */
+	if (argc < 2)
+		goto usage;
+
+	argc = parse_options(argc, argv, data_options, data_usage,
+			     PARSE_OPT_STOP_AT_NON_OPTION);
+	if (argc < 1)
+		goto usage;
+
+	cmdstr = argv[0];
+
+	for_each_cmd(cmd) {
+		if (strcmp(cmd->name, cmdstr))
+			continue;
+
+		return cmd->fn(argc, argv, prefix);
+	}
+
+	pr_err("Unknown command: %s\n", cmdstr);
+usage:
+	print_usage();
+	return -1;
+}
diff --git a/tools/perf/builtin.h b/tools/perf/builtin.h
index b210d62907e4..3688ad29085f 100644
--- a/tools/perf/builtin.h
+++ b/tools/perf/builtin.h
@@ -37,6 +37,7 @@ extern int cmd_test(int argc, const char **argv, const char *prefix);
 extern int cmd_trace(int argc, const char **argv, const char *prefix);
 extern int cmd_inject(int argc, const char **argv, const char *prefix);
 extern int cmd_mem(int argc, const char **argv, const char *prefix);
+extern int cmd_data(int argc, const char **argv, const char *prefix);
 
 extern int find_scripts(char **scripts_array, char **scripts_path_array);
 #endif
diff --git a/tools/perf/command-list.txt b/tools/perf/command-list.txt
index 0906fc401c52..00fcaf8a5b8d 100644
--- a/tools/perf/command-list.txt
+++ b/tools/perf/command-list.txt
@@ -7,6 +7,7 @@ perf-archive			mainporcelain common
 perf-bench			mainporcelain common
 perf-buildid-cache		mainporcelain common
 perf-buildid-list		mainporcelain common
+perf-data			mainporcelain common
 perf-diff			mainporcelain common
 perf-evlist			mainporcelain common
 perf-inject			mainporcelain common
diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index 3700a7faca6c..f3c66b81c6be 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -62,6 +62,7 @@ static struct cmd_struct commands[] = {
 #endif
 	{ "inject",	cmd_inject,	0 },
 	{ "mem",	cmd_mem,	0 },
+	{ "data",	cmd_data,	0 },
 };
 
 struct pager_config {
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 42/42] perf data: Implement 'index' subcommand
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (40 preceding siblings ...)
  2015-01-29  8:07 ` [PATCH 41/42] perf tools: Add new perf data command Namhyung Kim
@ 2015-01-29  8:07 ` Namhyung Kim
  2015-01-29 19:56 ` [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Arnaldo Carvalho de Melo
  42 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29  8:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

The index command first splits a given data file into intermediate
data files and merges them into a final data file with an index table
so that it can processed using multi threads.  The HEADER_DATA_INDEX
feature bit is added to distinguish data file that has an index table.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/Documentation/perf-data.txt |  29 +++
 tools/perf/builtin-data.c              | 353 +++++++++++++++++++++++++++++++++
 2 files changed, 382 insertions(+)

diff --git a/tools/perf/Documentation/perf-data.txt b/tools/perf/Documentation/perf-data.txt
index b8c83947715c..468ef7eb53e7 100644
--- a/tools/perf/Documentation/perf-data.txt
+++ b/tools/perf/Documentation/perf-data.txt
@@ -13,3 +13,32 @@ SYNOPSIS
 DESCRIPTION
 -----------
 Data file related processing.
+
+COMMANDS
+--------
+index::
+	Build an index table for data file so that it can be processed
+	with multiple threads concurrently.
+
+
+OPTIONS for 'split'
+---------------------
+-i::
+--input::
+	Specify input perf data file path.
+
+-o::
+--output::
+	Specify output perf data directory path.
+
+-v::
+--verbose::
+        Be more verbose (show counter open errors, etc).
+
+-f::
+--force::
+        Don't complain, do it.
+
+SEE ALSO
+--------
+linkperf:perf[1], linkperf:perf-report[1]
diff --git a/tools/perf/builtin-data.c b/tools/perf/builtin-data.c
index 1eee97d020fa..be44215355e6 100644
--- a/tools/perf/builtin-data.c
+++ b/tools/perf/builtin-data.c
@@ -2,10 +2,15 @@
 #include "builtin.h"
 #include "perf.h"
 #include "debug.h"
+#include "session.h"
+#include "evlist.h"
 #include "parse-options.h"
+#include <sys/mman.h>
 
 typedef int (*data_cmd_fn_t)(int argc, const char **argv, const char *prefix);
 
+static const char *output_name;
+
 struct data_cmd {
 	const char	*name;
 	const char	*summary;
@@ -41,10 +46,358 @@ static void print_usage(void)
 	printf("\n");
 }
 
+static int data_cmd_index(int argc, const char **argv, const char *prefix);
+
 static struct data_cmd data_cmds[] = {
+	{ "index", "merge data file and add index", data_cmd_index },
 	{ NULL },
 };
 
+#define FD_HASH_BITS  7
+#define FD_HASH_SIZE  (1 << FD_HASH_BITS)
+#define FD_HASH_MASK  (FD_HASH_SIZE - 1)
+
+struct data_index {
+	struct perf_tool	tool;
+	struct perf_session	*session;
+	enum {
+		PER_CPU,
+		PER_THREAD,
+	} split_mode;
+	char			*tmpdir;
+	int 			header_fd;
+	u64			header_written;
+	struct hlist_head	fd_hash[FD_HASH_SIZE];
+	int			fd_hash_nr;
+	int			output_fd;
+};
+
+struct fdhash_node {
+	int			id;
+	int			fd;
+	struct hlist_node	list;
+};
+
+static struct hlist_head *get_hash(struct data_index *index, int id)
+{
+	return &index->fd_hash[id % FD_HASH_MASK];
+}
+
+static int perf_event__rewrite_header(struct perf_tool *tool,
+				      union perf_event *event)
+{
+	struct data_index *index = container_of(tool, struct data_index, tool);
+	ssize_t size;
+
+	size = writen(index->header_fd, event, event->header.size);
+	if (size < 0)
+		return -errno;
+
+	index->header_written += size;
+	return 0;
+}
+
+static int split_other_events(struct perf_tool *tool,
+				union perf_event *event,
+				struct perf_sample *sample __maybe_unused,
+				struct machine *machine __maybe_unused)
+{
+	return perf_event__rewrite_header(tool, event);
+}
+
+static int split_sample_event(struct perf_tool *tool,
+				union perf_event *event,
+				struct perf_sample *sample,
+				struct perf_evsel *evsel __maybe_unused,
+				struct machine *machine __maybe_unused)
+{
+	struct data_index *index = container_of(tool, struct data_index, tool);
+	int id = index->split_mode == PER_CPU ? sample->cpu : sample->tid;
+	int fd = -1;
+	char buf[PATH_MAX];
+	struct hlist_head *head;
+	struct fdhash_node *node;
+
+	head = get_hash(index, id);
+	hlist_for_each_entry(node, head, list) {
+		if (node->id == id) {
+			fd = node->fd;
+			break;
+		}
+	}
+
+	if (fd == -1) {
+		scnprintf(buf, sizeof(buf), "%s/perf.data.%d",
+			  index->tmpdir, index->fd_hash_nr++);
+
+		fd = open(buf, O_RDWR|O_CREAT|O_TRUNC, 0600);
+		if (fd < 0) {
+			pr_err("cannot open data file: %s: %m\n", buf);
+			return -1;
+		}
+
+		node = malloc(sizeof(*node));
+		if (node == NULL) {
+			pr_err("memory allocation failed\n");
+			return -1;
+		}
+
+		node->id = id;
+		node->fd = fd;
+
+		hlist_add_head(&node->list, head);
+	}
+
+	return writen(fd, event, event->header.size) > 0 ? 0 : -errno;
+}
+
+static int split_data_file(struct data_index *index)
+{
+	struct perf_session *session = index->session;
+	char buf[PATH_MAX];
+	u64 sample_type;
+	int header_fd;
+
+	if (asprintf(&index->tmpdir, "%s.dir", output_name) < 0) {
+		pr_err("memory allocation failed\n");
+		return -1;
+	}
+
+	if (mkdir(index->tmpdir, 0700) < 0) {
+		pr_err("cannot create intermediate directory\n");
+		return -1;
+	}
+
+	/*
+	 * This is necessary to write (copy) build-id table.  After
+	 * processing header, dsos list will only contain dso which
+	 * was on the original build-id table.
+	 */
+	dsos__hit_all(session);
+
+	scnprintf(buf, sizeof(buf), "%s/perf.header", index->tmpdir);
+	header_fd = open(buf, O_RDWR|O_CREAT|O_TRUNC, 0600);
+	if (header_fd < 0) {
+		pr_err("cannot open header file: %s: %m\n", buf);
+		return -1;
+	}
+
+	lseek(header_fd, session->header.data_offset, SEEK_SET);
+
+	sample_type = perf_evlist__combined_sample_type(session->evlist);
+	if (sample_type & PERF_SAMPLE_CPU)
+		index->split_mode = PER_CPU;
+	else
+		index->split_mode = PER_THREAD;
+
+	pr_debug("splitting data file for %s\n",
+		 index->split_mode == PER_CPU ? "CPUs" : "threads");
+
+	index->header_fd = header_fd;
+	if (perf_session__process_events(session, &index->tool) < 0) {
+		pr_err("failed to process events\n");
+		return -1;
+	}
+
+	session->header.data_size = index->header_written;
+	/* 
+	 * This is needed for index to determine current (header) file
+	 * size (including feature data).
+	 */
+	perf_session__write_header(session, session->evlist, header_fd, true);
+
+	return 0;
+}
+
+static int build_index_table(struct data_index *index)
+{
+	int i, n;
+	u64 offset;
+	u64 nr_index = index->fd_hash_nr;
+	struct perf_file_section *idx;
+	struct perf_session *session = index->session;
+
+	idx = calloc(nr_index, sizeof(*idx));
+	if (idx == NULL)
+		return -1;
+
+	/* index data will be placed after header file */
+	offset = lseek(index->header_fd, 0, SEEK_END);
+	if (offset == (u64)(loff_t) -1)
+		goto out;
+
+	/* increase the offset for added index data */
+	offset += sizeof(nr_index) + nr_index * sizeof(*index);
+	offset = PERF_ALIGN(offset, page_size);
+
+	for (i = n = 0; i < FD_HASH_SIZE; i++) {
+		struct fdhash_node *node;
+
+		hlist_for_each_entry(node, &index->fd_hash[i], list) {
+			struct stat stbuf;
+
+			if (fstat(node->fd, &stbuf) < 0)
+				goto out;
+
+			idx[n].offset = offset;
+			idx[n].size   = stbuf.st_size;
+			n++;
+
+			offset += PERF_ALIGN(stbuf.st_size, page_size);
+		}
+	}
+
+	BUG_ON(n != (int)nr_index);
+
+	session->header.index = idx;
+	session->header.nr_index = nr_index;
+	perf_header__set_feat(&session->header, HEADER_DATA_INDEX);
+
+	perf_session__write_header(session, session->evlist,
+				   index->output_fd, true);
+	return 0;
+
+out:
+	free(idx);
+	return -1;
+}
+
+static int cleanup_temp_files(struct data_index *index)
+{
+	int i;
+
+	for (i = 0; i < FD_HASH_SIZE; i++) {
+		struct fdhash_node *pos;
+		struct hlist_node *tmp;
+
+		hlist_for_each_entry_safe(pos, tmp, &index->fd_hash[i], list) {
+			hlist_del(&pos->list);
+			close(pos->fd);
+			free(pos);
+		}
+	}
+	close(index->header_fd);
+
+	rm_rf(index->tmpdir);
+	zfree(&index->tmpdir);
+	return 0;
+}
+
+static int __data_cmd_index(struct data_index *index)
+{
+	struct perf_session *session = index->session;
+	char *output = NULL;
+	int ret = -1;
+	int i, n;
+
+	if (!output_name) {
+		if (asprintf(&output, "%s.out", session->file->path) < 0) {
+			pr_err("memory allocation failed\n");
+			return -1;
+		}
+
+		output_name = output;
+	}
+
+	index->output_fd = open(output_name, O_RDWR|O_CREAT|O_TRUNC, 0600);
+	if (index->output_fd < 0) {
+		pr_err("cannot create output file: %s\n", output_name);
+		goto out;
+	}
+
+	/*
+	 * This is necessary to write (copy) build-id table.  After
+	 * processing header, dsos list will contain dso which was on
+	 * the original build-id table.
+	 */
+	dsos__hit_all(session);
+
+	if (split_data_file(index) < 0)
+		goto out_clean;
+
+	if (build_index_table(index) < 0)
+		goto out_clean;
+
+	/* copy meta-events */
+	if (copyfile_offset(index->header_fd, session->header.data_offset,
+			   index->output_fd, session->header.data_offset,
+			   session->header.data_size) < 0)
+		goto out_clean;
+
+	/* copy sample events */
+	for (i = n = 0; i < FD_HASH_SIZE; i++) {
+		struct fdhash_node *node;
+
+		hlist_for_each_entry(node, &index->fd_hash[i], list) {
+			if (copyfile_offset(node->fd, 0, index->output_fd,
+					    session->header.index[n].offset,
+					    session->header.index[n].size) < 0)
+				goto out_clean;
+			n++;
+		}
+	}
+	ret = 0;
+
+out_clean:
+	cleanup_temp_files(index);
+	close(index->output_fd);
+out:
+	free(output);
+	return ret;
+}
+
+int data_cmd_index(int argc, const char **argv, const char *prefix __maybe_unused)
+{
+	bool force = false;
+	struct perf_session *session;
+	struct perf_data_file file = {
+		.mode  = PERF_DATA_MODE_READ,
+	};
+	struct data_index index = {
+		.tool = {
+			.sample		= split_sample_event,
+			.fork		= split_other_events,
+			.comm		= split_other_events,
+			.exit		= split_other_events,
+			.mmap		= split_other_events,
+			.mmap2		= split_other_events,
+			.lost		= split_other_events,
+			.throttle	= split_other_events,
+			.unthrottle	= split_other_events,
+			.ordered_events = false,
+		},
+	};
+	const char * const index_usage[] = {
+		"perf data index [<options>]",
+		NULL
+	};
+	const struct option index_options[] = {
+	OPT_STRING('i', "input", &input_name, "file", "input file name"),
+	OPT_STRING('o', "output", &output_name, "file", "output directory name"),
+	OPT_BOOLEAN('f', "force", &force, "don't complain, do it"),
+	OPT_INCR('v', "verbose", &verbose, "be more verbose"),
+	OPT_END()
+	};
+
+	argc = parse_options(argc, argv, index_options, index_usage, 0);
+	if (argc)
+		usage_with_options(index_usage, index_options);
+
+	file.path = input_name;
+	file.force = force;
+	session = perf_session__new(&file, false, &index.tool);
+	if (session == NULL)
+		return -1;
+
+	index.session = session;
+	symbol__init(&session->header.env);
+
+	__data_cmd_index(&index);
+
+	perf_session__delete(session);
+	return 0;
+}
+
 int cmd_data(int argc, const char **argv, const char *prefix)
 {
 	struct data_cmd *cmd;
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH 29/42] perf tools: Protect dso cache fd with a mutex
  2015-01-29  8:07 ` [PATCH 29/42] perf tools: Protect dso cache fd with a mutex Namhyung Kim
@ 2015-01-29 12:31   ` Arnaldo Carvalho de Melo
  2015-01-29 13:19     ` Namhyung Kim
  0 siblings, 1 reply; 221+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-01-29 12:31 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

Em Thu, Jan 29, 2015 at 05:07:10PM +0900, Namhyung Kim escreveu:
> When dso cache is accessed in multi-thread environment, it's possible
> to close other dso->data.fd during operation due to open file limit.
> Protect the file descriptors using a separate mutex.
> 
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/tests/dso-data.c |   5 ++
>  tools/perf/util/dso.c       | 136 +++++++++++++++++++++++++++++---------------
>  2 files changed, 94 insertions(+), 47 deletions(-)
> 
> diff --git a/tools/perf/tests/dso-data.c b/tools/perf/tests/dso-data.c
> index caaf37f079b1..0276e7d2d41b 100644
> --- a/tools/perf/tests/dso-data.c
> +++ b/tools/perf/tests/dso-data.c
> @@ -111,6 +111,9 @@ int test__dso_data(void)
>  	memset(&machine, 0, sizeof(machine));
>  
>  	dso = dso__new((const char *)file);
> +	TEST_ASSERT_VAL("failed to get dso", dso);
> +
> +	dso->binary_type = DSO_BINARY_TYPE__SYSTEM_PATH_DSO;
>  
>  	/* Basic 10 bytes tests. */
>  	for (i = 0; i < ARRAY_SIZE(offsets); i++) {
> @@ -199,6 +202,8 @@ static int dsos__create(int cnt, int size)
>  
>  		dsos[i] = dso__new(file);
>  		TEST_ASSERT_VAL("failed to get dso", dsos[i]);
> +
> +		dsos[i]->binary_type = DSO_BINARY_TYPE__SYSTEM_PATH_DSO;

Those two are unrelated, please put them in a separate patch, one that I
can even cherrypick ahead of the other patches.

>  	}
>  
>  	return 0;
> diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
> index 11ece224ef50..ae92046ae2c8 100644
> --- a/tools/perf/util/dso.c
> +++ b/tools/perf/util/dso.c
> @@ -213,6 +213,7 @@ bool dso__needs_decompress(struct dso *dso)
>   */
>  static LIST_HEAD(dso__data_open);
>  static long dso__data_open_cnt;
> +static pthread_mutex_t dso__data_open_lock = PTHREAD_MUTEX_INITIALIZER;
>  
>  static void dso__list_add(struct dso *dso)
>  {
> @@ -240,7 +241,7 @@ static int do_open(char *name)
>  		if (fd >= 0)
>  			return fd;
>  
> -		pr_debug("dso open failed, mmap: %s\n",
> +		pr_debug("dso open failed: %s\n",
>  			 strerror_r(errno, sbuf, sizeof(sbuf)));
>  		if (!dso__data_open_cnt || errno != EMFILE)

Ditto, another unrelated patch, please separate.

>  			break;
> @@ -382,7 +383,9 @@ static void check_data_close(void)
>   */
>  void dso__data_close(struct dso *dso)
>  {
> +	pthread_mutex_lock(&dso__data_open_lock);
>  	close_dso(dso);
> +	pthread_mutex_unlock(&dso__data_open_lock);
>  }
>  
>  /**
> @@ -405,6 +408,8 @@ int dso__data_fd(struct dso *dso, struct machine *machine)
>  	if (dso->data.status == DSO_DATA_STATUS_ERROR)
>  		return -1;
>  
> +	pthread_mutex_lock(&dso__data_open_lock);
> +
>  	if (dso->data.fd >= 0)
>  		goto out;
>  
> @@ -427,6 +432,7 @@ int dso__data_fd(struct dso *dso, struct machine *machine)
>  	else
>  		dso->data.status = DSO_DATA_STATUS_ERROR;
>  
> +	pthread_mutex_unlock(&dso__data_open_lock);
>  	return dso->data.fd;
>  }
>  
> @@ -531,52 +537,66 @@ dso_cache__memcpy(struct dso_cache *cache, u64 offset,
>  }
>  
>  static ssize_t
> -dso_cache__read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
> +dso_cache__read(struct dso *dso, struct machine *machine,
> +		u64 offset, u8 *data, ssize_t size)
>  {
>  	struct dso_cache *cache;
>  	struct dso_cache *old;
> -	ssize_t ret;
> -
> -	do {
> -		u64 cache_offset;

While I understand that there was no need for this do { } while (0)
construct in the first place, removing it in this patch is not
interesting, as it is both unrelated to this patch and makes the it
harder to review by just looking at the patch :-\ Please refrain from
doing this in this patch.

A later patch that does _just_ that could be done, if you feel like
doing it.

> +	ssize_t ret = -EINVAL;
> +	u64 cache_offset;
>  
> -		ret = -ENOMEM;
> +	cache = zalloc(sizeof(*cache) + DSO__DATA_CACHE_SIZE);
> +	if (!cache)
> +		return -ENOMEM;
>  
> -		cache = zalloc(sizeof(*cache) + DSO__DATA_CACHE_SIZE);
> -		if (!cache)
> -			break;
> +	cache_offset = offset & DSO__DATA_CACHE_MASK;
>  
> -		cache_offset = offset & DSO__DATA_CACHE_MASK;
> -		ret = -EINVAL;
> +	pthread_mutex_lock(&dso__data_open_lock);
>  
> -		if (-1 == lseek(dso->data.fd, cache_offset, SEEK_SET))
> -			break;
> +	/*
> +	 * dso->data.fd might be closed if other thread opened another
> +	 * file (dso) due to open file limit (RLIMIT_NOFILE).
> +	 */
> +	if (dso->data.fd < 0) {
> +		dso->data.fd = open_dso(dso, machine);

Also please consider adding a backpointer to machine in the dso object,
since you need to reopen it, so that we don't have to go on passing
machine around to dso_cache__read(), etc.

This probably needs to be done in the patch that makes dso->data.fd to
be closed due to limit.

> +		if (dso->data.fd < 0) {
> +			ret = -errno;
> +			dso->data.status = DSO_DATA_STATUS_ERROR;
> +			goto err_unlock;
> +		}
> +	}
>  
> -		ret = read(dso->data.fd, cache->data, DSO__DATA_CACHE_SIZE);
> -		if (ret <= 0)
> -			break;
> +	if (-1 == lseek(dso->data.fd, cache_offset, SEEK_SET))
> +		goto err_unlock;
>  
> -		cache->offset = cache_offset;
> -		cache->size   = ret;
> -		old = dso_cache__insert(dso, cache);
> -		if (old) {
> -			/* we lose the race */
> -			free(cache);
> -			cache = old;
> -		}
> +	ret = read(dso->data.fd, cache->data, DSO__DATA_CACHE_SIZE);
> +	if (ret <= 0)
> +		goto err_unlock;
>  
> -		ret = dso_cache__memcpy(cache, offset, data, size);
> +	pthread_mutex_unlock(&dso__data_open_lock);
>  
> -	} while (0);
> +	cache->offset = cache_offset;
> +	cache->size   = ret;
> +	old = dso_cache__insert(dso, cache);
> +	if (old) {
> +		/* we lose the race */
> +		free(cache);
> +		cache = old;
> +	}
>  
> +	ret = dso_cache__memcpy(cache, offset, data, size);
>  	if (ret <= 0)
>  		free(cache);
>  
>  	return ret;
> +
> +err_unlock:
> +	pthread_mutex_unlock(&dso__data_open_lock);
> +	return ret;
>  }
>  
> -static ssize_t dso_cache_read(struct dso *dso, u64 offset,
> -			      u8 *data, ssize_t size)
> +static ssize_t dso_cache_read(struct dso *dso, struct machine *machine,
> +			      u64 offset, u8 *data, ssize_t size)
>  {
>  	struct dso_cache *cache;
>  
> @@ -584,7 +604,7 @@ static ssize_t dso_cache_read(struct dso *dso, u64 offset,
>  	if (cache)
>  		return dso_cache__memcpy(cache, offset, data, size);
>  	else
> -		return dso_cache__read(dso, offset, data, size);
> +		return dso_cache__read(dso, machine, offset, data, size);
>  }
>  
>  /*
> @@ -592,7 +612,8 @@ static ssize_t dso_cache_read(struct dso *dso, u64 offset,
>   * in the rb_tree. Any read to already cached data is served
>   * by cached data.
>   */
> -static ssize_t cached_read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
> +static ssize_t cached_read(struct dso *dso, struct machine *machine,
> +			   u64 offset, u8 *data, ssize_t size)
>  {
>  	ssize_t r = 0;
>  	u8 *p = data;
> @@ -600,7 +621,7 @@ static ssize_t cached_read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
>  	do {
>  		ssize_t ret;
>  
> -		ret = dso_cache_read(dso, offset, p, size);
> +		ret = dso_cache_read(dso, machine, offset, p, size);
>  		if (ret < 0)
>  			return ret;
>  
> @@ -620,21 +641,42 @@ static ssize_t cached_read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
>  	return r;
>  }
>  
> -static int data_file_size(struct dso *dso)
> +static int data_file_size(struct dso *dso, struct machine *machine)
>  {
> +	int ret = 0;
>  	struct stat st;
>  	char sbuf[STRERR_BUFSIZE];
>  
> -	if (!dso->data.file_size) {
> -		if (fstat(dso->data.fd, &st)) {
> -			pr_err("dso mmap failed, fstat: %s\n",
> -				strerror_r(errno, sbuf, sizeof(sbuf)));
> -			return -1;
> +	if (dso->data.file_size)
> +		return 0;
> +
> +	pthread_mutex_lock(&dso__data_open_lock);
> +
> +	/*
> +	 * dso->data.fd might be closed if other thread opened another
> +	 * file (dso) due to open file limit (RLIMIT_NOFILE).
> +	 */
> +	if (dso->data.fd < 0) {
> +		dso->data.fd = open_dso(dso, machine);
> +		if (dso->data.fd < 0) {
> +			ret = -errno;
> +			dso->data.status = DSO_DATA_STATUS_ERROR;
> +			goto out;
>  		}
> -		dso->data.file_size = st.st_size;
>  	}
>  
> -	return 0;
> +	if (fstat(dso->data.fd, &st) < 0) {
> +		ret = -errno;
> +		pr_err("dso cache fstat failed: %s\n",
> +		       strerror_r(errno, sbuf, sizeof(sbuf)));
> +		dso->data.status = DSO_DATA_STATUS_ERROR;
> +		goto out;
> +	}
> +	dso->data.file_size = st.st_size;
> +
> +out:
> +	pthread_mutex_unlock(&dso__data_open_lock);
> +	return ret;
>  }
>  
>  /**
> @@ -652,17 +694,17 @@ off_t dso__data_size(struct dso *dso, struct machine *machine)
>  	if (fd < 0)
>  		return fd;
>  
> -	if (data_file_size(dso))
> +	if (data_file_size(dso, machine))
>  		return -1;
>  
>  	/* For now just estimate dso data size is close to file size */
>  	return dso->data.file_size;
>  }
>  
> -static ssize_t data_read_offset(struct dso *dso, u64 offset,
> -				u8 *data, ssize_t size)
> +static ssize_t data_read_offset(struct dso *dso, struct machine *machine,
> +				u64 offset, u8 *data, ssize_t size)
>  {
> -	if (data_file_size(dso))
> +	if (data_file_size(dso, machine))
>  		return -1;
>  
>  	/* Check the offset sanity. */
> @@ -672,7 +714,7 @@ static ssize_t data_read_offset(struct dso *dso, u64 offset,
>  	if (offset + size < offset)
>  		return -1;
>  
> -	return cached_read(dso, offset, data, size);
> +	return cached_read(dso, machine, offset, data, size);
>  }
>  
>  /**
> @@ -689,10 +731,10 @@ static ssize_t data_read_offset(struct dso *dso, u64 offset,
>  ssize_t dso__data_read_offset(struct dso *dso, struct machine *machine,
>  			      u64 offset, u8 *data, ssize_t size)
>  {
> -	if (dso__data_fd(dso, machine) < 0)
> +	if (dso->data.status == DSO_DATA_STATUS_ERROR)
>  		return -1;
>  
> -	return data_read_offset(dso, offset, data, size);
> +	return data_read_offset(dso, machine, offset, data, size);
>  }
>  
>  /**
> -- 
> 2.2.2

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH 27/42] perf tools: Protect dso symbol loading using a mutex
  2015-01-29  8:07 ` [PATCH 27/42] perf tools: Protect dso symbol loading using a mutex Namhyung Kim
@ 2015-01-29 12:34   ` Arnaldo Carvalho de Melo
  2015-01-29 12:48     ` Namhyung Kim
  0 siblings, 1 reply; 221+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-01-29 12:34 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

Em Thu, Jan 29, 2015 at 05:07:08PM +0900, Namhyung Kim escreveu:
> When multi-thread support for perf report is enabled, it's possible to
> access a dso concurrently.  Add a new pthread_mutex to protect it from
> concurrent dso__load().
> 
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/util/dso.c    |  2 ++
>  tools/perf/util/dso.h    |  1 +
>  tools/perf/util/symbol.c | 34 ++++++++++++++++++++++++----------
>  3 files changed, 27 insertions(+), 10 deletions(-)
> 
> diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
> index 45be944d450a..3da75816b8f8 100644
> --- a/tools/perf/util/dso.c
> +++ b/tools/perf/util/dso.c
> @@ -888,6 +888,7 @@ struct dso *dso__new(const char *name)
>  		RB_CLEAR_NODE(&dso->rb_node);
>  		INIT_LIST_HEAD(&dso->node);
>  		INIT_LIST_HEAD(&dso->data.open_entry);
> +		pthread_mutex_init(&dso->lock, NULL);
>  	}
>  
>  	return dso;
> @@ -917,6 +918,7 @@ void dso__delete(struct dso *dso)
>  	dso_cache__free(&dso->data.cache);
>  	dso__free_a2l(dso);
>  	zfree(&dso->symsrc_filename);
> +	pthread_mutex_destroy(&dso->lock);
>  	free(dso);
>  }
>  
> diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
> index 3782c82c6e44..ac753594a469 100644
> --- a/tools/perf/util/dso.h
> +++ b/tools/perf/util/dso.h
> @@ -102,6 +102,7 @@ struct dsos {
>  };
>  
>  struct dso {
> +	pthread_mutex_t	 lock;
>  	struct list_head node;
>  	struct rb_node	 rb_node;	/* rbtree node sorted by long name */
>  	struct rb_root	 symbols[MAP__NR_TYPES];
> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> index a69066865a55..714e20c99354 100644
> --- a/tools/perf/util/symbol.c
> +++ b/tools/perf/util/symbol.c
> @@ -1357,12 +1357,22 @@ int dso__load(struct dso *dso, struct map *map, symbol_filter_t filter)
>  	struct symsrc *syms_ss = NULL, *runtime_ss = NULL;
>  	bool kmod;
>  
> -	dso__set_loaded(dso, map->type);
> +	pthread_mutex_lock(&dso->lock);
> +
> +	/* check again under the dso->lock */

Again? Where was it first checked? Perhaps we should lock there, so that
we don't have to do two checks, one unlocked, the other locked?

> +	if (dso__loaded(dso, map->type)) {
> +		ret = 1;
> +		goto out;
> +	}
> +

- Arnaldo

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH 40/42] perf callchain: Save eh/debug frame offset for dwarf unwind
  2015-01-29  8:07 ` [PATCH 40/42] perf callchain: Save eh/debug frame offset for dwarf unwind Namhyung Kim
@ 2015-01-29 12:38   ` Arnaldo Carvalho de Melo
  2015-01-29 13:23     ` Namhyung Kim
  2015-01-29 19:22   ` Arnaldo Carvalho de Melo
  2015-01-30 18:32   ` [tip:perf/core] perf callchain: Cache eh/ debug " tip-bot for Namhyung Kim
  2 siblings, 1 reply; 221+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-01-29 12:38 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

Em Thu, Jan 29, 2015 at 05:07:21PM +0900, Namhyung Kim escreveu:
> When libunwind tries to resolve callchains it needs to know the offset
> of .eh_frame_hdr or .debug_frame to access the dso.  Since it calls
> dso__data_fd(), it'll try to grab dso->lock everytime for same
> information.  So save it to dso_data struct and reuse it.
> 
> Note that there's a window between dso__data_fd() and actual use of
> the fd.  The fd could be closed by other threads to deal with the open
> file limit in dso cache code.  But I think it's ok since in that case
> elf_section_offset() will return 0 so it'll be tried in next acess.

I know that you did this in the context of your multi threading
patchkit, but this seems useful even without that patckhit, i.e. this
can be cherry picked on the grounds that it speeds up things by caching
something that doesn't change, right?

I.e. I'll probably just rewrite the comment and apply it before
considering the other patches, so that other people can comment on the
other patches, etc.

- Arnaldo
 
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/util/dso.h              |  1 +
>  tools/perf/util/unwind-libunwind.c | 31 ++++++++++++++++++++-----------
>  2 files changed, 21 insertions(+), 11 deletions(-)
> 
> diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
> index c18fcc0e8081..323ee08d56fc 100644
> --- a/tools/perf/util/dso.h
> +++ b/tools/perf/util/dso.h
> @@ -141,6 +141,7 @@ struct dso {
>  		u32		 status_seen;
>  		size_t		 file_size;
>  		struct list_head open_entry;
> +		u64		 frame_offset;
>  	} data;
>  
>  	union { /* Tool specific area */
> diff --git a/tools/perf/util/unwind-libunwind.c b/tools/perf/util/unwind-libunwind.c
> index 7ed6eaf232b6..3219b20837b5 100644
> --- a/tools/perf/util/unwind-libunwind.c
> +++ b/tools/perf/util/unwind-libunwind.c
> @@ -266,14 +266,17 @@ static int read_unwind_spec_eh_frame(struct dso *dso, struct machine *machine,
>  				     u64 *fde_count)
>  {
>  	int ret = -EINVAL, fd;
> -	u64 offset;
> +	u64 offset = dso->data.frame_offset;
>  
> -	fd = dso__data_fd(dso, machine);
> -	if (fd < 0)
> -		return -EINVAL;
> +	if (offset == 0) {
> +		fd = dso__data_fd(dso, machine);
> +		if (fd < 0)
> +			return -EINVAL;
>  
> -	/* Check the .eh_frame section for unwinding info */
> -	offset = elf_section_offset(fd, ".eh_frame_hdr");
> +		/* Check the .eh_frame section for unwinding info */
> +		offset = elf_section_offset(fd, ".eh_frame_hdr");
> +		dso->data.frame_offset = offset;
> +	}
>  
>  	if (offset)
>  		ret = unwind_spec_ehframe(dso, machine, offset,
> @@ -287,14 +290,20 @@ static int read_unwind_spec_eh_frame(struct dso *dso, struct machine *machine,
>  static int read_unwind_spec_debug_frame(struct dso *dso,
>  					struct machine *machine, u64 *offset)
>  {
> -	int fd = dso__data_fd(dso, machine);
> +	int fd;
> +	u64 ofs = dso->data.frame_offset;
>  
> -	if (fd < 0)
> -		return -EINVAL;
> +	if (ofs == 0) {
> +		fd = dso__data_fd(dso, machine);
> +		if (fd < 0)
> +			return -EINVAL;
>  
> -	/* Check the .debug_frame section for unwinding info */
> -	*offset = elf_section_offset(fd, ".debug_frame");
> +		/* Check the .debug_frame section for unwinding info */
> +		ofs = elf_section_offset(fd, ".debug_frame");
> +		dso->data.frame_offset = ofs;
> +	}
>  
> +	*offset = ofs;
>  	if (*offset)
>  		return 0;
>  
> -- 
> 2.2.2

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH 27/42] perf tools: Protect dso symbol loading using a mutex
  2015-01-29 12:34   ` Arnaldo Carvalho de Melo
@ 2015-01-29 12:48     ` Namhyung Kim
  0 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29 12:48 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

Hi Arnaldo,

On Thu, Jan 29, 2015 at 9:34 PM, Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
> Em Thu, Jan 29, 2015 at 05:07:08PM +0900, Namhyung Kim escreveu:
>> When multi-thread support for perf report is enabled, it's possible to
>> access a dso concurrently.  Add a new pthread_mutex to protect it from
>> concurrent dso__load().
>>
>> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
>> ---
>>  tools/perf/util/dso.c    |  2 ++
>>  tools/perf/util/dso.h    |  1 +
>>  tools/perf/util/symbol.c | 34 ++++++++++++++++++++++++----------
>>  3 files changed, 27 insertions(+), 10 deletions(-)
>>
>> diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
>> index 45be944d450a..3da75816b8f8 100644
>> --- a/tools/perf/util/dso.c
>> +++ b/tools/perf/util/dso.c
>> @@ -888,6 +888,7 @@ struct dso *dso__new(const char *name)
>>               RB_CLEAR_NODE(&dso->rb_node);
>>               INIT_LIST_HEAD(&dso->node);
>>               INIT_LIST_HEAD(&dso->data.open_entry);
>> +             pthread_mutex_init(&dso->lock, NULL);
>>       }
>>
>>       return dso;
>> @@ -917,6 +918,7 @@ void dso__delete(struct dso *dso)
>>       dso_cache__free(&dso->data.cache);
>>       dso__free_a2l(dso);
>>       zfree(&dso->symsrc_filename);
>> +     pthread_mutex_destroy(&dso->lock);
>>       free(dso);
>>  }
>>
>> diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
>> index 3782c82c6e44..ac753594a469 100644
>> --- a/tools/perf/util/dso.h
>> +++ b/tools/perf/util/dso.h
>> @@ -102,6 +102,7 @@ struct dsos {
>>  };
>>
>>  struct dso {
>> +     pthread_mutex_t  lock;
>>       struct list_head node;
>>       struct rb_node   rb_node;       /* rbtree node sorted by long name */
>>       struct rb_root   symbols[MAP__NR_TYPES];
>> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
>> index a69066865a55..714e20c99354 100644
>> --- a/tools/perf/util/symbol.c
>> +++ b/tools/perf/util/symbol.c
>> @@ -1357,12 +1357,22 @@ int dso__load(struct dso *dso, struct map *map, symbol_filter_t filter)
>>       struct symsrc *syms_ss = NULL, *runtime_ss = NULL;
>>       bool kmod;
>>
>> -     dso__set_loaded(dso, map->type);
>> +     pthread_mutex_lock(&dso->lock);
>> +
>> +     /* check again under the dso->lock */
>
> Again? Where was it first checked?

Please see map__load().


> Perhaps we should lock there, so that
> we don't have to do two checks, one unlocked, the other locked?

Hmm.. maybe.  I just keep it to avoid locking overhead since it'll be
called whenever it searches symbols during preprocessing.  I didn't
measure the overhead but it could be huge IMHO.

Thanks,
Namhyung


>
>> +     if (dso__loaded(dso, map->type)) {
>> +             ret = 1;
>> +             goto out;
>> +     }
>> +
>
> - Arnaldo

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH 29/42] perf tools: Protect dso cache fd with a mutex
  2015-01-29 12:31   ` Arnaldo Carvalho de Melo
@ 2015-01-29 13:19     ` Namhyung Kim
  2015-01-29 16:23       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29 13:19 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

On Thu, Jan 29, 2015 at 09:31:07AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Jan 29, 2015 at 05:07:10PM +0900, Namhyung Kim escreveu:
> > When dso cache is accessed in multi-thread environment, it's possible
> > to close other dso->data.fd during operation due to open file limit.
> > Protect the file descriptors using a separate mutex.
> > 
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---
> >  tools/perf/tests/dso-data.c |   5 ++
> >  tools/perf/util/dso.c       | 136 +++++++++++++++++++++++++++++---------------
> >  2 files changed, 94 insertions(+), 47 deletions(-)
> > 
> > diff --git a/tools/perf/tests/dso-data.c b/tools/perf/tests/dso-data.c
> > index caaf37f079b1..0276e7d2d41b 100644
> > --- a/tools/perf/tests/dso-data.c
> > +++ b/tools/perf/tests/dso-data.c
> > @@ -111,6 +111,9 @@ int test__dso_data(void)
> >  	memset(&machine, 0, sizeof(machine));
> >  
> >  	dso = dso__new((const char *)file);
> > +	TEST_ASSERT_VAL("failed to get dso", dso);
> > +
> > +	dso->binary_type = DSO_BINARY_TYPE__SYSTEM_PATH_DSO;
> >  
> >  	/* Basic 10 bytes tests. */
> >  	for (i = 0; i < ARRAY_SIZE(offsets); i++) {
> > @@ -199,6 +202,8 @@ static int dsos__create(int cnt, int size)
> >  
> >  		dsos[i] = dso__new(file);
> >  		TEST_ASSERT_VAL("failed to get dso", dsos[i]);
> > +
> > +		dsos[i]->binary_type = DSO_BINARY_TYPE__SYSTEM_PATH_DSO;
> 
> Those two are unrelated, please put them in a separate patch, one that I
> can even cherrypick ahead of the other patches.

It's a consequence of changing dso__data_read_offset() not to call
dso__data_fd() due to a performance reason.  The binary_type was
determined during the dso__data_fd() before, but now it needs to be
set explicitly for this test.

In the original code, it was called everytime we access to the dso
cache just to check an error, I guess.  But it's enough to check the
status field.


> 
> >  	}
> >  
> >  	return 0;
> > diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
> > index 11ece224ef50..ae92046ae2c8 100644
> > --- a/tools/perf/util/dso.c
> > +++ b/tools/perf/util/dso.c
> > @@ -213,6 +213,7 @@ bool dso__needs_decompress(struct dso *dso)
> >   */
> >  static LIST_HEAD(dso__data_open);
> >  static long dso__data_open_cnt;
> > +static pthread_mutex_t dso__data_open_lock = PTHREAD_MUTEX_INITIALIZER;
> >  
> >  static void dso__list_add(struct dso *dso)
> >  {
> > @@ -240,7 +241,7 @@ static int do_open(char *name)
> >  		if (fd >= 0)
> >  			return fd;
> >  
> > -		pr_debug("dso open failed, mmap: %s\n",
> > +		pr_debug("dso open failed: %s\n",
> >  			 strerror_r(errno, sbuf, sizeof(sbuf)));
> >  		if (!dso__data_open_cnt || errno != EMFILE)
> 
> Ditto, another unrelated patch, please separate.

Ah, okay.  I kept it since it's just a small change.  But I'd like to
separate if it helps reviewing.


> 
> >  			break;
> > @@ -382,7 +383,9 @@ static void check_data_close(void)
> >   */
> >  void dso__data_close(struct dso *dso)
> >  {
> > +	pthread_mutex_lock(&dso__data_open_lock);
> >  	close_dso(dso);
> > +	pthread_mutex_unlock(&dso__data_open_lock);
> >  }
> >  
> >  /**
> > @@ -405,6 +408,8 @@ int dso__data_fd(struct dso *dso, struct machine *machine)
> >  	if (dso->data.status == DSO_DATA_STATUS_ERROR)
> >  		return -1;
> >  
> > +	pthread_mutex_lock(&dso__data_open_lock);
> > +
> >  	if (dso->data.fd >= 0)
> >  		goto out;
> >  
> > @@ -427,6 +432,7 @@ int dso__data_fd(struct dso *dso, struct machine *machine)
> >  	else
> >  		dso->data.status = DSO_DATA_STATUS_ERROR;
> >  
> > +	pthread_mutex_unlock(&dso__data_open_lock);
> >  	return dso->data.fd;
> >  }
> >  
> > @@ -531,52 +537,66 @@ dso_cache__memcpy(struct dso_cache *cache, u64 offset,
> >  }
> >  
> >  static ssize_t
> > -dso_cache__read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
> > +dso_cache__read(struct dso *dso, struct machine *machine,
> > +		u64 offset, u8 *data, ssize_t size)
> >  {
> >  	struct dso_cache *cache;
> >  	struct dso_cache *old;
> > -	ssize_t ret;
> > -
> > -	do {
> > -		u64 cache_offset;
> 
> While I understand that there was no need for this do { } while (0)
> construct in the first place, removing it in this patch is not
> interesting, as it is both unrelated to this patch and makes the it
> harder to review by just looking at the patch :-\ Please refrain from
> doing this in this patch.

Understood, sorry for bothering! :)


> 
> A later patch that does _just_ that could be done, if you feel like
> doing it.

Okay.


> 
> > +	ssize_t ret = -EINVAL;
> > +	u64 cache_offset;
> >  
> > -		ret = -ENOMEM;
> > +	cache = zalloc(sizeof(*cache) + DSO__DATA_CACHE_SIZE);
> > +	if (!cache)
> > +		return -ENOMEM;
> >  
> > -		cache = zalloc(sizeof(*cache) + DSO__DATA_CACHE_SIZE);
> > -		if (!cache)
> > -			break;
> > +	cache_offset = offset & DSO__DATA_CACHE_MASK;
> >  
> > -		cache_offset = offset & DSO__DATA_CACHE_MASK;
> > -		ret = -EINVAL;
> > +	pthread_mutex_lock(&dso__data_open_lock);
> >  
> > -		if (-1 == lseek(dso->data.fd, cache_offset, SEEK_SET))
> > -			break;
> > +	/*
> > +	 * dso->data.fd might be closed if other thread opened another
> > +	 * file (dso) due to open file limit (RLIMIT_NOFILE).
> > +	 */
> > +	if (dso->data.fd < 0) {
> > +		dso->data.fd = open_dso(dso, machine);
> 
> Also please consider adding a backpointer to machine in the dso object,
> since you need to reopen it, so that we don't have to go on passing
> machine around to dso_cache__read(), etc.

Yeah, it's a pain to passing a machine pointer.

> 
> This probably needs to be done in the patch that makes dso->data.fd to
> be closed due to limit.

I don't know which patch you are refering..  It already closes an fd
if it reaches the limit - what this patch does is protecting such
concurrent open and close when multi-thread is used.

Thanks,
Namhyung


> 
> > +		if (dso->data.fd < 0) {
> > +			ret = -errno;
> > +			dso->data.status = DSO_DATA_STATUS_ERROR;
> > +			goto err_unlock;
> > +		}
> > +	}
> >  
> > -		ret = read(dso->data.fd, cache->data, DSO__DATA_CACHE_SIZE);
> > -		if (ret <= 0)
> > -			break;
> > +	if (-1 == lseek(dso->data.fd, cache_offset, SEEK_SET))
> > +		goto err_unlock;
> >  
> > -		cache->offset = cache_offset;
> > -		cache->size   = ret;
> > -		old = dso_cache__insert(dso, cache);
> > -		if (old) {
> > -			/* we lose the race */
> > -			free(cache);
> > -			cache = old;
> > -		}
> > +	ret = read(dso->data.fd, cache->data, DSO__DATA_CACHE_SIZE);
> > +	if (ret <= 0)
> > +		goto err_unlock;
> >  
> > -		ret = dso_cache__memcpy(cache, offset, data, size);
> > +	pthread_mutex_unlock(&dso__data_open_lock);
> >  
> > -	} while (0);
> > +	cache->offset = cache_offset;
> > +	cache->size   = ret;
> > +	old = dso_cache__insert(dso, cache);
> > +	if (old) {
> > +		/* we lose the race */
> > +		free(cache);
> > +		cache = old;
> > +	}
> >  
> > +	ret = dso_cache__memcpy(cache, offset, data, size);
> >  	if (ret <= 0)
> >  		free(cache);
> >  
> >  	return ret;
> > +
> > +err_unlock:
> > +	pthread_mutex_unlock(&dso__data_open_lock);
> > +	return ret;
> >  }
> >  
> > -static ssize_t dso_cache_read(struct dso *dso, u64 offset,
> > -			      u8 *data, ssize_t size)
> > +static ssize_t dso_cache_read(struct dso *dso, struct machine *machine,
> > +			      u64 offset, u8 *data, ssize_t size)
> >  {
> >  	struct dso_cache *cache;
> >  
> > @@ -584,7 +604,7 @@ static ssize_t dso_cache_read(struct dso *dso, u64 offset,
> >  	if (cache)
> >  		return dso_cache__memcpy(cache, offset, data, size);
> >  	else
> > -		return dso_cache__read(dso, offset, data, size);
> > +		return dso_cache__read(dso, machine, offset, data, size);
> >  }
> >  
> >  /*
> > @@ -592,7 +612,8 @@ static ssize_t dso_cache_read(struct dso *dso, u64 offset,
> >   * in the rb_tree. Any read to already cached data is served
> >   * by cached data.
> >   */
> > -static ssize_t cached_read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
> > +static ssize_t cached_read(struct dso *dso, struct machine *machine,
> > +			   u64 offset, u8 *data, ssize_t size)
> >  {
> >  	ssize_t r = 0;
> >  	u8 *p = data;
> > @@ -600,7 +621,7 @@ static ssize_t cached_read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
> >  	do {
> >  		ssize_t ret;
> >  
> > -		ret = dso_cache_read(dso, offset, p, size);
> > +		ret = dso_cache_read(dso, machine, offset, p, size);
> >  		if (ret < 0)
> >  			return ret;
> >  
> > @@ -620,21 +641,42 @@ static ssize_t cached_read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
> >  	return r;
> >  }
> >  
> > -static int data_file_size(struct dso *dso)
> > +static int data_file_size(struct dso *dso, struct machine *machine)
> >  {
> > +	int ret = 0;
> >  	struct stat st;
> >  	char sbuf[STRERR_BUFSIZE];
> >  
> > -	if (!dso->data.file_size) {
> > -		if (fstat(dso->data.fd, &st)) {
> > -			pr_err("dso mmap failed, fstat: %s\n",
> > -				strerror_r(errno, sbuf, sizeof(sbuf)));
> > -			return -1;
> > +	if (dso->data.file_size)
> > +		return 0;
> > +
> > +	pthread_mutex_lock(&dso__data_open_lock);
> > +
> > +	/*
> > +	 * dso->data.fd might be closed if other thread opened another
> > +	 * file (dso) due to open file limit (RLIMIT_NOFILE).
> > +	 */
> > +	if (dso->data.fd < 0) {
> > +		dso->data.fd = open_dso(dso, machine);
> > +		if (dso->data.fd < 0) {
> > +			ret = -errno;
> > +			dso->data.status = DSO_DATA_STATUS_ERROR;
> > +			goto out;
> >  		}
> > -		dso->data.file_size = st.st_size;
> >  	}
> >  
> > -	return 0;
> > +	if (fstat(dso->data.fd, &st) < 0) {
> > +		ret = -errno;
> > +		pr_err("dso cache fstat failed: %s\n",
> > +		       strerror_r(errno, sbuf, sizeof(sbuf)));
> > +		dso->data.status = DSO_DATA_STATUS_ERROR;
> > +		goto out;
> > +	}
> > +	dso->data.file_size = st.st_size;
> > +
> > +out:
> > +	pthread_mutex_unlock(&dso__data_open_lock);
> > +	return ret;
> >  }
> >  
> >  /**
> > @@ -652,17 +694,17 @@ off_t dso__data_size(struct dso *dso, struct machine *machine)
> >  	if (fd < 0)
> >  		return fd;
> >  
> > -	if (data_file_size(dso))
> > +	if (data_file_size(dso, machine))
> >  		return -1;
> >  
> >  	/* For now just estimate dso data size is close to file size */
> >  	return dso->data.file_size;
> >  }
> >  
> > -static ssize_t data_read_offset(struct dso *dso, u64 offset,
> > -				u8 *data, ssize_t size)
> > +static ssize_t data_read_offset(struct dso *dso, struct machine *machine,
> > +				u64 offset, u8 *data, ssize_t size)
> >  {
> > -	if (data_file_size(dso))
> > +	if (data_file_size(dso, machine))
> >  		return -1;
> >  
> >  	/* Check the offset sanity. */
> > @@ -672,7 +714,7 @@ static ssize_t data_read_offset(struct dso *dso, u64 offset,
> >  	if (offset + size < offset)
> >  		return -1;
> >  
> > -	return cached_read(dso, offset, data, size);
> > +	return cached_read(dso, machine, offset, data, size);
> >  }
> >  
> >  /**
> > @@ -689,10 +731,10 @@ static ssize_t data_read_offset(struct dso *dso, u64 offset,
> >  ssize_t dso__data_read_offset(struct dso *dso, struct machine *machine,
> >  			      u64 offset, u8 *data, ssize_t size)
> >  {
> > -	if (dso__data_fd(dso, machine) < 0)
> > +	if (dso->data.status == DSO_DATA_STATUS_ERROR)
> >  		return -1;
> >  
> > -	return data_read_offset(dso, offset, data, size);
> > +	return data_read_offset(dso, machine, offset, data, size);
> >  }
> >  
> >  /**
> > -- 
> > 2.2.2

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH 40/42] perf callchain: Save eh/debug frame offset for dwarf unwind
  2015-01-29 12:38   ` Arnaldo Carvalho de Melo
@ 2015-01-29 13:23     ` Namhyung Kim
  0 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-29 13:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

On Thu, Jan 29, 2015 at 09:38:31AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Jan 29, 2015 at 05:07:21PM +0900, Namhyung Kim escreveu:
> > When libunwind tries to resolve callchains it needs to know the offset
> > of .eh_frame_hdr or .debug_frame to access the dso.  Since it calls
> > dso__data_fd(), it'll try to grab dso->lock everytime for same
> > information.  So save it to dso_data struct and reuse it.
> > 
> > Note that there's a window between dso__data_fd() and actual use of
> > the fd.  The fd could be closed by other threads to deal with the open
> > file limit in dso cache code.  But I think it's ok since in that case
> > elf_section_offset() will return 0 so it'll be tried in next acess.
> 
> I know that you did this in the context of your multi threading
> patchkit, but this seems useful even without that patckhit, i.e. this
> can be cherry picked on the grounds that it speeds up things by caching
> something that doesn't change, right?

Right.

> 
> I.e. I'll probably just rewrite the comment and apply it before
> considering the other patches, so that other people can comment on the
> other patches, etc.

Thanks for doing that!
Namhyung


> 
> - Arnaldo
>  
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---
> >  tools/perf/util/dso.h              |  1 +
> >  tools/perf/util/unwind-libunwind.c | 31 ++++++++++++++++++++-----------
> >  2 files changed, 21 insertions(+), 11 deletions(-)
> > 
> > diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
> > index c18fcc0e8081..323ee08d56fc 100644
> > --- a/tools/perf/util/dso.h
> > +++ b/tools/perf/util/dso.h
> > @@ -141,6 +141,7 @@ struct dso {
> >  		u32		 status_seen;
> >  		size_t		 file_size;
> >  		struct list_head open_entry;
> > +		u64		 frame_offset;
> >  	} data;
> >  
> >  	union { /* Tool specific area */
> > diff --git a/tools/perf/util/unwind-libunwind.c b/tools/perf/util/unwind-libunwind.c
> > index 7ed6eaf232b6..3219b20837b5 100644
> > --- a/tools/perf/util/unwind-libunwind.c
> > +++ b/tools/perf/util/unwind-libunwind.c
> > @@ -266,14 +266,17 @@ static int read_unwind_spec_eh_frame(struct dso *dso, struct machine *machine,
> >  				     u64 *fde_count)
> >  {
> >  	int ret = -EINVAL, fd;
> > -	u64 offset;
> > +	u64 offset = dso->data.frame_offset;
> >  
> > -	fd = dso__data_fd(dso, machine);
> > -	if (fd < 0)
> > -		return -EINVAL;
> > +	if (offset == 0) {
> > +		fd = dso__data_fd(dso, machine);
> > +		if (fd < 0)
> > +			return -EINVAL;
> >  
> > -	/* Check the .eh_frame section for unwinding info */
> > -	offset = elf_section_offset(fd, ".eh_frame_hdr");
> > +		/* Check the .eh_frame section for unwinding info */
> > +		offset = elf_section_offset(fd, ".eh_frame_hdr");
> > +		dso->data.frame_offset = offset;
> > +	}
> >  
> >  	if (offset)
> >  		ret = unwind_spec_ehframe(dso, machine, offset,
> > @@ -287,14 +290,20 @@ static int read_unwind_spec_eh_frame(struct dso *dso, struct machine *machine,
> >  static int read_unwind_spec_debug_frame(struct dso *dso,
> >  					struct machine *machine, u64 *offset)
> >  {
> > -	int fd = dso__data_fd(dso, machine);
> > +	int fd;
> > +	u64 ofs = dso->data.frame_offset;
> >  
> > -	if (fd < 0)
> > -		return -EINVAL;
> > +	if (ofs == 0) {
> > +		fd = dso__data_fd(dso, machine);
> > +		if (fd < 0)
> > +			return -EINVAL;
> >  
> > -	/* Check the .debug_frame section for unwinding info */
> > -	*offset = elf_section_offset(fd, ".debug_frame");
> > +		/* Check the .debug_frame section for unwinding info */
> > +		ofs = elf_section_offset(fd, ".debug_frame");
> > +		dso->data.frame_offset = ofs;
> > +	}
> >  
> > +	*offset = ofs;
> >  	if (*offset)
> >  		return 0;
> >  
> > -- 
> > 2.2.2

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH 29/42] perf tools: Protect dso cache fd with a mutex
  2015-01-29 13:19     ` Namhyung Kim
@ 2015-01-29 16:23       ` Arnaldo Carvalho de Melo
  2015-01-30  0:51         ` Namhyung Kim
  0 siblings, 1 reply; 221+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-01-29 16:23 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

Em Thu, Jan 29, 2015 at 10:19:38PM +0900, Namhyung Kim escreveu:
> On Thu, Jan 29, 2015 at 09:31:07AM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Thu, Jan 29, 2015 at 05:07:10PM +0900, Namhyung Kim escreveu:
> > > When dso cache is accessed in multi-thread environment, it's possible
> > > to close other dso->data.fd during operation due to open file limit.
> > > Protect the file descriptors using a separate mutex.
> > > 
> > > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > > ---
> > >  tools/perf/tests/dso-data.c |   5 ++
> > >  tools/perf/util/dso.c       | 136 +++++++++++++++++++++++++++++---------------
> > >  2 files changed, 94 insertions(+), 47 deletions(-)
> > > 
> > > diff --git a/tools/perf/tests/dso-data.c b/tools/perf/tests/dso-data.c
> > > index caaf37f079b1..0276e7d2d41b 100644
> > > --- a/tools/perf/tests/dso-data.c
> > > +++ b/tools/perf/tests/dso-data.c
> > > @@ -111,6 +111,9 @@ int test__dso_data(void)
> > >  	memset(&machine, 0, sizeof(machine));
> > >  
> > >  	dso = dso__new((const char *)file);
> > > +	TEST_ASSERT_VAL("failed to get dso", dso);
> > > +
> > > +	dso->binary_type = DSO_BINARY_TYPE__SYSTEM_PATH_DSO;
> > >  
> > >  	/* Basic 10 bytes tests. */
> > >  	for (i = 0; i < ARRAY_SIZE(offsets); i++) {
> > > @@ -199,6 +202,8 @@ static int dsos__create(int cnt, int size)
> > >  
> > >  		dsos[i] = dso__new(file);
> > >  		TEST_ASSERT_VAL("failed to get dso", dsos[i]);
> > > +
> > > +		dsos[i]->binary_type = DSO_BINARY_TYPE__SYSTEM_PATH_DSO;
> > 
> > Those two are unrelated, please put them in a separate patch, one that I
> > can even cherrypick ahead of the other patches.
> 
> It's a consequence of changing dso__data_read_offset() not to call
> dso__data_fd() due to a performance reason.  The binary_type was
> determined during the dso__data_fd() before, but now it needs to be
> set explicitly for this test.
> 
> In the original code, it was called everytime we access to the dso
> cache just to check an error, I guess.  But it's enough to check the
> status field.

Are you saying that this test should not rely on some function that is
called somewhere down the functions it uses and should instead do as you
do above?

I.e. if that is the case, then this stands out as a separate patch, if
not, if this is indeed really related to this patch (at first sight it
doesn't look like) then this explanation you give should be in the
patch comment log.

> > >  	return 0;
> > > diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
> > > index 11ece224ef50..ae92046ae2c8 100644
> > > --- a/tools/perf/util/dso.c
> > > +++ b/tools/perf/util/dso.c
> > > @@ -213,6 +213,7 @@ bool dso__needs_decompress(struct dso *dso)
> > >   */
> > >  static LIST_HEAD(dso__data_open);
> > >  static long dso__data_open_cnt;
> > > +static pthread_mutex_t dso__data_open_lock = PTHREAD_MUTEX_INITIALIZER;
> > >  
> > >  static void dso__list_add(struct dso *dso)
> > >  {
> > > @@ -240,7 +241,7 @@ static int do_open(char *name)
> > >  		if (fd >= 0)
> > >  			return fd;
> > >  
> > > -		pr_debug("dso open failed, mmap: %s\n",
> > > +		pr_debug("dso open failed: %s\n",
> > >  			 strerror_r(errno, sbuf, sizeof(sbuf)));
> > >  		if (!dso__data_open_cnt || errno != EMFILE)
> > 
> > Ditto, another unrelated patch, please separate.
> 
> Ah, okay.  I kept it since it's just a small change.  But I'd like to
> separate if it helps reviewing.

Thanks

> > >  			break;
> > > @@ -382,7 +383,9 @@ static void check_data_close(void)
> > >   */
> > >  void dso__data_close(struct dso *dso)
> > >  {
> > > +	pthread_mutex_lock(&dso__data_open_lock);
> > >  	close_dso(dso);
> > > +	pthread_mutex_unlock(&dso__data_open_lock);
> > >  }
> > >  
> > >  /**
> > > @@ -405,6 +408,8 @@ int dso__data_fd(struct dso *dso, struct machine *machine)
> > >  	if (dso->data.status == DSO_DATA_STATUS_ERROR)
> > >  		return -1;
> > >  
> > > +	pthread_mutex_lock(&dso__data_open_lock);
> > > +
> > >  	if (dso->data.fd >= 0)
> > >  		goto out;
> > >  
> > > @@ -427,6 +432,7 @@ int dso__data_fd(struct dso *dso, struct machine *machine)
> > >  	else
> > >  		dso->data.status = DSO_DATA_STATUS_ERROR;
> > >  
> > > +	pthread_mutex_unlock(&dso__data_open_lock);
> > >  	return dso->data.fd;
> > >  }
> > >  
> > > @@ -531,52 +537,66 @@ dso_cache__memcpy(struct dso_cache *cache, u64 offset,
> > >  }
> > >  
> > >  static ssize_t
> > > -dso_cache__read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
> > > +dso_cache__read(struct dso *dso, struct machine *machine,
> > > +		u64 offset, u8 *data, ssize_t size)
> > >  {
> > >  	struct dso_cache *cache;
> > >  	struct dso_cache *old;
> > > -	ssize_t ret;
> > > -
> > > -	do {
> > > -		u64 cache_offset;
> > 
> > While I understand that there was no need for this do { } while (0)
> > construct in the first place, removing it in this patch is not
> > interesting, as it is both unrelated to this patch and makes the it
> > harder to review by just looking at the patch :-\ Please refrain from
> > doing this in this patch.
> 
> Understood, sorry for bothering! :)

:-)

> > A later patch that does _just_ that could be done, if you feel like
> > doing it.
> 
> Okay.
> 
> 
> > 
> > > +	ssize_t ret = -EINVAL;
> > > +	u64 cache_offset;
> > >  
> > > -		ret = -ENOMEM;
> > > +	cache = zalloc(sizeof(*cache) + DSO__DATA_CACHE_SIZE);
> > > +	if (!cache)
> > > +		return -ENOMEM;
> > >  
> > > -		cache = zalloc(sizeof(*cache) + DSO__DATA_CACHE_SIZE);
> > > -		if (!cache)
> > > -			break;
> > > +	cache_offset = offset & DSO__DATA_CACHE_MASK;
> > >  
> > > -		cache_offset = offset & DSO__DATA_CACHE_MASK;
> > > -		ret = -EINVAL;
> > > +	pthread_mutex_lock(&dso__data_open_lock);
> > >  
> > > -		if (-1 == lseek(dso->data.fd, cache_offset, SEEK_SET))
> > > -			break;
> > > +	/*
> > > +	 * dso->data.fd might be closed if other thread opened another
> > > +	 * file (dso) due to open file limit (RLIMIT_NOFILE).
> > > +	 */
> > > +	if (dso->data.fd < 0) {
> > > +		dso->data.fd = open_dso(dso, machine);
> > 
> > Also please consider adding a backpointer to machine in the dso object,
> > since you need to reopen it, so that we don't have to go on passing
> > machine around to dso_cache__read(), etc.
> 
> Yeah, it's a pain to passing a machine pointer.

Hey, so setting of dso->data.fd is protected in this function and we can
be sure that it will not be closed _again_ just before we do that lseek
and other operations, I guess so, just checking...
 
> > 
> > This probably needs to be done in the patch that makes dso->data.fd to
> > be closed due to limit.
> 
> I don't know which patch you are refering..  It already closes an fd
> if it reaches the limit - what this patch does is protecting such
> concurrent open and close when multi-thread is used.

Ok then, i.e.:

A) take the lock, close it if over the limit, drop the lock.

B) take the lock, check if it is closed, open if so, use it, drop the
lock.

Is that right?

- Arnaldo

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH 40/42] perf callchain: Save eh/debug frame offset for dwarf unwind
  2015-01-29  8:07 ` [PATCH 40/42] perf callchain: Save eh/debug frame offset for dwarf unwind Namhyung Kim
  2015-01-29 12:38   ` Arnaldo Carvalho de Melo
@ 2015-01-29 19:22   ` Arnaldo Carvalho de Melo
  2015-01-30 18:32   ` [tip:perf/core] perf callchain: Cache eh/ debug " tip-bot for Namhyung Kim
  2 siblings, 0 replies; 221+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-01-29 19:22 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

Em Thu, Jan 29, 2015 at 05:07:21PM +0900, Namhyung Kim escreveu:
> When libunwind tries to resolve callchains it needs to know the offset
> of .eh_frame_hdr or .debug_frame to access the dso.  Since it calls
> dso__data_fd(), it'll try to grab dso->lock everytime for same
> information.  So save it to dso_data struct and reuse it.
> 
> Note that there's a window between dso__data_fd() and actual use of
> the fd.  The fd could be closed by other threads to deal with the open
> file limit in dso cache code.  But I think it's ok since in that case
> elf_section_offset() will return 0 so it'll be tried in next acess.

Thanks, applied after rewriting the changelog to read as:

---
    perf callchain: Cache eh/debug frame offset for dwarf unwind
    
    When libunwind tries to resolve callchains it needs to know the
    offset of .eh_frame_hdr or .debug_frame to access the dso.
    
    Since it will always return the same result for a given DSO, just
    cache the result as an optimization.
---

- Arnaldo

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2)
  2015-01-29  8:06 [RFC/PATCHSET 00/42] perf tools: Speed-up perf report by using multi thread (v2) Namhyung Kim
                   ` (41 preceding siblings ...)
  2015-01-29  8:07 ` [PATCH 42/42] perf data: Implement 'index' subcommand Namhyung Kim
@ 2015-01-29 19:56 ` Arnaldo Carvalho de Melo
  42 siblings, 0 replies; 221+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-01-29 19:56 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker


Applied 1-5 and dwarf unwind caching one, will look at the others.

- Arnaldo

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH 29/42] perf tools: Protect dso cache fd with a mutex
  2015-01-29 16:23       ` Arnaldo Carvalho de Melo
@ 2015-01-30  0:51         ` Namhyung Kim
  0 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-01-30  0:51 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern,
	Adrian Hunter, Andi Kleen, Stephane Eranian, Frederic Weisbecker

On Thu, Jan 29, 2015 at 01:23:33PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Jan 29, 2015 at 10:19:38PM +0900, Namhyung Kim escreveu:
> > On Thu, Jan 29, 2015 at 09:31:07AM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Thu, Jan 29, 2015 at 05:07:10PM +0900, Namhyung Kim escreveu:
> > > > When dso cache is accessed in multi-thread environment, it's possible
> > > > to close other dso->data.fd during operation due to open file limit.
> > > > Protect the file descriptors using a separate mutex.
> > > > 
> > > > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > > > ---
> > > >  tools/perf/tests/dso-data.c |   5 ++
> > > >  tools/perf/util/dso.c       | 136 +++++++++++++++++++++++++++++---------------
> > > >  2 files changed, 94 insertions(+), 47 deletions(-)
> > > > 
> > > > diff --git a/tools/perf/tests/dso-data.c b/tools/perf/tests/dso-data.c
> > > > index caaf37f079b1..0276e7d2d41b 100644
> > > > --- a/tools/perf/tests/dso-data.c
> > > > +++ b/tools/perf/tests/dso-data.c
> > > > @@ -111,6 +111,9 @@ int test__dso_data(void)
> > > >  	memset(&machine, 0, sizeof(machine));
> > > >  
> > > >  	dso = dso__new((const char *)file);
> > > > +	TEST_ASSERT_VAL("failed to get dso", dso);
> > > > +
> > > > +	dso->binary_type = DSO_BINARY_TYPE__SYSTEM_PATH_DSO;
> > > >  
> > > >  	/* Basic 10 bytes tests. */
> > > >  	for (i = 0; i < ARRAY_SIZE(offsets); i++) {
> > > > @@ -199,6 +202,8 @@ static int dsos__create(int cnt, int size)
> > > >  
> > > >  		dsos[i] = dso__new(file);
> > > >  		TEST_ASSERT_VAL("failed to get dso", dsos[i]);
> > > > +
> > > > +		dsos[i]->binary_type = DSO_BINARY_TYPE__SYSTEM_PATH_DSO;
> > > 
> > > Those two are unrelated, please put them in a separate patch, one that I
> > > can even cherrypick ahead of the other patches.
> > 
> > It's a consequence of changing dso__data_read_offset() not to call
> > dso__data_fd() due to a performance reason.  The binary_type was
> > determined during the dso__data_fd() before, but now it needs to be
> > set explicitly for this test.
> > 
> > In the original code, it was called everytime we access to the dso
> > cache just to check an error, I guess.  But it's enough to check the
> > status field.
> 
> Are you saying that this test should not rely on some function that is
> called somewhere down the functions it uses and should instead do as you
> do above?
> 
> I.e. if that is the case, then this stands out as a separate patch, if
> not, if this is indeed really related to this patch (at first sight it
> doesn't look like) then this explanation you give should be in the
> patch comment log.

I think it'd be better adding dso__data_fd() after dso__new() as an
extra validation step.  With that we don't need to specify binary type
manually and have a more consistent usage pattern.  I'll do it as a
separate patch.


> 
> > > >  	return 0;
> > > > diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
> > > > index 11ece224ef50..ae92046ae2c8 100644
> > > > --- a/tools/perf/util/dso.c
> > > > +++ b/tools/perf/util/dso.c
> > > > @@ -213,6 +213,7 @@ bool dso__needs_decompress(struct dso *dso)
> > > >   */
> > > >  static LIST_HEAD(dso__data_open);
> > > >  static long dso__data_open_cnt;
> > > > +static pthread_mutex_t dso__data_open_lock = PTHREAD_MUTEX_INITIALIZER;
> > > >  
> > > >  static void dso__list_add(struct dso *dso)
> > > >  {
> > > > @@ -240,7 +241,7 @@ static int do_open(char *name)
> > > >  		if (fd >= 0)
> > > >  			return fd;
> > > >  
> > > > -		pr_debug("dso open failed, mmap: %s\n",
> > > > +		pr_debug("dso open failed: %s\n",
> > > >  			 strerror_r(errno, sbuf, sizeof(sbuf)));
> > > >  		if (!dso__data_open_cnt || errno != EMFILE)
> > > 
> > > Ditto, another unrelated patch, please separate.
> > 
> > Ah, okay.  I kept it since it's just a small change.  But I'd like to
> > separate if it helps reviewing.
> 
> Thanks
> 
> > > >  			break;
> > > > @@ -382,7 +383,9 @@ static void check_data_close(void)
> > > >   */
> > > >  void dso__data_close(struct dso *dso)
> > > >  {
> > > > +	pthread_mutex_lock(&dso__data_open_lock);
> > > >  	close_dso(dso);
> > > > +	pthread_mutex_unlock(&dso__data_open_lock);
> > > >  }
> > > >  
> > > >  /**
> > > > @@ -405,6 +408,8 @@ int dso__data_fd(struct dso *dso, struct machine *machine)
> > > >  	if (dso->data.status == DSO_DATA_STATUS_ERROR)
> > > >  		return -1;
> > > >  
> > > > +	pthread_mutex_lock(&dso__data_open_lock);
> > > > +
> > > >  	if (dso->data.fd >= 0)
> > > >  		goto out;
> > > >  
> > > > @@ -427,6 +432,7 @@ int dso__data_fd(struct dso *dso, struct machine *machine)
> > > >  	else
> > > >  		dso->data.status = DSO_DATA_STATUS_ERROR;
> > > >  
> > > > +	pthread_mutex_unlock(&dso__data_open_lock);
> > > >  	return dso->data.fd;
> > > >  }
> > > >  
> > > > @@ -531,52 +537,66 @@ dso_cache__memcpy(struct dso_cache *cache, u64 offset,
> > > >  }
> > > >  
> > > >  static ssize_t
> > > > -dso_cache__read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
> > > > +dso_cache__read(struct dso *dso, struct machine *machine,
> > > > +		u64 offset, u8 *data, ssize_t size)
> > > >  {
> > > >  	struct dso_cache *cache;
> > > >  	struct dso_cache *old;
> > > > -	ssize_t ret;
> > > > -
> > > > -	do {
> > > > -		u64 cache_offset;
> > > 
> > > While I understand that there was no need for this do { } while (0)
> > > construct in the first place, removing it in this patch is not
> > > interesting, as it is both unrelated to this patch and makes the it
> > > harder to review by just looking at the patch :-\ Please refrain from
> > > doing this in this patch.
> > 
> > Understood, sorry for bothering! :)
> 
> :-)
> 
> > > A later patch that does _just_ that could be done, if you feel like
> > > doing it.
> > 
> > Okay.
> > 
> > 
> > > 
> > > > +	ssize_t ret = -EINVAL;
> > > > +	u64 cache_offset;
> > > >  
> > > > -		ret = -ENOMEM;
> > > > +	cache = zalloc(sizeof(*cache) + DSO__DATA_CACHE_SIZE);
> > > > +	if (!cache)
> > > > +		return -ENOMEM;
> > > >  
> > > > -		cache = zalloc(sizeof(*cache) + DSO__DATA_CACHE_SIZE);
> > > > -		if (!cache)
> > > > -			break;
> > > > +	cache_offset = offset & DSO__DATA_CACHE_MASK;
> > > >  
> > > > -		cache_offset = offset & DSO__DATA_CACHE_MASK;
> > > > -		ret = -EINVAL;
> > > > +	pthread_mutex_lock(&dso__data_open_lock);
> > > >  
> > > > -		if (-1 == lseek(dso->data.fd, cache_offset, SEEK_SET))
> > > > -			break;
> > > > +	/*
> > > > +	 * dso->data.fd might be closed if other thread opened another
> > > > +	 * file (dso) due to open file limit (RLIMIT_NOFILE).
> > > > +	 */
> > > > +	if (dso->data.fd < 0) {
> > > > +		dso->data.fd = open_dso(dso, machine);
> > > 
> > > Also please consider adding a backpointer to machine in the dso object,
> > > since you need to reopen it, so that we don't have to go on passing
> > > machine around to dso_cache__read(), etc.
> > 
> > Yeah, it's a pain to passing a machine pointer.
> 
> Hey, so setting of dso->data.fd is protected in this function and we can
> be sure that it will not be closed _again_ just before we do that lseek
> and other operations, I guess so, just checking...

Right.  Accessing to dso->data.fd is safe only if it grabs the
dso__data_open_lock.


>  
> > > 
> > > This probably needs to be done in the patch that makes dso->data.fd to
> > > be closed due to limit.
> > 
> > I don't know which patch you are refering..  It already closes an fd
> > if it reaches the limit - what this patch does is protecting such
> > concurrent open and close when multi-thread is used.
> 
> Ok then, i.e.:
> 
> A) take the lock, close it if over the limit, drop the lock.
> 
> B) take the lock, check if it is closed, open if so, use it, drop the
> lock.
> 
> Is that right?

It's like this:

A) take the lock, check if it's still open, use it and drop the lock.

B) take the lock, it's closed, reopen it and check if it reaches the
limit.  Then close the first (kinda in LRU) dso to make next open()
succeeded.  Use it and drop the lock.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH 01/42] perf tools: Support to read compressed module from build-id cache
  2015-01-29  8:06 ` [PATCH 01/42] perf tools: Support to read compressed module from build-id cache Namhyung Kim
@ 2015-01-30 14:32   ` Jiri Olsa
  2015-02-02 15:03     ` Namhyung Kim
  2015-01-30 18:33   ` [tip:perf/core] perf symbols: " tip-bot for Namhyung Kim
  1 sibling, 1 reply; 221+ messages in thread
From: Jiri Olsa @ 2015-01-30 14:32 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML,
	David Ahern, Adrian Hunter, Andi Kleen, Stephane Eranian,
	Frederic Weisbecker

On Thu, Jan 29, 2015 at 05:06:42PM +0900, Namhyung Kim wrote:
> The commit c00c48fc6e6e ("perf symbols: Preparation for compressed
> kernel module support") added support for compressed kernel modules
> but it only supports system path DSOs.  When a dso is read from
> build-id cache, its filename doesn't end with ".gz" but has build-id.
> In this case, we should fallback to the original dso->name.
> 
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/util/symbol-elf.c | 13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
> index 06fcd1bf98b6..b24f9d8727a8 100644
> --- a/tools/perf/util/symbol-elf.c
> +++ b/tools/perf/util/symbol-elf.c
> @@ -574,13 +574,16 @@ static int decompress_kmodule(struct dso *dso, const char *name,
>  	const char *ext = strrchr(name, '.');
>  	char tmpbuf[] = "/tmp/perf-kmod-XXXXXX";
>  
> -	if ((type != DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE_COMP &&
> -	     type != DSO_BINARY_TYPE__GUEST_KMODULE_COMP) ||
> -	    type != dso->symtab_type)
> +	if (type != DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE_COMP &&
> +	    type != DSO_BINARY_TYPE__GUEST_KMODULE_COMP &&
> +	    type != DSO_BINARY_TYPE__BUILD_ID_CACHE)
>  		return -1;

hum, is it possible the type == DSO_BINARY_TYPE__BUILD_ID_CACHE could get in here?


---
        for (i = 0; i < DSO_BINARY_TYPE__SYMTAB_CNT; i++) {
                struct symsrc *ss = &ss_[ss_pos];
                bool next_slot = false;

                enum dso_binary_type symtab_type = binary_type_symtab[i];

                if (!dso__is_compatible_symtab_type(dso, kmod, symtab_type))
                        continue;

---		^^^ this check should rule out buildid symtab_type for kmod dso?

		symsrc__init(
			

I wonder wether we should set special type from compressed binaries (as of now),
or instead try to decompress anything that looks like it's compressed ;-)
it seems more to be more generic and could simplify the code..

jirka

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH 08/42] perf tools: Add rm_rf() utility function
  2015-01-29  8:06 ` [PATCH 08/42] perf tools: Add rm_rf() utility function Namhyung Kim
@ 2015-01-30 15:02   ` Jiri Olsa
  2015-05-20 12:24     ` [tip:perf/core] " tip-bot for Namhyung Kim
  0 siblings, 1 reply; 221+ messages in thread
From: Jiri Olsa @ 2015-01-30 15:02 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML,
	David Ahern, Adrian Hunter, Andi Kleen, Stephane Eranian,
	Frederic Weisbecker

On Thu, Jan 29, 2015 at 05:06:49PM +0900, Namhyung Kim wrote:
> The rm_rf() function does same as the shell command 'rm -rf' which
> removes all directory entries recursively.

Acked-by: Jiri Olsa <jolsa@kernel.org>

thanks,
jirka

^ permalink raw reply	[flat|nested] 221+ messages in thread

* [tip:perf/core] perf callchain: Cache eh/ debug frame offset for dwarf unwind
  2015-01-29  8:07 ` [PATCH 40/42] perf callchain: Save eh/debug frame offset for dwarf unwind Namhyung Kim
  2015-01-29 12:38   ` Arnaldo Carvalho de Melo
  2015-01-29 19:22   ` Arnaldo Carvalho de Melo
@ 2015-01-30 18:32   ` tip-bot for Namhyung Kim
  2 siblings, 0 replies; 221+ messages in thread
From: tip-bot for Namhyung Kim @ 2015-01-30 18:32 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: a.p.zijlstra, fweisbec, hpa, andi, dsahern, eranian, jolsa,
	adrian.hunter, mingo, linux-kernel, tglx, acme, namhyung

Commit-ID:  f1f13af99a903ae873f5373e965508e0486c1c29
Gitweb:     http://git.kernel.org/tip/f1f13af99a903ae873f5373e965508e0486c1c29
Author:     Namhyung Kim <namhyung@kernel.org>
AuthorDate: Thu, 29 Jan 2015 17:07:21 +0900
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 29 Jan 2015 16:20:42 -0300

perf callchain: Cache eh/debug frame offset for dwarf unwind

When libunwind tries to resolve callchains it needs to know the offset
of .eh_frame_hdr or .debug_frame to access the dso.

Since it will always return the same result for a given DSO, just cache
the result as an optimization.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1422518843-25818-41-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/dso.h              |  1 +
 tools/perf/util/unwind-libunwind.c | 31 ++++++++++++++++++++-----------
 2 files changed, 21 insertions(+), 11 deletions(-)

diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index 3782c82..ced9284 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -139,6 +139,7 @@ struct dso {
 		u32		 status_seen;
 		size_t		 file_size;
 		struct list_head open_entry;
+		u64		 frame_offset;
 	} data;
 
 	union { /* Tool specific area */
diff --git a/tools/perf/util/unwind-libunwind.c b/tools/perf/util/unwind-libunwind.c
index 6edf535..e3c40a5 100644
--- a/tools/perf/util/unwind-libunwind.c
+++ b/tools/perf/util/unwind-libunwind.c
@@ -266,14 +266,17 @@ static int read_unwind_spec_eh_frame(struct dso *dso, struct machine *machine,
 				     u64 *fde_count)
 {
 	int ret = -EINVAL, fd;
-	u64 offset;
+	u64 offset = dso->data.frame_offset;
 
-	fd = dso__data_fd(dso, machine);
-	if (fd < 0)
-		return -EINVAL;
+	if (offset == 0) {
+		fd = dso__data_fd(dso, machine);
+		if (fd < 0)
+			return -EINVAL;
 
-	/* Check the .eh_frame section for unwinding info */
-	offset = elf_section_offset(fd, ".eh_frame_hdr");
+		/* Check the .eh_frame section for unwinding info */
+		offset = elf_section_offset(fd, ".eh_frame_hdr");
+		dso->data.frame_offset = offset;
+	}
 
 	if (offset)
 		ret = unwind_spec_ehframe(dso, machine, offset,
@@ -287,14 +290,20 @@ static int read_unwind_spec_eh_frame(struct dso *dso, struct machine *machine,
 static int read_unwind_spec_debug_frame(struct dso *dso,
 					struct machine *machine, u64 *offset)
 {
-	int fd = dso__data_fd(dso, machine);
+	int fd;
+	u64 ofs = dso->data.frame_offset;
 
-	if (fd < 0)
-		return -EINVAL;
+	if (ofs == 0) {
+		fd = dso__data_fd(dso, machine);
+		if (fd < 0)
+			return -EINVAL;
 
-	/* Check the .debug_frame section for unwinding info */
-	*offset = elf_section_offset(fd, ".debug_frame");
+		/* Check the .debug_frame section for unwinding info */
+		ofs = elf_section_offset(fd, ".debug_frame");
+		dso->data.frame_offset = ofs;
+	}
 
+	*offset = ofs;
 	if (*offset)
 		return 0;
 

^ permalink raw reply	[flat|nested] 221+ messages in thread

* [tip:perf/core] perf tools: Do not use __perf_session__process_events() directly
  2015-01-29  8:06 ` [PATCH 02/42] perf tools: Do not use __perf_session__process_events() directly Namhyung Kim
@ 2015-01-30 18:32   ` tip-bot for Namhyung Kim
  0 siblings, 0 replies; 221+ messages in thread
From: tip-bot for Namhyung Kim @ 2015-01-30 18:32 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, namhyung, eranian, adrian.hunter, mingo, hpa, andi,
	dsahern, a.p.zijlstra, acme, fweisbec, tglx, jolsa

Commit-ID:  4ac30cf74b308fb01338e660d3471cd490a7958a
Gitweb:     http://git.kernel.org/tip/4ac30cf74b308fb01338e660d3471cd490a7958a
Author:     Namhyung Kim <namhyung@kernel.org>
AuthorDate: Thu, 29 Jan 2015 17:06:43 +0900
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 29 Jan 2015 16:36:32 -0300

perf tools: Do not use __perf_session__process_events() directly

It's only used for perf record to process build-id because its file size
it's not fixed at this time due to remaining header features.

However data offset and size is available so that we can use the
perf_session__process_events() once we set the file size as the current
offset like for now.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1422518843-25818-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-record.c | 7 +++----
 tools/perf/util/session.c   | 6 +++---
 tools/perf/util/session.h   | 3 ---
 3 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 8648c6d..1134de2 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -194,12 +194,13 @@ static int process_buildids(struct record *rec)
 {
 	struct perf_data_file *file  = &rec->file;
 	struct perf_session *session = rec->session;
-	u64 start = session->header.data_offset;
 
 	u64 size = lseek(file->fd, 0, SEEK_CUR);
 	if (size == 0)
 		return 0;
 
+	file->size = size;
+
 	/*
 	 * During this process, it'll load kernel map and replace the
 	 * dso->long_name to a real pathname it found.  In this case
@@ -211,9 +212,7 @@ static int process_buildids(struct record *rec)
 	 */
 	symbol_conf.ignore_vmlinux_buildid = true;
 
-	return __perf_session__process_events(session, start,
-					      size - start,
-					      size, &build_id__mark_dso_hit_ops);
+	return perf_session__process_events(session, &build_id__mark_dso_hit_ops);
 }
 
 static void perf_event__synthesize_guest_os(struct machine *machine, void *data)
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index b0ce3d6..0baf75f 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1251,9 +1251,9 @@ fetch_mmaped_event(struct perf_session *session,
 #define NUM_MMAPS 128
 #endif
 
-int __perf_session__process_events(struct perf_session *session,
-				   u64 data_offset, u64 data_size,
-				   u64 file_size, struct perf_tool *tool)
+static int __perf_session__process_events(struct perf_session *session,
+					  u64 data_offset, u64 data_size,
+					  u64 file_size, struct perf_tool *tool)
 {
 	int fd = perf_data_file__fd(session->file);
 	u64 head, page_offset, file_offset, file_pos, size;
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index dc26ebf..6d663dc 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -49,9 +49,6 @@ int perf_session__peek_event(struct perf_session *session, off_t file_offset,
 			     union perf_event **event_ptr,
 			     struct perf_sample *sample);
 
-int __perf_session__process_events(struct perf_session *session,
-				   u64 data_offset, u64 data_size, u64 size,
-				   struct perf_tool *tool);
 int perf_session__process_events(struct perf_session *session,
 				 struct perf_tool *tool);
 

^ permalink raw reply	[flat|nested] 221+ messages in thread

* [tip:perf/core] perf record: Show precise number of samples
  2015-01-29  8:06 ` [PATCH 03/42] perf record: Show precise number of samples Namhyung Kim
@ 2015-01-30 18:32   ` tip-bot for Namhyung Kim
  0 siblings, 0 replies; 221+ messages in thread
From: tip-bot for Namhyung Kim @ 2015-01-30 18:32 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, andi, mail, eranian, adrian.hunter, namhyung, mingo,
	a.p.zijlstra, fweisbec, jolsa, tglx, dsahern, hpa, linux-kernel

Commit-ID:  e3d5911221f5cf71e1f0306256d4e42d34a365d2
Gitweb:     http://git.kernel.org/tip/e3d5911221f5cf71e1f0306256d4e42d34a365d2
Author:     Namhyung Kim <namhyung@kernel.org>
AuthorDate: Thu, 29 Jan 2015 17:06:44 +0900
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 29 Jan 2015 16:37:20 -0300

perf record: Show precise number of samples

After perf record finishes, it prints file size and number of samples in
the file but this info is wrong since it assumes typical sample size of
24 bytes and divides file size by the value.

However as we post-process recorded samples for build-id, it can show
correct number like below.  If build-id post-processing is not requested
just omit the wrong number of samples.

  $ perf record noploop 1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.159 MB perf.data (3989 samples) ]

  $ perf report --stdio -n
  # To display the perf.data header info, please use --header/--header-only options.
  #
  # Samples: 3K of event 'cycles'
  # Event count (approx.): 3771330663
  #
  # Overhead       Samples  Command  Shared Object     Symbol
  # ........  ............  .......  ................  ..........................
  #
      99.90%          3982  noploop  noploop           [.] main
       0.09%             1  noploop  ld-2.17.so        [.] _dl_check_map_versions
       0.01%             1  noploop  [kernel.vmlinux]  [k] setup_arg_pages
       0.00%             5  noploop  [kernel.vmlinux]  [k] intel_pmu_enable_all

Reported-by: Milian Wolff <mail@milianw.de>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1422518843-25818-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-record.c | 51 ++++++++++++++++++++++++++++++++++-----------
 1 file changed, 39 insertions(+), 12 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 1134de2..9900b43 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -190,6 +190,19 @@ out:
 	return rc;
 }
 
+static int process_sample_event(struct perf_tool *tool,
+				union perf_event *event,
+				struct perf_sample *sample,
+				struct perf_evsel *evsel,
+				struct machine *machine)
+{
+	struct record *rec = container_of(tool, struct record, tool);
+
+	rec->samples++;
+
+	return build_id__mark_dso_hit(tool, event, sample, evsel, machine);
+}
+
 static int process_buildids(struct record *rec)
 {
 	struct perf_data_file *file  = &rec->file;
@@ -212,7 +225,7 @@ static int process_buildids(struct record *rec)
 	 */
 	symbol_conf.ignore_vmlinux_buildid = true;
 
-	return perf_session__process_events(session, &build_id__mark_dso_hit_ops);
+	return perf_session__process_events(session, &rec->tool);
 }
 
 static void perf_event__synthesize_guest_os(struct machine *machine, void *data)
@@ -503,19 +516,9 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		goto out_child;
 	}
 
-	if (!quiet) {
+	if (!quiet)
 		fprintf(stderr, "[ perf record: Woken up %ld times to write data ]\n", waking);
 
-		/*
-		 * Approximate RIP event size: 24 bytes.
-		 */
-		fprintf(stderr,
-			"[ perf record: Captured and wrote %.3f MB %s (~%" PRIu64 " samples) ]\n",
-			(double)rec->bytes_written / 1024.0 / 1024.0,
-			file->path,
-			rec->bytes_written / 24);
-	}
-
 out_child:
 	if (forks) {
 		int exit_status;
@@ -534,6 +537,9 @@ out_child:
 	} else
 		status = err;
 
+	/* this will be recalculated during process_buildids() */
+	rec->samples = 0;
+
 	if (!err && !file->is_pipe) {
 		rec->session->header.data_size += rec->bytes_written;
 
@@ -543,6 +549,20 @@ out_child:
 					   file->fd, true);
 	}
 
+	if (!err && !quiet) {
+		char samples[128];
+
+		if (rec->samples)
+			scnprintf(samples, sizeof(samples),
+				  " (%" PRIu64 " samples)", rec->samples);
+		else
+			samples[0] = '\0';
+
+		fprintf(stderr,	"[ perf record: Captured and wrote %.3f MB %s%s ]\n",
+			perf_data_file__size(file) / 1024.0 / 1024.0,
+			file->path, samples);
+	}
+
 out_delete_session:
 	perf_session__delete(session);
 	return status;
@@ -719,6 +739,13 @@ static struct record record = {
 			.default_per_cpu = true,
 		},
 	},
+	.tool = {
+		.sample		= process_sample_event,
+		.fork		= perf_event__process_fork,
+		.comm		= perf_event__process_comm,
+		.mmap		= perf_event__process_mmap,
+		.mmap2		= perf_event__process_mmap2,
+	},
 };
 
 #define CALLCHAIN_HELP "setup and enables call-graph (stack chain/backtrace) recording: "

^ permalink raw reply	[flat|nested] 221+ messages in thread

* [tip:perf/core] perf header: Set header version correctly
  2015-01-29  8:06 ` [PATCH 04/42] perf header: Set header version correctly Namhyung Kim
@ 2015-01-30 18:33   ` tip-bot for Namhyung Kim
  0 siblings, 0 replies; 221+ messages in thread
From: tip-bot for Namhyung Kim @ 2015-01-30 18:33 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: dsahern, linux-kernel, namhyung, a.p.zijlstra, acme, mingo,
	eranian, fweisbec, andi, adrian.hunter, jolsa, hpa, tglx

Commit-ID:  f7913971bdad1a72c6158074786babed477d61e2
Gitweb:     http://git.kernel.org/tip/f7913971bdad1a72c6158074786babed477d61e2
Author:     Namhyung Kim <namhyung@kernel.org>
AuthorDate: Thu, 29 Jan 2015 17:06:45 +0900
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 29 Jan 2015 16:53:11 -0300

perf header: Set header version correctly

When check_magic_endian() is called, it checks the magic number in the
perf data file to determine version and endianness.  But if it uses a
same endian the verison number wasn't updated and makes confusion.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1422518843-25818-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/header.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index b20e40c..1f407f7 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -2237,6 +2237,7 @@ static int check_magic_endian(u64 magic, uint64_t hdr_sz,
 	 * - unique number to identify actual perf.data files
 	 * - encode endianness of file
 	 */
+	ph->version = PERF_HEADER_VERSION_2;
 
 	/* check magic number with one endianness */
 	if (magic == __perf_magic2)
@@ -2247,7 +2248,6 @@ static int check_magic_endian(u64 magic, uint64_t hdr_sz,
 		return -1;
 
 	ph->needs_swap = true;
-	ph->version = PERF_HEADER_VERSION_2;
 
 	return 0;
 }

^ permalink raw reply	[flat|nested] 221+ messages in thread

* [tip:perf/core] perf evsel: Set attr.task bit for a tracking event
  2015-01-29  8:06 ` [PATCH 05/42] perf tools: Set attr.task bit for a tracking event Namhyung Kim
@ 2015-01-30 18:33   ` tip-bot for Namhyung Kim
  0 siblings, 0 replies; 221+ messages in thread
From: tip-bot for Namhyung Kim @ 2015-01-30 18:33 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: adrian.hunter, hpa, tglx, acme, namhyung, dsahern, andi, jolsa,
	fweisbec, mingo, a.p.zijlstra, eranian, linux-kernel

Commit-ID:  62e503b7ed98fcdf16308cda0b5378e7840f4339
Gitweb:     http://git.kernel.org/tip/62e503b7ed98fcdf16308cda0b5378e7840f4339
Author:     Namhyung Kim <namhyung@kernel.org>
AuthorDate: Thu, 29 Jan 2015 17:06:46 +0900
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 29 Jan 2015 16:54:59 -0300

perf evsel: Set attr.task bit for a tracking event

The perf_event_attr.task bit is to track task (fork and exit) events but
it missed to be set by perf_evsel__config().  While it was not a problem
in practice since setting other bits (comm/mmap) ended up being in same
result, it'd be good to set it explicitly anyway.

The attr->task is to track task related events (fork/exit) only but
other meta events like comm and mmap[2] also needs the task events.  So
setting attr->comm and/or attr->mmap causes the kernel emits the task
events anyway.  So the attr->task is only meaningful when other bits are
off but I'd like to set it for completeness.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1422518843-25818-6-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evsel.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 1d826d6..ea51a90 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -709,6 +709,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts)
 	if (opts->sample_weight)
 		perf_evsel__set_sample_bit(evsel, WEIGHT);
 
+	attr->task  = track;
 	attr->mmap  = track;
 	attr->mmap2 = track && !perf_missing_features.mmap2;
 	attr->comm  = track;

^ permalink raw reply	[flat|nested] 221+ messages in thread

* [tip:perf/core] perf symbols: Support to read compressed module from build-id cache
  2015-01-29  8:06 ` [PATCH 01/42] perf tools: Support to read compressed module from build-id cache Namhyung Kim
  2015-01-30 14:32   ` Jiri Olsa
@ 2015-01-30 18:33   ` tip-bot for Namhyung Kim
  1 sibling, 0 replies; 221+ messages in thread
From: tip-bot for Namhyung Kim @ 2015-01-30 18:33 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: adrian.hunter, namhyung, mingo, dsahern, hpa, eranian, acme,
	fweisbec, jolsa, linux-kernel, tglx, a.p.zijlstra, andi

Commit-ID:  0b064f43001fc2627ff8c3020647b85db040235f
Gitweb:     http://git.kernel.org/tip/0b064f43001fc2627ff8c3020647b85db040235f
Author:     Namhyung Kim <namhyung@kernel.org>
AuthorDate: Thu, 29 Jan 2015 17:06:42 +0900
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 29 Jan 2015 16:56:54 -0300

perf symbols: Support to read compressed module from build-id cache

The commit c00c48fc6e6e ("perf symbols: Preparation for compressed
kernel module support") added support for compressed kernel modules but
it only supports system path DSOs.  When a dso is read from build-id
cache, its filename doesn't end with ".gz" but has build-id.  In this
case, we should fallback to the original dso->name.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1422518843-25818-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/symbol-elf.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 06fcd1b..b24f9d8 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -574,13 +574,16 @@ static int decompress_kmodule(struct dso *dso, const char *name,
 	const char *ext = strrchr(name, '.');
 	char tmpbuf[] = "/tmp/perf-kmod-XXXXXX";
 
-	if ((type != DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE_COMP &&
-	     type != DSO_BINARY_TYPE__GUEST_KMODULE_COMP) ||
-	    type != dso->symtab_type)
+	if (type != DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE_COMP &&
+	    type != DSO_BINARY_TYPE__GUEST_KMODULE_COMP &&
+	    type != DSO_BINARY_TYPE__BUILD_ID_CACHE)
 		return -1;
 
-	if (!ext || !is_supported_compression(ext + 1))
-		return -1;
+	if (!ext || !is_supported_compression(ext + 1)) {
+		ext = strrchr(dso->name, '.');
+		if (!ext || !is_supported_compression(ext + 1))
+			return -1;
+	}
 
 	fd = mkstemp(tmpbuf);
 	if (fd < 0)

^ permalink raw reply	[flat|nested] 221+ messages in thread

* [tip:perf/core] perf tools: Use perf_data_file__fd() consistently
  2015-01-29  8:06 ` [PATCH 07/42] perf tools: Use perf_data_file__fd() consistently Namhyung Kim
@ 2015-01-30 18:33   ` tip-bot for Namhyung Kim
  0 siblings, 0 replies; 221+ messages in thread
From: tip-bot for Namhyung Kim @ 2015-01-30 18:33 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, dsahern, acme, linux-kernel, eranian, fweisbec,
	a.p.zijlstra, andi, mingo, adrian.hunter, tglx, namhyung, jolsa

Commit-ID:  42aa276f40730211383e9a9923416f1fb9841d68
Gitweb:     http://git.kernel.org/tip/42aa276f40730211383e9a9923416f1fb9841d68
Author:     Namhyung Kim <namhyung@kernel.org>
AuthorDate: Thu, 29 Jan 2015 17:06:48 +0900
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 29 Jan 2015 16:58:24 -0300

perf tools: Use perf_data_file__fd() consistently

Do not reference file->fd directly since we want hide the
implementation details from outside for possible future changes.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1422518843-25818-8-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-inject.c |  5 +++--
 tools/perf/builtin-record.c | 14 +++++++-------
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index 84df2de..a13641e 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -343,6 +343,7 @@ static int __cmd_inject(struct perf_inject *inject)
 	int ret = -EINVAL;
 	struct perf_session *session = inject->session;
 	struct perf_data_file *file_out = &inject->output;
+	int fd = perf_data_file__fd(file_out);
 
 	signal(SIGINT, sig_handler);
 
@@ -376,7 +377,7 @@ static int __cmd_inject(struct perf_inject *inject)
 	}
 
 	if (!file_out->is_pipe)
-		lseek(file_out->fd, session->header.data_offset, SEEK_SET);
+		lseek(fd, session->header.data_offset, SEEK_SET);
 
 	ret = perf_session__process_events(session, &inject->tool);
 
@@ -385,7 +386,7 @@ static int __cmd_inject(struct perf_inject *inject)
 			perf_header__set_feat(&session->header,
 					      HEADER_BUILD_ID);
 		session->header.data_size = inject->bytes_written;
-		perf_session__write_header(session, session->evlist, file_out->fd, true);
+		perf_session__write_header(session, session->evlist, fd, true);
 	}
 
 	return ret;
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 9900b43..404ab34 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -208,7 +208,7 @@ static int process_buildids(struct record *rec)
 	struct perf_data_file *file  = &rec->file;
 	struct perf_session *session = rec->session;
 
-	u64 size = lseek(file->fd, 0, SEEK_CUR);
+	u64 size = lseek(perf_data_file__fd(file), 0, SEEK_CUR);
 	if (size == 0)
 		return 0;
 
@@ -334,6 +334,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	struct perf_data_file *file = &rec->file;
 	struct perf_session *session;
 	bool disabled = false, draining = false;
+	int fd;
 
 	rec->progname = argv[0];
 
@@ -348,6 +349,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		return -1;
 	}
 
+	fd = perf_data_file__fd(file);
 	rec->session = session;
 
 	record__init_features(rec);
@@ -372,12 +374,11 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		perf_header__clear_feat(&session->header, HEADER_GROUP_DESC);
 
 	if (file->is_pipe) {
-		err = perf_header__write_pipe(file->fd);
+		err = perf_header__write_pipe(fd);
 		if (err < 0)
 			goto out_child;
 	} else {
-		err = perf_session__write_header(session, rec->evlist,
-						 file->fd, false);
+		err = perf_session__write_header(session, rec->evlist, fd, false);
 		if (err < 0)
 			goto out_child;
 	}
@@ -409,7 +410,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 			 * return this more properly and also
 			 * propagate errors that now are calling die()
 			 */
-			err = perf_event__synthesize_tracing_data(tool, file->fd, rec->evlist,
+			err = perf_event__synthesize_tracing_data(tool,	fd, rec->evlist,
 								  process_synthesized_event);
 			if (err <= 0) {
 				pr_err("Couldn't record tracing data.\n");
@@ -545,8 +546,7 @@ out_child:
 
 		if (!rec->no_buildid)
 			process_buildids(rec);
-		perf_session__write_header(rec->session, rec->evlist,
-					   file->fd, true);
+		perf_session__write_header(rec->session, rec->evlist, fd, true);
 	}
 
 	if (!err && !quiet) {

^ permalink raw reply	[flat|nested] 221+ messages in thread

* [tip:perf/core] perf symbols: Convert lseek + read to pread
  2015-01-29  8:07 ` [PATCH 39/42] perf tools: Convert lseek + read to pread Namhyung Kim
@ 2015-01-30 18:34   ` tip-bot for Namhyung Kim
  0 siblings, 0 replies; 221+ messages in thread
From: tip-bot for Namhyung Kim @ 2015-01-30 18:34 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: a.p.zijlstra, fweisbec, linux-kernel, eranian, mingo, tglx, acme,
	dsahern, jolsa, hpa, andi, adrian.hunter, namhyung

Commit-ID:  c52686f9f888d23ca72f1309e86af8e91d075697
Gitweb:     http://git.kernel.org/tip/c52686f9f888d23ca72f1309e86af8e91d075697
Author:     Namhyung Kim <namhyung@kernel.org>
AuthorDate: Thu, 29 Jan 2015 17:02:01 -0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 29 Jan 2015 17:02:01 -0300

perf symbols: Convert lseek + read to pread

When dso_cache__read() is called, it reads data from the given offset
using lseek + normal read syscall.  It can be combined to a single pread
syscall.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1422518843-25818-40-git-send-email-namhyung@kernel.org
[ Fixed it up when cherry picking it from the multi threaded patchkit ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/dso.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 45be944..c2f7d3b 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -532,12 +532,8 @@ dso_cache__read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
 			break;
 
 		cache_offset = offset & DSO__DATA_CACHE_MASK;
-		ret = -EINVAL;
 
-		if (-1 == lseek(dso->data.fd, cache_offset, SEEK_SET))
-			break;
-
-		ret = read(dso->data.fd, cache->data, DSO__DATA_CACHE_SIZE);
+		ret = pread(dso->data.fd, cache->data, DSO__DATA_CACHE_SIZE, cache_offset);
 		if (ret <= 0)
 			break;
 

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH 14/42] perf record: Add --index option for building index table
  2015-01-29  8:06 ` [PATCH 14/42] perf record: Add --index option for building index table Namhyung Kim
@ 2015-02-01 18:06   ` Jiri Olsa
  2015-02-02  8:34     ` Adrian Hunter
  0 siblings, 1 reply; 221+ messages in thread
From: Jiri Olsa @ 2015-02-01 18:06 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML,
	David Ahern, Adrian Hunter, Andi Kleen, Stephane Eranian,
	Frederic Weisbecker

On Thu, Jan 29, 2015 at 05:06:55PM +0900, Namhyung Kim wrote:
> The new --index option will create indexed data file which can be
> processed by multiple threads parallelly.  It saves meta event and
> sample data in separate files and merges them with an index table.
> 
> To build an index table, it needs to know exact offsets and sizes for
> each sample data.  However the offset only can be calculated after the
> feature data is fixed, and to save feature data it needs to access to
> the sample data because it needs to mark used DSOs for build-id table.
> 
> So I ended up with reserving 1MB hole for the feature data area and then
> put sample data and calculated offsets.  Now an indexed perf data file
> will look like below:
> 
>         +---------------------+
>         |     file header     |
>         |---------------------|
>         |                     |
>         |     meta events     |
>         |                     |
>         |---------------------|
>         |     feature data    |
>         |   (contains index) -+--+
>         |---------------------|  |
>         |      ~1MB hole      |  |
>         |---------------------|  |
>         |                     |  |
>         |    sample data[1] <-+--+
>         |                     |  |
>         |---------------------|  |
>         |                     |  |
>         |    sample data[2] <-|--+
>         |                     |  |
>         |---------------------|  |
>         |         ...         | ...
>         +---------------------+

I also dont see how to store it in a nice way under current header layout,
but how about bump up the header version for this feature? ;-)

currently it's:

struct perf_file_header {
        u64                             magic;
        u64                             size;
        u64                             attr_size;
        struct perf_file_section        attrs;
        struct perf_file_section        data;
        /* event_types is ignored */
        struct perf_file_section        event_types;
        DECLARE_BITMAP(adds_features, HEADER_FEAT_BITS);
};


- we already store attrs as a FEATURE so we could omit that
- your patch stores only synthesized data into 'data' section (-1 idx)
  this could be stored into separate file and get merged with the rest
- new header version would have 'features' section, so the features
  position wouldnt depend on the 'data' end as of now and we could
  easily store after all data is merged:

struct perf_file_header {
        u64                             magic;
        u64                             size;
        u64                             attr_size;
        struct perf_file_section        features;
        DECLARE_BITMAP(adds_features, HEADER_FEAT_BITS);
};


thoughts?
jirka

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH 14/42] perf record: Add --index option for building index table
  2015-02-01 18:06   ` Jiri Olsa
@ 2015-02-02  8:34     ` Adrian Hunter
  2015-02-02  9:15       ` Jiri Olsa
  0 siblings, 1 reply; 221+ messages in thread
From: Adrian Hunter @ 2015-02-02  8:34 UTC (permalink / raw)
  To: Jiri Olsa, Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML,
	David Ahern, Andi Kleen, Stephane Eranian, Frederic Weisbecker

On 01/02/15 20:06, Jiri Olsa wrote:
> On Thu, Jan 29, 2015 at 05:06:55PM +0900, Namhyung Kim wrote:
>> The new --index option will create indexed data file which can be
>> processed by multiple threads parallelly.  It saves meta event and
>> sample data in separate files and merges them with an index table.
>>
>> To build an index table, it needs to know exact offsets and sizes for
>> each sample data.  However the offset only can be calculated after the
>> feature data is fixed, and to save feature data it needs to access to
>> the sample data because it needs to mark used DSOs for build-id table.
>>
>> So I ended up with reserving 1MB hole for the feature data area and then
>> put sample data and calculated offsets.  Now an indexed perf data file
>> will look like below:
>>
>>         +---------------------+
>>         |     file header     |
>>         |---------------------|
>>         |                     |
>>         |     meta events     |
>>         |                     |
>>         |---------------------|
>>         |     feature data    |
>>         |   (contains index) -+--+
>>         |---------------------|  |
>>         |      ~1MB hole      |  |
>>         |---------------------|  |
>>         |                     |  |
>>         |    sample data[1] <-+--+
>>         |                     |  |
>>         |---------------------|  |
>>         |                     |  |
>>         |    sample data[2] <-|--+
>>         |                     |  |
>>         |---------------------|  |
>>         |         ...         | ...
>>         +---------------------+
> 
> I also dont see how to store it in a nice way under current header layout,
> but how about bump up the header version for this feature? ;-)
> 
> currently it's:
> 
> struct perf_file_header {
>         u64                             magic;
>         u64                             size;
>         u64                             attr_size;
>         struct perf_file_section        attrs;
>         struct perf_file_section        data;
>         /* event_types is ignored */
>         struct perf_file_section        event_types;
>         DECLARE_BITMAP(adds_features, HEADER_FEAT_BITS);
> };
> 
> 
> - we already store attrs as a FEATURE so we could omit that
> - your patch stores only synthesized data into 'data' section (-1 idx)
>   this could be stored into separate file and get merged with the rest
> - new header version would have 'features' section, so the features
>   position wouldnt depend on the 'data' end as of now and we could
>   easily store after all data is merged:
> 
> struct perf_file_header {
>         u64                             magic;
>         u64                             size;
>         u64                             attr_size;
>         struct perf_file_section        features;
>         DECLARE_BITMAP(adds_features, HEADER_FEAT_BITS);
> };
> 
> 
> thoughts?

How come the features are being written before the sample data anyway?
I would have expected:
	- write the data (update the index in memory)
	- write the features (including index)


^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH 14/42] perf record: Add --index option for building index table
  2015-02-02  8:34     ` Adrian Hunter
@ 2015-02-02  9:15       ` Jiri Olsa
  2015-02-02  9:52         ` Adrian Hunter
  0 siblings, 1 reply; 221+ messages in thread
From: Jiri Olsa @ 2015-02-02  9:15 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Namhyung Kim, Arnaldo Carvalho de Melo, Ingo Molnar,
	Peter Zijlstra, LKML, David Ahern, Andi Kleen, Stephane Eranian,
	Frederic Weisbecker

On Mon, Feb 02, 2015 at 10:34:50AM +0200, Adrian Hunter wrote:

SNIP

> > but how about bump up the header version for this feature? ;-)
> > 
> > currently it's:
> > 
> > struct perf_file_header {
> >         u64                             magic;
> >         u64                             size;
> >         u64                             attr_size;
> >         struct perf_file_section        attrs;
> >         struct perf_file_section        data;
> >         /* event_types is ignored */
> >         struct perf_file_section        event_types;
> >         DECLARE_BITMAP(adds_features, HEADER_FEAT_BITS);
> > };
> > 
> > 
> > - we already store attrs as a FEATURE so we could omit that
> > - your patch stores only synthesized data into 'data' section (-1 idx)
> >   this could be stored into separate file and get merged with the rest
> > - new header version would have 'features' section, so the features
> >   position wouldnt depend on the 'data' end as of now and we could
> >   easily store after all data is merged:
> > 
> > struct perf_file_header {
> >         u64                             magic;
> >         u64                             size;
> >         u64                             attr_size;
> >         struct perf_file_section        features;
> >         DECLARE_BITMAP(adds_features, HEADER_FEAT_BITS);
> > };
> > 
> > 
> > thoughts?
> 
> How come the features are being written before the sample data anyway?
> I would have expected:
> 	- write the data (update the index in memory)
> 	- write the features (including index)
>

I think the problem is that the only way how to get features offset
right now is via perf_file_header::data.offset + perf_file_headerdata.size,
and we still use this section to carry 'sythesized' data, so it needs
to have correct size.

I guess we could workaround that by storing the 'perf_file_header::data'
as the last data section. That would require to treat it the same way as
all other data sections, but we could keep current header layout.

jirka

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH 14/42] perf record: Add --index option for building index table
  2015-02-02  9:15       ` Jiri Olsa
@ 2015-02-02  9:52         ` Adrian Hunter
  2015-02-02 10:05           ` Jiri Olsa
  0 siblings, 1 reply; 221+ messages in thread
From: Adrian Hunter @ 2015-02-02  9:52 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Namhyung Kim, Arnaldo Carvalho de Melo, Ingo Molnar,
	Peter Zijlstra, LKML, David Ahern, Andi Kleen, Stephane Eranian,
	Frederic Weisbecker

On 02/02/15 11:15, Jiri Olsa wrote:
> On Mon, Feb 02, 2015 at 10:34:50AM +0200, Adrian Hunter wrote:
> 
> SNIP
> 
>>> but how about bump up the header version for this feature? ;-)
>>>
>>> currently it's:
>>>
>>> struct perf_file_header {
>>>         u64                             magic;
>>>         u64                             size;
>>>         u64                             attr_size;
>>>         struct perf_file_section        attrs;
>>>         struct perf_file_section        data;
>>>         /* event_types is ignored */
>>>         struct perf_file_section        event_types;
>>>         DECLARE_BITMAP(adds_features, HEADER_FEAT_BITS);
>>> };
>>>
>>>
>>> - we already store attrs as a FEATURE so we could omit that
>>> - your patch stores only synthesized data into 'data' section (-1 idx)
>>>   this could be stored into separate file and get merged with the rest
>>> - new header version would have 'features' section, so the features
>>>   position wouldnt depend on the 'data' end as of now and we could
>>>   easily store after all data is merged:
>>>
>>> struct perf_file_header {
>>>         u64                             magic;
>>>         u64                             size;
>>>         u64                             attr_size;
>>>         struct perf_file_section        features;
>>>         DECLARE_BITMAP(adds_features, HEADER_FEAT_BITS);
>>> };
>>>
>>>
>>> thoughts?
>>
>> How come the features are being written before the sample data anyway?
>> I would have expected:
>> 	- write the data (update the index in memory)
>> 	- write the features (including index)
>>
> 
> I think the problem is that the only way how to get features offset
> right now is via perf_file_header::data.offset + perf_file_headerdata.size,
> and we still use this section to carry 'sythesized' data, so it needs
> to have correct size.

Why not make it the same as all the other data. i.e. find the start and size
via the index? And then just lump all the data together?

> I guess we could workaround that by storing the 'perf_file_header::data'
> as the last data section. That would require to treat it the same way as
> all other data sections, but we could keep current header layout.

Would it need to be last? Logically it should precede the data that depends
on it.


^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH 14/42] perf record: Add --index option for building index table
  2015-02-02  9:52         ` Adrian Hunter
@ 2015-02-02 10:05           ` Jiri Olsa
  2015-02-02 12:07             ` Adrian Hunter
  0 siblings, 1 reply; 221+ messages in thread
From: Jiri Olsa @ 2015-02-02 10:05 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Namhyung Kim, Arnaldo Carvalho de Melo, Ingo Molnar,
	Peter Zijlstra, LKML, David Ahern, Andi Kleen, Stephane Eranian,
	Frederic Weisbecker

On Mon, Feb 02, 2015 at 11:52:26AM +0200, Adrian Hunter wrote:
> On 02/02/15 11:15, Jiri Olsa wrote:
> > On Mon, Feb 02, 2015 at 10:34:50AM +0200, Adrian Hunter wrote:
> > 
> > SNIP
> > 
> >>> but how about bump up the header version for this feature? ;-)
> >>>
> >>> currently it's:
> >>>
> >>> struct perf_file_header {
> >>>         u64                             magic;
> >>>         u64                             size;
> >>>         u64                             attr_size;
> >>>         struct perf_file_section        attrs;
> >>>         struct perf_file_section        data;
> >>>         /* event_types is ignored */
> >>>         struct perf_file_section        event_types;
> >>>         DECLARE_BITMAP(adds_features, HEADER_FEAT_BITS);
> >>> };
> >>>
> >>>
> >>> - we already store attrs as a FEATURE so we could omit that
> >>> - your patch stores only synthesized data into 'data' section (-1 idx)
> >>>   this could be stored into separate file and get merged with the rest
> >>> - new header version would have 'features' section, so the features
> >>>   position wouldnt depend on the 'data' end as of now and we could
> >>>   easily store after all data is merged:
> >>>
> >>> struct perf_file_header {
> >>>         u64                             magic;
> >>>         u64                             size;
> >>>         u64                             attr_size;
> >>>         struct perf_file_section        features;
> >>>         DECLARE_BITMAP(adds_features, HEADER_FEAT_BITS);
> >>> };
> >>>
> >>>
> >>> thoughts?
> >>
> >> How come the features are being written before the sample data anyway?
> >> I would have expected:
> >> 	- write the data (update the index in memory)
> >> 	- write the features (including index)
> >>
> > 
> > I think the problem is that the only way how to get features offset
> > right now is via perf_file_header::data.offset + perf_file_headerdata.size,
> > and we still use this section to carry 'sythesized' data, so it needs
> > to have correct size.
> 
> Why not make it the same as all the other data. i.e. find the start and size
> via the index? And then just lump all the data together?

thats what I suggested

> 
> > I guess we could workaround that by storing the 'perf_file_header::data'
> > as the last data section. That would require to treat it the same way as
> > all other data sections, but we could keep current header layout.
> 
> Would it need to be last? Logically it should precede the data that depends
> on it.

i suggested this as a workaround for having features at the end of the file
while keeping the current perf data header

jirka

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH 14/42] perf record: Add --index option for building index table
  2015-02-02 10:05           ` Jiri Olsa
@ 2015-02-02 12:07             ` Adrian Hunter
  2015-02-02 12:13               ` Jiri Olsa
  0 siblings, 1 reply; 221+ messages in thread
From: Adrian Hunter @ 2015-02-02 12:07 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Namhyung Kim, Arnaldo Carvalho de Melo, Ingo Molnar,
	Peter Zijlstra, LKML, David Ahern, Andi Kleen, Stephane Eranian,
	Frederic Weisbecker

On 02/02/15 12:05, Jiri Olsa wrote:
> On Mon, Feb 02, 2015 at 11:52:26AM +0200, Adrian Hunter wrote:
>> On 02/02/15 11:15, Jiri Olsa wrote:
>>> On Mon, Feb 02, 2015 at 10:34:50AM +0200, Adrian Hunter wrote:
>>>
>>> SNIP
>>>
>>>>> but how about bump up the header version for this feature? ;-)
>>>>>
>>>>> currently it's:
>>>>>
>>>>> struct perf_file_header {
>>>>>         u64                             magic;
>>>>>         u64                             size;
>>>>>         u64                             attr_size;
>>>>>         struct perf_file_section        attrs;
>>>>>         struct perf_file_section        data;
>>>>>         /* event_types is ignored */
>>>>>         struct perf_file_section        event_types;
>>>>>         DECLARE_BITMAP(adds_features, HEADER_FEAT_BITS);
>>>>> };
>>>>>
>>>>>
>>>>> - we already store attrs as a FEATURE so we could omit that
>>>>> - your patch stores only synthesized data into 'data' section (-1 idx)
>>>>>   this could be stored into separate file and get merged with the rest
>>>>> - new header version would have 'features' section, so the features
>>>>>   position wouldnt depend on the 'data' end as of now and we could
>>>>>   easily store after all data is merged:
>>>>>
>>>>> struct perf_file_header {
>>>>>         u64                             magic;
>>>>>         u64                             size;
>>>>>         u64                             attr_size;
>>>>>         struct perf_file_section        features;
>>>>>         DECLARE_BITMAP(adds_features, HEADER_FEAT_BITS);
>>>>> };
>>>>>
>>>>>
>>>>> thoughts?
>>>>
>>>> How come the features are being written before the sample data anyway?
>>>> I would have expected:
>>>> 	- write the data (update the index in memory)
>>>> 	- write the features (including index)
>>>>
>>>
>>> I think the problem is that the only way how to get features offset
>>> right now is via perf_file_header::data.offset + perf_file_headerdata.size,
>>> and we still use this section to carry 'sythesized' data, so it needs
>>> to have correct size.
>>
>> Why not make it the same as all the other data. i.e. find the start and size
>> via the index? And then just lump all the data together?
> 
> thats what I suggested

No, I meant really lump it all together. i.e. perf_file_header.data.size =
total data size

> 
>>
>>> I guess we could workaround that by storing the 'perf_file_header::data'
>>> as the last data section. That would require to treat it the same way as
>>> all other data sections, but we could keep current header layout.
>>
>> Would it need to be last? Logically it should precede the data that depends
>> on it.
> 
> i suggested this as a workaround for having features at the end of the file
> while keeping the current perf data header

Which wouldn't be necessary if you lump it all together?


^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH 14/42] perf record: Add --index option for building index table
  2015-02-02 12:07             ` Adrian Hunter
@ 2015-02-02 12:13               ` Jiri Olsa
  2015-02-02 14:56                 ` Namhyung Kim
  0 siblings, 1 reply; 221+ messages in thread
From: Jiri Olsa @ 2015-02-02 12:13 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Namhyung Kim, Arnaldo Carvalho de Melo, Ingo Molnar,
	Peter Zijlstra, LKML, David Ahern, Andi Kleen, Stephane Eranian,
	Frederic Weisbecker

On Mon, Feb 02, 2015 at 02:07:27PM +0200, Adrian Hunter wrote:

SNIP

> >>
> >> Why not make it the same as all the other data. i.e. find the start and size
> >> via the index? And then just lump all the data together?
> > 
> > thats what I suggested
> 
> No, I meant really lump it all together. i.e. perf_file_header.data.size =
> total data size
> 
> > 
> >>
> >>> I guess we could workaround that by storing the 'perf_file_header::data'
> >>> as the last data section. That would require to treat it the same way as
> >>> all other data sections, but we could keep current header layout.
> >>
> >> Would it need to be last? Logically it should precede the data that depends
> >> on it.
> > 
> > i suggested this as a workaround for having features at the end of the file
> > while keeping the current perf data header
> 
> Which wouldn't be necessary if you lump it all together?

yep, that's also an option

jirka

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH 14/42] perf record: Add --index option for building index table
  2015-02-02 12:13               ` Jiri Olsa
@ 2015-02-02 14:56                 ` Namhyung Kim
  2015-02-02 17:30                   ` Jiri Olsa
  0 siblings, 1 reply; 221+ messages in thread
From: Namhyung Kim @ 2015-02-02 14:56 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Adrian Hunter, Arnaldo Carvalho de Melo, Ingo Molnar,
	Peter Zijlstra, LKML, David Ahern, Andi Kleen, Stephane Eranian,
	Frederic Weisbecker

Hi Jiri and Adrian,

On Mon, Feb 2, 2015 at 9:13 PM, Jiri Olsa <jolsa@redhat.com> wrote:
> On Mon, Feb 02, 2015 at 02:07:27PM +0200, Adrian Hunter wrote:
>
> SNIP
>
>> >>
>> >> Why not make it the same as all the other data. i.e. find the start and size
>> >> via the index? And then just lump all the data together?
>> >
>> > thats what I suggested
>>
>> No, I meant really lump it all together. i.e. perf_file_header.data.size =
>> total data size
>>
>> >
>> >>
>> >>> I guess we could workaround that by storing the 'perf_file_header::data'
>> >>> as the last data section. That would require to treat it the same way as
>> >>> all other data sections, but we could keep current header layout.
>> >>
>> >> Would it need to be last? Logically it should precede the data that depends
>> >> on it.
>> >
>> > i suggested this as a workaround for having features at the end of the file
>> > while keeping the current perf data header
>>
>> Which wouldn't be necessary if you lump it all together?
>
> yep, that's also an option

So we want a single section for the entire data area, right?

I also thought about it.  My concern was the holes between each data
due to page alignment.  If an old tool which doesn't know about the
index accesses to the data file, it'd just see a event type of 0 and
stop processing.

Maybe the page alignment is not necessary?

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH 01/42] perf tools: Support to read compressed module from build-id cache
  2015-01-30 14:32   ` Jiri Olsa
@ 2015-02-02 15:03     ` Namhyung Kim
  0 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-02-02 15:03 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML,
	David Ahern, Adrian Hunter, Andi Kleen, Stephane Eranian,
	Frederic Weisbecker

On Fri, Jan 30, 2015 at 11:32 PM, Jiri Olsa <jolsa@redhat.com> wrote:
> On Thu, Jan 29, 2015 at 05:06:42PM +0900, Namhyung Kim wrote:
>> The commit c00c48fc6e6e ("perf symbols: Preparation for compressed
>> kernel module support") added support for compressed kernel modules
>> but it only supports system path DSOs.  When a dso is read from
>> build-id cache, its filename doesn't end with ".gz" but has build-id.
>> In this case, we should fallback to the original dso->name.
>>
>> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
>> ---
>>  tools/perf/util/symbol-elf.c | 13 ++++++++-----
>>  1 file changed, 8 insertions(+), 5 deletions(-)
>>
>> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
>> index 06fcd1bf98b6..b24f9d8727a8 100644
>> --- a/tools/perf/util/symbol-elf.c
>> +++ b/tools/perf/util/symbol-elf.c
>> @@ -574,13 +574,16 @@ static int decompress_kmodule(struct dso *dso, const char *name,
>>       const char *ext = strrchr(name, '.');
>>       char tmpbuf[] = "/tmp/perf-kmod-XXXXXX";
>>
>> -     if ((type != DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE_COMP &&
>> -          type != DSO_BINARY_TYPE__GUEST_KMODULE_COMP) ||
>> -         type != dso->symtab_type)
>> +     if (type != DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE_COMP &&
>> +         type != DSO_BINARY_TYPE__GUEST_KMODULE_COMP &&
>> +         type != DSO_BINARY_TYPE__BUILD_ID_CACHE)
>>               return -1;
>
> hum, is it possible the type == DSO_BINARY_TYPE__BUILD_ID_CACHE could get in here?
>
>
> ---
>         for (i = 0; i < DSO_BINARY_TYPE__SYMTAB_CNT; i++) {
>                 struct symsrc *ss = &ss_[ss_pos];
>                 bool next_slot = false;
>
>                 enum dso_binary_type symtab_type = binary_type_symtab[i];
>
>                 if (!dso__is_compatible_symtab_type(dso, kmod, symtab_type))
>                         continue;
>
> ---             ^^^ this check should rule out buildid symtab_type for kmod dso?

AFAICS symtab_type of BUILD_ID_CACHE always returns true in this function.


>
>                 symsrc__init(
>
>
> I wonder wether we should set special type from compressed binaries (as of now),
> or instead try to decompress anything that looks like it's compressed ;-)
> it seems more to be more generic and could simplify the code..

I don't know.  But it seems only kernel modules are compressed now.
If user-level dso also supports compression, we need to think about it
again..

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH 14/42] perf record: Add --index option for building index table
  2015-02-02 14:56                 ` Namhyung Kim
@ 2015-02-02 17:30                   ` Jiri Olsa
  2015-02-03  8:42                     ` Adrian Hunter
  0 siblings, 1 reply; 221+ messages in thread
From: Jiri Olsa @ 2015-02-02 17:30 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Adrian Hunter, Arnaldo Carvalho de Melo, Ingo Molnar,
	Peter Zijlstra, LKML, David Ahern, Andi Kleen, Stephane Eranian,
	Frederic Weisbecker

On Mon, Feb 02, 2015 at 11:56:09PM +0900, Namhyung Kim wrote:
> Hi Jiri and Adrian,
> 
> On Mon, Feb 2, 2015 at 9:13 PM, Jiri Olsa <jolsa@redhat.com> wrote:
> > On Mon, Feb 02, 2015 at 02:07:27PM +0200, Adrian Hunter wrote:
> >
> > SNIP
> >
> >> >>
> >> >> Why not make it the same as all the other data. i.e. find the start and size
> >> >> via the index? And then just lump all the data together?
> >> >
> >> > thats what I suggested
> >>
> >> No, I meant really lump it all together. i.e. perf_file_header.data.size =
> >> total data size
> >>
> >> >
> >> >>
> >> >>> I guess we could workaround that by storing the 'perf_file_header::data'
> >> >>> as the last data section. That would require to treat it the same way as
> >> >>> all other data sections, but we could keep current header layout.
> >> >>
> >> >> Would it need to be last? Logically it should precede the data that depends
> >> >> on it.
> >> >
> >> > i suggested this as a workaround for having features at the end of the file
> >> > while keeping the current perf data header
> >>
> >> Which wouldn't be necessary if you lump it all together?
> >
> > yep, that's also an option
> 
> So we want a single section for the entire data area, right?
> 
> I also thought about it.  My concern was the holes between each data
> due to page alignment.  If an old tool which doesn't know about the
> index accesses to the data file, it'd just see a event type of 0 and
> stop processing.
> 
> Maybe the page alignment is not necessary?

seems ok,  but how about time ordering.. every time you reach new
file data you'll hit 'out of order event' right?

hum, maybe it's not a big deal now when it's just incrementing counter ;-)

jirka

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH 14/42] perf record: Add --index option for building index table
  2015-02-02 17:30                   ` Jiri Olsa
@ 2015-02-03  8:42                     ` Adrian Hunter
  0 siblings, 0 replies; 221+ messages in thread
From: Adrian Hunter @ 2015-02-03  8:42 UTC (permalink / raw)
  To: Jiri Olsa, Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra, LKML,
	David Ahern, Andi Kleen, Stephane Eranian, Frederic Weisbecker

On 02/02/15 19:30, Jiri Olsa wrote:
> On Mon, Feb 02, 2015 at 11:56:09PM +0900, Namhyung Kim wrote:
>> Hi Jiri and Adrian,
>>
>> On Mon, Feb 2, 2015 at 9:13 PM, Jiri Olsa <jolsa@redhat.com> wrote:
>>> On Mon, Feb 02, 2015 at 02:07:27PM +0200, Adrian Hunter wrote:
>>>
>>> SNIP
>>>
>>>>>>
>>>>>> Why not make it the same as all the other data. i.e. find the start and size
>>>>>> via the index? And then just lump all the data together?
>>>>>
>>>>> thats what I suggested
>>>>
>>>> No, I meant really lump it all together. i.e. perf_file_header.data.size =
>>>> total data size
>>>>
>>>>>
>>>>>>
>>>>>>> I guess we could workaround that by storing the 'perf_file_header::data'
>>>>>>> as the last data section. That would require to treat it the same way as
>>>>>>> all other data sections, but we could keep current header layout.
>>>>>>
>>>>>> Would it need to be last? Logically it should precede the data that depends
>>>>>> on it.
>>>>>
>>>>> i suggested this as a workaround for having features at the end of the file
>>>>> while keeping the current perf data header
>>>>
>>>> Which wouldn't be necessary if you lump it all together?
>>>
>>> yep, that's also an option
>>
>> So we want a single section for the entire data area, right?
>>
>> I also thought about it.  My concern was the holes between each data
>> due to page alignment.  If an old tool which doesn't know about the
>> index accesses to the data file, it'd just see a event type of 0 and
>> stop processing.

Please don't leave holes. Either fill them with a padding event or put the
data end-to-end.

>>
>> Maybe the page alignment is not necessary?
> 
> seems ok,  but how about time ordering.. every time you reach new
> file data you'll hit 'out of order event' right?
> 
> hum, maybe it's not a big deal now when it's just incrementing counter ;-)
> 
> jirka
> 
> 


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3)
@ 2015-03-03  3:07 Namhyung Kim
  2015-03-03  3:07 ` [PATCH 01/38] perf tools: Use a software dummy event to track task/mmap events Namhyung Kim
                   ` (37 more replies)
  0 siblings, 38 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

Hello,

This patchset converts perf report to use multiple threads in order to
speed up the processing on large data files.  I can see a minimum ~30%
of speedup with this change.  The code is still experimental and
contains many rough edges.  But I'd like to share and give some
feedbacks.

 * changes in v3)
  - handle header (metadata) same as sample data (at index 0)
  - maintain libunwind address space in map_groups instead of thread
  - use *_time API only for indexed data file
  - resolve callchain with the *_time API
  - use dso__data_get/put_fd() to protect access to fd
  - synthesize COMM event for command line workload

 * changes in v2)
  - rework with single indexed data file rather than multiple files in
    a directory

The perf report processes (sample) events like below:

  1. preprocess sample to get matching thread/dso/symbol info
  2. insert it to hists rbtree (with callchain tree) based on the info
  3. optionally collapse hist entries that match given sort key(s)
  4. resort hist entries (by overhead) for output
  5. display the hist entries

The stage 1 is a preprocessing and mostly act like a read-only
operation in that it doesn't change a machine state during the sample
processing.  Meta events like fork, comm and mmap can change the
machine/thread state but symbols can be loaded during the processing
(stage 2).

The stage 2 consumes most of the time especially with callchains and
 --children option is enabled.  And this work can be easily patitioned
as each sample is independent to others.  But the resulting hists must
be combined/collapsed to a single global hists before going to further
steps.

The stage 3 is optional and only needed by certain sort keys - but
with stage 2 paralellized, it needs to be done always.

The stage 4 and 5 works on whole hists so must be done serially.

So my approach is like this:

Partially do stage 1 first - but only for meta events that changes
machine state.  To do this I add a dummy tracking event to perf record
and make it collect such meta events only.  They are saved as normal
data and processed before sample events at perf report time.

This also requires to handle multiple sample data concurrently and to
find a corresponding machine state when processing samples.  On a
large profiling session, many tasks were created and exited so pid
might be recycled (even more than once!).  To deal with it, I managed
to have thread, map_groups and comm in time sorted.  The only
remaining thing is symbol loading as it's done lazily when sample
requires it.

With that being done, the stage 2 can be done by multiple threads.  I
also save each sample data (per-cpu or per-thread) in separate files
during record and then merge them into a single data file with an
index table.  On perf report time, each region of sample data will be
processed by each thread.  And symbol loading is protected by a mutex
lock.

For DWARF post-unwinding, dso cache data also needs to be protected by
a lock and this caused a huge contention.  I made it to search the
rbtree speculatively first and then, if it didn't find one, search it
again under the dso lock.  Please take a look at it if it's acceptable.

The patch 1-9 are to support indexing for data file.  With --index
option, perf record will create a intermediate directory and then save
meta events and sample data to separate files.  And finally it'll
build an index table and concatenate the data files (and also remove
the intermediate direcotry).

The patch 10-23 are to manage machine and thread state using timestamp
so that it can be searched when processing samples.  The patch 24-37
are to implement parallel report.  And finally I implemented 'perf
data index' command to build an index table for a given data file.

This patchset didn't change perf record to use multi-thread.  But I
think it can be easily done later if needed.

Note that output has a slight difference to original version when
compared using indexed data file.  But they're mostly unresolved
symbols for callchains.

Here is the result:

This is just elapsed time measured by 'perf stat -r 5'.

The data file was recorded during kernel build with fp callchain and
size is 2.1GB.  The machine has 6 core with hyper-threading enabled
and I got a similar result on my laptop too.

 perf report          --children  --no-children  + --call-graph none
 		   -------------  -------------  -------------------
 current           286.213349446   93.753958745      36.860880945  
 with index        270.158361549   87.963067415      32.896841653
 + --multi-thread  166.039011492   43.209152911       8.434560193


This result is with 7.7GB data file using libunwind for callchain.

 perf report          --children  --no-children  + --call-graph none
 		   -------------  -------------  -------------------
 current           150.714039134  111.420099831       5.035423803
 with index        152.438739157  112.691612534       3.642109876
 + --multi-thread   45.966048256   29.844907087       1.829218561

I guess the speedup of indexed data file came from skipping ordered
event layer.

This result is with same file but using libdw for callchain unwind.

 perf report          --children  --no-children  + --call-graph none
 		   -------------  -------------  -------------------
 current           457.507820205  491.520096816       4.831840810
 with index        441.140769287  461.993666236       3.767947395
 + --multi-thread  219.289176894  171.935294339       1.785351793

On my archlinux system, callchain unwind using libdw is much slower
than libunwind.  I'm using elfutils version 0.160.  Also I don't know
why --children takes less time than --no-children.  Anyway we can see
the --multi-thread performance is much better for each case.


You can get it from 'perf/threaded-v3' branch on my tree at:

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Please take a look and play with it.  Any comments are welcome! :)

Thanks,
Namhyung


Namhyung Kim (38):
  perf tools: Use a software dummy event to track task/mmap events
  perf tools: Add rm_rf() utility function
  perf tools: Introduce copyfile_offset() function
  perf tools: Create separate mmap for dummy tracking event
  perf tools: Introduce perf_evlist__mmap_track()
  perf tools: Add HEADER_DATA_INDEX feature
  perf tools: Handle indexed data file properly
  perf record: Add --index option for building index table
  perf report: Skip dummy tracking event
  perf tools: Pass session arg to perf_event__preprocess_sample()
  perf script: Pass session arg to ->process_event callback
  perf tools: Introduce thread__comm_time() helpers
  perf tools: Add a test case for thread comm handling
  perf tools: Use thread__comm_time() when adding hist entries
  perf tools: Convert dead thread list into rbtree
  perf tools: Introduce machine__find*_thread_time()
  perf tools: Add a test case for timed thread handling
  perf tools: Reducing arguments of hist_entry_iter__add()
  perf tools: Pass session to hist_entry_iter struct
  perf tools: Maintain map groups list in a leader thread
  perf tools: Introduce session__find_addr_location() and friends
  perf callchain: Use session__find_addr_location() and friends
  perf tools: Add a test case for timed map groups handling
  perf tools: Protect dso symbol loading using a mutex
  perf tools: Protect dso cache tree using dso->lock
  perf tools: Protect dso cache fd with a mutex
  perf callchain: Maintain libunwind's address space in map_groups
  perf tools: Add dso__data_get/put_fd()
  perf session: Pass struct events stats to event processing functions
  perf hists: Pass hists struct to hist_entry_iter struct
  perf tools: Move BUILD_ID_SIZE definition to perf.h
  perf report: Parallelize perf report using multi-thread
  perf tools: Add missing_threads rb tree
  perf record: Synthesize COMM event for a command line workload
  perf tools: Fix progress ui to support multi thread
  perf report: Add --multi-thread option and config item
  perf session: Handle index files generally
  perf data: Implement 'index' subcommand

 tools/perf/Documentation/perf-data.txt             |  25 +-
 tools/perf/Documentation/perf-record.txt           |   4 +
 tools/perf/Documentation/perf-report.txt           |   3 +
 tools/perf/builtin-annotate.c                      |   8 +-
 tools/perf/builtin-data.c                          | 349 ++++++++++++++++++++-
 tools/perf/builtin-diff.c                          |  21 +-
 tools/perf/builtin-mem.c                           |   6 +-
 tools/perf/builtin-record.c                        | 189 ++++++++++-
 tools/perf/builtin-report.c                        |  81 ++++-
 tools/perf/builtin-script.c                        |  58 ++--
 tools/perf/builtin-timechart.c                     |  10 +-
 tools/perf/builtin-top.c                           |  12 +-
 tools/perf/perf.h                                  |   2 +
 tools/perf/tests/Build                             |   3 +
 tools/perf/tests/builtin-test.c                    |  12 +
 tools/perf/tests/dwarf-unwind.c                    |  14 +-
 tools/perf/tests/hists_common.c                    |   3 +-
 tools/perf/tests/hists_cumulate.c                  |   9 +-
 tools/perf/tests/hists_filter.c                    |   7 +-
 tools/perf/tests/hists_link.c                      |  10 +-
 tools/perf/tests/hists_output.c                    |   9 +-
 tools/perf/tests/tests.h                           |   3 +
 tools/perf/tests/thread-comm.c                     |  47 +++
 tools/perf/tests/thread-lookup-time.c              | 180 +++++++++++
 tools/perf/tests/thread-mg-share.c                 |   7 +-
 tools/perf/tests/thread-mg-time.c                  |  88 ++++++
 tools/perf/ui/browsers/hists.c                     |  30 +-
 tools/perf/ui/gtk/hists.c                          |   3 +
 tools/perf/util/build-id.c                         |   9 +-
 tools/perf/util/build-id.h                         |   2 -
 tools/perf/util/callchain.c                        |   6 +-
 tools/perf/util/callchain.h                        |   4 +-
 tools/perf/util/db-export.c                        |   6 +-
 tools/perf/util/db-export.h                        |   4 +-
 tools/perf/util/dso.c                              | 172 +++++++---
 tools/perf/util/dso.h                              |  11 +-
 tools/perf/util/event.c                            |  77 ++++-
 tools/perf/util/event.h                            |  13 +-
 tools/perf/util/evlist.c                           | 161 ++++++++--
 tools/perf/util/evlist.h                           |  22 +-
 tools/perf/util/evsel.h                            |  15 +
 tools/perf/util/header.c                           |  61 ++++
 tools/perf/util/header.h                           |   3 +
 tools/perf/util/hist.c                             | 126 +++++---
 tools/perf/util/hist.h                             |   9 +-
 tools/perf/util/machine.c                          | 287 ++++++++++++++---
 tools/perf/util/machine.h                          |  15 +-
 tools/perf/util/map.c                              |   8 +
 tools/perf/util/map.h                              |   3 +
 tools/perf/util/ordered-events.c                   |   4 +-
 .../perf/util/scripting-engines/trace-event-perl.c |   3 +-
 .../util/scripting-engines/trace-event-python.c    |  32 +-
 tools/perf/util/session.c                          | 345 +++++++++++++++++---
 tools/perf/util/session.h                          |  48 ++-
 tools/perf/util/symbol.c                           |  34 +-
 tools/perf/util/thread.c                           | 173 +++++++++-
 tools/perf/util/thread.h                           |  28 +-
 tools/perf/util/tool.h                             |  14 +
 tools/perf/util/trace-event-scripting.c            |   3 +-
 tools/perf/util/trace-event.h                      |   3 +-
 tools/perf/util/unwind-libdw.c                     |  14 +-
 tools/perf/util/unwind-libdw.h                     |   1 +
 tools/perf/util/unwind-libunwind.c                 |  98 +++---
 tools/perf/util/unwind.h                           |  18 +-
 tools/perf/util/util.c                             |  81 ++++-
 tools/perf/util/util.h                             |   2 +
 66 files changed, 2670 insertions(+), 438 deletions(-)
 create mode 100644 tools/perf/tests/thread-comm.c
 create mode 100644 tools/perf/tests/thread-lookup-time.c
 create mode 100644 tools/perf/tests/thread-mg-time.c

-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 01/38] perf tools: Use a software dummy event to track task/mmap events
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 02/38] perf tools: Add rm_rf() utility function Namhyung Kim
                   ` (36 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

Add APIs for software dummy event to track task/comm/mmap events
separately.  The perf record will use them to save such events in a
separate mmap buffer to make it easy to index.  This is a preparation of
multi-thread support which will come later.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/evlist.c | 30 ++++++++++++++++++++++++++++++
 tools/perf/util/evlist.h |  1 +
 tools/perf/util/evsel.h  | 15 +++++++++++++++
 3 files changed, 46 insertions(+)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 8d0b62361129..928a5750648d 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -194,6 +194,36 @@ int perf_evlist__add_default(struct perf_evlist *evlist)
 	return -ENOMEM;
 }
 
+int perf_evlist__add_dummy_tracking(struct perf_evlist *evlist)
+{
+	struct perf_event_attr attr = {
+		.type = PERF_TYPE_SOFTWARE,
+		.config = PERF_COUNT_SW_DUMMY,
+		.exclude_kernel = 1,
+	};
+	struct perf_evsel *evsel;
+
+	event_attr_init(&attr);
+
+	evsel = perf_evsel__new(&attr);
+	if (evsel == NULL)
+		goto error;
+
+	/* use strdup() because free(evsel) assumes name is allocated */
+	evsel->name = strdup("dummy");
+	if (!evsel->name)
+		goto error_free;
+
+	perf_evlist__add(evlist, evsel);
+	perf_evlist__set_tracking_event(evlist, evsel);
+
+	return 0;
+error_free:
+	perf_evsel__delete(evsel);
+error:
+	return -ENOMEM;
+}
+
 static int perf_evlist__add_attrs(struct perf_evlist *evlist,
 				  struct perf_event_attr *attrs, size_t nr_attrs)
 {
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index f07c984465f0..a278df8fbed3 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -68,6 +68,7 @@ void perf_evlist__delete(struct perf_evlist *evlist);
 
 void perf_evlist__add(struct perf_evlist *evlist, struct perf_evsel *entry);
 int perf_evlist__add_default(struct perf_evlist *evlist);
+int perf_evlist__add_dummy_tracking(struct perf_evlist *evlist);
 int __perf_evlist__add_default_attrs(struct perf_evlist *evlist,
 				     struct perf_event_attr *attrs, size_t nr_attrs);
 
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index dcf202aebe9f..80aeb3d84593 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -331,6 +331,21 @@ static inline bool perf_evsel__is_function_event(struct perf_evsel *evsel)
 #undef FUNCTION_EVENT
 }
 
+/**
+ * perf_evsel__is_dummy_tracking - Return whether given evsel is a dummy
+ * event for tracking meta events only
+ *
+ * @evsel - evsel selector to be tested
+ *
+ * Return %true if event is a dummy tracking event
+ */
+static inline bool perf_evsel__is_dummy_tracking(struct perf_evsel *evsel)
+{
+	return evsel->attr.type == PERF_TYPE_SOFTWARE &&
+		evsel->attr.config == PERF_COUNT_SW_DUMMY &&
+		evsel->attr.task == 1 && evsel->attr.mmap == 1;
+}
+
 struct perf_attr_details {
 	bool freq;
 	bool verbose;
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 02/38] perf tools: Add rm_rf() utility function
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
  2015-03-03  3:07 ` [PATCH 01/38] perf tools: Use a software dummy event to track task/mmap events Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 03/38] perf tools: Introduce copyfile_offset() function Namhyung Kim
                   ` (35 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

The rm_rf() function does same as the shell command 'rm -rf' which
removes all directory entries recursively.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/util.c | 43 +++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/util.h |  1 +
 2 files changed, 44 insertions(+)

diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index 4ee6d0d4c993..6104afb7e1ef 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -72,6 +72,49 @@ int mkdir_p(char *path, mode_t mode)
 	return (stat(path, &st) && mkdir(path, mode)) ? -1 : 0;
 }
 
+int rm_rf(char *path)
+{
+	DIR *dir;
+	int ret = 0;
+	struct dirent *d;
+	char namebuf[PATH_MAX];
+
+	dir = opendir(path);
+	if (dir == NULL)
+		return 0;
+
+	while ((d = readdir(dir)) != NULL && !ret) {
+		struct stat statbuf;
+
+		if (!strcmp(d->d_name, ".") || !strcmp(d->d_name, ".."))
+			continue;
+
+		scnprintf(namebuf, sizeof(namebuf), "%s/%s",
+			  path, d->d_name);
+
+		ret = stat(namebuf, &statbuf);
+		if (ret < 0) {
+			pr_debug("stat failed: %s\n", namebuf);
+			break;
+		}
+
+		if (S_ISREG(statbuf.st_mode))
+			ret = unlink(namebuf);
+		else if (S_ISDIR(statbuf.st_mode))
+			ret = rm_rf(namebuf);
+		else {
+			pr_debug("unknown file: %s\n", namebuf);
+			ret = -1;
+		}
+	}
+	closedir(dir);
+
+	if (ret < 0)
+		return ret;
+
+	return rmdir(path);
+}
+
 static int slow_copyfile(const char *from, const char *to, mode_t mode)
 {
 	int err = -1;
diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
index fbd598afc606..ba31979fcdcc 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -249,6 +249,7 @@ static inline int sane_case(int x, int high)
 }
 
 int mkdir_p(char *path, mode_t mode);
+int rm_rf(char *path);
 int copyfile(const char *from, const char *to);
 int copyfile_mode(const char *from, const char *to, mode_t mode);
 
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 03/38] perf tools: Introduce copyfile_offset() function
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
  2015-03-03  3:07 ` [PATCH 01/38] perf tools: Use a software dummy event to track task/mmap events Namhyung Kim
  2015-03-03  3:07 ` [PATCH 02/38] perf tools: Add rm_rf() utility function Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-04 14:58   ` Jiri Olsa
  2015-03-03  3:07 ` [PATCH 04/38] perf tools: Create separate mmap for dummy tracking event Namhyung Kim
                   ` (34 subsequent siblings)
  37 siblings, 1 reply; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

The copyfile_offset() function is to copy source data from given
offset to a destination file with an offset.  It'll be used to build
an indexed data file.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/util.c | 38 +++++++++++++++++++++++++++++---------
 tools/perf/util/util.h |  1 +
 2 files changed, 30 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index 6104afb7e1ef..0c264bc685ac 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -145,11 +145,38 @@ static int slow_copyfile(const char *from, const char *to, mode_t mode)
 	return err;
 }
 
+int copyfile_offset(int ifd, loff_t off_in, int ofd, loff_t off_out, u64 size)
+{
+	void *ptr;
+	loff_t pgoff;
+
+	pgoff = off_in & ~(page_size - 1);
+	off_in -= pgoff;
+
+	ptr = mmap(NULL, off_in + size, PROT_READ, MAP_PRIVATE, ifd, pgoff);
+	if (ptr == MAP_FAILED)
+		return -1;
+
+	while (size) {
+		ssize_t ret = pwrite(ofd, ptr + off_in, size, off_out);
+		if (ret < 0 && errno == EINTR)
+			continue;
+		if (ret <= 0)
+			break;
+
+		size -= ret;
+		off_in += ret;
+		off_out -= ret;
+	}
+	munmap(ptr, off_in + size);
+
+	return size ? -1 : 0;
+}
+
 int copyfile_mode(const char *from, const char *to, mode_t mode)
 {
 	int fromfd, tofd;
 	struct stat st;
-	void *addr;
 	int err = -1;
 
 	if (stat(from, &st))
@@ -166,15 +193,8 @@ int copyfile_mode(const char *from, const char *to, mode_t mode)
 	if (tofd < 0)
 		goto out_close_from;
 
-	addr = mmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, fromfd, 0);
-	if (addr == MAP_FAILED)
-		goto out_close_to;
-
-	if (write(tofd, addr, st.st_size) == st.st_size)
-		err = 0;
+	err = copyfile_offset(fromfd, 0, tofd, 0, st.st_size);
 
-	munmap(addr, st.st_size);
-out_close_to:
 	close(tofd);
 	if (err)
 		unlink(to);
diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
index ba31979fcdcc..91535bceb1bf 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -252,6 +252,7 @@ int mkdir_p(char *path, mode_t mode);
 int rm_rf(char *path);
 int copyfile(const char *from, const char *to);
 int copyfile_mode(const char *from, const char *to, mode_t mode);
+int copyfile_offset(int fromfd, loff_t from_ofs, int tofd, loff_t to_ofs, u64 size);
 
 s64 perf_atoll(const char *str);
 char **argv_split(const char *str, int *argcp);
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 04/38] perf tools: Create separate mmap for dummy tracking event
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (2 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 03/38] perf tools: Introduce copyfile_offset() function Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 05/38] perf tools: Introduce perf_evlist__mmap_track() Namhyung Kim
                   ` (33 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

When indexed data file support is enabled, a dummy tracking event will
be used to track metadata (like task, comm and mmap events) for a
session and actual samples will be recorded in separate (intermediate)
files and then merged (with index table).

Provide separate mmap to the dummy tracking event.  The size is fixed
to 128KiB (+ 1 page) as the event rate will be lower than samples.  I
originally wanted to use a single mmap for this but cross-cpu sharing
is prohibited so it's per-cpu (or per-task) like normal mmaps.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-record.c |   9 +++-
 tools/perf/util/evlist.c    | 122 +++++++++++++++++++++++++++++++++++---------
 tools/perf/util/evlist.h    |  11 +++-
 3 files changed, 117 insertions(+), 25 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 4fdad06d37db..2bd724763e1d 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -69,7 +69,7 @@ static int process_synthesized_event(struct perf_tool *tool,
 
 static int record__mmap_read(struct record *rec, int idx)
 {
-	struct perf_mmap *md = &rec->evlist->mmap[idx];
+	struct perf_mmap *md = perf_evlist__mmap_desc(rec->evlist, idx);
 	unsigned int head = perf_mmap__read_head(md);
 	unsigned int old = md->prev;
 	unsigned char *data = md->base + page_size;
@@ -105,6 +105,7 @@ static int record__mmap_read(struct record *rec, int idx)
 	}
 
 	md->prev = old;
+
 	perf_evlist__mmap_consume(rec->evlist, idx);
 out:
 	return rc;
@@ -275,6 +276,12 @@ static int record__mmap_read_all(struct record *rec)
 				goto out;
 			}
 		}
+		if (rec->evlist->track_mmap) {
+			if (record__mmap_read(rec, track_mmap_idx(i)) != 0) {
+				rc = -1;
+				goto out;
+			}
+		}
 	}
 
 	/*
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 928a5750648d..ebbec07843a2 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -28,6 +28,7 @@
 
 static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx);
 static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx);
+static void __perf_evlist__munmap_track(struct perf_evlist *evlist, int idx);
 
 #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
 #define SID(e, x, y) xyarray__entry(e->sample_id, x, y)
@@ -728,22 +729,39 @@ static bool perf_mmap__empty(struct perf_mmap *md)
 	return perf_mmap__read_head(md) != md->prev;
 }
 
+struct perf_mmap *perf_evlist__mmap_desc(struct perf_evlist *evlist, int idx)
+{
+	if (idx >= 0)
+		return &evlist->mmap[idx];
+	else
+		return &evlist->track_mmap[track_mmap_idx(idx)];
+}
+
 static void perf_evlist__mmap_get(struct perf_evlist *evlist, int idx)
 {
-	++evlist->mmap[idx].refcnt;
+	struct perf_mmap *md = perf_evlist__mmap_desc(evlist, idx);
+
+	++md->refcnt;
 }
 
 static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx)
 {
-	BUG_ON(evlist->mmap[idx].refcnt == 0);
+	struct perf_mmap *md = perf_evlist__mmap_desc(evlist, idx);
+
+	BUG_ON(md->refcnt == 0);
+
+	if (--md->refcnt != 0)
+		return;
 
-	if (--evlist->mmap[idx].refcnt == 0)
+	if (idx >= 0)
 		__perf_evlist__munmap(evlist, idx);
+	else
+		__perf_evlist__munmap_track(evlist, track_mmap_idx(idx));
 }
 
 void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
 {
-	struct perf_mmap *md = &evlist->mmap[idx];
+	struct perf_mmap *md = perf_evlist__mmap_desc(evlist, idx);
 
 	if (!evlist->overwrite) {
 		unsigned int old = md->prev;
@@ -764,6 +782,15 @@ static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
 	}
 }
 
+static void __perf_evlist__munmap_track(struct perf_evlist *evlist, int idx)
+{
+	if (evlist->track_mmap[idx].base != NULL) {
+		munmap(evlist->track_mmap[idx].base, TRACK_MMAP_SIZE);
+		evlist->track_mmap[idx].base = NULL;
+		evlist->track_mmap[idx].refcnt = 0;
+	}
+}
+
 void perf_evlist__munmap(struct perf_evlist *evlist)
 {
 	int i;
@@ -775,23 +802,43 @@ void perf_evlist__munmap(struct perf_evlist *evlist)
 		__perf_evlist__munmap(evlist, i);
 
 	zfree(&evlist->mmap);
+
+	if (evlist->track_mmap == NULL)
+		return;
+
+	for (i = 0; i < evlist->nr_mmaps; i++)
+		__perf_evlist__munmap_track(evlist, i);
+
+	zfree(&evlist->track_mmap);
 }
 
-static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
+static int perf_evlist__alloc_mmap(struct perf_evlist *evlist, bool track_mmap)
 {
 	evlist->nr_mmaps = cpu_map__nr(evlist->cpus);
 	if (cpu_map__empty(evlist->cpus))
 		evlist->nr_mmaps = thread_map__nr(evlist->threads);
 	evlist->mmap = zalloc(evlist->nr_mmaps * sizeof(struct perf_mmap));
-	return evlist->mmap != NULL ? 0 : -ENOMEM;
+	if (evlist->mmap == NULL)
+		return -ENOMEM;
+
+	if (track_mmap) {
+		evlist->track_mmap = calloc(evlist->nr_mmaps,
+					    sizeof(struct perf_mmap));
+		if (evlist->track_mmap == NULL) {
+			zfree(&evlist->mmap);
+			return -ENOMEM;
+		}
+	}
+	return 0;
 }
 
 struct mmap_params {
-	int prot;
-	int mask;
+	int	prot;
+	size_t	len;
 };
 
-static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
+static int __perf_evlist__mmap(struct perf_evlist *evlist __maybe_unused,
+			       struct perf_mmap *pmmap,
 			       struct mmap_params *mp, int fd)
 {
 	/*
@@ -807,15 +854,14 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 	 * evlist layer can't just drop it when filtering events in
 	 * perf_evlist__filter_pollfd().
 	 */
-	evlist->mmap[idx].refcnt = 2;
-	evlist->mmap[idx].prev = 0;
-	evlist->mmap[idx].mask = mp->mask;
-	evlist->mmap[idx].base = mmap(NULL, evlist->mmap_len, mp->prot,
-				      MAP_SHARED, fd, 0);
-	if (evlist->mmap[idx].base == MAP_FAILED) {
+	pmmap->refcnt = 2;
+	pmmap->prev = 0;
+	pmmap->mask = mp->len - page_size - 1;
+	pmmap->base = mmap(NULL, mp->len, mp->prot, MAP_SHARED, fd, 0);
+	if (pmmap->base == MAP_FAILED) {
 		pr_debug2("failed to mmap perf event ring buffer, error %d\n",
 			  errno);
-		evlist->mmap[idx].base = NULL;
+		pmmap->base = NULL;
 		return -1;
 	}
 
@@ -824,7 +870,8 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 				       struct mmap_params *mp, int cpu,
-				       int thread, int *output)
+				       int thread, int *output,
+				       int *track_output)
 {
 	struct perf_evsel *evsel;
 
@@ -836,9 +883,30 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 
 		fd = FD(evsel, cpu, thread);
 
-		if (*output == -1) {
+		if (perf_evsel__is_dummy_tracking(evsel)) {
+			struct mmap_params track_mp = {
+				.prot	= mp->prot,
+				.len	= TRACK_MMAP_SIZE,
+			};
+
+			if (*track_output == -1) {
+				*track_output = fd;
+				if (__perf_evlist__mmap(evlist,
+							&evlist->track_mmap[idx],
+							&track_mp, fd) < 0)
+					return -1;
+			} else {
+				if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT,
+					  *track_output) != 0)
+					return -1;
+			}
+
+			/* mark idx as track mmap idx (negative) */
+			idx = track_mmap_idx(idx);
+		} else if (*output == -1) {
 			*output = fd;
-			if (__perf_evlist__mmap(evlist, idx, mp, *output) < 0)
+			if (__perf_evlist__mmap(evlist, &evlist->mmap[idx],
+						mp, *output) < 0)
 				return -1;
 		} else {
 			if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, *output) != 0)
@@ -867,6 +935,11 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 			perf_evlist__set_sid_idx(evlist, evsel, idx, cpu,
 						 thread);
 		}
+
+		if (perf_evsel__is_dummy_tracking(evsel)) {
+			/* restore idx as normal idx (positive) */
+			idx = track_mmap_idx(idx);
+		}
 	}
 
 	return 0;
@@ -882,10 +955,12 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist,
 	pr_debug2("perf event ring buffer mmapped per cpu\n");
 	for (cpu = 0; cpu < nr_cpus; cpu++) {
 		int output = -1;
+		int track_output = -1;
 
 		for (thread = 0; thread < nr_threads; thread++) {
 			if (perf_evlist__mmap_per_evsel(evlist, cpu, mp, cpu,
-							thread, &output))
+							thread, &output,
+							&track_output))
 				goto out_unmap;
 		}
 	}
@@ -907,9 +982,10 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist,
 	pr_debug2("perf event ring buffer mmapped per thread\n");
 	for (thread = 0; thread < nr_threads; thread++) {
 		int output = -1;
+		int track_output = -1;
 
 		if (perf_evlist__mmap_per_evsel(evlist, thread, mp, 0, thread,
-						&output))
+						&output, &track_output))
 			goto out_unmap;
 	}
 
@@ -1032,7 +1108,7 @@ int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 		.prot = PROT_READ | (overwrite ? 0 : PROT_WRITE),
 	};
 
-	if (evlist->mmap == NULL && perf_evlist__alloc_mmap(evlist) < 0)
+	if (evlist->mmap == NULL && perf_evlist__alloc_mmap(evlist, true) < 0)
 		return -ENOMEM;
 
 	if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0)
@@ -1041,7 +1117,7 @@ int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 	evlist->overwrite = overwrite;
 	evlist->mmap_len = perf_evlist__mmap_size(pages);
 	pr_debug("mmap size %zuB\n", evlist->mmap_len);
-	mp.mask = evlist->mmap_len - page_size - 1;
+	mp.len = evlist->mmap_len;
 
 	evlist__for_each(evlist, evsel) {
 		if ((evsel->attr.read_format & PERF_FORMAT_ID) &&
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index a278df8fbed3..3bd9747bb9aa 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -48,12 +48,15 @@ struct perf_evlist {
 	bool		 overwrite;
 	struct fdarray	 pollfd;
 	struct perf_mmap *mmap;
+	struct perf_mmap *track_mmap;
 	struct thread_map *threads;
 	struct cpu_map	  *cpus;
 	struct perf_evsel *selected;
 	struct events_stats stats;
 };
 
+#define TRACK_MMAP_SIZE  (((128 * 1024 / page_size) + 1) * page_size)
+
 struct perf_evsel_str_handler {
 	const char *name;
 	void	   *handler;
@@ -103,8 +106,8 @@ struct perf_evsel *perf_evlist__id2evsel(struct perf_evlist *evlist, u64 id);
 struct perf_sample_id *perf_evlist__id2sid(struct perf_evlist *evlist, u64 id);
 
 union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx);
-
 void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx);
+struct perf_mmap *perf_evlist__mmap_desc(struct perf_evlist *evlist, int idx);
 
 int perf_evlist__open(struct perf_evlist *evlist);
 void perf_evlist__close(struct perf_evlist *evlist);
@@ -214,6 +217,12 @@ bool perf_evlist__can_select_event(struct perf_evlist *evlist, const char *str);
 void perf_evlist__to_front(struct perf_evlist *evlist,
 			   struct perf_evsel *move_evsel);
 
+/* convert from/to negative idx for track mmaps */
+static inline int track_mmap_idx(int idx)
+{
+	return -idx - 1;
+}
+
 /**
  * __evlist__for_each - iterate thru all the evsels
  * @list: list_head instance to iterate
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 05/38] perf tools: Introduce perf_evlist__mmap_track()
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (3 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 04/38] perf tools: Create separate mmap for dummy tracking event Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 06/38] perf tools: Add HEADER_DATA_INDEX feature Namhyung Kim
                   ` (32 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

The perf_evlist__mmap_track function creates data mmaps and optionally
tracking mmaps for events.  It'll be used for perf record to save events
in a separate files and build an index table.  Checking dummy tracking
event in perf_evlist__mmap() alone is not enough as users can specify a
dummy event (like in keep tracking testcase) without the index option.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-record.c |  3 ++-
 tools/perf/util/evlist.c    | 15 +++++++++------
 tools/perf/util/evlist.h    | 10 ++++++++--
 3 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 2bd724763e1d..4568bc4117a1 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -169,7 +169,8 @@ static int record__open(struct record *rec)
 		goto out;
 	}
 
-	if (perf_evlist__mmap(evlist, opts->mmap_pages, false) < 0) {
+	if (perf_evlist__mmap_track(evlist, opts->mmap_pages, false,
+				    false) < 0) {
 		if (errno == EPERM) {
 			pr_err("Permission error mapping pages.\n"
 			       "Consider increasing "
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index ebbec07843a2..d264ba3602b1 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -834,6 +834,7 @@ static int perf_evlist__alloc_mmap(struct perf_evlist *evlist, bool track_mmap)
 
 struct mmap_params {
 	int	prot;
+	bool	track;
 	size_t	len;
 };
 
@@ -883,7 +884,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 
 		fd = FD(evsel, cpu, thread);
 
-		if (perf_evsel__is_dummy_tracking(evsel)) {
+		if (mp->track && perf_evsel__is_dummy_tracking(evsel)) {
 			struct mmap_params track_mp = {
 				.prot	= mp->prot,
 				.len	= TRACK_MMAP_SIZE,
@@ -936,7 +937,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 						 thread);
 		}
 
-		if (perf_evsel__is_dummy_tracking(evsel)) {
+		if (mp->track && perf_evsel__is_dummy_tracking(evsel)) {
 			/* restore idx as normal idx (positive) */
 			idx = track_mmap_idx(idx);
 		}
@@ -1087,10 +1088,11 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
 }
 
 /**
- * perf_evlist__mmap - Create mmaps to receive events.
+ * perf_evlist__mmap_track - Create mmaps to receive events.
  * @evlist: list of events
  * @pages: map length in pages
  * @overwrite: overwrite older events?
+ * @use_track_mmap: use another mmaps to track meta events
  *
  * If @overwrite is %false the user needs to signal event consumption using
  * perf_mmap__write_tail().  Using perf_evlist__mmap_read() does this
@@ -1098,17 +1100,18 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
  *
  * Return: %0 on success, negative error code otherwise.
  */
-int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
-		      bool overwrite)
+int perf_evlist__mmap_track(struct perf_evlist *evlist, unsigned int pages,
+			    bool overwrite, bool use_track_mmap)
 {
 	struct perf_evsel *evsel;
 	const struct cpu_map *cpus = evlist->cpus;
 	const struct thread_map *threads = evlist->threads;
 	struct mmap_params mp = {
 		.prot = PROT_READ | (overwrite ? 0 : PROT_WRITE),
+		.track = use_track_mmap,
 	};
 
-	if (evlist->mmap == NULL && perf_evlist__alloc_mmap(evlist, true) < 0)
+	if (evlist->mmap == NULL && perf_evlist__alloc_mmap(evlist, mp.track) < 0)
 		return -ENOMEM;
 
 	if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 3bd9747bb9aa..da45074ee500 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -130,10 +130,16 @@ int perf_evlist__parse_mmap_pages(const struct option *opt,
 				  const char *str,
 				  int unset);
 
-int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
-		      bool overwrite);
+int perf_evlist__mmap_track(struct perf_evlist *evlist, unsigned int pages,
+			    bool overwrite, bool use_track_mmap);
 void perf_evlist__munmap(struct perf_evlist *evlist);
 
+static inline int perf_evlist__mmap(struct perf_evlist *evlist,
+				    unsigned int pages, bool overwrite)
+{
+	return perf_evlist__mmap_track(evlist, pages, overwrite, false);
+}
+
 void perf_evlist__disable(struct perf_evlist *evlist);
 void perf_evlist__enable(struct perf_evlist *evlist);
 
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 06/38] perf tools: Add HEADER_DATA_INDEX feature
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (4 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 05/38] perf tools: Introduce perf_evlist__mmap_track() Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 07/38] perf tools: Handle indexed data file properly Namhyung Kim
                   ` (31 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

The HEADER_DATA_INDEX feature is to record index table for sample data
so that they can be processed by multiple thread concurrently.  Each
item is a struct perf_file_section which consists of an offset and size.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-record.c |  2 ++
 tools/perf/util/header.c    | 61 +++++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/header.h    |  3 +++
 3 files changed, 66 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 4568bc4117a1..1bdf7e4a0a6f 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -312,6 +312,8 @@ static void record__init_features(struct record *rec)
 
 	if (!rec->opts.branch_stack)
 		perf_header__clear_feat(&session->header, HEADER_BRANCH_STACK);
+
+	perf_header__clear_feat(&session->header, HEADER_DATA_INDEX);
 }
 
 static volatile int workload_exec_errno;
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 1f407f7352a7..77206a6cbf65 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -869,6 +869,24 @@ static int write_branch_stack(int fd __maybe_unused,
 	return 0;
 }
 
+static int write_data_index(int fd, struct perf_header *h,
+			    struct perf_evlist *evlist __maybe_unused)
+{
+	int ret;
+	unsigned i;
+
+	ret = do_write(fd, &h->nr_index, sizeof(h->nr_index));
+	if (ret < 0)
+		return ret;
+
+	for (i = 0; i < h->nr_index; i++) {
+		ret = do_write(fd, &h->index[i], sizeof(*h->index));
+		if (ret < 0)
+			return ret;
+	}
+	return 0;
+}
+
 static void print_hostname(struct perf_header *ph, int fd __maybe_unused,
 			   FILE *fp)
 {
@@ -1225,6 +1243,12 @@ static void print_group_desc(struct perf_header *ph, int fd __maybe_unused,
 	}
 }
 
+static void print_data_index(struct perf_header *ph __maybe_unused,
+			     int fd __maybe_unused, FILE *fp)
+{
+	fprintf(fp, "# contains data index for parallel processing\n");
+}
+
 static int __event_process_build_id(struct build_id_event *bev,
 				    char *filename,
 				    struct perf_session *session)
@@ -1833,6 +1857,42 @@ static int process_group_desc(struct perf_file_section *section __maybe_unused,
 	return ret;
 }
 
+static int process_data_index(struct perf_file_section *section __maybe_unused,
+			      struct perf_header *ph, int fd,
+			      void *data __maybe_unused)
+{
+	ssize_t ret;
+	u64 nr_index;
+	unsigned i;
+	struct perf_file_section *index;
+
+	ret = readn(fd, &nr_index, sizeof(nr_index));
+	if (ret != sizeof(nr_index))
+		return -1;
+
+	if (ph->needs_swap)
+		nr_index = bswap_64(nr_index);
+
+	index = calloc(nr_index, sizeof(*index));
+	if (index == NULL)
+		return -1;
+
+	for (i = 0; i < nr_index; i++) {
+		ret = readn(fd, &index[i], sizeof(*index));
+		if (ret != sizeof(*index))
+			return ret;
+
+		if (ph->needs_swap) {
+			index[i].offset = bswap_64(index[i].offset);
+			index[i].size   = bswap_64(index[i].size);
+		}
+	}
+
+	ph->index = index;
+	ph->nr_index = nr_index;
+	return 0;
+}
+
 struct feature_ops {
 	int (*write)(int fd, struct perf_header *h, struct perf_evlist *evlist);
 	void (*print)(struct perf_header *h, int fd, FILE *fp);
@@ -1873,6 +1933,7 @@ static const struct feature_ops feat_ops[HEADER_LAST_FEATURE] = {
 	FEAT_OPA(HEADER_BRANCH_STACK,	branch_stack),
 	FEAT_OPP(HEADER_PMU_MAPPINGS,	pmu_mappings),
 	FEAT_OPP(HEADER_GROUP_DESC,	group_desc),
+	FEAT_OPP(HEADER_DATA_INDEX,	data_index),
 };
 
 struct header_print_data {
diff --git a/tools/perf/util/header.h b/tools/perf/util/header.h
index 3bb90ac172a1..e5594f0d6dcd 100644
--- a/tools/perf/util/header.h
+++ b/tools/perf/util/header.h
@@ -30,6 +30,7 @@ enum {
 	HEADER_BRANCH_STACK,
 	HEADER_PMU_MAPPINGS,
 	HEADER_GROUP_DESC,
+	HEADER_DATA_INDEX,
 	HEADER_LAST_FEATURE,
 	HEADER_FEAT_BITS	= 256,
 };
@@ -94,6 +95,8 @@ struct perf_header {
 	bool				needs_swap;
 	u64				data_offset;
 	u64				data_size;
+	struct perf_file_section	*index;
+	u64				nr_index;
 	u64				feat_offset;
 	DECLARE_BITMAP(adds_features, HEADER_FEAT_BITS);
 	struct perf_session_env 	env;
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 07/38] perf tools: Handle indexed data file properly
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (5 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 06/38] perf tools: Add HEADER_DATA_INDEX feature Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-04 16:19   ` Jiri Olsa
  2015-03-03  3:07 ` [PATCH 08/38] perf record: Add --index option for building index table Namhyung Kim
                   ` (30 subsequent siblings)
  37 siblings, 1 reply; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

When perf detects data file has index table, process header part first
and then rest data files in a row.  Note that the indexed sample data is
recorded for each cpu/thread separately, it's already ordered with
respect to themselves so no need to use the ordered event queue
interface.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/session.c | 62 ++++++++++++++++++++++++++++++++++++++---------
 tools/perf/util/session.h |  5 ++++
 2 files changed, 55 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index e4f166981ff0..00cd1ad427be 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1300,11 +1300,10 @@ fetch_mmaped_event(struct perf_session *session,
 #define NUM_MMAPS 128
 #endif
 
-static int __perf_session__process_events(struct perf_session *session,
+static int __perf_session__process_events(struct perf_session *session, int fd,
 					  u64 data_offset, u64 data_size,
 					  u64 file_size, struct perf_tool *tool)
 {
-	int fd = perf_data_file__fd(session->file);
 	u64 head, page_offset, file_offset, file_pos, size;
 	int err, mmap_prot, mmap_flags, map_idx = 0;
 	size_t	mmap_size;
@@ -1327,7 +1326,9 @@ static int __perf_session__process_events(struct perf_session *session,
 	mmap_size = MMAP_SIZE;
 	if (mmap_size > file_size) {
 		mmap_size = file_size;
-		session->one_mmap = true;
+
+		if (!perf_session__has_index(session))
+			session->one_mmap = true;
 	}
 
 	memset(mmaps, 0, sizeof(mmaps));
@@ -1400,29 +1401,66 @@ static int __perf_session__process_events(struct perf_session *session,
 	err = ordered_events__flush(session, tool, OE_FLUSH__FINAL);
 out_err:
 	ui_progress__finish();
-	perf_tool__warn_about_errors(tool, &session->evlist->stats);
 	ordered_events__free(&session->ordered_events);
 	session->one_mmap = false;
 	return err;
 }
 
+static int __perf_session__process_indexed_events(struct perf_session *session,
+						  struct perf_tool *tool)
+{
+	struct perf_data_file *file = session->file;
+	u64 size = perf_data_file__size(file);
+	int err = 0, i;
+
+	for (i = 0; i < (int)session->header.nr_index; i++) {
+		struct perf_file_section *index = &session->header.index[i];
+
+		if (!index->size)
+			continue;
+
+		/*
+		 * For indexed data file, samples are processed for
+		 * each cpu/thread so it's already ordered.  However
+		 * meta-events at index 0 should be processed in order.
+		 */
+		if (i > 0)
+			tool->ordered_events = false;
+
+		err = __perf_session__process_events(session,
+						     perf_data_file__fd(file),
+						     index->offset, index->size,
+						     size, tool);
+		if (err < 0)
+			break;
+	}
+
+	perf_tool__warn_about_errors(tool, &session->evlist->stats);
+	return err;
+}
+
 int perf_session__process_events(struct perf_session *session,
 				 struct perf_tool *tool)
 {
-	u64 size = perf_data_file__size(session->file);
+	struct perf_data_file *file = session->file;
+	u64 size = perf_data_file__size(file);
 	int err;
 
 	if (perf_session__register_idle_thread(session) == NULL)
 		return -ENOMEM;
 
-	if (!perf_data_file__is_pipe(session->file))
-		err = __perf_session__process_events(session,
-						     session->header.data_offset,
-						     session->header.data_size,
-						     size, tool);
-	else
-		err = __perf_session__process_pipe_events(session, tool);
+	if (perf_data_file__is_pipe(file))
+		return __perf_session__process_pipe_events(session, tool);
+	if (perf_session__has_index(session))
+		return __perf_session__process_indexed_events(session, tool);
+
+	err = __perf_session__process_events(session,
+					     perf_data_file__fd(file),
+					     session->header.data_offset,
+					     session->header.data_size,
+					     size, tool);
 
+	perf_tool__warn_about_errors(tool, &session->evlist->stats);
 	return err;
 }
 
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index fe859f379ca7..aff0d2b4cc0b 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -137,4 +137,9 @@ int perf_event__synthesize_id_index(struct perf_tool *tool,
 				    struct perf_evlist *evlist,
 				    struct machine *machine);
 
+static inline bool perf_session__has_index(struct perf_session *session)
+{
+	return perf_header__has_feat(&session->header, HEADER_DATA_INDEX);
+}
+
 #endif /* __PERF_SESSION_H */
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 08/38] perf record: Add --index option for building index table
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (6 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 07/38] perf tools: Handle indexed data file properly Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-05  7:56   ` Jiri Olsa
  2015-03-03  3:07 ` [PATCH 09/38] perf report: Skip dummy tracking event Namhyung Kim
                   ` (29 subsequent siblings)
  37 siblings, 1 reply; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

The new --index option will create indexed data file which can be
processed by multiple threads parallelly.  It saves meta event and
sample data in separate files and merges them with an index table.

If there's an index table in the data file, the HEADER_DATA_INDEX
feature bit is set and session->header.index[0] will point to the meta
event area, and rest are sample data.  It'd look like below:

        +---------------------+
        |     file header     |
        |---------------------|
        |                     |
        |    meta events[0] <-+--+
        |                     |  |
        |---------------------|  |
        |                     |  |
        |    sample data[1] <-+--+
        |                     |  |
        |---------------------|  |
        |                     |  |
        |    sample data[2] <-|--+
        |                     |  |
        |---------------------|  |
        |         ...         | ...
        |---------------------|  |
        |     feature data    |  |
        |   (contains index) -+--+
        +---------------------+

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/Documentation/perf-record.txt |   4 +
 tools/perf/builtin-record.c              | 161 +++++++++++++++++++++++++++++--
 tools/perf/perf.h                        |   1 +
 tools/perf/util/session.c                |   1 +
 4 files changed, 158 insertions(+), 9 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 355c4f5569b5..5476432c045f 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -250,6 +250,10 @@ is off by default.
 --running-time::
 Record running and enabled time for read events (:S)
 
+--index::
+Build an index table for sample data.  This will speed up perf report by
+parallel processing.
+
 SEE ALSO
 --------
 linkperf:perf-stat[1], linkperf:perf-list[1]
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 1bdf7e4a0a6f..ecf8e7293015 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -38,6 +38,7 @@ struct record {
 	struct record_opts	opts;
 	u64			bytes_written;
 	struct perf_data_file	file;
+	int			*fds;
 	struct perf_evlist	*evlist;
 	struct perf_session	*session;
 	const char		*progname;
@@ -47,9 +48,16 @@ struct record {
 	long			samples;
 };
 
-static int record__write(struct record *rec, void *bf, size_t size)
+static int record__write(struct record *rec, void *bf, size_t size, int idx)
 {
-	if (perf_data_file__write(rec->session->file, bf, size) < 0) {
+	int fd;
+
+	if (rec->fds && idx >= 0)
+		fd = rec->fds[idx];
+	else
+		fd = perf_data_file__fd(rec->session->file);
+
+	if (writen(fd, bf, size) < 0) {
 		pr_err("failed to write perf data, error: %m\n");
 		return -1;
 	}
@@ -64,7 +72,7 @@ static int process_synthesized_event(struct perf_tool *tool,
 				     struct machine *machine __maybe_unused)
 {
 	struct record *rec = container_of(tool, struct record, tool);
-	return record__write(rec, event, event->header.size);
+	return record__write(rec, event, event->header.size, -1);
 }
 
 static int record__mmap_read(struct record *rec, int idx)
@@ -89,7 +97,7 @@ static int record__mmap_read(struct record *rec, int idx)
 		size = md->mask + 1 - (old & md->mask);
 		old += size;
 
-		if (record__write(rec, buf, size) < 0) {
+		if (record__write(rec, buf, size, idx) < 0) {
 			rc = -1;
 			goto out;
 		}
@@ -99,7 +107,7 @@ static int record__mmap_read(struct record *rec, int idx)
 	size = head - old;
 	old += size;
 
-	if (record__write(rec, buf, size) < 0) {
+	if (record__write(rec, buf, size, idx) < 0) {
 		rc = -1;
 		goto out;
 	}
@@ -111,6 +119,113 @@ static int record__mmap_read(struct record *rec, int idx)
 	return rc;
 }
 
+#define INDEX_FILE_FMT  "%s.dir/perf.data.%d"
+
+static int record__create_index_files(struct record *rec, int nr_index)
+{
+	int i = 0;
+	int ret = -1;
+	char path[PATH_MAX];
+	struct perf_data_file *file = &rec->file;
+
+	/* +1 for header file itself */
+	nr_index++;
+
+	rec->fds = malloc(nr_index * sizeof(int));
+	if (rec->fds == NULL)
+		return -ENOMEM;
+
+	scnprintf(path, sizeof(path), "%s.dir", file->path);
+	if (mkdir(path, S_IRWXU) < 0)
+		goto out_err;
+
+	rec->fds[0] = perf_data_file__fd(file);
+
+	for (i = 1; i < nr_index; i++) {
+		scnprintf(path, sizeof(path), INDEX_FILE_FMT, file->path, i);
+		ret = open(path, O_RDWR|O_CREAT|O_TRUNC, S_IRUSR|S_IWUSR);
+		if (ret < 0)
+			goto out_err;
+
+		rec->fds[i] = ret;
+	}
+	return 0;
+
+out_err:
+	while (--i >= 1)
+		close(rec->fds[i]);
+	zfree(&rec->fds);
+
+	scnprintf(path, sizeof(path), "%s.dir", file->path);
+	rm_rf(path);
+
+	return ret;
+}
+
+static int record__merge_index_files(struct record *rec, int nr_index)
+{
+	int i;
+	int ret = -1;
+	u64 offset;
+	char path[PATH_MAX];
+	struct perf_file_section *idx;
+	struct perf_data_file *file = &rec->file;
+	struct perf_session *session = rec->session;
+	int output_fd = perf_data_file__fd(file);
+
+	/* +1 for header file itself */
+	nr_index++;
+
+	idx = calloc(nr_index, sizeof(*idx));
+	if (idx == NULL)
+		goto out_close;
+
+	offset = lseek(output_fd, 0, SEEK_END);
+
+	idx[0].offset = session->header.data_offset;
+	idx[0].size   = offset - idx[0].offset;
+
+	for (i = 1; i < nr_index; i++) {
+		struct stat stbuf;
+		int fd = rec->fds[i];
+
+		if (fstat(fd, &stbuf) < 0)
+			goto out_close;
+
+		idx[i].offset = offset;
+		idx[i].size   = stbuf.st_size;
+
+		offset += stbuf.st_size;
+	}
+
+	/* copy sample events */
+	for (i = 1; i < nr_index; i++) {
+		int fd = rec->fds[i];
+
+		if (idx[i].size == 0)
+			continue;
+
+		if (copyfile_offset(fd, 0, output_fd, idx[i].offset,
+				    idx[i].size) < 0)
+			goto out_close;
+	}
+
+	session->header.index = idx;
+	session->header.nr_index = nr_index;
+
+	ret = 0;
+
+out_close:
+	for (i = 1; i < nr_index; i++)
+		close(rec->fds[i]);
+
+	scnprintf(path, sizeof(path), "%s.dir", file->path);
+	rm_rf(path);
+
+	zfree(&rec->fds);
+	return ret;
+}
+
 static volatile int done = 0;
 static volatile int signr = -1;
 static volatile int child_finished = 0;
@@ -170,7 +285,7 @@ static int record__open(struct record *rec)
 	}
 
 	if (perf_evlist__mmap_track(evlist, opts->mmap_pages, false,
-				    false) < 0) {
+				    opts->index) < 0) {
 		if (errno == EPERM) {
 			pr_err("Permission error mapping pages.\n"
 			       "Consider increasing "
@@ -186,6 +301,12 @@ static int record__open(struct record *rec)
 		goto out;
 	}
 
+	if (opts->index) {
+		rc = record__create_index_files(rec, evlist->nr_mmaps);
+		if (rc < 0)
+			goto out;
+	}
+
 	session->evlist = evlist;
 	perf_session__set_id_hdr_size(session);
 out:
@@ -210,7 +331,8 @@ static int process_buildids(struct record *rec)
 	struct perf_data_file *file  = &rec->file;
 	struct perf_session *session = rec->session;
 
-	u64 size = lseek(perf_data_file__fd(file), 0, SEEK_CUR);
+	/* update file size after merging sample files with index */
+	u64 size = lseek(perf_data_file__fd(file), 0, SEEK_END);
 	if (size == 0)
 		return 0;
 
@@ -290,7 +412,8 @@ static int record__mmap_read_all(struct record *rec)
 	 * at least one event.
 	 */
 	if (bytes_written != rec->bytes_written)
-		rc = record__write(rec, &finished_round_event, sizeof(finished_round_event));
+		rc = record__write(rec, &finished_round_event,
+				   sizeof(finished_round_event), -1);
 
 out:
 	return rc;
@@ -313,7 +436,8 @@ static void record__init_features(struct record *rec)
 	if (!rec->opts.branch_stack)
 		perf_header__clear_feat(&session->header, HEADER_BRANCH_STACK);
 
-	perf_header__clear_feat(&session->header, HEADER_DATA_INDEX);
+	if (!rec->opts.index)
+		perf_header__clear_feat(&session->header, HEADER_DATA_INDEX);
 }
 
 static volatile int workload_exec_errno;
@@ -375,6 +499,11 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		}
 	}
 
+	if (file->is_pipe && opts->index) {
+		pr_warning("Indexing is disabled for pipe output\n");
+		opts->index = false;
+	}
+
 	if (record__open(rec) != 0) {
 		err = -1;
 		goto out_child;
@@ -554,6 +683,9 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	if (!err && !file->is_pipe) {
 		rec->session->header.data_size += rec->bytes_written;
 
+		if (rec->opts.index)
+			record__merge_index_files(rec, rec->evlist->nr_mmaps);
+
 		if (!rec->no_buildid)
 			process_buildids(rec);
 		perf_session__write_header(rec->session, rec->evlist, fd, true);
@@ -851,6 +983,8 @@ struct option __record_options[] = {
 		    "Sample machine registers on interrupt"),
 	OPT_BOOLEAN(0, "running-time", &record.opts.running_time,
 		    "Record running/enabled time of read (:S) events"),
+	OPT_BOOLEAN(0, "index", &record.opts.index,
+		    "make index for sample data to speed-up processing"),
 	OPT_END()
 };
 
@@ -900,6 +1034,15 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
 		goto out_symbol_exit;
 	}
 
+	if (rec->opts.index) {
+		if (!rec->opts.sample_time) {
+			pr_err("Sample timestamp is required for indexing\n");
+			goto out_symbol_exit;
+		}
+
+		perf_evlist__add_dummy_tracking(rec->evlist);
+	}
+
 	if (rec->opts.target.tid && !rec->opts.no_inherit_set)
 		rec->opts.no_inherit = true;
 
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 1caa70a4a9e1..a03552849399 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -54,6 +54,7 @@ struct record_opts {
 	bool	     period;
 	bool	     sample_intr_regs;
 	bool	     running_time;
+	bool	     index;
 	unsigned int freq;
 	unsigned int mmap_pages;
 	unsigned int user_freq;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 00cd1ad427be..46761a39fbae 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -173,6 +173,7 @@ void perf_session__delete(struct perf_session *session)
 	machines__exit(&session->machines);
 	if (session->file)
 		perf_data_file__close(session->file);
+	free(session->header.index);
 	free(session);
 }
 
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 09/38] perf report: Skip dummy tracking event
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (7 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 08/38] perf record: Add --index option for building index table Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 10/38] perf tools: Pass session arg to perf_event__preprocess_sample() Namhyung Kim
                   ` (28 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

The dummy tracking event is only for tracking task/comom/mmap events
and has no sample data for itself.  So no need to report, just skip it.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-report.c    |  3 +++
 tools/perf/ui/browsers/hists.c | 30 ++++++++++++++++++++++++------
 tools/perf/ui/gtk/hists.c      |  3 +++
 3 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index fb350343b1d7..7d132e1e2af9 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -320,6 +320,9 @@ static int perf_evlist__tty_browse_hists(struct perf_evlist *evlist,
 		struct hists *hists = evsel__hists(pos);
 		const char *evname = perf_evsel__name(pos);
 
+		if (perf_evsel__is_dummy_tracking(pos))
+			continue;
+
 		if (symbol_conf.event_group &&
 		    !perf_evsel__is_group_leader(pos))
 			continue;
diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 788506eef567..7d33d7dc0824 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -1947,14 +1947,17 @@ static int perf_evsel_menu__run(struct perf_evsel_menu *menu,
 	return key;
 }
 
-static bool filter_group_entries(struct ui_browser *browser __maybe_unused,
-				 void *entry)
+static bool filter_entries(struct ui_browser *browser __maybe_unused,
+			   void *entry)
 {
 	struct perf_evsel *evsel = list_entry(entry, struct perf_evsel, node);
 
 	if (symbol_conf.event_group && !perf_evsel__is_group_leader(evsel))
 		return true;
 
+	if (perf_evsel__is_dummy_tracking(evsel))
+		return true;
+
 	return false;
 }
 
@@ -1971,7 +1974,7 @@ static int __perf_evlist__tui_browse_hists(struct perf_evlist *evlist,
 			.refresh    = ui_browser__list_head_refresh,
 			.seek	    = ui_browser__list_head_seek,
 			.write	    = perf_evsel_menu__write,
-			.filter	    = filter_group_entries,
+			.filter	    = filter_entries,
 			.nr_entries = nr_entries,
 			.priv	    = evlist,
 		},
@@ -1998,21 +2001,22 @@ int perf_evlist__tui_browse_hists(struct perf_evlist *evlist, const char *help,
 				  struct perf_session_env *env)
 {
 	int nr_entries = evlist->nr_entries;
+	struct perf_evsel *first = perf_evlist__first(evlist);
+	struct perf_evsel *pos;
 
 single_entry:
 	if (nr_entries == 1) {
-		struct perf_evsel *first = perf_evlist__first(evlist);
-
 		return perf_evsel__hists_browse(first, nr_entries, help,
 						false, hbt, min_pcnt,
 						env);
 	}
 
 	if (symbol_conf.event_group) {
-		struct perf_evsel *pos;
 
 		nr_entries = 0;
 		evlist__for_each(evlist, pos) {
+			if (perf_evsel__is_dummy_tracking(pos))
+				continue;
 			if (perf_evsel__is_group_leader(pos))
 				nr_entries++;
 		}
@@ -2021,6 +2025,20 @@ int perf_evlist__tui_browse_hists(struct perf_evlist *evlist, const char *help,
 			goto single_entry;
 	}
 
+	evlist__for_each(evlist, pos) {
+		if (perf_evsel__is_dummy_tracking(pos))
+			nr_entries--;
+	}
+
+	if (nr_entries == 1) {
+		evlist__for_each(evlist, pos) {
+			if (!perf_evsel__is_dummy_tracking(pos)) {
+				first = pos;
+				goto single_entry;
+			}
+		}
+	}
+
 	return __perf_evlist__tui_browse_hists(evlist, nr_entries, help,
 					       hbt, min_pcnt, env);
 }
diff --git a/tools/perf/ui/gtk/hists.c b/tools/perf/ui/gtk/hists.c
index 4b3585eed1e8..83a7ecd5cda8 100644
--- a/tools/perf/ui/gtk/hists.c
+++ b/tools/perf/ui/gtk/hists.c
@@ -317,6 +317,9 @@ int perf_evlist__gtk_browse_hists(struct perf_evlist *evlist,
 		char buf[512];
 		size_t size = sizeof(buf);
 
+		if (perf_evsel__is_dummy_tracking(pos))
+			continue;
+
 		if (symbol_conf.event_group) {
 			if (!perf_evsel__is_group_leader(pos))
 				continue;
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 10/38] perf tools: Pass session arg to perf_event__preprocess_sample()
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (8 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 09/38] perf report: Skip dummy tracking event Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03 13:59   ` Arnaldo Carvalho de Melo
  2015-03-03  3:07 ` [PATCH 11/38] perf script: Pass session arg to ->process_event callback Namhyung Kim
                   ` (27 subsequent siblings)
  37 siblings, 1 reply; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

The perf_event__preprocess_sample() translates a given ip into a
matching symbol.  To do that, it first finds a corresponding thread
and map in the current thread tree.  But for indexed data files, it
needs to find a thread (and map) with slightly different APIs using
timestamp.  So it needs a way to know whether this session deals with
an indexed data file or not.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-annotate.c     |  3 ++-
 tools/perf/builtin-diff.c         | 13 +++++++++----
 tools/perf/builtin-mem.c          |  6 +++++-
 tools/perf/builtin-report.c       |  3 ++-
 tools/perf/builtin-script.c       | 20 +++++++++++---------
 tools/perf/builtin-timechart.c    | 10 +++++++---
 tools/perf/builtin-top.c          |  3 ++-
 tools/perf/tests/hists_cumulate.c |  2 +-
 tools/perf/tests/hists_filter.c   |  2 +-
 tools/perf/tests/hists_link.c     |  4 ++--
 tools/perf/tests/hists_output.c   |  2 +-
 tools/perf/util/event.c           |  3 ++-
 tools/perf/util/event.h           |  4 +++-
 13 files changed, 48 insertions(+), 27 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 747f86103599..b89e4c6ed488 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -85,7 +85,8 @@ static int process_sample_event(struct perf_tool *tool,
 	struct perf_annotate *ann = container_of(tool, struct perf_annotate, tool);
 	struct addr_location al;
 
-	if (perf_event__preprocess_sample(event, machine, &al, sample) < 0) {
+	if (perf_event__preprocess_sample(event, machine, &al, sample,
+					  ann->session) < 0) {
 		pr_warning("problem processing %d event, skipping it.\n",
 			   event->header.type);
 		return -1;
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 74aada554b12..3e2229227062 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -42,6 +42,7 @@ struct diff_hpp_fmt {
 };
 
 struct data__file {
+	struct perf_tool	tool;
 	struct perf_session	*session;
 	struct perf_data_file	file;
 	int			 idx;
@@ -320,16 +321,18 @@ static int hists__add_entry(struct hists *hists,
 	return -ENOMEM;
 }
 
-static int diff__process_sample_event(struct perf_tool *tool __maybe_unused,
+static int diff__process_sample_event(struct perf_tool *tool,
 				      union perf_event *event,
 				      struct perf_sample *sample,
 				      struct perf_evsel *evsel,
 				      struct machine *machine)
 {
 	struct addr_location al;
+	struct data__file *d = container_of(tool, struct data__file, tool);
 	struct hists *hists = evsel__hists(evsel);
 
-	if (perf_event__preprocess_sample(event, machine, &al, sample) < 0) {
+	if (perf_event__preprocess_sample(event, machine, &al, sample,
+					  d->session) < 0) {
 		pr_warning("problem processing %d event, skipping it.\n",
 			   event->header.type);
 		return -1;
@@ -740,14 +743,16 @@ static int __cmd_diff(void)
 	int ret = -EINVAL, i;
 
 	data__for_each_file(i, d) {
-		d->session = perf_session__new(&d->file, false, &tool);
+		memcpy(&d->tool, &tool, sizeof(tool));
+
+		d->session = perf_session__new(&d->file, false, &d->tool);
 		if (!d->session) {
 			pr_err("Failed to open %s\n", d->file.path);
 			ret = -1;
 			goto out_delete;
 		}
 
-		ret = perf_session__process_events(d->session, &tool);
+		ret = perf_session__process_events(d->session, &d->tool);
 		if (ret) {
 			pr_err("Failed to process %s\n", d->file.path);
 			goto out_delete;
diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 9b5663950a4d..21d46918860e 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -12,6 +12,7 @@
 
 struct perf_mem {
 	struct perf_tool	tool;
+	struct perf_session	*session;
 	char const		*input_name;
 	bool			hide_unresolved;
 	bool			dump_raw;
@@ -66,7 +67,8 @@ dump_raw_samples(struct perf_tool *tool,
 	struct addr_location al;
 	const char *fmt;
 
-	if (perf_event__preprocess_sample(event, machine, &al, sample) < 0) {
+	if (perf_event__preprocess_sample(event, machine, &al, sample,
+					  mem->session) < 0) {
 		fprintf(stderr, "problem processing %d event, skipping it.\n",
 				event->header.type);
 		return -1;
@@ -129,6 +131,8 @@ static int report_raw_events(struct perf_mem *mem)
 	if (session == NULL)
 		return -1;
 
+	mem->session = session;
+
 	if (mem->cpu_list) {
 		ret = perf_session__cpu_bitmap(session, mem->cpu_list,
 					       mem->cpu_bitmap);
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 7d132e1e2af9..fe1f34c00c58 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -142,7 +142,8 @@ static int process_sample_event(struct perf_tool *tool,
 	};
 	int ret;
 
-	if (perf_event__preprocess_sample(event, machine, &al, sample) < 0) {
+	if (perf_event__preprocess_sample(event, machine, &al, sample,
+					  rep->session) < 0) {
 		pr_debug("problem processing %d event, skipping it.\n",
 			 event->header.type);
 		return -1;
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index ce304dfd962a..ab920f8cded6 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -542,13 +542,21 @@ static int cleanup_scripting(void)
 	return scripting_ops->stop_script();
 }
 
-static int process_sample_event(struct perf_tool *tool __maybe_unused,
+struct perf_script {
+	struct perf_tool	tool;
+	struct perf_session	*session;
+	bool			show_task_events;
+	bool			show_mmap_events;
+};
+
+static int process_sample_event(struct perf_tool *tool,
 				union perf_event *event,
 				struct perf_sample *sample,
 				struct perf_evsel *evsel,
 				struct machine *machine)
 {
 	struct addr_location al;
+	struct perf_script *script = container_of(tool, struct perf_script, tool);
 	struct thread *thread = machine__findnew_thread(machine, sample->pid,
 							sample->tid);
 
@@ -569,7 +577,8 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
 		return 0;
 	}
 
-	if (perf_event__preprocess_sample(event, machine, &al, sample) < 0) {
+	if (perf_event__preprocess_sample(event, machine, &al, sample,
+					  script->session) < 0) {
 		pr_err("problem processing %d event, skipping it.\n",
 		       event->header.type);
 		return -1;
@@ -586,13 +595,6 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
 	return 0;
 }
 
-struct perf_script {
-	struct perf_tool	tool;
-	struct perf_session	*session;
-	bool			show_task_events;
-	bool			show_mmap_events;
-};
-
 static int process_attr(struct perf_tool *tool, union perf_event *event,
 			struct perf_evlist **pevlist)
 {
diff --git a/tools/perf/builtin-timechart.c b/tools/perf/builtin-timechart.c
index f3bb1a4bf060..4178727be12c 100644
--- a/tools/perf/builtin-timechart.c
+++ b/tools/perf/builtin-timechart.c
@@ -48,6 +48,7 @@ struct wake_event;
 
 struct timechart {
 	struct perf_tool	tool;
+	struct perf_session	*session;
 	struct per_pid		*all_data;
 	struct power_event	*power_events;
 	struct wake_event	*wake_events;
@@ -469,7 +470,8 @@ static void sched_switch(struct timechart *tchart, int cpu, u64 timestamp,
 
 static const char *cat_backtrace(union perf_event *event,
 				 struct perf_sample *sample,
-				 struct machine *machine)
+				 struct machine *machine,
+				 struct perf_session *session)
 {
 	struct addr_location al;
 	unsigned int i;
@@ -488,7 +490,8 @@ static const char *cat_backtrace(union perf_event *event,
 	if (!chain)
 		goto exit;
 
-	if (perf_event__preprocess_sample(event, machine, &al, sample) < 0) {
+	if (perf_event__preprocess_sample(event, machine, &al, sample,
+					  session) < 0) {
 		fprintf(stderr, "problem processing %d event, skipping it.\n",
 			event->header.type);
 		goto exit;
@@ -567,7 +570,7 @@ static int process_sample_event(struct perf_tool *tool,
 	if (evsel->handler != NULL) {
 		tracepoint_handler f = evsel->handler;
 		return f(tchart, evsel, sample,
-			 cat_backtrace(event, sample, machine));
+			 cat_backtrace(event, sample, machine, tchart->session));
 	}
 
 	return 0;
@@ -1623,6 +1626,7 @@ static int __cmd_timechart(struct timechart *tchart, const char *output_name)
 		goto out_delete;
 	}
 
+	tchart->session = session;
 	ret = perf_session__process_events(session, &tchart->tool);
 	if (ret)
 		goto out_delete;
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 5fb8723c7128..054c56206481 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -723,7 +723,8 @@ static void perf_event__process_sample(struct perf_tool *tool,
 	if (event->header.misc & PERF_RECORD_MISC_EXACT_IP)
 		top->exact_samples++;
 
-	if (perf_event__preprocess_sample(event, machine, &al, sample) < 0)
+	if (perf_event__preprocess_sample(event, machine, &al, sample,
+					  top->session) < 0)
 		return;
 
 	if (!top->kptr_restrict_warned &&
diff --git a/tools/perf/tests/hists_cumulate.c b/tools/perf/tests/hists_cumulate.c
index 18619966454c..60682e62d9de 100644
--- a/tools/perf/tests/hists_cumulate.c
+++ b/tools/perf/tests/hists_cumulate.c
@@ -101,7 +101,7 @@ static int add_hist_entries(struct hists *hists, struct machine *machine)
 		sample.callchain = (struct ip_callchain *)fake_callchains[i];
 
 		if (perf_event__preprocess_sample(&event, machine, &al,
-						  &sample) < 0)
+						  &sample, NULL) < 0)
 			goto out;
 
 		if (hist_entry_iter__add(&iter, &al, evsel, &sample,
diff --git a/tools/perf/tests/hists_filter.c b/tools/perf/tests/hists_filter.c
index 59e53db7914c..1c4e495d5137 100644
--- a/tools/perf/tests/hists_filter.c
+++ b/tools/perf/tests/hists_filter.c
@@ -78,7 +78,7 @@ static int add_hist_entries(struct perf_evlist *evlist,
 			sample.ip = fake_samples[i].ip;
 
 			if (perf_event__preprocess_sample(&event, machine, &al,
-							  &sample) < 0)
+							  &sample, NULL) < 0)
 				goto out;
 
 			if (hist_entry_iter__add(&iter, &al, evsel, &sample,
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index 278ba8344c23..a731a531a3e2 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -86,7 +86,7 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
 			sample.tid = fake_common_samples[k].pid;
 			sample.ip = fake_common_samples[k].ip;
 			if (perf_event__preprocess_sample(&event, machine, &al,
-							  &sample) < 0)
+							  &sample, NULL) < 0)
 				goto out;
 
 			he = __hists__add_entry(hists, &al, NULL,
@@ -110,7 +110,7 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
 			sample.tid = fake_samples[i][k].pid;
 			sample.ip = fake_samples[i][k].ip;
 			if (perf_event__preprocess_sample(&event, machine, &al,
-							  &sample) < 0)
+							  &sample, NULL) < 0)
 				goto out;
 
 			he = __hists__add_entry(hists, &al, NULL,
diff --git a/tools/perf/tests/hists_output.c b/tools/perf/tests/hists_output.c
index b52c9faea224..f4e3286cd496 100644
--- a/tools/perf/tests/hists_output.c
+++ b/tools/perf/tests/hists_output.c
@@ -67,7 +67,7 @@ static int add_hist_entries(struct hists *hists, struct machine *machine)
 		sample.ip = fake_samples[i].ip;
 
 		if (perf_event__preprocess_sample(&event, machine, &al,
-						  &sample) < 0)
+						  &sample, NULL) < 0)
 			goto out;
 
 		if (hist_entry_iter__add(&iter, &al, evsel, &sample,
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index d5efa5092ce6..704ef27cc7c8 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -836,7 +836,8 @@ void thread__find_addr_location(struct thread *thread,
 int perf_event__preprocess_sample(const union perf_event *event,
 				  struct machine *machine,
 				  struct addr_location *al,
-				  struct perf_sample *sample)
+				  struct perf_sample *sample,
+				  struct perf_session *session __maybe_unused)
 {
 	u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
 	struct thread *thread = machine__findnew_thread(machine, sample->pid,
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index c4ffe2bd0738..19814f70292b 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -353,11 +353,13 @@ int perf_event__process(struct perf_tool *tool,
 			struct machine *machine);
 
 struct addr_location;
+struct perf_session;
 
 int perf_event__preprocess_sample(const union perf_event *event,
 				  struct machine *machine,
 				  struct addr_location *al,
-				  struct perf_sample *sample);
+				  struct perf_sample *sample,
+				  struct perf_session *session);
 
 struct thread;
 
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 11/38] perf script: Pass session arg to ->process_event callback
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (9 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 10/38] perf tools: Pass session arg to perf_event__preprocess_sample() Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 12/38] perf tools: Introduce thread__comm_time() helpers Namhyung Kim
                   ` (26 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

Sometimes it needs to retrieve symbol info inside a script engine so
we need to pass the session pointer to find the symbol correctly as
with previous patch.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-script.c                        | 23 ++++++++++++----------
 tools/perf/util/db-export.c                        |  6 ++++--
 tools/perf/util/db-export.h                        |  4 +++-
 tools/perf/util/event.c                            |  3 ++-
 tools/perf/util/event.h                            |  3 ++-
 .../perf/util/scripting-engines/trace-event-perl.c |  3 ++-
 .../util/scripting-engines/trace-event-python.c    |  5 +++--
 tools/perf/util/trace-event-scripting.c            |  3 ++-
 tools/perf/util/trace-event.h                      |  3 ++-
 9 files changed, 33 insertions(+), 20 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index ab920f8cded6..4a007110d2f7 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -377,9 +377,10 @@ static void print_sample_start(struct perf_sample *sample,
 }
 
 static void print_sample_addr(union perf_event *event,
-			  struct perf_sample *sample,
-			  struct thread *thread,
-			  struct perf_event_attr *attr)
+			      struct perf_sample *sample,
+			      struct thread *thread,
+			      struct perf_event_attr *attr,
+			      struct perf_session *session)
 {
 	struct addr_location al;
 
@@ -388,7 +389,7 @@ static void print_sample_addr(union perf_event *event,
 	if (!sample_addr_correlates_sym(attr))
 		return;
 
-	perf_event__preprocess_sample_addr(event, sample, thread, &al);
+	perf_event__preprocess_sample_addr(event, sample, thread, &al, session);
 
 	if (PRINT_FIELD(SYM)) {
 		printf(" ");
@@ -409,7 +410,8 @@ static void print_sample_bts(union perf_event *event,
 			     struct perf_sample *sample,
 			     struct perf_evsel *evsel,
 			     struct thread *thread,
-			     struct addr_location *al)
+			     struct addr_location *al,
+			     struct perf_session *session)
 {
 	struct perf_event_attr *attr = &evsel->attr;
 	bool print_srcline_last = false;
@@ -436,7 +438,7 @@ static void print_sample_bts(union perf_event *event,
 	    ((evsel->attr.sample_type & PERF_SAMPLE_ADDR) &&
 	     !output[attr->type].user_set)) {
 		printf(" => ");
-		print_sample_addr(event, sample, thread, attr);
+		print_sample_addr(event, sample, thread, attr, session);
 	}
 
 	if (print_srcline_last)
@@ -447,7 +449,7 @@ static void print_sample_bts(union perf_event *event,
 
 static void process_event(union perf_event *event, struct perf_sample *sample,
 			  struct perf_evsel *evsel, struct thread *thread,
-			  struct addr_location *al)
+			  struct addr_location *al, struct perf_session *session)
 {
 	struct perf_event_attr *attr = &evsel->attr;
 
@@ -465,7 +467,7 @@ static void process_event(union perf_event *event, struct perf_sample *sample,
 	}
 
 	if (is_bts_event(attr)) {
-		print_sample_bts(event, sample, evsel, thread, al);
+		print_sample_bts(event, sample, evsel, thread, al, session);
 		return;
 	}
 
@@ -473,7 +475,7 @@ static void process_event(union perf_event *event, struct perf_sample *sample,
 		event_format__print(evsel->tp_format, sample->cpu,
 				    sample->raw_data, sample->raw_size);
 	if (PRINT_FIELD(ADDR))
-		print_sample_addr(event, sample, thread, attr);
+		print_sample_addr(event, sample, thread, attr, session);
 
 	if (PRINT_FIELD(IP)) {
 		if (!symbol_conf.use_callchain)
@@ -590,7 +592,8 @@ static int process_sample_event(struct perf_tool *tool,
 	if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
 		return 0;
 
-	scripting_ops->process_event(event, sample, evsel, thread, &al);
+	scripting_ops->process_event(event, sample, evsel, thread, &al,
+				     script->session);
 
 	return 0;
 }
diff --git a/tools/perf/util/db-export.c b/tools/perf/util/db-export.c
index c81dae399763..e9ad11fe2e16 100644
--- a/tools/perf/util/db-export.c
+++ b/tools/perf/util/db-export.c
@@ -282,7 +282,8 @@ int db_export__branch_type(struct db_export *dbe, u32 branch_type,
 
 int db_export__sample(struct db_export *dbe, union perf_event *event,
 		      struct perf_sample *sample, struct perf_evsel *evsel,
-		      struct thread *thread, struct addr_location *al)
+		      struct thread *thread, struct addr_location *al,
+		      struct perf_session *session)
 {
 	struct export_sample es = {
 		.event = event,
@@ -328,7 +329,8 @@ int db_export__sample(struct db_export *dbe, union perf_event *event,
 	    sample_addr_correlates_sym(&evsel->attr)) {
 		struct addr_location addr_al;
 
-		perf_event__preprocess_sample_addr(event, sample, thread, &addr_al);
+		perf_event__preprocess_sample_addr(event, sample, thread,
+						   &addr_al, session);
 		err = db_ids_from_al(dbe, &addr_al, &es.addr_dso_db_id,
 				     &es.addr_sym_db_id, &es.addr_offset);
 		if (err)
diff --git a/tools/perf/util/db-export.h b/tools/perf/util/db-export.h
index adbd22d66798..b994f1041d19 100644
--- a/tools/perf/util/db-export.h
+++ b/tools/perf/util/db-export.h
@@ -29,6 +29,7 @@ struct addr_location;
 struct call_return_processor;
 struct call_path;
 struct call_return;
+struct perf_session;
 
 struct export_sample {
 	union perf_event	*event;
@@ -97,7 +98,8 @@ int db_export__branch_type(struct db_export *dbe, u32 branch_type,
 			   const char *name);
 int db_export__sample(struct db_export *dbe, union perf_event *event,
 		      struct perf_sample *sample, struct perf_evsel *evsel,
-		      struct thread *thread, struct addr_location *al);
+		      struct thread *thread, struct addr_location *al,
+		      struct perf_session *session);
 
 int db_export__branch_types(struct db_export *dbe);
 
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 704ef27cc7c8..510a308c2158 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -918,7 +918,8 @@ bool sample_addr_correlates_sym(struct perf_event_attr *attr)
 void perf_event__preprocess_sample_addr(union perf_event *event,
 					struct perf_sample *sample,
 					struct thread *thread,
-					struct addr_location *al)
+					struct addr_location *al,
+					struct perf_session *session __maybe_unused)
 {
 	u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
 
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 19814f70292b..27261320249a 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -368,7 +368,8 @@ bool sample_addr_correlates_sym(struct perf_event_attr *attr);
 void perf_event__preprocess_sample_addr(union perf_event *event,
 					struct perf_sample *sample,
 					struct thread *thread,
-					struct addr_location *al);
+					struct addr_location *al,
+					struct perf_session *session);
 
 const char *perf_event__name(unsigned int id);
 
diff --git a/tools/perf/util/scripting-engines/trace-event-perl.c b/tools/perf/util/scripting-engines/trace-event-perl.c
index 22ebc46226e7..dd69fbaf03b8 100644
--- a/tools/perf/util/scripting-engines/trace-event-perl.c
+++ b/tools/perf/util/scripting-engines/trace-event-perl.c
@@ -356,7 +356,8 @@ static void perl_process_event(union perf_event *event,
 			       struct perf_sample *sample,
 			       struct perf_evsel *evsel,
 			       struct thread *thread,
-			       struct addr_location *al __maybe_unused)
+			       struct addr_location *al __maybe_unused,
+			       struct perf_session *session __maybe_unused)
 {
 	perl_process_tracepoint(sample, evsel, thread);
 	perl_process_event_generic(event, sample, evsel);
diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index 0c815a40a6e8..802def46af7b 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -839,7 +839,8 @@ static void python_process_event(union perf_event *event,
 				 struct perf_sample *sample,
 				 struct perf_evsel *evsel,
 				 struct thread *thread,
-				 struct addr_location *al)
+				 struct addr_location *al,
+				 struct perf_session *session)
 {
 	struct tables *tables = &tables_global;
 
@@ -851,7 +852,7 @@ static void python_process_event(union perf_event *event,
 	default:
 		if (tables->db_export_mode)
 			db_export__sample(&tables->dbe, event, sample, evsel,
-					  thread, al);
+					  thread, al, session);
 		else
 			python_process_general_event(sample, evsel, thread, al);
 	}
diff --git a/tools/perf/util/trace-event-scripting.c b/tools/perf/util/trace-event-scripting.c
index 5c9bdd1591a9..36ed50d71171 100644
--- a/tools/perf/util/trace-event-scripting.c
+++ b/tools/perf/util/trace-event-scripting.c
@@ -44,7 +44,8 @@ static void process_event_unsupported(union perf_event *event __maybe_unused,
 				      struct perf_sample *sample __maybe_unused,
 				      struct perf_evsel *evsel __maybe_unused,
 				      struct thread *thread __maybe_unused,
-				      struct addr_location *al __maybe_unused)
+				      struct addr_location *al __maybe_unused,
+				      struct perf_session *session __maybe_unused)
 {
 }
 
diff --git a/tools/perf/util/trace-event.h b/tools/perf/util/trace-event.h
index 356629a30ca9..40e19c2af606 100644
--- a/tools/perf/util/trace-event.h
+++ b/tools/perf/util/trace-event.h
@@ -73,7 +73,8 @@ struct scripting_ops {
 			       struct perf_sample *sample,
 			       struct perf_evsel *evsel,
 			       struct thread *thread,
-				   struct addr_location *al);
+			       struct addr_location *al,
+			       struct perf_session *session);
 	int (*generate_script) (struct pevent *pevent, const char *outfile);
 };
 
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 12/38] perf tools: Introduce thread__comm_time() helpers
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (10 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 11/38] perf script: Pass session arg to ->process_event callback Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03 16:28   ` Frederic Weisbecker
  2015-03-03  3:07 ` [PATCH 13/38] perf tools: Add a test case for thread comm handling Namhyung Kim
                   ` (25 subsequent siblings)
  37 siblings, 1 reply; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

When data file indexing is enabled, it processes all task, comm and mmap
events first and then goes to the sample events.  So all it sees is the
last comm of a thread although it has information at the time of sample.

Sort thread's comm by time so that it can find appropriate comm at the
sample time.  The thread__comm_time() will mostly work even if
PERF_SAMPLE_TIME bit is off since in that case, sample->time will be
-1 so it'll take the last comm anyway.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/thread.c | 33 ++++++++++++++++++++++++++++++++-
 tools/perf/util/thread.h |  2 ++
 2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 9ebc8b1f9be5..ad96725105c2 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -103,6 +103,21 @@ struct comm *thread__exec_comm(const struct thread *thread)
 	return last;
 }
 
+struct comm *thread__comm_time(const struct thread *thread, u64 timestamp)
+{
+	struct comm *comm;
+
+	list_for_each_entry(comm, &thread->comm_list, list) {
+		if (timestamp >= comm->start)
+			return comm;
+	}
+
+	if (list_empty(&thread->comm_list))
+		return NULL;
+
+	return list_last_entry(&thread->comm_list, struct comm, list);
+}
+
 int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
 		       bool exec)
 {
@@ -118,7 +133,13 @@ int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
 		new = comm__new(str, timestamp, exec);
 		if (!new)
 			return -ENOMEM;
-		list_add(&new->list, &thread->comm_list);
+
+		/* sort by time */
+		list_for_each_entry(curr, &thread->comm_list, list) {
+			if (timestamp >= curr->start)
+				break;
+		}
+		list_add_tail(&new->list, &curr->list);
 
 		if (exec)
 			unwind__flush_access(thread);
@@ -139,6 +160,16 @@ const char *thread__comm_str(const struct thread *thread)
 	return comm__str(comm);
 }
 
+const char *thread__comm_str_time(const struct thread *thread, u64 timestamp)
+{
+	const struct comm *comm = thread__comm_time(thread, timestamp);
+
+	if (!comm)
+		return NULL;
+
+	return comm__str(comm);
+}
+
 /* CHECKME: it should probably better return the max comm len from its comm list */
 int thread__comm_len(struct thread *thread)
 {
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index 160fd066a7d1..be67c3bad5e7 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -53,7 +53,9 @@ static inline int thread__set_comm(struct thread *thread, const char *comm,
 int thread__comm_len(struct thread *thread);
 struct comm *thread__comm(const struct thread *thread);
 struct comm *thread__exec_comm(const struct thread *thread);
+struct comm *thread__comm_time(const struct thread *thread, u64 timestamp);
 const char *thread__comm_str(const struct thread *thread);
+const char *thread__comm_str_time(const struct thread *thread, u64 timestamp);
 void thread__insert_map(struct thread *thread, struct map *map);
 int thread__fork(struct thread *thread, struct thread *parent, u64 timestamp);
 size_t thread__fprintf(struct thread *thread, FILE *fp);
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 13/38] perf tools: Add a test case for thread comm handling
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (11 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 12/38] perf tools: Introduce thread__comm_time() helpers Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 14/38] perf tools: Use thread__comm_time() when adding hist entries Namhyung Kim
                   ` (24 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

The new test case checks various thread comm handling like overridding
and time sorting.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/tests/Build          |  1 +
 tools/perf/tests/builtin-test.c |  4 ++++
 tools/perf/tests/tests.h        |  1 +
 tools/perf/tests/thread-comm.c  | 47 +++++++++++++++++++++++++++++++++++++++++
 4 files changed, 53 insertions(+)
 create mode 100644 tools/perf/tests/thread-comm.c

diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index 2de01a4b4084..af8f31a3b678 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -24,6 +24,7 @@ perf-y += bp_signal_overflow.o
 perf-y += task-exit.o
 perf-y += sw-clock.o
 perf-y += mmap-thread-lookup.o
+perf-y += thread-comm.o
 perf-y += thread-mg-share.o
 perf-y += switch-tracking.o
 perf-y += keep-tracking.o
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 4b7d9ab0f049..1b463d82a71a 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -167,6 +167,10 @@ static struct test {
 		.func = test__fdarray__add,
 	},
 	{
+		.desc = "Test thread comm handling",
+		.func = test__thread_comm,
+	},
+	{
 		.func = NULL,
 	},
 };
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index 00e776a87a9c..43ac17780629 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -51,6 +51,7 @@ int test__hists_cumulate(void);
 int test__switch_tracking(void);
 int test__fdarray__filter(void);
 int test__fdarray__add(void);
+int test__thread_comm(void);
 
 #if defined(__x86_64__) || defined(__i386__) || defined(__arm__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
diff --git a/tools/perf/tests/thread-comm.c b/tools/perf/tests/thread-comm.c
new file mode 100644
index 000000000000..44ee85d71581
--- /dev/null
+++ b/tools/perf/tests/thread-comm.c
@@ -0,0 +1,47 @@
+#include "tests.h"
+#include "machine.h"
+#include "thread.h"
+#include "debug.h"
+
+int test__thread_comm(void)
+{
+	struct machines machines;
+	struct machine *machine;
+	struct thread *t;
+
+	/*
+	 * This test is to check whether it can retrieve a correct
+	 * comm for a given time.  When multi-file data storage is
+	 * enabled, those task/comm events are processed first so the
+	 * later sample should find a matching comm properly.
+	 */
+	machines__init(&machines);
+	machine = &machines.host;
+
+	t = machine__findnew_thread(machine, 100, 100);
+	TEST_ASSERT_VAL("wrong init thread comm",
+			!strcmp(thread__comm_str(t), ":100"));
+
+	thread__set_comm(t, "perf-test1", 10000);
+	TEST_ASSERT_VAL("failed to override thread comm",
+			!strcmp(thread__comm_str(t), "perf-test1"));
+
+	thread__set_comm(t, "perf-test2", 20000);
+	thread__set_comm(t, "perf-test3", 30000);
+	thread__set_comm(t, "perf-test4", 40000);
+
+	TEST_ASSERT_VAL("failed to find timed comm",
+			!strcmp(thread__comm_str_time(t, 20000), "perf-test2"));
+	TEST_ASSERT_VAL("failed to find timed comm",
+			!strcmp(thread__comm_str_time(t, 35000), "perf-test3"));
+	TEST_ASSERT_VAL("failed to find timed comm",
+			!strcmp(thread__comm_str_time(t, 50000), "perf-test4"));
+
+	thread__set_comm(t, "perf-test1.5", 15000);
+	TEST_ASSERT_VAL("failed to sort timed comm",
+			!strcmp(thread__comm_str_time(t, 15000), "perf-test1.5"));
+
+	machine__delete_threads(machine);
+	machines__exit(&machines);
+	return 0;
+}
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 14/38] perf tools: Use thread__comm_time() when adding hist entries
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (12 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 13/38] perf tools: Add a test case for thread comm handling Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 15/38] perf tools: Convert dead thread list into rbtree Namhyung Kim
                   ` (23 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

Now thread->comm can be handled with time properly, use it to find
correct comm when adding hist entries.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-annotate.c |  5 +++--
 tools/perf/builtin-diff.c     |  8 ++++----
 tools/perf/tests/hists_link.c |  4 ++--
 tools/perf/util/hist.c        | 19 ++++++++++---------
 tools/perf/util/hist.h        |  2 +-
 5 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index b89e4c6ed488..50628900f9fa 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -47,7 +47,7 @@ struct perf_annotate {
 };
 
 static int perf_evsel__add_sample(struct perf_evsel *evsel,
-				  struct perf_sample *sample __maybe_unused,
+				  struct perf_sample *sample,
 				  struct addr_location *al,
 				  struct perf_annotate *ann)
 {
@@ -67,7 +67,8 @@ static int perf_evsel__add_sample(struct perf_evsel *evsel,
 		return 0;
 	}
 
-	he = __hists__add_entry(hists, al, NULL, NULL, NULL, 1, 1, 0, true);
+	he = __hists__add_entry(hists, al, NULL, NULL, NULL, 1, 1, 0,
+				sample->time, true);
 	if (he == NULL)
 		return -ENOMEM;
 
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 3e2229227062..ddf6f0999838 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -313,10 +313,10 @@ static int formula_fprintf(struct hist_entry *he, struct hist_entry *pair,
 
 static int hists__add_entry(struct hists *hists,
 			    struct addr_location *al, u64 period,
-			    u64 weight, u64 transaction)
+			    u64 weight, u64 transaction, u64 timestamp)
 {
 	if (__hists__add_entry(hists, al, NULL, NULL, NULL, period, weight,
-			       transaction, true) != NULL)
+			       transaction, timestamp, true) != NULL)
 		return 0;
 	return -ENOMEM;
 }
@@ -338,8 +338,8 @@ static int diff__process_sample_event(struct perf_tool *tool,
 		return -1;
 	}
 
-	if (hists__add_entry(hists, &al, sample->period,
-			     sample->weight, sample->transaction)) {
+	if (hists__add_entry(hists, &al, sample->period, sample->weight,
+			     sample->transaction, sample->time)) {
 		pr_warning("problem incrementing symbol period, skipping event\n");
 		return -1;
 	}
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index a731a531a3e2..4f3d45692acb 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -90,7 +90,7 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
 				goto out;
 
 			he = __hists__add_entry(hists, &al, NULL,
-						NULL, NULL, 1, 1, 0, true);
+						NULL, NULL, 1, 1, 0, -1, true);
 			if (he == NULL)
 				goto out;
 
@@ -114,7 +114,7 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
 				goto out;
 
 			he = __hists__add_entry(hists, &al, NULL,
-						NULL, NULL, 1, 1, 0, true);
+						NULL, NULL, 1, 1, 0, -1, true);
 			if (he == NULL)
 				goto out;
 
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 70b48a65064c..4badf2491fbf 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -447,11 +447,11 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
 				      struct branch_info *bi,
 				      struct mem_info *mi,
 				      u64 period, u64 weight, u64 transaction,
-				      bool sample_self)
+				      u64 timestamp, bool sample_self)
 {
 	struct hist_entry entry = {
 		.thread	= al->thread,
-		.comm = thread__comm(al->thread),
+		.comm = thread__comm_time(al->thread, timestamp),
 		.ms = {
 			.map	= al->map,
 			.sym	= al->sym,
@@ -509,13 +509,14 @@ iter_add_single_mem_entry(struct hist_entry_iter *iter, struct addr_location *al
 {
 	u64 cost;
 	struct mem_info *mi = iter->priv;
+	struct perf_sample *sample = iter->sample;
 	struct hists *hists = evsel__hists(iter->evsel);
 	struct hist_entry *he;
 
 	if (mi == NULL)
 		return -EINVAL;
 
-	cost = iter->sample->weight;
+	cost = sample->weight;
 	if (!cost)
 		cost = 1;
 
@@ -527,7 +528,7 @@ iter_add_single_mem_entry(struct hist_entry_iter *iter, struct addr_location *al
 	 * and the he_stat__add_period() function.
 	 */
 	he = __hists__add_entry(hists, al, iter->parent, NULL, mi,
-				cost, cost, 0, true);
+				cost, cost, 0, sample->time, true);
 	if (!he)
 		return -ENOMEM;
 
@@ -628,7 +629,7 @@ iter_add_next_branch_entry(struct hist_entry_iter *iter, struct addr_location *a
 	 * and not events sampled. Thus we use a pseudo period of 1.
 	 */
 	he = __hists__add_entry(hists, al, iter->parent, &bi[i], NULL,
-				1, 1, 0, true);
+				1, 1, 0, iter->sample->time, true);
 	if (he == NULL)
 		return -ENOMEM;
 
@@ -666,7 +667,7 @@ iter_add_single_normal_entry(struct hist_entry_iter *iter, struct addr_location
 
 	he = __hists__add_entry(evsel__hists(evsel), al, iter->parent, NULL, NULL,
 				sample->period, sample->weight,
-				sample->transaction, true);
+				sample->transaction, sample->time, true);
 	if (he == NULL)
 		return -ENOMEM;
 
@@ -728,7 +729,7 @@ iter_add_single_cumulative_entry(struct hist_entry_iter *iter,
 
 	he = __hists__add_entry(hists, al, iter->parent, NULL, NULL,
 				sample->period, sample->weight,
-				sample->transaction, true);
+				sample->transaction, sample->time, true);
 	if (he == NULL)
 		return -ENOMEM;
 
@@ -772,7 +773,7 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
 	struct hist_entry he_tmp = {
 		.cpu = al->cpu,
 		.thread = al->thread,
-		.comm = thread__comm(al->thread),
+		.comm = thread__comm_time(al->thread, sample->time),
 		.ip = al->addr,
 		.ms = {
 			.map = al->map,
@@ -801,7 +802,7 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
 
 	he = __hists__add_entry(evsel__hists(evsel), al, iter->parent, NULL, NULL,
 				sample->period, sample->weight,
-				sample->transaction, false);
+				sample->transaction, sample->time, false);
 	if (he == NULL)
 		return -ENOMEM;
 
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 2b690d028907..0eed50a5b1f0 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -109,7 +109,7 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
 				      struct branch_info *bi,
 				      struct mem_info *mi, u64 period,
 				      u64 weight, u64 transaction,
-				      bool sample_self);
+				      u64 timestamp, bool sample_self);
 int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
 			 struct perf_evsel *evsel, struct perf_sample *sample,
 			 int max_stack_depth, void *arg);
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 15/38] perf tools: Convert dead thread list into rbtree
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (13 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 14/38] perf tools: Use thread__comm_time() when adding hist entries Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 16/38] perf tools: Introduce machine__find*_thread_time() Namhyung Kim
                   ` (22 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

Currently perf maintains dead threads in a linked list but this can be
a problem if someone needs to search from it especially in a large
session which might have many dead threads.  Convert it to a rbtree
like normal threads and it'll be used later with multi-file changes.

The list node is now used for chaining dead threads of same tid since
it's easier to handle such threads in time order.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/machine.c | 70 +++++++++++++++++++++++++++++++++++++++++------
 tools/perf/util/machine.h |  2 +-
 tools/perf/util/thread.c  |  1 +
 tools/perf/util/thread.h  | 11 ++++----
 4 files changed, 68 insertions(+), 16 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 9e0f60a7e7b3..6b8236dc4367 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -28,7 +28,7 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid)
 	dsos__init(&machine->kernel_dsos);
 
 	machine->threads = RB_ROOT;
-	INIT_LIST_HEAD(&machine->dead_threads);
+	machine->dead_threads = RB_ROOT;
 	machine->last_match = NULL;
 
 	machine->vdso_info = NULL;
@@ -91,10 +91,22 @@ static void dsos__delete(struct dsos *dsos)
 
 void machine__delete_dead_threads(struct machine *machine)
 {
-	struct thread *n, *t;
+	struct rb_node *nd = rb_first(&machine->dead_threads);
+
+	while (nd) {
+		struct thread *t = rb_entry(nd, struct thread, rb_node);
+		struct thread *pos;
+
+		nd = rb_next(nd);
+		rb_erase(&t->rb_node, &machine->dead_threads);
+
+		while (!list_empty(&t->tid_node)) {
+			pos = list_first_entry(&t->tid_node,
+					       struct thread, tid_node);
+			list_del(&pos->tid_node);
+			thread__delete(pos);
+		}
 
-	list_for_each_entry_safe(t, n, &machine->dead_threads, node) {
-		list_del(&t->node);
 		thread__delete(t);
 	}
 }
@@ -106,8 +118,8 @@ void machine__delete_threads(struct machine *machine)
 	while (nd) {
 		struct thread *t = rb_entry(nd, struct thread, rb_node);
 
-		rb_erase(&t->rb_node, &machine->threads);
 		nd = rb_next(nd);
+		rb_erase(&t->rb_node, &machine->threads);
 		thread__delete(t);
 	}
 }
@@ -1238,13 +1250,46 @@ int machine__process_mmap_event(struct machine *machine, union perf_event *event
 
 static void machine__remove_thread(struct machine *machine, struct thread *th)
 {
+	struct rb_node **p = &machine->dead_threads.rb_node;
+	struct rb_node *parent = NULL;
+	struct thread *pos;
+
 	machine->last_match = NULL;
 	rb_erase(&th->rb_node, &machine->threads);
+
+	th->dead = true;
+
 	/*
 	 * We may have references to this thread, for instance in some hist_entry
-	 * instances, so just move them to a separate list.
+	 * instances, so just move them to a separate list in rbtree.
 	 */
-	list_add_tail(&th->node, &machine->dead_threads);
+	while (*p != NULL) {
+		parent = *p;
+		pos = rb_entry(parent, struct thread, rb_node);
+
+		if (pos->tid == th->tid) {
+			struct thread *old;
+
+			/* sort by time */
+			list_for_each_entry(old, &pos->tid_node, tid_node) {
+				if (th->start_time >= old->start_time) {
+					pos = old;
+					break;
+				}
+			}
+
+			list_add_tail(&th->tid_node, &pos->tid_node);
+			return;
+		}
+
+		if (th->tid < pos->tid)
+			p = &(*p)->rb_left;
+		else
+			p = &(*p)->rb_right;
+	}
+
+	rb_link_node(&th->rb_node, parent, p);
+	rb_insert_color(&th->rb_node, &machine->dead_threads);
 }
 
 int machine__process_fork_event(struct machine *machine, union perf_event *event,
@@ -1729,7 +1774,7 @@ int machine__for_each_thread(struct machine *machine,
 			     void *priv)
 {
 	struct rb_node *nd;
-	struct thread *thread;
+	struct thread *thread, *pos;
 	int rc = 0;
 
 	for (nd = rb_first(&machine->threads); nd; nd = rb_next(nd)) {
@@ -1739,10 +1784,17 @@ int machine__for_each_thread(struct machine *machine,
 			return rc;
 	}
 
-	list_for_each_entry(thread, &machine->dead_threads, node) {
+	for (nd = rb_first(&machine->dead_threads); nd; nd = rb_next(nd)) {
+		thread = rb_entry(nd, struct thread, rb_node);
 		rc = fn(thread, priv);
 		if (rc != 0)
 			return rc;
+
+		list_for_each_entry(pos, &thread->tid_node, tid_node) {
+			rc = fn(pos, priv);
+			if (rc != 0)
+				return rc;
+		}
 	}
 	return rc;
 }
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index e8b7779a0a3f..4349946a38ff 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -30,7 +30,7 @@ struct machine {
 	bool		  comm_exec;
 	char		  *root_dir;
 	struct rb_root	  threads;
-	struct list_head  dead_threads;
+	struct rb_root	  dead_threads;
 	struct thread	  *last_match;
 	struct vdso_info  *vdso_info;
 	struct dsos	  user_dsos;
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index ad96725105c2..c9ae0e1599da 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -38,6 +38,7 @@ struct thread *thread__new(pid_t pid, pid_t tid)
 		thread->ppid = -1;
 		thread->cpu = -1;
 		INIT_LIST_HEAD(&thread->comm_list);
+		INIT_LIST_HEAD(&thread->tid_node);
 
 		if (unwind__prepare_access(thread) < 0)
 			goto err_thread;
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index be67c3bad5e7..21268e66b2ad 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -11,10 +11,8 @@
 struct thread_stack;
 
 struct thread {
-	union {
-		struct rb_node	 rb_node;
-		struct list_head node;
-	};
+	struct rb_node	 	rb_node;
+	struct list_head 	tid_node;
 	struct map_groups	*mg;
 	pid_t			pid_; /* Not all tools update this */
 	pid_t			tid;
@@ -22,7 +20,8 @@ struct thread {
 	int			cpu;
 	char			shortname[3];
 	bool			comm_set;
-	bool			dead; /* if set thread has exited */
+	bool			exited; /* if set thread has exited */
+	bool			dead; /* thread is in dead_threads list */
 	struct list_head	comm_list;
 	int			comm_len;
 	u64			db_id;
@@ -39,7 +38,7 @@ int thread__init_map_groups(struct thread *thread, struct machine *machine);
 void thread__delete(struct thread *thread);
 static inline void thread__exited(struct thread *thread)
 {
-	thread->dead = true;
+	thread->exited = true;
 }
 
 int __thread__set_comm(struct thread *thread, const char *comm, u64 timestamp,
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 16/38] perf tools: Introduce machine__find*_thread_time()
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (14 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 15/38] perf tools: Convert dead thread list into rbtree Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 17/38] perf tools: Add a test case for timed thread handling Namhyung Kim
                   ` (21 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

With data file indexing is enabled, it needs to search thread based on
sample time since sample processing is done after other (task, comm and
mmap) events are processed.  This can be a problem if a session is very
long and pid is recycled - in that case it'll only see the last one.

So keep thread start time in it, and search thread based on the time.
This patch introduces machine__find{,new}_thread_time() function for
this.  It'll first search current thread rbtree and then dead thread
tree and list.  If it couldn't find anyone, it'll create a new thread.

The sample timestamp of 0 means that this is called from synthesized
event so just use current rbtree.  The timestamp will be -1 if sample
didn't record the timestamp so will see current threads automatically.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-script.c     |  11 ++++-
 tools/perf/tests/dwarf-unwind.c |   8 ++--
 tools/perf/tests/hists_common.c |   3 +-
 tools/perf/tests/hists_link.c   |   2 +-
 tools/perf/util/event.c         |  14 ++++--
 tools/perf/util/machine.c       | 102 +++++++++++++++++++++++++++++++++++++++-
 tools/perf/util/machine.h       |   8 +++-
 tools/perf/util/thread.c        |   4 ++
 tools/perf/util/thread.h        |   1 +
 9 files changed, 138 insertions(+), 15 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 4a007110d2f7..65b3a07be2bf 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -559,8 +559,15 @@ static int process_sample_event(struct perf_tool *tool,
 {
 	struct addr_location al;
 	struct perf_script *script = container_of(tool, struct perf_script, tool);
-	struct thread *thread = machine__findnew_thread(machine, sample->pid,
-							sample->tid);
+	struct thread *thread;
+
+	if (perf_session__has_index(script->session))
+		thread = machine__findnew_thread_time(machine, sample->pid,
+						      sample->tid,
+						      sample->time);
+	else
+		thread = machine__findnew_thread(machine, sample->pid,
+						 sample->tid);
 
 	if (thread == NULL) {
 		pr_debug("problem processing %d event, skipping it.\n",
diff --git a/tools/perf/tests/dwarf-unwind.c b/tools/perf/tests/dwarf-unwind.c
index 0bf06bec68c7..7e04feb431cb 100644
--- a/tools/perf/tests/dwarf-unwind.c
+++ b/tools/perf/tests/dwarf-unwind.c
@@ -16,10 +16,10 @@
 
 static int mmap_handler(struct perf_tool *tool __maybe_unused,
 			union perf_event *event,
-			struct perf_sample *sample __maybe_unused,
+			struct perf_sample *sample,
 			struct machine *machine)
 {
-	return machine__process_mmap2_event(machine, event, NULL);
+	return machine__process_mmap2_event(machine, event, sample);
 }
 
 static int init_live_machine(struct machine *machine)
@@ -66,12 +66,10 @@ static int unwind_entry(struct unwind_entry *entry, void *arg)
 __attribute__ ((noinline))
 static int unwind_thread(struct thread *thread)
 {
-	struct perf_sample sample;
+	struct perf_sample sample = { .time = -1ULL, };
 	unsigned long cnt = 0;
 	int err = -1;
 
-	memset(&sample, 0, sizeof(sample));
-
 	if (test__arch_unwind_sample(&sample, thread)) {
 		pr_debug("failed to get unwind sample\n");
 		goto out;
diff --git a/tools/perf/tests/hists_common.c b/tools/perf/tests/hists_common.c
index a62c09134516..86a8fdb41804 100644
--- a/tools/perf/tests/hists_common.c
+++ b/tools/perf/tests/hists_common.c
@@ -80,6 +80,7 @@ static struct {
 struct machine *setup_fake_machine(struct machines *machines)
 {
 	struct machine *machine = machines__find(machines, HOST_KERNEL_ID);
+	struct perf_sample sample = { .time = -1ULL, };
 	size_t i;
 
 	if (machine == NULL) {
@@ -113,7 +114,7 @@ struct machine *setup_fake_machine(struct machines *machines)
 		strcpy(fake_mmap_event.mmap.filename,
 		       fake_mmap_info[i].filename);
 
-		machine__process_mmap_event(machine, &fake_mmap_event, NULL);
+		machine__process_mmap_event(machine, &fake_mmap_event, &sample);
 	}
 
 	for (i = 0; i < ARRAY_SIZE(fake_symbols); i++) {
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index 4f3d45692acb..1237cc87e8d5 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -64,7 +64,7 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
 	struct perf_evsel *evsel;
 	struct addr_location al;
 	struct hist_entry *he;
-	struct perf_sample sample = { .period = 1, };
+	struct perf_sample sample = { .period = 1, .time = -1ULL, };
 	size_t i = 0, k;
 
 	/*
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 510a308c2158..3bfe10fe0c69 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -9,6 +9,7 @@
 #include "strlist.h"
 #include "thread.h"
 #include "thread_map.h"
+#include "session.h"
 #include "symbol/kallsyms.h"
 
 static const char *perf_event__names[] = {
@@ -837,11 +838,18 @@ int perf_event__preprocess_sample(const union perf_event *event,
 				  struct machine *machine,
 				  struct addr_location *al,
 				  struct perf_sample *sample,
-				  struct perf_session *session __maybe_unused)
+				  struct perf_session *session)
 {
 	u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
-	struct thread *thread = machine__findnew_thread(machine, sample->pid,
-							sample->tid);
+	struct thread *thread;
+
+	if (session && perf_session__has_index(session))
+		thread = machine__findnew_thread_time(machine, sample->pid,
+						      sample->tid,
+						      sample->time);
+	else
+		thread = machine__findnew_thread(machine, sample->pid,
+						 sample->tid);
 
 	if (thread == NULL)
 		return -1;
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 6b8236dc4367..b4b97b5e1f1c 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -434,6 +434,106 @@ struct thread *machine__find_thread(struct machine *machine, pid_t pid,
 	return __machine__findnew_thread(machine, pid, tid, false);
 }
 
+static struct thread *__machine__findnew_thread_time(struct machine *machine,
+						     pid_t pid, pid_t tid,
+						     u64 timestamp, bool create)
+{
+	struct thread *curr, *pos, *new;
+	struct thread *th = NULL;
+	struct rb_node **p;
+	struct rb_node *parent = NULL;
+
+	curr = __machine__findnew_thread(machine, pid, tid, false);
+	if (curr && timestamp >= curr->start_time)
+		return curr;
+
+	p = &machine->dead_threads.rb_node;
+	while (*p != NULL) {
+		parent = *p;
+		th = rb_entry(parent, struct thread, rb_node);
+
+		if (th->tid == tid) {
+			list_for_each_entry(pos, &th->tid_node, tid_node) {
+				if (timestamp >= pos->start_time &&
+				    pos->start_time > th->start_time) {
+					th = pos;
+					break;
+				}
+			}
+
+			if (timestamp >= th->start_time) {
+				machine__update_thread_pid(machine, th, pid);
+				return th;
+			}
+			break;
+		}
+
+		if (tid < th->tid)
+			p = &(*p)->rb_left;
+		else
+			p = &(*p)->rb_right;
+	}
+
+	if (!create)
+		return NULL;
+
+	if (!curr && !*p)
+		return __machine__findnew_thread(machine, pid, tid, true);
+
+	new = thread__new(pid, tid);
+	if (new == NULL)
+		return NULL;
+
+	new->dead = true;
+	new->start_time = timestamp;
+
+	if (*p) {
+		list_for_each_entry(pos, &th->tid_node, tid_node) {
+			/* sort by time */
+			if (timestamp >= pos->start_time) {
+				th = pos;
+				break;
+			}
+		}
+		list_add_tail(&new->tid_node, &th->tid_node);
+	} else {
+		rb_link_node(&new->rb_node, parent, p);
+		rb_insert_color(&new->rb_node, &machine->dead_threads);
+	}
+
+	/*
+	 * We have to initialize map_groups separately
+	 * after rb tree is updated.
+	 *
+	 * The reason is that we call machine__findnew_thread
+	 * within thread__init_map_groups to find the thread
+	 * leader and that would screwed the rb tree.
+	 */
+	if (thread__init_map_groups(new, machine)) {
+		if (!list_empty(&new->tid_node))
+			list_del(&new->tid_node);
+		else
+			rb_erase(&new->rb_node, &machine->dead_threads);
+
+		thread__delete(new);
+		return NULL;
+	}
+
+	return new;
+}
+
+struct thread *machine__find_thread_time(struct machine *machine, pid_t pid,
+					 pid_t tid, u64 timestamp)
+{
+	return __machine__findnew_thread_time(machine, pid, tid, timestamp, false);
+}
+
+struct thread *machine__findnew_thread_time(struct machine *machine, pid_t pid,
+					    pid_t tid, u64 timestamp)
+{
+	return __machine__findnew_thread_time(machine, pid, tid, timestamp, true);
+}
+
 struct comm *machine__thread_exec_comm(struct machine *machine,
 				       struct thread *thread)
 {
@@ -1172,7 +1272,7 @@ int machine__process_mmap2_event(struct machine *machine,
 	}
 
 	thread = machine__findnew_thread(machine, event->mmap2.pid,
-					event->mmap2.tid);
+					 event->mmap2.tid);
 	if (thread == NULL)
 		goto out_problem;
 
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 4349946a38ff..9571b6b1c5b5 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -68,8 +68,6 @@ static inline bool machine__kernel_ip(struct machine *machine, u64 ip)
 	return ip >= kernel_start;
 }
 
-struct thread *machine__find_thread(struct machine *machine, pid_t pid,
-				    pid_t tid);
 struct comm *machine__thread_exec_comm(struct machine *machine,
 				       struct thread *thread);
 
@@ -149,6 +147,12 @@ static inline bool machine__is_host(struct machine *machine)
 
 struct thread *machine__findnew_thread(struct machine *machine, pid_t pid,
 				       pid_t tid);
+struct thread *machine__find_thread(struct machine *machine, pid_t pid,
+				    pid_t tid);
+struct thread *machine__findnew_thread_time(struct machine *machine, pid_t pid,
+					    pid_t tid, u64 timestamp);
+struct thread *machine__find_thread_time(struct machine *machine, pid_t pid,
+					 pid_t tid, u64 timestamp);
 
 size_t machine__fprintf(struct machine *machine, FILE *fp);
 
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index c9ae0e1599da..306bdaede019 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -127,6 +127,9 @@ int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
 
 	/* Override the default :tid entry */
 	if (!thread->comm_set) {
+		if (!thread->start_time)
+			thread->start_time = timestamp;
+
 		err = comm__override(curr, str, timestamp, exec);
 		if (err)
 			return err;
@@ -228,6 +231,7 @@ int thread__fork(struct thread *thread, struct thread *parent, u64 timestamp)
 	}
 
 	thread->ppid = parent->tid;
+	thread->start_time = timestamp;
 	return thread__clone_map_groups(thread, parent);
 }
 
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index 21268e66b2ad..e5d7abd255ea 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -25,6 +25,7 @@ struct thread {
 	struct list_head	comm_list;
 	int			comm_len;
 	u64			db_id;
+	u64			start_time;
 
 	void			*priv;
 	struct thread_stack	*ts;
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 17/38] perf tools: Add a test case for timed thread handling
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (15 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 16/38] perf tools: Introduce machine__find*_thread_time() Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 18/38] perf tools: Reducing arguments of hist_entry_iter__add() Namhyung Kim
                   ` (20 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

A test case for verifying live and dead thread tree management during
time change and new machine__find{,new}_thread_time().

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/tests/Build                |   1 +
 tools/perf/tests/builtin-test.c       |   4 +
 tools/perf/tests/tests.h              |   1 +
 tools/perf/tests/thread-lookup-time.c | 174 ++++++++++++++++++++++++++++++++++
 4 files changed, 180 insertions(+)
 create mode 100644 tools/perf/tests/thread-lookup-time.c

diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index af8f31a3b678..bfa0aa35761f 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -26,6 +26,7 @@ perf-y += sw-clock.o
 perf-y += mmap-thread-lookup.o
 perf-y += thread-comm.o
 perf-y += thread-mg-share.o
+perf-y += thread-lookup-time.o
 perf-y += switch-tracking.o
 perf-y += keep-tracking.o
 perf-y += code-reading.o
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 1b463d82a71a..e4d335de19ea 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -171,6 +171,10 @@ static struct test {
 		.func = test__thread_comm,
 	},
 	{
+		.desc = "Test thread lookup with time",
+		.func = test__thread_lookup_time,
+	},
+	{
 		.func = NULL,
 	},
 };
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index 43ac17780629..1090337f63e5 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -52,6 +52,7 @@ int test__switch_tracking(void);
 int test__fdarray__filter(void);
 int test__fdarray__add(void);
 int test__thread_comm(void);
+int test__thread_lookup_time(void);
 
 #if defined(__x86_64__) || defined(__i386__) || defined(__arm__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
diff --git a/tools/perf/tests/thread-lookup-time.c b/tools/perf/tests/thread-lookup-time.c
new file mode 100644
index 000000000000..6237ecf8caae
--- /dev/null
+++ b/tools/perf/tests/thread-lookup-time.c
@@ -0,0 +1,174 @@
+#include "tests.h"
+#include "machine.h"
+#include "thread.h"
+#include "map.h"
+#include "debug.h"
+
+static int thread__print_cb(struct thread *th, void *arg __maybe_unused)
+{
+	printf("thread: %d, start time: %"PRIu64" %s\n",
+	       th->tid, th->start_time, th->dead ? "(dead)" : "");
+	return 0;
+}
+
+static int lookup_with_timestamp(struct machine *machine)
+{
+	struct thread *t1, *t2, *t3;
+	union perf_event fork = {
+		.fork = {
+			.pid = 0,
+			.tid = 0,
+			.ppid = 1,
+			.ptid = 1,
+		},
+	};
+	struct perf_sample sample = {
+		.time = 50000,
+	};
+
+	/* start_time is set to 0 */
+	t1 = machine__findnew_thread(machine, 0, 0);
+
+	if (verbose > 1) {
+		printf("========= after t1 created ==========\n");
+		machine__for_each_thread(machine, thread__print_cb, NULL);
+	}
+
+	TEST_ASSERT_VAL("wrong start time of old thread", t1->start_time == 0);
+
+	TEST_ASSERT_VAL("cannot find current thread",
+			machine__find_thread(machine, 0, 0) == t1);
+
+	TEST_ASSERT_VAL("cannot find current thread with time",
+			machine__findnew_thread_time(machine, 0, 0, 10000) == t1);
+
+	/* start_time is overwritten to new value */
+	thread__set_comm(t1, "/usr/bin/perf", 20000);
+
+	if (verbose > 1) {
+		printf("========= after t1 set comm ==========\n");
+		machine__for_each_thread(machine, thread__print_cb, NULL);
+	}
+
+	TEST_ASSERT_VAL("failed to update start time", t1->start_time == 20000);
+
+	TEST_ASSERT_VAL("should not find passed thread",
+			/* this will create yet another dead thread */
+			machine__findnew_thread_time(machine, 0, 0, 10000) != t1);
+
+	TEST_ASSERT_VAL("cannot find overwritten thread with time",
+			machine__find_thread_time(machine, 0, 0, 20000) == t1);
+
+	/* now t1 goes to dead thread tree, and create t2 */
+	machine__process_fork_event(machine, &fork, &sample);
+
+	if (verbose > 1) {
+		printf("========= after t2 forked ==========\n");
+		machine__for_each_thread(machine, thread__print_cb, NULL);
+	}
+
+	t2 = machine__find_thread(machine, 0, 0);
+	TEST_ASSERT_VAL("cannot find current thread", t2 != NULL);
+
+	TEST_ASSERT_VAL("wrong start time of new thread", t2->start_time == 50000);
+
+	TEST_ASSERT_VAL("dead thread cannot be found",
+			machine__find_thread_time(machine, 0, 0, 10000) != t1);
+
+	TEST_ASSERT_VAL("cannot find dead thread after new thread",
+			machine__find_thread_time(machine, 0, 0, 30000) == t1);
+
+	TEST_ASSERT_VAL("cannot find current thread after new thread",
+			machine__find_thread_time(machine, 0, 0, 50000) == t2);
+
+	/* now t2 goes to dead thread tree, and create t3 */
+	sample.time = 60000;
+	machine__process_fork_event(machine, &fork, &sample);
+
+	if (verbose > 1) {
+		printf("========= after t3 forked ==========\n");
+		machine__for_each_thread(machine, thread__print_cb, NULL);
+	}
+
+	t3 = machine__find_thread(machine, 0, 0);
+	TEST_ASSERT_VAL("cannot find current thread", t3 != NULL);
+
+	TEST_ASSERT_VAL("wrong start time of new thread", t3->start_time == 60000);
+
+	TEST_ASSERT_VAL("cannot find dead thread after new thread",
+			machine__findnew_thread_time(machine, 0, 0, 30000) == t1);
+
+	TEST_ASSERT_VAL("cannot find dead thread after new thread",
+			machine__findnew_thread_time(machine, 0, 0, 50000) == t2);
+
+	TEST_ASSERT_VAL("cannot find current thread after new thread",
+			machine__findnew_thread_time(machine, 0, 0, 70000) == t3);
+
+	machine__delete_threads(machine);
+	return 0;
+}
+
+static int lookup_without_timestamp(struct machine *machine)
+{
+	struct thread *t1, *t2, *t3;
+	union perf_event fork = {
+		.fork = {
+			.pid = 0,
+			.tid = 0,
+			.ppid = 1,
+			.ptid = 1,
+		},
+	};
+	struct perf_sample sample = {
+		.time = -1ULL,
+	};
+
+	t1 = machine__findnew_thread(machine, 0, 0);
+	TEST_ASSERT_VAL("cannot find current thread", t1 != NULL);
+
+	TEST_ASSERT_VAL("cannot find new thread with time",
+			machine__findnew_thread_time(machine, 0, 0, -1ULL) == t1);
+
+	machine__process_fork_event(machine, &fork, &sample);
+
+	t2 = machine__find_thread(machine, 0, 0);
+	TEST_ASSERT_VAL("cannot find current thread", t2 != NULL);
+
+	TEST_ASSERT_VAL("cannot find new thread with time",
+			machine__find_thread_time(machine, 0, 0, -1ULL) == t2);
+
+	machine__process_fork_event(machine, &fork, &sample);
+
+	t3 = machine__find_thread(machine, 0, 0);
+	TEST_ASSERT_VAL("cannot find current thread", t3 != NULL);
+
+	TEST_ASSERT_VAL("cannot find new thread with time",
+			machine__findnew_thread_time(machine, 0, 0, -1ULL) == t3);
+
+	machine__delete_threads(machine);
+	return 0;
+}
+
+int test__thread_lookup_time(void)
+{
+	struct machines machines;
+	struct machine *machine;
+
+	/*
+	 * This test is to check whether it can retrieve a correct
+	 * thread for a given time.  When multi-file data storage is
+	 * enabled, those task/comm/mmap events are processed first so
+	 * the later sample should find a matching thread properly.
+	 */
+	machines__init(&machines);
+	machine = &machines.host;
+
+	if (lookup_with_timestamp(machine) < 0)
+		return -1;
+
+	if (lookup_without_timestamp(machine) < 0)
+		return -1;
+
+	machines__exit(&machines);
+	return 0;
+}
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 18/38] perf tools: Reducing arguments of hist_entry_iter__add()
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (16 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 17/38] perf tools: Add a test case for timed thread handling Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 19/38] perf tools: Pass session to hist_entry_iter struct Namhyung Kim
                   ` (19 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

The evsel and sample arguments are to set iter for later use.  As it
also receives an iter as another argument, just set them before
calling the function.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-report.c       | 9 +++++----
 tools/perf/builtin-top.c          | 7 ++++---
 tools/perf/tests/hists_cumulate.c | 6 ++++--
 tools/perf/tests/hists_filter.c   | 4 +++-
 tools/perf/tests/hists_output.c   | 6 ++++--
 tools/perf/util/hist.c            | 8 ++------
 tools/perf/util/hist.h            | 1 -
 7 files changed, 22 insertions(+), 19 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index fe1f34c00c58..cff357522358 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -137,8 +137,10 @@ static int process_sample_event(struct perf_tool *tool,
 	struct report *rep = container_of(tool, struct report, tool);
 	struct addr_location al;
 	struct hist_entry_iter iter = {
-		.hide_unresolved = rep->hide_unresolved,
-		.add_entry_cb = hist_iter__report_callback,
+		.evsel 			= evsel,
+		.sample 		= sample,
+		.hide_unresolved 	= rep->hide_unresolved,
+		.add_entry_cb 		= hist_iter__report_callback,
 	};
 	int ret;
 
@@ -167,8 +169,7 @@ static int process_sample_event(struct perf_tool *tool,
 	if (al.map != NULL)
 		al.map->dso->hit = 1;
 
-	ret = hist_entry_iter__add(&iter, &al, evsel, sample, rep->max_stack,
-				   rep);
+	ret = hist_entry_iter__add(&iter, &al, rep->max_stack, rep);
 	if (ret < 0)
 		pr_debug("problem adding hist entry, skipping event\n");
 
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 054c56206481..2c37bff901ba 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -774,7 +774,9 @@ static void perf_event__process_sample(struct perf_tool *tool,
 	if (al.sym == NULL || !al.sym->ignore) {
 		struct hists *hists = evsel__hists(evsel);
 		struct hist_entry_iter iter = {
-			.add_entry_cb = hist_iter__top_callback,
+			.evsel		= evsel,
+			.sample 	= sample,
+			.add_entry_cb 	= hist_iter__top_callback,
 		};
 
 		if (symbol_conf.cumulate_callchain)
@@ -784,8 +786,7 @@ static void perf_event__process_sample(struct perf_tool *tool,
 
 		pthread_mutex_lock(&hists->lock);
 
-		err = hist_entry_iter__add(&iter, &al, evsel, sample,
-					   top->max_stack, top);
+		err = hist_entry_iter__add(&iter, &al, top->max_stack, top);
 		if (err < 0)
 			pr_err("Problem incrementing symbol period, skipping event\n");
 
diff --git a/tools/perf/tests/hists_cumulate.c b/tools/perf/tests/hists_cumulate.c
index 60682e62d9de..da64acbd35b7 100644
--- a/tools/perf/tests/hists_cumulate.c
+++ b/tools/perf/tests/hists_cumulate.c
@@ -87,6 +87,8 @@ static int add_hist_entries(struct hists *hists, struct machine *machine)
 			},
 		};
 		struct hist_entry_iter iter = {
+			.evsel = evsel,
+			.sample	= &sample,
 			.hide_unresolved = false,
 		};
 
@@ -104,8 +106,8 @@ static int add_hist_entries(struct hists *hists, struct machine *machine)
 						  &sample, NULL) < 0)
 			goto out;
 
-		if (hist_entry_iter__add(&iter, &al, evsel, &sample,
-					 PERF_MAX_STACK_DEPTH, NULL) < 0)
+		if (hist_entry_iter__add(&iter, &al, PERF_MAX_STACK_DEPTH,
+					 NULL) < 0)
 			goto out;
 
 		fake_samples[i].thread = al.thread;
diff --git a/tools/perf/tests/hists_filter.c b/tools/perf/tests/hists_filter.c
index 1c4e495d5137..f5c0c69383dc 100644
--- a/tools/perf/tests/hists_filter.c
+++ b/tools/perf/tests/hists_filter.c
@@ -63,6 +63,8 @@ static int add_hist_entries(struct perf_evlist *evlist,
 				},
 			};
 			struct hist_entry_iter iter = {
+				.evsel = evsel,
+				.sample = &sample,
 				.ops = &hist_iter_normal,
 				.hide_unresolved = false,
 			};
@@ -81,7 +83,7 @@ static int add_hist_entries(struct perf_evlist *evlist,
 							  &sample, NULL) < 0)
 				goto out;
 
-			if (hist_entry_iter__add(&iter, &al, evsel, &sample,
+			if (hist_entry_iter__add(&iter, &al,
 						 PERF_MAX_STACK_DEPTH, NULL) < 0)
 				goto out;
 
diff --git a/tools/perf/tests/hists_output.c b/tools/perf/tests/hists_output.c
index f4e3286cd496..4e3cff568eaa 100644
--- a/tools/perf/tests/hists_output.c
+++ b/tools/perf/tests/hists_output.c
@@ -57,6 +57,8 @@ static int add_hist_entries(struct hists *hists, struct machine *machine)
 			},
 		};
 		struct hist_entry_iter iter = {
+			.evsel = evsel,
+			.sample = &sample,
 			.ops = &hist_iter_normal,
 			.hide_unresolved = false,
 		};
@@ -70,8 +72,8 @@ static int add_hist_entries(struct hists *hists, struct machine *machine)
 						  &sample, NULL) < 0)
 			goto out;
 
-		if (hist_entry_iter__add(&iter, &al, evsel, &sample,
-					 PERF_MAX_STACK_DEPTH, NULL) < 0)
+		if (hist_entry_iter__add(&iter, &al, PERF_MAX_STACK_DEPTH,
+					 NULL) < 0)
 			goto out;
 
 		fake_samples[i].thread = al.thread;
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 4badf2491fbf..0553a14a80a4 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -857,19 +857,15 @@ const struct hist_iter_ops hist_iter_cumulative = {
 };
 
 int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
-			 struct perf_evsel *evsel, struct perf_sample *sample,
 			 int max_stack_depth, void *arg)
 {
 	int err, err2;
 
-	err = sample__resolve_callchain(sample, &iter->parent, evsel, al,
-					max_stack_depth);
+	err = sample__resolve_callchain(iter->sample, &iter->parent,
+					iter->evsel, al, max_stack_depth);
 	if (err)
 		return err;
 
-	iter->evsel = evsel;
-	iter->sample = sample;
-
 	err = iter->ops->prepare_entry(iter, al);
 	if (err)
 		goto out;
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 0eed50a5b1f0..0098aad4a23c 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -111,7 +111,6 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
 				      u64 weight, u64 transaction,
 				      u64 timestamp, bool sample_self);
 int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
-			 struct perf_evsel *evsel, struct perf_sample *sample,
 			 int max_stack_depth, void *arg);
 
 int64_t hist_entry__cmp(struct hist_entry *left, struct hist_entry *right);
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 19/38] perf tools: Pass session to hist_entry_iter struct
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (17 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 18/38] perf tools: Reducing arguments of hist_entry_iter__add() Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 20/38] perf tools: Maintain map groups list in a leader thread Namhyung Kim
                   ` (18 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

The session is necessary to determine whether this is an indexed data
so that it needs to use timestamp for searching threads/symbols.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-report.c | 1 +
 tools/perf/builtin-top.c    | 1 +
 tools/perf/util/hist.h      | 1 +
 3 files changed, 3 insertions(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index cff357522358..0d6e6bff7994 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -139,6 +139,7 @@ static int process_sample_event(struct perf_tool *tool,
 	struct hist_entry_iter iter = {
 		.evsel 			= evsel,
 		.sample 		= sample,
+		.session 		= rep->session,
 		.hide_unresolved 	= rep->hide_unresolved,
 		.add_entry_cb 		= hist_iter__report_callback,
 	};
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 2c37bff901ba..f33cb0e2aa0d 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -776,6 +776,7 @@ static void perf_event__process_sample(struct perf_tool *tool,
 		struct hist_entry_iter iter = {
 			.evsel		= evsel,
 			.sample 	= sample,
+			.session 	= top->session,
 			.add_entry_cb 	= hist_iter__top_callback,
 		};
 
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 0098aad4a23c..0afe15ba0277 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -88,6 +88,7 @@ struct hist_entry_iter {
 
 	struct perf_evsel *evsel;
 	struct perf_sample *sample;
+	struct perf_session *session;
 	struct hist_entry *he;
 	struct symbol *parent;
 	void *priv;
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 20/38] perf tools: Maintain map groups list in a leader thread
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (18 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 19/38] perf tools: Pass session to hist_entry_iter struct Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 21/38] perf tools: Introduce session__find_addr_location() and friends Namhyung Kim
                   ` (17 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

To support multi-threaded perf report, we need to maintain time-sorted
map groups.  Add ->mg_list member to struct thread and sort the list
by time.  Now leader threads have one more refcnt for map groups in
the list so also update the thread-mg-share test case.

Currently only add a new map groups when an exec (comm) event is
received.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/tests/thread-mg-share.c |   7 ++-
 tools/perf/util/event.c            |   2 +
 tools/perf/util/machine.c          |   4 +-
 tools/perf/util/map.c              |   3 ++
 tools/perf/util/map.h              |   2 +
 tools/perf/util/thread.c           | 108 ++++++++++++++++++++++++++++++++++++-
 tools/perf/util/thread.h           |   3 ++
 7 files changed, 124 insertions(+), 5 deletions(-)

diff --git a/tools/perf/tests/thread-mg-share.c b/tools/perf/tests/thread-mg-share.c
index b028499dd3cf..8933e01d0549 100644
--- a/tools/perf/tests/thread-mg-share.c
+++ b/tools/perf/tests/thread-mg-share.c
@@ -23,6 +23,9 @@ int test__thread_mg_share(void)
 	 * with several threads and checks they properly share and
 	 * maintain map groups info (struct map_groups).
 	 *
+	 * Note that a leader thread has one more refcnt for its
+	 * (current) map groups.
+	 *
 	 * thread group (pid: 0, tids: 0, 1, 2, 3)
 	 * other  group (pid: 4, tids: 4, 5)
 	*/
@@ -43,7 +46,7 @@ int test__thread_mg_share(void)
 			leader && t1 && t2 && t3 && other);
 
 	mg = leader->mg;
-	TEST_ASSERT_VAL("wrong refcnt", mg->refcnt == 4);
+	TEST_ASSERT_VAL("wrong refcnt", mg->refcnt == 5);
 
 	/* test the map groups pointer is shared */
 	TEST_ASSERT_VAL("map groups don't match", mg == t1->mg);
@@ -59,7 +62,7 @@ int test__thread_mg_share(void)
 	TEST_ASSERT_VAL("failed to find other leader", other_leader);
 
 	other_mg = other->mg;
-	TEST_ASSERT_VAL("wrong refcnt", other_mg->refcnt == 2);
+	TEST_ASSERT_VAL("wrong refcnt", other_mg->refcnt == 3);
 
 	TEST_ASSERT_VAL("map groups don't match", other_mg == other_leader->mg);
 
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 3bfe10fe0c69..8d4a5cb829d0 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -765,6 +765,8 @@ void thread__find_addr_map(struct thread *thread, u8 cpumode,
 		return;
 	}
 
+	BUG_ON(mg == NULL);
+
 	if (cpumode == PERF_RECORD_MISC_KERNEL && perf_host) {
 		al->level = 'k';
 		mg = &machine->kmaps;
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index b4b97b5e1f1c..b14e4dc5261d 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -331,7 +331,7 @@ static void machine__update_thread_pid(struct machine *machine,
 		goto out_err;
 
 	if (!leader->mg)
-		leader->mg = map_groups__new(machine);
+		thread__set_map_groups(leader, map_groups__new(machine), 0);
 
 	if (!leader->mg)
 		goto out_err;
@@ -348,7 +348,7 @@ static void machine__update_thread_pid(struct machine *machine,
 		if (!map_groups__empty(th->mg))
 			pr_err("Discarding thread maps for %d:%d\n",
 			       th->pid_, th->tid);
-		map_groups__delete(th->mg);
+		map_groups__put(th->mg);
 	}
 
 	th->mg = map_groups__get(leader->mg);
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 62ca9f2607d5..85fbb1b3e69f 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -422,6 +422,8 @@ void map_groups__init(struct map_groups *mg, struct machine *machine)
 	}
 	mg->machine = machine;
 	mg->refcnt = 1;
+	mg->timestamp = 0;
+	INIT_LIST_HEAD(&mg->list);
 }
 
 static void maps__delete(struct rb_root *maps)
@@ -484,6 +486,7 @@ struct map_groups *map_groups__new(struct machine *machine)
 void map_groups__delete(struct map_groups *mg)
 {
 	map_groups__exit(mg);
+	list_del(&mg->list);
 	free(mg);
 }
 
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index 0e42438b1e59..f33d49029ac0 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -61,7 +61,9 @@ struct map_groups {
 	struct rb_root	 maps[MAP__NR_TYPES];
 	struct list_head removed_maps[MAP__NR_TYPES];
 	struct machine	 *machine;
+	u64		 timestamp;
 	int		 refcnt;
+	struct list_head list;
 };
 
 struct map_groups *map_groups__new(struct machine *machine);
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 306bdaede019..7b01c171dcfa 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -10,13 +10,76 @@
 #include "comm.h"
 #include "unwind.h"
 
+struct map_groups *thread__get_map_groups(struct thread *thread, u64 timestamp)
+{
+	struct map_groups *mg;
+	struct thread *leader = thread;
+
+	BUG_ON(thread->mg == NULL);
+
+	if (thread->tid != thread->pid_) {
+		leader = machine__find_thread_time(thread->mg->machine,
+						   thread->pid_, thread->pid_,
+						   timestamp);
+		if (leader == NULL)
+			goto out;
+	}
+
+	list_for_each_entry(mg, &leader->mg_list, list)
+		if (timestamp >= mg->timestamp)
+			return mg;
+
+out:
+	return thread->mg;
+}
+
+int thread__set_map_groups(struct thread *thread, struct map_groups *mg,
+			   u64 timestamp)
+{
+	struct list_head *pos;
+	struct map_groups *old;
+
+	if (mg == NULL)
+		return -ENOMEM;
+
+	/*
+	 * Only a leader thread can have map groups list - others
+	 * reference it through map_groups__get.  This means the
+	 * leader thread will have one more refcnt than others.
+	 */
+	if (thread->tid != thread->pid_)
+		return -EINVAL;
+
+	if (thread->mg) {
+		BUG_ON(thread->mg->refcnt <= 1);
+		map_groups__put(thread->mg);
+	}
+
+	/* sort by time */
+	list_for_each(pos, &thread->mg_list) {
+		old = list_entry(pos, struct map_groups, list);
+		if (timestamp > old->timestamp)
+			break;
+	}
+
+	list_add_tail(&mg->list, pos);
+	mg->timestamp = timestamp;
+
+	/* set current ->mg to most recent one */
+	thread->mg = list_first_entry(&thread->mg_list, struct map_groups, list);
+	/* increase one more refcnt for current */
+	map_groups__get(thread->mg);
+
+	return 0;
+}
+
 int thread__init_map_groups(struct thread *thread, struct machine *machine)
 {
 	struct thread *leader;
 	pid_t pid = thread->pid_;
 
 	if (pid == thread->tid || pid == -1) {
-		thread->mg = map_groups__new(machine);
+		thread__set_map_groups(thread, map_groups__new(machine), 0);
 	} else {
 		leader = machine__findnew_thread(machine, pid, pid);
 		if (leader)
@@ -39,6 +102,7 @@ struct thread *thread__new(pid_t pid, pid_t tid)
 		thread->cpu = -1;
 		INIT_LIST_HEAD(&thread->comm_list);
 		INIT_LIST_HEAD(&thread->tid_node);
+		INIT_LIST_HEAD(&thread->mg_list);
 
 		if (unwind__prepare_access(thread) < 0)
 			goto err_thread;
@@ -67,6 +131,7 @@ struct thread *thread__new(pid_t pid, pid_t tid)
 void thread__delete(struct thread *thread)
 {
 	struct comm *comm, *tmp;
+	struct map_groups *mg, *tmp_mg;
 
 	thread_stack__free(thread);
 
@@ -74,6 +139,10 @@ void thread__delete(struct thread *thread)
 		map_groups__put(thread->mg);
 		thread->mg = NULL;
 	}
+	/* only leader threads have mg list */
+	list_for_each_entry_safe(mg, tmp_mg, &thread->mg_list, list)
+		map_groups__put(mg);
+
 	list_for_each_entry_safe(comm, tmp, &thread->comm_list, list) {
 		list_del(&comm->list);
 		comm__free(comm);
@@ -119,6 +188,9 @@ struct comm *thread__comm_time(const struct thread *thread, u64 timestamp)
 	return list_last_entry(&thread->comm_list, struct comm, list);
 }
 
+static int thread__clone_map_groups(struct thread *thread,
+				    struct thread *parent);
+
 int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
 		       bool exec)
 {
@@ -149,6 +221,40 @@ int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
 			unwind__flush_access(thread);
 	}
 
+	if (exec) {
+		struct machine *machine;
+
+		BUG_ON(thread->mg == NULL || thread->mg->machine == NULL);
+
+		machine = thread->mg->machine;
+
+		if (thread->tid != thread->pid_) {
+			struct map_groups *old = thread->mg;
+			struct thread *leader;
+
+			leader = machine__findnew_thread(machine, thread->pid_,
+							 thread->pid_);
+
+			/* now it'll be a new leader */
+			thread->pid_ = thread->tid;
+
+			thread->mg = map_groups__new(old->machine);
+			if (thread->mg == NULL)
+				return -ENOMEM;
+
+			/* save current mg in the new leader */
+			thread__clone_map_groups(thread, leader);
+
+			/* current mg of leader thread needs one more refcnt */
+			map_groups__get(thread->mg);
+
+			thread__set_map_groups(thread, thread->mg, old->timestamp);
+		}
+
+		/* create a new mg for newly executed binary */
+		thread__set_map_groups(thread, map_groups__new(machine), timestamp);
+	}
+
 	thread->comm_set = true;
 
 	return 0;
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index e5d7abd255ea..08cafa2d97f9 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -14,6 +14,7 @@ struct thread {
 	struct rb_node	 	rb_node;
 	struct list_head 	tid_node;
 	struct map_groups	*mg;
+	struct list_head	mg_list;
 	pid_t			pid_; /* Not all tools update this */
 	pid_t			tid;
 	pid_t			ppid;
@@ -56,6 +57,8 @@ struct comm *thread__exec_comm(const struct thread *thread);
 struct comm *thread__comm_time(const struct thread *thread, u64 timestamp);
 const char *thread__comm_str(const struct thread *thread);
 const char *thread__comm_str_time(const struct thread *thread, u64 timestamp);
+struct map_groups *thread__get_map_groups(struct thread *thread, u64 timestamp);
+int thread__set_map_groups(struct thread *thread, struct map_groups *mg, u64 timestamp);
 void thread__insert_map(struct thread *thread, struct map *map);
 int thread__fork(struct thread *thread, struct thread *parent, u64 timestamp);
 size_t thread__fprintf(struct thread *thread, FILE *fp);
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 21/38] perf tools: Introduce session__find_addr_location() and friends
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (19 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 20/38] perf tools: Maintain map groups list in a leader thread Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 22/38] perf callchain: Use " Namhyung Kim
                   ` (16 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

These new functions are for find appropriate map (and symbol) at the
given time when used with an indexed data file.  This is based on the
fact that map_groups list is sorted by time in the previous patch.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/event.c   | 57 ++++++++++++++++++++++++++++++++++++++---------
 tools/perf/util/hist.c    |  4 ++--
 tools/perf/util/machine.c | 37 ++++++++++++++++++++----------
 tools/perf/util/machine.h |  2 ++
 tools/perf/util/session.h | 38 +++++++++++++++++++++++++++++++
 tools/perf/util/thread.c  | 21 +++++++++++++++++
 tools/perf/util/thread.h  | 10 +++++++++
 7 files changed, 145 insertions(+), 24 deletions(-)

diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 8d4a5cb829d0..5abf7086c97c 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -746,16 +746,14 @@ int perf_event__process(struct perf_tool *tool __maybe_unused,
 	return machine__process_event(machine, event, sample);
 }
 
-void thread__find_addr_map(struct thread *thread, u8 cpumode,
-			   enum map_type type, u64 addr,
-			   struct addr_location *al)
+static void map_groups__find_addr_map(struct map_groups *mg, u8 cpumode,
+				      enum map_type type, u64 addr,
+				      struct addr_location *al)
 {
-	struct map_groups *mg = thread->mg;
 	struct machine *machine = mg->machine;
 	bool load_map = false;
 
 	al->machine = machine;
-	al->thread = thread;
 	al->addr = addr;
 	al->cpumode = cpumode;
 	al->filtered = 0;
@@ -824,6 +822,26 @@ void thread__find_addr_map(struct thread *thread, u8 cpumode,
 	}
 }
 
+void thread__find_addr_map(struct thread *thread, u8 cpumode,
+			   enum map_type type, u64 addr,
+			   struct addr_location *al)
+{
+	al->thread = thread;
+	map_groups__find_addr_map(thread->mg, cpumode, type, addr, al);
+}
+
+void thread__find_addr_map_time(struct thread *thread, u8 cpumode,
+				enum map_type type, u64 addr,
+				struct addr_location *al, u64 timestamp)
+{
+	struct map_groups *mg;
+
+	mg = thread__get_map_groups(thread, timestamp);
+
+	al->thread = thread;
+	map_groups__find_addr_map(mg, cpumode, type, addr, al);
+}
+
 void thread__find_addr_location(struct thread *thread,
 				u8 cpumode, enum map_type type, u64 addr,
 				struct addr_location *al)
@@ -836,6 +854,21 @@ void thread__find_addr_location(struct thread *thread,
 		al->sym = NULL;
 }
 
+void thread__find_addr_location_time(struct thread *thread, u8 cpumode,
+				     enum map_type type, u64 addr,
+				     struct addr_location *al, u64 timestamp)
+{
+	struct map_groups *mg;
+
+	mg = thread__get_map_groups(thread, timestamp);
+	map_groups__find_addr_map(mg, cpumode, type, addr, al);
+	if (al->map != NULL)
+		al->sym = map__find_symbol(al->map, al->addr,
+					   mg->machine->symbol_filter);
+	else
+		al->sym = NULL;
+}
+
 int perf_event__preprocess_sample(const union perf_event *event,
 				  struct machine *machine,
 				  struct addr_location *al,
@@ -868,7 +901,9 @@ int perf_event__preprocess_sample(const union perf_event *event,
 	    machine->vmlinux_maps[MAP__FUNCTION] == NULL)
 		machine__create_kernel_maps(machine);
 
-	thread__find_addr_map(thread, cpumode, MAP__FUNCTION, sample->ip, al);
+	session__find_addr_map(session, thread, cpumode, MAP__FUNCTION,
+			       sample->ip, al, sample->time);
+
 	dump_printf(" ...... dso: %s\n",
 		    al->map ? al->map->dso->long_name :
 			al->level == 'H' ? "[hypervisor]" : "<not found>");
@@ -929,14 +964,16 @@ void perf_event__preprocess_sample_addr(union perf_event *event,
 					struct perf_sample *sample,
 					struct thread *thread,
 					struct addr_location *al,
-					struct perf_session *session __maybe_unused)
+					struct perf_session *session)
 {
 	u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
 
-	thread__find_addr_map(thread, cpumode, MAP__FUNCTION, sample->addr, al);
+	session__find_addr_map(session, thread, cpumode, MAP__FUNCTION,
+			       sample->addr, al, sample->time);
+
 	if (!al->map)
-		thread__find_addr_map(thread, cpumode, MAP__VARIABLE,
-				      sample->addr, al);
+		session__find_addr_map(session, thread, cpumode, MAP__VARIABLE,
+				       sample->addr, al, sample->time);
 
 	al->cpu = sample->cpu;
 	al->sym = NULL;
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 0553a14a80a4..0d189ae76922 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -496,7 +496,7 @@ iter_prepare_mem_entry(struct hist_entry_iter *iter, struct addr_location *al)
 	struct perf_sample *sample = iter->sample;
 	struct mem_info *mi;
 
-	mi = sample__resolve_mem(sample, al);
+	mi = sample__resolve_mem(sample, iter->session, al);
 	if (mi == NULL)
 		return -ENOMEM;
 
@@ -570,7 +570,7 @@ iter_prepare_branch_entry(struct hist_entry_iter *iter, struct addr_location *al
 	struct branch_info *bi;
 	struct perf_sample *sample = iter->sample;
 
-	bi = sample__resolve_bstack(sample, al);
+	bi = sample__resolve_bstack(sample, iter->session, al);
 	if (!bi)
 		return -ENOMEM;
 
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index b14e4dc5261d..4743718d4bf1 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -8,6 +8,7 @@
 #include "sort.h"
 #include "strlist.h"
 #include "thread.h"
+#include "session.h"
 #include "vdso.h"
 #include <stdbool.h>
 #include <symbol/kallsyms.h>
@@ -1469,9 +1470,10 @@ static bool symbol__match_regex(struct symbol *sym, regex_t *regex)
 	return 0;
 }
 
-static void ip__resolve_ams(struct thread *thread,
+static void ip__resolve_ams(struct perf_session *session,
+			    struct thread *thread,
 			    struct addr_map_symbol *ams,
-			    u64 ip)
+			    u64 ip, u64 timestamp)
 {
 	struct addr_location al;
 
@@ -1483,7 +1485,8 @@ static void ip__resolve_ams(struct thread *thread,
 	 * Thus, we have to try consecutively until we find a match
 	 * or else, the symbol is unknown
 	 */
-	thread__find_cpumode_addr_location(thread, MAP__FUNCTION, ip, &al);
+	session__find_cpumode_addr_location(session, thread, MAP__FUNCTION,
+					    ip, &al, timestamp);
 
 	ams->addr = ip;
 	ams->al_addr = al.addr;
@@ -1491,21 +1494,25 @@ static void ip__resolve_ams(struct thread *thread,
 	ams->map = al.map;
 }
 
-static void ip__resolve_data(struct thread *thread,
-			     u8 m, struct addr_map_symbol *ams, u64 addr)
+static void ip__resolve_data(struct perf_session *session,
+			     struct thread *thread, u8 m,
+			     struct addr_map_symbol *ams,
+			     u64 addr, u64 timestamp)
 {
 	struct addr_location al;
 
 	memset(&al, 0, sizeof(al));
 
-	thread__find_addr_location(thread, m, MAP__VARIABLE, addr, &al);
+	session__find_addr_location(session, thread, m, MAP__VARIABLE, addr,
+				    &al, timestamp);
 	if (al.map == NULL) {
 		/*
 		 * some shared data regions have execute bit set which puts
 		 * their mapping in the MAP__FUNCTION type array.
 		 * Check there as a fallback option before dropping the sample.
 		 */
-		thread__find_addr_location(thread, m, MAP__FUNCTION, addr, &al);
+		session__find_addr_location(session, thread, m, MAP__FUNCTION,
+					    addr, &al, timestamp);
 	}
 
 	ams->addr = addr;
@@ -1515,6 +1522,7 @@ static void ip__resolve_data(struct thread *thread,
 }
 
 struct mem_info *sample__resolve_mem(struct perf_sample *sample,
+				     struct perf_session *session,
 				     struct addr_location *al)
 {
 	struct mem_info *mi = zalloc(sizeof(*mi));
@@ -1522,8 +1530,10 @@ struct mem_info *sample__resolve_mem(struct perf_sample *sample,
 	if (!mi)
 		return NULL;
 
-	ip__resolve_ams(al->thread, &mi->iaddr, sample->ip);
-	ip__resolve_data(al->thread, al->cpumode, &mi->daddr, sample->addr);
+	ip__resolve_ams(session, al->thread, &mi->iaddr, sample->ip,
+			sample->time);
+	ip__resolve_data(session, al->thread, al->cpumode, &mi->daddr,
+			 sample->addr, sample->time);
 	mi->data_src.val = sample->data_src;
 
 	return mi;
@@ -1539,6 +1549,7 @@ static int add_callchain_ip(struct thread *thread,
 
 	al.filtered = 0;
 	al.sym = NULL;
+
 	if (branch_history)
 		thread__find_cpumode_addr_location(thread, MAP__FUNCTION,
 						   ip, &al);
@@ -1589,6 +1600,7 @@ static int add_callchain_ip(struct thread *thread,
 }
 
 struct branch_info *sample__resolve_bstack(struct perf_sample *sample,
+					   struct perf_session *session,
 					   struct addr_location *al)
 {
 	unsigned int i;
@@ -1599,8 +1611,10 @@ struct branch_info *sample__resolve_bstack(struct perf_sample *sample,
 		return NULL;
 
 	for (i = 0; i < bs->nr; i++) {
-		ip__resolve_ams(al->thread, &bi[i].to, bs->entries[i].to);
-		ip__resolve_ams(al->thread, &bi[i].from, bs->entries[i].from);
+		ip__resolve_ams(session, al->thread, &bi[i].to,
+				bs->entries[i].to, sample->time);
+		ip__resolve_ams(session, al->thread, &bi[i].from,
+				bs->entries[i].from, sample->time);
 		bi[i].flags = bs->entries[i].flags;
 	}
 	return bi;
@@ -1826,7 +1840,6 @@ static int thread__resolve_callchain_sample(struct thread *thread,
 		ip = chain->ips[j];
 
 		err = add_callchain_ip(thread, parent, root_al, false, ip);
-
 		if (err)
 			return (err < 0) ? err : 0;
 	}
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 9571b6b1c5b5..45aee0c329ef 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -121,8 +121,10 @@ void machine__delete_threads(struct machine *machine);
 void machine__delete(struct machine *machine);
 
 struct branch_info *sample__resolve_bstack(struct perf_sample *sample,
+					   struct perf_session *session,
 					   struct addr_location *al);
 struct mem_info *sample__resolve_mem(struct perf_sample *sample,
+				     struct perf_session *session,
 				     struct addr_location *al);
 int thread__resolve_callchain(struct thread *thread,
 			      struct perf_evsel *evsel,
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index aff0d2b4cc0b..dbd21f8e7cf1 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -142,4 +142,42 @@ static inline bool perf_session__has_index(struct perf_session *session)
 	return perf_header__has_feat(&session->header, HEADER_DATA_INDEX);
 }
 
+static inline void
+session__find_addr_map(struct perf_session *session, struct thread *thread,
+		       u8 cpumode, enum map_type type, u64 addr,
+		       struct addr_location *al, u64 timestamp)
+{
+	if (session && perf_session__has_index(session))
+		thread__find_addr_map_time(thread, cpumode, type, addr, al,
+					   timestamp);
+	else
+		thread__find_addr_map(thread, cpumode, type, addr, al);
+}
+
+static inline void
+session__find_addr_location(struct perf_session *session, struct thread *thread,
+			    u8 cpumode, enum map_type type, u64 addr,
+			    struct addr_location *al, u64 timestamp)
+{
+	if (session && perf_session__has_index(session))
+		thread__find_addr_location_time(thread, cpumode, type, addr, al,
+						timestamp);
+	else
+		thread__find_addr_location(thread, cpumode, type, addr, al);
+}
+
+static inline void
+session__find_cpumode_addr_location(struct perf_session *session,
+				    struct thread *thread, enum map_type type,
+				    u64 addr, struct addr_location *al,
+				    u64 timestamp)
+{
+	if (session && perf_session__has_index(session))
+		thread__find_cpumode_addr_location_time(thread, type, addr, al,
+							timestamp);
+	else
+		thread__find_cpumode_addr_location(thread, type, addr, al);
+}
+
+
 #endif /* __PERF_SESSION_H */
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 7b01c171dcfa..9ae1ce8606af 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -359,3 +359,24 @@ void thread__find_cpumode_addr_location(struct thread *thread,
 			break;
 	}
 }
+
+void thread__find_cpumode_addr_location_time(struct thread *thread,
+					     enum map_type type, u64 addr,
+					     struct addr_location *al,
+					     u64 timestamp)
+{
+	size_t i;
+	const u8 const cpumodes[] = {
+		PERF_RECORD_MISC_USER,
+		PERF_RECORD_MISC_KERNEL,
+		PERF_RECORD_MISC_GUEST_USER,
+		PERF_RECORD_MISC_GUEST_KERNEL
+	};
+
+	for (i = 0; i < ARRAY_SIZE(cpumodes); i++) {
+		thread__find_addr_location_time(thread, cpumodes[i], type,
+						addr, al, timestamp);
+		if (al->map)
+			break;
+	}
+}
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index 08cafa2d97f9..5209ad5adadf 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -66,14 +66,24 @@ size_t thread__fprintf(struct thread *thread, FILE *fp);
 void thread__find_addr_map(struct thread *thread,
 			   u8 cpumode, enum map_type type, u64 addr,
 			   struct addr_location *al);
+void thread__find_addr_map_time(struct thread *thread, u8 cpumode,
+				enum map_type type, u64 addr,
+				struct addr_location *al, u64 timestamp);
 
 void thread__find_addr_location(struct thread *thread,
 				u8 cpumode, enum map_type type, u64 addr,
 				struct addr_location *al);
+void thread__find_addr_location_time(struct thread *thread, u8 cpumode,
+				     enum map_type type, u64 addr,
+				     struct addr_location *al, u64 timestamp);
 
 void thread__find_cpumode_addr_location(struct thread *thread,
 					enum map_type type, u64 addr,
 					struct addr_location *al);
+void thread__find_cpumode_addr_location_time(struct thread *thread,
+					     enum map_type type, u64 addr,
+					     struct addr_location *al,
+					     u64 timestamp);
 
 static inline void *thread__priv(struct thread *thread)
 {
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 22/38] perf callchain: Use session__find_addr_location() and friends
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (20 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 21/38] perf tools: Introduce session__find_addr_location() and friends Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03 14:01   ` Arnaldo Carvalho de Melo
  2015-03-03  3:07 ` [PATCH 23/38] perf tools: Add a test case for timed map groups handling Namhyung Kim
                   ` (15 subsequent siblings)
  37 siblings, 1 reply; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

Pass session struct to callchain resolve routines and find correct
thread/map/symbol using proper functions.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-script.c                        |  4 +--
 tools/perf/tests/dwarf-unwind.c                    |  2 +-
 tools/perf/util/callchain.c                        |  6 ++--
 tools/perf/util/callchain.h                        |  4 ++-
 tools/perf/util/hist.c                             |  5 +--
 tools/perf/util/machine.c                          | 41 +++++++++++++---------
 tools/perf/util/machine.h                          |  1 +
 .../util/scripting-engines/trace-event-python.c    | 27 +++++++-------
 tools/perf/util/session.c                          |  6 ++--
 tools/perf/util/session.h                          |  2 +-
 tools/perf/util/unwind-libdw.c                     | 14 ++++----
 tools/perf/util/unwind-libdw.h                     |  1 +
 tools/perf/util/unwind-libunwind.c                 | 32 +++++++++--------
 tools/perf/util/unwind.h                           |  3 +-
 14 files changed, 86 insertions(+), 62 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 65b3a07be2bf..90a401a52868 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -429,7 +429,7 @@ static void print_sample_bts(union perf_event *event,
 				print_opts &= ~PRINT_IP_OPT_SRCLINE;
 			}
 		}
-		perf_evsel__print_ip(evsel, sample, al, print_opts,
+		perf_evsel__print_ip(evsel, sample, session, al, print_opts,
 				     PERF_MAX_STACK_DEPTH);
 	}
 
@@ -483,7 +483,7 @@ static void process_event(union perf_event *event, struct perf_sample *sample,
 		else
 			printf("\n");
 
-		perf_evsel__print_ip(evsel, sample, al,
+		perf_evsel__print_ip(evsel, sample, session, al,
 				     output[attr->type].print_ip_opts,
 				     PERF_MAX_STACK_DEPTH);
 	}
diff --git a/tools/perf/tests/dwarf-unwind.c b/tools/perf/tests/dwarf-unwind.c
index 7e04feb431cb..241270374e93 100644
--- a/tools/perf/tests/dwarf-unwind.c
+++ b/tools/perf/tests/dwarf-unwind.c
@@ -75,7 +75,7 @@ static int unwind_thread(struct thread *thread)
 		goto out;
 	}
 
-	err = unwind__get_entries(unwind_entry, &cnt, thread,
+	err = unwind__get_entries(unwind_entry, &cnt, thread, NULL,
 				  &sample, MAX_STACK);
 	if (err)
 		pr_debug("unwind failed\n");
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 9f643ee77001..f95b27037dc8 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -757,7 +757,9 @@ int callchain_cursor_append(struct callchain_cursor *cursor,
 	return 0;
 }
 
-int sample__resolve_callchain(struct perf_sample *sample, struct symbol **parent,
+int sample__resolve_callchain(struct perf_sample *sample,
+			      struct perf_session *session,
+			      struct symbol **parent,
 			      struct perf_evsel *evsel, struct addr_location *al,
 			      int max_stack)
 {
@@ -767,7 +769,7 @@ int sample__resolve_callchain(struct perf_sample *sample, struct symbol **parent
 	if (symbol_conf.use_callchain || symbol_conf.cumulate_callchain ||
 	    sort__has_parent) {
 		return thread__resolve_callchain(al->thread, evsel, sample,
-						 parent, al, max_stack);
+						 session, parent, al, max_stack);
 	}
 	return 0;
 }
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 6033a0a212ca..ca9048f84cb5 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -165,7 +165,9 @@ struct hist_entry;
 int record_parse_callchain_opt(const struct option *opt, const char *arg, int unset);
 int record_callchain_opt(const struct option *opt, const char *arg, int unset);
 
-int sample__resolve_callchain(struct perf_sample *sample, struct symbol **parent,
+int sample__resolve_callchain(struct perf_sample *sample,
+			      struct perf_session *session,
+			      struct symbol **parent,
 			      struct perf_evsel *evsel, struct addr_location *al,
 			      int max_stack);
 int hist_entry__append_callchain(struct hist_entry *he, struct perf_sample *sample);
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 0d189ae76922..dbe7f3744bf1 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -861,8 +861,9 @@ int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
 {
 	int err, err2;
 
-	err = sample__resolve_callchain(iter->sample, &iter->parent,
-					iter->evsel, al, max_stack_depth);
+	err = sample__resolve_callchain(iter->sample, iter->session,
+					&iter->parent, iter->evsel, al,
+					max_stack_depth);
 	if (err)
 		return err;
 
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 4743718d4bf1..63d860dca74b 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1540,10 +1540,11 @@ struct mem_info *sample__resolve_mem(struct perf_sample *sample,
 }
 
 static int add_callchain_ip(struct thread *thread,
+			    struct perf_session *session,
 			    struct symbol **parent,
 			    struct addr_location *root_al,
 			    bool branch_history,
-			    u64 ip)
+			    u64 ip, u64 timestamp)
 {
 	struct addr_location al;
 
@@ -1551,8 +1552,9 @@ static int add_callchain_ip(struct thread *thread,
 	al.sym = NULL;
 
 	if (branch_history)
-		thread__find_cpumode_addr_location(thread, MAP__FUNCTION,
-						   ip, &al);
+		session__find_cpumode_addr_location(session, thread,
+						    MAP__FUNCTION, ip, &al,
+						    timestamp);
 	else {
 		u8 cpumode = PERF_RECORD_MISC_USER;
 
@@ -1579,8 +1581,8 @@ static int add_callchain_ip(struct thread *thread,
 			}
 			return 0;
 		}
-		thread__find_addr_location(thread, cpumode, MAP__FUNCTION,
-				   ip, &al);
+		session__find_addr_location(session,thread, cpumode,
+					    MAP__FUNCTION, ip, &al, timestamp);
 	}
 
 	if (al.sym != NULL) {
@@ -1670,6 +1672,7 @@ static int remove_loops(struct branch_entry *l, int nr)
  */
 static int resolve_lbr_callchain_sample(struct thread *thread,
 					struct perf_sample *sample,
+					struct perf_session *session,
 					struct symbol **parent,
 					struct addr_location *root_al,
 					int max_stack)
@@ -1722,7 +1725,8 @@ static int resolve_lbr_callchain_sample(struct thread *thread,
 					ip = lbr_stack->entries[0].to;
 			}
 
-			err = add_callchain_ip(thread, parent, root_al, false, ip);
+			err = add_callchain_ip(thread, session, parent, root_al,
+					       false, ip, sample->time);
 			if (err)
 				return (err < 0) ? err : 0;
 		}
@@ -1735,6 +1739,7 @@ static int resolve_lbr_callchain_sample(struct thread *thread,
 static int thread__resolve_callchain_sample(struct thread *thread,
 					    struct perf_evsel *evsel,
 					    struct perf_sample *sample,
+					    struct perf_session *session,
 					    struct symbol **parent,
 					    struct addr_location *root_al,
 					    int max_stack)
@@ -1742,6 +1747,7 @@ static int thread__resolve_callchain_sample(struct thread *thread,
 	struct branch_stack *branch = sample->branch_stack;
 	struct ip_callchain *chain = sample->callchain;
 	int chain_nr = min(max_stack, (int)chain->nr);
+	u64 timestamp = sample->time;
 	int i, j, err;
 	int skip_idx = -1;
 	int first_call = 0;
@@ -1749,8 +1755,8 @@ static int thread__resolve_callchain_sample(struct thread *thread,
 	callchain_cursor_reset(&callchain_cursor);
 
 	if (has_branch_callstack(evsel)) {
-		err = resolve_lbr_callchain_sample(thread, sample, parent,
-						   root_al, max_stack);
+		err = resolve_lbr_callchain_sample(thread, sample, session,
+						   parent, root_al, max_stack);
 		if (err)
 			return (err < 0) ? err : 0;
 	}
@@ -1806,11 +1812,12 @@ static int thread__resolve_callchain_sample(struct thread *thread,
 		nr = remove_loops(be, nr);
 
 		for (i = 0; i < nr; i++) {
-			err = add_callchain_ip(thread, parent, root_al,
-					       true, be[i].to);
+			err = add_callchain_ip(thread, session, parent, root_al,
+					       true, be[i].to, timestamp);
 			if (!err)
-				err = add_callchain_ip(thread, parent, root_al,
-						       true, be[i].from);
+				err = add_callchain_ip(thread, session, parent,
+						       root_al, true,
+						       be[i].from, timestamp);
 			if (err == -EINVAL)
 				break;
 			if (err)
@@ -1839,7 +1846,8 @@ static int thread__resolve_callchain_sample(struct thread *thread,
 #endif
 		ip = chain->ips[j];
 
-		err = add_callchain_ip(thread, parent, root_al, false, ip);
+		err = add_callchain_ip(thread, session, parent, root_al, false,
+				       ip, timestamp);
 		if (err)
 			return (err < 0) ? err : 0;
 	}
@@ -1857,12 +1865,13 @@ static int unwind_entry(struct unwind_entry *entry, void *arg)
 int thread__resolve_callchain(struct thread *thread,
 			      struct perf_evsel *evsel,
 			      struct perf_sample *sample,
+			      struct perf_session *session,
 			      struct symbol **parent,
 			      struct addr_location *root_al,
 			      int max_stack)
 {
-	int ret = thread__resolve_callchain_sample(thread, evsel,
-						   sample, parent,
+	int ret = thread__resolve_callchain_sample(thread, evsel, sample,
+						   session, parent,
 						   root_al, max_stack);
 	if (ret)
 		return ret;
@@ -1878,7 +1887,7 @@ int thread__resolve_callchain(struct thread *thread,
 		return 0;
 
 	return unwind__get_entries(unwind_entry, &callchain_cursor,
-				   thread, sample, max_stack);
+				   thread, session, sample, max_stack);
 
 }
 
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 45aee0c329ef..38ead24f0f47 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -129,6 +129,7 @@ struct mem_info *sample__resolve_mem(struct perf_sample *sample,
 int thread__resolve_callchain(struct thread *thread,
 			      struct perf_evsel *evsel,
 			      struct perf_sample *sample,
+			      struct perf_session *session,
 			      struct symbol **parent,
 			      struct addr_location *root_al,
 			      int max_stack);
diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index 802def46af7b..e8c2896055c5 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -298,9 +298,10 @@ static PyObject *get_field_numeric_entry(struct event_format *event,
 }
 
 
-static PyObject *python_process_callchain(struct perf_sample *sample,
-					 struct perf_evsel *evsel,
-					 struct addr_location *al)
+static PyObject *python_process_callchain(struct perf_session *session,
+					  struct perf_sample *sample,
+					  struct perf_evsel *evsel,
+					  struct addr_location *al)
 {
 	PyObject *pylist;
 
@@ -311,9 +312,8 @@ static PyObject *python_process_callchain(struct perf_sample *sample,
 	if (!symbol_conf.use_callchain || !sample->callchain)
 		goto exit;
 
-	if (thread__resolve_callchain(al->thread, evsel,
-				      sample, NULL, NULL,
-				      PERF_MAX_STACK_DEPTH) != 0) {
+	if (thread__resolve_callchain(al->thread, evsel, sample, session,
+				      NULL, NULL, PERF_MAX_STACK_DEPTH) != 0) {
 		pr_err("Failed to resolve callchain. Skipping\n");
 		goto exit;
 	}
@@ -374,7 +374,8 @@ static PyObject *python_process_callchain(struct perf_sample *sample,
 }
 
 
-static void python_process_tracepoint(struct perf_sample *sample,
+static void python_process_tracepoint(struct perf_session *session,
+				      struct perf_sample *sample,
 				      struct perf_evsel *evsel,
 				      struct thread *thread,
 				      struct addr_location *al)
@@ -424,7 +425,7 @@ static void python_process_tracepoint(struct perf_sample *sample,
 	PyTuple_SetItem(t, n++, context);
 
 	/* ip unwinding */
-	callchain = python_process_callchain(sample, evsel, al);
+	callchain = python_process_callchain(session, sample, evsel, al);
 
 	if (handler) {
 		PyTuple_SetItem(t, n++, PyInt_FromLong(cpu));
@@ -759,7 +760,8 @@ static int python_process_call_return(struct call_return *cr, void *data)
 	return db_export__call_return(dbe, cr);
 }
 
-static void python_process_general_event(struct perf_sample *sample,
+static void python_process_general_event(struct perf_session *session,
+					 struct perf_sample *sample,
 					 struct perf_evsel *evsel,
 					 struct thread *thread,
 					 struct addr_location *al)
@@ -822,7 +824,7 @@ static void python_process_general_event(struct perf_sample *sample,
 	}
 
 	/* ip unwinding */
-	callchain = python_process_callchain(sample, evsel, al);
+	callchain = python_process_callchain(session, sample, evsel, al);
 	pydict_set_item_string_decref(dict, "callchain", callchain);
 
 	PyTuple_SetItem(t, n++, dict);
@@ -846,7 +848,7 @@ static void python_process_event(union perf_event *event,
 
 	switch (evsel->attr.type) {
 	case PERF_TYPE_TRACEPOINT:
-		python_process_tracepoint(sample, evsel, thread, al);
+		python_process_tracepoint(session, sample, evsel, thread, al);
 		break;
 	/* Reserve for future process_hw/sw/raw APIs */
 	default:
@@ -854,7 +856,8 @@ static void python_process_event(union perf_event *event,
 			db_export__sample(&tables->dbe, event, sample, evsel,
 					  thread, al, session);
 		else
-			python_process_general_event(sample, evsel, thread, al);
+			python_process_general_event(session, sample, evsel,
+						     thread, al);
 	}
 }
 
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 46761a39fbae..d89dfa8592a9 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1550,7 +1550,7 @@ struct perf_evsel *perf_session__find_first_evtype(struct perf_session *session,
 }
 
 void perf_evsel__print_ip(struct perf_evsel *evsel, struct perf_sample *sample,
-			  struct addr_location *al,
+			  struct perf_session *session, struct addr_location *al,
 			  unsigned int print_opts, unsigned int stack_depth)
 {
 	struct callchain_cursor_node *node;
@@ -1565,8 +1565,8 @@ void perf_evsel__print_ip(struct perf_evsel *evsel, struct perf_sample *sample,
 	if (symbol_conf.use_callchain && sample->callchain) {
 		struct addr_location node_al;
 
-		if (thread__resolve_callchain(al->thread, evsel,
-					      sample, NULL, NULL,
+		if (thread__resolve_callchain(al->thread, evsel, sample,
+					      session, NULL, NULL,
 					      PERF_MAX_STACK_DEPTH) != 0) {
 			if (verbose)
 				error("Failed to resolve callchain. Skipping\n");
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index dbd21f8e7cf1..4d264fef8968 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -102,7 +102,7 @@ struct perf_evsel *perf_session__find_first_evtype(struct perf_session *session,
 					    unsigned int type);
 
 void perf_evsel__print_ip(struct perf_evsel *evsel, struct perf_sample *sample,
-			  struct addr_location *al,
+			  struct perf_session *session, struct addr_location *al,
 			  unsigned int print_opts, unsigned int stack_depth);
 
 int perf_session__cpu_bitmap(struct perf_session *session,
diff --git a/tools/perf/util/unwind-libdw.c b/tools/perf/util/unwind-libdw.c
index 2dcfe9a7c8d0..ebaf51b58c92 100644
--- a/tools/perf/util/unwind-libdw.c
+++ b/tools/perf/util/unwind-libdw.c
@@ -8,6 +8,7 @@
 #include "unwind-libdw.h"
 #include "machine.h"
 #include "thread.h"
+#include "session.h"
 #include <linux/types.h>
 #include "event.h"
 #include "perf_regs.h"
@@ -26,10 +27,9 @@ static int __report_module(struct addr_location *al, u64 ip,
 	Dwfl_Module *mod;
 	struct dso *dso = NULL;
 
-	thread__find_addr_location(ui->thread,
-				   PERF_RECORD_MISC_USER,
-				   MAP__FUNCTION, ip, al);
-
+	session__find_addr_location(ui->session, ui->thread,
+				    PERF_RECORD_MISC_USER, MAP__FUNCTION,
+				    ip, al, ui->sample->time);
 	if (al->map)
 		dso = al->map->dso;
 
@@ -89,8 +89,8 @@ static int access_dso_mem(struct unwind_info *ui, Dwarf_Addr addr,
 	struct addr_location al;
 	ssize_t size;
 
-	thread__find_addr_map(ui->thread, PERF_RECORD_MISC_USER,
-			      MAP__FUNCTION, addr, &al);
+	session__find_addr_map(ui->session, ui->thread, PERF_RECORD_MISC_USER,
+			       MAP__FUNCTION, addr, &al, ui->sample->time);
 	if (!al.map) {
 		pr_debug("unwind: no map for %lx\n", (unsigned long)addr);
 		return -1;
@@ -165,12 +165,14 @@ frame_callback(Dwfl_Frame *state, void *arg)
 
 int unwind__get_entries(unwind_entry_cb_t cb, void *arg,
 			struct thread *thread,
+			struct perf_session *session,
 			struct perf_sample *data,
 			int max_stack)
 {
 	struct unwind_info ui = {
 		.sample		= data,
 		.thread		= thread,
+		.session 	= session,
 		.machine	= thread->mg->machine,
 		.cb		= cb,
 		.arg		= arg,
diff --git a/tools/perf/util/unwind-libdw.h b/tools/perf/util/unwind-libdw.h
index 417a1426f3ad..806e522713a2 100644
--- a/tools/perf/util/unwind-libdw.h
+++ b/tools/perf/util/unwind-libdw.h
@@ -11,6 +11,7 @@ bool libdw__arch_set_initial_registers(Dwfl_Thread *thread, void *arg);
 struct unwind_info {
 	Dwfl			*dwfl;
 	struct perf_sample      *sample;
+	struct perf_session     *session;
 	struct machine          *machine;
 	struct thread           *thread;
 	unwind_entry_cb_t	cb;
diff --git a/tools/perf/util/unwind-libunwind.c b/tools/perf/util/unwind-libunwind.c
index e3c40a520a25..9ee63179383e 100644
--- a/tools/perf/util/unwind-libunwind.c
+++ b/tools/perf/util/unwind-libunwind.c
@@ -86,6 +86,7 @@ UNW_OBJ(dwarf_find_debug_frame) (int found, unw_dyn_info_t *di_debug,
 
 struct unwind_info {
 	struct perf_sample	*sample;
+	struct perf_session	*session;
 	struct machine		*machine;
 	struct thread		*thread;
 };
@@ -315,8 +316,8 @@ static struct map *find_map(unw_word_t ip, struct unwind_info *ui)
 {
 	struct addr_location al;
 
-	thread__find_addr_map(ui->thread, PERF_RECORD_MISC_USER,
-			      MAP__FUNCTION, ip, &al);
+	session__find_addr_map(ui->session, ui->thread, PERF_RECORD_MISC_USER,
+			       MAP__FUNCTION, ip, &al, ui->sample->time);
 	return al.map;
 }
 
@@ -406,20 +407,19 @@ get_proc_name(unw_addr_space_t __maybe_unused as,
 static int access_dso_mem(struct unwind_info *ui, unw_word_t addr,
 			  unw_word_t *data)
 {
-	struct addr_location al;
+	struct map *map;
 	ssize_t size;
 
-	thread__find_addr_map(ui->thread, PERF_RECORD_MISC_USER,
-			      MAP__FUNCTION, addr, &al);
-	if (!al.map) {
+	map = find_map(addr, ui);
+	if (!map) {
 		pr_debug("unwind: no map for %lx\n", (unsigned long)addr);
 		return -1;
 	}
 
-	if (!al.map->dso)
+	if (!map->dso)
 		return -1;
 
-	size = dso__data_read_addr(al.map->dso, al.map, ui->machine,
+	size = dso__data_read_addr(map->dso, map, ui->machine,
 				   addr, (u8 *) data, sizeof(*data));
 
 	return !(size == sizeof(*data));
@@ -511,14 +511,14 @@ static void put_unwind_info(unw_addr_space_t __maybe_unused as,
 	pr_debug("unwind: put_unwind_info called\n");
 }
 
-static int entry(u64 ip, struct thread *thread,
-		 unwind_entry_cb_t cb, void *arg)
+static int entry(u64 ip, struct thread *thread, struct perf_session *session,
+		 u64 timestamp, unwind_entry_cb_t cb, void *arg)
 {
 	struct unwind_entry e;
 	struct addr_location al;
 
-	thread__find_addr_location(thread, PERF_RECORD_MISC_USER,
-				   MAP__FUNCTION, ip, &al);
+	session__find_addr_location(session, thread, PERF_RECORD_MISC_USER,
+				    MAP__FUNCTION, ip, &al, timestamp);
 
 	e.ip = ip;
 	e.map = al.map;
@@ -620,20 +620,22 @@ static int get_entries(struct unwind_info *ui, unwind_entry_cb_t cb,
 		unw_word_t ip;
 
 		unw_get_reg(&c, UNW_REG_IP, &ip);
-		ret = ip ? entry(ip, ui->thread, cb, arg) : 0;
+		ret = ip ? entry(ip, ui->thread, ui->session,
+				 ui->sample->time, cb, arg) : 0;
 	}
 
 	return ret;
 }
 
 int unwind__get_entries(unwind_entry_cb_t cb, void *arg,
-			struct thread *thread,
+			struct thread *thread, struct perf_session *session,
 			struct perf_sample *data, int max_stack)
 {
 	u64 ip;
 	struct unwind_info ui = {
 		.sample       = data,
 		.thread       = thread,
+		.session      = session,
 		.machine      = thread->mg->machine,
 	};
 	int ret;
@@ -645,7 +647,7 @@ int unwind__get_entries(unwind_entry_cb_t cb, void *arg,
 	if (ret)
 		return ret;
 
-	ret = entry(ip, thread, cb, arg);
+	ret = entry(ip, thread, session, data->time, cb, arg);
 	if (ret)
 		return -ENOMEM;
 
diff --git a/tools/perf/util/unwind.h b/tools/perf/util/unwind.h
index 12790cf94618..c619890e60ad 100644
--- a/tools/perf/util/unwind.h
+++ b/tools/perf/util/unwind.h
@@ -16,7 +16,7 @@ typedef int (*unwind_entry_cb_t)(struct unwind_entry *entry, void *arg);
 
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
 int unwind__get_entries(unwind_entry_cb_t cb, void *arg,
-			struct thread *thread,
+			struct thread *thread, struct perf_session *session,
 			struct perf_sample *data, int max_stack);
 /* libunwind specific */
 #ifdef HAVE_LIBUNWIND_SUPPORT
@@ -38,6 +38,7 @@ static inline int
 unwind__get_entries(unwind_entry_cb_t cb __maybe_unused,
 		    void *arg __maybe_unused,
 		    struct thread *thread __maybe_unused,
+		    struct perf_session *session __maybe_unused,
 		    struct perf_sample *data __maybe_unused,
 		    int max_stack __maybe_unused)
 {
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 23/38] perf tools: Add a test case for timed map groups handling
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (21 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 22/38] perf callchain: Use " Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 24/38] perf tools: Protect dso symbol loading using a mutex Namhyung Kim
                   ` (14 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

A test case for verifying thread->mg and ->mg_list handling during
time change and new thread__find_addr_map_time() and friends.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/tests/Build            |  1 +
 tools/perf/tests/builtin-test.c   |  4 ++
 tools/perf/tests/tests.h          |  1 +
 tools/perf/tests/thread-mg-time.c | 88 +++++++++++++++++++++++++++++++++++++++
 4 files changed, 94 insertions(+)
 create mode 100644 tools/perf/tests/thread-mg-time.c

diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index bfa0aa35761f..b6f50e3e301f 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -27,6 +27,7 @@ perf-y += mmap-thread-lookup.o
 perf-y += thread-comm.o
 perf-y += thread-mg-share.o
 perf-y += thread-lookup-time.o
+perf-y += thread-mg-time.o
 perf-y += switch-tracking.o
 perf-y += keep-tracking.o
 perf-y += code-reading.o
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index e4d335de19ea..8f61a7e291ee 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -175,6 +175,10 @@ static struct test {
 		.func = test__thread_lookup_time,
 	},
 	{
+		.desc = "Test thread map group handling with time",
+		.func = test__thread_mg_time,
+	},
+	{
 		.func = NULL,
 	},
 };
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index 1090337f63e5..03557563f31d 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -53,6 +53,7 @@ int test__fdarray__filter(void);
 int test__fdarray__add(void);
 int test__thread_comm(void);
 int test__thread_lookup_time(void);
+int test__thread_mg_time(void);
 
 #if defined(__x86_64__) || defined(__i386__) || defined(__arm__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
diff --git a/tools/perf/tests/thread-mg-time.c b/tools/perf/tests/thread-mg-time.c
new file mode 100644
index 000000000000..69fd13752c1d
--- /dev/null
+++ b/tools/perf/tests/thread-mg-time.c
@@ -0,0 +1,88 @@
+#include "tests.h"
+#include "machine.h"
+#include "thread.h"
+#include "map.h"
+#include "debug.h"
+
+#define PERF_MAP_START  0x40000
+
+int test__thread_mg_time(void)
+{
+	struct machines machines;
+	struct machine *machine;
+	struct thread *t;
+	struct map_groups *mg;
+	struct map *map;
+	struct addr_location al = { .map = NULL, };
+
+	/*
+	 * This test is to check whether it can retrieve a correct map
+	 * for a given time.  When multi-file data storage is enabled,
+	 * those task/comm/mmap events are processed first so the
+	 * later sample should find a matching comm properly.
+	 */
+	machines__init(&machines);
+	machine = &machines.host;
+
+	t = machine__findnew_thread(machine, 0, 0);
+	mg = t->mg;
+
+	map = dso__new_map("/usr/bin/perf");
+	map->start = PERF_MAP_START;
+	map->end = PERF_MAP_START + 0x1000;
+
+	thread__insert_map(t, map);
+
+	if (verbose > 1)
+		map_groups__fprintf(t->mg, stderr);
+
+	thread__find_addr_map(t, PERF_RECORD_MISC_USER, MAP__FUNCTION,
+			      PERF_MAP_START, &al);
+
+	TEST_ASSERT_VAL("cannot find mapping for perf", al.map != NULL);
+	TEST_ASSERT_VAL("non matched mapping found", al.map == map);
+	TEST_ASSERT_VAL("incorrect map groups", al.map->groups == mg);
+	TEST_ASSERT_VAL("incorrect map groups", al.map->groups == t->mg);
+
+	thread__find_addr_map_time(t, PERF_RECORD_MISC_USER,
+				   MAP__FUNCTION, PERF_MAP_START, &al, -1ULL);
+
+	TEST_ASSERT_VAL("cannot find timed mapping for perf", al.map != NULL);
+	TEST_ASSERT_VAL("non matched timed mapping", al.map == map);
+	TEST_ASSERT_VAL("incorrect timed map groups", al.map->groups == mg);
+	TEST_ASSERT_VAL("incorrect map groups", al.map->groups == t->mg);
+
+
+	pr_debug("simulate EXEC event (generate new mg)\n");
+	__thread__set_comm(t, "perf-test", 10000, true);
+
+	map = dso__new_map("/usr/bin/perf-test");
+	map->start = PERF_MAP_START;
+	map->end = PERF_MAP_START + 0x2000;
+
+	thread__insert_map(t, map);
+
+	if (verbose > 1)
+		map_groups__fprintf(t->mg, stderr);
+
+	thread__find_addr_map(t, PERF_RECORD_MISC_USER, MAP__FUNCTION,
+			      PERF_MAP_START + 4, &al);
+
+	TEST_ASSERT_VAL("cannot find mapping for perf-test", al.map != NULL);
+	TEST_ASSERT_VAL("invalid mapping found", al.map == map);
+	TEST_ASSERT_VAL("incorrect map groups", al.map->groups != mg);
+	TEST_ASSERT_VAL("incorrect map groups", al.map->groups == t->mg);
+
+	pr_debug("searching map in the old mag groups\n");
+	thread__find_addr_map_time(t, PERF_RECORD_MISC_USER,
+				   MAP__FUNCTION, PERF_MAP_START, &al, 5000);
+
+	TEST_ASSERT_VAL("cannot find timed mapping for perf-test", al.map != NULL);
+	TEST_ASSERT_VAL("non matched timed mapping", al.map != map);
+	TEST_ASSERT_VAL("incorrect timed map groups", al.map->groups == mg);
+	TEST_ASSERT_VAL("incorrect map groups", al.map->groups != t->mg);
+
+	machine__delete_threads(machine);
+	machines__exit(&machines);
+	return 0;
+}
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 24/38] perf tools: Protect dso symbol loading using a mutex
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (22 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 23/38] perf tools: Add a test case for timed map groups handling Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 25/38] perf tools: Protect dso cache tree using dso->lock Namhyung Kim
                   ` (13 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

When multi-thread support for perf report is enabled, it's possible to
access a dso concurrently.  Add a new pthread_mutex to protect it from
concurrent dso__load().

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/dso.c    |  2 ++
 tools/perf/util/dso.h    |  1 +
 tools/perf/util/symbol.c | 34 ++++++++++++++++++++++++----------
 3 files changed, 27 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 814554d1b857..f10269c3fe2f 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -884,6 +884,7 @@ struct dso *dso__new(const char *name)
 		RB_CLEAR_NODE(&dso->rb_node);
 		INIT_LIST_HEAD(&dso->node);
 		INIT_LIST_HEAD(&dso->data.open_entry);
+		pthread_mutex_init(&dso->lock, NULL);
 	}
 
 	return dso;
@@ -913,6 +914,7 @@ void dso__delete(struct dso *dso)
 	dso_cache__free(&dso->data.cache);
 	dso__free_a2l(dso);
 	zfree(&dso->symsrc_filename);
+	pthread_mutex_destroy(&dso->lock);
 	free(dso);
 }
 
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index ced92841ff97..da188a73d034 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -102,6 +102,7 @@ struct dsos {
 };
 
 struct dso {
+	pthread_mutex_t	 lock;
 	struct list_head node;
 	struct rb_node	 rb_node;	/* rbtree node sorted by long name */
 	struct rb_root	 symbols[MAP__NR_TYPES];
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index a69066865a55..714e20c99354 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -1357,12 +1357,22 @@ int dso__load(struct dso *dso, struct map *map, symbol_filter_t filter)
 	struct symsrc *syms_ss = NULL, *runtime_ss = NULL;
 	bool kmod;
 
-	dso__set_loaded(dso, map->type);
+	pthread_mutex_lock(&dso->lock);
+
+	/* check again under the dso->lock */
+	if (dso__loaded(dso, map->type)) {
+		ret = 1;
+		goto out;
+	}
+
+	if (dso->kernel) {
+		if (dso->kernel == DSO_TYPE_KERNEL)
+			ret = dso__load_kernel_sym(dso, map, filter);
+		else if (dso->kernel == DSO_TYPE_GUEST_KERNEL)
+			ret = dso__load_guest_kernel_sym(dso, map, filter);
 
-	if (dso->kernel == DSO_TYPE_KERNEL)
-		return dso__load_kernel_sym(dso, map, filter);
-	else if (dso->kernel == DSO_TYPE_GUEST_KERNEL)
-		return dso__load_guest_kernel_sym(dso, map, filter);
+		goto out;
+	}
 
 	if (map->groups && map->groups->machine)
 		machine = map->groups->machine;
@@ -1375,18 +1385,18 @@ int dso__load(struct dso *dso, struct map *map, symbol_filter_t filter)
 		struct stat st;
 
 		if (lstat(dso->name, &st) < 0)
-			return -1;
+			goto out;
 
 		if (st.st_uid && (st.st_uid != geteuid())) {
 			pr_warning("File %s not owned by current user or root, "
 				"ignoring it.\n", dso->name);
-			return -1;
+			goto out;
 		}
 
 		ret = dso__load_perf_map(dso, map, filter);
 		dso->symtab_type = ret > 0 ? DSO_BINARY_TYPE__JAVA_JIT :
 					     DSO_BINARY_TYPE__NOT_FOUND;
-		return ret;
+		goto out;
 	}
 
 	if (machine)
@@ -1394,7 +1404,7 @@ int dso__load(struct dso *dso, struct map *map, symbol_filter_t filter)
 
 	name = malloc(PATH_MAX);
 	if (!name)
-		return -1;
+		goto out;
 
 	kmod = dso->symtab_type == DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE ||
 		dso->symtab_type == DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE_COMP ||
@@ -1475,7 +1485,11 @@ int dso__load(struct dso *dso, struct map *map, symbol_filter_t filter)
 out_free:
 	free(name);
 	if (ret < 0 && strstr(dso->name, " (deleted)") != NULL)
-		return 0;
+		ret = 0;
+out:
+	dso__set_loaded(dso, map->type);
+	pthread_mutex_unlock(&dso->lock);
+
 	return ret;
 }
 
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 25/38] perf tools: Protect dso cache tree using dso->lock
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (23 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 24/38] perf tools: Protect dso symbol loading using a mutex Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 26/38] perf tools: Protect dso cache fd with a mutex Namhyung Kim
                   ` (12 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

The dso cache is accessed during dwarf callchain unwind and it might
be processed concurrently when multi-thread report is enabled.
Protect it under dso->lock.

Note that it doesn't protect dso_cache__find().  I think it's safe to
access to the cache tree without the lock since we don't delete nodes.
It it missed an existing node due to rotation, it'll find it during
dso_cache__insert() anyway.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/dso.c | 34 +++++++++++++++++++++++++++-------
 1 file changed, 27 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index f10269c3fe2f..3bfbe0e76e96 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -443,10 +443,12 @@ bool dso__data_status_seen(struct dso *dso, enum dso_data_status_seen by)
 }
 
 static void
-dso_cache__free(struct rb_root *root)
+dso_cache__free(struct dso *dso)
 {
+	struct rb_root *root = &dso->data.cache;
 	struct rb_node *next = rb_first(root);
 
+	pthread_mutex_lock(&dso->lock);
 	while (next) {
 		struct dso_cache *cache;
 
@@ -455,10 +457,12 @@ dso_cache__free(struct rb_root *root)
 		rb_erase(&cache->rb_node, root);
 		free(cache);
 	}
+	pthread_mutex_unlock(&dso->lock);
 }
 
-static struct dso_cache *dso_cache__find(const struct rb_root *root, u64 offset)
+static struct dso_cache *dso_cache__find(struct dso *dso, u64 offset)
 {
+	const struct rb_root *root = &dso->data.cache;
 	struct rb_node * const *p = &root->rb_node;
 	const struct rb_node *parent = NULL;
 	struct dso_cache *cache;
@@ -477,17 +481,20 @@ static struct dso_cache *dso_cache__find(const struct rb_root *root, u64 offset)
 		else
 			return cache;
 	}
+
 	return NULL;
 }
 
-static void
-dso_cache__insert(struct rb_root *root, struct dso_cache *new)
+static struct dso_cache *
+dso_cache__insert(struct dso *dso, struct dso_cache *new)
 {
+	struct rb_root *root = &dso->data.cache;
 	struct rb_node **p = &root->rb_node;
 	struct rb_node *parent = NULL;
 	struct dso_cache *cache;
 	u64 offset = new->offset;
 
+	pthread_mutex_lock(&dso->lock);
 	while (*p != NULL) {
 		u64 end;
 
@@ -499,10 +506,17 @@ dso_cache__insert(struct rb_root *root, struct dso_cache *new)
 			p = &(*p)->rb_left;
 		else if (offset >= end)
 			p = &(*p)->rb_right;
+		else
+			goto out;
 	}
 
 	rb_link_node(&new->rb_node, parent, p);
 	rb_insert_color(&new->rb_node, root);
+
+	cache = NULL;
+out:
+	pthread_mutex_unlock(&dso->lock);
+	return cache;
 }
 
 static ssize_t
@@ -520,6 +534,7 @@ static ssize_t
 dso_cache__read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
 {
 	struct dso_cache *cache;
+	struct dso_cache *old;
 	ssize_t ret;
 
 	do {
@@ -539,7 +554,12 @@ dso_cache__read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
 
 		cache->offset = cache_offset;
 		cache->size   = ret;
-		dso_cache__insert(&dso->data.cache, cache);
+		old = dso_cache__insert(dso, cache);
+		if (old) {
+			/* we lose the race */
+			free(cache);
+			cache = old;
+		}
 
 		ret = dso_cache__memcpy(cache, offset, data, size);
 
@@ -556,7 +576,7 @@ static ssize_t dso_cache_read(struct dso *dso, u64 offset,
 {
 	struct dso_cache *cache;
 
-	cache = dso_cache__find(&dso->data.cache, offset);
+	cache = dso_cache__find(dso, offset);
 	if (cache)
 		return dso_cache__memcpy(cache, offset, data, size);
 	else
@@ -911,7 +931,7 @@ void dso__delete(struct dso *dso)
 	}
 
 	dso__data_close(dso);
-	dso_cache__free(&dso->data.cache);
+	dso_cache__free(dso);
 	dso__free_a2l(dso);
 	zfree(&dso->symsrc_filename);
 	pthread_mutex_destroy(&dso->lock);
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 26/38] perf tools: Protect dso cache fd with a mutex
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (24 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 25/38] perf tools: Protect dso cache tree using dso->lock Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 27/38] perf callchain: Maintain libunwind's address space in map_groups Namhyung Kim
                   ` (11 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

When dso cache is accessed in multi-thread environment, it's possible
to close other dso->data.fd during operation due to open file limit.
Protect the file descriptors using a separate mutex.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/dso.c | 98 +++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 72 insertions(+), 26 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 3bfbe0e76e96..64aaa45dcdd7 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -213,6 +213,7 @@ bool dso__needs_decompress(struct dso *dso)
  */
 static LIST_HEAD(dso__data_open);
 static long dso__data_open_cnt;
+static pthread_mutex_t dso__data_open_lock = PTHREAD_MUTEX_INITIALIZER;
 
 static void dso__list_add(struct dso *dso)
 {
@@ -382,7 +383,9 @@ static void check_data_close(void)
  */
 void dso__data_close(struct dso *dso)
 {
+	pthread_mutex_lock(&dso__data_open_lock);
 	close_dso(dso);
+	pthread_mutex_unlock(&dso__data_open_lock);
 }
 
 /**
@@ -405,6 +408,8 @@ int dso__data_fd(struct dso *dso, struct machine *machine)
 	if (dso->data.status == DSO_DATA_STATUS_ERROR)
 		return -1;
 
+	pthread_mutex_lock(&dso__data_open_lock);
+
 	if (dso->data.fd >= 0)
 		goto out;
 
@@ -427,6 +432,7 @@ int dso__data_fd(struct dso *dso, struct machine *machine)
 	else
 		dso->data.status = DSO_DATA_STATUS_ERROR;
 
+	pthread_mutex_unlock(&dso__data_open_lock);
 	return dso->data.fd;
 }
 
@@ -531,7 +537,8 @@ dso_cache__memcpy(struct dso_cache *cache, u64 offset,
 }
 
 static ssize_t
-dso_cache__read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
+dso_cache__read(struct dso *dso, struct machine *machine,
+		u64 offset, u8 *data, ssize_t size)
 {
 	struct dso_cache *cache;
 	struct dso_cache *old;
@@ -540,11 +547,24 @@ dso_cache__read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
 	do {
 		u64 cache_offset;
 
-		ret = -ENOMEM;
-
 		cache = zalloc(sizeof(*cache) + DSO__DATA_CACHE_SIZE);
 		if (!cache)
-			break;
+			return -ENOMEM;
+
+		pthread_mutex_lock(&dso__data_open_lock);
+
+		/*
+		 * dso->data.fd might be closed if other thread opened another
+		 * file (dso) due to open file limit (RLIMIT_NOFILE).
+		 */
+		if (dso->data.fd < 0) {
+			dso->data.fd = open_dso(dso, machine);
+			if (dso->data.fd < 0) {
+				ret = -errno;
+				dso->data.status = DSO_DATA_STATUS_ERROR;
+				break;
+			}
+		}
 
 		cache_offset = offset & DSO__DATA_CACHE_MASK;
 
@@ -554,6 +574,11 @@ dso_cache__read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
 
 		cache->offset = cache_offset;
 		cache->size   = ret;
+	} while (0);
+
+	pthread_mutex_unlock(&dso__data_open_lock);
+
+	if (ret > 0) {
 		old = dso_cache__insert(dso, cache);
 		if (old) {
 			/* we lose the race */
@@ -562,8 +587,7 @@ dso_cache__read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
 		}
 
 		ret = dso_cache__memcpy(cache, offset, data, size);
-
-	} while (0);
+	}
 
 	if (ret <= 0)
 		free(cache);
@@ -571,8 +595,8 @@ dso_cache__read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
 	return ret;
 }
 
-static ssize_t dso_cache_read(struct dso *dso, u64 offset,
-			      u8 *data, ssize_t size)
+static ssize_t dso_cache_read(struct dso *dso, struct machine *machine,
+			      u64 offset, u8 *data, ssize_t size)
 {
 	struct dso_cache *cache;
 
@@ -580,7 +604,7 @@ static ssize_t dso_cache_read(struct dso *dso, u64 offset,
 	if (cache)
 		return dso_cache__memcpy(cache, offset, data, size);
 	else
-		return dso_cache__read(dso, offset, data, size);
+		return dso_cache__read(dso, machine, offset, data, size);
 }
 
 /*
@@ -588,7 +612,8 @@ static ssize_t dso_cache_read(struct dso *dso, u64 offset,
  * in the rb_tree. Any read to already cached data is served
  * by cached data.
  */
-static ssize_t cached_read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
+static ssize_t cached_read(struct dso *dso, struct machine *machine,
+			   u64 offset, u8 *data, ssize_t size)
 {
 	ssize_t r = 0;
 	u8 *p = data;
@@ -596,7 +621,7 @@ static ssize_t cached_read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
 	do {
 		ssize_t ret;
 
-		ret = dso_cache_read(dso, offset, p, size);
+		ret = dso_cache_read(dso, machine, offset, p, size);
 		if (ret < 0)
 			return ret;
 
@@ -616,21 +641,42 @@ static ssize_t cached_read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
 	return r;
 }
 
-static int data_file_size(struct dso *dso)
+static int data_file_size(struct dso *dso, struct machine *machine)
 {
+	int ret = 0;
 	struct stat st;
 	char sbuf[STRERR_BUFSIZE];
 
-	if (!dso->data.file_size) {
-		if (fstat(dso->data.fd, &st)) {
-			pr_err("dso mmap failed, fstat: %s\n",
-				strerror_r(errno, sbuf, sizeof(sbuf)));
-			return -1;
+	if (dso->data.file_size)
+		return 0;
+
+	pthread_mutex_lock(&dso__data_open_lock);
+
+	/*
+	 * dso->data.fd might be closed if other thread opened another
+	 * file (dso) due to open file limit (RLIMIT_NOFILE).
+	 */
+	if (dso->data.fd < 0) {
+		dso->data.fd = open_dso(dso, machine);
+		if (dso->data.fd < 0) {
+			ret = -errno;
+			dso->data.status = DSO_DATA_STATUS_ERROR;
+			goto out;
 		}
-		dso->data.file_size = st.st_size;
 	}
 
-	return 0;
+	if (fstat(dso->data.fd, &st) < 0) {
+		ret = -errno;
+		pr_err("dso cache fstat failed: %s\n",
+		       strerror_r(errno, sbuf, sizeof(sbuf)));
+		dso->data.status = DSO_DATA_STATUS_ERROR;
+		goto out;
+	}
+	dso->data.file_size = st.st_size;
+
+out:
+	pthread_mutex_unlock(&dso__data_open_lock);
+	return ret;
 }
 
 /**
@@ -648,17 +694,17 @@ off_t dso__data_size(struct dso *dso, struct machine *machine)
 	if (fd < 0)
 		return fd;
 
-	if (data_file_size(dso))
+	if (data_file_size(dso, machine))
 		return -1;
 
 	/* For now just estimate dso data size is close to file size */
 	return dso->data.file_size;
 }
 
-static ssize_t data_read_offset(struct dso *dso, u64 offset,
-				u8 *data, ssize_t size)
+static ssize_t data_read_offset(struct dso *dso, struct machine *machine,
+				u64 offset, u8 *data, ssize_t size)
 {
-	if (data_file_size(dso))
+	if (data_file_size(dso, machine))
 		return -1;
 
 	/* Check the offset sanity. */
@@ -668,7 +714,7 @@ static ssize_t data_read_offset(struct dso *dso, u64 offset,
 	if (offset + size < offset)
 		return -1;
 
-	return cached_read(dso, offset, data, size);
+	return cached_read(dso, machine, offset, data, size);
 }
 
 /**
@@ -685,10 +731,10 @@ static ssize_t data_read_offset(struct dso *dso, u64 offset,
 ssize_t dso__data_read_offset(struct dso *dso, struct machine *machine,
 			      u64 offset, u8 *data, ssize_t size)
 {
-	if (dso__data_fd(dso, machine) < 0)
+	if (dso->data.status == DSO_DATA_STATUS_ERROR)
 		return -1;
 
-	return data_read_offset(dso, offset, data, size);
+	return data_read_offset(dso, machine, offset, data, size);
 }
 
 /**
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 27/38] perf callchain: Maintain libunwind's address space in map_groups
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (25 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 26/38] perf tools: Protect dso cache fd with a mutex Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 28/38] perf tools: Add dso__data_get/put_fd() Namhyung Kim
                   ` (10 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

Currently the address_space was kept in thread struct but it's more
appropriate to keep it in map_groups as it's maintained with time.
Also we don't need to flush after exec since it still can be accessed
when used with an indexed data file.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/tests/dwarf-unwind.c    |  4 ++--
 tools/perf/util/map.c              |  5 +++++
 tools/perf/util/map.h              |  1 +
 tools/perf/util/thread.c           |  8 --------
 tools/perf/util/unwind-libunwind.c | 28 +++++++++++++---------------
 tools/perf/util/unwind.h           | 15 ++++++---------
 6 files changed, 27 insertions(+), 34 deletions(-)

diff --git a/tools/perf/tests/dwarf-unwind.c b/tools/perf/tests/dwarf-unwind.c
index 241270374e93..0fa26e77e28a 100644
--- a/tools/perf/tests/dwarf-unwind.c
+++ b/tools/perf/tests/dwarf-unwind.c
@@ -143,6 +143,8 @@ int test__dwarf_unwind(void)
 	struct thread *thread;
 	int err = -1;
 
+	callchain_param.record_mode = CALLCHAIN_DWARF;
+
 	machines__init(&machines);
 
 	machine = machines__find(&machines, HOST_KERNEL_ID);
@@ -151,8 +153,6 @@ int test__dwarf_unwind(void)
 		return -1;
 	}
 
-	callchain_param.record_mode = CALLCHAIN_DWARF;
-
 	if (init_live_machine(machine)) {
 		pr_err("Could not init machine\n");
 		goto out;
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 85fbb1b3e69f..c7eeabafa6c9 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -14,6 +14,7 @@
 #include "util.h"
 #include "debug.h"
 #include "machine.h"
+#include "unwind.h"
 #include <linux/string.h>
 
 const char *map_type__name[MAP__NR_TYPES] = {
@@ -424,6 +425,8 @@ void map_groups__init(struct map_groups *mg, struct machine *machine)
 	mg->refcnt = 1;
 	mg->timestamp = 0;
 	INIT_LIST_HEAD(&mg->list);
+
+	unwind__prepare_access(mg);
 }
 
 static void maps__delete(struct rb_root *maps)
@@ -457,6 +460,8 @@ void map_groups__exit(struct map_groups *mg)
 		maps__delete(&mg->maps[i]);
 		maps__delete_removed(&mg->removed_maps[i]);
 	}
+
+	unwind__finish_access(mg);
 }
 
 bool map_groups__empty(struct map_groups *mg)
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index f33d49029ac0..f7db4a010dc8 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -64,6 +64,7 @@ struct map_groups {
 	u64		 timestamp;
 	int		 refcnt;
 	struct list_head list;
+	void		 *priv;
 };
 
 struct map_groups *map_groups__new(struct machine *machine);
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 9ae1ce8606af..552e1a56af6a 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -8,7 +8,6 @@
 #include "util.h"
 #include "debug.h"
 #include "comm.h"
-#include "unwind.h"
 
 struct map_groups *thread__get_map_groups(struct thread *thread, u64 timestamp)
 {
@@ -104,9 +103,6 @@ struct thread *thread__new(pid_t pid, pid_t tid)
 		INIT_LIST_HEAD(&thread->tid_node);
 		INIT_LIST_HEAD(&thread->mg_list);
 
-		if (unwind__prepare_access(thread) < 0)
-			goto err_thread;
-
 		comm_str = malloc(32);
 		if (!comm_str)
 			goto err_thread;
@@ -147,7 +143,6 @@ void thread__delete(struct thread *thread)
 		list_del(&comm->list);
 		comm__free(comm);
 	}
-	unwind__finish_access(thread);
 
 	free(thread);
 }
@@ -216,9 +211,6 @@ int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
 				break;
 		}
 		list_add_tail(&new->list, &curr->list);
-
-		if (exec)
-			unwind__flush_access(thread);
 	}
 
 	if (exec) {
diff --git a/tools/perf/util/unwind-libunwind.c b/tools/perf/util/unwind-libunwind.c
index 9ee63179383e..9d7ecb26fde9 100644
--- a/tools/perf/util/unwind-libunwind.c
+++ b/tools/perf/util/unwind-libunwind.c
@@ -32,6 +32,7 @@
 #include "symbol.h"
 #include "util.h"
 #include "debug.h"
+#include "map.h"
 
 extern int
 UNW_OBJ(dwarf_search_unwind_table) (unw_addr_space_t as,
@@ -560,7 +561,7 @@ static unw_accessors_t accessors = {
 	.get_proc_name		= get_proc_name,
 };
 
-int unwind__prepare_access(struct thread *thread)
+int unwind__prepare_access(struct map_groups *mg)
 {
 	unw_addr_space_t addr_space;
 
@@ -574,41 +575,38 @@ int unwind__prepare_access(struct thread *thread)
 	}
 
 	unw_set_caching_policy(addr_space, UNW_CACHE_GLOBAL);
-	thread__set_priv(thread, addr_space);
+	mg->priv = addr_space;
 
 	return 0;
 }
 
-void unwind__flush_access(struct thread *thread)
+void unwind__finish_access(struct map_groups *mg)
 {
-	unw_addr_space_t addr_space;
+	unw_addr_space_t addr_space = mg->priv;
 
 	if (callchain_param.record_mode != CALLCHAIN_DWARF)
 		return;
 
-	addr_space = thread__priv(thread);
-	unw_flush_cache(addr_space, 0, 0);
-}
-
-void unwind__finish_access(struct thread *thread)
-{
-	unw_addr_space_t addr_space;
-
-	if (callchain_param.record_mode != CALLCHAIN_DWARF)
+	if (addr_space == NULL)
 		return;
 
-	addr_space = thread__priv(thread);
 	unw_destroy_addr_space(addr_space);
+	mg->priv = NULL;
 }
 
 static int get_entries(struct unwind_info *ui, unwind_entry_cb_t cb,
 		       void *arg, int max_stack)
 {
+	struct map_groups *mg;
 	unw_addr_space_t addr_space;
 	unw_cursor_t c;
 	int ret;
 
-	addr_space = thread__priv(ui->thread);
+	mg = thread__get_map_groups(ui->thread, ui->sample->time);
+	if (mg == NULL)
+		return -1;
+
+	addr_space = mg->priv;
 	if (addr_space == NULL)
 		return -1;
 
diff --git a/tools/perf/util/unwind.h b/tools/perf/util/unwind.h
index c619890e60ad..b3833eaf4c3b 100644
--- a/tools/perf/util/unwind.h
+++ b/tools/perf/util/unwind.h
@@ -21,17 +21,15 @@ int unwind__get_entries(unwind_entry_cb_t cb, void *arg,
 /* libunwind specific */
 #ifdef HAVE_LIBUNWIND_SUPPORT
 int libunwind__arch_reg_id(int regnum);
-int unwind__prepare_access(struct thread *thread);
-void unwind__flush_access(struct thread *thread);
-void unwind__finish_access(struct thread *thread);
+int unwind__prepare_access(struct map_groups *mg);
+void unwind__finish_access(struct map_groups *mg);
 #else
-static inline int unwind__prepare_access(struct thread *thread __maybe_unused)
+static inline int unwind__prepare_access(struct map_groups *mg __maybe_unused)
 {
 	return 0;
 }
 
-static inline void unwind__flush_access(struct thread *thread __maybe_unused) {}
-static inline void unwind__finish_access(struct thread *thread __maybe_unused) {}
+static inline void unwind__finish_access(struct map_groups *mg __maybe_unused) {}
 #endif
 #else
 static inline int
@@ -45,12 +43,11 @@ unwind__get_entries(unwind_entry_cb_t cb __maybe_unused,
 	return 0;
 }
 
-static inline int unwind__prepare_access(struct thread *thread __maybe_unused)
+static inline int unwind__prepare_access(struct map_groups *mg __maybe_unused)
 {
 	return 0;
 }
 
-static inline void unwind__flush_access(struct thread *thread __maybe_unused) {}
-static inline void unwind__finish_access(struct thread *thread __maybe_unused) {}
+static inline void unwind__finish_access(struct map_groups *mg __maybe_unused) {}
 #endif /* HAVE_DWARF_UNWIND_SUPPORT */
 #endif /* __UNWIND_H */
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 28/38] perf tools: Add dso__data_get/put_fd()
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (26 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 27/38] perf callchain: Maintain libunwind's address space in map_groups Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 29/38] perf session: Pass struct events stats to event processing functions Namhyung Kim
                   ` (9 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

Using dso__data_fd() in multi-thread environment is not safe since
returned fd can be closed and/or reused anytime.  So convert it to the
dso__data_get/put_fd() pair to protect the access with lock.

The original dso__data_fd() is deprecated and kept only for testing.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/dso.c              | 44 +++++++++++++++++++++++++++++---------
 tools/perf/util/dso.h              |  9 ++++++--
 tools/perf/util/unwind-libunwind.c | 38 +++++++++++++++++++-------------
 3 files changed, 64 insertions(+), 27 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 64aaa45dcdd7..50bb2f93b7e9 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -389,14 +389,15 @@ void dso__data_close(struct dso *dso)
 }
 
 /**
- * dso__data_fd - Get dso's data file descriptor
+ * dso__data_get_fd - Get dso's data file descriptor
  * @dso: dso object
  * @machine: machine object
  *
  * External interface to find dso's file, open it and
- * returns file descriptor.
+ * returns file descriptor.  Should be paired with
+ * dso__data_put_fd().
  */
-int dso__data_fd(struct dso *dso, struct machine *machine)
+int dso__data_get_fd(struct dso *dso, struct machine *machine)
 {
 	enum dso_binary_type binary_type_data[] = {
 		DSO_BINARY_TYPE__BUILD_ID_CACHE,
@@ -405,11 +406,11 @@ int dso__data_fd(struct dso *dso, struct machine *machine)
 	};
 	int i = 0;
 
+	pthread_mutex_lock(&dso__data_open_lock);
+
 	if (dso->data.status == DSO_DATA_STATUS_ERROR)
 		return -1;
 
-	pthread_mutex_lock(&dso__data_open_lock);
-
 	if (dso->data.fd >= 0)
 		goto out;
 
@@ -432,10 +433,31 @@ int dso__data_fd(struct dso *dso, struct machine *machine)
 	else
 		dso->data.status = DSO_DATA_STATUS_ERROR;
 
-	pthread_mutex_unlock(&dso__data_open_lock);
 	return dso->data.fd;
 }
 
+void dso__data_put_fd(struct dso *dso __maybe_unused)
+{
+	pthread_mutex_unlock(&dso__data_open_lock);
+}
+
+/**
+ * dso__data_get_fd - Get dso's data file descriptor
+ * @dso: dso object
+ * @machine: machine object
+ *
+ * Obsolete interface to find dso's file, open it and
+ * returns file descriptor.  It's not thread-safe in that
+ * the returned fd may be reused for other file.
+ */
+int dso__data_fd(struct dso *dso, struct machine *machine)
+{
+	int fd = dso__data_get_fd(dso, machine);
+
+	dso__data_put_fd(dso);
+	return fd;
+}
+
 bool dso__data_status_seen(struct dso *dso, enum dso_data_status_seen by)
 {
 	u32 flag = 1 << by;
@@ -1144,10 +1166,12 @@ size_t dso__fprintf(struct dso *dso, enum map_type type, FILE *fp)
 enum dso_type dso__type(struct dso *dso, struct machine *machine)
 {
 	int fd;
+	enum dso_type type = DSO__TYPE_UNKNOWN;
 
-	fd = dso__data_fd(dso, machine);
-	if (fd < 0)
-		return DSO__TYPE_UNKNOWN;
+	fd = dso__data_get_fd(dso, machine);
+	if (fd >= 0)
+		type = dso__type_fd(fd);
+	dso__data_put_fd(dso);
 
-	return dso__type_fd(fd);
+	return type;
 }
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index da188a73d034..9f1e67da9c01 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -197,7 +197,9 @@ bool dso__needs_decompress(struct dso *dso);
 
 /*
  * The dso__data_* external interface provides following functions:
- *   dso__data_fd
+ *   dso__data_fd (obsolete)
+ *   dso__data_get_fd
+ *   dso__data_put_fd
  *   dso__data_close
  *   dso__data_size
  *   dso__data_read_offset
@@ -214,8 +216,9 @@ bool dso__needs_decompress(struct dso *dso);
  * The current usage of the dso__data_* interface is as follows:
  *
  * Get DSO's fd:
- *   int fd = dso__data_fd(dso, machine);
+ *   int fd = dso__data_get_fd(dso, machine);
  *   USE 'fd' SOMEHOW
+ *   dso__data_put_fd(dso)
  *
  * Read DSO's data:
  *   n = dso__data_read_offset(dso_0, &machine, 0, buf, BUFSIZE);
@@ -235,6 +238,8 @@ bool dso__needs_decompress(struct dso *dso);
  * TODO
 */
 int dso__data_fd(struct dso *dso, struct machine *machine);
+int dso__data_get_fd(struct dso *dso, struct machine *machine);
+void dso__data_put_fd(struct dso *dso);
 void dso__data_close(struct dso *dso);
 
 off_t dso__data_size(struct dso *dso, struct machine *machine);
diff --git a/tools/perf/util/unwind-libunwind.c b/tools/perf/util/unwind-libunwind.c
index 9d7ecb26fde9..e807ba9d375a 100644
--- a/tools/perf/util/unwind-libunwind.c
+++ b/tools/perf/util/unwind-libunwind.c
@@ -271,13 +271,13 @@ static int read_unwind_spec_eh_frame(struct dso *dso, struct machine *machine,
 	u64 offset = dso->data.frame_offset;
 
 	if (offset == 0) {
-		fd = dso__data_fd(dso, machine);
-		if (fd < 0)
-			return -EINVAL;
-
-		/* Check the .eh_frame section for unwinding info */
-		offset = elf_section_offset(fd, ".eh_frame_hdr");
-		dso->data.frame_offset = offset;
+		fd = dso__data_get_fd(dso, machine);
+		if (fd >= 0) {
+			/* Check the .eh_frame section for unwinding info */
+			offset = elf_section_offset(fd, ".eh_frame_hdr");
+			dso->data.frame_offset = offset;
+		}
+		dso__data_put_fd(dso);
 	}
 
 	if (offset)
@@ -296,13 +296,19 @@ static int read_unwind_spec_debug_frame(struct dso *dso,
 	u64 ofs = dso->data.frame_offset;
 
 	if (ofs == 0) {
-		fd = dso__data_fd(dso, machine);
-		if (fd < 0)
-			return -EINVAL;
-
-		/* Check the .debug_frame section for unwinding info */
-		ofs = elf_section_offset(fd, ".debug_frame");
-		dso->data.frame_offset = ofs;
+		int ret = 0;
+
+		fd = dso__data_get_fd(dso, machine);
+		if (fd >= 0) {
+			/* Check the .debug_frame section for unwinding info */
+			ofs = elf_section_offset(fd, ".debug_frame");
+			dso->data.frame_offset = ofs;
+		} else
+			ret = -EINVAL;
+
+		dso__data_put_fd(dso);
+		if (ret)
+			return ret;
 	}
 
 	*offset = ofs;
@@ -355,10 +361,12 @@ find_proc_info(unw_addr_space_t as, unw_word_t ip, unw_proc_info_t *pi,
 #ifndef NO_LIBUNWIND_DEBUG_FRAME
 	/* Check the .debug_frame section for unwinding info */
 	if (!read_unwind_spec_debug_frame(map->dso, ui->machine, &segbase)) {
-		int fd = dso__data_fd(map->dso, ui->machine);
+		int fd = dso__data_get_fd(map->dso, ui->machine);
 		int is_exec = elf_is_exec(fd, map->dso->name);
 		unw_word_t base = is_exec ? 0 : map->start;
 
+		dso__data_put_fd(map->dso);
+
 		memset(&di, 0, sizeof(di));
 		if (dwarf_find_debug_frame(0, &di, ip, base, map->dso->name,
 					   map->start, map->end))
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 29/38] perf session: Pass struct events stats to event processing functions
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (27 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 28/38] perf tools: Add dso__data_get/put_fd() Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 30/38] perf hists: Pass hists struct to hist_entry_iter struct Namhyung Kim
                   ` (8 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

Pass stats structure so that it can point separate object when used in
multi-thread environment.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/ordered-events.c |  4 ++-
 tools/perf/util/session.c        | 53 ++++++++++++++++++++++++----------------
 tools/perf/util/session.h        |  1 +
 3 files changed, 36 insertions(+), 22 deletions(-)

diff --git a/tools/perf/util/ordered-events.c b/tools/perf/util/ordered-events.c
index 077ddd25189f..35b7c0fd103b 100644
--- a/tools/perf/util/ordered-events.c
+++ b/tools/perf/util/ordered-events.c
@@ -183,7 +183,9 @@ static int __ordered_events__flush(struct perf_session *s,
 		if (ret)
 			pr_err("Can't parse sample, err = %d\n", ret);
 		else {
-			ret = perf_session__deliver_event(s, iter->event, &sample, tool,
+			ret = perf_session__deliver_event(s, &s->evlist->stats,
+							  iter->event,
+							  &sample, tool,
 							  iter->file_offset);
 			if (ret)
 				return ret;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index d89dfa8592a9..0090eb8c6974 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -826,6 +826,7 @@ static struct machine *machines__find_for_cpumode(struct machines *machines,
 }
 
 static int deliver_sample_value(struct perf_evlist *evlist,
+				struct events_stats *stats,
 				struct perf_tool *tool,
 				union perf_event *event,
 				struct perf_sample *sample,
@@ -841,7 +842,7 @@ static int deliver_sample_value(struct perf_evlist *evlist,
 	}
 
 	if (!sid || sid->evsel == NULL) {
-		++evlist->stats.nr_unknown_id;
+		++stats->nr_unknown_id;
 		return 0;
 	}
 
@@ -849,6 +850,7 @@ static int deliver_sample_value(struct perf_evlist *evlist,
 }
 
 static int deliver_sample_group(struct perf_evlist *evlist,
+				struct events_stats *stats,
 				struct perf_tool *tool,
 				union  perf_event *event,
 				struct perf_sample *sample,
@@ -858,7 +860,7 @@ static int deliver_sample_group(struct perf_evlist *evlist,
 	u64 i;
 
 	for (i = 0; i < sample->read.group.nr; i++) {
-		ret = deliver_sample_value(evlist, tool, event, sample,
+		ret = deliver_sample_value(evlist, stats, tool, event, sample,
 					   &sample->read.group.values[i],
 					   machine);
 		if (ret)
@@ -870,6 +872,7 @@ static int deliver_sample_group(struct perf_evlist *evlist,
 
 static int
  perf_evlist__deliver_sample(struct perf_evlist *evlist,
+			     struct events_stats *stats,
 			     struct perf_tool *tool,
 			     union  perf_event *event,
 			     struct perf_sample *sample,
@@ -886,14 +889,15 @@ static int
 
 	/* For PERF_SAMPLE_READ we have either single or group mode. */
 	if (read_format & PERF_FORMAT_GROUP)
-		return deliver_sample_group(evlist, tool, event, sample,
+		return deliver_sample_group(evlist, stats, tool, event, sample,
 					    machine);
 	else
-		return deliver_sample_value(evlist, tool, event, sample,
+		return deliver_sample_value(evlist, stats, tool, event, sample,
 					    &sample->read.one, machine);
 }
 
 int perf_session__deliver_event(struct perf_session *session,
+				struct events_stats *stats,
 				union perf_event *event,
 				struct perf_sample *sample,
 				struct perf_tool *tool, u64 file_offset)
@@ -912,14 +916,15 @@ int perf_session__deliver_event(struct perf_session *session,
 	case PERF_RECORD_SAMPLE:
 		dump_sample(evsel, event, sample);
 		if (evsel == NULL) {
-			++evlist->stats.nr_unknown_id;
+			++stats->nr_unknown_id;
 			return 0;
 		}
 		if (machine == NULL) {
-			++evlist->stats.nr_unprocessable_samples;
+			++stats->nr_unprocessable_samples;
 			return 0;
 		}
-		return perf_evlist__deliver_sample(evlist, tool, event, sample, evsel, machine);
+		return perf_evlist__deliver_sample(evlist, stats, tool, event,
+						   sample, evsel, machine);
 	case PERF_RECORD_MMAP:
 		return tool->mmap(tool, event, sample, machine);
 	case PERF_RECORD_MMAP2:
@@ -932,7 +937,7 @@ int perf_session__deliver_event(struct perf_session *session,
 		return tool->exit(tool, event, sample, machine);
 	case PERF_RECORD_LOST:
 		if (tool->lost == perf_event__process_lost)
-			evlist->stats.total_lost += event->lost.lost;
+			stats->total_lost += event->lost.lost;
 		return tool->lost(tool, event, sample, machine);
 	case PERF_RECORD_READ:
 		return tool->read(tool, event, sample, evsel, machine);
@@ -941,7 +946,7 @@ int perf_session__deliver_event(struct perf_session *session,
 	case PERF_RECORD_UNTHROTTLE:
 		return tool->unthrottle(tool, event, sample, machine);
 	default:
-		++evlist->stats.nr_unknown_events;
+		++stats->nr_unknown_events;
 		return -1;
 	}
 }
@@ -996,7 +1001,8 @@ int perf_session__deliver_synth_event(struct perf_session *session,
 	if (event->header.type >= PERF_RECORD_USER_TYPE_START)
 		return perf_session__process_user_event(session, event, tool, 0);
 
-	return perf_session__deliver_event(session, event, sample, tool, 0);
+	return perf_session__deliver_event(session, &session->evlist->stats,
+					   event, sample, tool, 0);
 }
 
 static void event_swap(union perf_event *event, bool sample_id_all)
@@ -1064,6 +1070,7 @@ int perf_session__peek_event(struct perf_session *session, off_t file_offset,
 }
 
 static s64 perf_session__process_event(struct perf_session *session,
+				       struct events_stats *stats,
 				       union perf_event *event,
 				       struct perf_tool *tool,
 				       u64 file_offset)
@@ -1078,7 +1085,7 @@ static s64 perf_session__process_event(struct perf_session *session,
 	if (event->header.type >= PERF_RECORD_HEADER_MAX)
 		return -EINVAL;
 
-	events_stats__inc(&evlist->stats, event->header.type);
+	events_stats__inc(stats, event->header.type);
 
 	if (event->header.type >= PERF_RECORD_USER_TYPE_START)
 		return perf_session__process_user_event(session, event, tool, file_offset);
@@ -1097,8 +1104,8 @@ static s64 perf_session__process_event(struct perf_session *session,
 			return ret;
 	}
 
-	return perf_session__deliver_event(session, event, &sample, tool,
-					   file_offset);
+	return perf_session__deliver_event(session, stats, event, &sample,
+					   tool, file_offset);
 }
 
 void perf_event_header__bswap(struct perf_event_header *hdr)
@@ -1237,7 +1244,8 @@ static int __perf_session__process_pipe_events(struct perf_session *session,
 		}
 	}
 
-	if ((skip = perf_session__process_event(session, event, tool, head)) < 0) {
+	if ((skip = perf_session__process_event(session, &session->evlist->stats,
+						event, tool, head)) < 0) {
 		pr_err("%#" PRIx64 " [%#x]: failed to process type: %d\n",
 		       head, event->header.size, event->header.type);
 		err = -EINVAL;
@@ -1301,7 +1309,8 @@ fetch_mmaped_event(struct perf_session *session,
 #define NUM_MMAPS 128
 #endif
 
-static int __perf_session__process_events(struct perf_session *session, int fd,
+static int __perf_session__process_events(struct perf_session *session,
+					  struct events_stats *stats, int fd,
 					  u64 data_offset, u64 data_size,
 					  u64 file_size, struct perf_tool *tool)
 {
@@ -1374,8 +1383,8 @@ static int __perf_session__process_events(struct perf_session *session, int fd,
 	size = event->header.size;
 
 	if (size < sizeof(struct perf_event_header) ||
-	    (skip = perf_session__process_event(session, event, tool, file_pos))
-									< 0) {
+	    (skip = perf_session__process_event(session, stats, event,
+						tool, file_pos)) < 0) {
 		pr_err("%#" PRIx64 " [%#x]: failed to process type: %d\n",
 		       file_offset + head, event->header.size,
 		       event->header.type);
@@ -1412,6 +1421,7 @@ static int __perf_session__process_indexed_events(struct perf_session *session,
 {
 	struct perf_data_file *file = session->file;
 	u64 size = perf_data_file__size(file);
+	struct events_stats *stats = &session->evlist->stats;
 	int err = 0, i;
 
 	for (i = 0; i < (int)session->header.nr_index; i++) {
@@ -1428,7 +1438,7 @@ static int __perf_session__process_indexed_events(struct perf_session *session,
 		if (i > 0)
 			tool->ordered_events = false;
 
-		err = __perf_session__process_events(session,
+		err = __perf_session__process_events(session, stats,
 						     perf_data_file__fd(file),
 						     index->offset, index->size,
 						     size, tool);
@@ -1436,7 +1446,7 @@ static int __perf_session__process_indexed_events(struct perf_session *session,
 			break;
 	}
 
-	perf_tool__warn_about_errors(tool, &session->evlist->stats);
+	perf_tool__warn_about_errors(tool, stats);
 	return err;
 }
 
@@ -1445,6 +1455,7 @@ int perf_session__process_events(struct perf_session *session,
 {
 	struct perf_data_file *file = session->file;
 	u64 size = perf_data_file__size(file);
+	struct events_stats *stats = &session->evlist->stats;
 	int err;
 
 	if (perf_session__register_idle_thread(session) == NULL)
@@ -1455,13 +1466,13 @@ int perf_session__process_events(struct perf_session *session,
 	if (perf_session__has_index(session))
 		return __perf_session__process_indexed_events(session, tool);
 
-	err = __perf_session__process_events(session,
+	err = __perf_session__process_events(session, stats,
 					     perf_data_file__fd(file),
 					     session->header.data_offset,
 					     session->header.data_size,
 					     size, tool);
 
-	perf_tool__warn_about_errors(tool, &session->evlist->stats);
+	perf_tool__warn_about_errors(tool, stats);
 	return err;
 }
 
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index 4d264fef8968..c9a53ecf658d 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -58,6 +58,7 @@ int perf_session_queue_event(struct perf_session *s, union perf_event *event,
 void perf_tool__fill_defaults(struct perf_tool *tool);
 
 int perf_session__deliver_event(struct perf_session *session,
+				struct events_stats *stats,
 				union perf_event *event,
 				struct perf_sample *sample,
 				struct perf_tool *tool, u64 file_offset);
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 30/38] perf hists: Pass hists struct to hist_entry_iter struct
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (28 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 29/38] perf session: Pass struct events stats to event processing functions Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 31/38] perf tools: Move BUILD_ID_SIZE definition to perf.h Namhyung Kim
                   ` (7 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

This is a preparation for perf report multi-thread support.  When
multi-thread is enable, each thread will have its own hists during the
sample processing.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-report.c       |  1 +
 tools/perf/builtin-top.c          |  1 +
 tools/perf/tests/hists_cumulate.c |  1 +
 tools/perf/tests/hists_filter.c   |  1 +
 tools/perf/tests/hists_output.c   |  1 +
 tools/perf/util/hist.c            | 20 +++++++-------------
 tools/perf/util/hist.h            |  1 +
 7 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 0d6e6bff7994..5adf269b84a9 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -138,6 +138,7 @@ static int process_sample_event(struct perf_tool *tool,
 	struct addr_location al;
 	struct hist_entry_iter iter = {
 		.evsel 			= evsel,
+		.hists 			= evsel__hists(evsel),
 		.sample 		= sample,
 		.session 		= rep->session,
 		.hide_unresolved 	= rep->hide_unresolved,
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index f33cb0e2aa0d..52c6d5d16ecb 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -775,6 +775,7 @@ static void perf_event__process_sample(struct perf_tool *tool,
 		struct hists *hists = evsel__hists(evsel);
 		struct hist_entry_iter iter = {
 			.evsel		= evsel,
+			.hists 		= evsel__hists(evsel),
 			.sample 	= sample,
 			.session 	= top->session,
 			.add_entry_cb 	= hist_iter__top_callback,
diff --git a/tools/perf/tests/hists_cumulate.c b/tools/perf/tests/hists_cumulate.c
index da64acbd35b7..273182c7cc12 100644
--- a/tools/perf/tests/hists_cumulate.c
+++ b/tools/perf/tests/hists_cumulate.c
@@ -88,6 +88,7 @@ static int add_hist_entries(struct hists *hists, struct machine *machine)
 		};
 		struct hist_entry_iter iter = {
 			.evsel = evsel,
+			.hists = evsel__hists(evsel),
 			.sample	= &sample,
 			.hide_unresolved = false,
 		};
diff --git a/tools/perf/tests/hists_filter.c b/tools/perf/tests/hists_filter.c
index f5c0c69383dc..67b9d498d731 100644
--- a/tools/perf/tests/hists_filter.c
+++ b/tools/perf/tests/hists_filter.c
@@ -64,6 +64,7 @@ static int add_hist_entries(struct perf_evlist *evlist,
 			};
 			struct hist_entry_iter iter = {
 				.evsel = evsel,
+				.hists = evsel__hists(evsel),
 				.sample = &sample,
 				.ops = &hist_iter_normal,
 				.hide_unresolved = false,
diff --git a/tools/perf/tests/hists_output.c b/tools/perf/tests/hists_output.c
index 4e3cff568eaa..541cf09280c1 100644
--- a/tools/perf/tests/hists_output.c
+++ b/tools/perf/tests/hists_output.c
@@ -58,6 +58,7 @@ static int add_hist_entries(struct hists *hists, struct machine *machine)
 		};
 		struct hist_entry_iter iter = {
 			.evsel = evsel,
+			.hists = evsel__hists(evsel),
 			.sample = &sample,
 			.ops = &hist_iter_normal,
 			.hide_unresolved = false,
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index dbe7f3744bf1..cbcfda5f1eac 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -510,7 +510,7 @@ iter_add_single_mem_entry(struct hist_entry_iter *iter, struct addr_location *al
 	u64 cost;
 	struct mem_info *mi = iter->priv;
 	struct perf_sample *sample = iter->sample;
-	struct hists *hists = evsel__hists(iter->evsel);
+	struct hists *hists = iter->hists;
 	struct hist_entry *he;
 
 	if (mi == NULL)
@@ -540,8 +540,7 @@ static int
 iter_finish_mem_entry(struct hist_entry_iter *iter,
 		      struct addr_location *al __maybe_unused)
 {
-	struct perf_evsel *evsel = iter->evsel;
-	struct hists *hists = evsel__hists(evsel);
+	struct hists *hists = iter->hists;
 	struct hist_entry *he = iter->he;
 	int err = -EINVAL;
 
@@ -613,8 +612,7 @@ static int
 iter_add_next_branch_entry(struct hist_entry_iter *iter, struct addr_location *al)
 {
 	struct branch_info *bi;
-	struct perf_evsel *evsel = iter->evsel;
-	struct hists *hists = evsel__hists(evsel);
+	struct hists *hists = iter->hists;
 	struct hist_entry *he = NULL;
 	int i = iter->curr;
 	int err = 0;
@@ -661,11 +659,10 @@ iter_prepare_normal_entry(struct hist_entry_iter *iter __maybe_unused,
 static int
 iter_add_single_normal_entry(struct hist_entry_iter *iter, struct addr_location *al)
 {
-	struct perf_evsel *evsel = iter->evsel;
 	struct perf_sample *sample = iter->sample;
 	struct hist_entry *he;
 
-	he = __hists__add_entry(evsel__hists(evsel), al, iter->parent, NULL, NULL,
+	he = __hists__add_entry(iter->hists, al, iter->parent, NULL, NULL,
 				sample->period, sample->weight,
 				sample->transaction, sample->time, true);
 	if (he == NULL)
@@ -680,7 +677,6 @@ iter_finish_normal_entry(struct hist_entry_iter *iter,
 			 struct addr_location *al __maybe_unused)
 {
 	struct hist_entry *he = iter->he;
-	struct perf_evsel *evsel = iter->evsel;
 	struct perf_sample *sample = iter->sample;
 
 	if (he == NULL)
@@ -688,7 +684,7 @@ iter_finish_normal_entry(struct hist_entry_iter *iter,
 
 	iter->he = NULL;
 
-	hists__inc_nr_samples(evsel__hists(evsel), he->filtered);
+	hists__inc_nr_samples(iter->hists, he->filtered);
 
 	return hist_entry__append_callchain(he, sample);
 }
@@ -720,8 +716,7 @@ static int
 iter_add_single_cumulative_entry(struct hist_entry_iter *iter,
 				 struct addr_location *al)
 {
-	struct perf_evsel *evsel = iter->evsel;
-	struct hists *hists = evsel__hists(evsel);
+	struct hists *hists = iter->hists;
 	struct perf_sample *sample = iter->sample;
 	struct hist_entry **he_cache = iter->priv;
 	struct hist_entry *he;
@@ -766,7 +761,6 @@ static int
 iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
 			       struct addr_location *al)
 {
-	struct perf_evsel *evsel = iter->evsel;
 	struct perf_sample *sample = iter->sample;
 	struct hist_entry **he_cache = iter->priv;
 	struct hist_entry *he;
@@ -800,7 +794,7 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
 		}
 	}
 
-	he = __hists__add_entry(evsel__hists(evsel), al, iter->parent, NULL, NULL,
+	he = __hists__add_entry(iter->hists, al, iter->parent, NULL, NULL,
 				sample->period, sample->weight,
 				sample->transaction, sample->time, false);
 	if (he == NULL)
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 0afe15ba0277..6db118613ff5 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -86,6 +86,7 @@ struct hist_entry_iter {
 
 	bool hide_unresolved;
 
+	struct hists *hists;
 	struct perf_evsel *evsel;
 	struct perf_sample *sample;
 	struct perf_session *session;
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 31/38] perf tools: Move BUILD_ID_SIZE definition to perf.h
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (29 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 30/38] perf hists: Pass hists struct to hist_entry_iter struct Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 32/38] perf report: Parallelize perf report using multi-thread Namhyung Kim
                   ` (6 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

The util/event.h includes util/build-id.h only for BUILD_ID_SIZE.
This is a problem when I include util/event.h from util/tool.h which
is also included by util/build-id.h since it now makes a circular
dependency resulting in incomplete type error.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/perf.h          | 1 +
 tools/perf/util/build-id.h | 2 --
 tools/perf/util/dso.h      | 1 +
 tools/perf/util/event.h    | 1 -
 4 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index a03552849399..c32bee696f41 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -30,6 +30,7 @@ static inline unsigned long long rdclock(void)
 }
 
 #define MAX_NR_CPUS			256
+#define BUILD_ID_SIZE			20
 
 extern const char *input_name;
 extern bool perf_host, perf_guest;
diff --git a/tools/perf/util/build-id.h b/tools/perf/util/build-id.h
index 85011222cc14..e71304c9c86f 100644
--- a/tools/perf/util/build-id.h
+++ b/tools/perf/util/build-id.h
@@ -1,8 +1,6 @@
 #ifndef PERF_BUILD_ID_H_
 #define PERF_BUILD_ID_H_ 1
 
-#define BUILD_ID_SIZE 20
-
 #include "tool.h"
 #include "strlist.h"
 #include <linux/types.h>
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index 9f1e67da9c01..0e5a9897d1bf 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -7,6 +7,7 @@
 #include <linux/types.h>
 #include <linux/bitops.h>
 #include "map.h"
+#include "perf.h"
 #include "build-id.h"
 
 enum dso_binary_type {
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 27261320249a..1f86c279520e 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -6,7 +6,6 @@
 
 #include "../perf.h"
 #include "map.h"
-#include "build-id.h"
 #include "perf_regs.h"
 
 struct mmap_event {
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 32/38] perf report: Parallelize perf report using multi-thread
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (30 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 31/38] perf tools: Move BUILD_ID_SIZE definition to perf.h Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 33/38] perf tools: Add missing_threads rb tree Namhyung Kim
                   ` (5 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

Introduce perf_session__process_events_mt() to enable multi-thread
sample processing.  It allocates a struct perf_tool_mt and fills
needed info in it.

The session and hists event stats are counted for each thread and
summed after finishing the processing.  Similarly hist entries are
added to per-thread hists first and then move to the original hists
using hists__mt_resort().  This function reuses hists__collapse_
resort() code so makes sort__need_collapse force to true and skips
the collapsing function.

Note that most of preprocessing stage is already done by processing
meta events in dummy tracking evsel first.  We can find corresponding
thread and map based on the sample time and symbol loading and dso
cache access is protected by pthread mutex.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/hist.c    |  75 +++++++++++++++++++-----
 tools/perf/util/hist.h    |   3 +
 tools/perf/util/session.c | 142 ++++++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/session.h |   2 +
 tools/perf/util/tool.h    |  12 ++++
 5 files changed, 221 insertions(+), 13 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index cbcfda5f1eac..a6bbe730c4af 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -945,7 +945,7 @@ void hist_entry__delete(struct hist_entry *he)
  * collapse the histogram
  */
 
-static bool hists__collapse_insert_entry(struct hists *hists __maybe_unused,
+static bool hists__collapse_insert_entry(struct hists *hists,
 					 struct rb_root *root,
 					 struct hist_entry *he)
 {
@@ -982,6 +982,13 @@ static bool hists__collapse_insert_entry(struct hists *hists __maybe_unused,
 	}
 	hists->nr_entries++;
 
+	/*
+	 * For multi-threaded report, he->hists points to a dummy
+	 * hists in the struct perf_tool_mt.  Please see
+	 * perf_session__process_events_mt().
+	 */
+	he->hists = hists;
+
 	rb_link_node(&he->rb_node_in, parent, p);
 	rb_insert_color(&he->rb_node_in, root);
 	return true;
@@ -1009,19 +1016,12 @@ static void hists__apply_filters(struct hists *hists, struct hist_entry *he)
 	hists__filter_entry_by_symbol(hists, he);
 }
 
-void hists__collapse_resort(struct hists *hists, struct ui_progress *prog)
+static void __hists__collapse_resort(struct hists *hists, struct rb_root *root,
+				     struct ui_progress *prog)
 {
-	struct rb_root *root;
 	struct rb_node *next;
 	struct hist_entry *n;
 
-	if (!sort__need_collapse)
-		return;
-
-	hists->nr_entries = 0;
-
-	root = hists__get_rotate_entries_in(hists);
-
 	next = rb_first(root);
 
 	while (next) {
@@ -1044,6 +1044,27 @@ void hists__collapse_resort(struct hists *hists, struct ui_progress *prog)
 	}
 }
 
+void hists__collapse_resort(struct hists *hists, struct ui_progress *prog)
+{
+	struct rb_root *root;
+
+	if (!sort__need_collapse)
+		return;
+
+	hists->nr_entries = 0;
+
+	root = hists__get_rotate_entries_in(hists);
+	__hists__collapse_resort(hists, root, prog);
+}
+
+void hists__mt_resort(struct hists *dst, struct hists *src)
+{
+	struct rb_root *root = src->entries_in;
+
+	sort__need_collapse = 1;
+	__hists__collapse_resort(dst, root, NULL);
+}
+
 static int hist_entry__sort(struct hist_entry *a, struct hist_entry *b)
 {
 	struct perf_hpp_fmt *fmt;
@@ -1272,6 +1293,29 @@ void events_stats__inc(struct events_stats *stats, u32 type)
 	++stats->nr_events[type];
 }
 
+void events_stats__add(struct events_stats *dst, struct events_stats *src)
+{
+	int i;
+
+#define ADD(_field)  dst->_field += src->_field
+
+	ADD(total_period);
+	ADD(total_non_filtered_period);
+	ADD(total_lost);
+	ADD(total_invalid_chains);
+	ADD(nr_non_filtered_samples);
+	ADD(nr_lost_warned);
+	ADD(nr_unknown_events);
+	ADD(nr_invalid_chains);
+	ADD(nr_unknown_id);
+	ADD(nr_unprocessable_samples);
+
+	for (i = 0; i < PERF_RECORD_HEADER_MAX; i++)
+		ADD(nr_events[i]);
+
+#undef ADD
+}
+
 void hists__inc_nr_events(struct hists *hists, u32 type)
 {
 	events_stats__inc(&hists->stats, type);
@@ -1448,16 +1492,21 @@ int perf_hist_config(const char *var, const char *value)
 	return 0;
 }
 
-static int hists_evsel__init(struct perf_evsel *evsel)
+void __hists__init(struct hists *hists)
 {
-	struct hists *hists = evsel__hists(evsel);
-
 	memset(hists, 0, sizeof(*hists));
 	hists->entries_in_array[0] = hists->entries_in_array[1] = RB_ROOT;
 	hists->entries_in = &hists->entries_in_array[0];
 	hists->entries_collapsed = RB_ROOT;
 	hists->entries = RB_ROOT;
 	pthread_mutex_init(&hists->lock, NULL);
+}
+
+static int hists_evsel__init(struct perf_evsel *evsel)
+{
+	struct hists *hists = evsel__hists(evsel);
+
+	__hists__init(hists);
 	return 0;
 }
 
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 6db118613ff5..3ef30f632948 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -123,6 +123,7 @@ int hist_entry__sort_snprintf(struct hist_entry *he, char *bf, size_t size,
 void hist_entry__delete(struct hist_entry *he);
 
 void hists__output_resort(struct hists *hists, struct ui_progress *prog);
+void hists__mt_resort(struct hists *dst, struct hists *src);
 void hists__collapse_resort(struct hists *hists, struct ui_progress *prog);
 
 void hists__decay_entries(struct hists *hists, bool zap_user, bool zap_kernel);
@@ -135,6 +136,7 @@ void hists__inc_stats(struct hists *hists, struct hist_entry *h);
 void hists__inc_nr_events(struct hists *hists, u32 type);
 void hists__inc_nr_samples(struct hists *hists, bool filtered);
 void events_stats__inc(struct events_stats *stats, u32 type);
+void events_stats__add(struct events_stats *dst, struct events_stats *src);
 size_t events_stats__fprintf(struct events_stats *stats, FILE *fp);
 
 size_t hists__fprintf(struct hists *hists, bool show_header, int max_rows,
@@ -178,6 +180,7 @@ static inline struct hists *evsel__hists(struct perf_evsel *evsel)
 }
 
 int hists__init(void);
+void __hists__init(struct hists *hists);
 
 struct perf_hpp {
 	char *buf;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 0090eb8c6974..566d62e58928 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1476,6 +1476,148 @@ int perf_session__process_events(struct perf_session *session,
 	return err;
 }
 
+static void *processing_thread_idx(void *arg)
+{
+	struct perf_tool_mt *mt_tool = arg;
+	struct perf_session *session = mt_tool->session;
+	int fd = perf_data_file__fd(session->file);
+	u64 offset = session->header.index[mt_tool->idx].offset;
+	u64 size = session->header.index[mt_tool->idx].size;
+	u64 file_size = perf_data_file__size(session->file);
+
+	pr_debug("processing samples using thread [%d]\n", mt_tool->idx);
+	if (__perf_session__process_events(session, &mt_tool->stats,
+					   fd, offset, size, file_size,
+					   &mt_tool->tool) < 0) {
+		pr_err("processing samples failed (thread [%d])\n", mt_tool->idx);
+		return NULL;
+	}
+
+	pr_debug("processing samples done for thread [%d]\n", mt_tool->idx);
+	return arg;
+}
+
+int perf_session__process_events_mt(struct perf_session *session,
+				    struct perf_tool *tool, void *arg)
+{
+	struct perf_data_file *file = session->file;
+	struct perf_evlist *evlist = session->evlist;
+	struct perf_evsel *evsel;
+	u64 nr_entries = 0;
+	struct perf_tool_mt *mt_tools = NULL;
+	struct perf_tool_mt *mt;
+	pthread_t *th_id;
+	int err, i, k;
+	int nr_index = session->header.nr_index;
+	u64 size = perf_data_file__size(file);
+
+	if (perf_data_file__is_pipe(file) || !session->header.index) {
+		pr_err("data file doesn't contain the index table\n");
+		return -EINVAL;
+	}
+
+	if (perf_session__register_idle_thread(session) == NULL)
+		return -ENOMEM;
+
+	err = __perf_session__process_events(session, &evlist->stats,
+					     perf_data_file__fd(file),
+					     session->header.index[0].offset,
+					     session->header.index[0].size,
+					     size, tool);
+	if (err)
+		return err;
+
+	th_id = calloc(nr_index, sizeof(*th_id));
+	if (th_id == NULL)
+		goto out;
+
+	mt_tools = calloc(nr_index, sizeof(*mt_tools));
+	if (mt_tools == NULL)
+		goto out;
+
+	for (i = 1; i < nr_index; i++) {
+		mt = &mt_tools[i];
+
+		memcpy(&mt->tool, tool, sizeof(*tool));
+
+		mt->hists = calloc(evlist->nr_entries, sizeof(*mt->hists));
+		if (mt->hists == NULL)
+			goto err;
+
+		for (k = 0; k < evlist->nr_entries; k++)
+			__hists__init(&mt->hists[k]);
+
+		mt->session = session;
+		mt->tool.ordered_events = false;
+		mt->idx = i;
+		mt->priv = arg;
+
+		pthread_create(&th_id[i], NULL, processing_thread_idx, mt);
+	}
+
+	for (i = 1; i < nr_index; i++) {
+		pthread_join(th_id[i], (void **)&mt);
+		if (mt == NULL) {
+			err = -EINVAL;
+			continue;
+		}
+
+		events_stats__add(&evlist->stats, &mt->stats);
+
+		evlist__for_each(evlist, evsel) {
+			struct hists *hists = evsel__hists(evsel);
+
+			events_stats__add(&hists->stats,
+					  &mt->hists[evsel->idx].stats);
+
+			nr_entries += mt->hists[evsel->idx].nr_entries;
+		}
+	}
+
+	for (i = 1; i < nr_index; i++) {
+		mt = &mt_tools[i];
+
+		evlist__for_each(evlist, evsel) {
+			struct hists *hists = evsel__hists(evsel);
+
+			if (perf_evsel__is_dummy_tracking(evsel))
+				continue;
+
+			hists__mt_resort(hists, &mt->hists[evsel->idx]);
+
+			/* Non-group events are considered as leader */
+			if (symbol_conf.event_group &&
+			    !perf_evsel__is_group_leader(evsel)) {
+				struct hists *leader_hists;
+
+				leader_hists = evsel__hists(evsel->leader);
+				hists__match(leader_hists, hists);
+				hists__link(leader_hists, hists);
+			}
+		}
+	}
+
+out:
+	perf_tool__warn_about_errors(tool, &evlist->stats);
+
+	if (mt_tools) {
+		for (i = 1; i < nr_index; i++)
+			free(mt_tools[i].hists);
+		free(mt_tools);
+	}
+
+	free(th_id);
+	return err;
+
+err:
+	while (i-- > 1) {
+		pthread_cancel(th_id[i]);
+		pthread_join(th_id[i], NULL);
+	}
+
+	goto out;
+}
+
 bool perf_session__has_traces(struct perf_session *session, const char *msg)
 {
 	struct perf_evsel *evsel;
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index c9a53ecf658d..90b5dce9ea79 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -50,6 +50,8 @@ int perf_session__peek_event(struct perf_session *session, off_t file_offset,
 
 int perf_session__process_events(struct perf_session *session,
 				 struct perf_tool *tool);
+int perf_session__process_events_mt(struct perf_session *session,
+				    struct perf_tool *tool, void *arg);
 
 int perf_session_queue_event(struct perf_session *s, union perf_event *event,
 			     struct perf_tool *tool, struct perf_sample *sample,
diff --git a/tools/perf/util/tool.h b/tools/perf/util/tool.h
index bb2708bbfaca..a04826bbe991 100644
--- a/tools/perf/util/tool.h
+++ b/tools/perf/util/tool.h
@@ -2,6 +2,7 @@
 #define __PERF_TOOL_H
 
 #include <stdbool.h>
+#include "util/event.h"
 
 struct perf_session;
 union perf_event;
@@ -10,6 +11,7 @@ struct perf_evsel;
 struct perf_sample;
 struct perf_tool;
 struct machine;
+struct hists;
 
 typedef int (*event_sample)(struct perf_tool *tool, union perf_event *event,
 			    struct perf_sample *sample,
@@ -45,4 +47,14 @@ struct perf_tool {
 	bool		ordering_requires_timestamps;
 };
 
+struct perf_tool_mt {
+	struct perf_tool	tool;
+	struct events_stats	stats;
+	struct hists		*hists;
+	struct perf_session	*session;
+	int			idx;
+
+	void			*priv;
+};
+
 #endif /* __PERF_TOOL_H */
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 33/38] perf tools: Add missing_threads rb tree
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (31 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 32/38] perf report: Parallelize perf report using multi-thread Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 34/38] perf record: Synthesize COMM event for a command line workload Namhyung Kim
                   ` (4 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

Sometimes it's possible to miss certain meta events like fork/exit and
in this case it can fail to find such thread in the machine's rbtree.
But adding a thread to the tree is dangerous since it's now executed
in multi-thread environment otherwise it'll add an overhead in order
to grab a lock for every search.  So adds a separate missing_threads
tree and protect it with a mutex.  It's expected to be accessed only
if a thread is not found in a normal tree.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/tests/thread-lookup-time.c |   8 ++-
 tools/perf/util/build-id.c            |   9 ++-
 tools/perf/util/machine.c             | 129 +++++++++++++++++++++-------------
 tools/perf/util/machine.h             |   2 +
 tools/perf/util/session.c             |   8 +--
 tools/perf/util/thread.h              |   1 +
 6 files changed, 101 insertions(+), 56 deletions(-)

diff --git a/tools/perf/tests/thread-lookup-time.c b/tools/perf/tests/thread-lookup-time.c
index 6237ecf8caae..04cdde9329d6 100644
--- a/tools/perf/tests/thread-lookup-time.c
+++ b/tools/perf/tests/thread-lookup-time.c
@@ -7,7 +7,9 @@
 static int thread__print_cb(struct thread *th, void *arg __maybe_unused)
 {
 	printf("thread: %d, start time: %"PRIu64" %s\n",
-	       th->tid, th->start_time, th->dead ? "(dead)" : "");
+	       th->tid, th->start_time,
+	       th->dead ? "(dead)" : th->exited ? "(exited)" :
+	       th->missing ? "(missing)" : "");
 	return 0;
 }
 
@@ -105,6 +107,8 @@ static int lookup_with_timestamp(struct machine *machine)
 			machine__findnew_thread_time(machine, 0, 0, 70000) == t3);
 
 	machine__delete_threads(machine);
+	machine__delete_dead_threads(machine);
+	machine__delete_missing_threads(machine);
 	return 0;
 }
 
@@ -146,6 +150,8 @@ static int lookup_without_timestamp(struct machine *machine)
 			machine__findnew_thread_time(machine, 0, 0, -1ULL) == t3);
 
 	machine__delete_threads(machine);
+	machine__delete_dead_threads(machine);
+	machine__delete_missing_threads(machine);
 	return 0;
 }
 
diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index ffdc338df925..5b8974400422 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -60,7 +60,14 @@ static int perf_event__exit_del_thread(struct perf_tool *tool __maybe_unused,
 		    event->fork.ppid, event->fork.ptid);
 
 	if (thread) {
-		rb_erase(&thread->rb_node, &machine->threads);
+		if (thread->dead)
+			rb_erase(&thread->rb_node, &machine->dead_threads);
+		else if (thread->missing)
+			rb_erase(&thread->rb_node, &machine->missing_threads);
+		else
+			rb_erase(&thread->rb_node, &machine->threads);
+
+		list_del(&thread->tid_node);
 		machine->last_match = NULL;
 		thread__delete(thread);
 	}
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 63d860dca74b..ec401f82efb3 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -30,6 +30,7 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid)
 
 	machine->threads = RB_ROOT;
 	machine->dead_threads = RB_ROOT;
+	machine->missing_threads = RB_ROOT;
 	machine->last_match = NULL;
 
 	machine->vdso_info = NULL;
@@ -90,6 +91,19 @@ static void dsos__delete(struct dsos *dsos)
 	}
 }
 
+void machine__delete_missing_threads(struct machine *machine)
+{
+	struct rb_node *nd = rb_first(&machine->missing_threads);
+
+	while (nd) {
+		struct thread *t = rb_entry(nd, struct thread, rb_node);
+
+		nd = rb_next(nd);
+		rb_erase(&t->rb_node, &machine->missing_threads);
+		thread__delete(t);
+	}
+}
+
 void machine__delete_dead_threads(struct machine *machine)
 {
 	struct rb_node *nd = rb_first(&machine->dead_threads);
@@ -435,20 +449,14 @@ struct thread *machine__find_thread(struct machine *machine, pid_t pid,
 	return __machine__findnew_thread(machine, pid, tid, false);
 }
 
-static struct thread *__machine__findnew_thread_time(struct machine *machine,
-						     pid_t pid, pid_t tid,
-						     u64 timestamp, bool create)
+static struct thread *machine__find_dead_thread_time(struct machine *machine,
+						     pid_t pid __maybe_unused,
+						     pid_t tid, u64 timestamp)
 {
-	struct thread *curr, *pos, *new;
-	struct thread *th = NULL;
-	struct rb_node **p;
+	struct thread *th, *pos;
+	struct rb_node **p = &machine->dead_threads.rb_node;
 	struct rb_node *parent = NULL;
 
-	curr = __machine__findnew_thread(machine, pid, tid, false);
-	if (curr && timestamp >= curr->start_time)
-		return curr;
-
-	p = &machine->dead_threads.rb_node;
 	while (*p != NULL) {
 		parent = *p;
 		th = rb_entry(parent, struct thread, rb_node);
@@ -462,10 +470,9 @@ static struct thread *__machine__findnew_thread_time(struct machine *machine,
 				}
 			}
 
-			if (timestamp >= th->start_time) {
-				machine__update_thread_pid(machine, th, pid);
+			if (timestamp >= th->start_time)
 				return th;
-			}
+
 			break;
 		}
 
@@ -475,50 +482,67 @@ static struct thread *__machine__findnew_thread_time(struct machine *machine,
 			p = &(*p)->rb_right;
 	}
 
-	if (!create)
-		return NULL;
+	return NULL;
+}
 
-	if (!curr && !*p)
-		return __machine__findnew_thread(machine, pid, tid, true);
+static struct thread *__machine__findnew_thread_time(struct machine *machine,
+						     pid_t pid, pid_t tid,
+						     u64 timestamp, bool create)
+{
+	struct thread *th, *new = NULL;
+	struct rb_node **p = &machine->missing_threads.rb_node;
+	struct rb_node *parent = NULL;
 
-	new = thread__new(pid, tid);
-	if (new == NULL)
-		return NULL;
+	static pthread_mutex_t missing_thread_lock = PTHREAD_MUTEX_INITIALIZER;
 
-	new->dead = true;
-	new->start_time = timestamp;
+	th = __machine__findnew_thread(machine, pid, tid, false);
+	if (th && timestamp >= th->start_time)
+		return th;
 
-	if (*p) {
-		list_for_each_entry(pos, &th->tid_node, tid_node) {
-			/* sort by time */
-			if (timestamp >= pos->start_time) {
-				th = pos;
-				break;
-			}
+	th = machine__find_dead_thread_time(machine, pid, tid, timestamp);
+	if (th)
+		return th;
+
+	pthread_mutex_lock(&missing_thread_lock);
+
+	while (*p != NULL) {
+		parent = *p;
+		th = rb_entry(parent, struct thread, rb_node);
+
+		if (th->tid == tid) {
+			pthread_mutex_unlock(&missing_thread_lock);
+			return th;
 		}
-		list_add_tail(&new->tid_node, &th->tid_node);
-	} else {
-		rb_link_node(&new->rb_node, parent, p);
-		rb_insert_color(&new->rb_node, &machine->dead_threads);
+
+		if (tid < th->tid)
+			p = &(*p)->rb_left;
+		else
+			p = &(*p)->rb_right;
 	}
 
+	if (!create)
+		goto out;
+
+	new = thread__new(pid, tid);
+	if (new == NULL)
+		goto out;
+
+	/* missing threads are not bothered with timestamp */
+	new->start_time = 0;
+	new->missing = true;
+
 	/*
-	 * We have to initialize map_groups separately
-	 * after rb tree is updated.
-	 *
-	 * The reason is that we call machine__findnew_thread
-	 * within thread__init_map_groups to find the thread
-	 * leader and that would screwed the rb tree.
+	 * missing threads have their own map groups regardless of
+	 * leader for the sake of simplicity.  it's okay since the map
+	 * groups has no map in it anyway.
 	 */
-	if (thread__init_map_groups(new, machine)) {
-		if (!list_empty(&new->tid_node))
-			list_del(&new->tid_node);
-		else
-			rb_erase(&new->rb_node, &machine->dead_threads);
+	new->mg = map_groups__new(machine);
 
-		thread__delete(new);
-		return NULL;
-	}
+	rb_link_node(&new->rb_node, parent, p);
+	rb_insert_color(&new->rb_node, &machine->missing_threads);
+
+out:
+	pthread_mutex_unlock(&missing_thread_lock);
 
 	return new;
 }
@@ -1357,6 +1381,7 @@ static void machine__remove_thread(struct machine *machine, struct thread *th)
 
 	machine->last_match = NULL;
 	rb_erase(&th->rb_node, &machine->threads);
+	RB_CLEAR_NODE(&th->rb_node);
 
 	th->dead = true;
 
@@ -1918,6 +1943,14 @@ int machine__for_each_thread(struct machine *machine,
 				return rc;
 		}
 	}
+
+	for (nd = rb_first(&machine->missing_threads); nd; nd = rb_next(nd)) {
+		thread = rb_entry(nd, struct thread, rb_node);
+		rc = fn(thread, priv);
+		if (rc != 0)
+			return rc;
+	}
+
 	return rc;
 }
 
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 38ead24f0f47..d43310d246c1 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -31,6 +31,7 @@ struct machine {
 	char		  *root_dir;
 	struct rb_root	  threads;
 	struct rb_root	  dead_threads;
+	struct rb_root	  missing_threads;
 	struct thread	  *last_match;
 	struct vdso_info  *vdso_info;
 	struct dsos	  user_dsos;
@@ -116,6 +117,7 @@ void machines__set_comm_exec(struct machines *machines, bool comm_exec);
 struct machine *machine__new_host(void);
 int machine__init(struct machine *machine, const char *root_dir, pid_t pid);
 void machine__exit(struct machine *machine);
+void machine__delete_missing_threads(struct machine *machine);
 void machine__delete_dead_threads(struct machine *machine);
 void machine__delete_threads(struct machine *machine);
 void machine__delete(struct machine *machine);
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 566d62e58928..49ded46104dd 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -138,14 +138,11 @@ struct perf_session *perf_session__new(struct perf_data_file *file,
 	return NULL;
 }
 
-static void perf_session__delete_dead_threads(struct perf_session *session)
-{
-	machine__delete_dead_threads(&session->machines.host);
-}
-
 static void perf_session__delete_threads(struct perf_session *session)
 {
 	machine__delete_threads(&session->machines.host);
+	machine__delete_dead_threads(&session->machines.host);
+	machine__delete_missing_threads(&session->machines.host);
 }
 
 static void perf_session_env__delete(struct perf_session_env *env)
@@ -167,7 +164,6 @@ static void perf_session_env__delete(struct perf_session_env *env)
 void perf_session__delete(struct perf_session *session)
 {
 	perf_session__destroy_kernel_maps(session);
-	perf_session__delete_dead_threads(session);
 	perf_session__delete_threads(session);
 	perf_session_env__delete(&session->header.env);
 	machines__exit(&session->machines);
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index 5209ad5adadf..88fee3d8c0dc 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -23,6 +23,7 @@ struct thread {
 	bool			comm_set;
 	bool			exited; /* if set thread has exited */
 	bool			dead; /* thread is in dead_threads list */
+	bool			missing; /* thread is in missing_threads list */
 	struct list_head	comm_list;
 	int			comm_len;
 	u64			db_id;
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 34/38] perf record: Synthesize COMM event for a command line workload
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (32 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 33/38] perf tools: Add missing_threads rb tree Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 35/38] perf tools: Fix progress ui to support multi thread Namhyung Kim
                   ` (3 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML,
	Frederic Weisbecker, Adrian Hunter, Stephane Eranian, Andi Kleen,
	David Ahern

When perf creates a new child to profile, the events are enabled on
exec().  And in this case, it doesn't synthesize any event for the
child since they'll be generated during exec().  But there's an window
between the enabling and the event generation.

It used to be overcome since samples are only in kernel (so we always
have the map) and the comm is overridden by a later COMM event.
However it won't work anymore since those samples will go to a missing
thread now but the COMM event will create a (current) thread.  This
leads to those early samples (like native_write_msr_safe) not having a
comm but pid (like ':15328').

So it needs to synthesize COMM event for the child explicitly before
enabling so that it can have a correct comm.  But at this time, the
comm will be "perf" since it's not exec-ed yet.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-record.c | 18 +++++++++++++++++-
 tools/perf/util/event.c     |  2 +-
 tools/perf/util/event.h     |  5 +++++
 3 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index ecf8e7293015..6f141f17c4ba 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -605,8 +605,24 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	/*
 	 * Let the child rip
 	 */
-	if (forks)
+	if (forks) {
+		union perf_event *comm_event;
+
+		comm_event = malloc(sizeof(*comm_event) + machine->id_hdr_size);
+		if (comm_event == NULL)
+			goto out_child;
+
+		err = perf_event__synthesize_comm(tool, comm_event,
+						  rec->evlist->threads->map[0],
+						  process_synthesized_event,
+						  machine);
+		free(comm_event);
+
+		if (err < 0)
+			goto out_child;
+
 		perf_evlist__start_workload(rec->evlist);
+	}
 
 	if (opts->initial_delay) {
 		usleep(opts->initial_delay * 1000);
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 5abf7086c97c..2ad7e2805400 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -127,7 +127,7 @@ static pid_t perf_event__prepare_comm(union perf_event *event, pid_t pid,
 	return tgid;
 }
 
-static pid_t perf_event__synthesize_comm(struct perf_tool *tool,
+pid_t perf_event__synthesize_comm(struct perf_tool *tool,
 					 union perf_event *event, pid_t pid,
 					 perf_event__handler_t process,
 					 struct machine *machine)
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 1f86c279520e..6df23199fea0 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -386,6 +386,11 @@ int perf_event__synthesize_mmap_events(struct perf_tool *tool,
 				       struct machine *machine,
 				       bool mmap_data);
 
+pid_t perf_event__synthesize_comm(struct perf_tool *tool,
+				  union perf_event *event, pid_t pid,
+				  perf_event__handler_t process,
+				  struct machine *machine);
+
 size_t perf_event__fprintf_comm(union perf_event *event, FILE *fp);
 size_t perf_event__fprintf_mmap(union perf_event *event, FILE *fp);
 size_t perf_event__fprintf_mmap2(union perf_event *event, FILE *fp);
-- 
2.2.2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH 35/38] perf tools: Fix progress ui to support multi thread
  2015-03-03  3:07 [RFC/PATCHSET 00/38] perf tools: Speed-up perf report by using multi thread (v3) Namhyung Kim
                   ` (33 preceding siblings ...)
  2015-03-03  3:07 ` [PATCH 34/38] perf record: Synthesize COMM event for a command line workload Namhyung Kim
@ 2015-03-03  3:07 ` Namhyung Kim
  2015-03-03  3:07 ` [PATCH 36/38] perf report: Add --multi-thread option and config item Namhyung Kim
                   ` (2 subsequent siblings)
  37 siblings, 0 replies; 221+ messages in thread
From: Namhyung Kim @ 2015-03-03  3:07 UTC (