All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] perf tools: Small sample parsing speedup
@ 2017-10-31  9:29 Jiri Olsa
  2017-10-31  9:29 ` [PATCH 1/7] perf tools: Reset cursor arg instead of callchain_cursor Jiri Olsa
                   ` (7 more replies)
  0 siblings, 8 replies; 11+ messages in thread
From: Jiri Olsa @ 2017-10-31  9:29 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Namhyung Kim, David Ahern, Peter Zijlstra

hi,
sending small speedup fix for sample parsing code
and few assorted fixes.

Also available in:
  https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
  perf/fixes

thanks,
jirka


---
Jiri Olsa (7):
      perf tools: Reset cursor arg instead of callchain_cursor
      perf tools: Centralize perf_sample initialization
      perf tools: Add perf_evlist__parse_sample_timestamp function
      perf tools: Pass timestamp arg in perf_session__queue_event
      perf tools: Optimize sample parsing for ordered events
      perf tools: Remove perf_tool from event_op2
      perf tools: Remove perf_tool from event_op3

 tools/perf/builtin-inject.c      |  32 ++++++++++++++------------------
 tools/perf/builtin-kvm.c         |   8 ++++----
 tools/perf/builtin-script.c      |  22 ++++++++++------------
 tools/perf/builtin-stat.c        |  23 +++++++++++------------
 tools/perf/util/auxtrace.c       |  17 +++++++----------
 tools/perf/util/auxtrace.h       |  15 ++++++---------
 tools/perf/util/evlist.c         |  11 +++++++++++
 tools/perf/util/evlist.h         |   4 ++++
 tools/perf/util/evsel.c          |  71 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----------
 tools/perf/util/evsel.h          |   4 ++++
 tools/perf/util/header.c         |  16 +++++++---------
 tools/perf/util/header.h         |  15 ++++++---------
 tools/perf/util/machine.c        |   2 +-
 tools/perf/util/ordered-events.c |   3 +--
 tools/perf/util/ordered-events.h |   2 +-
 tools/perf/util/session.c        | 120 +++++++++++++++++++++++++++++++++++++++++++++++++++---------------------------------------------------------------------
 tools/perf/util/session.h        |   7 +++----
 tools/perf/util/stat.c           |   5 ++---
 tools/perf/util/stat.h           |   5 ++---
 tools/perf/util/tool.h           |   7 ++-----
 20 files changed, 208 insertions(+), 181 deletions(-)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/7] perf tools: Reset cursor arg instead of callchain_cursor
  2017-10-31  9:29 [PATCH 0/7] perf tools: Small sample parsing speedup Jiri Olsa
@ 2017-10-31  9:29 ` Jiri Olsa
  2017-10-31  9:29 ` [PATCH 2/7] perf tools: Centralize perf_sample initialization Jiri Olsa
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Jiri Olsa @ 2017-10-31  9:29 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Namhyung Kim, David Ahern, Peter Zijlstra

We already pass cursor into thread__resolve_callchain
function, so there's no point in resetting the global
instance.

Link: http://lkml.kernel.org/n/tip-puk015qvuppao9m1xtdy9v7j@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/machine.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 94d8f1ccedd9..3f6512b0e6b3 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2200,7 +2200,7 @@ int thread__resolve_callchain(struct thread *thread,
 {
 	int ret = 0;
 
-	callchain_cursor_reset(&callchain_cursor);
+	callchain_cursor_reset(cursor);
 
 	if (callchain_param.order == ORDER_CALLEE) {
 		ret = thread__resolve_callchain_sample(thread, cursor,
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/7] perf tools: Centralize perf_sample initialization
  2017-10-31  9:29 [PATCH 0/7] perf tools: Small sample parsing speedup Jiri Olsa
  2017-10-31  9:29 ` [PATCH 1/7] perf tools: Reset cursor arg instead of callchain_cursor Jiri Olsa
@ 2017-10-31  9:29 ` Jiri Olsa
  2017-10-31  9:29 ` [PATCH 3/7] perf tools: Add perf_evlist__parse_sample_timestamp function Jiri Olsa
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Jiri Olsa @ 2017-10-31  9:29 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Namhyung Kim, David Ahern, Peter Zijlstra

Move the initialization bits into common place at the
beginning of the function.

Also removing some superfluous zero initialization for
addr and transaction, because we zero all the data at
the top.

Link: http://lkml.kernel.org/n/tip-1gv5t6fvv735t1rt3mxpy1h9@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/evsel.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index f894893c203d..8bd013896b26 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1977,6 +1977,8 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
 	data->stream_id = data->id = data->time = -1ULL;
 	data->period = evsel->attr.sample_period;
 	data->cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
+	data->id = -1ULL;
+	data->data_src = PERF_MEM_DATA_SRC_NONE;
 
 	if (event->header.type != PERF_RECORD_SAMPLE) {
 		if (!evsel->attr.sample_id_all)
@@ -1994,7 +1996,6 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
 	if (evsel->sample_size + sizeof(event->header) > event->header.size)
 		return -EFAULT;
 
-	data->id = -1ULL;
 	if (type & PERF_SAMPLE_IDENTIFIER) {
 		data->id = *array;
 		array++;
@@ -2024,7 +2025,6 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
 		array++;
 	}
 
-	data->addr = 0;
 	if (type & PERF_SAMPLE_ADDR) {
 		data->addr = *array;
 		array++;
@@ -2188,14 +2188,12 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
 		array++;
 	}
 
-	data->data_src = PERF_MEM_DATA_SRC_NONE;
 	if (type & PERF_SAMPLE_DATA_SRC) {
 		OVERFLOW_CHECK_u64(array);
 		data->data_src = *array;
 		array++;
 	}
 
-	data->transaction = 0;
 	if (type & PERF_SAMPLE_TRANSACTION) {
 		OVERFLOW_CHECK_u64(array);
 		data->transaction = *array;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/7] perf tools: Add perf_evlist__parse_sample_timestamp function
  2017-10-31  9:29 [PATCH 0/7] perf tools: Small sample parsing speedup Jiri Olsa
  2017-10-31  9:29 ` [PATCH 1/7] perf tools: Reset cursor arg instead of callchain_cursor Jiri Olsa
  2017-10-31  9:29 ` [PATCH 2/7] perf tools: Centralize perf_sample initialization Jiri Olsa
@ 2017-10-31  9:29 ` Jiri Olsa
  2017-10-31  9:29 ` [PATCH 4/7] perf tools: Pass timestamp arg in perf_session__queue_event Jiri Olsa
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Jiri Olsa @ 2017-10-31  9:29 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Namhyung Kim, David Ahern, Peter Zijlstra

Adding perf_evlist__parse_sample_timestamp to retrieve
the timestamp of the sample.

The idea is to use this function instead of the full sample
parsing before we queue the sample. At that time only the
timestamp is needed and we parse the sample once again
later on delivery.

Link: http://lkml.kernel.org/n/tip-o7syqo8lipj4or7renpu8e8y@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/evlist.c | 11 ++++++++
 tools/perf/util/evlist.h |  4 +++
 tools/perf/util/evsel.c  | 65 +++++++++++++++++++++++++++++++++++++++++++-----
 tools/perf/util/evsel.h  |  4 +++
 4 files changed, 78 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index c6c891e154a6..8e31729cca54 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1582,6 +1582,17 @@ int perf_evlist__parse_sample(struct perf_evlist *evlist, union perf_event *even
 	return perf_evsel__parse_sample(evsel, event, sample);
 }
 
+int perf_evlist__parse_sample_timestamp(struct perf_evlist *evlist,
+					union perf_event *event,
+					u64 *timestamp)
+{
+	struct perf_evsel *evsel = perf_evlist__event2evsel(evlist, event);
+
+	if (!evsel)
+		return -EFAULT;
+	return perf_evsel__parse_sample_timestamp(evsel, event, timestamp);
+}
+
 size_t perf_evlist__fprintf(struct perf_evlist *evlist, FILE *fp)
 {
 	struct perf_evsel *evsel;
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 8c433e95bd9a..9acafbf6f7dd 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -204,6 +204,10 @@ u16 perf_evlist__id_hdr_size(struct perf_evlist *evlist);
 int perf_evlist__parse_sample(struct perf_evlist *evlist, union perf_event *event,
 			      struct perf_sample *sample);
 
+int perf_evlist__parse_sample_timestamp(struct perf_evlist *evlist,
+					union perf_event *event,
+					u64 *timestamp);
+
 bool perf_evlist__valid_sample_type(struct perf_evlist *evlist);
 bool perf_evlist__valid_sample_id_all(struct perf_evlist *evlist);
 bool perf_evlist__valid_read_format(struct perf_evlist *evlist);
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 8bd013896b26..36f6e505fa99 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1956,6 +1956,20 @@ static inline bool overflow(const void *endp, u16 max_size, const void *offset,
 #define OVERFLOW_CHECK_u64(offset) \
 	OVERFLOW_CHECK(offset, sizeof(u64), sizeof(u64))
 
+static int
+perf_event__check_size(union perf_event *event, unsigned int sample_size)
+{
+	/*
+	 * The evsel's sample_size is based on PERF_SAMPLE_MASK which includes
+	 * up to PERF_SAMPLE_PERIOD.  After that overflow() must be used to
+	 * check the format does not go past the end of the event.
+	 */
+	if (sample_size + sizeof(event->header) > event->header.size)
+		return -EFAULT;
+
+	return 0;
+}
+
 int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
 			     struct perf_sample *data)
 {
@@ -1988,12 +2002,7 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
 
 	array = event->sample.array;
 
-	/*
-	 * The evsel's sample_size is based on PERF_SAMPLE_MASK which includes
-	 * up to PERF_SAMPLE_PERIOD.  After that overflow() must be used to
-	 * check the format does not go past the end of the event.
-	 */
-	if (evsel->sample_size + sizeof(event->header) > event->header.size)
+	if (perf_event__check_size(event, evsel->sample_size))
 		return -EFAULT;
 
 	if (type & PERF_SAMPLE_IDENTIFIER) {
@@ -2226,6 +2235,50 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
 	return 0;
 }
 
+int perf_evsel__parse_sample_timestamp(struct perf_evsel *evsel,
+				       union perf_event *event,
+				       u64 *timestamp)
+{
+	u64 type = evsel->attr.sample_type;
+	const u64 *array;
+
+	if (!(type & PERF_SAMPLE_TIME))
+		return -1;
+
+	if (event->header.type != PERF_RECORD_SAMPLE) {
+		struct perf_sample data = {
+			.time = -1ULL,
+		};
+
+		if (!evsel->attr.sample_id_all)
+			return -1;
+		if (perf_evsel__parse_id_sample(evsel, event, &data))
+			return -1;
+
+		*timestamp = data.time;
+		return 0;
+	}
+
+	array = event->sample.array;
+
+	if (perf_event__check_size(event, evsel->sample_size))
+		return -EFAULT;
+
+	if (type & PERF_SAMPLE_IDENTIFIER)
+		array++;
+
+	if (type & PERF_SAMPLE_IP)
+		array++;
+
+	if (type & PERF_SAMPLE_TID)
+		array++;
+
+	if (type & PERF_SAMPLE_TIME)
+		*timestamp = *array;
+
+	return 0;
+}
+
 size_t perf_event__sample_event_size(const struct perf_sample *sample, u64 type,
 				     u64 read_format)
 {
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 64782b19089d..fb005ef62945 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -337,6 +337,10 @@ static inline int perf_evsel__read_on_cpu_scaled(struct perf_evsel *evsel,
 int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
 			     struct perf_sample *sample);
 
+int perf_evsel__parse_sample_timestamp(struct perf_evsel *evsel,
+				       union perf_event *event,
+				       u64 *timestamp);
+
 static inline struct perf_evsel *perf_evsel__next(struct perf_evsel *evsel)
 {
 	return list_entry(evsel->node.next, struct perf_evsel, node);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/7] perf tools: Pass timestamp arg in perf_session__queue_event
  2017-10-31  9:29 [PATCH 0/7] perf tools: Small sample parsing speedup Jiri Olsa
                   ` (2 preceding siblings ...)
  2017-10-31  9:29 ` [PATCH 3/7] perf tools: Add perf_evlist__parse_sample_timestamp function Jiri Olsa
@ 2017-10-31  9:29 ` Jiri Olsa
  2017-10-31  9:29 ` [PATCH 5/7] perf tools: Optimize sample parsing for ordered events Jiri Olsa
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Jiri Olsa @ 2017-10-31  9:29 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Namhyung Kim, David Ahern, Peter Zijlstra

There's no need to pass whole sample data, because it's
only timestamp that is used.

Link: http://lkml.kernel.org/n/tip-xd1hpoze3kgb1rb639o3vehb@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/builtin-kvm.c         | 2 +-
 tools/perf/util/ordered-events.c | 3 +--
 tools/perf/util/ordered-events.h | 2 +-
 tools/perf/util/session.c        | 6 +++---
 tools/perf/util/session.h        | 2 +-
 5 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index 0af4c092b471..293589a9adab 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -753,7 +753,7 @@ static s64 perf_kvm__mmap_read_idx(struct perf_kvm_stat *kvm, int idx,
 			return -1;
 		}
 
-		err = perf_session__queue_event(kvm->session, event, &sample, 0);
+		err = perf_session__queue_event(kvm->session, event, sample.time, 0);
 		/*
 		 * FIXME: Here we can't consume the event, as perf_session__queue_event will
 		 *        point to it, and it'll get possibly overwritten by the kernel.
diff --git a/tools/perf/util/ordered-events.c b/tools/perf/util/ordered-events.c
index 4de398cfb577..7eb9115d8ad4 100644
--- a/tools/perf/util/ordered-events.c
+++ b/tools/perf/util/ordered-events.c
@@ -156,9 +156,8 @@ void ordered_events__delete(struct ordered_events *oe, struct ordered_event *eve
 }
 
 int ordered_events__queue(struct ordered_events *oe, union perf_event *event,
-			  struct perf_sample *sample, u64 file_offset)
+			  u64 timestamp, u64 file_offset)
 {
-	u64 timestamp = sample->time;
 	struct ordered_event *oevent;
 
 	if (!timestamp || timestamp == ~0ULL)
diff --git a/tools/perf/util/ordered-events.h b/tools/perf/util/ordered-events.h
index e11468a9a6e4..c2a5fd844b25 100644
--- a/tools/perf/util/ordered-events.h
+++ b/tools/perf/util/ordered-events.h
@@ -44,7 +44,7 @@ struct ordered_events {
 };
 
 int ordered_events__queue(struct ordered_events *oe, union perf_event *event,
-			  struct perf_sample *sample, u64 file_offset);
+			  u64 timestamp, u64 file_offset);
 void ordered_events__delete(struct ordered_events *oe, struct ordered_event *event);
 int ordered_events__flush(struct ordered_events *oe, enum oe_flush how);
 void ordered_events__init(struct ordered_events *oe, ordered_events__deliver_t deliver);
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index c09b748ab599..e591006c0d56 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -872,9 +872,9 @@ static int process_finished_round(struct perf_tool *tool __maybe_unused,
 }
 
 int perf_session__queue_event(struct perf_session *s, union perf_event *event,
-			      struct perf_sample *sample, u64 file_offset)
+			      u64 timestamp, u64 file_offset)
 {
-	return ordered_events__queue(&s->ordered_events, event, sample, file_offset);
+	return ordered_events__queue(&s->ordered_events, event, timestamp, file_offset);
 }
 
 static void callchain__lbr_callstack_printf(struct perf_sample *sample)
@@ -1516,7 +1516,7 @@ static s64 perf_session__process_event(struct perf_session *session,
 		return ret;
 
 	if (tool->ordered_events) {
-		ret = perf_session__queue_event(session, event, &sample, file_offset);
+		ret = perf_session__queue_event(session, event, sample.time, file_offset);
 		if (ret != -ETIME)
 			return ret;
 	}
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index cc1c5ea53c39..c0dd09aebb4b 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -52,7 +52,7 @@ int perf_session__peek_event(struct perf_session *session, off_t file_offset,
 int perf_session__process_events(struct perf_session *session);
 
 int perf_session__queue_event(struct perf_session *s, union perf_event *event,
-			      struct perf_sample *sample, u64 file_offset);
+			      u64 timestamp, u64 file_offset);
 
 void perf_tool__fill_defaults(struct perf_tool *tool);
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 5/7] perf tools: Optimize sample parsing for ordered events
  2017-10-31  9:29 [PATCH 0/7] perf tools: Small sample parsing speedup Jiri Olsa
                   ` (3 preceding siblings ...)
  2017-10-31  9:29 ` [PATCH 4/7] perf tools: Pass timestamp arg in perf_session__queue_event Jiri Olsa
@ 2017-10-31  9:29 ` Jiri Olsa
  2017-10-31  9:40   ` Ingo Molnar
  2017-10-31  9:29 ` [PATCH 6/7] perf tools: Remove perf_tool from event_op2 Jiri Olsa
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 11+ messages in thread
From: Jiri Olsa @ 2017-10-31  9:29 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Namhyung Kim, David Ahern, Peter Zijlstra

Currently when using ordered events we parse the sample
twice (the perf_evlist__parse_sample function). Once
before we queue the sample for sorting:

  perf_session__process_event
    perf_evlist__parse_sample(sample)
    perf_session__queue_event(sample.time)

And then when we deliver the sorted sample:

  ordered_events__deliver_event
    perf_evlist__parse_sample
    perf_session__deliver_event

We can skip the initial full sample parsing by using
perf_evlist__parse_sample_timestamp function, which
got introduced earlier. The new path looks like:

  perf_session__process_event
    perf_evlist__parse_sample_timestamp
    perf_session__queue_event

  ordered_events__deliver_event
    perf_session__deliver_event
      perf_evlist__parse_sample

It saves some instructions and is slightly faster:

Before:
 Performance counter stats for './perf.old report --stdio' (5 runs):

    64,396,007,225      cycles:u                                                      ( +-  0.97% )
   105,882,112,735      instructions:u            #    1.64  insn per cycle           ( +-  0.00% )

      21.618103465 seconds time elapsed                                          ( +-  1.12% )

After:
 Performance counter stats for './perf report --stdio' (5 runs):

    60,567,807,182      cycles:u                                                      ( +-  0.40% )
   104,853,333,514      instructions:u            #    1.73  insn per cycle           ( +-  0.00% )

      20.168895243 seconds time elapsed                                          ( +-  0.32% )

Link: http://lkml.kernel.org/n/tip-cjp2tuk0qkjs9dxzlpmm34ua@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/builtin-kvm.c  |  8 ++++----
 tools/perf/util/session.c | 41 ++++++++++++++++++-----------------------
 2 files changed, 22 insertions(+), 27 deletions(-)

diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index 293589a9adab..24733aea25cb 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -740,20 +740,20 @@ static s64 perf_kvm__mmap_read_idx(struct perf_kvm_stat *kvm, int idx,
 				   u64 *mmap_time)
 {
 	union perf_event *event;
-	struct perf_sample sample;
+	u64 timestamp;
 	s64 n = 0;
 	int err;
 
 	*mmap_time = ULLONG_MAX;
 	while ((event = perf_evlist__mmap_read(kvm->evlist, idx)) != NULL) {
-		err = perf_evlist__parse_sample(kvm->evlist, event, &sample);
+		err = perf_evlist__parse_sample_timestamp(kvm->evlist, event, &timestamp);
 		if (err) {
 			perf_evlist__mmap_consume(kvm->evlist, idx);
 			pr_err("Failed to parse sample\n");
 			return -1;
 		}
 
-		err = perf_session__queue_event(kvm->session, event, sample.time, 0);
+		err = perf_session__queue_event(kvm->session, event, timestamp, 0);
 		/*
 		 * FIXME: Here we can't consume the event, as perf_session__queue_event will
 		 *        point to it, and it'll get possibly overwritten by the kernel.
@@ -767,7 +767,7 @@ static s64 perf_kvm__mmap_read_idx(struct perf_kvm_stat *kvm, int idx,
 
 		/* save time stamp of our first sample for this mmap */
 		if (n == 0)
-			*mmap_time = sample.time;
+			*mmap_time = timestamp;
 
 		/* limit events per mmap handled all at once */
 		n++;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index e591006c0d56..91e787a4406d 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -26,7 +26,6 @@
 
 static int perf_session__deliver_event(struct perf_session *session,
 				       union perf_event *event,
-				       struct perf_sample *sample,
 				       struct perf_tool *tool,
 				       u64 file_offset);
 
@@ -106,17 +105,10 @@ static void perf_session__set_comm_exec(struct perf_session *session)
 static int ordered_events__deliver_event(struct ordered_events *oe,
 					 struct ordered_event *event)
 {
-	struct perf_sample sample;
 	struct perf_session *session = container_of(oe, struct perf_session,
 						    ordered_events);
-	int ret = perf_evlist__parse_sample(session->evlist, event->event, &sample);
-
-	if (ret) {
-		pr_err("Can't parse sample, err = %d\n", ret);
-		return ret;
-	}
 
-	return perf_session__deliver_event(session, event->event, &sample,
+	return perf_session__deliver_event(session, event->event,
 					   session->tool, event->file_offset);
 }
 
@@ -1327,20 +1319,26 @@ static int machines__deliver_event(struct machines *machines,
 
 static int perf_session__deliver_event(struct perf_session *session,
 				       union perf_event *event,
-				       struct perf_sample *sample,
 				       struct perf_tool *tool,
 				       u64 file_offset)
 {
+	struct perf_sample sample;
 	int ret;
 
-	ret = auxtrace__process_event(session, event, sample, tool);
+	ret = perf_evlist__parse_sample(session->evlist, event, &sample);
+	if (ret) {
+		pr_err("Can't parse sample, err = %d\n", ret);
+		return ret;
+	}
+
+	ret = auxtrace__process_event(session, event, &sample, tool);
 	if (ret < 0)
 		return ret;
 	if (ret > 0)
 		return 0;
 
 	return machines__deliver_event(&session->machines, session->evlist,
-				       event, sample, tool, file_offset);
+				       event, &sample, tool, file_offset);
 }
 
 static s64 perf_session__process_user_event(struct perf_session *session,
@@ -1494,7 +1492,6 @@ static s64 perf_session__process_event(struct perf_session *session,
 {
 	struct perf_evlist *evlist = session->evlist;
 	struct perf_tool *tool = session->tool;
-	struct perf_sample sample;
 	int ret;
 
 	if (session->header.needs_swap)
@@ -1508,21 +1505,19 @@ static s64 perf_session__process_event(struct perf_session *session,
 	if (event->header.type >= PERF_RECORD_USER_TYPE_START)
 		return perf_session__process_user_event(session, event, file_offset);
 
-	/*
-	 * For all kernel events we get the sample data
-	 */
-	ret = perf_evlist__parse_sample(evlist, event, &sample);
-	if (ret)
-		return ret;
-
 	if (tool->ordered_events) {
-		ret = perf_session__queue_event(session, event, sample.time, file_offset);
+		u64 timestamp;
+
+		ret = perf_evlist__parse_sample_timestamp(evlist, event, &timestamp);
+		if (ret)
+			return ret;
+
+		ret = perf_session__queue_event(session, event, timestamp, file_offset);
 		if (ret != -ETIME)
 			return ret;
 	}
 
-	return perf_session__deliver_event(session, event, &sample, tool,
-					   file_offset);
+	return perf_session__deliver_event(session, event, tool, file_offset);
 }
 
 void perf_event_header__bswap(struct perf_event_header *hdr)
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 6/7] perf tools: Remove perf_tool from event_op2
  2017-10-31  9:29 [PATCH 0/7] perf tools: Small sample parsing speedup Jiri Olsa
                   ` (4 preceding siblings ...)
  2017-10-31  9:29 ` [PATCH 5/7] perf tools: Optimize sample parsing for ordered events Jiri Olsa
@ 2017-10-31  9:29 ` Jiri Olsa
  2017-10-31  9:29 ` [PATCH 7/7] perf tools: Remove perf_tool from event_op3 Jiri Olsa
  2017-11-01  8:39 ` [PATCH 0/7] perf tools: Small sample parsing speedup Namhyung Kim
  7 siblings, 0 replies; 11+ messages in thread
From: Jiri Olsa @ 2017-10-31  9:29 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Namhyung Kim, David Ahern, Peter Zijlstra

Now when we keep perf_tool pointer inside perf_session,
there's no need to have perf_tool argument in the
event_op2 callback. Removing it.

Link: http://lkml.kernel.org/n/tip-dc99gim2w2919gdnzrl3tegh@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/builtin-inject.c | 26 ++++++++----------
 tools/perf/builtin-script.c | 22 +++++++--------
 tools/perf/builtin-stat.c   | 23 ++++++++--------
 tools/perf/util/auxtrace.c  | 10 +++----
 tools/perf/util/auxtrace.h  | 10 +++----
 tools/perf/util/header.c    | 16 +++++------
 tools/perf/util/header.h    | 15 ++++------
 tools/perf/util/session.c   | 67 +++++++++++++++++++--------------------------
 tools/perf/util/session.h   |  5 ++--
 tools/perf/util/stat.c      |  5 ++--
 tools/perf/util/stat.h      |  5 ++--
 tools/perf/util/tool.h      |  3 +-
 12 files changed, 88 insertions(+), 119 deletions(-)

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index 91e65093d3c2..6f73e716d1c0 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -85,12 +85,10 @@ static int perf_event__drop_oe(struct perf_tool *tool __maybe_unused,
 }
 #endif
 
-static int perf_event__repipe_op2_synth(struct perf_tool *tool,
-					union perf_event *event,
-					struct perf_session *session
-					__maybe_unused)
+static int perf_event__repipe_op2_synth(struct perf_session *session,
+					union perf_event *event)
 {
-	return perf_event__repipe_synth(tool, event);
+	return perf_event__repipe_synth(session->tool, event);
 }
 
 static int perf_event__repipe_attr(struct perf_tool *tool,
@@ -361,26 +359,24 @@ static int perf_event__repipe_exit(struct perf_tool *tool,
 	return err;
 }
 
-static int perf_event__repipe_tracing_data(struct perf_tool *tool,
-					   union perf_event *event,
-					   struct perf_session *session)
+static int perf_event__repipe_tracing_data(struct perf_session *session,
+					   union perf_event *event)
 {
 	int err;
 
-	perf_event__repipe_synth(tool, event);
-	err = perf_event__process_tracing_data(tool, event, session);
+	perf_event__repipe_synth(session->tool, event);
+	err = perf_event__process_tracing_data(session, event);
 
 	return err;
 }
 
-static int perf_event__repipe_id_index(struct perf_tool *tool,
-				       union perf_event *event,
-				       struct perf_session *session)
+static int perf_event__repipe_id_index(struct perf_session *session,
+				       union perf_event *event)
 {
 	int err;
 
-	perf_event__repipe_synth(tool, event);
-	err = perf_event__process_id_index(tool, event, session);
+	perf_event__repipe_synth(session->tool, event);
+	err = perf_event__process_id_index(session, event);
 
 	return err;
 }
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 89975e30c0ba..48080f2f2470 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -2756,9 +2756,8 @@ static void script__setup_sample_type(struct perf_script *script)
 	}
 }
 
-static int process_stat_round_event(struct perf_tool *tool __maybe_unused,
-				    union perf_event *event,
-				    struct perf_session *session)
+static int process_stat_round_event(struct perf_session *session,
+				    union perf_event *event)
 {
 	struct stat_round_event *round = &event->stat_round;
 	struct perf_evsel *counter;
@@ -2772,9 +2771,8 @@ static int process_stat_round_event(struct perf_tool *tool __maybe_unused,
 	return 0;
 }
 
-static int process_stat_config_event(struct perf_tool *tool __maybe_unused,
-				     union perf_event *event,
-				     struct perf_session *session __maybe_unused)
+static int process_stat_config_event(struct perf_session *session __maybe_unused,
+				     union perf_event *event)
 {
 	perf_event__read_stat_config(&stat_config, &event->stat_config);
 	return 0;
@@ -2800,10 +2798,10 @@ static int set_maps(struct perf_script *script)
 }
 
 static
-int process_thread_map_event(struct perf_tool *tool,
-			     union perf_event *event,
-			     struct perf_session *session __maybe_unused)
+int process_thread_map_event(struct perf_session *session,
+			     union perf_event *event)
 {
+	struct perf_tool *tool = session->tool;
 	struct perf_script *script = container_of(tool, struct perf_script, tool);
 
 	if (script->threads) {
@@ -2819,10 +2817,10 @@ int process_thread_map_event(struct perf_tool *tool,
 }
 
 static
-int process_cpu_map_event(struct perf_tool *tool __maybe_unused,
-			  union perf_event *event,
-			  struct perf_session *session __maybe_unused)
+int process_cpu_map_event(struct perf_session *session,
+			  union perf_event *event)
 {
+	struct perf_tool *tool = session->tool;
 	struct perf_script *script = container_of(tool, struct perf_script, tool);
 
 	if (script->cpus) {
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 59af5a8419e2..718a6a5b4ee0 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -2432,9 +2432,8 @@ static int __cmd_record(int argc, const char **argv)
 	return argc;
 }
 
-static int process_stat_round_event(struct perf_tool *tool __maybe_unused,
-				    union perf_event *event,
-				    struct perf_session *session)
+static int process_stat_round_event(struct perf_session *session,
+				    union perf_event *event)
 {
 	struct stat_round_event *stat_round = &event->stat_round;
 	struct perf_evsel *counter;
@@ -2459,10 +2458,10 @@ static int process_stat_round_event(struct perf_tool *tool __maybe_unused,
 }
 
 static
-int process_stat_config_event(struct perf_tool *tool,
-			      union perf_event *event,
-			      struct perf_session *session __maybe_unused)
+int process_stat_config_event(struct perf_session *session,
+			      union perf_event *event)
 {
+	struct perf_tool *tool = session->tool;
 	struct perf_stat *st = container_of(tool, struct perf_stat, tool);
 
 	perf_event__read_stat_config(&stat_config, &event->stat_config);
@@ -2502,10 +2501,10 @@ static int set_maps(struct perf_stat *st)
 }
 
 static
-int process_thread_map_event(struct perf_tool *tool,
-			     union perf_event *event,
-			     struct perf_session *session __maybe_unused)
+int process_thread_map_event(struct perf_session *session,
+			     union perf_event *event)
 {
+	struct perf_tool *tool = session->tool;
 	struct perf_stat *st = container_of(tool, struct perf_stat, tool);
 
 	if (st->threads) {
@@ -2521,10 +2520,10 @@ int process_thread_map_event(struct perf_tool *tool,
 }
 
 static
-int process_cpu_map_event(struct perf_tool *tool,
-			  union perf_event *event,
-			  struct perf_session *session __maybe_unused)
+int process_cpu_map_event(struct perf_session *session,
+			  union perf_event *event)
 {
+	struct perf_tool *tool = session->tool;
 	struct perf_stat *st = container_of(tool, struct perf_stat, tool);
 	struct cpu_map *cpus;
 
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index a33491416400..cdc4a3187b5f 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -899,9 +899,8 @@ static bool auxtrace__dont_decode(struct perf_session *session)
 	       session->itrace_synth_opts->dont_decode;
 }
 
-int perf_event__process_auxtrace_info(struct perf_tool *tool __maybe_unused,
-				      union perf_event *event,
-				      struct perf_session *session)
+int perf_event__process_auxtrace_info(struct perf_session *session,
+				      union perf_event *event)
 {
 	enum auxtrace_type type = event->auxtrace_info.type;
 
@@ -1173,9 +1172,8 @@ void events_stats__auxtrace_error_warn(const struct events_stats *stats)
 	}
 }
 
-int perf_event__process_auxtrace_error(struct perf_tool *tool __maybe_unused,
-				       union perf_event *event,
-				       struct perf_session *session)
+int perf_event__process_auxtrace_error(struct perf_session *session,
+				       union perf_event *event)
 {
 	if (auxtrace__dont_decode(session))
 		return 0;
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index 33b5e6cdf38c..e2329ac31acc 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -513,15 +513,13 @@ int perf_event__synthesize_auxtrace_info(struct auxtrace_record *itr,
 					 struct perf_tool *tool,
 					 struct perf_session *session,
 					 perf_event__handler_t process);
-int perf_event__process_auxtrace_info(struct perf_tool *tool,
-				      union perf_event *event,
-				      struct perf_session *session);
+int perf_event__process_auxtrace_info(struct perf_session *session,
+				      union perf_event *event);
 s64 perf_event__process_auxtrace(struct perf_tool *tool,
 				 union perf_event *event,
 				 struct perf_session *session);
-int perf_event__process_auxtrace_error(struct perf_tool *tool,
-				       union perf_event *event,
-				       struct perf_session *session);
+int perf_event__process_auxtrace_error(struct perf_session *session,
+				       union perf_event *event);
 int itrace_parse_synth_opts(const struct option *opt, const char *str,
 			    int unset);
 void itrace_synth_opts__set_default(struct itrace_synth_opts *synth_opts);
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 6e59dcca9df2..8dbcd76c3269 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -3050,10 +3050,10 @@ int perf_event__synthesize_features(struct perf_tool *tool,
 	return 0;
 }
 
-int perf_event__process_feature(struct perf_tool *tool,
-				union perf_event *event,
-				struct perf_session *session __maybe_unused)
+int perf_event__process_feature(struct perf_session *session,
+				union perf_event *event)
 {
+	struct perf_tool *tool = session->tool;
 	struct feat_fd ff = { .fd = 0 };
 	struct feature_event *fe = (struct feature_event *)event;
 	int type = fe->header.type;
@@ -3392,9 +3392,8 @@ int perf_event__synthesize_tracing_data(struct perf_tool *tool, int fd,
 	return aligned_size;
 }
 
-int perf_event__process_tracing_data(struct perf_tool *tool __maybe_unused,
-				     union perf_event *event,
-				     struct perf_session *session)
+int perf_event__process_tracing_data(struct perf_session *session,
+				     union perf_event *event)
 {
 	ssize_t size_read, padding, size = event->tracing_data.size;
 	int fd = perf_data__fd(session->data);
@@ -3460,9 +3459,8 @@ int perf_event__synthesize_build_id(struct perf_tool *tool,
 	return err;
 }
 
-int perf_event__process_build_id(struct perf_tool *tool __maybe_unused,
-				 union perf_event *event,
-				 struct perf_session *session)
+int perf_event__process_build_id(struct perf_session *session,
+				 union perf_event *event)
 {
 	__event_process_build_id(&event->build_id,
 				 event->build_id.filename,
diff --git a/tools/perf/util/header.h b/tools/perf/util/header.h
index f7a16ee527b8..170854a43ad5 100644
--- a/tools/perf/util/header.h
+++ b/tools/perf/util/header.h
@@ -106,9 +106,8 @@ int perf_event__synthesize_features(struct perf_tool *tool,
 				    struct perf_evlist *evlist,
 				    perf_event__handler_t process);
 
-int perf_event__process_feature(struct perf_tool *tool,
-				union perf_event *event,
-				struct perf_session *session);
+int perf_event__process_feature(struct perf_session *session,
+				union perf_event *event);
 
 int perf_event__synthesize_attr(struct perf_tool *tool,
 				struct perf_event_attr *attr, u32 ids, u64 *id,
@@ -138,17 +137,15 @@ size_t perf_event__fprintf_event_update(union perf_event *event, FILE *fp);
 int perf_event__synthesize_tracing_data(struct perf_tool *tool,
 					int fd, struct perf_evlist *evlist,
 					perf_event__handler_t process);
-int perf_event__process_tracing_data(struct perf_tool *tool,
-				     union perf_event *event,
-				     struct perf_session *session);
+int perf_event__process_tracing_data(struct perf_session *session,
+				     union perf_event *event);
 
 int perf_event__synthesize_build_id(struct perf_tool *tool,
 				    struct dso *pos, u16 misc,
 				    perf_event__handler_t process,
 				    struct machine *machine);
-int perf_event__process_build_id(struct perf_tool *tool,
-				 union perf_event *event,
-				 struct perf_session *session);
+int perf_event__process_build_id(struct perf_session *session,
+				 union perf_event *event);
 bool is_perf_magic(u64 magic);
 
 #define NAME_ALIGN 64
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 91e787a4406d..e565a8332e11 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -198,12 +198,10 @@ void perf_session__delete(struct perf_session *session)
 	free(session);
 }
 
-static int process_event_synth_tracing_data_stub(struct perf_tool *tool
+static int process_event_synth_tracing_data_stub(struct perf_session *session
 						 __maybe_unused,
 						 union perf_event *event
-						 __maybe_unused,
-						 struct perf_session *session
-						__maybe_unused)
+						 __maybe_unused)
 {
 	dump_printf(": unhandled!\n");
 	return 0;
@@ -287,9 +285,8 @@ static s64 process_event_auxtrace_stub(struct perf_tool *tool __maybe_unused,
 	return event->auxtrace.size;
 }
 
-static int process_event_op2_stub(struct perf_tool *tool __maybe_unused,
-				  union perf_event *event __maybe_unused,
-				  struct perf_session *session __maybe_unused)
+static int process_event_op2_stub(struct perf_session *session __maybe_unused,
+				  union perf_event *event __maybe_unused)
 {
 	dump_printf(": unhandled!\n");
 	return 0;
@@ -297,9 +294,8 @@ static int process_event_op2_stub(struct perf_tool *tool __maybe_unused,
 
 
 static
-int process_event_thread_map_stub(struct perf_tool *tool __maybe_unused,
-				  union perf_event *event __maybe_unused,
-				  struct perf_session *session __maybe_unused)
+int process_event_thread_map_stub(struct perf_session *session __maybe_unused,
+				  union perf_event *event __maybe_unused)
 {
 	if (dump_trace)
 		perf_event__fprintf_thread_map(event, stdout);
@@ -309,9 +305,8 @@ int process_event_thread_map_stub(struct perf_tool *tool __maybe_unused,
 }
 
 static
-int process_event_cpu_map_stub(struct perf_tool *tool __maybe_unused,
-			       union perf_event *event __maybe_unused,
-			       struct perf_session *session __maybe_unused)
+int process_event_cpu_map_stub(struct perf_session *session __maybe_unused,
+			       union perf_event *event __maybe_unused)
 {
 	if (dump_trace)
 		perf_event__fprintf_cpu_map(event, stdout);
@@ -321,9 +316,8 @@ int process_event_cpu_map_stub(struct perf_tool *tool __maybe_unused,
 }
 
 static
-int process_event_stat_config_stub(struct perf_tool *tool __maybe_unused,
-				   union perf_event *event __maybe_unused,
-				   struct perf_session *session __maybe_unused)
+int process_event_stat_config_stub(struct perf_session *session __maybe_unused,
+				   union perf_event *event __maybe_unused)
 {
 	if (dump_trace)
 		perf_event__fprintf_stat_config(event, stdout);
@@ -332,10 +326,8 @@ int process_event_stat_config_stub(struct perf_tool *tool __maybe_unused,
 	return 0;
 }
 
-static int process_stat_stub(struct perf_tool *tool __maybe_unused,
-			     union perf_event *event __maybe_unused,
-			     struct perf_session *perf_session
-			     __maybe_unused)
+static int process_stat_stub(struct perf_session *perf_session __maybe_unused,
+			     union perf_event *event)
 {
 	if (dump_trace)
 		perf_event__fprintf_stat(event, stdout);
@@ -344,10 +336,8 @@ static int process_stat_stub(struct perf_tool *tool __maybe_unused,
 	return 0;
 }
 
-static int process_stat_round_stub(struct perf_tool *tool __maybe_unused,
-				   union perf_event *event __maybe_unused,
-				   struct perf_session *perf_session
-				   __maybe_unused)
+static int process_stat_round_stub(struct perf_session *perf_session __maybe_unused,
+				   union perf_event *event)
 {
 	if (dump_trace)
 		perf_event__fprintf_stat_round(event, stdout);
@@ -1372,37 +1362,37 @@ static s64 perf_session__process_user_event(struct perf_session *session,
 	case PERF_RECORD_HEADER_TRACING_DATA:
 		/* setup for reading amidst mmap */
 		lseek(fd, file_offset, SEEK_SET);
-		return tool->tracing_data(tool, event, session);
+		return tool->tracing_data(session, event);
 	case PERF_RECORD_HEADER_BUILD_ID:
-		return tool->build_id(tool, event, session);
+		return tool->build_id(session, event);
 	case PERF_RECORD_FINISHED_ROUND:
 		return tool->finished_round(tool, event, oe);
 	case PERF_RECORD_ID_INDEX:
-		return tool->id_index(tool, event, session);
+		return tool->id_index(session, event);
 	case PERF_RECORD_AUXTRACE_INFO:
-		return tool->auxtrace_info(tool, event, session);
+		return tool->auxtrace_info(session, event);
 	case PERF_RECORD_AUXTRACE:
 		/* setup for reading amidst mmap */
 		lseek(fd, file_offset + event->header.size, SEEK_SET);
 		return tool->auxtrace(tool, event, session);
 	case PERF_RECORD_AUXTRACE_ERROR:
 		perf_session__auxtrace_error_inc(session, event);
-		return tool->auxtrace_error(tool, event, session);
+		return tool->auxtrace_error(session, event);
 	case PERF_RECORD_THREAD_MAP:
-		return tool->thread_map(tool, event, session);
+		return tool->thread_map(session, event);
 	case PERF_RECORD_CPU_MAP:
-		return tool->cpu_map(tool, event, session);
+		return tool->cpu_map(session, event);
 	case PERF_RECORD_STAT_CONFIG:
-		return tool->stat_config(tool, event, session);
+		return tool->stat_config(session, event);
 	case PERF_RECORD_STAT:
-		return tool->stat(tool, event, session);
+		return tool->stat(session, event);
 	case PERF_RECORD_STAT_ROUND:
-		return tool->stat_round(tool, event, session);
+		return tool->stat_round(session, event);
 	case PERF_RECORD_TIME_CONV:
 		session->time_conv = event->time_conv;
-		return tool->time_conv(tool, event, session);
+		return tool->time_conv(session, event);
 	case PERF_RECORD_HEADER_FEATURE:
-		return tool->feature(tool, event, session);
+		return tool->feature(session, event);
 	default:
 		return -EINVAL;
 	}
@@ -2134,9 +2124,8 @@ int __perf_session__set_tracepoints_handlers(struct perf_session *session,
 	return err;
 }
 
-int perf_event__process_id_index(struct perf_tool *tool __maybe_unused,
-				 union perf_event *event,
-				 struct perf_session *session)
+int perf_event__process_id_index(struct perf_session *session,
+				 union perf_event *event)
 {
 	struct perf_evlist *evlist = session->evlist;
 	struct id_index_event *ie = &event->id_index;
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index c0dd09aebb4b..c31dcb7e6fb2 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -119,9 +119,8 @@ int perf_session__deliver_synth_event(struct perf_session *session,
 				      union perf_event *event,
 				      struct perf_sample *sample);
 
-int perf_event__process_id_index(struct perf_tool *tool,
-				 union perf_event *event,
-				 struct perf_session *session);
+int perf_event__process_id_index(struct perf_session *session,
+				 union perf_event *event);
 
 int perf_event__synthesize_id_index(struct perf_tool *tool,
 				    perf_event__handler_t process,
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 203f5d8d11d1..53567ddfe958 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -366,9 +366,8 @@ int perf_stat_process_counter(struct perf_stat_config *config,
 	return 0;
 }
 
-int perf_event__process_stat_event(struct perf_tool *tool __maybe_unused,
-				   union perf_event *event,
-				   struct perf_session *session)
+int perf_event__process_stat_event(struct perf_session *session,
+				   union perf_event *event)
 {
 	struct perf_counts_values count;
 	struct stat_event *st = &event->stat;
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index 490b78aa7230..a8a2b01ac124 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -107,9 +107,8 @@ int perf_stat_process_counter(struct perf_stat_config *config,
 struct perf_tool;
 union perf_event;
 struct perf_session;
-int perf_event__process_stat_event(struct perf_tool *tool,
-				   union perf_event *event,
-				   struct perf_session *session);
+int perf_event__process_stat_event(struct perf_session *session,
+				   union perf_event *event);
 
 size_t perf_event__fprintf_stat(union perf_event *event, FILE *fp);
 size_t perf_event__fprintf_stat_round(union perf_event *event, FILE *fp);
diff --git a/tools/perf/util/tool.h b/tools/perf/util/tool.h
index d549e50db397..2eddeab74a6b 100644
--- a/tools/perf/util/tool.h
+++ b/tools/perf/util/tool.h
@@ -25,8 +25,7 @@ typedef int (*event_attr_op)(struct perf_tool *tool,
 			     union perf_event *event,
 			     struct perf_evlist **pevlist);
 
-typedef int (*event_op2)(struct perf_tool *tool, union perf_event *event,
-			 struct perf_session *session);
+typedef int (*event_op2)(struct perf_session *session, union perf_event *event);
 
 typedef int (*event_oe)(struct perf_tool *tool, union perf_event *event,
 			struct ordered_events *oe);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 7/7] perf tools: Remove perf_tool from event_op3
  2017-10-31  9:29 [PATCH 0/7] perf tools: Small sample parsing speedup Jiri Olsa
                   ` (5 preceding siblings ...)
  2017-10-31  9:29 ` [PATCH 6/7] perf tools: Remove perf_tool from event_op2 Jiri Olsa
@ 2017-10-31  9:29 ` Jiri Olsa
  2017-11-01  8:39 ` [PATCH 0/7] perf tools: Small sample parsing speedup Namhyung Kim
  7 siblings, 0 replies; 11+ messages in thread
From: Jiri Olsa @ 2017-10-31  9:29 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Namhyung Kim, David Ahern, Peter Zijlstra

Now when we keep perf_tool pointer inside perf_session,
there's no need to have perf_tool argument in the
event_op3 callback. Removing it.

Link: http://lkml.kernel.org/n/tip-78u9m0jbre3bn16l6guqfyrf@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/builtin-inject.c | 6 +++---
 tools/perf/util/auxtrace.c  | 7 +++----
 tools/perf/util/auxtrace.h  | 5 ++---
 tools/perf/util/session.c   | 8 +++-----
 tools/perf/util/tool.h      | 4 +---
 5 files changed, 12 insertions(+), 18 deletions(-)

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index 6f73e716d1c0..ef0b375b4091 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -130,10 +130,10 @@ static int copy_bytes(struct perf_inject *inject, int fd, off_t size)
 	return 0;
 }
 
-static s64 perf_event__repipe_auxtrace(struct perf_tool *tool,
-				       union perf_event *event,
-				       struct perf_session *session)
+static s64 perf_event__repipe_auxtrace(struct perf_session *session,
+				       union perf_event *event)
 {
+	struct perf_tool *tool = session->tool;
 	struct perf_inject *inject = container_of(tool, struct perf_inject,
 						  tool);
 	int ret;
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index cdc4a3187b5f..775a6f960dd8 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -919,9 +919,8 @@ int perf_event__process_auxtrace_info(struct perf_session *session,
 	}
 }
 
-s64 perf_event__process_auxtrace(struct perf_tool *tool,
-				 union perf_event *event,
-				 struct perf_session *session)
+s64 perf_event__process_auxtrace(struct perf_session *session,
+				 union perf_event *event)
 {
 	s64 err;
 
@@ -937,7 +936,7 @@ s64 perf_event__process_auxtrace(struct perf_tool *tool,
 	if (!session->auxtrace || event->header.type != PERF_RECORD_AUXTRACE)
 		return -EINVAL;
 
-	err = session->auxtrace->process_auxtrace_event(session, event, tool);
+	err = session->auxtrace->process_auxtrace_event(session, event, session->tool);
 	if (err < 0)
 		return err;
 
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index e2329ac31acc..27842bf4dc1d 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -515,9 +515,8 @@ int perf_event__synthesize_auxtrace_info(struct auxtrace_record *itr,
 					 perf_event__handler_t process);
 int perf_event__process_auxtrace_info(struct perf_session *session,
 				      union perf_event *event);
-s64 perf_event__process_auxtrace(struct perf_tool *tool,
-				 union perf_event *event,
-				 struct perf_session *session);
+s64 perf_event__process_auxtrace(struct perf_session *session,
+				 union perf_event *event);
 int perf_event__process_auxtrace_error(struct perf_session *session,
 				       union perf_event *event);
 int itrace_parse_synth_opts(const struct option *opt, const char *str,
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index e565a8332e11..2f3c4542570b 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -274,10 +274,8 @@ static int skipn(int fd, off_t n)
 	return 0;
 }
 
-static s64 process_event_auxtrace_stub(struct perf_tool *tool __maybe_unused,
-				       union perf_event *event,
-				       struct perf_session *session
-				       __maybe_unused)
+static s64 process_event_auxtrace_stub(struct perf_session *session __maybe_unused,
+				       union perf_event *event)
 {
 	dump_printf(": unhandled!\n");
 	if (perf_data__is_pipe(session->data))
@@ -1374,7 +1372,7 @@ static s64 perf_session__process_user_event(struct perf_session *session,
 	case PERF_RECORD_AUXTRACE:
 		/* setup for reading amidst mmap */
 		lseek(fd, file_offset + event->header.size, SEEK_SET);
-		return tool->auxtrace(tool, event, session);
+		return tool->auxtrace(session, event);
 	case PERF_RECORD_AUXTRACE_ERROR:
 		perf_session__auxtrace_error_inc(session, event);
 		return tool->auxtrace_error(session, event);
diff --git a/tools/perf/util/tool.h b/tools/perf/util/tool.h
index 2eddeab74a6b..b7d4fef9f683 100644
--- a/tools/perf/util/tool.h
+++ b/tools/perf/util/tool.h
@@ -26,13 +26,11 @@ typedef int (*event_attr_op)(struct perf_tool *tool,
 			     struct perf_evlist **pevlist);
 
 typedef int (*event_op2)(struct perf_session *session, union perf_event *event);
+typedef s64 (*event_op3)(struct perf_session *session, union perf_event *event);
 
 typedef int (*event_oe)(struct perf_tool *tool, union perf_event *event,
 			struct ordered_events *oe);
 
-typedef s64 (*event_op3)(struct perf_tool *tool, union perf_event *event,
-			 struct perf_session *session);
-
 enum show_feature_header {
 	SHOW_FEAT_NO_HEADER = 0,
 	SHOW_FEAT_HEADER,
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 5/7] perf tools: Optimize sample parsing for ordered events
  2017-10-31  9:29 ` [PATCH 5/7] perf tools: Optimize sample parsing for ordered events Jiri Olsa
@ 2017-10-31  9:40   ` Ingo Molnar
  0 siblings, 0 replies; 11+ messages in thread
From: Ingo Molnar @ 2017-10-31  9:40 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, lkml, Namhyung Kim, David Ahern,
	Peter Zijlstra


* Jiri Olsa <jolsa@kernel.org> wrote:

> Currently when using ordered events we parse the sample
> twice (the perf_evlist__parse_sample function). Once
> before we queue the sample for sorting:
> 
>   perf_session__process_event
>     perf_evlist__parse_sample(sample)
>     perf_session__queue_event(sample.time)
> 
> And then when we deliver the sorted sample:
> 
>   ordered_events__deliver_event
>     perf_evlist__parse_sample
>     perf_session__deliver_event
> 
> We can skip the initial full sample parsing by using
> perf_evlist__parse_sample_timestamp function, which
> got introduced earlier. The new path looks like:
> 
>   perf_session__process_event
>     perf_evlist__parse_sample_timestamp
>     perf_session__queue_event
> 
>   ordered_events__deliver_event
>     perf_session__deliver_event
>       perf_evlist__parse_sample
> 
> It saves some instructions and is slightly faster:
> 
> Before:
>  Performance counter stats for './perf.old report --stdio' (5 runs):
> 
>     64,396,007,225      cycles:u                                                      ( +-  0.97% )
>    105,882,112,735      instructions:u            #    1.64  insn per cycle           ( +-  0.00% )
> 
>       21.618103465 seconds time elapsed                                          ( +-  1.12% )
> 
> After:
>  Performance counter stats for './perf report --stdio' (5 runs):
> 
>     60,567,807,182      cycles:u                                                      ( +-  0.40% )
>    104,853,333,514      instructions:u            #    1.73  insn per cycle           ( +-  0.00% )
> 
>       20.168895243 seconds time elapsed                                          ( +-  0.32% )

That's a 7% speedup, not bad!

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/7] perf tools: Small sample parsing speedup
  2017-10-31  9:29 [PATCH 0/7] perf tools: Small sample parsing speedup Jiri Olsa
                   ` (6 preceding siblings ...)
  2017-10-31  9:29 ` [PATCH 7/7] perf tools: Remove perf_tool from event_op3 Jiri Olsa
@ 2017-11-01  8:39 ` Namhyung Kim
  2017-11-01 11:50   ` Jiri Olsa
  7 siblings, 1 reply; 11+ messages in thread
From: Namhyung Kim @ 2017-11-01  8:39 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, lkml, Ingo Molnar, David Ahern,
	Peter Zijlstra, kernel-team

Hi Jiri,

On Tue, Oct 31, 2017 at 10:29:40AM +0100, Jiri Olsa wrote:
> hi,
> sending small speedup fix for sample parsing code
> and few assorted fixes.

Nice work, for patch 1-5:

  Acked-by: Namhyung Kim <namhyung@kernel.org>

But I'm not sure for patch 6 and 7 as tool and event arguments are
passed to others consistently.

Thanks,
Namhyung


> 
> Also available in:
>   https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
>   perf/fixes
> 
> thanks,
> jirka
> 
> 
> ---
> Jiri Olsa (7):
>       perf tools: Reset cursor arg instead of callchain_cursor
>       perf tools: Centralize perf_sample initialization
>       perf tools: Add perf_evlist__parse_sample_timestamp function
>       perf tools: Pass timestamp arg in perf_session__queue_event
>       perf tools: Optimize sample parsing for ordered events
>       perf tools: Remove perf_tool from event_op2
>       perf tools: Remove perf_tool from event_op3
> 
>  tools/perf/builtin-inject.c      |  32 ++++++++++++++------------------
>  tools/perf/builtin-kvm.c         |   8 ++++----
>  tools/perf/builtin-script.c      |  22 ++++++++++------------
>  tools/perf/builtin-stat.c        |  23 +++++++++++------------
>  tools/perf/util/auxtrace.c       |  17 +++++++----------
>  tools/perf/util/auxtrace.h       |  15 ++++++---------
>  tools/perf/util/evlist.c         |  11 +++++++++++
>  tools/perf/util/evlist.h         |   4 ++++
>  tools/perf/util/evsel.c          |  71 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----------
>  tools/perf/util/evsel.h          |   4 ++++
>  tools/perf/util/header.c         |  16 +++++++---------
>  tools/perf/util/header.h         |  15 ++++++---------
>  tools/perf/util/machine.c        |   2 +-
>  tools/perf/util/ordered-events.c |   3 +--
>  tools/perf/util/ordered-events.h |   2 +-
>  tools/perf/util/session.c        | 120 +++++++++++++++++++++++++++++++++++++++++++++++++++---------------------------------------------------------------------
>  tools/perf/util/session.h        |   7 +++----
>  tools/perf/util/stat.c           |   5 ++---
>  tools/perf/util/stat.h           |   5 ++---
>  tools/perf/util/tool.h           |   7 ++-----
>  20 files changed, 208 insertions(+), 181 deletions(-)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/7] perf tools: Small sample parsing speedup
  2017-11-01  8:39 ` [PATCH 0/7] perf tools: Small sample parsing speedup Namhyung Kim
@ 2017-11-01 11:50   ` Jiri Olsa
  0 siblings, 0 replies; 11+ messages in thread
From: Jiri Olsa @ 2017-11-01 11:50 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, lkml, Ingo Molnar,
	David Ahern, Peter Zijlstra, kernel-team

On Wed, Nov 01, 2017 at 05:39:55PM +0900, Namhyung Kim wrote:
> Hi Jiri,
> 
> On Tue, Oct 31, 2017 at 10:29:40AM +0100, Jiri Olsa wrote:
> > hi,
> > sending small speedup fix for sample parsing code
> > and few assorted fixes.
> 
> Nice work, for patch 1-5:
> 
>   Acked-by: Namhyung Kim <namhyung@kernel.org>
> 
> But I'm not sure for patch 6 and 7 as tool and event arguments are
> passed to others consistently.

I have it as a base for my other patches in queue and they
seemed standalone enough ;-) ... I removed it because I'm
adding some more and it got a little crowded

jirka

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-11-01 11:50 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-31  9:29 [PATCH 0/7] perf tools: Small sample parsing speedup Jiri Olsa
2017-10-31  9:29 ` [PATCH 1/7] perf tools: Reset cursor arg instead of callchain_cursor Jiri Olsa
2017-10-31  9:29 ` [PATCH 2/7] perf tools: Centralize perf_sample initialization Jiri Olsa
2017-10-31  9:29 ` [PATCH 3/7] perf tools: Add perf_evlist__parse_sample_timestamp function Jiri Olsa
2017-10-31  9:29 ` [PATCH 4/7] perf tools: Pass timestamp arg in perf_session__queue_event Jiri Olsa
2017-10-31  9:29 ` [PATCH 5/7] perf tools: Optimize sample parsing for ordered events Jiri Olsa
2017-10-31  9:40   ` Ingo Molnar
2017-10-31  9:29 ` [PATCH 6/7] perf tools: Remove perf_tool from event_op2 Jiri Olsa
2017-10-31  9:29 ` [PATCH 7/7] perf tools: Remove perf_tool from event_op3 Jiri Olsa
2017-11-01  8:39 ` [PATCH 0/7] perf tools: Small sample parsing speedup Namhyung Kim
2017-11-01 11:50   ` Jiri Olsa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.