linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFCv2 0/9] perf: Allow leader sampling on inherited events
@ 2014-08-25 14:45 Jiri Olsa
  2014-08-25 14:45 ` [PATCH 1/9] perf: Remove redundant parent context check from context_equiv Jiri Olsa
                   ` (9 more replies)
  0 siblings, 10 replies; 39+ messages in thread
From: Jiri Olsa @ 2014-08-25 14:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Arnaldo Carvalho de Melo, Corey Ashford, David Ahern,
	Frederic Weisbecker, Ingo Molnar, Jen-Cheng(Tommy) Huang,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian,
	Jiri Olsa

hi,
Jen-Cheng(Tommy) Huang reported the leader sampling not working
on children processes:
  http://www.mail-archive.com/linux-perf-users@vger.kernel.org/msg01644.html

The leader sampling (example below) lets the group leader event (cycles)
do the sampling and reads the rest of the group (cache-misses) via
PERF_FORMAT_GROUP format.

Example:
  $ perf record -e '{cycles,cache-misses}:S' <workload>
  $ perf report --group

  The perf report --group allows to see all events group
  data in single view.

The reason for leader sampling being switched off for inherited
events, is that the kernel does no allow PERF_FORMAT_GROUP format
on inherited events (which is used by leader sampling).

I switched on the PERF_FORMAT_GROUP format for inherited events
with few other fixies in patches:
  perf: Deny optimized switch for events read by PERF_SAMPLE_READ
  perf: Allow PERF_FORMAT_GROUP format on inherited events

And I fixed perf tool code to be able to process data from
children processes.

Anyway, I might have missed some other reason why this was
never switched on in kernel, so sending this as RFC.

v2 changes:
   - detection of the kernel support
   - added perf script support for period column
   - separate kernel change for context_equiv change

thanks for comments,
jirka


Reported-by: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
Jiri Olsa (9):
      perf: Remove redundant parent context check from context_equiv
      perf: Deny optimized switch for events read by PERF_SAMPLE_READ
      perf: Allow PERF_FORMAT_GROUP format on inherited events
      perf tools: Add support to traverse xyarrays
      perf tools: Add pr_warning_once debug macro
      perf tools: Add hash of periods for struct perf_sample_id
      perf tools: Allow PERF_FORMAT_GROUP for inherited events
      perf script: Add period data column
      perf script: Add period as a default output column

 kernel/events/core.c                     | 33 ++++++++++++++------------------
 tools/perf/Documentation/perf-script.txt |  2 +-
 tools/perf/Makefile.perf                 |  1 +
 tools/perf/builtin-script.c              | 21 +++++++++++++++++----
 tools/perf/perf.h                        |  1 +
 tools/perf/tests/builtin-test.c          |  4 ++++
 tools/perf/tests/tests.h                 |  1 +
 tools/perf/tests/xyarray.c               | 33 ++++++++++++++++++++++++++++++++
 tools/perf/util/debug.h                  | 12 ++++++++++++
 tools/perf/util/evsel.c                  | 19 ++++++++++++++++++-
 tools/perf/util/evsel.h                  |  5 ++++-
 tools/perf/util/record.c                 | 12 ++++++++++++
 tools/perf/util/session.c                | 93 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++------
 tools/perf/util/xyarray.c                |  4 +++-
 tools/perf/util/xyarray.h                |  6 ++++++
 15 files changed, 214 insertions(+), 33 deletions(-)
 create mode 100644 tools/perf/tests/xyarray.c

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 1/9] perf: Remove redundant parent context check from context_equiv
  2014-08-25 14:45 [RFCv2 0/9] perf: Allow leader sampling on inherited events Jiri Olsa
@ 2014-08-25 14:45 ` Jiri Olsa
  2014-09-02 10:50   ` Peter Zijlstra
  2014-08-25 14:45 ` [PATCH 2/9] perf: Deny optimized switch for events read by PERF_SAMPLE_READ Jiri Olsa
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 39+ messages in thread
From: Jiri Olsa @ 2014-08-25 14:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jiri Olsa, Andi Kleen, Arnaldo Carvalho de Melo, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

As described in commit:
  1f9a7268c67f perf: Do not allow optimized switch for non-cloned events
we dont allow optimized switch for non cloned contexts.

There's no need now to check if one of the context is parent
of the other in context_equiv function.

Cc: Andi Kleen <andi@firstfloor.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 kernel/events/core.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index d8cb4d21a346..4ad4ba2bc106 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2257,14 +2257,6 @@ static int context_equiv(struct perf_event_context *ctx1,
 	if (ctx1->pin_count || ctx2->pin_count)
 		return 0;
 
-	/* If ctx1 is the parent of ctx2 */
-	if (ctx1 == ctx2->parent_ctx && ctx1->generation == ctx2->parent_gen)
-		return 1;
-
-	/* If ctx2 is the parent of ctx1 */
-	if (ctx1->parent_ctx == ctx2 && ctx1->parent_gen == ctx2->generation)
-		return 1;
-
 	/*
 	 * If ctx1 and ctx2 have the same parent; we flatten the parent
 	 * hierarchy, see perf_event_init_context().
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 2/9] perf: Deny optimized switch for events read by PERF_SAMPLE_READ
  2014-08-25 14:45 [RFCv2 0/9] perf: Allow leader sampling on inherited events Jiri Olsa
  2014-08-25 14:45 ` [PATCH 1/9] perf: Remove redundant parent context check from context_equiv Jiri Olsa
@ 2014-08-25 14:45 ` Jiri Olsa
  2014-09-02 10:52   ` Peter Zijlstra
  2014-08-25 14:45 ` [PATCH 3/9] perf: Allow PERF_FORMAT_GROUP format on inherited events Jiri Olsa
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 39+ messages in thread
From: Jiri Olsa @ 2014-08-25 14:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jiri Olsa, Andi Kleen, Arnaldo Carvalho de Melo, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

The optimized task context switch for cloned perf events just
swaps whole perf event contexts (of current and next process)
if it finds them suitable. Events from the 'current' context
will now measure data of the 'next' context and vice versa.

This is ok for cases where we are not directly interested in
the event->count value of separate child events, like:
  - standard sampling, where we take 'period' value for the
    event count
  - counting, where we accumulate all events (children)
    into a single count value

But in case we read event by using the PERF_SAMPLE_READ sample
type, we are interested in direct event->count value measured
in specific task. Switching events within tasks for this kind
of measurements corrupts data.

Fixing this by setting/unsetting pin_count for perf event context
once cloned event with PERF_SAMPLE_READ read is added/removed.
The pin_count value != 0 makes the context not suitable for
optimized switch.

Cc: Andi Kleen <andi@firstfloor.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 kernel/events/core.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 4ad4ba2bc106..ff6a17607ddb 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1117,6 +1117,12 @@ ctx_group_list(struct perf_event *event, struct perf_event_context *ctx)
 		return &ctx->flexible_groups;
 }
 
+static bool is_clone_with_read(struct perf_event *event)
+{
+	return event->parent &&
+	       (event->attr.sample_type & PERF_SAMPLE_READ);
+}
+
 /*
  * Add a event from the lists for its context.
  * Must be called with ctx->mutex and ctx->lock held.
@@ -1148,6 +1154,9 @@ list_add_event(struct perf_event *event, struct perf_event_context *ctx)
 	if (has_branch_stack(event))
 		ctx->nr_branch_stack++;
 
+	if (is_clone_with_read(event))
+		ctx->pin_count++;
+
 	list_add_rcu(&event->event_entry, &ctx->event_list);
 	if (!ctx->nr_events)
 		perf_pmu_rotate_start(ctx->pmu);
@@ -1313,6 +1322,9 @@ list_del_event(struct perf_event *event, struct perf_event_context *ctx)
 	if (has_branch_stack(event))
 		ctx->nr_branch_stack--;
 
+	if (is_clone_with_read(event))
+		ctx->pin_count--;
+
 	ctx->nr_events--;
 	if (event->attr.inherit_stat)
 		ctx->nr_stat--;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 3/9] perf: Allow PERF_FORMAT_GROUP format on inherited events
  2014-08-25 14:45 [RFCv2 0/9] perf: Allow leader sampling on inherited events Jiri Olsa
  2014-08-25 14:45 ` [PATCH 1/9] perf: Remove redundant parent context check from context_equiv Jiri Olsa
  2014-08-25 14:45 ` [PATCH 2/9] perf: Deny optimized switch for events read by PERF_SAMPLE_READ Jiri Olsa
@ 2014-08-25 14:45 ` Jiri Olsa
  2014-08-25 14:45 ` [PATCH 4/9] perf tools: Add support to traverse xyarrays Jiri Olsa
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 39+ messages in thread
From: Jiri Olsa @ 2014-08-25 14:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jiri Olsa, Andi Kleen, Arnaldo Carvalho de Melo, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

I assume the reason for this being disabled is the difficulty
to read child events once in perf overflow routine, thus the
perf_output_read_group function. The read syscall function
perf_event_read_group seems to handle this nicely.

My goal is to be able to read all events in group on leader
sample by using the PERF_SAMPLE_READ with PERF_FORMAT_GROUP
format. Once the monitored process forks, I need the child
processes/events do the same and store samples into parents
ring buffer.

So I need all events sample just to report their own value
(without child events being included). Thus switching the
perf_event_count call for simple read of event->count.

Reported-by: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 kernel/events/core.c | 13 ++-----------
 1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index ff6a17607ddb..9d6b15c8f728 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4608,9 +4608,6 @@ static void perf_output_read_one(struct perf_output_handle *handle,
 	__output_copy(handle, values, n * sizeof(u64));
 }
 
-/*
- * XXX PERF_FORMAT_GROUP vs inherited events seems difficult.
- */
 static void perf_output_read_group(struct perf_output_handle *handle,
 			    struct perf_event *event,
 			    u64 enabled, u64 running)
@@ -4631,7 +4628,7 @@ static void perf_output_read_group(struct perf_output_handle *handle,
 	if (leader != event)
 		leader->pmu->read(leader);
 
-	values[n++] = perf_event_count(leader);
+	values[n++] = local64_read(&leader->count);
 	if (read_format & PERF_FORMAT_ID)
 		values[n++] = primary_event_id(leader);
 
@@ -4644,7 +4641,7 @@ static void perf_output_read_group(struct perf_output_handle *handle,
 		    (sub->state == PERF_EVENT_STATE_ACTIVE))
 			sub->pmu->read(sub);
 
-		values[n++] = perf_event_count(sub);
+		values[n++] = local64_read(&sub->count);
 		if (read_format & PERF_FORMAT_ID)
 			values[n++] = primary_event_id(sub);
 
@@ -6953,12 +6950,6 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
 
 	local64_set(&hwc->period_left, hwc->sample_period);
 
-	/*
-	 * we currently do not support PERF_FORMAT_GROUP on inherited events
-	 */
-	if (attr->inherit && (attr->read_format & PERF_FORMAT_GROUP))
-		goto err_ns;
-
 	pmu = perf_init_event(event);
 	if (!pmu)
 		goto err_ns;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 4/9] perf tools: Add support to traverse xyarrays
  2014-08-25 14:45 [RFCv2 0/9] perf: Allow leader sampling on inherited events Jiri Olsa
                   ` (2 preceding siblings ...)
  2014-08-25 14:45 ` [PATCH 3/9] perf: Allow PERF_FORMAT_GROUP format on inherited events Jiri Olsa
@ 2014-08-25 14:45 ` Jiri Olsa
  2014-08-25 14:45 ` [PATCH 5/9] perf tools: Add pr_warning_once debug macro Jiri Olsa
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 39+ messages in thread
From: Jiri Olsa @ 2014-08-25 14:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jiri Olsa, Andi Kleen, Arnaldo Carvalho de Melo, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

Adding xyarray__for_each define to allow sequentially
traverse xyarrays. It will be handy in following patch.

Cc: Andi Kleen <andi@firstfloor.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/Makefile.perf        |  1 +
 tools/perf/tests/builtin-test.c |  4 ++++
 tools/perf/tests/tests.h        |  1 +
 tools/perf/tests/xyarray.c      | 33 +++++++++++++++++++++++++++++++++
 tools/perf/util/xyarray.c       |  4 +++-
 tools/perf/util/xyarray.h       |  6 ++++++
 6 files changed, 48 insertions(+), 1 deletion(-)
 create mode 100644 tools/perf/tests/xyarray.c

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 95e832b1bc3c..da8fc07190cb 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -426,6 +426,7 @@ endif
 LIB_OBJS += $(OUTPUT)tests/mmap-thread-lookup.o
 LIB_OBJS += $(OUTPUT)tests/thread-mg-share.o
 LIB_OBJS += $(OUTPUT)tests/switch-tracking.o
+LIB_OBJS += $(OUTPUT)tests/xyarray.o
 
 BUILTIN_OBJS += $(OUTPUT)builtin-annotate.o
 BUILTIN_OBJS += $(OUTPUT)builtin-bench.o
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 6a4145e5ad2c..5c90326c8d1d 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -158,6 +158,10 @@ static struct test {
 		.func = test__switch_tracking,
 	},
 	{
+		.desc = "Test xyarray",
+		.func = test__xyarray,
+	},
+	{
 		.func = NULL,
 	},
 };
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index be8be10e3957..8c3e7d1108f3 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -49,6 +49,7 @@ int test__thread_mg_share(void);
 int test__hists_output(void);
 int test__hists_cumulate(void);
 int test__switch_tracking(void);
+int test__xyarray(void);
 
 #if defined(__x86_64__) || defined(__i386__) || defined(__arm__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
diff --git a/tools/perf/tests/xyarray.c b/tools/perf/tests/xyarray.c
new file mode 100644
index 000000000000..e1a1d6a45106
--- /dev/null
+++ b/tools/perf/tests/xyarray.c
@@ -0,0 +1,33 @@
+#include "tests.h"
+#include "xyarray.h"
+#include "debug.h"
+
+struct krava {
+	int a;
+};
+
+#define X 100
+#define Y 100
+
+int test__xyarray(void)
+{
+	struct xyarray *a;
+	struct krava *k;
+	int x, y;
+
+	a = xyarray__new(X, Y, sizeof(struct krava));
+	TEST_ASSERT_VAL("failed to allocate xyarray", a);
+
+	for (x = 0; x < X; x++) {
+		for (y = 0; y < Y; y++) {
+			k = xyarray__entry(a, x, y);
+			k->a = x * X + y;
+		}
+	}
+
+	y = 0;
+	xyarray__for_each(a, k)
+		TEST_ASSERT_VAL("wrong array value", k->a == y++);
+
+	return 0;
+}
diff --git a/tools/perf/util/xyarray.c b/tools/perf/util/xyarray.c
index 22afbf6c536a..077e8240fe98 100644
--- a/tools/perf/util/xyarray.c
+++ b/tools/perf/util/xyarray.c
@@ -4,11 +4,13 @@
 struct xyarray *xyarray__new(int xlen, int ylen, size_t entry_size)
 {
 	size_t row_size = ylen * entry_size;
-	struct xyarray *xy = zalloc(sizeof(*xy) + xlen * row_size);
+	size_t size = xlen * row_size;
+	struct xyarray *xy = zalloc(sizeof(*xy) + size);
 
 	if (xy != NULL) {
 		xy->entry_size = entry_size;
 		xy->row_size   = row_size;
+		xy->size       = size;
 	}
 
 	return xy;
diff --git a/tools/perf/util/xyarray.h b/tools/perf/util/xyarray.h
index c488a07275dd..e4efa075fd76 100644
--- a/tools/perf/util/xyarray.h
+++ b/tools/perf/util/xyarray.h
@@ -6,6 +6,7 @@
 struct xyarray {
 	size_t row_size;
 	size_t entry_size;
+	size_t size;
 	char contents[];
 };
 
@@ -17,4 +18,9 @@ static inline void *xyarray__entry(struct xyarray *xy, int x, int y)
 	return &xy->contents[x * xy->row_size + y * xy->entry_size];
 }
 
+#define xyarray__for_each(array, entry)					\
+	for (entry = (void *) &array->contents[0];			\
+	     (void *) entry < ((void *) array->contents + array->size);	\
+	     entry++)
+
 #endif /* _PERF_XYARRAY_H_ */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 5/9] perf tools: Add pr_warning_once debug macro
  2014-08-25 14:45 [RFCv2 0/9] perf: Allow leader sampling on inherited events Jiri Olsa
                   ` (3 preceding siblings ...)
  2014-08-25 14:45 ` [PATCH 4/9] perf tools: Add support to traverse xyarrays Jiri Olsa
@ 2014-08-25 14:45 ` Jiri Olsa
  2014-08-25 14:45 ` [PATCH 6/9] perf tools: Add hash of periods for struct perf_sample_id Jiri Olsa
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 39+ messages in thread
From: Jiri Olsa @ 2014-08-25 14:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jiri Olsa, Andi Kleen, Arnaldo Carvalho de Melo, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

Adding pr_warning_once debug macro to display
pr_warning message just once. It will be used
in following patch.

Cc: Andi Kleen <andi@firstfloor.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/debug.h | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/tools/perf/util/debug.h b/tools/perf/util/debug.h
index be264d6f3b30..be0399c2703d 100644
--- a/tools/perf/util/debug.h
+++ b/tools/perf/util/debug.h
@@ -31,6 +31,18 @@ extern int debug_ordered_events;
 #define pr_debug3(fmt, ...) pr_debugN(3, pr_fmt(fmt), ##__VA_ARGS__)
 #define pr_debug4(fmt, ...) pr_debugN(4, pr_fmt(fmt), ##__VA_ARGS__)
 
+#define pr_once(fmt, pr, ...)			\
+	do {					\
+		static int __warned;		\
+						\
+		if (!__warned) {		\
+			pr(fmt, ##__VA_ARGS__);	\
+			__warned = 1;		\
+		}				\
+	} while (0)
+
+#define pr_warning_once(fmt, ...) pr_once(fmt, pr_warning, ##__VA_ARGS__)
+
 #define pr_time_N(n, var, t, fmt, ...) \
 	eprintf_time(n, var, t, fmt, ##__VA_ARGS__)
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 6/9] perf tools: Add hash of periods for struct perf_sample_id
  2014-08-25 14:45 [RFCv2 0/9] perf: Allow leader sampling on inherited events Jiri Olsa
                   ` (4 preceding siblings ...)
  2014-08-25 14:45 ` [PATCH 5/9] perf tools: Add pr_warning_once debug macro Jiri Olsa
@ 2014-08-25 14:45 ` Jiri Olsa
  2014-08-25 14:45 ` [PATCH 7/9] perf tools: Allow PERF_FORMAT_GROUP for inherited events Jiri Olsa
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 39+ messages in thread
From: Jiri Olsa @ 2014-08-25 14:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jiri Olsa, Andi Kleen, Arnaldo Carvalho de Melo, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

With PERF_FORMAT_GROUP format on inherited events being allowed
in kernel, we can now allow leader sampling on inherited events.

But before we actually switch it on, we need to change the sorting
of PERF_SAMPLE_READ sample's data. Currently PERF_SAMPLE_READ values
are sorted on event id. Now when we'll get data from all children
processes we need to add TID as another sort key.

Adding hash of TIDs into each 'struct perf_sample_id' to
hold event values for different TIDs.

Reported-by: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/evsel.c   | 13 +++++++
 tools/perf/util/evsel.h   |  5 ++-
 tools/perf/util/session.c | 93 ++++++++++++++++++++++++++++++++++++++++++++---
 3 files changed, 104 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index b38de5819323..507d458ded2c 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -787,8 +787,21 @@ void perf_evsel__free_fd(struct perf_evsel *evsel)
 	evsel->fd = NULL;
 }
 
+static void free_sample_id(struct perf_evsel *evsel)
+{
+	struct perf_sample_id *sid;
+
+	if (evsel->sample_id) {
+		xyarray__for_each(evsel->sample_id, sid) {
+			if (sid->periods)
+				perf_sample_hash__delete(sid->periods);
+		}
+	}
+}
+
 void perf_evsel__free_id(struct perf_evsel *evsel)
 {
+	free_sample_id(evsel);
 	xyarray__delete(evsel->sample_id);
 	evsel->sample_id = NULL;
 	zfree(&evsel->id);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 7bc314be6a7b..41c000fc018b 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -29,6 +29,7 @@ struct perf_counts {
 };
 
 struct perf_evsel;
+struct perf_sample_hash;
 
 /*
  * Per fd, to map back from PERF_SAMPLE_ID to evsel, only used when there are
@@ -40,9 +41,11 @@ struct perf_sample_id {
 	struct perf_evsel	*evsel;
 
 	/* Holds total ID period value for PERF_SAMPLE_READ processing. */
-	u64			period;
+	struct perf_sample_hash	*periods;
 };
 
+void perf_sample_hash__delete(struct perf_sample_hash *hash);
+
 /** struct perf_evsel - event selector
  *
  * @name - Can be set to retain the original event name passed by the user,
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 6d2d50dea1d8..dcd2662b3b2e 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1,4 +1,6 @@
 #include <linux/kernel.h>
+#include <linux/bitops.h>
+#include <linux/hash.h>
 #include <traceevent/event-parse.h>
 
 #include <byteswap.h>
@@ -733,6 +735,82 @@ static struct machine *
 	return &session->machines.host;
 }
 
+struct perf_sample_period {
+	struct hlist_node	node;
+	u64			value;
+	pid_t			tid;
+};
+
+#define PERF_SAMPLE__HLIST_BITS 8
+#define PERF_SAMPLE__HLIST_SIZE (1 << PERF_SAMPLE__HLIST_BITS)
+
+struct perf_sample_hash {
+	struct hlist_head heads[PERF_SAMPLE__HLIST_SIZE];
+};
+
+void perf_sample_hash__delete(struct perf_sample_hash *hash)
+{
+	int h;
+
+	for (h = 0; h < PERF_SAMPLE__HLIST_SIZE; h++) {
+		struct perf_sample_period *period;
+		struct hlist_head *head;
+		struct hlist_node *n;
+
+		head = &hash->heads[h];
+		hlist_for_each_entry_safe(period, n, head, node) {
+			hlist_del(&period->node);
+			free(period);
+		}
+	}
+
+	free(hash);
+}
+
+static struct perf_sample_period*
+findnew_hash_period(struct perf_sample_hash *hash, pid_t tid)
+{
+	struct perf_sample_period *period;
+	struct hlist_head *head;
+	int hash_val;
+
+	hash_val = hash_64(tid, PERF_SAMPLE__HLIST_BITS);
+	head = &hash->heads[hash_val];
+
+	hlist_for_each_entry(period, head, node) {
+		if (period->tid == tid)
+			return period;
+	}
+
+	period = zalloc(sizeof(*period));
+	if (period) {
+		period->tid = tid;
+		hlist_add_head(&period->node, &hash->heads[hash_val]);
+	}
+
+	return period;
+}
+
+static struct perf_sample_period*
+get_sample_period(struct perf_sample_id *sid, pid_t tid)
+{
+	struct perf_sample_hash *hash = sid->periods;
+	int i;
+
+	if (hash == NULL) {
+		hash = zalloc(sizeof(*hash));
+		if (hash == NULL)
+			return NULL;
+
+		for (i = 0; i < PERF_SAMPLE__HLIST_SIZE; ++i)
+			INIT_HLIST_HEAD(&hash->heads[i]);
+
+		sid->periods = hash;
+	}
+
+	return findnew_hash_period(hash, tid);
+}
+
 static int deliver_sample_value(struct perf_session *session,
 				struct perf_tool *tool,
 				union perf_event *event,
@@ -741,19 +819,22 @@ static int deliver_sample_value(struct perf_session *session,
 				struct machine *machine)
 {
 	struct perf_sample_id *sid;
+	struct perf_sample_period *period;
 
 	sid = perf_evlist__id2sid(session->evlist, v->id);
-	if (sid) {
-		sample->id     = v->id;
-		sample->period = v->value - sid->period;
-		sid->period    = v->value;
-	}
-
 	if (!sid || sid->evsel == NULL) {
 		++session->stats.nr_unknown_id;
 		return 0;
 	}
 
+	period = get_sample_period(sid, sample->tid);
+	if (period == NULL)
+		return -ENOMEM;
+
+	sample->id     = v->id;
+	sample->period = v->value - period->value;
+	period->value  = v->value;
+
 	return tool->sample(tool, event, sample, sid->evsel, machine);
 }
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 7/9] perf tools: Allow PERF_FORMAT_GROUP for inherited events
  2014-08-25 14:45 [RFCv2 0/9] perf: Allow leader sampling on inherited events Jiri Olsa
                   ` (5 preceding siblings ...)
  2014-08-25 14:45 ` [PATCH 6/9] perf tools: Add hash of periods for struct perf_sample_id Jiri Olsa
@ 2014-08-25 14:45 ` Jiri Olsa
  2014-08-25 14:45 ` [PATCH 8/9] perf script: Add period data column Jiri Olsa
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 39+ messages in thread
From: Jiri Olsa @ 2014-08-25 14:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jiri Olsa, Andi Kleen, Arnaldo Carvalho de Melo, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

Swithing on leader sampling on inherited events.

Following command will now get data from all children processes:
  $ perf record -e '{cycles,cache-misses}:S' <workload>

Use following command to display the data:
  $ perf report --group

Adding warning for record command if the support
is not detected:
  $ perf record -e '{cycles,cache-misses}:S,{instructions,branch}:S' kill
  Disabling inherit for sample read events, no kernel support.
  ...

Reported-by: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/perf.h        |  1 +
 tools/perf/util/evsel.c  |  6 +++++-
 tools/perf/util/record.c | 12 ++++++++++++
 3 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 510c65f72858..54e1cb42277e 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -45,6 +45,7 @@ struct record_opts {
 	bool         call_graph_enabled;
 	bool	     group;
 	bool	     inherit_stat;
+	bool	     inherit_format_group;
 	bool	     no_buffering;
 	bool	     no_inherit;
 	bool	     no_inherit_set;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 507d458ded2c..7415159bbcea 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -586,7 +586,11 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts)
 		 */
 		if (leader->nr_members > 1) {
 			attr->read_format |= PERF_FORMAT_GROUP;
-			attr->inherit = 0;
+			attr->inherit = opts->inherit_format_group;
+			if (!opts->inherit_format_group) {
+				pr_warning_once("Disabling inherit for sample read events, "
+						"no kernel support.\n");
+			}
 		}
 	}
 
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index cf69325b985f..22b1d9c79ac0 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -85,6 +85,12 @@ static void perf_probe_comm_exec(struct perf_evsel *evsel)
 	evsel->attr.comm_exec = 1;
 }
 
+static void perf_probe_inherit_format_group(struct perf_evsel *evsel)
+{
+	evsel->attr.inherit = 1;
+	evsel->attr.read_format |= PERF_FORMAT_GROUP;
+}
+
 bool perf_can_sample_identifier(void)
 {
 	return perf_probe_api(perf_probe_sample_identifier);
@@ -95,6 +101,11 @@ static bool perf_can_comm_exec(void)
 	return perf_probe_api(perf_probe_comm_exec);
 }
 
+static bool perf_can_inherit_format_group(void)
+{
+	return perf_probe_api(perf_probe_inherit_format_group);
+}
+
 void perf_evlist__config(struct perf_evlist *evlist, struct record_opts *opts)
 {
 	struct perf_evsel *evsel;
@@ -111,6 +122,7 @@ void perf_evlist__config(struct perf_evlist *evlist, struct record_opts *opts)
 	if (evlist->cpus->map[0] < 0)
 		opts->no_inherit = true;
 
+	opts->inherit_format_group = perf_can_inherit_format_group();
 	use_comm_exec = perf_can_comm_exec();
 
 	evlist__for_each(evlist, evsel) {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 8/9] perf script: Add period data column
  2014-08-25 14:45 [RFCv2 0/9] perf: Allow leader sampling on inherited events Jiri Olsa
                   ` (6 preceding siblings ...)
  2014-08-25 14:45 ` [PATCH 7/9] perf tools: Allow PERF_FORMAT_GROUP for inherited events Jiri Olsa
@ 2014-08-25 14:45 ` Jiri Olsa
  2014-08-27 14:33   ` David Ahern
  2014-10-18  7:07   ` [tip:perf/urgent] " tip-bot for Jiri Olsa
  2014-08-25 14:45 ` [PATCH 9/9] perf script: Add period as a default output column Jiri Olsa
  2014-08-25 14:51 ` [RFCv2 0/9] perf: Allow leader sampling on inherited events Jiri Olsa
  9 siblings, 2 replies; 39+ messages in thread
From: Jiri Olsa @ 2014-08-25 14:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jiri Olsa, Andi Kleen, Arnaldo Carvalho de Melo, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

Adding period data column to be displayed in perf script.
It's possible to get period values using -f option, like:

  $ perf script -f comm,tid,time,period,ip,sym,dso
          :26019 26019 52414.329088:       3707  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
          :26019 26019 52414.329088:         44  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
          :26019 26019 52414.329093:       1987  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
          :26019 26019 52414.329093:          6  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
              ls 26019 52414.329442:     537558        3407c0639c _dl_map_object_from_fd (/usr/lib64/ld-2.17.so)
              ls 26019 52414.329442:       2099        3407c0639c _dl_map_object_from_fd (/usr/lib64/ld-2.17.so)
              ls 26019 52414.330181:    1242100        34080917bb get_next_seq (/usr/lib64/libc-2.17.so)
              ls 26019 52414.330181:       3774        34080917bb get_next_seq (/usr/lib64/libc-2.17.so)
              ls 26019 52414.331427:    1083662  ffffffff810c7dc2 update_curr ([kernel.kallsyms])
              ls 26019 52414.331427:        360  ffffffff810c7dc2 update_curr ([kernel.kallsyms])

Cc: Andi Kleen <andi@firstfloor.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/Documentation/perf-script.txt |  2 +-
 tools/perf/builtin-script.c              | 12 +++++++++++-
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 05f9a0a6784c..2625290196c1 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -115,7 +115,7 @@ OPTIONS
 -f::
 --fields::
         Comma separated list of fields to print. Options are:
-        comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff, srcline.
+        comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff, srcline, period.
         Field list can be prepended with the type, trace, sw or hw,
         to indicate to which event type the field list applies.
         e.g., -f sw:comm,tid,time,ip,sym  and -f trace:time,cpu,trace
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 02dce9295e2c..7f08c8d4d633 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -44,6 +44,7 @@ enum perf_output_field {
 	PERF_OUTPUT_ADDR            = 1U << 10,
 	PERF_OUTPUT_SYMOFFSET       = 1U << 11,
 	PERF_OUTPUT_SRCLINE         = 1U << 12,
+	PERF_OUTPUT_PERIOD          = 1U << 13,
 };
 
 struct output_option {
@@ -63,6 +64,7 @@ struct output_option {
 	{.str = "addr",  .field = PERF_OUTPUT_ADDR},
 	{.str = "symoff", .field = PERF_OUTPUT_SYMOFFSET},
 	{.str = "srcline", .field = PERF_OUTPUT_SRCLINE},
+	{.str = "period", .field = PERF_OUTPUT_PERIOD},
 };
 
 /* default set to maintain compatibility with current format */
@@ -229,6 +231,11 @@ static int perf_evsel__check_attr(struct perf_evsel *evsel,
 					PERF_OUTPUT_CPU))
 		return -EINVAL;
 
+	if (PRINT_FIELD(PERIOD) &&
+		perf_evsel__check_stype(evsel, PERF_SAMPLE_PERIOD, "PERIOD",
+					PERF_OUTPUT_PERIOD))
+		return -EINVAL;
+
 	return 0;
 }
 
@@ -448,6 +455,9 @@ static void process_event(union perf_event *event, struct perf_sample *sample,
 
 	print_sample_start(sample, thread, evsel);
 
+	if (PRINT_FIELD(PERIOD))
+		printf("%10" PRIu64 " ", sample->period);
+
 	if (PRINT_FIELD(EVNAME)) {
 		const char *evname = perf_evsel__name(evsel);
 		printf("%s: ", evname ? evname : "[unknown]");
@@ -1544,7 +1554,7 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
 		     "comma separated output fields prepend with 'type:'. "
 		     "Valid types: hw,sw,trace,raw. "
 		     "Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,"
-		     "addr,symoff", parse_output_fields),
+		     "addr,symoff,period", parse_output_fields),
 	OPT_BOOLEAN('a', "all-cpus", &system_wide,
 		    "system-wide collection from all CPUs"),
 	OPT_STRING('S', "symbols", &symbol_conf.sym_list_str, "symbol[,symbol...]",
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 9/9] perf script: Add period as a default output column
  2014-08-25 14:45 [RFCv2 0/9] perf: Allow leader sampling on inherited events Jiri Olsa
                   ` (7 preceding siblings ...)
  2014-08-25 14:45 ` [PATCH 8/9] perf script: Add period data column Jiri Olsa
@ 2014-08-25 14:45 ` Jiri Olsa
  2014-08-27 14:40   ` David Ahern
  2014-10-18  7:07   ` [tip:perf/urgent] " tip-bot for Jiri Olsa
  2014-08-25 14:51 ` [RFCv2 0/9] perf: Allow leader sampling on inherited events Jiri Olsa
  9 siblings, 2 replies; 39+ messages in thread
From: Jiri Olsa @ 2014-08-25 14:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jiri Olsa, Andi Kleen, Arnaldo Carvalho de Melo, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

Adding period as a default output column in script command
fo hardware, software and raw events.

If PERF_SAMPLE_PERIOD sample type is defined in perf.data,
following will be displayed in perf script output:

  $ perf script
              ls  8034 57477.887209:     250000 task-clock:  ffffffff81361d72 memset ([kernel.kallsyms])
              ls  8034 57477.887464:     250000 task-clock:  ffffffff816f6d92 _raw_spin_unlock_irqrestore ([kernel.kallsyms])
              ls  8034 57477.887708:     250000 task-clock:  ffffffff811a94f0 do_munmap ([kernel.kallsyms])
              ls  8034 57477.887959:     250000 task-clock:        34080916c6 get_next_seq (/usr/lib64/libc-2.17.so)
              ls  8034 57477.888208:     250000 task-clock:        3408079230 _IO_doallocbuf (/usr/lib64/libc-2.17.so)
              ls  8034 57477.888717:     250000 task-clock:  ffffffff814242c8 n_tty_write ([kernel.kallsyms])
              ls  8034 57477.889285:     250000 task-clock:        3408076402 fwrite_unlocked (/usr/lib64/libc-2.17.so)

Cc: Andi Kleen <andi@firstfloor.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/builtin-script.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 7f08c8d4d633..cbe37df81085 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -82,7 +82,8 @@ static struct {
 		.fields = PERF_OUTPUT_COMM | PERF_OUTPUT_TID |
 			      PERF_OUTPUT_CPU | PERF_OUTPUT_TIME |
 			      PERF_OUTPUT_EVNAME | PERF_OUTPUT_IP |
-				  PERF_OUTPUT_SYM | PERF_OUTPUT_DSO,
+			      PERF_OUTPUT_SYM | PERF_OUTPUT_DSO |
+			      PERF_OUTPUT_PERIOD,
 
 		.invalid_fields = PERF_OUTPUT_TRACE,
 	},
@@ -93,7 +94,8 @@ static struct {
 		.fields = PERF_OUTPUT_COMM | PERF_OUTPUT_TID |
 			      PERF_OUTPUT_CPU | PERF_OUTPUT_TIME |
 			      PERF_OUTPUT_EVNAME | PERF_OUTPUT_IP |
-				  PERF_OUTPUT_SYM | PERF_OUTPUT_DSO,
+			      PERF_OUTPUT_SYM | PERF_OUTPUT_DSO |
+			      PERF_OUTPUT_PERIOD,
 
 		.invalid_fields = PERF_OUTPUT_TRACE,
 	},
@@ -112,7 +114,8 @@ static struct {
 		.fields = PERF_OUTPUT_COMM | PERF_OUTPUT_TID |
 			      PERF_OUTPUT_CPU | PERF_OUTPUT_TIME |
 			      PERF_OUTPUT_EVNAME | PERF_OUTPUT_IP |
-				  PERF_OUTPUT_SYM | PERF_OUTPUT_DSO,
+			      PERF_OUTPUT_SYM | PERF_OUTPUT_DSO |
+			      PERF_OUTPUT_PERIOD,
 
 		.invalid_fields = PERF_OUTPUT_TRACE,
 	},
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [RFCv2 0/9] perf: Allow leader sampling on inherited events
  2014-08-25 14:45 [RFCv2 0/9] perf: Allow leader sampling on inherited events Jiri Olsa
                   ` (8 preceding siblings ...)
  2014-08-25 14:45 ` [PATCH 9/9] perf script: Add period as a default output column Jiri Olsa
@ 2014-08-25 14:51 ` Jiri Olsa
  9 siblings, 0 replies; 39+ messages in thread
From: Jiri Olsa @ 2014-08-25 14:51 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo,
	Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

On Mon, Aug 25, 2014 at 04:45:34PM +0200, Jiri Olsa wrote:
> hi,
> Jen-Cheng(Tommy) Huang reported the leader sampling not working
> on children processes:
>   http://www.mail-archive.com/linux-perf-users@vger.kernel.org/msg01644.html
> 
> The leader sampling (example below) lets the group leader event (cycles)
> do the sampling and reads the rest of the group (cache-misses) via
> PERF_FORMAT_GROUP format.
> 
> Example:
>   $ perf record -e '{cycles,cache-misses}:S' <workload>
>   $ perf report --group
> 
>   The perf report --group allows to see all events group
>   data in single view.
> 
> The reason for leader sampling being switched off for inherited
> events, is that the kernel does no allow PERF_FORMAT_GROUP format
> on inherited events (which is used by leader sampling).
> 
> I switched on the PERF_FORMAT_GROUP format for inherited events
> with few other fixies in patches:
>   perf: Deny optimized switch for events read by PERF_SAMPLE_READ
>   perf: Allow PERF_FORMAT_GROUP format on inherited events
> 
> And I fixed perf tool code to be able to process data from
> children processes.
> 
> Anyway, I might have missed some other reason why this was
> never switched on in kernel, so sending this as RFC.
> 
> v2 changes:
>    - detection of the kernel support
>    - added perf script support for period column
>    - separate kernel change for context_equiv change

and as usuall I forgot.. also reachable in here:
  git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
  perf/core_format_group

jirka

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 8/9] perf script: Add period data column
  2014-08-25 14:45 ` [PATCH 8/9] perf script: Add period data column Jiri Olsa
@ 2014-08-27 14:33   ` David Ahern
  2014-10-17 16:10     ` Jiri Olsa
  2014-10-18  7:07   ` [tip:perf/urgent] " tip-bot for Jiri Olsa
  1 sibling, 1 reply; 39+ messages in thread
From: David Ahern @ 2014-08-27 14:33 UTC (permalink / raw)
  To: Jiri Olsa, linux-kernel
  Cc: Andi Kleen, Arnaldo Carvalho de Melo, Corey Ashford,
	Frederic Weisbecker, Ingo Molnar, Jen-Cheng(Tommy) Huang,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian

On 8/25/14, 8:45 AM, Jiri Olsa wrote:
> Adding period data column to be displayed in perf script.
> It's possible to get period values using -f option, like:
>
>    $ perf script -f comm,tid,time,period,ip,sym,dso
>            :26019 26019 52414.329088:       3707  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
>            :26019 26019 52414.329088:         44  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
>            :26019 26019 52414.329093:       1987  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
>            :26019 26019 52414.329093:          6  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
>                ls 26019 52414.329442:     537558        3407c0639c _dl_map_object_from_fd (/usr/lib64/ld-2.17.so)
>                ls 26019 52414.329442:       2099        3407c0639c _dl_map_object_from_fd (/usr/lib64/ld-2.17.so)
>                ls 26019 52414.330181:    1242100        34080917bb get_next_seq (/usr/lib64/libc-2.17.so)
>                ls 26019 52414.330181:       3774        34080917bb get_next_seq (/usr/lib64/libc-2.17.so)
>                ls 26019 52414.331427:    1083662  ffffffff810c7dc2 update_curr ([kernel.kallsyms])
>                ls 26019 52414.331427:        360  ffffffff810c7dc2 update_curr ([kernel.kallsyms])
>
> Cc: Andi Kleen <andi@firstfloor.org>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Stephane Eranian <eranian@google.com>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>   tools/perf/Documentation/perf-script.txt |  2 +-
>   tools/perf/builtin-script.c              | 12 +++++++++++-
>   2 files changed, 12 insertions(+), 2 deletions(-)

Acked-by: David Ahern <dsahern@gmail.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 9/9] perf script: Add period as a default output column
  2014-08-25 14:45 ` [PATCH 9/9] perf script: Add period as a default output column Jiri Olsa
@ 2014-08-27 14:40   ` David Ahern
  2014-10-17 16:11     ` Jiri Olsa
  2014-10-18  7:07   ` [tip:perf/urgent] " tip-bot for Jiri Olsa
  1 sibling, 1 reply; 39+ messages in thread
From: David Ahern @ 2014-08-27 14:40 UTC (permalink / raw)
  To: Jiri Olsa, linux-kernel
  Cc: Andi Kleen, Arnaldo Carvalho de Melo, Corey Ashford,
	Frederic Weisbecker, Ingo Molnar, Jen-Cheng(Tommy) Huang,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian

On 8/25/14, 8:45 AM, Jiri Olsa wrote:
> Adding period as a default output column in script command
> fo hardware, software and raw events.
>
> If PERF_SAMPLE_PERIOD sample type is defined in perf.data,
> following will be displayed in perf script output:
>
>    $ perf script
>                ls  8034 57477.887209:     250000 task-clock:  ffffffff81361d72 memset ([kernel.kallsyms])
>                ls  8034 57477.887464:     250000 task-clock:  ffffffff816f6d92 _raw_spin_unlock_irqrestore ([kernel.kallsyms])
>                ls  8034 57477.887708:     250000 task-clock:  ffffffff811a94f0 do_munmap ([kernel.kallsyms])
>                ls  8034 57477.887959:     250000 task-clock:        34080916c6 get_next_seq (/usr/lib64/libc-2.17.so)
>                ls  8034 57477.888208:     250000 task-clock:        3408079230 _IO_doallocbuf (/usr/lib64/libc-2.17.so)
>                ls  8034 57477.888717:     250000 task-clock:  ffffffff814242c8 n_tty_write ([kernel.kallsyms])
>                ls  8034 57477.889285:     250000 task-clock:        3408076402 fwrite_unlocked (/usr/lib64/libc-2.17.so)
>
> Cc: Andi Kleen<andi@firstfloor.org>
> Cc: Arnaldo Carvalho de Melo<acme@kernel.org>
> Cc: Corey Ashford<cjashfor@linux.vnet.ibm.com>
> Cc: David Ahern<dsahern@gmail.com>
> Cc: Frederic Weisbecker<fweisbec@gmail.com>
> Cc: Ingo Molnar<mingo@kernel.org>
> Cc: Jen-Cheng(Tommy) Huang<tommy24@gatech.edu>
> Cc: Namhyung Kim<namhyung@kernel.org>
> Cc: Paul Mackerras<paulus@samba.org>
> Cc: Peter Zijlstra<a.p.zijlstra@chello.nl>
> Cc: Stephane Eranian<eranian@google.com>
> Signed-off-by: Jiri Olsa<jolsa@kernel.org>


Acked-by: David Ahern <dsahern@gmail.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/9] perf: Remove redundant parent context check from context_equiv
  2014-08-25 14:45 ` [PATCH 1/9] perf: Remove redundant parent context check from context_equiv Jiri Olsa
@ 2014-09-02 10:50   ` Peter Zijlstra
  2014-09-08  9:43     ` Jiri Olsa
  0 siblings, 1 reply; 39+ messages in thread
From: Peter Zijlstra @ 2014-09-02 10:50 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo,
	Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Stephane Eranian

On Mon, Aug 25, 2014 at 04:45:35PM +0200, Jiri Olsa wrote:
> As described in commit:
>   1f9a7268c67f perf: Do not allow optimized switch for non-cloned events
> we dont allow optimized switch for non cloned contexts.
> 
> There's no need now to check if one of the context is parent
> of the other in context_equiv function.
> 

OK, so talk to me about that patch; it looks like you're slowly
reverting: 5a3126d4fe7c ("perf: Fix the perf context switch
optimization").

So what exactly is the problem? I'm still jet-lagged and not seeing it.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/9] perf: Deny optimized switch for events read by PERF_SAMPLE_READ
  2014-08-25 14:45 ` [PATCH 2/9] perf: Deny optimized switch for events read by PERF_SAMPLE_READ Jiri Olsa
@ 2014-09-02 10:52   ` Peter Zijlstra
  2014-09-08 10:00     ` Jiri Olsa
  0 siblings, 1 reply; 39+ messages in thread
From: Peter Zijlstra @ 2014-09-02 10:52 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo,
	Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Stephane Eranian

On Mon, Aug 25, 2014 at 04:45:36PM +0200, Jiri Olsa wrote:
> The optimized task context switch for cloned perf events just
> swaps whole perf event contexts (of current and next process)
> if it finds them suitable. Events from the 'current' context
> will now measure data of the 'next' context and vice versa.
> 
> This is ok for cases where we are not directly interested in
> the event->count value of separate child events, like:
>   - standard sampling, where we take 'period' value for the
>     event count
>   - counting, where we accumulate all events (children)
>     into a single count value
> 
> But in case we read event by using the PERF_SAMPLE_READ sample
> type, we are interested in direct event->count value measured
> in specific task. Switching events within tasks for this kind
> of measurements corrupts data.
> 
> Fixing this by setting/unsetting pin_count for perf event context
> once cloned event with PERF_SAMPLE_READ read is added/removed.
> The pin_count value != 0 makes the context not suitable for
> optimized switch.

no.. so the value of the counter is the sum of all the inherited events.
It doesn't matter if you flip it or not the sum is not affected.

PERF_SAMPLE_READ should return the value.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/9] perf: Remove redundant parent context check from context_equiv
  2014-09-02 10:50   ` Peter Zijlstra
@ 2014-09-08  9:43     ` Jiri Olsa
  2014-09-08  9:45       ` Peter Zijlstra
  0 siblings, 1 reply; 39+ messages in thread
From: Jiri Olsa @ 2014-09-08  9:43 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Olsa, linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo,
	Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Stephane Eranian

On Tue, Sep 02, 2014 at 12:50:36PM +0200, Peter Zijlstra wrote:
> On Mon, Aug 25, 2014 at 04:45:35PM +0200, Jiri Olsa wrote:
> > As described in commit:
> >   1f9a7268c67f perf: Do not allow optimized switch for non-cloned events
> > we dont allow optimized switch for non cloned contexts.
> > 
> > There's no need now to check if one of the context is parent
> > of the other in context_equiv function.
> > 
> 
> OK, so talk to me about that patch; it looks like you're slowly
> reverting: 5a3126d4fe7c ("perf: Fix the perf context switch
> optimization").
> 
> So what exactly is the problem? I'm still jet-lagged and not seeing it.

the problem is, that we cannot allow non-cloned context
to be part of the optimized switch for reasons explained
in commit 1f9a7268c67f

I did not realize there was a just single commit introducing
non-cloned context switch to revert ;-) attached

I haven't tested it yet.. will do with my other changes

thanks,
jirka


---
This reverts commit 5a3126d4fe7c311fe12f98fef0470f6cb582d1ef.

As explained in commit:
  1f9a7268c67f perf: Do not allow optimized switch for non-cloned events

we cannot allow optimized switch for non cloned contexts. It
suppressed it by forcing the condition for parent contexts.

But what we actually should do is to revert the full commit that
introduced the optimized switch for non-cloned contexts.
This way we get rid of unneeded (now) ctx->generation updates
for non-cloned contexts.
---
 kernel/events/core.c | 64 +++++++++++++++-------------------------------------
 1 file changed, 18 insertions(+), 46 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index d8cb4d21a346..424e6d3b07b7 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -916,7 +916,6 @@ static void unclone_ctx(struct perf_event_context *ctx)
 		put_ctx(ctx->parent_ctx);
 		ctx->parent_ctx = NULL;
 	}
-	ctx->generation++;
 }
 
 static u32 perf_event_pid(struct perf_event *event, struct task_struct *p)
@@ -1154,8 +1153,6 @@ list_add_event(struct perf_event *event, struct perf_event_context *ctx)
 	ctx->nr_events++;
 	if (event->attr.inherit_stat)
 		ctx->nr_stat++;
-
-	ctx->generation++;
 }
 
 /*
@@ -1333,8 +1330,6 @@ list_del_event(struct perf_event *event, struct perf_event_context *ctx)
 	 */
 	if (event->state > PERF_EVENT_STATE_OFF)
 		event->state = PERF_EVENT_STATE_OFF;
-
-	ctx->generation++;
 }
 
 static void perf_group_detach(struct perf_event *event)
@@ -2243,38 +2238,22 @@ static void ctx_sched_out(struct perf_event_context *ctx,
 }
 
 /*
- * Test whether two contexts are equivalent, i.e. whether they have both been
- * cloned from the same version of the same context.
- *
- * Equivalence is measured using a generation number in the context that is
- * incremented on each modification to it; see unclone_ctx(), list_add_event()
- * and list_del_event().
+ * Test whether two contexts are equivalent, i.e. whether they
+ * have both been cloned from the same version of the same context
+ * and they both have the same number of enabled events.
+ * If the number of enabled events is the same, then the set
+ * of enabled events should be the same, because these are both
+ * inherited contexts, therefore we can't access individual events
+ * in them directly with an fd; we can only enable/disable all
+ * events via prctl, or enable/disable all events in a family
+ * via ioctl, which will have the same effect on both contexts.
  */
 static int context_equiv(struct perf_event_context *ctx1,
 			 struct perf_event_context *ctx2)
 {
-	/* Pinning disables the swap optimization */
-	if (ctx1->pin_count || ctx2->pin_count)
-		return 0;
-
-	/* If ctx1 is the parent of ctx2 */
-	if (ctx1 == ctx2->parent_ctx && ctx1->generation == ctx2->parent_gen)
-		return 1;
-
-	/* If ctx2 is the parent of ctx1 */
-	if (ctx1->parent_ctx == ctx2 && ctx1->parent_gen == ctx2->generation)
-		return 1;
-
-	/*
-	 * If ctx1 and ctx2 have the same parent; we flatten the parent
-	 * hierarchy, see perf_event_init_context().
-	 */
-	if (ctx1->parent_ctx && ctx1->parent_ctx == ctx2->parent_ctx &&
-			ctx1->parent_gen == ctx2->parent_gen)
-		return 1;
-
-	/* Unmatched */
-	return 0;
+	return ctx1->parent_ctx && ctx1->parent_ctx == ctx2->parent_ctx
+		&& ctx1->parent_gen == ctx2->parent_gen
+		&& !ctx1->pin_count && !ctx2->pin_count;
 }
 
 static void __perf_event_sync_stat(struct perf_event *event,
@@ -2354,7 +2333,7 @@ static void perf_event_context_sched_out(struct task_struct *task, int ctxn,
 {
 	struct perf_event_context *ctx = task->perf_event_ctxp[ctxn];
 	struct perf_event_context *next_ctx;
-	struct perf_event_context *parent, *next_parent;
+	struct perf_event_context *parent;
 	struct perf_cpu_context *cpuctx;
 	int do_switch = 1;
 
@@ -2366,18 +2345,10 @@ static void perf_event_context_sched_out(struct task_struct *task, int ctxn,
 		return;
 
 	rcu_read_lock();
-	next_ctx = next->perf_event_ctxp[ctxn];
-	if (!next_ctx)
-		goto unlock;
-
 	parent = rcu_dereference(ctx->parent_ctx);
-	next_parent = rcu_dereference(next_ctx->parent_ctx);
-
-	/* If neither context have a parent context; they cannot be clones. */
-	if (!parent || !next_parent)
-		goto unlock;
-
-	if (next_parent == ctx || next_ctx == parent || next_parent == parent) {
+	next_ctx = next->perf_event_ctxp[ctxn];
+	if (parent && next_ctx &&
+	    rcu_dereference(next_ctx->parent_ctx) == parent) {
 		/*
 		 * Looks like the two contexts are clones, so we might be
 		 * able to optimize the context switch.  We lock both
@@ -2405,7 +2376,6 @@ static void perf_event_context_sched_out(struct task_struct *task, int ctxn,
 		raw_spin_unlock(&next_ctx->lock);
 		raw_spin_unlock(&ctx->lock);
 	}
-unlock:
 	rcu_read_unlock();
 
 	if (do_switch) {
@@ -7398,6 +7368,7 @@ SYSCALL_DEFINE5(perf_event_open,
 	}
 
 	perf_install_in_context(ctx, event, event->cpu);
+	++ctx->generation;
 	perf_unpin_context(ctx);
 	mutex_unlock(&ctx->mutex);
 
@@ -7484,6 +7455,7 @@ perf_event_create_kernel_counter(struct perf_event_attr *attr, int cpu,
 	WARN_ON_ONCE(ctx->parent_ctx);
 	mutex_lock(&ctx->mutex);
 	perf_install_in_context(ctx, event, cpu);
+	++ctx->generation;
 	perf_unpin_context(ctx);
 	mutex_unlock(&ctx->mutex);
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/9] perf: Remove redundant parent context check from context_equiv
  2014-09-08  9:43     ` Jiri Olsa
@ 2014-09-08  9:45       ` Peter Zijlstra
  2014-09-08  9:48         ` Peter Zijlstra
  0 siblings, 1 reply; 39+ messages in thread
From: Peter Zijlstra @ 2014-09-08  9:45 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo,
	Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Stephane Eranian

[-- Attachment #1: Type: text/plain, Size: 974 bytes --]

On Mon, Sep 08, 2014 at 11:43:48AM +0200, Jiri Olsa wrote:
> On Tue, Sep 02, 2014 at 12:50:36PM +0200, Peter Zijlstra wrote:
> > On Mon, Aug 25, 2014 at 04:45:35PM +0200, Jiri Olsa wrote:
> > > As described in commit:
> > >   1f9a7268c67f perf: Do not allow optimized switch for non-cloned events
> > > we dont allow optimized switch for non cloned contexts.
> > > 
> > > There's no need now to check if one of the context is parent
> > > of the other in context_equiv function.
> > > 
> > 
> > OK, so talk to me about that patch; it looks like you're slowly
> > reverting: 5a3126d4fe7c ("perf: Fix the perf context switch
> > optimization").
> > 
> > So what exactly is the problem? I'm still jet-lagged and not seeing it.
> 
> the problem is, that we cannot allow non-cloned context
> to be part of the optimized switch for reasons explained
> in commit 1f9a7268c67f

The thing is; I don't understand those reasons. That commit log doesn't
explain.

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/9] perf: Remove redundant parent context check from context_equiv
  2014-09-08  9:45       ` Peter Zijlstra
@ 2014-09-08  9:48         ` Peter Zijlstra
  2014-09-08 10:01           ` Peter Zijlstra
  0 siblings, 1 reply; 39+ messages in thread
From: Peter Zijlstra @ 2014-09-08  9:48 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo,
	Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Stephane Eranian

[-- Attachment #1: Type: text/plain, Size: 1283 bytes --]

On Mon, Sep 08, 2014 at 11:45:48AM +0200, Peter Zijlstra wrote:
> On Mon, Sep 08, 2014 at 11:43:48AM +0200, Jiri Olsa wrote:
> > On Tue, Sep 02, 2014 at 12:50:36PM +0200, Peter Zijlstra wrote:
> > > On Mon, Aug 25, 2014 at 04:45:35PM +0200, Jiri Olsa wrote:
> > > > As described in commit:
> > > >   1f9a7268c67f perf: Do not allow optimized switch for non-cloned events
> > > > we dont allow optimized switch for non cloned contexts.
> > > > 
> > > > There's no need now to check if one of the context is parent
> > > > of the other in context_equiv function.
> > > > 
> > > 
> > > OK, so talk to me about that patch; it looks like you're slowly
> > > reverting: 5a3126d4fe7c ("perf: Fix the perf context switch
> > > optimization").
> > > 
> > > So what exactly is the problem? I'm still jet-lagged and not seeing it.
> > 
> > the problem is, that we cannot allow non-cloned context
> > to be part of the optimized switch for reasons explained
> > in commit 1f9a7268c67f
> 
> The thing is; I don't understand those reasons. That commit log doesn't
> explain.

Ah wait, I finally see. I think we want to fix that exit path, not
disallow the cloning.

The thing is, by not allowing this optimization simple things like eg.
pipe-test say very expensive.

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/9] perf: Deny optimized switch for events read by PERF_SAMPLE_READ
  2014-09-02 10:52   ` Peter Zijlstra
@ 2014-09-08 10:00     ` Jiri Olsa
  2014-09-08 10:11       ` Peter Zijlstra
  0 siblings, 1 reply; 39+ messages in thread
From: Jiri Olsa @ 2014-09-08 10:00 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Olsa, linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo,
	Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Stephane Eranian

On Tue, Sep 02, 2014 at 12:52:44PM +0200, Peter Zijlstra wrote:
> On Mon, Aug 25, 2014 at 04:45:36PM +0200, Jiri Olsa wrote:
> > The optimized task context switch for cloned perf events just
> > swaps whole perf event contexts (of current and next process)
> > if it finds them suitable. Events from the 'current' context
> > will now measure data of the 'next' context and vice versa.
> > 
> > This is ok for cases where we are not directly interested in
> > the event->count value of separate child events, like:
> >   - standard sampling, where we take 'period' value for the
> >     event count
> >   - counting, where we accumulate all events (children)
> >     into a single count value
> > 
> > But in case we read event by using the PERF_SAMPLE_READ sample
> > type, we are interested in direct event->count value measured
> > in specific task. Switching events within tasks for this kind
> > of measurements corrupts data.
> > 
> > Fixing this by setting/unsetting pin_count for perf event context
> > once cloned event with PERF_SAMPLE_READ read is added/removed.
> > The pin_count value != 0 makes the context not suitable for
> > optimized switch.
> 
> no.. so the value of the counter is the sum of all the inherited events.
> It doesn't matter if you flip it or not the sum is not affected.
> 
> PERF_SAMPLE_READ should return the value.

so I want to be able to do the leader sampling over child processes

that means:
  - have event group with sampling leader, and the rest
    of the group events' periods being read on leader's sample
    via PERF_SAMPLE_READ sample_type
  - for each child process created I want it to do the
    same thing as the parent - sample on leader, read
    values of other events in group via PERF_SAMPLE_READ

Now, If I let the optimized switch enabled for above config,
I'll get wrong data, because the period counts of group events
are local to the child process. Optimized switch will move
it to another child.

jirka

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/9] perf: Remove redundant parent context check from context_equiv
  2014-09-08  9:48         ` Peter Zijlstra
@ 2014-09-08 10:01           ` Peter Zijlstra
  2014-09-08 11:39             ` Peter Zijlstra
  2014-09-08 12:01             ` Jiri Olsa
  0 siblings, 2 replies; 39+ messages in thread
From: Peter Zijlstra @ 2014-09-08 10:01 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo,
	Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Stephane Eranian

[-- Attachment #1: Type: text/plain, Size: 745 bytes --]

On Mon, Sep 08, 2014 at 11:48:55AM +0200, Peter Zijlstra wrote:

> > The thing is; I don't understand those reasons. That commit log doesn't
> > explain.
> 
> Ah wait, I finally see. I think we want to fix that exit path, not
> disallow the cloning.
> 
> The thing is, by not allowing this optimization simple things like eg.
> pipe-test say very expensive.

So its 179033b3e064 ("perf: Add PERF_EVENT_STATE_EXIT state for events
with exited task") that introduces the problem. Before that things would
work correctly afaict.

The exit would remove from the context but leave the event in existence.
Both the fd and the inherited events would have references to it, only
once those are gone do we destroy the actual event.



[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/9] perf: Deny optimized switch for events read by PERF_SAMPLE_READ
  2014-09-08 10:00     ` Jiri Olsa
@ 2014-09-08 10:11       ` Peter Zijlstra
  2014-09-08 16:39         ` Jiri Olsa
  0 siblings, 1 reply; 39+ messages in thread
From: Peter Zijlstra @ 2014-09-08 10:11 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo,
	Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Stephane Eranian

[-- Attachment #1: Type: text/plain, Size: 1139 bytes --]

On Mon, Sep 08, 2014 at 12:00:19PM +0200, Jiri Olsa wrote:

> > no.. so the value of the counter is the sum of all the inherited events.
> > It doesn't matter if you flip it or not the sum is not affected.
> > 
> > PERF_SAMPLE_READ should return the value.
> 
> so I want to be able to do the leader sampling over child processes
> that means:
>   - have event group with sampling leader, and the rest
>     of the group events' periods being read on leader's sample
>     via PERF_SAMPLE_READ sample_type
>   - for each child process created I want it to do the
>     same thing as the parent - sample on leader, read
>     values of other events in group via PERF_SAMPLE_READ

The thing is, PERF_SAMPLE_READ should return the exact same thing as an
actual read(), and for read() the value reported is the sum of all
events.

I _think_ I know what you want, and I think I see why you'd want that,
but I'm not sure how to best do that.. Hmm.. maybe look at
'perf_event_attr::inherit_stat' that appears to be similar.

At which point read() vs inherit_stat might be 'broken'. Hard to know if
anybody uses that :-(

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/9] perf: Remove redundant parent context check from context_equiv
  2014-09-08 10:01           ` Peter Zijlstra
@ 2014-09-08 11:39             ` Peter Zijlstra
  2014-09-08 12:19               ` Jiri Olsa
  2014-09-08 12:32               ` Jiri Olsa
  2014-09-08 12:01             ` Jiri Olsa
  1 sibling, 2 replies; 39+ messages in thread
From: Peter Zijlstra @ 2014-09-08 11:39 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo,
	Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Stephane Eranian

[-- Attachment #1: Type: text/plain, Size: 1527 bytes --]

On Mon, Sep 08, 2014 at 12:01:22PM +0200, Peter Zijlstra wrote:
> On Mon, Sep 08, 2014 at 11:48:55AM +0200, Peter Zijlstra wrote:
> 
> > > The thing is; I don't understand those reasons. That commit log doesn't
> > > explain.
> > 
> > Ah wait, I finally see. I think we want to fix that exit path, not
> > disallow the cloning.
> > 
> > The thing is, by not allowing this optimization simple things like eg.
> > pipe-test say very expensive.
> 
> So its 179033b3e064 ("perf: Add PERF_EVENT_STATE_EXIT state for events
> with exited task") that introduces the problem. Before that things would
> work correctly afaict.
> 
> The exit would remove from the context but leave the event in existence.
> Both the fd and the inherited events would have references to it, only
> once those are gone do we destroy the actual event.

I have another 'problem' with 179033b3e064. What if you 'want' to
continue monitoring after the initial task died? Eg. if you want to
monitor crap that unconditionally daemonizes.

So at this point I would suggest we revert both 179033b3e064 and
1f9a7268c67f so that the context switch optimization works again for
simple 2 task things (pipe-test).

If we re-introduce 179033b3e064 (which I think makes sense) we need to
make sure it works with the context switch optimization (we could add
perf_event::task I think, it would clean up a few other things iirc) and
we'd have to make it conditional so we can still monitor daemons.

Or am I totally missing things again?

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/9] perf: Remove redundant parent context check from context_equiv
  2014-09-08 10:01           ` Peter Zijlstra
  2014-09-08 11:39             ` Peter Zijlstra
@ 2014-09-08 12:01             ` Jiri Olsa
  2014-09-08 13:34               ` Peter Zijlstra
  1 sibling, 1 reply; 39+ messages in thread
From: Jiri Olsa @ 2014-09-08 12:01 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Olsa, linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo,
	Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Stephane Eranian

On Mon, Sep 08, 2014 at 12:01:22PM +0200, Peter Zijlstra wrote:
> On Mon, Sep 08, 2014 at 11:48:55AM +0200, Peter Zijlstra wrote:
> 
> > > The thing is; I don't understand those reasons. That commit log doesn't
> > > explain.
> > 
> > Ah wait, I finally see. I think we want to fix that exit path, not
> > disallow the cloning.
> > 
> > The thing is, by not allowing this optimization simple things like eg.
> > pipe-test say very expensive.
> 
> So its 179033b3e064 ("perf: Add PERF_EVENT_STATE_EXIT state for events
> with exited task") that introduces the problem. Before that things would
> work correctly afaict.

hum, I dont think so.. because the perf_remove_from_context set event
to PERF_EVENT_STATE_OFF state anyway.. thus making any new cloned events
disabled

the PERF_EVENT_STATE_EXIT should be just notification for poll to return HUP

> The exit would remove from the context but leave the event in existence.
> Both the fd and the inherited events would have references to it, only
> once those are gone do we destroy the actual event.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/9] perf: Remove redundant parent context check from context_equiv
  2014-09-08 11:39             ` Peter Zijlstra
@ 2014-09-08 12:19               ` Jiri Olsa
  2014-09-08 13:20                 ` Peter Zijlstra
  2014-09-08 12:32               ` Jiri Olsa
  1 sibling, 1 reply; 39+ messages in thread
From: Jiri Olsa @ 2014-09-08 12:19 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Olsa, linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo,
	Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Stephane Eranian

On Mon, Sep 08, 2014 at 01:39:58PM +0200, Peter Zijlstra wrote:
> On Mon, Sep 08, 2014 at 12:01:22PM +0200, Peter Zijlstra wrote:
> > On Mon, Sep 08, 2014 at 11:48:55AM +0200, Peter Zijlstra wrote:
> > 
> > > > The thing is; I don't understand those reasons. That commit log doesn't
> > > > explain.
> > > 
> > > Ah wait, I finally see. I think we want to fix that exit path, not
> > > disallow the cloning.
> > > 
> > > The thing is, by not allowing this optimization simple things like eg.
> > > pipe-test say very expensive.
> > 
> > So its 179033b3e064 ("perf: Add PERF_EVENT_STATE_EXIT state for events
> > with exited task") that introduces the problem. Before that things would
> > work correctly afaict.
> > 
> > The exit would remove from the context but leave the event in existence.
> > Both the fd and the inherited events would have references to it, only
> > once those are gone do we destroy the actual event.
> 
> I have another 'problem' with 179033b3e064. What if you 'want' to
> continue monitoring after the initial task died? Eg. if you want to
> monitor crap that unconditionally daemonizes.

right.. did not think of that.. need to check more, but
seems like just the check for children should be enough

jirka


---
diff --git a/kernel/events/core.c b/kernel/events/core.c
index bf482ccbdbe1..341d0b47ca14 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3568,6 +3568,19 @@ static int perf_event_read_one(struct perf_event *event,
 	return n * sizeof(u64);
 }
 
+static bool is_event_hup(struct perf_event *event)
+{
+	bool no_children;
+
+	if (event->state != PERF_EVENT_STATE_EXIT)
+		return false;
+
+	mutex_lock(&event->child_mutex);
+	no_children = list_empty(&event->child_list);
+	mutex_unlock(&event->child_mutex);
+	return no_children;
+}
+
 /*
  * Read the performance event - simple non blocking version for now
  */
@@ -3582,8 +3595,7 @@ perf_read_hw(struct perf_event *event, char __user *buf, size_t count)
 	 * error state (i.e. because it was pinned but it couldn't be
 	 * scheduled on to the CPU at some point).
 	 */
-	if ((event->state == PERF_EVENT_STATE_ERROR) ||
-	    (event->state == PERF_EVENT_STATE_EXIT))
+	if ((event->state == PERF_EVENT_STATE_ERROR) || (is_event_hup(event)))
 		return 0;
 
 	if (count < event->read_size)
@@ -3614,7 +3626,7 @@ static unsigned int perf_poll(struct file *file, poll_table *wait)
 
 	poll_wait(file, &event->waitq, wait);
 
-	if (event->state == PERF_EVENT_STATE_EXIT)
+	if (is_event_hup(event))
 		return events;
 
 	/*

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/9] perf: Remove redundant parent context check from context_equiv
  2014-09-08 11:39             ` Peter Zijlstra
  2014-09-08 12:19               ` Jiri Olsa
@ 2014-09-08 12:32               ` Jiri Olsa
  1 sibling, 0 replies; 39+ messages in thread
From: Jiri Olsa @ 2014-09-08 12:32 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Olsa, linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo,
	Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Stephane Eranian

On Mon, Sep 08, 2014 at 01:39:58PM +0200, Peter Zijlstra wrote:
> On Mon, Sep 08, 2014 at 12:01:22PM +0200, Peter Zijlstra wrote:
> > On Mon, Sep 08, 2014 at 11:48:55AM +0200, Peter Zijlstra wrote:
> > 
> > > > The thing is; I don't understand those reasons. That commit log doesn't
> > > > explain.
> > > 
> > > Ah wait, I finally see. I think we want to fix that exit path, not
> > > disallow the cloning.
> > > 
> > > The thing is, by not allowing this optimization simple things like eg.
> > > pipe-test say very expensive.
> > 
> > So its 179033b3e064 ("perf: Add PERF_EVENT_STATE_EXIT state for events
> > with exited task") that introduces the problem. Before that things would
> > work correctly afaict.
> > 
> > The exit would remove from the context but leave the event in existence.
> > Both the fd and the inherited events would have references to it, only
> > once those are gone do we destroy the actual event.
> 
> I have another 'problem' with 179033b3e064. What if you 'want' to
> continue monitoring after the initial task died? Eg. if you want to
> monitor crap that unconditionally daemonizes.
> 
> So at this point I would suggest we revert both 179033b3e064 and
> 1f9a7268c67f so that the context switch optimization works again for
> simple 2 task things (pipe-test).
> 
> If we re-introduce 179033b3e064 (which I think makes sense) we need to
> make sure it works with the context switch optimization (we could add
> perf_event::task I think, it would clean up a few other things iirc) and
> we'd have to make it conditional so we can still monitor daemons.
> 
> Or am I totally missing things again?

as I said in previous email I dont think the 179033b3e064 introduced
the issue and I think we should keep 1f9a7268c67f to have correct
numbers ;-)

I'll think about the fix to keep non-cloned context in optimized switch

jirka

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/9] perf: Remove redundant parent context check from context_equiv
  2014-09-08 12:19               ` Jiri Olsa
@ 2014-09-08 13:20                 ` Peter Zijlstra
  0 siblings, 0 replies; 39+ messages in thread
From: Peter Zijlstra @ 2014-09-08 13:20 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo,
	Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Stephane Eranian

[-- Attachment #1: Type: text/plain, Size: 2109 bytes --]

On Mon, Sep 08, 2014 at 02:19:52PM +0200, Jiri Olsa wrote:
> > I have another 'problem' with 179033b3e064. What if you 'want' to
> > continue monitoring after the initial task died? Eg. if you want to
> > monitor crap that unconditionally daemonizes.
> 
> right.. did not think of that.. need to check more, but
> seems like just the check for children should be enough
> 

Indeed, that should work.

> ---
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index bf482ccbdbe1..341d0b47ca14 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -3568,6 +3568,19 @@ static int perf_event_read_one(struct perf_event *event,
>  	return n * sizeof(u64);
>  }
>  
> +static bool is_event_hup(struct perf_event *event)
> +{
> +	bool no_children;
> +
> +	if (event->state != PERF_EVENT_STATE_EXIT)
> +		return false;
> +
> +	mutex_lock(&event->child_mutex);
> +	no_children = list_empty(&event->child_list);
> +	mutex_unlock(&event->child_mutex);
> +	return no_children;
> +}
> +
>  /*
>   * Read the performance event - simple non blocking version for now
>   */
> @@ -3582,8 +3595,7 @@ perf_read_hw(struct perf_event *event, char __user *buf, size_t count)
>  	 * error state (i.e. because it was pinned but it couldn't be
>  	 * scheduled on to the CPU at some point).
>  	 */
> -	if ((event->state == PERF_EVENT_STATE_ERROR) ||
> -	    (event->state == PERF_EVENT_STATE_EXIT))
> +	if ((event->state == PERF_EVENT_STATE_ERROR) || (is_event_hup(event)))
>  		return 0;

Do we want this? It seems like a fairly sensible thing to start a
counter and wait for the thing to die, only to then read the total
count. But with this on we get 0s.

I suppose Stephane's email got to you after you did this and we should
be dropping this thing entirely?

>  	if (count < event->read_size)
> @@ -3614,7 +3626,7 @@ static unsigned int perf_poll(struct file *file, poll_table *wait)
>  
>  	poll_wait(file, &event->waitq, wait);
>  
> -	if (event->state == PERF_EVENT_STATE_EXIT)
> +	if (is_event_hup(event))
>  		return events;
>  
>  	/*

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/9] perf: Remove redundant parent context check from context_equiv
  2014-09-08 12:01             ` Jiri Olsa
@ 2014-09-08 13:34               ` Peter Zijlstra
  2014-09-08 15:13                 ` Peter Zijlstra
  0 siblings, 1 reply; 39+ messages in thread
From: Peter Zijlstra @ 2014-09-08 13:34 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo,
	Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Stephane Eranian

[-- Attachment #1: Type: text/plain, Size: 995 bytes --]

On Mon, Sep 08, 2014 at 02:01:41PM +0200, Jiri Olsa wrote:
> On Mon, Sep 08, 2014 at 12:01:22PM +0200, Peter Zijlstra wrote:
> > On Mon, Sep 08, 2014 at 11:48:55AM +0200, Peter Zijlstra wrote:
> > 
> > > > The thing is; I don't understand those reasons. That commit log doesn't
> > > > explain.
> > > 
> > > Ah wait, I finally see. I think we want to fix that exit path, not
> > > disallow the cloning.
> > > 
> > > The thing is, by not allowing this optimization simple things like eg.
> > > pipe-test say very expensive.
> > 
> > So its 179033b3e064 ("perf: Add PERF_EVENT_STATE_EXIT state for events
> > with exited task") that introduces the problem. Before that things would
> > work correctly afaict.
> 
> hum, I dont think so.. because the perf_remove_from_context set event
> to PERF_EVENT_STATE_OFF state anyway.. thus making any new cloned events
> disabled

Urgh, see I knew I was missing something.

Can't we fix that? Lemme check to see what relies on this.

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/9] perf: Remove redundant parent context check from context_equiv
  2014-09-08 13:34               ` Peter Zijlstra
@ 2014-09-08 15:13                 ` Peter Zijlstra
  2014-09-08 16:45                   ` Jiri Olsa
  0 siblings, 1 reply; 39+ messages in thread
From: Peter Zijlstra @ 2014-09-08 15:13 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo,
	Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Stephane Eranian

[-- Attachment #1: Type: text/plain, Size: 1824 bytes --]

On Mon, Sep 08, 2014 at 03:34:28PM +0200, Peter Zijlstra wrote:
> On Mon, Sep 08, 2014 at 02:01:41PM +0200, Jiri Olsa wrote:
> > On Mon, Sep 08, 2014 at 12:01:22PM +0200, Peter Zijlstra wrote:
> > > On Mon, Sep 08, 2014 at 11:48:55AM +0200, Peter Zijlstra wrote:
> > > 
> > > > > The thing is; I don't understand those reasons. That commit log doesn't
> > > > > explain.
> > > > 
> > > > Ah wait, I finally see. I think we want to fix that exit path, not
> > > > disallow the cloning.
> > > > 
> > > > The thing is, by not allowing this optimization simple things like eg.
> > > > pipe-test say very expensive.
> > > 
> > > So its 179033b3e064 ("perf: Add PERF_EVENT_STATE_EXIT state for events
> > > with exited task") that introduces the problem. Before that things would
> > > work correctly afaict.
> > 
> > hum, I dont think so.. because the perf_remove_from_context set event
> > to PERF_EVENT_STATE_OFF state anyway.. thus making any new cloned events
> > disabled
> 
> Urgh, see I knew I was missing something.
> 
> Can't we fix that? Lemme check to see what relies on this.

2e2af50b1fab ("perf_events: Disable events when we detach them")

Seems to be about it. And I think we should solve that differently, but
the best I can come up with ties into the event->ctx mess we have in
that other thread.

The thing is, IOC_ENABLE/DISABLE and read() and such should act
(sanely and) independent from the attached state.

Its just that the whole event->ctx migration mess is making this
somewhat hard atm.

So things like perf_event_read() should not only check ctx->is_active
but also worry about event->attach_state & PERF_ATTACH_CONTEXT.

Now the biggest problem is that we cannot tell if its a temporary state
(move_group / migrate_context) or permanent (exit)... 

Urgh

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/9] perf: Deny optimized switch for events read by PERF_SAMPLE_READ
  2014-09-08 10:11       ` Peter Zijlstra
@ 2014-09-08 16:39         ` Jiri Olsa
  0 siblings, 0 replies; 39+ messages in thread
From: Jiri Olsa @ 2014-09-08 16:39 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Olsa, linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo,
	Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Stephane Eranian

On Mon, Sep 08, 2014 at 12:11:19PM +0200, Peter Zijlstra wrote:
> On Mon, Sep 08, 2014 at 12:00:19PM +0200, Jiri Olsa wrote:
> 
> > > no.. so the value of the counter is the sum of all the inherited events.
> > > It doesn't matter if you flip it or not the sum is not affected.
> > > 
> > > PERF_SAMPLE_READ should return the value.
> > 
> > so I want to be able to do the leader sampling over child processes
> > that means:
> >   - have event group with sampling leader, and the rest
> >     of the group events' periods being read on leader's sample
> >     via PERF_SAMPLE_READ sample_type
> >   - for each child process created I want it to do the
> >     same thing as the parent - sample on leader, read
> >     values of other events in group via PERF_SAMPLE_READ
> 
> The thing is, PERF_SAMPLE_READ should return the exact same thing as an
> actual read(), and for read() the value reported is the sum of all
> events.

maybe we could make this configurable? like via new per_event_attr::option bit,
keeping by default the behaviour you said

> 
> I _think_ I know what you want, and I think I see why you'd want that,
> but I'm not sure how to best do that.. Hmm.. maybe look at
> 'perf_event_attr::inherit_stat' that appears to be similar.
> 
> At which point read() vs inherit_stat might be 'broken'. Hard to know if
> anybody uses that :-(

I'll check, thanks

jirka

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/9] perf: Remove redundant parent context check from context_equiv
  2014-09-08 15:13                 ` Peter Zijlstra
@ 2014-09-08 16:45                   ` Jiri Olsa
  2014-09-09 10:20                     ` Peter Zijlstra
  2014-09-10 13:57                     ` Peter Zijlstra
  0 siblings, 2 replies; 39+ messages in thread
From: Jiri Olsa @ 2014-09-08 16:45 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Olsa, linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo,
	Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Stephane Eranian

On Mon, Sep 08, 2014 at 05:13:05PM +0200, Peter Zijlstra wrote:
> On Mon, Sep 08, 2014 at 03:34:28PM +0200, Peter Zijlstra wrote:
> > On Mon, Sep 08, 2014 at 02:01:41PM +0200, Jiri Olsa wrote:
> > > On Mon, Sep 08, 2014 at 12:01:22PM +0200, Peter Zijlstra wrote:
> > > > On Mon, Sep 08, 2014 at 11:48:55AM +0200, Peter Zijlstra wrote:
> > > > 
> > > > > > The thing is; I don't understand those reasons. That commit log doesn't
> > > > > > explain.
> > > > > 
> > > > > Ah wait, I finally see. I think we want to fix that exit path, not
> > > > > disallow the cloning.
> > > > > 
> > > > > The thing is, by not allowing this optimization simple things like eg.
> > > > > pipe-test say very expensive.
> > > > 
> > > > So its 179033b3e064 ("perf: Add PERF_EVENT_STATE_EXIT state for events
> > > > with exited task") that introduces the problem. Before that things would
> > > > work correctly afaict.
> > > 
> > > hum, I dont think so.. because the perf_remove_from_context set event
> > > to PERF_EVENT_STATE_OFF state anyway.. thus making any new cloned events
> > > disabled
> > 
> > Urgh, see I knew I was missing something.
> > 
> > Can't we fix that? Lemme check to see what relies on this.
> 
> 2e2af50b1fab ("perf_events: Disable events when we detach them")
> 
> Seems to be about it. And I think we should solve that differently, but
> the best I can come up with ties into the event->ctx mess we have in
> that other thread.
> 
> The thing is, IOC_ENABLE/DISABLE and read() and such should act
> (sanely and) independent from the attached state.
> 
> Its just that the whole event->ctx migration mess is making this
> somewhat hard atm.
> 
> So things like perf_event_read() should not only check ctx->is_active
> but also worry about event->attach_state & PERF_ATTACH_CONTEXT.
> 
> Now the biggest problem is that we cannot tell if its a temporary state
> (move_group / migrate_context) or permanent (exit)... 
> 
> Urgh

I just noticed that we initialize the child state with base parent
state not the real (immediate) parent.. which is what we want IMO

I wonder attached patch could fix the issue mentioned in:
  1f9a726 perf: Do not allow optimized switch for non-cloned events

now I need to recall what I used to test this ;-)

jirka


---
diff --git a/kernel/events/core.c b/kernel/events/core.c
index e4d6924..561a4ea 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7794,6 +7794,7 @@ inherit_event(struct perf_event *parent_event,
 	      struct perf_event *group_leader,
 	      struct perf_event_context *child_ctx)
 {
+	enum perf_event_active_state parent_state = parent_event->state;
 	struct perf_event *child_event;
 	unsigned long flags;
 
@@ -7827,7 +7828,7 @@ inherit_event(struct perf_event *parent_event,
 	 * not its attr.disabled bit.  We hold the parent's mutex,
 	 * so we won't race with perf_event_{en, dis}able_family.
 	 */
-	if (parent_event->state >= PERF_EVENT_STATE_INACTIVE)
+	if (parent_state >= PERF_EVENT_STATE_INACTIVE)
 		child_event->state = PERF_EVENT_STATE_INACTIVE;
 	else
 		child_event->state = PERF_EVENT_STATE_OFF;

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/9] perf: Remove redundant parent context check from context_equiv
  2014-09-08 16:45                   ` Jiri Olsa
@ 2014-09-09 10:20                     ` Peter Zijlstra
  2014-09-10 13:57                     ` Peter Zijlstra
  1 sibling, 0 replies; 39+ messages in thread
From: Peter Zijlstra @ 2014-09-09 10:20 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo,
	Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Stephane Eranian

[-- Attachment #1: Type: text/plain, Size: 1340 bytes --]

On Mon, Sep 08, 2014 at 06:45:24PM +0200, Jiri Olsa wrote:
> I just noticed that we initialize the child state with base parent
> state not the real (immediate) parent.. which is what we want IMO
> 
> I wonder attached patch could fix the issue mentioned in:
>   1f9a726 perf: Do not allow optimized switch for non-cloned events
> 
> now I need to recall what I used to test this ;-)

Ah, very nice! Yes this might just do it.

> ---
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index e4d6924..561a4ea 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -7794,6 +7794,7 @@ inherit_event(struct perf_event *parent_event,
>  	      struct perf_event *group_leader,
>  	      struct perf_event_context *child_ctx)
>  {
> +	enum perf_event_active_state parent_state = parent_event->state;
>  	struct perf_event *child_event;
>  	unsigned long flags;
>  
> @@ -7827,7 +7828,7 @@ inherit_event(struct perf_event *parent_event,
>  	 * not its attr.disabled bit.  We hold the parent's mutex,
>  	 * so we won't race with perf_event_{en, dis}able_family.
>  	 */
> -	if (parent_event->state >= PERF_EVENT_STATE_INACTIVE)
> +	if (parent_state >= PERF_EVENT_STATE_INACTIVE)
>  		child_event->state = PERF_EVENT_STATE_INACTIVE;
>  	else
>  		child_event->state = PERF_EVENT_STATE_OFF;

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/9] perf: Remove redundant parent context check from context_equiv
  2014-09-08 16:45                   ` Jiri Olsa
  2014-09-09 10:20                     ` Peter Zijlstra
@ 2014-09-10 13:57                     ` Peter Zijlstra
  2014-09-10 14:35                       ` Jiri Olsa
  1 sibling, 1 reply; 39+ messages in thread
From: Peter Zijlstra @ 2014-09-10 13:57 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo,
	Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Stephane Eranian

On Mon, Sep 08, 2014 at 06:45:24PM +0200, Jiri Olsa wrote:
> 
> now I need to recall what I used to test this ;-)
> 

Hint, if you'd mentioned that in the changelog.... :-)

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/9] perf: Remove redundant parent context check from context_equiv
  2014-09-10 13:57                     ` Peter Zijlstra
@ 2014-09-10 14:35                       ` Jiri Olsa
  2014-09-24 14:58                         ` [tip:perf/core] Revert "perf: Do not allow optimized switch for non-cloned events" tip-bot for Jiri Olsa
  0 siblings, 1 reply; 39+ messages in thread
From: Jiri Olsa @ 2014-09-10 14:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Olsa, linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo,
	Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Stephane Eranian

On Wed, Sep 10, 2014 at 03:57:09PM +0200, Peter Zijlstra wrote:
> On Mon, Sep 08, 2014 at 06:45:24PM +0200, Jiri Olsa wrote:
> > 
> > now I need to recall what I used to test this ;-)
> > 
> 
> Hint, if you'd mentioned that in the changelog.... :-)

will try much better this time.. perf test attached ;-)
seems like the child state fix is a good one

jirka


---
>From dc1068c627b332b2a5feb4deedfd96db8e056b5f Mon Sep 17 00:00:00 2001
From: Jiri Olsa <jolsa@redhat.com>
Date: Wed, 10 Sep 2014 16:23:51 +0200
Subject: [PATCH] perf tests: Add test for optimized sched switch bug

---
 tools/perf/Makefile.perf            |    1 +
 tools/perf/tests/builtin-test.c     |    4 +
 tools/perf/tests/optimized_switch.c |  169 +++++++++++++++++++++++++++++++++++
 tools/perf/tests/tests.h            |    1 +
 4 files changed, 175 insertions(+), 0 deletions(-)
 create mode 100644 tools/perf/tests/optimized_switch.c

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index da8fc07..75603ac 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -427,6 +427,7 @@ LIB_OBJS += $(OUTPUT)tests/mmap-thread-lookup.o
 LIB_OBJS += $(OUTPUT)tests/thread-mg-share.o
 LIB_OBJS += $(OUTPUT)tests/switch-tracking.o
 LIB_OBJS += $(OUTPUT)tests/xyarray.o
+LIB_OBJS += $(OUTPUT)tests/optimized_switch.o
 
 BUILTIN_OBJS += $(OUTPUT)builtin-annotate.o
 BUILTIN_OBJS += $(OUTPUT)builtin-bench.o
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 5c90326..e092d17 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -162,6 +162,10 @@ static struct test {
 		.func = test__xyarray,
 	},
 	{
+		.desc = "Test optimized switch",
+		.func = test__optimized_switch,
+	},
+	{
 		.func = NULL,
 	},
 };
diff --git a/tools/perf/tests/optimized_switch.c b/tools/perf/tests/optimized_switch.c
new file mode 100644
index 0000000..68f8705
--- /dev/null
+++ b/tools/perf/tests/optimized_switch.c
@@ -0,0 +1,169 @@
+#include <unistd.h>
+#include <stdlib.h>
+#include <sys/prctl.h>
+#include <linux/compiler.h>
+#include "tests.h"
+#include "debug.h"
+#include "evsel.h"
+#include "evlist.h"
+#include "thread_map.h"
+
+#define GROUPS	10
+#define WORKERS	10
+#define BYTES	100
+
+struct worker {
+	union {
+		int pipefd[2];
+		struct {
+			int read;
+			int write;
+		} fd;
+	};
+};
+
+#define pr_errsys(s) \
+	pr_err(s " failed: %s\n", strerror(errno))
+
+static void worker(int fd)
+{
+	unsigned char c = 0;
+	int cnt = BYTES;
+	ssize_t r;
+
+	while (cnt--) {
+		r = read(fd, &c, sizeof(c));
+		if (r != 1) {
+			pr_errsys("read");
+			exit(-1);
+		}
+	}
+
+	prctl(0, 0, 0, 0, 0);
+	exit(0);
+}
+
+static void write_round(struct worker *workers)
+{
+	unsigned char c = 0;
+	ssize_t r;
+	int i, j;
+
+	for (j = 0; j < BYTES; j++) {
+		for (i = 0; i < WORKERS; i++) {
+			struct worker *w = &workers[i];
+
+			r = write(w->fd.write, &c, sizeof(c));
+			if (r != 1) {
+				pr_errsys("write");
+				exit(-1);
+			}
+		}
+	}
+}
+
+static int group(void)
+{
+	struct worker workers[WORKERS];
+	int i;
+
+	for (i = 0; i < WORKERS; i++) {
+		struct worker *w = &workers[i];
+		int pid;
+
+		if (pipe(w->pipefd)) {
+			pr_errsys("pipe");
+			return -1;
+		}
+
+		pid = fork();
+		if (pid == -1) {
+			pr_errsys("fork");
+			return -1;
+		} else if (!pid) {
+			worker(w->fd.read);
+		}
+	}
+
+	write_round(workers);
+
+	for (i = 0; i < WORKERS; i++)
+		wait(NULL);
+
+	return 0;
+}
+
+static int child(void)
+{
+	int i;
+
+	for (i = 0; i < GROUPS; i++)
+		TEST_ASSERT_VAL("group failed", !group());
+
+	return 0;
+}
+
+int test__optimized_switch(void)
+{
+	ssize_t r __maybe_unused;
+	unsigned char c = 0;
+	int ready[2], pid;
+	struct thread_map *threads;
+	struct perf_evsel *evsel;
+	u64 total;
+
+	if (pipe(ready)) {
+		pr_errsys("pipe");
+		return -1;
+	}
+
+	pid = fork();
+	if (pid == -1) {
+		pr_errsys("fork");
+		return -1;
+	}
+
+	if (!pid) {
+		r = read(ready[0], &c, sizeof(c));
+		if (r != 1) {
+			pr_errsys("read");
+			exit(-1);
+		}
+
+		exit(child());
+	}
+
+	threads = thread_map__new(-1, pid, UINT_MAX);
+
+	evsel = perf_evsel__newtp("syscalls", "sys_enter_prctl");
+	if (evsel == NULL) {
+		pr_debug("perf_evsel__open_per_thread failed\n");
+		return -1;
+	}
+
+	evsel->attr.inherit = 1;
+
+	if (perf_evsel__open_per_thread(evsel, threads) < 0) {
+		pr_err("perf_evsel__newtp failed\n");
+		return -1;
+	}
+
+	r = write(ready[1], &c, sizeof(c));
+	if (r != 1) {
+		pr_errsys("write");
+		return -1;
+	}
+
+	wait(NULL);
+
+	if (perf_evsel__read_on_cpu(evsel, 0, 0) < 0) {
+		pr_err("perf_evsel__read_on_cpu failed\n");
+		return -1;
+	}
+
+	total = evsel->counts->cpu[0].val;
+
+	pr_debug("total count %lu, expected count %d\n", total, GROUPS * WORKERS);
+	TEST_ASSERT_VAL("wrong count", total == GROUPS * WORKERS);
+	return 0;
+}
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index 8c3e7d1..466184b 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -50,6 +50,7 @@ int test__hists_output(void);
 int test__hists_cumulate(void);
 int test__switch_tracking(void);
 int test__xyarray(void);
+int test__optimized_switch(void);
 
 #if defined(__x86_64__) || defined(__i386__) || defined(__arm__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [tip:perf/core] Revert "perf: Do not allow optimized switch for non-cloned events"
  2014-09-10 14:35                       ` Jiri Olsa
@ 2014-09-24 14:58                         ` tip-bot for Jiri Olsa
  0 siblings, 0 replies; 39+ messages in thread
From: tip-bot for Jiri Olsa @ 2014-09-24 14:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, eranian, hpa, mingo, jolsa, torvalds, peterz, acme,
	jolsa, fweisbec, tglx

Commit-ID:  802c8a61d4c9c794db863dcabb0006ab001a651b
Gitweb:     http://git.kernel.org/tip/802c8a61d4c9c794db863dcabb0006ab001a651b
Author:     Jiri Olsa <jolsa@redhat.com>
AuthorDate: Fri, 12 Sep 2014 13:18:28 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 Sep 2014 14:48:13 +0200

Revert "perf: Do not allow optimized switch for non-cloned events"

This reverts commit 1f9a7268c67f0290837aada443d28fd953ddca90.

With the fix of the initial state for the cloned event we now correctly
handle the error described in:

  1f9a7268c67f perf: Do not allow optimized switch for non-cloned events

so we can revert it.

I made an automated test for this, but its not suitable for automated
perf tests framework. It needs to be customized for each machine (the
more cpu the higher numbers for GROUPS/WORKERS/BYTES) and it could take
longer time to hit the issue.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140910143535.GD2409@krava.brq.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/events/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 132524c..b164cb0 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2375,7 +2375,7 @@ static void perf_event_context_sched_out(struct task_struct *task, int ctxn,
 	next_parent = rcu_dereference(next_ctx->parent_ctx);
 
 	/* If neither context have a parent context; they cannot be clones. */
-	if (!parent || !next_parent)
+	if (!parent && !next_parent)
 		goto unlock;
 
 	if (next_parent == ctx || next_ctx == parent || next_parent == parent) {

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH 8/9] perf script: Add period data column
  2014-08-27 14:33   ` David Ahern
@ 2014-10-17 16:10     ` Jiri Olsa
  2014-10-17 18:22       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 39+ messages in thread
From: Jiri Olsa @ 2014-10-17 16:10 UTC (permalink / raw)
  To: David Ahern
  Cc: Jiri Olsa, linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo,
	Corey Ashford, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

On Wed, Aug 27, 2014 at 08:33:53AM -0600, David Ahern wrote:
> On 8/25/14, 8:45 AM, Jiri Olsa wrote:
> >Adding period data column to be displayed in perf script.
> >It's possible to get period values using -f option, like:
> >
> >   $ perf script -f comm,tid,time,period,ip,sym,dso
> >           :26019 26019 52414.329088:       3707  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
> >           :26019 26019 52414.329088:         44  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
> >           :26019 26019 52414.329093:       1987  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
> >           :26019 26019 52414.329093:          6  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
> >               ls 26019 52414.329442:     537558        3407c0639c _dl_map_object_from_fd (/usr/lib64/ld-2.17.so)
> >               ls 26019 52414.329442:       2099        3407c0639c _dl_map_object_from_fd (/usr/lib64/ld-2.17.so)
> >               ls 26019 52414.330181:    1242100        34080917bb get_next_seq (/usr/lib64/libc-2.17.so)
> >               ls 26019 52414.330181:       3774        34080917bb get_next_seq (/usr/lib64/libc-2.17.so)
> >               ls 26019 52414.331427:    1083662  ffffffff810c7dc2 update_curr ([kernel.kallsyms])
> >               ls 26019 52414.331427:        360  ffffffff810c7dc2 update_curr ([kernel.kallsyms])
> >
> >Cc: Andi Kleen <andi@firstfloor.org>
> >Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> >Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
> >Cc: David Ahern <dsahern@gmail.com>
> >Cc: Frederic Weisbecker <fweisbec@gmail.com>
> >Cc: Ingo Molnar <mingo@kernel.org>
> >Cc: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
> >Cc: Namhyung Kim <namhyung@kernel.org>
> >Cc: Paul Mackerras <paulus@samba.org>
> >Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> >Cc: Stephane Eranian <eranian@google.com>
> >Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> >---
> >  tools/perf/Documentation/perf-script.txt |  2 +-
> >  tools/perf/builtin-script.c              | 12 +++++++++++-
> >  2 files changed, 12 insertions(+), 2 deletions(-)
> 
> Acked-by: David Ahern <dsahern@gmail.com>

Arnaldo,
could you please pick this one and patch 9/9?
just realized this was left behind ;-)

thanks,
jirka

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 9/9] perf script: Add period as a default output column
  2014-08-27 14:40   ` David Ahern
@ 2014-10-17 16:11     ` Jiri Olsa
  0 siblings, 0 replies; 39+ messages in thread
From: Jiri Olsa @ 2014-10-17 16:11 UTC (permalink / raw)
  To: David Ahern
  Cc: Jiri Olsa, linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo,
	Corey Ashford, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

On Wed, Aug 27, 2014 at 08:40:49AM -0600, David Ahern wrote:
> On 8/25/14, 8:45 AM, Jiri Olsa wrote:
> >Adding period as a default output column in script command
> >fo hardware, software and raw events.
> >
> >If PERF_SAMPLE_PERIOD sample type is defined in perf.data,
> >following will be displayed in perf script output:
> >
> >   $ perf script
> >               ls  8034 57477.887209:     250000 task-clock:  ffffffff81361d72 memset ([kernel.kallsyms])
> >               ls  8034 57477.887464:     250000 task-clock:  ffffffff816f6d92 _raw_spin_unlock_irqrestore ([kernel.kallsyms])
> >               ls  8034 57477.887708:     250000 task-clock:  ffffffff811a94f0 do_munmap ([kernel.kallsyms])
> >               ls  8034 57477.887959:     250000 task-clock:        34080916c6 get_next_seq (/usr/lib64/libc-2.17.so)
> >               ls  8034 57477.888208:     250000 task-clock:        3408079230 _IO_doallocbuf (/usr/lib64/libc-2.17.so)
> >               ls  8034 57477.888717:     250000 task-clock:  ffffffff814242c8 n_tty_write ([kernel.kallsyms])
> >               ls  8034 57477.889285:     250000 task-clock:        3408076402 fwrite_unlocked (/usr/lib64/libc-2.17.so)
> >
> >Cc: Andi Kleen<andi@firstfloor.org>
> >Cc: Arnaldo Carvalho de Melo<acme@kernel.org>
> >Cc: Corey Ashford<cjashfor@linux.vnet.ibm.com>
> >Cc: David Ahern<dsahern@gmail.com>
> >Cc: Frederic Weisbecker<fweisbec@gmail.com>
> >Cc: Ingo Molnar<mingo@kernel.org>
> >Cc: Jen-Cheng(Tommy) Huang<tommy24@gatech.edu>
> >Cc: Namhyung Kim<namhyung@kernel.org>
> >Cc: Paul Mackerras<paulus@samba.org>
> >Cc: Peter Zijlstra<a.p.zijlstra@chello.nl>
> >Cc: Stephane Eranian<eranian@google.com>
> >Signed-off-by: Jiri Olsa<jolsa@kernel.org>
> 
> 
> Acked-by: David Ahern <dsahern@gmail.com>

this one (referenced by earlier email) ;-)

thanks,
jirka

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 8/9] perf script: Add period data column
  2014-10-17 16:10     ` Jiri Olsa
@ 2014-10-17 18:22       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 39+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-10-17 18:22 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: David Ahern, Jiri Olsa, linux-kernel, Andi Kleen, Corey Ashford,
	Frederic Weisbecker, Ingo Molnar, Jen-Cheng(Tommy) Huang,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian

Em Fri, Oct 17, 2014 at 06:10:06PM +0200, Jiri Olsa escreveu:
> On Wed, Aug 27, 2014 at 08:33:53AM -0600, David Ahern wrote:
> > On 8/25/14, 8:45 AM, Jiri Olsa wrote:
> > >Adding period data column to be displayed in perf script.
> > >It's possible to get period values using -f option, like:
> > >
> > >   $ perf script -f comm,tid,time,period,ip,sym,dso
> > >           :26019 26019 52414.329088:       3707  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
> > >           :26019 26019 52414.329088:         44  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
> > >           :26019 26019 52414.329093:       1987  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
> > >           :26019 26019 52414.329093:          6  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
> > >               ls 26019 52414.329442:     537558        3407c0639c _dl_map_object_from_fd (/usr/lib64/ld-2.17.so)
> > >               ls 26019 52414.329442:       2099        3407c0639c _dl_map_object_from_fd (/usr/lib64/ld-2.17.so)
> > >               ls 26019 52414.330181:    1242100        34080917bb get_next_seq (/usr/lib64/libc-2.17.so)
> > >               ls 26019 52414.330181:       3774        34080917bb get_next_seq (/usr/lib64/libc-2.17.so)
> > >               ls 26019 52414.331427:    1083662  ffffffff810c7dc2 update_curr ([kernel.kallsyms])
> > >               ls 26019 52414.331427:        360  ffffffff810c7dc2 update_curr ([kernel.kallsyms])
> > >
> > >Cc: Andi Kleen <andi@firstfloor.org>
> > >Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> > >Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
> > >Cc: David Ahern <dsahern@gmail.com>
> > >Cc: Frederic Weisbecker <fweisbec@gmail.com>
> > >Cc: Ingo Molnar <mingo@kernel.org>
> > >Cc: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
> > >Cc: Namhyung Kim <namhyung@kernel.org>
> > >Cc: Paul Mackerras <paulus@samba.org>
> > >Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > >Cc: Stephane Eranian <eranian@google.com>
> > >Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > >---
> > >  tools/perf/Documentation/perf-script.txt |  2 +-
> > >  tools/perf/builtin-script.c              | 12 +++++++++++-
> > >  2 files changed, 12 insertions(+), 2 deletions(-)
> > 
> > Acked-by: David Ahern <dsahern@gmail.com>
> 
> Arnaldo,
> could you please pick this one and patch 9/9?
> just realized this was left behind ;-)

Applied both.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [tip:perf/urgent] perf script: Add period data column
  2014-08-25 14:45 ` [PATCH 8/9] perf script: Add period data column Jiri Olsa
  2014-08-27 14:33   ` David Ahern
@ 2014-10-18  7:07   ` tip-bot for Jiri Olsa
  1 sibling, 0 replies; 39+ messages in thread
From: tip-bot for Jiri Olsa @ 2014-10-18  7:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, tglx, fweisbec, jolsa, andi, mingo, acme, tommy24, paulus,
	namhyung, cjashfor, eranian, a.p.zijlstra, dsahern, linux-kernel

Commit-ID:  535aeaae7de821ba5d43ee2a204ee667ca95aae4
Gitweb:     http://git.kernel.org/tip/535aeaae7de821ba5d43ee2a204ee667ca95aae4
Author:     Jiri Olsa <jolsa@kernel.org>
AuthorDate: Mon, 25 Aug 2014 16:45:42 +0200
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 17 Oct 2014 15:21:30 -0300

perf script: Add period data column

Adding period data column to be displayed in perf script.  It's possible
to get period values using -f option, like:

  $ perf script -f comm,tid,time,period,ip,sym,dso
          :26019 26019 52414.329088:       3707  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
          :26019 26019 52414.329088:         44  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
          :26019 26019 52414.329093:       1987  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
          :26019 26019 52414.329093:          6  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
              ls 26019 52414.329442:     537558        3407c0639c _dl_map_object_from_fd (/usr/lib64/ld-2.17.so)
              ls 26019 52414.329442:       2099        3407c0639c _dl_map_object_from_fd (/usr/lib64/ld-2.17.so)
              ls 26019 52414.330181:    1242100        34080917bb get_next_seq (/usr/lib64/libc-2.17.so)
              ls 26019 52414.330181:       3774        34080917bb get_next_seq (/usr/lib64/libc-2.17.so)
              ls 26019 52414.331427:    1083662  ffffffff810c7dc2 update_curr ([kernel.kallsyms])
              ls 26019 52414.331427:        360  ffffffff810c7dc2 update_curr ([kernel.kallsyms])

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: "Jen-Cheng(Tommy) Huang" <tommy24@gatech.edu>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1408977943-16594-9-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-script.txt |  2 +-
 tools/perf/builtin-script.c              | 12 +++++++++++-
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 5a0160d..2149480 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -115,7 +115,7 @@ OPTIONS
 -f::
 --fields::
         Comma separated list of fields to print. Options are:
-        comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff, srcline.
+        comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff, srcline, period.
         Field list can be prepended with the type, trace, sw or hw,
         to indicate to which event type the field list applies.
         e.g., -f sw:comm,tid,time,ip,sym  and -f trace:time,cpu,trace
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 6b4925f..0659dff 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -44,6 +44,7 @@ enum perf_output_field {
 	PERF_OUTPUT_ADDR            = 1U << 10,
 	PERF_OUTPUT_SYMOFFSET       = 1U << 11,
 	PERF_OUTPUT_SRCLINE         = 1U << 12,
+	PERF_OUTPUT_PERIOD          = 1U << 13,
 };
 
 struct output_option {
@@ -63,6 +64,7 @@ struct output_option {
 	{.str = "addr",  .field = PERF_OUTPUT_ADDR},
 	{.str = "symoff", .field = PERF_OUTPUT_SYMOFFSET},
 	{.str = "srcline", .field = PERF_OUTPUT_SRCLINE},
+	{.str = "period", .field = PERF_OUTPUT_PERIOD},
 };
 
 /* default set to maintain compatibility with current format */
@@ -229,6 +231,11 @@ static int perf_evsel__check_attr(struct perf_evsel *evsel,
 					PERF_OUTPUT_CPU))
 		return -EINVAL;
 
+	if (PRINT_FIELD(PERIOD) &&
+		perf_evsel__check_stype(evsel, PERF_SAMPLE_PERIOD, "PERIOD",
+					PERF_OUTPUT_PERIOD))
+		return -EINVAL;
+
 	return 0;
 }
 
@@ -448,6 +455,9 @@ static void process_event(union perf_event *event, struct perf_sample *sample,
 
 	print_sample_start(sample, thread, evsel);
 
+	if (PRINT_FIELD(PERIOD))
+		printf("%10" PRIu64 " ", sample->period);
+
 	if (PRINT_FIELD(EVNAME)) {
 		const char *evname = perf_evsel__name(evsel);
 		printf("%s: ", evname ? evname : "[unknown]");
@@ -1543,7 +1553,7 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
 		     "comma separated output fields prepend with 'type:'. "
 		     "Valid types: hw,sw,trace,raw. "
 		     "Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,"
-		     "addr,symoff", parse_output_fields),
+		     "addr,symoff,period", parse_output_fields),
 	OPT_BOOLEAN('a', "all-cpus", &system_wide,
 		    "system-wide collection from all CPUs"),
 	OPT_STRING('S', "symbols", &symbol_conf.sym_list_str, "symbol[,symbol...]",

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [tip:perf/urgent] perf script: Add period as a default output column
  2014-08-25 14:45 ` [PATCH 9/9] perf script: Add period as a default output column Jiri Olsa
  2014-08-27 14:40   ` David Ahern
@ 2014-10-18  7:07   ` tip-bot for Jiri Olsa
  1 sibling, 0 replies; 39+ messages in thread
From: tip-bot for Jiri Olsa @ 2014-10-18  7:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: a.p.zijlstra, tommy24, dsahern, tglx, namhyung, linux-kernel,
	fweisbec, eranian, cjashfor, andi, mingo, hpa, paulus, acme,
	jolsa

Commit-ID:  e8564b710c6df2c3aeb56c507c22f4bcfa4c0b2d
Gitweb:     http://git.kernel.org/tip/e8564b710c6df2c3aeb56c507c22f4bcfa4c0b2d
Author:     Jiri Olsa <jolsa@kernel.org>
AuthorDate: Mon, 25 Aug 2014 16:45:43 +0200
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 17 Oct 2014 15:22:19 -0300

perf script: Add period as a default output column

Adding period as a default output column in script command fo hardware,
software and raw events.

If PERF_SAMPLE_PERIOD sample type is defined in perf.data, following
will be displayed in perf script output:

  $ perf script
              ls  8034 57477.887209:     250000 task-clock:  ffffffff81361d72 memset ([kernel.kallsyms])
              ls  8034 57477.887464:     250000 task-clock:  ffffffff816f6d92 _raw_spin_unlock_irqrestore ([kernel.kallsyms])
              ls  8034 57477.887708:     250000 task-clock:  ffffffff811a94f0 do_munmap ([kernel.kallsyms])
              ls  8034 57477.887959:     250000 task-clock:        34080916c6 get_next_seq (/usr/lib64/libc-2.17.so)
              ls  8034 57477.888208:     250000 task-clock:        3408079230 _IO_doallocbuf (/usr/lib64/libc-2.17.so)
              ls  8034 57477.888717:     250000 task-clock:  ffffffff814242c8 n_tty_write ([kernel.kallsyms])
              ls  8034 57477.889285:     250000 task-clock:        3408076402 fwrite_unlocked (/usr/lib64/libc-2.17.so)

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: "Jen-Cheng(Tommy) Huang" <tommy24@gatech.edu>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1408977943-16594-10-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-script.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 0659dff..9708a12 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -82,7 +82,8 @@ static struct {
 		.fields = PERF_OUTPUT_COMM | PERF_OUTPUT_TID |
 			      PERF_OUTPUT_CPU | PERF_OUTPUT_TIME |
 			      PERF_OUTPUT_EVNAME | PERF_OUTPUT_IP |
-				  PERF_OUTPUT_SYM | PERF_OUTPUT_DSO,
+			      PERF_OUTPUT_SYM | PERF_OUTPUT_DSO |
+			      PERF_OUTPUT_PERIOD,
 
 		.invalid_fields = PERF_OUTPUT_TRACE,
 	},
@@ -93,7 +94,8 @@ static struct {
 		.fields = PERF_OUTPUT_COMM | PERF_OUTPUT_TID |
 			      PERF_OUTPUT_CPU | PERF_OUTPUT_TIME |
 			      PERF_OUTPUT_EVNAME | PERF_OUTPUT_IP |
-				  PERF_OUTPUT_SYM | PERF_OUTPUT_DSO,
+			      PERF_OUTPUT_SYM | PERF_OUTPUT_DSO |
+			      PERF_OUTPUT_PERIOD,
 
 		.invalid_fields = PERF_OUTPUT_TRACE,
 	},
@@ -112,7 +114,8 @@ static struct {
 		.fields = PERF_OUTPUT_COMM | PERF_OUTPUT_TID |
 			      PERF_OUTPUT_CPU | PERF_OUTPUT_TIME |
 			      PERF_OUTPUT_EVNAME | PERF_OUTPUT_IP |
-				  PERF_OUTPUT_SYM | PERF_OUTPUT_DSO,
+			      PERF_OUTPUT_SYM | PERF_OUTPUT_DSO |
+			      PERF_OUTPUT_PERIOD,
 
 		.invalid_fields = PERF_OUTPUT_TRACE,
 	},

^ permalink raw reply related	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2014-10-18  7:08 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-25 14:45 [RFCv2 0/9] perf: Allow leader sampling on inherited events Jiri Olsa
2014-08-25 14:45 ` [PATCH 1/9] perf: Remove redundant parent context check from context_equiv Jiri Olsa
2014-09-02 10:50   ` Peter Zijlstra
2014-09-08  9:43     ` Jiri Olsa
2014-09-08  9:45       ` Peter Zijlstra
2014-09-08  9:48         ` Peter Zijlstra
2014-09-08 10:01           ` Peter Zijlstra
2014-09-08 11:39             ` Peter Zijlstra
2014-09-08 12:19               ` Jiri Olsa
2014-09-08 13:20                 ` Peter Zijlstra
2014-09-08 12:32               ` Jiri Olsa
2014-09-08 12:01             ` Jiri Olsa
2014-09-08 13:34               ` Peter Zijlstra
2014-09-08 15:13                 ` Peter Zijlstra
2014-09-08 16:45                   ` Jiri Olsa
2014-09-09 10:20                     ` Peter Zijlstra
2014-09-10 13:57                     ` Peter Zijlstra
2014-09-10 14:35                       ` Jiri Olsa
2014-09-24 14:58                         ` [tip:perf/core] Revert "perf: Do not allow optimized switch for non-cloned events" tip-bot for Jiri Olsa
2014-08-25 14:45 ` [PATCH 2/9] perf: Deny optimized switch for events read by PERF_SAMPLE_READ Jiri Olsa
2014-09-02 10:52   ` Peter Zijlstra
2014-09-08 10:00     ` Jiri Olsa
2014-09-08 10:11       ` Peter Zijlstra
2014-09-08 16:39         ` Jiri Olsa
2014-08-25 14:45 ` [PATCH 3/9] perf: Allow PERF_FORMAT_GROUP format on inherited events Jiri Olsa
2014-08-25 14:45 ` [PATCH 4/9] perf tools: Add support to traverse xyarrays Jiri Olsa
2014-08-25 14:45 ` [PATCH 5/9] perf tools: Add pr_warning_once debug macro Jiri Olsa
2014-08-25 14:45 ` [PATCH 6/9] perf tools: Add hash of periods for struct perf_sample_id Jiri Olsa
2014-08-25 14:45 ` [PATCH 7/9] perf tools: Allow PERF_FORMAT_GROUP for inherited events Jiri Olsa
2014-08-25 14:45 ` [PATCH 8/9] perf script: Add period data column Jiri Olsa
2014-08-27 14:33   ` David Ahern
2014-10-17 16:10     ` Jiri Olsa
2014-10-17 18:22       ` Arnaldo Carvalho de Melo
2014-10-18  7:07   ` [tip:perf/urgent] " tip-bot for Jiri Olsa
2014-08-25 14:45 ` [PATCH 9/9] perf script: Add period as a default output column Jiri Olsa
2014-08-27 14:40   ` David Ahern
2014-10-17 16:11     ` Jiri Olsa
2014-10-18  7:07   ` [tip:perf/urgent] " tip-bot for Jiri Olsa
2014-08-25 14:51 ` [RFCv2 0/9] perf: Allow leader sampling on inherited events Jiri Olsa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).