linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jiri Olsa <jolsa@kernel.org>
To: linux-kernel@vger.kernel.org
Cc: Jiri Olsa <jolsa@kernel.org>, Andi Kleen <andi@firstfloor.org>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Corey Ashford <cjashfor@linux.vnet.ibm.com>,
	David Ahern <dsahern@gmail.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Ingo Molnar <mingo@kernel.org>,
	"Jen-Cheng(Tommy) Huang" <tommy24@gatech.edu>,
	Namhyung Kim <namhyung@kernel.org>,
	Paul Mackerras <paulus@samba.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Stephane Eranian <eranian@google.com>
Subject: [PATCH 2/9] perf: Deny optimized switch for events read by PERF_SAMPLE_READ
Date: Mon, 25 Aug 2014 16:45:36 +0200	[thread overview]
Message-ID: <1408977943-16594-3-git-send-email-jolsa@kernel.org> (raw)
In-Reply-To: <1408977943-16594-1-git-send-email-jolsa@kernel.org>

The optimized task context switch for cloned perf events just
swaps whole perf event contexts (of current and next process)
if it finds them suitable. Events from the 'current' context
will now measure data of the 'next' context and vice versa.

This is ok for cases where we are not directly interested in
the event->count value of separate child events, like:
  - standard sampling, where we take 'period' value for the
    event count
  - counting, where we accumulate all events (children)
    into a single count value

But in case we read event by using the PERF_SAMPLE_READ sample
type, we are interested in direct event->count value measured
in specific task. Switching events within tasks for this kind
of measurements corrupts data.

Fixing this by setting/unsetting pin_count for perf event context
once cloned event with PERF_SAMPLE_READ read is added/removed.
The pin_count value != 0 makes the context not suitable for
optimized switch.

Cc: Andi Kleen <andi@firstfloor.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 kernel/events/core.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 4ad4ba2bc106..ff6a17607ddb 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1117,6 +1117,12 @@ ctx_group_list(struct perf_event *event, struct perf_event_context *ctx)
 		return &ctx->flexible_groups;
 }
 
+static bool is_clone_with_read(struct perf_event *event)
+{
+	return event->parent &&
+	       (event->attr.sample_type & PERF_SAMPLE_READ);
+}
+
 /*
  * Add a event from the lists for its context.
  * Must be called with ctx->mutex and ctx->lock held.
@@ -1148,6 +1154,9 @@ list_add_event(struct perf_event *event, struct perf_event_context *ctx)
 	if (has_branch_stack(event))
 		ctx->nr_branch_stack++;
 
+	if (is_clone_with_read(event))
+		ctx->pin_count++;
+
 	list_add_rcu(&event->event_entry, &ctx->event_list);
 	if (!ctx->nr_events)
 		perf_pmu_rotate_start(ctx->pmu);
@@ -1313,6 +1322,9 @@ list_del_event(struct perf_event *event, struct perf_event_context *ctx)
 	if (has_branch_stack(event))
 		ctx->nr_branch_stack--;
 
+	if (is_clone_with_read(event))
+		ctx->pin_count--;
+
 	ctx->nr_events--;
 	if (event->attr.inherit_stat)
 		ctx->nr_stat--;
-- 
1.8.3.1


  parent reply	other threads:[~2014-08-25 14:46 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-25 14:45 [RFCv2 0/9] perf: Allow leader sampling on inherited events Jiri Olsa
2014-08-25 14:45 ` [PATCH 1/9] perf: Remove redundant parent context check from context_equiv Jiri Olsa
2014-09-02 10:50   ` Peter Zijlstra
2014-09-08  9:43     ` Jiri Olsa
2014-09-08  9:45       ` Peter Zijlstra
2014-09-08  9:48         ` Peter Zijlstra
2014-09-08 10:01           ` Peter Zijlstra
2014-09-08 11:39             ` Peter Zijlstra
2014-09-08 12:19               ` Jiri Olsa
2014-09-08 13:20                 ` Peter Zijlstra
2014-09-08 12:32               ` Jiri Olsa
2014-09-08 12:01             ` Jiri Olsa
2014-09-08 13:34               ` Peter Zijlstra
2014-09-08 15:13                 ` Peter Zijlstra
2014-09-08 16:45                   ` Jiri Olsa
2014-09-09 10:20                     ` Peter Zijlstra
2014-09-10 13:57                     ` Peter Zijlstra
2014-09-10 14:35                       ` Jiri Olsa
2014-09-24 14:58                         ` [tip:perf/core] Revert "perf: Do not allow optimized switch for non-cloned events" tip-bot for Jiri Olsa
2014-08-25 14:45 ` Jiri Olsa [this message]
2014-09-02 10:52   ` [PATCH 2/9] perf: Deny optimized switch for events read by PERF_SAMPLE_READ Peter Zijlstra
2014-09-08 10:00     ` Jiri Olsa
2014-09-08 10:11       ` Peter Zijlstra
2014-09-08 16:39         ` Jiri Olsa
2014-08-25 14:45 ` [PATCH 3/9] perf: Allow PERF_FORMAT_GROUP format on inherited events Jiri Olsa
2014-08-25 14:45 ` [PATCH 4/9] perf tools: Add support to traverse xyarrays Jiri Olsa
2014-08-25 14:45 ` [PATCH 5/9] perf tools: Add pr_warning_once debug macro Jiri Olsa
2014-08-25 14:45 ` [PATCH 6/9] perf tools: Add hash of periods for struct perf_sample_id Jiri Olsa
2014-08-25 14:45 ` [PATCH 7/9] perf tools: Allow PERF_FORMAT_GROUP for inherited events Jiri Olsa
2014-08-25 14:45 ` [PATCH 8/9] perf script: Add period data column Jiri Olsa
2014-08-27 14:33   ` David Ahern
2014-10-17 16:10     ` Jiri Olsa
2014-10-17 18:22       ` Arnaldo Carvalho de Melo
2014-10-18  7:07   ` [tip:perf/urgent] " tip-bot for Jiri Olsa
2014-08-25 14:45 ` [PATCH 9/9] perf script: Add period as a default output column Jiri Olsa
2014-08-27 14:40   ` David Ahern
2014-10-17 16:11     ` Jiri Olsa
2014-10-18  7:07   ` [tip:perf/urgent] " tip-bot for Jiri Olsa
2014-08-25 14:51 ` [RFCv2 0/9] perf: Allow leader sampling on inherited events Jiri Olsa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1408977943-16594-3-git-send-email-jolsa@kernel.org \
    --to=jolsa@kernel.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@kernel.org \
    --cc=andi@firstfloor.org \
    --cc=cjashfor@linux.vnet.ibm.com \
    --cc=dsahern@gmail.com \
    --cc=eranian@google.com \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=namhyung@kernel.org \
    --cc=paulus@samba.org \
    --cc=tommy24@gatech.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).