linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jiri Olsa <jolsa@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Olsa <jolsa@kernel.org>,
	linux-kernel@vger.kernel.org, Andi Kleen <andi@firstfloor.org>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Corey Ashford <cjashfor@linux.vnet.ibm.com>,
	David Ahern <dsahern@gmail.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Ingo Molnar <mingo@kernel.org>,
	"Jen-Cheng(Tommy) Huang" <tommy24@gatech.edu>,
	Namhyung Kim <namhyung@kernel.org>,
	Paul Mackerras <paulus@samba.org>,
	Stephane Eranian <eranian@google.com>
Subject: Re: [PATCH 1/9] perf: Remove redundant parent context check from context_equiv
Date: Mon, 8 Sep 2014 11:43:48 +0200	[thread overview]
Message-ID: <20140908094348.GB1172@krava.brq.redhat.com> (raw)
In-Reply-To: <20140902105036.GH5806@worktop.ger.corp.intel.com>

On Tue, Sep 02, 2014 at 12:50:36PM +0200, Peter Zijlstra wrote:
> On Mon, Aug 25, 2014 at 04:45:35PM +0200, Jiri Olsa wrote:
> > As described in commit:
> >   1f9a7268c67f perf: Do not allow optimized switch for non-cloned events
> > we dont allow optimized switch for non cloned contexts.
> > 
> > There's no need now to check if one of the context is parent
> > of the other in context_equiv function.
> > 
> 
> OK, so talk to me about that patch; it looks like you're slowly
> reverting: 5a3126d4fe7c ("perf: Fix the perf context switch
> optimization").
> 
> So what exactly is the problem? I'm still jet-lagged and not seeing it.

the problem is, that we cannot allow non-cloned context
to be part of the optimized switch for reasons explained
in commit 1f9a7268c67f

I did not realize there was a just single commit introducing
non-cloned context switch to revert ;-) attached

I haven't tested it yet.. will do with my other changes

thanks,
jirka


---
This reverts commit 5a3126d4fe7c311fe12f98fef0470f6cb582d1ef.

As explained in commit:
  1f9a7268c67f perf: Do not allow optimized switch for non-cloned events

we cannot allow optimized switch for non cloned contexts. It
suppressed it by forcing the condition for parent contexts.

But what we actually should do is to revert the full commit that
introduced the optimized switch for non-cloned contexts.
This way we get rid of unneeded (now) ctx->generation updates
for non-cloned contexts.
---
 kernel/events/core.c | 64 +++++++++++++++-------------------------------------
 1 file changed, 18 insertions(+), 46 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index d8cb4d21a346..424e6d3b07b7 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -916,7 +916,6 @@ static void unclone_ctx(struct perf_event_context *ctx)
 		put_ctx(ctx->parent_ctx);
 		ctx->parent_ctx = NULL;
 	}
-	ctx->generation++;
 }
 
 static u32 perf_event_pid(struct perf_event *event, struct task_struct *p)
@@ -1154,8 +1153,6 @@ list_add_event(struct perf_event *event, struct perf_event_context *ctx)
 	ctx->nr_events++;
 	if (event->attr.inherit_stat)
 		ctx->nr_stat++;
-
-	ctx->generation++;
 }
 
 /*
@@ -1333,8 +1330,6 @@ list_del_event(struct perf_event *event, struct perf_event_context *ctx)
 	 */
 	if (event->state > PERF_EVENT_STATE_OFF)
 		event->state = PERF_EVENT_STATE_OFF;
-
-	ctx->generation++;
 }
 
 static void perf_group_detach(struct perf_event *event)
@@ -2243,38 +2238,22 @@ static void ctx_sched_out(struct perf_event_context *ctx,
 }
 
 /*
- * Test whether two contexts are equivalent, i.e. whether they have both been
- * cloned from the same version of the same context.
- *
- * Equivalence is measured using a generation number in the context that is
- * incremented on each modification to it; see unclone_ctx(), list_add_event()
- * and list_del_event().
+ * Test whether two contexts are equivalent, i.e. whether they
+ * have both been cloned from the same version of the same context
+ * and they both have the same number of enabled events.
+ * If the number of enabled events is the same, then the set
+ * of enabled events should be the same, because these are both
+ * inherited contexts, therefore we can't access individual events
+ * in them directly with an fd; we can only enable/disable all
+ * events via prctl, or enable/disable all events in a family
+ * via ioctl, which will have the same effect on both contexts.
  */
 static int context_equiv(struct perf_event_context *ctx1,
 			 struct perf_event_context *ctx2)
 {
-	/* Pinning disables the swap optimization */
-	if (ctx1->pin_count || ctx2->pin_count)
-		return 0;
-
-	/* If ctx1 is the parent of ctx2 */
-	if (ctx1 == ctx2->parent_ctx && ctx1->generation == ctx2->parent_gen)
-		return 1;
-
-	/* If ctx2 is the parent of ctx1 */
-	if (ctx1->parent_ctx == ctx2 && ctx1->parent_gen == ctx2->generation)
-		return 1;
-
-	/*
-	 * If ctx1 and ctx2 have the same parent; we flatten the parent
-	 * hierarchy, see perf_event_init_context().
-	 */
-	if (ctx1->parent_ctx && ctx1->parent_ctx == ctx2->parent_ctx &&
-			ctx1->parent_gen == ctx2->parent_gen)
-		return 1;
-
-	/* Unmatched */
-	return 0;
+	return ctx1->parent_ctx && ctx1->parent_ctx == ctx2->parent_ctx
+		&& ctx1->parent_gen == ctx2->parent_gen
+		&& !ctx1->pin_count && !ctx2->pin_count;
 }
 
 static void __perf_event_sync_stat(struct perf_event *event,
@@ -2354,7 +2333,7 @@ static void perf_event_context_sched_out(struct task_struct *task, int ctxn,
 {
 	struct perf_event_context *ctx = task->perf_event_ctxp[ctxn];
 	struct perf_event_context *next_ctx;
-	struct perf_event_context *parent, *next_parent;
+	struct perf_event_context *parent;
 	struct perf_cpu_context *cpuctx;
 	int do_switch = 1;
 
@@ -2366,18 +2345,10 @@ static void perf_event_context_sched_out(struct task_struct *task, int ctxn,
 		return;
 
 	rcu_read_lock();
-	next_ctx = next->perf_event_ctxp[ctxn];
-	if (!next_ctx)
-		goto unlock;
-
 	parent = rcu_dereference(ctx->parent_ctx);
-	next_parent = rcu_dereference(next_ctx->parent_ctx);
-
-	/* If neither context have a parent context; they cannot be clones. */
-	if (!parent || !next_parent)
-		goto unlock;
-
-	if (next_parent == ctx || next_ctx == parent || next_parent == parent) {
+	next_ctx = next->perf_event_ctxp[ctxn];
+	if (parent && next_ctx &&
+	    rcu_dereference(next_ctx->parent_ctx) == parent) {
 		/*
 		 * Looks like the two contexts are clones, so we might be
 		 * able to optimize the context switch.  We lock both
@@ -2405,7 +2376,6 @@ static void perf_event_context_sched_out(struct task_struct *task, int ctxn,
 		raw_spin_unlock(&next_ctx->lock);
 		raw_spin_unlock(&ctx->lock);
 	}
-unlock:
 	rcu_read_unlock();
 
 	if (do_switch) {
@@ -7398,6 +7368,7 @@ SYSCALL_DEFINE5(perf_event_open,
 	}
 
 	perf_install_in_context(ctx, event, event->cpu);
+	++ctx->generation;
 	perf_unpin_context(ctx);
 	mutex_unlock(&ctx->mutex);
 
@@ -7484,6 +7455,7 @@ perf_event_create_kernel_counter(struct perf_event_attr *attr, int cpu,
 	WARN_ON_ONCE(ctx->parent_ctx);
 	mutex_lock(&ctx->mutex);
 	perf_install_in_context(ctx, event, cpu);
+	++ctx->generation;
 	perf_unpin_context(ctx);
 	mutex_unlock(&ctx->mutex);
 
-- 
1.8.3.1


  reply	other threads:[~2014-09-08  9:44 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-25 14:45 [RFCv2 0/9] perf: Allow leader sampling on inherited events Jiri Olsa
2014-08-25 14:45 ` [PATCH 1/9] perf: Remove redundant parent context check from context_equiv Jiri Olsa
2014-09-02 10:50   ` Peter Zijlstra
2014-09-08  9:43     ` Jiri Olsa [this message]
2014-09-08  9:45       ` Peter Zijlstra
2014-09-08  9:48         ` Peter Zijlstra
2014-09-08 10:01           ` Peter Zijlstra
2014-09-08 11:39             ` Peter Zijlstra
2014-09-08 12:19               ` Jiri Olsa
2014-09-08 13:20                 ` Peter Zijlstra
2014-09-08 12:32               ` Jiri Olsa
2014-09-08 12:01             ` Jiri Olsa
2014-09-08 13:34               ` Peter Zijlstra
2014-09-08 15:13                 ` Peter Zijlstra
2014-09-08 16:45                   ` Jiri Olsa
2014-09-09 10:20                     ` Peter Zijlstra
2014-09-10 13:57                     ` Peter Zijlstra
2014-09-10 14:35                       ` Jiri Olsa
2014-09-24 14:58                         ` [tip:perf/core] Revert "perf: Do not allow optimized switch for non-cloned events" tip-bot for Jiri Olsa
2014-08-25 14:45 ` [PATCH 2/9] perf: Deny optimized switch for events read by PERF_SAMPLE_READ Jiri Olsa
2014-09-02 10:52   ` Peter Zijlstra
2014-09-08 10:00     ` Jiri Olsa
2014-09-08 10:11       ` Peter Zijlstra
2014-09-08 16:39         ` Jiri Olsa
2014-08-25 14:45 ` [PATCH 3/9] perf: Allow PERF_FORMAT_GROUP format on inherited events Jiri Olsa
2014-08-25 14:45 ` [PATCH 4/9] perf tools: Add support to traverse xyarrays Jiri Olsa
2014-08-25 14:45 ` [PATCH 5/9] perf tools: Add pr_warning_once debug macro Jiri Olsa
2014-08-25 14:45 ` [PATCH 6/9] perf tools: Add hash of periods for struct perf_sample_id Jiri Olsa
2014-08-25 14:45 ` [PATCH 7/9] perf tools: Allow PERF_FORMAT_GROUP for inherited events Jiri Olsa
2014-08-25 14:45 ` [PATCH 8/9] perf script: Add period data column Jiri Olsa
2014-08-27 14:33   ` David Ahern
2014-10-17 16:10     ` Jiri Olsa
2014-10-17 18:22       ` Arnaldo Carvalho de Melo
2014-10-18  7:07   ` [tip:perf/urgent] " tip-bot for Jiri Olsa
2014-08-25 14:45 ` [PATCH 9/9] perf script: Add period as a default output column Jiri Olsa
2014-08-27 14:40   ` David Ahern
2014-10-17 16:11     ` Jiri Olsa
2014-10-18  7:07   ` [tip:perf/urgent] " tip-bot for Jiri Olsa
2014-08-25 14:51 ` [RFCv2 0/9] perf: Allow leader sampling on inherited events Jiri Olsa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140908094348.GB1172@krava.brq.redhat.com \
    --to=jolsa@redhat.com \
    --cc=acme@kernel.org \
    --cc=andi@firstfloor.org \
    --cc=cjashfor@linux.vnet.ibm.com \
    --cc=dsahern@gmail.com \
    --cc=eranian@google.com \
    --cc=fweisbec@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=namhyung@kernel.org \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=tommy24@gatech.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).