linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paul Mackerras <paulus@samba.org>
To: Ingo Molnar <mingo@elte.hu>,
	"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Cc: linux-kernel@vger.kernel.org
Subject: [PATCH] perf_counter: Prevent oopses from per-cpu software counters
Date: Thu, 5 Feb 2009 15:52:21 +1100	[thread overview]
Message-ID: <18826.28805.120988.303819@drongo.ozlabs.ibm.com> (raw)

Impact: oops fix

Yanmin Zhang reported that using a PERF_COUNT_TASK_CLOCK software
counter as a per-cpu counter would reliably crash the system, because
it calls __task_delta_exec with a null pointer.  And indeed, a "task
clock" counter only makes sense as a per-task counter.  Similarly,
counting page faults, context switches or cpu migrations only makes
sense for a per-task counter.

This fixes the problem by disallowing the use of the task clock,
page fault, context switch and cpu migration software counters as
per-cpu counters, since they all require a task context to obtain their
data.  The only software counter that can be used as a per-cpu counter
is the cpu clock counter (PERF_COUNT_CPU_CYCLES).

In order for sw_perf_counter_init to be able to tell whether we are
setting up a per-task or a per-cpu counter, this arranges for counter->ctx
to be initialized earlier, in perf_counter_alloc.

The other minor change this makes is to ensure that if sw_perf_counter_init
fails, we don't try to initialize the counter as a hardware counter.
Since the user has passed a negative event type (and it isn't raw), they
clearly don't intend it to be interpreted as a hardware event.  This
matters now that sw_perf_counter_init can fail for valid software event
types (because of the check that the counter is a per-task counter).

Signed-off-by: Paul Mackerras <paulus@samba.org>
---
This is in my perfcounters.git tree master branch at:

git://git.kernel.org/pub/scm/linux/kernel/git/paulus/perfcounters.git master

Ingo - pull if you agree with it.  The check in sw_perf_counter_init
may need to be modified if we add more software counters, of course,
but that can happen later.

diff --git a/kernel/perf_counter.c b/kernel/perf_counter.c
index f27a7e9..47d91da 100644
--- a/kernel/perf_counter.c
+++ b/kernel/perf_counter.c
@@ -502,7 +502,6 @@ perf_install_in_context(struct perf_counter_context *ctx,
 {
 	struct task_struct *task = ctx->task;
 
-	counter->ctx = ctx;
 	if (!task) {
 		/*
 		 * Per cpu counters are installed via an smp call and
@@ -1564,6 +1563,15 @@ sw_perf_counter_init(struct perf_counter *counter)
 {
 	const struct hw_perf_counter_ops *hw_ops = NULL;
 
+	/*
+	 * PERF_COUNT_CPU_CLOCK is the only type that is valid
+	 * for a per-cpu counter.  All the others require a task
+	 * context and thus can only be used as per-task counters.
+	 */
+	if (!counter->ctx->task &&
+	    counter->hw_event.type != PERF_COUNT_CPU_CLOCK)
+		return NULL;
+
 	switch (counter->hw_event.type) {
 	case PERF_COUNT_CPU_CLOCK:
 		hw_ops = &perf_ops_cpu_clock;
@@ -1592,6 +1600,7 @@ sw_perf_counter_init(struct perf_counter *counter)
 static struct perf_counter *
 perf_counter_alloc(struct perf_counter_hw_event *hw_event,
 		   int cpu,
+		   struct perf_counter_context *ctx,
 		   struct perf_counter *group_leader,
 		   gfp_t gfpflags)
 {
@@ -1623,6 +1632,7 @@ perf_counter_alloc(struct perf_counter_hw_event *hw_event,
 	counter->wakeup_pending		= 0;
 	counter->group_leader		= group_leader;
 	counter->hw_ops			= NULL;
+	counter->ctx			= ctx;
 
 	counter->state = PERF_COUNTER_STATE_INACTIVE;
 	if (hw_event->disabled)
@@ -1631,7 +1641,7 @@ perf_counter_alloc(struct perf_counter_hw_event *hw_event,
 	hw_ops = NULL;
 	if (!hw_event->raw && hw_event->type < 0)
 		hw_ops = sw_perf_counter_init(counter);
-	if (!hw_ops)
+	else
 		hw_ops = hw_perf_counter_init(counter);
 
 	if (!hw_ops) {
@@ -1707,7 +1717,8 @@ sys_perf_counter_open(struct perf_counter_hw_event *hw_event_uptr __user,
 	}
 
 	ret = -EINVAL;
-	counter = perf_counter_alloc(&hw_event, cpu, group_leader, GFP_KERNEL);
+	counter = perf_counter_alloc(&hw_event, cpu, ctx, group_leader,
+				     GFP_KERNEL);
 	if (!counter)
 		goto err_put_context;
 
@@ -1777,15 +1788,14 @@ inherit_counter(struct perf_counter *parent_counter,
 		parent_counter = parent_counter->parent;
 
 	child_counter = perf_counter_alloc(&parent_counter->hw_event,
-					    parent_counter->cpu, group_leader,
-					    GFP_KERNEL);
+					   parent_counter->cpu, child_ctx,
+					   group_leader, GFP_KERNEL);
 	if (!child_counter)
 		return NULL;
 
 	/*
 	 * Link it up in the child's context:
 	 */
-	child_counter->ctx = child_ctx;
 	child_counter->task = child;
 	list_add_counter(child_counter, child_ctx);
 	child_ctx->nr_counters++;


             reply	other threads:[~2009-02-05  4:52 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-05  4:52 Paul Mackerras [this message]
2009-02-05 14:22 ` [PATCH] perf_counter: Prevent oopses from per-cpu software counters Ingo Molnar
2009-02-06  2:40   ` Paul Mackerras
2009-02-06  4:34   ` Paul Mackerras
2009-02-06  6:35     ` Zhang, Yanmin
2009-02-06 15:31     ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=18826.28805.120988.303819@drongo.ozlabs.ibm.com \
    --to=paulus@samba.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=yanmin_zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).