linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/1] perf: fix corruption of sibling list with hotplug
@ 2014-11-05 16:11 Mark Rutland
  2014-11-05 16:11 ` [PATCH 1/1] " Mark Rutland
  0 siblings, 1 reply; 3+ messages in thread
From: Mark Rutland @ 2014-11-05 16:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: acme, drew.richardson, mingo, a.p.zijlstra, vincent.weaver,
	will.deacon, Mark Rutland

Hi,

The following patch prevents events from being freed while still part of a
sibling list, which can happen if all references to CPU-bound events in a
sibling list are dropped while the relevant CPU is offline.

We currnetly rely on a cross call to __perf_remove_from_context to remove an
event from its sibling list when it is destroyed, but this can fail for
CPU-bound events if the CPU is already offline. We then free the event without
having removed it from the sibling list, leaving siblings pointing at memory
that may be reallocated at any time.

To work around this I forcefully tear apart CPU-bound event groups upon CPU hot
unplug. We already force the events out of their context, and they'll never be
scheduled again. We could instead try to tear apart the groups from another CPU
if the cross-call fails, but that seemed more complex.

Without this patch, the below test case (when run with sufficient privileges)
can trigger a dereference of a garbage pointer in perf_group_detach, leading to
an Oops and a possible panic.

Thanks,
Mark.

---->8----
#include <errno.h>
#include <fcntl.h>
#include <linux/perf_event.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <unistd.h>

#define CPU 1
#define DELAY_SECS 4

#define _STR(x) #x
#define STR(x) _STR(x)
#define CPU_S STR(CPU)

int sys_perf_event_open(struct perf_event_attr *attr, pid_t pid, int cpu,
			       int group_fd, unsigned long flags)
{
	return syscall(__NR_perf_event_open, attr, pid, cpu, group_fd, flags);
}

int cpufd;

/* Arbitrary CPU-bound event */
struct perf_event_attr attr = {
	.type = PERF_TYPE_SOFTWARE,
	.config = PERF_COUNT_SW_CONTEXT_SWITCHES,
	.sample_period = 100000000,
	.sample_type = PERF_SAMPLE_TIME,
};

void trigger(void)
{
	int leader, follower;

	leader = sys_perf_event_open(&attr, -1, CPU, -1, 0);
	if (leader < 0)
		exit(errno);

	follower = sys_perf_event_open(&attr, -1, CPU, leader, 0);
	if (follower < 0)
		exit(errno);

	/* Make cpu_call_funcion on CPU fail when killing the follower */
	if (write(cpufd, "0", 1) != 1)
		exit(EIO);

	close(follower);

	/* Ensure cpu_call_function succeeds when killing the leader */
	if (write(cpufd, "1", 1) != 1)
		exit(EIO);

	/* hope something else re-uses the memory */
	sleep(DELAY_SECS);

	close(leader);
}

int main(void)
{
	cpufd = open("/sys/devices/system/cpu/cpu" CPU_S "/online", O_WRONLY);
	if (cpufd < 0)
		exit(errno);

	/* make sure the CPU is online to begin with */
	if (write(cpufd, "1", 1) != 1)
		exit(EIO);

	for (;;)
		trigger();

	return 0;
}
---->8----

Mark Rutland (1):
  perf: fix corruption of sibling list with hotplug

 kernel/events/core.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

-- 
1.9.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH 1/1] perf: fix corruption of sibling list with hotplug
  2014-11-05 16:11 [PATCH 0/1] perf: fix corruption of sibling list with hotplug Mark Rutland
@ 2014-11-05 16:11 ` Mark Rutland
  2014-11-16  9:49   ` [tip:perf/urgent] perf: Fix " tip-bot for Mark Rutland
  0 siblings, 1 reply; 3+ messages in thread
From: Mark Rutland @ 2014-11-05 16:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: acme, drew.richardson, mingo, a.p.zijlstra, vincent.weaver,
	will.deacon, Mark Rutland

When a CPU hotplugged out, we call perf_remove_from_context (via
perf_event_exit_cpu) to rip each CPU-bound event out of its PMU's cpu
context, but leave siblings grouped together. Freeing of these events is
left to the mercy of the usual refcounting.

When a CPU-bound event's refcount drops to zero we cross-call to
__perf_remove_from_context to clean it up, detaching grouped siblings.
This works when the relevant CPU is online, but will fail if the CPU is
currently offline, and we won't detach the event from its siblings
before freeing the event, leaving the sibling list corrupt. If the
sibling list is later walked (e.g. because the CPU cam online again
before a remaining sibling's refcount drops to zero), we will walk the
now corrupted siblings list, potentially dereferencing garbage values.

Given that the events should never be scheduled again (as we removed
them from their context), we can simply detatch siblings when the CPU
goes down in the first place. If the CPU comes back online, the
redundant call to __perf_remove_from_context is safe.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Drew Richardson <drew.richardson@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Will Deacon <will.deacon@arm.com>
---
 kernel/events/core.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 2b02c9f..1cd5eef 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1562,8 +1562,10 @@ static void perf_remove_from_context(struct perf_event *event, bool detach_group
 
 	if (!task) {
 		/*
-		 * Per cpu events are removed via an smp call and
-		 * the removal is always successful.
+		 * Per cpu events are removed via an smp call. The removal can
+		 * fail if the CPU is currently offline, but in that case we
+		 * already called __perf_remove_from_context from
+		 * perf_event_exit_cpu.
 		 */
 		cpu_function_call(event->cpu, __perf_remove_from_context, &re);
 		return;
@@ -8117,7 +8119,7 @@ static void perf_pmu_rotate_stop(struct pmu *pmu)
 
 static void __perf_event_exit_context(void *__info)
 {
-	struct remove_event re = { .detach_group = false };
+	struct remove_event re = { .detach_group = true };
 	struct perf_event_context *ctx = __info;
 
 	perf_pmu_rotate_stop(ctx->pmu);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [tip:perf/urgent] perf: Fix corruption of sibling list with hotplug
  2014-11-05 16:11 ` [PATCH 1/1] " Mark Rutland
@ 2014-11-16  9:49   ` tip-bot for Mark Rutland
  0 siblings, 0 replies; 3+ messages in thread
From: tip-bot for Mark Rutland @ 2014-11-16  9:49 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: vincent.weaver, tglx, linux-kernel, mingo, will.deacon, acme,
	hpa, peterz, drew.richardson, mark.rutland, torvalds

Commit-ID:  226424eee809251ec23bd4b09d8efba09c10fc3c
Gitweb:     http://git.kernel.org/tip/226424eee809251ec23bd4b09d8efba09c10fc3c
Author:     Mark Rutland <mark.rutland@arm.com>
AuthorDate: Wed, 5 Nov 2014 16:11:44 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sun, 16 Nov 2014 09:45:46 +0100

perf: Fix corruption of sibling list with hotplug

When a CPU hotplugged out, we call perf_remove_from_context() (via
perf_event_exit_cpu()) to rip each CPU-bound event out of its PMU's cpu
context, but leave siblings grouped together. Freeing of these events is
left to the mercy of the usual refcounting.

When a CPU-bound event's refcount drops to zero we cross-call to
__perf_remove_from_context() to clean it up, detaching grouped siblings.

This works when the relevant CPU is online, but will fail if the CPU is
currently offline, and we won't detach the event from its siblings
before freeing the event, leaving the sibling list corrupt. If the
sibling list is later walked (e.g. because the CPU cam online again
before a remaining sibling's refcount drops to zero), we will walk the
now corrupted siblings list, potentially dereferencing garbage values.

Given that the events should never be scheduled again (as we removed
them from their context), we can simply detatch siblings when the CPU
goes down in the first place. If the CPU comes back online, the
redundant call to __perf_remove_from_context() is safe.

Reported-by: Drew Richardson <drew.richardson@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: vincent.weaver@maine.edu
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1415203904-25308-2-git-send-email-mark.rutland@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/events/core.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 2b02c9f..1cd5eef 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1562,8 +1562,10 @@ static void perf_remove_from_context(struct perf_event *event, bool detach_group
 
 	if (!task) {
 		/*
-		 * Per cpu events are removed via an smp call and
-		 * the removal is always successful.
+		 * Per cpu events are removed via an smp call. The removal can
+		 * fail if the CPU is currently offline, but in that case we
+		 * already called __perf_remove_from_context from
+		 * perf_event_exit_cpu.
 		 */
 		cpu_function_call(event->cpu, __perf_remove_from_context, &re);
 		return;
@@ -8117,7 +8119,7 @@ static void perf_pmu_rotate_stop(struct pmu *pmu)
 
 static void __perf_event_exit_context(void *__info)
 {
-	struct remove_event re = { .detach_group = false };
+	struct remove_event re = { .detach_group = true };
 	struct perf_event_context *ctx = __info;
 
 	perf_pmu_rotate_stop(ctx->pmu);

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-11-16  9:51 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-05 16:11 [PATCH 0/1] perf: fix corruption of sibling list with hotplug Mark Rutland
2014-11-05 16:11 ` [PATCH 1/1] " Mark Rutland
2014-11-16  9:49   ` [tip:perf/urgent] perf: Fix " tip-bot for Mark Rutland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).