From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S964774AbbDMVCX (ORCPT ); Mon, 13 Apr 2015 17:02:23 -0400 Received: from www.linutronix.de ([62.245.132.108]:33727 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932939AbbDMVCK (ORCPT ); Mon, 13 Apr 2015 17:02:10 -0400 Message-Id: <20150413210035.020255651@linutronix.de> User-Agent: quilt/0.63-1 Date: Mon, 13 Apr 2015 21:02:18 -0000 From: Thomas Gleixner To: LKML Cc: Peter Zijlstra , Ingo Molnar , Stephane Eranian , Jiri Olsa , stable@vger.kernel.org, Arnaldo Carvalho de Melo Subject: [patch 1/5] perf: Fixup hrtimer forward wreckage References: <20150413210009.682000343@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-15 Content-Disposition: inline; filename=perf-remove-hrtimer-forward-wreckage.patch X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org perf_event_mux_interval_ms_store() tries to apply an updated hrtimer_interval to all possible cpus in a completely unsafe way. The changelog of the offending commit says: "In the 5th version, we handle the reprogramming of the hrtimer using hrtimer_forward_now(). That way, we sync up to new timer value quickly (suggested by Jiri Olsa)." The hrtimer reprogramming is completely unrelated to hrtimer_forward_now(). hrtimer_forward_now() merily forwards the expiry time of the hrtimer past now and that's where things get really bad in this code: The forward function is called on enqueued timers. That means the update of the expiry time corrupts the ordering of the hrtimer rbtree. The proper way to update hrtimers on remote cpus is to use a smp function call on all online cpus and perform the update locally by canceling the timer, forwarding and restarting it. Fixes: 62b856397927 "perf: Add sysfs entry to adjust multiplexing interval per PMU" Signed-off-by: Thomas Gleixner Cc: Stephane Eranian Cc: Jiri Olsa Cc: stable@vger.kernel.org Cc: Arnaldo Carvalho de Melo --- kernel/events/core.c | 36 ++++++++++++++++++++---------------- 1 file changed, 20 insertions(+), 16 deletions(-) Index: linux/kernel/events/core.c =================================================================== --- linux.orig/kernel/events/core.c +++ linux/kernel/events/core.c @@ -6827,13 +6827,27 @@ perf_event_mux_interval_ms_show(struct d return snprintf(page, PAGE_SIZE-1, "%d\n", pmu->hrtimer_interval_ms); } +static void perf_event_mux_update_interval(void *info) +{ + struct pmu *pmu = info; + struct perf_cpu_context *cpuctx = this_cpu_ptr(pmu->pmu_cpu_context); + int was_armed = hrtimer_cancel(&cpuctx->hrtimer); + + /* Update the interval and restart the timer with the new interval */ + cpuctx->hrtimer_interval = ms_to_ktime(pmu->hrtimer_interval_ms); + if (was_armed) { + hrtimer_forward_now(&cpuctx->hrtimer, cpuctx->hrtimer_interval); + hrtimer_start_expires(&cpuctx->hrtimer, HRTIMER_MODE_ABS); + } +} + static ssize_t perf_event_mux_interval_ms_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { struct pmu *pmu = dev_get_drvdata(dev); - int timer, cpu, ret; + int timer, ret; ret = kstrtoint(buf, 0, &timer); if (ret) @@ -6842,22 +6856,12 @@ perf_event_mux_interval_ms_store(struct if (timer < 1) return -EINVAL; - /* same value, noting to do */ - if (timer == pmu->hrtimer_interval_ms) - return count; - - pmu->hrtimer_interval_ms = timer; - - /* update all cpuctx for this PMU */ - for_each_possible_cpu(cpu) { - struct perf_cpu_context *cpuctx; - cpuctx = per_cpu_ptr(pmu->pmu_cpu_context, cpu); - cpuctx->hrtimer_interval = ns_to_ktime(NSEC_PER_MSEC * timer); - - if (hrtimer_active(&cpuctx->hrtimer)) - hrtimer_forward_now(&cpuctx->hrtimer, cpuctx->hrtimer_interval); + if (timer != pmu->hrtimer_interval_ms) { + get_online_cpus(); + pmu->hrtimer_interval_ms = timer; + on_each_cpu(perf_event_mux_update_interval, pmu, 1); + put_online_cpus(); } - return count; } static DEVICE_ATTR_RW(perf_event_mux_interval_ms);