From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757973Ab1BKSWH (ORCPT ); Fri, 11 Feb 2011 13:22:07 -0500 Received: from bombadil.infradead.org ([18.85.46.34]:55313 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757906Ab1BKSWE (ORCPT ); Fri, 11 Feb 2011 13:22:04 -0500 Subject: Re: perf on 2.6.38-rc4 wedges my box From: Peter Zijlstra To: David Ahern Cc: Jeff Moyer , linux-kernel@vger.kernel.org, Ingo Molnar , Paul Mackerras , Arnaldo Carvalho de Melo In-Reply-To: <1297446806.5226.44.camel@laptop> References: <1297373905.5226.31.camel@laptop> <4D55654F.6090408@gmail.com> <1297446806.5226.44.camel@laptop> Content-Type: text/plain; charset="UTF-8" Date: Fri, 11 Feb 2011 19:23:09 +0100 Message-ID: <1297448589.5226.47.camel@laptop> Mime-Version: 1.0 X-Mailer: Evolution 2.30.3 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 2011-02-11 at 18:53 +0100, Peter Zijlstra wrote: > On Fri, 2011-02-11 at 09:35 -0700, David Ahern wrote: > > I'm guessing in your case perf is using hardware cycles for profiling. > > > > I was able to reproduce the lockup in a VM which uses cpu-clock for > > profiling - like Jeff's case. The VM is running Fedora 14 with > > 2.6.38-rc4. > > > Ah, indeed, when I use: > > perf record -gfe task-clock -- ./aio-stress -O -o 0 -r 4 -d 32 -b 16 /dev/sdb > > things did come apart, something like the below cured that problem (but > did show the pending softirq thing and triggered something iffy in the > backtrace code -- will have to stare at those still) So while this doesn't explain these weird things, it should have at least one race less -- hrtimer_init() on a possible still running timer didn't seem like a very good idea, also since hrtimers are nsec the whole freq thing seemed unnecessary. --- kernel/perf_event.c | 37 +++++++++++++++++++++++++++++++++---- 1 files changed, 33 insertions(+), 4 deletions(-) diff --git a/kernel/perf_event.c b/kernel/perf_event.c index 999835b..08428b3 100644 --- a/kernel/perf_event.c +++ b/kernel/perf_event.c @@ -5051,6 +5051,10 @@ static enum hrtimer_restart perf_swevent_hrtimer(struct hrtimer *hrtimer) u64 period; event = container_of(hrtimer, struct perf_event, hw.hrtimer); + + if (event->state < PERF_EVENT_STATE_ACTIVE) + return HRTIMER_NORESTART; + event->pmu->read(event); perf_sample_data_init(&data, 0); @@ -5077,9 +5081,6 @@ static void perf_swevent_start_hrtimer(struct perf_event *event) if (!is_sampling_event(event)) return; - hrtimer_init(&hwc->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); - hwc->hrtimer.function = perf_swevent_hrtimer; - period = local64_read(&hwc->period_left); if (period) { if (period < 0) @@ -5102,7 +5103,31 @@ static void perf_swevent_cancel_hrtimer(struct perf_event *event) ktime_t remaining = hrtimer_get_remaining(&hwc->hrtimer); local64_set(&hwc->period_left, ktime_to_ns(remaining)); - hrtimer_cancel(&hwc->hrtimer); + hrtimer_try_to_cancel(&hwc->hrtimer); + } +} + +static void perf_swevent_init_hrtimer(struct perf_event *event) +{ + struct hw_perf_event *hwc = &event->hw; + + if (!is_sampling_event(event)) + return; + + hrtimer_init(&hwc->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); + hwc->hrtimer.function = perf_swevent_hrtimer; + + /* + * Since hrtimers have a fixed rate, we can do a static freq->period + * mapping and avoid the whole period adjust feedback stuff. + */ + if (event->attr.freq) { + long freq = event->attr.sample_freq; + + event->attr.sample_period = NSEC_PER_SEC / freq; + hwc->sample_period = event->attr.sample_period; + local64_set(&hwc->period_left, hwc->sample_period); + event->attr.freq = 0; } } @@ -5158,6 +5183,8 @@ static int cpu_clock_event_init(struct perf_event *event) if (event->attr.config != PERF_COUNT_SW_CPU_CLOCK) return -ENOENT; + perf_swevent_init_hrtimer(event); + return 0; } @@ -5235,6 +5262,8 @@ static int task_clock_event_init(struct perf_event *event) if (event->attr.config != PERF_COUNT_SW_TASK_CLOCK) return -ENOENT; + perf_swevent_init_hrtimer(event); + return 0; }