From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752406Ab2BPKZL (ORCPT ); Thu, 16 Feb 2012 05:25:11 -0500 Received: from youngberry.canonical.com ([91.189.89.112]:52011 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751454Ab2BPKZI (ORCPT ); Thu, 16 Feb 2012 05:25:08 -0500 MIME-Version: 1.0 In-Reply-To: <1329323900.2293.150.camel@twins> References: <1328578047.1724.17.camel@dave-Dell-System-XPS-L502X> <1329323900.2293.150.camel@twins> Date: Thu, 16 Feb 2012 18:25:05 +0800 Message-ID: Subject: Re: oprofile and ARM A9 hardware counter From: Ming Lei To: Peter Zijlstra Cc: eranian@gmail.com, "Shilimkar, Santosh" , David Long , b-cousson@ti.com, mans@mansr.com, will.deacon@arm.com, linux-arm , Ingo Molnar , Linux Kernel Mailing List Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On Thu, Feb 16, 2012 at 12:38 AM, Peter Zijlstra wrote: > > So what this patch seems to do is put that filter on period in > perf_ctx_adjust_freq(). Not making sense.. nor can I see a HZ > dependency, perf_ctx_adjust_freq() uses TICK_NSEC as time base. Yes, you are right, I remembered it was observed it on -rc1, and Stephane's unthrottling patch was not merged at that time. Today I investigated the problem further on -rc3 and found that seems the problem is caused by arm pmu code. The patch below may fix the problem, now about 40000 sample events can be generated on the command: 'perf record -e cycles -F 4000 ./noploop 10&& perf report -D | tail -20' armpmu_event_update may be called in tick path, so the running counter will be overflowed and produce a great value of 'delta', then a mistaken count is stored into event->count and event->hw.freq_count_stamp. Finally the two variables are not synchronous, then a invalid and large period is computed and written to pmu, and sample events are decreased much. Will, this patch simplifies the 'delta' computation and doesn't use the overflow flag, even though which can be read directly from PMOVSR, could you comment on the patch? diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c index 5bb91bf..789700a 100644 --- a/arch/arm/kernel/perf_event.c +++ b/arch/arm/kernel/perf_event.c @@ -193,13 +193,8 @@ again: new_raw_count) != prev_raw_count) goto again; - new_raw_count &= armpmu->max_period; - prev_raw_count &= armpmu->max_period; - - if (overflow) - delta = armpmu->max_period - prev_raw_count + new_raw_count + 1; - else - delta = new_raw_count - prev_raw_count; + delta = (armpmu->max_period - prev_raw_count + new_raw_count + + 1) & armpmu->max_period; local64_add(delta, &event->count); local64_sub(delta, &hwc->period_left); thanks, -- Ming Lei