From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752977Ab2BPPAg (ORCPT ); Thu, 16 Feb 2012 10:00:36 -0500 Received: from cam-admin0.cambridge.arm.com ([217.140.96.50]:50395 "EHLO cam-admin0.cambridge.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751377Ab2BPPAf (ORCPT ); Thu, 16 Feb 2012 10:00:35 -0500 Date: Thu, 16 Feb 2012 15:00:04 +0000 From: Will Deacon To: Ming Lei Cc: Peter Zijlstra , "eranian@gmail.com" , "Shilimkar, Santosh" , David Long , "b-cousson@ti.com" , "mans@mansr.com" , linux-arm , Ingo Molnar , Linux Kernel Mailing List Subject: Re: oprofile and ARM A9 hardware counter Message-ID: <20120216150004.GE2641@mudshark.cambridge.arm.com> References: <1329323900.2293.150.camel@twins> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Thread-Topic: oprofile and ARM A9 hardware counter Accept-Language: en-GB, en-US Content-Language: en-US User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 16, 2012 at 10:25:05AM +0000, Ming Lei wrote: > On Thu, Feb 16, 2012 at 12:38 AM, Peter Zijlstra wrote: > > > > So what this patch seems to do is put that filter on period in > > perf_ctx_adjust_freq(). Not making sense.. nor can I see a HZ > > dependency, perf_ctx_adjust_freq() uses TICK_NSEC as time base. > > Yes, you are right, I remembered it was observed it on -rc1, and > Stephane's unthrottling > patch was not merged at that time. Today I investigated the problem > further on -rc3 and found that seems the problem is caused by arm pmu code. As I reported previously, Stephane's patch is causing warnings on -rc3: http://lists.infradead.org/pipermail/linux-arm-kernel/2012-February/084391.html so I'd like to get to the bottom of that before changing anything else. I'd also like to know why this has only been reported on OMAP4 and I can't reproduce it on my boards. > The patch below may fix the problem, now about 40000 sample events > can be generated on the command: > > 'perf record -e cycles -F 4000 ./noploop 10&& perf report -D | tail -20' > > armpmu_event_update may be called in tick path, so the running counter > will be overflowed and produce a great value of 'delta', then a mistaken > count is stored into event->count and event->hw.freq_count_stamp. Finally > the two variables are not synchronous, then a invalid and large period is > computed and written to pmu, and sample events are decreased much. Hmm, so are you observing an event overflow during the tick handler? This should be fine unless the new value has wrapped past the previous one (i.e. more than 2^32 events have occurred). I find this extremely unlikely for sample-based profiling unless you have some major IRQ latency issues... The only way I can think of improving this (bearing in mind that at some point we're limited by 32 bits of counter) is to check for overflow in the tick path and then invoke the PMU irq handler if there is an overflow, but that's really not very nice. > diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c > index 5bb91bf..789700a 100644 > --- a/arch/arm/kernel/perf_event.c > +++ b/arch/arm/kernel/perf_event.c > @@ -193,13 +193,8 @@ again: > new_raw_count) != prev_raw_count) > goto again; > > - new_raw_count &= armpmu->max_period; > - prev_raw_count &= armpmu->max_period; > - > - if (overflow) > - delta = armpmu->max_period - prev_raw_count + new_raw_count + 1; > - else > - delta = new_raw_count - prev_raw_count; > + delta = (armpmu->max_period - prev_raw_count + new_raw_count > + + 1) & armpmu->max_period; This breaks when more than max_period events have passed. See a737823d ("ARM: 6835/1: perf: ensure overflows aren't missed due to IRQ latency"). Will