From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752122AbbEGCwn (ORCPT ); Wed, 6 May 2015 22:52:43 -0400 Received: from mail-la0-f54.google.com ([209.85.215.54]:34918 "EHLO mail-la0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751493AbbEGCwm (ORCPT ); Wed, 6 May 2015 22:52:42 -0400 MIME-Version: 1.0 In-Reply-To: <20150506045239.GA12393@gmail.com> References: <1430848712-28064-1-git-send-email-mingo@kernel.org> <1430848712-28064-47-git-send-email-mingo@kernel.org> <20150506045239.GA12393@gmail.com> From: Andy Lutomirski Date: Wed, 6 May 2015 19:52:19 -0700 Message-ID: Subject: Re: [PATCH 207/208] x86/fpu: Add FPU performance measurement subsystem To: Ingo Molnar Cc: Borislav Petkov , Fenghua Yu , Dave Hansen , Thomas Gleixner , Linus Torvalds , Oleg Nesterov , "linux-kernel@vger.kernel.org" , "H. Peter Anvin" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On May 6, 2015 10:22 AM, "Ingo Molnar" wrote: > > > * Andy Lutomirski wrote: > > > On May 5, 2015 11:30 PM, "Ingo Molnar" wrote: > > > > > > Add a short FPU performance suite that runs once during bootup. > > > > > > It can be enabled via CONFIG_X86_DEBUG_FPU_PERFORMANCE=y. > > > > Neat! > > > > Can you change "cycles" to "TSC ticks"? They're not quite the same thing. > > Yeah, with constant TSC we have the magic TSC frequency that is used > by RDTSC. > > I'm torn: 'TSC ticks' will mean very little to most people reading > that output. We could convert it to nsecs with a little bit of > calibration - but that makes it depend on small differences in CPU > model frequencies, while the (cached) cycle costs are typically > constant per microarchitecture. Isn't it dependent on the ratio of max turbo frequency to TSC freq? Typical non-ultra-mobile systems should be at or near max turbo frequency during bootup. > > I suspect we could snatch a performance counter temporarily, to get > the real cycles count, and maybe even add a uops column. Most of this > needs to run in kernel space, so it's not a tooling project. This will suck under KVM without extra care. I know, because I'm working on a similar userspace tool that uses RDPMC. Another option would be rdmsr(MSR_IA32_APERF), but that isn't available under KVM either. > > I also wanted to add cache-cold numbers which are very interesting as > well, just awfully hard to measure in a stable fashion. For cache-cold > numbers the natural unit would be memory bus cycles. Yeah, maybe it's worth wiring up perf counters at some point. --Andy