From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932227AbcDAIQy (ORCPT ); Fri, 1 Apr 2016 04:16:54 -0400 Received: from mga14.intel.com ([192.55.52.115]:23208 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758544AbcDAIQu convert rfc822-to-8bit (ORCPT ); Fri, 1 Apr 2016 04:16:50 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,426,1455004800"; d="scan'208";a="77154645" Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: [PATCH] x86: Calculate MHz using APERF/MPERF for cpuinfo and scaling_cur_freq From: Stephane Gasparini In-Reply-To: <20160401080328.GC3448@twins.programming.kicks-ass.net> Date: Fri, 1 Apr 2016 10:16:42 +0200 Cc: Len Brown , x86@kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, Len Brown Content-Transfer-Encoding: 8BIT Message-Id: <9222CD56-C603-449C-A049-E518DAFA6883@linux.intel.com> References: <6e0c25e64e0fb65a42dfc63ad5f660302e07cd87.1459485198.git.len.brown@intel.com> <52f711be59539723358bea1aa3c368910a68b46d.1459485198.git.len.brown@intel.com> <20160401080328.GC3448@twins.programming.kicks-ass.net> To: Peter Zijlstra X-Mailer: Apple Mail (2.3124) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org — Steph > On Apr 1, 2016, at 10:03 AM, Peter Zijlstra wrote: > > On Fri, Apr 01, 2016 at 12:37:00AM -0400, Len Brown wrote: >> diff --git a/arch/x86/kernel/cpu/aperfmperf.c b/arch/x86/kernel/cpu/aperfmperf.c >> new file mode 100644 >> index 0000000..9380102 >> --- /dev/null >> +++ b/arch/x86/kernel/cpu/aperfmperf.c >> @@ -0,0 +1,76 @@ >> +/* >> + * x86 APERF/MPERF KHz calculation >> + * Used by /proc/cpuinfo and /sys/.../cpufreq/scaling_cur_freq >> + * >> + * Copyright (C) 2015 Intel Corp. >> + * Author: Len Brown >> + * >> + * This file is licensed under GPLv2. >> + */ >> + >> +#include >> +#include >> +#include >> +#include >> + >> +struct aperfmperf_sample { >> + unsigned int khz; >> + unsigned long jiffies; >> + unsigned long long aperf; >> + unsigned long long mperf; >> +}; >> + >> +static DEFINE_PER_CPU(struct aperfmperf_sample, samples); >> + >> +/* >> + * aperfmperf_snapshot_khz() >> + * On the current CPU, snapshot APERF, MPERF, and jiffies >> + * unless we already did it within 100ms >> + * calculate kHz, save snapshot >> + */ >> +static void aperfmperf_snapshot_khz(void *dummy) >> +{ >> + unsigned long long aperf, aperf_delta; >> + unsigned long long mperf, mperf_delta; >> + unsigned long long numerator; > > u64 is less typing ;-) > >> + struct aperfmperf_sample *s = &get_cpu_var(samples); >> + >> + /* Cache KHz for 100 ms */ >> + if (time_before(jiffies, s->jiffies + HZ/10)) >> + goto out; > > This puts in a lower bound, but afaict there is no upper bound. Both > users appear to be userspace controlled. > > That is; if userspace doesn't request a freq reading we can go without > reading this for a very long time. > >> + >> + rdmsrl(MSR_IA32_APERF, aperf); >> + rdmsrl(MSR_IA32_MPERF, mperf); >> + >> + aperf_delta = aperf - s->aperf; >> + mperf_delta = mperf - s->mperf; > > That means these delta's can be arbitrarily large, in fact the MSRs can > have wrapped however many times. 64 bits is 18 446 744 073 709 551 615 so even assuming a 10 GHz frequency if my math are good this is more than 58 years before the MSR wrap around, assuming the device ran always at max freq. > >> + >> + /* >> + * There is no architectural guarantee that MPERF >> + * increments faster than we can read it. >> + */ >> + if (mperf_delta == 0) >> + goto out; >> + >> + numerator = cpu_khz * aperf_delta; > > And since delta can be any 64bit value as per the msr range, this > multiplication can overflow. > >> + s->khz = div64_u64(numerator, mperf_delta); >> + s->jiffies = jiffies; >> + s->aperf = aperf; >> + s->mperf = mperf; >> + >> +out: >> + put_cpu_var(samples); >> +} >> + >> +unsigned int aperfmperf_khz_on_cpu(int cpu) >> +{ >> + if (!cpu_khz) >> + return 0; >> + >> + if (!boot_cpu_has(X86_FEATURE_APERFMPERF)) >> + return 0; > > You could do the jiffy compare here; avoiding the IPI. > >> + >> + smp_call_function_single(cpu, aperfmperf_snapshot_khz, NULL, 1); >> + >> + return per_cpu(samples.khz, cpu); >> +} > -- > To unsubscribe from this list: send the line "unsubscribe linux-pm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html