From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751573AbdEHFPw (ORCPT ); Mon, 8 May 2017 01:15:52 -0400 Received: from mail-oi0-f47.google.com ([209.85.218.47]:36271 "EHLO mail-oi0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751165AbdEHFPu (ORCPT ); Mon, 8 May 2017 01:15:50 -0400 MIME-Version: 1.0 In-Reply-To: <20170508040119.GA17010@vireshk-i7> References: <4366682.tsferJN35u@aspire.rjw.lan> <2185243.flNrap3qq1@aspire.rjw.lan> <3300960.HE4b3sK4dn@aspire.rjw.lan> <2997922.DidfPadJuT@aspire.rjw.lan> <20170508040119.GA17010@vireshk-i7> From: Wanpeng Li Date: Mon, 8 May 2017 13:15:49 +0800 Message-ID: Subject: Re: [RFC][PATCH v3 2/2] cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely To: Viresh Kumar Cc: "Rafael J. Wysocki" , Linux PM , Peter Zijlstra , LKML , Srinivas Pandruvada , Juri Lelli , Vincent Guittot , Patrick Bellasi , Joel Fernandes , Morten Rasmussen , Ingo Molnar , Thomas Gleixner Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 2017-05-08 12:01 GMT+08:00 Viresh Kumar : > On 08-05-17, 11:49, Wanpeng Li wrote: >> Hi Rafael, >> 2017-03-22 7:08 GMT+08:00 Rafael J. Wysocki : >> > From: Rafael J. Wysocki >> > >> > The way the schedutil governor uses the PELT metric causes it to >> > underestimate the CPU utilization in some cases. >> > >> > That can be easily demonstrated by running kernel compilation on >> > a Sandy Bridge Intel processor, running turbostat in parallel with >> > it and looking at the values written to the MSR_IA32_PERF_CTL >> > register. Namely, the expected result would be that when all CPUs >> > were 100% busy, all of them would be requested to run in the maximum >> > P-state, but observation shows that this clearly isn't the case. >> > The CPUs run in the maximum P-state for a while and then are >> > requested to run slower and go back to the maximum P-state after >> > a while again. That causes the actual frequency of the processor to >> > visibly oscillate below the sustainable maximum in a jittery fashion >> > which clearly is not desirable. >> > >> > That has been attributed to CPU utilization metric updates on task >> > migration that cause the total utilization value for the CPU to be >> > reduced by the utilization of the migrated task. If that happens, >> > the schedutil governor may see a CPU utilization reduction and will >> > attempt to reduce the CPU frequency accordingly right away. That >> > may be premature, though, for example if the system is generally >> > busy and there are other runnable tasks waiting to be run on that >> > CPU already. >> > >> > This is unlikely to be an issue on systems where cpufreq policies are >> > shared between multiple CPUs, because in those cases the policy >> > utilization is computed as the maximum of the CPU utilization values >> >> Sorry for one question maybe not associated with this patch. If the >> cpufreq policy is shared between multiple CPUs, the function >> intel_cpufreq_target() just updates IA32_PERF_CTL MSR of the cpu >> which is managing this policy, I wonder whether other cpus which are >> affected should also update their per-logical cpu's IA32_PERF_CTL MSR? > > The CPUs share the policy when they share their freq/voltage rails and so > changing perf state of one CPU should result in that changing for all the CPUs > in that policy. Otherwise, they can't be considered to be part of the same > policy. > > That's why this code is changing it only for policy->cpu alone. I see, thanks for the explanation. Regards, Wanpeng Li