From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.9 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99415C433E0 for ; Tue, 26 May 2020 15:51:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6B0EF20721 for ; Tue, 26 May 2020 15:51:54 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=telus.net header.i=@telus.net header.b="IcILxIPH" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728301AbgEZPvy (ORCPT ); Tue, 26 May 2020 11:51:54 -0400 Received: from cmta18.telus.net ([209.171.16.91]:48810 "EHLO cmta18.telus.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728107AbgEZPvx (ORCPT ); Tue, 26 May 2020 11:51:53 -0400 Received: from dougxps ([173.180.45.4]) by cmsmtp with SMTP id dbrojbJ6MVEJfdbrpjmJc2; Tue, 26 May 2020 09:51:52 -0600 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=telus.net; s=neo; t=1590508312; bh=rvRTnl21xY1bE06cPy+n/6GL8pvu7dqzkWh0KRXTCgw=; h=From:To:Cc:References:In-Reply-To:Subject:Date; b=IcILxIPHyQAqBA07zd26fJ+SDdIIIj8H3VFVte0eoS0OrZy3eENxW5kgl2k6niPwa 2ODrLMBAWdYlg18hgOi0JBLulhkBTnC7egAoIKtBqVFqOeFRj+TmrQsa1Rl/5kXfOa 4YdKP0xiOvHZSlxc7wTeZ2Q6lyY2hFrJWae7gRSHA5uxOPJCCcAF4pOXUJ52VqcY65 vE8q3WQvJRBQrqxJIhNU0MoGHMNzkgbNKHrLi0hlpQCBcDHgY3OyWmtJopGKu3WgrS zv5neatDZgJX1+qN1vagQy74385B3RHHi/nR4J/rYJzpunamTbUvXBr9EFAOfh555U CHfvuoVrSg4sg== X-Telus-Authed: none X-Authority-Analysis: v=2.3 cv=KIck82No c=1 sm=1 tr=0 a=zJWegnE7BH9C0Gl4FFgQyA==:117 a=zJWegnE7BH9C0Gl4FFgQyA==:17 a=Pyq9K9CWowscuQLKlpiwfMBGOR0=:19 a=IkcTkHD0fZMA:10 a=VwQbUJbxAAAA:8 a=FGbulvE0AAAA:8 a=TkEsygaeObAnKy-4MokA:9 a=7Zwj6sZBwVKJAoWSPKxL6X1jA+E=:19 a=QEXdDO2ut3YA:10 a=AjGcO6oz07-iQ99wixmX:22 a=svzTaB3SJmTkU8mK-ULk:22 From: "Doug Smythies" To: "'Rafael J. Wysocki'" , "'Francisco Jerez'" Cc: "'Rafael J. Wysocki'" , "'Linux PM'" , "'LKML'" , "'Len Brown'" , "'Srinivas Pandruvada'" , "'Peter Zijlstra'" , "'Giovanni Gherdovich'" References: <3169564.ZRsPWhXyMD@kreacher> <87mu5wre1v.fsf@intel.com> <87a71vraus.fsf@intel.com> In-Reply-To: Subject: RE: [RFC/RFT][PATCH] cpufreq: intel_pstate: Work in passive mode with HWP enabled Date: Tue, 26 May 2020 08:51:48 -0700 Message-ID: <000801d63375$927946a0$b76bd3e0$@net> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 12.0 Content-Language: en-ca Thread-Index: AdYzNmRr0lgQYeCwTpGIV0wU6iZXRgALoz3Q X-CMAE-Envelope: MS4wfMXTJ9StKBHKi6NbuF8/H5aacLAutS9AAM/3BjdeZfjwAlD/+YO/yGGP1H3zVmSzO7IFG/3Q2XG7cKpE1iidsZZxyNw9twBAInuOYvh8+jE1casRbEXD 2j34mVbcsyng5C4XT6H39bJFgCkXrBQYfr0hfCThpAIoLBdQr1GL7UZxz7J1dwqYLpDqwnBtlpsnATkArCROEyUOGlSZ3PqLxt0LwEo3uq2vvz2blYtNqqYm B0pNpQHb5mnAUMjk/D65zmzWteteEYDTMaaS8CMHGRMDVtTCfWNS6/KLz3xYhk0jlvOTh74iDd5umB9ArCtM0Rgt64SeMzcfSGu5CGk7iFGQTMgogFOWzTpi mgSpO2eUhIunqU8aS9kRCYnGTELQeCXgByqdKgsjPDzog95SWUGjvW0Qmb4Yd1R3FLrXXa0c3Iy8BI38twSYid8REDx2E2oIs3qiKCzLcNe1JtBxtL0= Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org On 2020.05.26 01:19 Rafael J. Wysocki wrote: > to On Mon, May 25, 2020 at 10:57 PM Francisco Jerez > > "Rafael J. Wysocki" writes: > > > On Mon, May 25, 2020 at 3:39 AM Francisco Jerez > > > > Why not HWP_MIN_PERF? That would leave the HWP quite some room for > > maneuvering (the whole HWP_MIN_PERF-HWP_MAX_PERF P-state range, it's not > > like P-state selection would be entirely driven by the kernel), while > > avoiding every failure scenario I was describing in my previous reply. I have re-done my tests. The problem that I observe seems specific to hwp itself and not this patch and it's use in passive mode. I see the exact same thing with intel_pstate/powersave. [1] detail A. Test: still simple single threaded load sweep, at 347 hertz work/sleep frequency. What do I see? Unexpected frequency drops at around 70% load. Example, from trace: First, the thing has been going for awhile at 4.8 GHz. Old epp ; new epp ; freq GHz; load % ; duration mS 80 ; 82 ; 4.57 ; 61.94 ; 20.001 82 ; 80 ; 4.57 ; 62.47 ; 40.003 80 ; 44 ; 3.73 ; 68.63 ; 62.009 <<<< What? Why freq down? Why long duration? 44 ; 0 ; 1.96 ; 100.23 ; 19.995 <<<< Even lower freq. load overruns. 0 ; 73 ; 4.56 ; 82.93 ; 40.07 <<<< O.K. recovered, but damage done. 73 ; 46 ; 2.36 ; 79.19 ; 20.94 <<< now things oscillate a little. 46 ; 0 ; 1.9884 ; 100.24 ; 20.99 0 ; 75 ; 4.5624 ; 82.1 ; 41.002 <<< Event ends. Next event in 487 milliseconds. Observation: Events are often, but not always, preceded by a longer than normal duration. However, long durations are also not out of the ordinary in passive mode. And yes, the above trace was with DELAY_HWP 20,000, but I do have trace examples with it at 5,000. This was just a particularly good example. Observation (from looking at a lot of trace data): There are phase delays between the two systems, intel_cpufreq and hwp, and sometimes they seem to oscillate a little and fight each other. There maybe some problematic work/sleep frequencies where the oscillation builds into a full blown resonance. Why does hwp drop the frequency? This system is otherwise fairly idle, so maybe because the pll drops down during the non work periods. Maybe HWP thinks the system is idle and drops the frequency. I can eliminate the overruns by disabling deep idle states such that the PLL vote is never relinquished, but it's not a fair test. Note that the above response can be "tuned". If we take the conversation algorithm from target frequency to EPP and introduce and offset, the above can be improved. At what cost? More sluggishness, for a large positive offset. So, the overruns just move from the steady state side of the task to when the task starts. I did not find if there is a "sweet spot" between offset and system response, and I do not think there is value added in trying. Note: With original settings, I rarely observe a problem with the step function response to a new task. > > Actually, I have been thinking about the HWP min as an alternative > that may be worth evaluating. > > However, I would rather set the HWP min to something like 80% if the > cpufreq request. Yes, this is a good idea and should not suffer from the two servo systems fighting each other. I got 0 overruns, verses 2240 overruns with no min limitation (100 second test). As for INTEL_CPUFREQ_TRANSITION_DELAY_HWP, I'll probably use 10 milliseconds moving forward, because that is what I am most familiar with from years ago work on the this driver. But, I did not observe any issue with 5 milliseconds. [1] http://www.smythies.com/~doug/linux/intel_pstate/passive-hwp/passive-hwp-but-active-powersave.png Other replaced graphs: http://www.smythies.com/~doug/linux/intel_pstate/passive-hwp/passive-hwp-ondemand.png http://www.smythies.com/~doug/linux/intel_pstate/passive-hwp/passive-hwp-schedutil.png ... Doug