From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8573AC43215 for ; Fri, 15 Nov 2019 20:49:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 407C320733 for ; Fri, 15 Nov 2019 20:49:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727089AbfKOUta (ORCPT ); Fri, 15 Nov 2019 15:49:30 -0500 Received: from mga07.intel.com ([134.134.136.100]:30791 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726828AbfKOUt3 (ORCPT ); Fri, 15 Nov 2019 15:49:29 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 15 Nov 2019 12:49:27 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.68,309,1569308400"; d="scan'208";a="208245199" Received: from spandruv-desk.jf.intel.com ([10.54.75.31]) by orsmga003.jf.intel.com with ESMTP; 15 Nov 2019 12:49:26 -0800 From: Srinivas Pandruvada To: corbet@lwn.net, andriy.shevchenko@intel.com, prarit@redhat.com Cc: rafael.j.wysocki@intel.co, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, platform-driver-x86@vger.kernel.org, Srinivas Pandruvada Subject: [PATCH] admin guide/pm: Admin guide for intel-speed-select Date: Fri, 15 Nov 2019 12:49:25 -0800 Message-Id: <20191115204925.55181-1-srinivas.pandruvada@linux.intel.com> X-Mailer: git-send-email 2.17.2 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Added documentation to configure servers to use Intel(R) Speed Select Technology using intel-speed-select tool. Signed-off-by: Srinivas Pandruvada --- .../admin-guide/pm/intel-speed-select.rst | 934 ++++++++++++++++++ .../admin-guide/pm/working-state.rst | 1 + 2 files changed, 935 insertions(+) create mode 100644 Documentation/admin-guide/pm/intel-speed-select.rst diff --git a/Documentation/admin-guide/pm/intel-speed-select.rst b/Documentation/admin-guide/pm/intel-speed-select.rst new file mode 100644 index 000000000000..c2ce57ebc268 --- /dev/null +++ b/Documentation/admin-guide/pm/intel-speed-select.rst @@ -0,0 +1,934 @@ +.. SPDX-License-Identifier: GPL-2.0 + +========================================================= +Intel® Speed Select Technology (Intel® SST) : User Guide +========================================================= + +The Intel® Speed Select Technology (Intel® SST) provides a powerful new +collection of features that give more granular control over CPU performance. +With Intel® SST, one server can be configured for power and performance for a +variety of diverse workload requirements. + +Refer to the links below for an overview of the technology: + +- https://www.intel.com/content/www/us/en/architecture-and-technology/speed-select-technology-article.html +- https://builders.intel.com/docs/networkbuilders/intel-speed-select-technology-base-frequency-enhancing-performance.pdf + +These capabilities are further enhanced in some of the newer generations of +server platforms where these features can be enumerated and controlled +dynamically without pre-configuring via BIOS setup options. This dynamic +configuration is done via mailbox commands to the hardware. One way to enumerate +and configure these features is by using the Intel® Speed Select utility. + +This document explains how to use the Intel® Speed Select tool to enumerate and +control Intel® SST features. This document gives example commands and explains +how these commands change the power and performance profile of the system under +test. Using this tool as an example, customers can replicate the messaging +implemented in the tool in their production software. + + +intel-speed-select configuration tool +------------------------------------- +Most Linux distribution packages include the "intel-speed-select" tool. If not, +it can be built by downloading the Linux kernel tree from kernel.org. Once +downloaded, the tool can be built without building the full kernel. + +From the kernel tree, run the following commands: + +# cd tools/power/x86/intel-speed-select/ + +# make + +# make install + +**Getting Help** + +To get help with the tool, execute the command below: + +#intel-speed-select --help + +The top-level help describes arguments and features. Notice that there is a +multi-level help structure in the tool. For example: + +#intel-speed-select perf-profile --help + +This brings help for the feature "perf-profile" + +#intel-speed-select perf-profile info --help + +This brings help for the command "info". + + +Intel® Speed Select Technology - Performance Profile (Intel® SST-PP) +---------------------------------------- + +This feature allows configuration of a server dynamically based on workload +performance requirements. This helps users during deployment as they do not have +to choose a specific server configuration statically. This Intel® Speed Select +Technology - Performance Profile (Intel® SST-PP) feature introduces a mechanism +that allows multiple optimized performance profiles per system. Each profile +defines a set of CPUs that need to be online and rest offline to sustain a +guaranteed base frequency. Once the user issues a command to use a specific +performance profile and meet CPU online/offline requirement, the user can expect +a change in the base frequency dynamically. This feature is called +"perf-profile" when using the Intel® Speed Select tool. + +**Number or performance levels** + +There can be multiple performance profiles on a system. To get the number of +profiles, execute the command below: + +#intel-speed-select perf-profile get-config-levels + package-0 + die-0 + cpu-0 + get-config-levels:4 + package-1 + die-0 + cpu-14 + get-config-levels:4 + +On this system under test, there are 4 performance profiles in addition to the +base performance profile (which is performance level 0). + +**Lock/Unlock status** + +Even if there are multiple performance profiles, it is possible that that they +are locked. If they are locked, users cannot issue a command to change the +performance state. It is possible that there is a BIOS setup to unlock or check +with your system vendor. + +To check if the system is locked, execute the following command: + +#intel-speed-select perf-profile get-lock-status + package-0 + die-0 + cpu-0 + get-lock-status:0 + package-1 + die-0 + cpu-14 + get-lock-status:0 + +In this case, lock status is 0, which means that the system is unlocked. + +**Properties of a performance level** + +To get properties of a specific performance level (For example for the level 0, below), execute the command below: + +#intel-speed-select perf-profile info -l 0 + package-0 + die-0 + cpu-0 + perf-profile-level-0 + cpu-count:28 + + enable-cpu-mask:000003ff,f0003fff + + enable-cpu-list:0,1,2,3,4,5,6,7,8,9,10,11,12,13,28,29,30,31,32,33,34,35,36,37,38,39,40,41 + + thermal-design-power-ratio:26 + + base-frequency(MHz):2600 + + speed-select-turbo-freq:disabled + + speed-select-base-freq:disabled + ... + ... + +Here -l option is used to specify a performance level. + +If the option -l is omitted, then this command will print information about all +the performance levels. The above command is printing properties of the +performance level 0. + +For this performance profile, the list of CPUs displayed by the +"enable-cpu-mask/enable-cpu-list" at the max can be "online." When that +condition is met, then base frequency of 2600 MHz can be maintained. To +understand more, execute "intel-speed-select perf-profile info" for performance +level 4. + +#intel-speed-select perf-profile info -l 4 + package-0 + die-0 + cpu-0 + perf-profile-level-4 + cpu-count:28 + + enable-cpu-mask:000000fa,f0000faf + + enable-cpu-list:0,1,2,3,5,7,8,9,10,11,28,29,30,31,33,35,36,37,38,39 + + thermal-design-power-ratio:28 + + base-frequency(MHz):2800 + + speed-select-turbo-freq:disabled + + speed-select-base-freq:unsupported + ... + ... + +There are fewer CPUs in the “enable-cpu-mask/enable-cpu-list.” Consequently, if +the user only keeps these CPUs online and the rest "offline," then the base +frequency is increased to 2.8 GHz compared to 2.6 GHz at performance level 0. + +**Get current performance level** + +To get the current performance level, execute: + +# intel-speed-select perf-profile get-config-current-level + package-0 + die-0 + cpu-0 + get-config-current_level:0 + +First verify that the base_frequency displayed by the cpufreq sysfs is correct: + +# cat /sys/devices/system/cpu/cpu0/cpufreq/base_frequency + +2600000 + +This matches the base-frequency (MHz) field value displayed from the +"perf-profile info" command for performance level 0(cpufreq frequency is in +KHz). + +To check if the average frequency is equal to the base frequency for a 100% busy +workload, disable turbo: + +# echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo + +Then runs a busy workload on all CPUs, for example: + +#stress -c 64 + +To verify the base frequency, run turbostat: + +#turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1 + +======= ==== ===== ========= +Package Core CPU Bzy_MHz +======= ==== ===== ========= + - - 2600 +0 0 0 2600 +0 1 1 2600 +0 2 2 2600 +0 3 3 2600 +0 4 4 2600 +. . . . +. . . . +======= ==== ===== ========= + +**Changing performance level** + +To the change the performance level to 4, execute: + +#intel-speed-select -d perf-profile set-config-level -l 4 -o + package-0 + die-0 + cpu-0 + perf-profile + set_tdp_level:success + +In the command above, "-o" is optional. If it is specified, then it will also +offline CPUs which are not present in the enable_cpu_mask for this performance +level. + +Now if the base_frequency is checked: + +#cat /sys/devices/system/cpu/cpu0/cpufreq/base_frequency + +2800000 + +Which shows that the base frequency now increased from 2600 MHz at performance +level 0 to 2800 MHz at performance level 4. As a result, any workload, which can +use fewer CPUs, can see a boost of 200 MHz compared to performance level 0. + +**Check presence of other Intel® SST features** + +Each of the performance profiles also specifies weather there is support of +other two Intel® SST features (Intel® Speed Select Technology - Base Frequency +(Intel® SST-BF) and Intel® Speed Select Technology - Turbo Frequency (Intel® +SST-TF)). + +For example, from the output of "perf-profile info" above, for level 0 and level +4: + +For level 0: + speed-select-turbo-freq:disabled + + speed-select-base-freq:disabled + +For level 4: + + speed-select-turbo-freq:disabled + + speed-select-base-freq:unsupported + +Given these results, the “speed-select-base-freq” (Intel® SST-BF) in level 4 +changed from "disabled" to "unsupported" compared to performance level 0. + +This means that at performance level 4, the "speed-select-base-freq" feature is +not supported. However, at performance level 0, this feature is "supported", but +currently "disabled", meaning the user has not activated this feature. Whereas +"speed-select-turbo-freq" (Intel® SST-TF) is supported at both performance +levels, but currently not activated by the user. + +The Intel® SST-BF and the Intel® SST-TF features are built on a foundation +technology called Intel® Speed Select Technology - Core Power (Intel® SST-CP). +The platform firmware enables this feature when Intel® SST-BF or Intel® SST-TF +is supported on a platform. + +Intel® Speed Select Technology – Core Power (Intel® SST-CP) +---------------------------------------- + +Intel® Speed Select Technology – Core Power (Intel® SST-CP) is an interface that +allows users to define per core priority. This defines a mechanism to distribute +power among cores when there is a power constrained scenario. This defines a +class of service (CLOS) configuration. + +The user can configure up to 4 class of service configurations. Each CLOS group +configuration allows definitions of parameters, which affects how the frequency +can be limited and power is distributed. Each CPU core can be tied to a class of +service and hence an associated priority. The granularity is at core level not +at per CPU level. + +**Enable CLOS based prioritization** + +To use CLOS based prioritization feature, firmware must be informed to enable +and use a priority type. There is a default per platform priority type, which +can be changed with optional command line parameter. + +To enable and check the options, execute: + +# intel-speed-select core-power enable --help + +Intel(R) Speed Select Technology + +Enable core-power for a package/die + Clos Enable: Specify priority type with [--priority|-p] + 0: Proportional, 1: Ordered + +There are two types of priority types: + +- Ordered +Priority for ordered throttling is defined based on the index of the assigned +CLOS group. Where CLOS0 gets highest priority (throttled last). + +Priority order is: +CLOS0 > CLOS1 > CLOS2 > CLOS3. + +- Proportional +When proportional priority is used, there is an additional parameter called +frequency_weight, which can be specified per CLOS group. The goal of +proportional priority is to provide each core with the requested min., then +distribute all remaining (excess/deficit) budgets in proportion to a defined +weight. This proportional priority can be configured using "core-power config" +command. + +To enable with the platform default priority type, execute: + +# intel-speed-select core-power enable + package-0 + die-0 + cpu-0 + core-power + enable:success + package-1 + die-0 + cpu-6 + core-power + enable:success + +The scope of this enable is per package or die scoped when a package contains +multiple dies. To check if CLOS is enabled and get priority type, "core-power +info" command can be used. For example to check the status of core-power feature +on CPU 0, execute: + +#intel-speed-select -c 0 core-power info + package-0 + die-0 + cpu-0 + core-power + enable-status:1 + priority-type:0 + + +**Configuring CLOS groups** + +Each CLOS group has its own attributes including min, max, freq_weight and +desired. These parameters can be configured with "core-power config" command. +Defaults will be used if user skips setting a parameter except clos id, which is +mandatory. + +#intel-speed-select core-power config --help + +Intel(R) Speed Select Technology + +Set core-power configuration for one of the four clos ids + Specify targeted clos id with [--clos|-c] + + Specify clos Proportional Priority [--weight|-w] + + Specify clos min in MHz with [--min|-n] + + Specify clos max in MHz with [--max|-m] + + Specify clos desired in MHz with [--desired|-d] + +For example: +#intel-speed-select core-power config -c 0 + +Intel(R) Speed Select Technology + +clos epp is not specified, default: 0 + +clos frequency weight is not specified, default: 0 + +clos min is not specified, default: 0 MHz + +clos max is not specified, default: 25500 MHz + +clos desired is not specified, default: 0 + package-0 + die-0 + cpu-0 + core-power + config:success + package-1 + die-0 + cpu-6 + core-power + config:success + +The user has the option to change defaults. For example, the user can change the +"min" and set the base frequency to always get guaranteed base frequency. + +**Get the current CLOS configuration** + +To check the current configuration, "core-power get-config" can be used. For +example, to get the configuration of CLOS 0: + +#intel-speed-select core-power get-config -c 0 + package-0 + die-0 + cpu-0 + core-power + + clos:0 + + epp:0 + + clos-proportional-priority:0 + + clos-min:0 MHz + + clos-max:25500 MHz + + clos-desired:0 + + package-1 + die-0 + cpu-6 + core-power + clos:0 + + epp:0 + + clos-proportional-priority:0 + + clos-min:0 MHz + + clos-max:25500 MHz + + clos-desired:0 + +**Associating a CPU with a CLOS group** + +To associate a CPU to a CLOS group "core-power assoc" command can be used. + +# intel-speed-select core-power assoc --help +Associate a clos id to a CPU + Specify targeted clos id with [--clos|-c] + + +For example to associate CPU 10 to CLOS group 3, execute: + +#intel-speed-select -c 10 core-power assoc -c 3 + package-1 + die-0 + cpu-10 + core-power + assoc:success + +Once a CPU is associated, its sibling CPUs are also associated to a CLOS group. +Once associated, avoid changing Linux “cpufreq” subsystem scaling frequency +limits. + +To check the existing association for a CPU, "core-power get-assoc" command can +be used. For example, to get association of CPU 10, execute: + +#intel-speed-select -c 10 core-power get-assoc + package-1 + die-0 + cpu-10 + get-assoc + clos:3 + +This shows that CPU 10 is part of a CLOS group 3. + + +**Disable CLOS based prioritization** + +To disable, execute: + +# intel-speed-select core-power disable + +Some features like Intel® SST-TF can only be enabled when CLOS based prioritization +is enabled. For this reason, disabling while Intel® SST-TF is enabled can cause +Intel® SST-TF to fail. This will cause the "disable" command to display an error +if Intel® SST-TF is already enabled. In turn, to disable, the Intel® SST-TF +feature must be disabled first. + +Intel® Speed Select Technology - Base Frequency (Intel® SST-BF) +---------------------------------- + +The Intel® Speed Select Technology - Base Frequency (Intel® SST-BF) feature lets +the user control base frequency. If some critical workload threads demand +constant high guaranteed performance, then this feature can be used to execute +the thread at higher base frequency on specific sets of CPUs (high priority +CPUs) at the cost of lower base frequency (low priority CPUs) on other CPUs. +This feature does not require offline of the low priority CPUs. + +The support of Intel® SST-BF depends on the Intel® Speed Select Technology - +Performance Profile (Intel® SST-PP) performance level configuration. It is +possible that only certain performance levels support Intel® SST-BF. It is also +possible that only base performance level (level = 0) has support of Intel® +SST-BF. Consequently, first select the desired performance level to enable this +feature. + +In the system under test here, Intel® SST-BF is supported at the base +performance level 0, but currently disabled. + +#intel-speed-select -c 0 perf-profile info -l 0 + package-0 + die-0 + cpu-0 + perf-profile-level-0 + ... + ... + speed-select-base-freq:disabled + + ... + ... + +Before enabling Intel® SST-BF and measuring its impact on a workload +performance, execute some workload and measure performance and get a baseline +performance to compare against. + +Here the user wants more guaranteed performance. For this reason, it is likely +that turbo is disabled. To disable turbo, execute: + +#echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo + +Based on the output of the "intel-speed-select perf-profile info -l 0" base +frequency of guaranteed frequency 2600 MHz. + + +**Measure baseline performance for comparison** + +To compare, pick a multi-threaded workload where each thread can be scheduled on +separate CPUs. "Hackbench pipe" test is a good example on how to improve +performance using Intel® SST-BF. + +Below, the workload is measuring average scheduler wakeup latency, so a lower +number means better performance: + +# taskset -c 3,4 perf bench -r 100 sched pipe + +# Running 'sched/pipe' benchmark: + +# Executed 1000000 pipe operations between two processes + + Total time: 6.102 [sec] + + 6.102445 usecs/op + + 163868 ops/sec + +While running the above test, if we take turbostat output, it will show us that +2 of the CPUs are busy and reaching max. frequency (which would be the base +frequency as the turbo is disabled). + +======= ==== ==== ======= +Package Core CPU Bzy_MHz +======= ==== ==== ======= +0 0 0 1000 +0 1 1 1005 +0 2 2 1000 +0 3 3 2600 +0 4 4 2600 +0 5 5 1000 +0 6 6 1000 +0 7 7 1005 +0 8 8 1005 +0 9 9 1000 +0 10 10 1000 +0 11 11 995 +0 12 12 1000 +0 13 13 1000 +======= ==== ==== ======= + +From the above turbostat output, both CPU 3 and 4 are very busy and reaching +full guaranteed frequency of 2600 MHz. + +**Intel® SST-BF Capabilities** + +To get capabilities of Intel® SST-BF for the current performance level 0, +execute: + +#intel-speed-select base-freq info -l 0 + package-0 + die-0 + cpu-0 + speed-select-base-freq + high-priority-base-frequency(MHz):3000 + + high-priority-cpu-mask:00000216,00002160 + + high-priority-cpu-list:5,6,8,13,33,34,36,41 + + low-priority-base-frequency(MHz):2400 + + tjunction-temperature(C):125 + + thermal-design-power(W):205 + +The above capabilities show that there are some CPUs on this system that can +offer base frequency of 3000 MHz compared to the standard base frequency at this +performance levels. Nevertheless, these CPUs are fixed, and they are presented +via high-priority-cpu-list/high-priority-cpu-mask. But if this Intel® SST-BF +feature is selected, the low priorities CPUs (which are not in +high-priority-cpu-list) can only offer up to 2400 MHz. As a result, if this +clipping of low priority CPUs is acceptable, then the user can enable Intel® +SST-BF feature particularly for the above "sched pipe" workload since only two +CPUs are used, they can be scheduled on high priority CPUs and can get boost of +400 MHz. + +**Enable Intel® SST-BF** + +To enable Intel® SST-BF feature, execute: + +#intel-speed-select base-freq enable -a + package-0 + die-0 + cpu-0 + base-freq + enable:success + package-1 + die-0 + cpu-14 + base-freq + enable:success + +In this case, -a option is optional. This not only enables Intel® SST-BF, but it +also adjusts the priority of cores using Intel® Speed Select Technology – Core +Power (Intel® SST-CP) features. This option sets the minimum performance of each +Intel® Speed Select Technology - Performance Profile (Intel® SST-PP) class to +maximum performance so that the hardware will give maximum performance possible +for each CPU. + +If -a option is not used, then the following steps are required before enabling +Intel® SST-BF: + +- Discover Intel® SST-BF and note low and high priority base frequency +- Note the high prioity CPU list +- Enable CLOS using core-power feature set +- Configure CLOS parameters. Use CLOS.min to set to minimum performance +- Subscribe desired CPUs to CLOS groups + +With this configuration, if the same workload is executed by pinning the +workload to high priority CPUs (CPU 5 and 6 in this case): + +#taskset -c 5,6 perf bench -r 100 sched pipe + +# Running 'sched/pipe' benchmark: + +# Executed 1000000 pipe operations between two processes + + Total time: 5.627 [sec] + + 5.627922 usecs/op + 177685 ops/sec + +This way, by enabling Intel® SST-BF, the performance of this benchmark is +improved (latency reduced) by 7.79%. From the turbostat output, it can be +observed that the high priority CPUs reached 3000 MHz compared to 2600 MHz. + +======= ==== ==== ======= +Package Core CPU Bzy_MHz +======= ==== ==== ======= +0 0 0 2151 +0 1 1 2166 +0 2 2 2175 +0 3 3 2175 +0 4 4 2175 +0 5 5 3000 +0 6 6 3000 +0 7 7 2180 +0 8 8 2662 +0 9 9 2176 +0 10 10 2175 +0 11 11 2176 +0 12 12 2176 +0 13 13 2661 +======= ==== ==== ======= + +**Disable Intel® SST-BF** + +To disable the Intel® SST-BF feature, execute: + +#intel-speed-select base-freq disable -a + + +Intel® Speed Select Technology - Turbo Frequency (Intel® SST-TF) +------------------------------------ + +This feature enables the ability to set different "All core turbo ratio limits" +to cores based on the priority. By using this feature, some cores can be +configured to get higher turbo frequency by designating them as high priority at +the cost of lower or no turbo frequency on the low priority cores. + +For this reason, this feature is only useful when system is busy utilizing all +CPUs, but the user wants some configurable option to get high performance on +some CPUs. + +The support of Intel® Speed Select Technology - Turbo Frequency (Intel® SST-TF) +depends on the Intel® Speed Select Technology - Performance Profile (Intel® +SST-PP) performance level configuration. It is possible that only a certain +performance level supports Intel® SST-TF. It is also possible that only the base +performance level (level = 0) has the support of Intel® SST-TF. Hence, first +select the desired performance level to enable this feature. + +In the system under test here, Intel® SST-TF is supported at the base +performance level 0, but currently disabled. + +#intel-speed-select -c 0 perf-profile info -l 0 + package-0 + die-0 + cpu-0 + perf-profile-level-0 + ... + ... + speed-select-turbo-freq:disabled + + ... + ... + + +To check if performance can be improved using Intel® SST-TF feature, get the turbo +frequency properties with Intel® SST-TF enabled and compare to the base turbo +capability of this system. + +**Get Base turbo capability** + +To get the base turbo capability of performance level 0, execute: + +#intel-speed-select perf-profile info -l 0 + package-0 + die-0 + cpu-0 + perf-profile-level-0 + + ... + + ... + + turbo-ratio-limits-sse + bucket-0 + core-count:2 + + max-turbo-frequency(MHz):3200 + + bucket-1 + core-count:4 + + max-turbo-frequency(MHz):3100 + bucket-2 + core-count:6 + + max-turbo-frequency(MHz):3100 + bucket-3 + core-count:8 + + max-turbo-frequency(MHz):3100 + bucket-4 + core-count:10 + + max-turbo-frequency(MHz):3100 + bucket-5 + core-count:12 + + max-turbo-frequency(MHz):3100 + bucket-6 + core-count:14 + + max-turbo-frequency(MHz):3100 + bucket-7 + core-count:16 + + max-turbo-frequency(MHz):3100 + +Based on the data above, when all the CPUS are busy, the max. frequency of 3100 +MHz can be achieved. If there is some busy workload on cpu 0 - 11 (e.g. stress) +and on CPU 12 and 13, execute "hackbench pipe" workload: + +# taskset -c 12,13 perf bench -r 100 sched pipe + +# Running 'sched/pipe' benchmark: + +# Executed 1000000 pipe operations between two processes + + Total time: 5.705 [sec] + + 5.705488 usecs/op + 175269 ops/sec + +======= ==== ==== ======= +Package Core CPU Bzy_MHz +======= ==== ==== ======= +0 0 0 3000 +0 1 1 3000 +0 2 2 3000 +0 3 3 3000 +0 4 4 3000 +0 5 5 3100 +0 6 6 3100 +0 7 7 3000 +0 8 8 3100 +0 9 9 3000 +0 10 10 3000 +0 11 11 3000 +0 12 12 3100 +0 13 13 3100 +======= ==== ==== ======= + +Based on turbostat output, the performance is limited by frequency cap of 3100 +MHz. To check if the hackbench performance can be improved for CPU 12 and CPU +13, first check the capability of the Intel® SST-TF feature for this performance +level. + +**Get Intel® SST-TF Capability** + +To get the capability, the "turbo-freq info" command can be used: + +#intel-speed-select turbo-freq info -l 0 + speed-select-turbo-freq + bucket-0 + high-priority-cores-count:2 + + high-priority-max-frequency(MHz):3200 + + high-priority-max-avx2-frequency(MHz):3200 + + high-priority-max-avx512-frequency(MHz):3100 + + bucket-1 + high-priority-cores-count:4 + + high-priority-max-frequency(MHz):3100 + + high-priority-max-avx2-frequency(MHz):3000 + + high-priority-max-avx512-frequency(MHz):2900 + + bucket-2 + high-priority-cores-count:6 + + high-priority-max-frequency(MHz):3100 + + high-priority-max-avx2-frequency(MHz):3000 + + high-priority-max-avx512-frequency(MHz):2900 + + speed-select-turbo-freq-clip-frequencies + low-priority-max-frequency(MHz):2600 + + low-priority-max-avx2-frequency(MHz):2400 + + low-priority-max-avx512-frequency(MHz):2100 + +Based on the output above, there is an Intel® SST-TF bucket for which there are +two high priority cores. If only two high priority cores are set, then max. +turbo frequency on those cores can be increased to 3200 MHz. This is 100 MHz +more than the base turbo capability for all cores. + +In turn, for the hackbench workload, two CPUs can be set as high priority and +rest as low priority. One side effect is that once enabled, the low priority +cores will be clipped to a lower frequency of 2600 MHz. + +**Enable Intel® SST-TF** + +To enable Intel® SST-TF, execute: + +#intel-speed-select -c 12,13 turbo-freq enable -a + +Intel(R) Speed Select Technology + package-0 + die-0 + cpu-12 + turbo-freq + enable:success + package-0 + die-0 + cpu-13 + turbo-freq + enable:success + + package--1 + die-0 + cpu-63 + turbo-freq --auto + enable:success + +In this case, the option "-a" is optional. If set, it enables Intel® SST-TF +feature and also sets the CPUs to high and and low priority using Intel® Speed +Select Technology – Core Power (Intel® SST-CP) features. The CPU numbers passed +with "-c" arguments are marked as high priority, including its siblings. + +If -a option is not used, then the following steps are required before enabling +Intel® SST-TF: + +- Discover Intel® SST-TF and note buckets of high priority cores and maximum +frequency + +- Enable CLOS using core-power feature set - Configure CLOS parameters + +- Subscribe desired CPUs to CLOS groups making sure that high priority cores are set to the maximum frequency + +If the same hackbench workload is executed, schedule hackbench threads on high +priority CPUs: + +#taskset -c 12,13 perf bench -r 100 sched pipe + +# Running 'sched/pipe' benchmark: + +# Executed 1000000 pipe operations between two processes + + Total time: 5.510 [sec] + + 5.510165 usecs/op + 180826 ops/sec + +This improved performance by around 3.3% improvement on a busy system. Here the +turbostat output will show that the CPU 12 and CPU 13 are getting 100 MHz boost. + +======= ==== ==== ======= +Package Core CPU Bzy_MHz +======= ==== ==== ======= +... +0 12 12 3200 +0 13 13 3200 + +======= ==== ==== ======= diff --git a/Documentation/admin-guide/pm/working-state.rst b/Documentation/admin-guide/pm/working-state.rst index fc298eb1234b..06d1403b1597 100644 --- a/Documentation/admin-guide/pm/working-state.rst +++ b/Documentation/admin-guide/pm/working-state.rst @@ -11,3 +11,4 @@ Working-State Power Management cpufreq intel_pstate intel_epb + intel-speed-select -- 2.17.2