* [PATCH v4 0/3] Add support for frequency invariance to AMD EPYC Zen2 @ 2020-11-12 18:26 Giovanni Gherdovich 2020-11-12 18:26 ` [PATCH v4 1/3] x86, sched: Calculate frequency invariance for AMD systems Giovanni Gherdovich ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: Giovanni Gherdovich @ 2020-11-12 18:26 UTC (permalink / raw) To: Borislav Petkov, Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Len Brown, Rafael J . Wysocki Cc: Jon Grimm, Nathan Fontenot, Yazen Ghannam, Thomas Lendacky, Mel Gorman, Pu Wen, Viresh Kumar, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Doug Smythies, x86, linux-pm, linux-kernel, linux-acpi, Giovanni Gherdovich v3 at https://lore.kernel.org/lkml/20201110200519.18180-1-ggherdovich@suse.cz/ Changes wrt v3: - Correct the #ifdef guard for cppc_get_perf_caps() from CONFIG_ACPI to CONFIG_ACPI_CPPC_LIB (reported by "kernel test robot <lkp@intel.com>") - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Cover Letter from v3: v2 at https://lore.kernel.org/lkml/20201110183054.15883-1-ggherdovich@suse.cz/ Changes wrt v2: - "code golf" on the function function init_freq_invariance_cppc(). Make better use of the "secondary" argument to init_freq_invariance(), which was introduced at b56e7d45e807 ("x86, sched: Don't enable static key when starting secondary CPUs") to deal with CPU hotplug. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Cover Letter from v2: v1 at https://lore.kernel.org/lkml/20201110083936.31994-1-ggherdovich@suse.cz/ Changes wrt v1: - made initialization safe under CPU hotplug. The function init_freq_invariance_cppc now lets only the first caller into init_freq_invariance(). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Cover Letter from v1: This series adds support for frequency invariant accounting on AMD EPYC Zen2 (aka "Rome"). The first patch by Nathan lays out the foundation by querying ACPI infrastructure for the max boost frequency of the system. Specifically, this value is available via the CPPC machinery; the previous EPYC generation, namely Zen aka "Naples", doesn't implement that and frequency invariance won't be supported. The second patch sets the estimate for freq_max to be the midpoint between max_boost and max_P, as that works slightly better in practice. A side effect of this series is to provide, with the invariant schedutil governor, a suitable baseline to evaluate a (still work-in-progress) CPPC-based cpufreq driver for the AMD platform (see https://lore.kernel.org/lkml/cover.1562781484.git.Janakarajan.Natarajan@amd.com if/when it will resubmitted. Giovanni Gherdovich (2): x86, sched: Use midpoint of max_boost and max_P for frequency invariance on AMD EPYC x86: Print ratio freq_max/freq_base used in frequency invariance calculations Nathan Fontenot (1): x86, sched: Calculate frequency invariance for AMD systems arch/x86/include/asm/topology.h | 8 ++++ arch/x86/kernel/smpboot.c | 79 ++++++++++++++++++++++++++++++--- drivers/acpi/cppc_acpi.c | 3 ++ 3 files changed, 85 insertions(+), 5 deletions(-) -- 2.26.2 ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v4 1/3] x86, sched: Calculate frequency invariance for AMD systems 2020-11-12 18:26 [PATCH v4 0/3] Add support for frequency invariance to AMD EPYC Zen2 Giovanni Gherdovich @ 2020-11-12 18:26 ` Giovanni Gherdovich 2020-11-14 18:23 ` kernel test robot ` (2 more replies) 2020-11-12 18:26 ` [PATCH v4 2/3] x86, sched: Use midpoint of max_boost and max_P for frequency invariance on AMD EPYC Giovanni Gherdovich 2020-11-12 18:26 ` [PATCH v4 3/3] x86: Print ratio freq_max/freq_base used in frequency invariance calculations Giovanni Gherdovich 2 siblings, 3 replies; 15+ messages in thread From: Giovanni Gherdovich @ 2020-11-12 18:26 UTC (permalink / raw) To: Borislav Petkov, Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Len Brown, Rafael J . Wysocki Cc: Jon Grimm, Nathan Fontenot, Yazen Ghannam, Thomas Lendacky, Mel Gorman, Pu Wen, Viresh Kumar, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Doug Smythies, x86, linux-pm, linux-kernel, linux-acpi, Nathan Fontenot, Giovanni Gherdovich From: Nathan Fontenot <nathan.fontenot@amd.com> This is the first pass in creating the ability to calculate the frequency invariance on AMD systems. This approach uses the CPPC highest performance and nominal performance values that range from 0 - 255 instead of a high and base frquency. This is because we do not have the ability on AMD to get a highest frequency value. On AMD systems the highest performance and nominal performance vaues do correspond to the highest and base frequencies for the system so using them should produce an appropriate ratio but some tweaking is likely necessary. Due to CPPC being initialized later in boot than when the frequency invariant calculation is currently made, I had to create a callback from the CPPC init code to do the calculation after we have CPPC data. Special thanks to "kernel test robot <lkp@intel.com>" for reporting that compilation of drivers/acpi/cppc_acpi.c is conditional to CONFIG_ACPI_CPPC_LIB, not just CONFIG_ACPI. Signed-off-by: Nathan Fontenot <nathan.fontenot@amd.com> [ ggherdovich@suse.cz: made safe under CPU hotplug, edited changelog ] Signed-off-by: Giovanni Gherdovich <ggherdovich@suse.cz> --- arch/x86/include/asm/topology.h | 8 ++++ arch/x86/kernel/smpboot.c | 76 ++++++++++++++++++++++++++++++--- drivers/acpi/cppc_acpi.c | 3 ++ 3 files changed, 82 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h index f4234575f3fd..f66691d34f4a 100644 --- a/arch/x86/include/asm/topology.h +++ b/arch/x86/include/asm/topology.h @@ -218,4 +218,12 @@ static inline void arch_set_max_freq_ratio(bool turbo_disabled) } #endif +#ifdef CONFIG_ACPI_CPPC_LIB +void init_freq_invariance_cppc(void); +#else +static inline void init_freq_invariance_cppc(void) +{ +} +#endif + #endif /* _ASM_X86_TOPOLOGY_H */ diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index de776b2e6046..a4ab5cf6aeab 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -82,6 +82,10 @@ #include <asm/hw_irq.h> #include <asm/stackprotector.h> +#ifdef CONFIG_ACPI_CPPC_LIB +#include <acpi/cppc_acpi.h> +#endif + /* representing HT siblings of each logical CPU */ DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_map); EXPORT_PER_CPU_SYMBOL(cpu_sibling_map); @@ -148,7 +152,7 @@ static inline void smpboot_restore_warm_reset_vector(void) *((volatile u32 *)phys_to_virt(TRAMPOLINE_PHYS_LOW)) = 0; } -static void init_freq_invariance(bool secondary); +static void init_freq_invariance(bool secondary, bool cppc_ready); /* * Report back to the Boot Processor during boot time or to the caller processor @@ -186,7 +190,7 @@ static void smp_callin(void) */ set_cpu_sibling_map(raw_smp_processor_id()); - init_freq_invariance(true); + init_freq_invariance(true, false); /* * Get our bogomips. @@ -1340,7 +1344,7 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus) set_sched_topology(x86_topology); set_cpu_sibling_map(0); - init_freq_invariance(false); + init_freq_invariance(false, false); smp_sanity_check(); switch (apic_intr_mode) { @@ -2027,6 +2031,46 @@ static bool intel_set_max_freq_ratio(void) return true; } +#ifdef CONFIG_ACPI_CPPC_LIB +static bool amd_set_max_freq_ratio(void) +{ + struct cppc_perf_caps perf_caps; + u64 highest_perf, nominal_perf; + u64 perf_ratio; + int rc; + + rc = cppc_get_perf_caps(0, &perf_caps); + if (rc) { + pr_debug("Could not retrieve perf counters (%d)\n", rc); + return false; + } + + highest_perf = perf_caps.highest_perf; + nominal_perf = perf_caps.nominal_perf; + + if (!highest_perf || !nominal_perf) { + pr_debug("Could not retrieve highest or nominal performance\n"); + return false; + } + + perf_ratio = div_u64(highest_perf * SCHED_CAPACITY_SCALE, nominal_perf); + if (!perf_ratio) { + pr_debug("Non-zero highest/nominal perf values led to a 0 ratio\n"); + return false; + } + + arch_turbo_freq_ratio = perf_ratio; + arch_set_max_freq_ratio(false); + + return true; +} +#else +static bool amd_set_max_freq_ratio(void) +{ + return false; +} +#endif + static void init_counter_refs(void) { u64 aperf, mperf; @@ -2038,7 +2082,7 @@ static void init_counter_refs(void) this_cpu_write(arch_prev_mperf, mperf); } -static void init_freq_invariance(bool secondary) +static void init_freq_invariance(bool secondary, bool cppc_ready) { bool ret = false; @@ -2054,6 +2098,12 @@ static void init_freq_invariance(bool secondary) if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) ret = intel_set_max_freq_ratio(); + else if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) { + if (!cppc_ready) { + return; + } + ret = amd_set_max_freq_ratio(); + } if (ret) { init_counter_refs(); @@ -2063,6 +2113,22 @@ static void init_freq_invariance(bool secondary) } } +#ifdef CONFIG_ACPI_CPPC_LIB +static DEFINE_MUTEX(freq_invariance_lock); + +void init_freq_invariance_cppc(void) +{ + static bool secondary; + + mutex_lock(&freq_invariance_lock); + + init_freq_invariance(secondary, true); + secondary = true; + + mutex_unlock(&freq_invariance_lock); +} +#endif + static void disable_freq_invariance_workfn(struct work_struct *work) { static_branch_disable(&arch_scale_freq_key); @@ -2112,7 +2178,7 @@ void arch_scale_freq_tick(void) schedule_work(&disable_freq_invariance_work); } #else -static inline void init_freq_invariance(bool secondary) +static inline void init_freq_invariance(bool secondary, bool cppc_ready) { } #endif /* CONFIG_X86_64 */ diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c index 7a99b19bb893..368b13cb975d 100644 --- a/drivers/acpi/cppc_acpi.c +++ b/drivers/acpi/cppc_acpi.c @@ -39,6 +39,7 @@ #include <linux/ktime.h> #include <linux/rwsem.h> #include <linux/wait.h> +#include <linux/topology.h> #include <acpi/cppc_acpi.h> @@ -850,6 +851,8 @@ int acpi_cppc_processor_probe(struct acpi_processor *pr) goto out_free; } + init_freq_invariance_cppc(); + kfree(output.pointer); return 0; -- 2.26.2 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v4 1/3] x86, sched: Calculate frequency invariance for AMD systems 2020-11-12 18:26 ` [PATCH v4 1/3] x86, sched: Calculate frequency invariance for AMD systems Giovanni Gherdovich @ 2020-11-14 18:23 ` kernel test robot 2020-11-26 9:58 ` Peter Zijlstra 2020-12-03 9:13 ` [tip: sched/core] " tip-bot2 for Nathan Fontenot 2020-12-11 9:34 ` tip-bot2 for Nathan Fontenot 2 siblings, 1 reply; 15+ messages in thread From: kernel test robot @ 2020-11-14 18:23 UTC (permalink / raw) To: kbuild-all [-- Attachment #1: Type: text/plain, Size: 10168 bytes --] Hi Giovanni, Thank you for the patch! Yet something to improve: [auto build test ERROR on tip/x86/core] [also build test ERROR on tip/master v5.10-rc3 next-20201113] [cannot apply to bp/for-next] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Giovanni-Gherdovich/Add-support-for-frequency-invariance-to-AMD-EPYC-Zen2/20201113-022732 base: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 238c91115cd05c71447ea071624a4c9fe661f970 config: arm64-randconfig-r013-20201114 (attached as .config) compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 9a85643cd357e412cff69067bb5c4840e228c2ab) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install arm64 cross compiling tool for clang build # apt-get install binutils-aarch64-linux-gnu # https://github.com/0day-ci/linux/commit/3331764ab450bfb6ef0f9a3df70b9ec4f948e54f git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Giovanni-Gherdovich/Add-support-for-frequency-invariance-to-AMD-EPYC-Zen2/20201113-022732 git checkout 3331764ab450bfb6ef0f9a3df70b9ec4f948e54f # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): >> drivers/acpi/cppc_acpi.c:854:2: error: implicit declaration of function 'init_freq_invariance_cppc' [-Werror,-Wimplicit-function-declaration] init_freq_invariance_cppc(); ^ 1 error generated. vim +/init_freq_invariance_cppc +854 drivers/acpi/cppc_acpi.c 645 646 /* 647 * An example CPC table looks like the following. 648 * 649 * Name(_CPC, Package() 650 * { 651 * 17, 652 * NumEntries 653 * 1, 654 * // Revision 655 * ResourceTemplate(){Register(PCC, 32, 0, 0x120, 2)}, 656 * // Highest Performance 657 * ResourceTemplate(){Register(PCC, 32, 0, 0x124, 2)}, 658 * // Nominal Performance 659 * ResourceTemplate(){Register(PCC, 32, 0, 0x128, 2)}, 660 * // Lowest Nonlinear Performance 661 * ResourceTemplate(){Register(PCC, 32, 0, 0x12C, 2)}, 662 * // Lowest Performance 663 * ResourceTemplate(){Register(PCC, 32, 0, 0x130, 2)}, 664 * // Guaranteed Performance Register 665 * ResourceTemplate(){Register(PCC, 32, 0, 0x110, 2)}, 666 * // Desired Performance Register 667 * ResourceTemplate(){Register(SystemMemory, 0, 0, 0, 0)}, 668 * .. 669 * .. 670 * .. 671 * 672 * } 673 * Each Register() encodes how to access that specific register. 674 * e.g. a sample PCC entry has the following encoding: 675 * 676 * Register ( 677 * PCC, 678 * AddressSpaceKeyword 679 * 8, 680 * //RegisterBitWidth 681 * 8, 682 * //RegisterBitOffset 683 * 0x30, 684 * //RegisterAddress 685 * 9 686 * //AccessSize (subspace ID) 687 * 0 688 * ) 689 * } 690 */ 691 692 /** 693 * acpi_cppc_processor_probe - Search for per CPU _CPC objects. 694 * @pr: Ptr to acpi_processor containing this CPU's logical ID. 695 * 696 * Return: 0 for success or negative value for err. 697 */ 698 int acpi_cppc_processor_probe(struct acpi_processor *pr) 699 { 700 struct acpi_buffer output = {ACPI_ALLOCATE_BUFFER, NULL}; 701 union acpi_object *out_obj, *cpc_obj; 702 struct cpc_desc *cpc_ptr; 703 struct cpc_reg *gas_t; 704 struct device *cpu_dev; 705 acpi_handle handle = pr->handle; 706 unsigned int num_ent, i, cpc_rev; 707 int pcc_subspace_id = -1; 708 acpi_status status; 709 int ret = -EFAULT; 710 711 /* Parse the ACPI _CPC table for this CPU. */ 712 status = acpi_evaluate_object_typed(handle, "_CPC", NULL, &output, 713 ACPI_TYPE_PACKAGE); 714 if (ACPI_FAILURE(status)) { 715 ret = -ENODEV; 716 goto out_buf_free; 717 } 718 719 out_obj = (union acpi_object *) output.pointer; 720 721 cpc_ptr = kzalloc(sizeof(struct cpc_desc), GFP_KERNEL); 722 if (!cpc_ptr) { 723 ret = -ENOMEM; 724 goto out_buf_free; 725 } 726 727 /* First entry is NumEntries. */ 728 cpc_obj = &out_obj->package.elements[0]; 729 if (cpc_obj->type == ACPI_TYPE_INTEGER) { 730 num_ent = cpc_obj->integer.value; 731 } else { 732 pr_debug("Unexpected entry type(%d) for NumEntries\n", 733 cpc_obj->type); 734 goto out_free; 735 } 736 cpc_ptr->num_entries = num_ent; 737 738 /* Second entry should be revision. */ 739 cpc_obj = &out_obj->package.elements[1]; 740 if (cpc_obj->type == ACPI_TYPE_INTEGER) { 741 cpc_rev = cpc_obj->integer.value; 742 } else { 743 pr_debug("Unexpected entry type(%d) for Revision\n", 744 cpc_obj->type); 745 goto out_free; 746 } 747 cpc_ptr->version = cpc_rev; 748 749 if (!is_cppc_supported(cpc_rev, num_ent)) 750 goto out_free; 751 752 /* Iterate through remaining entries in _CPC */ 753 for (i = 2; i < num_ent; i++) { 754 cpc_obj = &out_obj->package.elements[i]; 755 756 if (cpc_obj->type == ACPI_TYPE_INTEGER) { 757 cpc_ptr->cpc_regs[i-2].type = ACPI_TYPE_INTEGER; 758 cpc_ptr->cpc_regs[i-2].cpc_entry.int_value = cpc_obj->integer.value; 759 } else if (cpc_obj->type == ACPI_TYPE_BUFFER) { 760 gas_t = (struct cpc_reg *) 761 cpc_obj->buffer.pointer; 762 763 /* 764 * The PCC Subspace index is encoded inside 765 * the CPC table entries. The same PCC index 766 * will be used for all the PCC entries, 767 * so extract it only once. 768 */ 769 if (gas_t->space_id == ACPI_ADR_SPACE_PLATFORM_COMM) { 770 if (pcc_subspace_id < 0) { 771 pcc_subspace_id = gas_t->access_width; 772 if (pcc_data_alloc(pcc_subspace_id)) 773 goto out_free; 774 } else if (pcc_subspace_id != gas_t->access_width) { 775 pr_debug("Mismatched PCC ids.\n"); 776 goto out_free; 777 } 778 } else if (gas_t->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY) { 779 if (gas_t->address) { 780 void __iomem *addr; 781 782 addr = ioremap(gas_t->address, gas_t->bit_width/8); 783 if (!addr) 784 goto out_free; 785 cpc_ptr->cpc_regs[i-2].sys_mem_vaddr = addr; 786 } 787 } else { 788 if (gas_t->space_id != ACPI_ADR_SPACE_FIXED_HARDWARE || !cpc_ffh_supported()) { 789 /* Support only PCC ,SYS MEM and FFH type regs */ 790 pr_debug("Unsupported register type: %d\n", gas_t->space_id); 791 goto out_free; 792 } 793 } 794 795 cpc_ptr->cpc_regs[i-2].type = ACPI_TYPE_BUFFER; 796 memcpy(&cpc_ptr->cpc_regs[i-2].cpc_entry.reg, gas_t, sizeof(*gas_t)); 797 } else { 798 pr_debug("Err in entry:%d in CPC table of CPU:%d \n", i, pr->id); 799 goto out_free; 800 } 801 } 802 per_cpu(cpu_pcc_subspace_idx, pr->id) = pcc_subspace_id; 803 804 /* 805 * Initialize the remaining cpc_regs as unsupported. 806 * Example: In case FW exposes CPPC v2, the below loop will initialize 807 * LOWEST_FREQ and NOMINAL_FREQ regs as unsupported 808 */ 809 for (i = num_ent - 2; i < MAX_CPC_REG_ENT; i++) { 810 cpc_ptr->cpc_regs[i].type = ACPI_TYPE_INTEGER; 811 cpc_ptr->cpc_regs[i].cpc_entry.int_value = 0; 812 } 813 814 815 /* Store CPU Logical ID */ 816 cpc_ptr->cpu_id = pr->id; 817 818 /* Parse PSD data for this CPU */ 819 ret = acpi_get_psd(cpc_ptr, handle); 820 if (ret) 821 goto out_free; 822 823 /* Register PCC channel once for all PCC subspace ID. */ 824 if (pcc_subspace_id >= 0 && !pcc_data[pcc_subspace_id]->pcc_channel_acquired) { 825 ret = register_pcc_channel(pcc_subspace_id); 826 if (ret) 827 goto out_free; 828 829 init_rwsem(&pcc_data[pcc_subspace_id]->pcc_lock); 830 init_waitqueue_head(&pcc_data[pcc_subspace_id]->pcc_write_wait_q); 831 } 832 833 /* Everything looks okay */ 834 pr_debug("Parsed CPC struct for CPU: %d\n", pr->id); 835 836 /* Add per logical CPU nodes for reading its feedback counters. */ 837 cpu_dev = get_cpu_device(pr->id); 838 if (!cpu_dev) { 839 ret = -EINVAL; 840 goto out_free; 841 } 842 843 /* Plug PSD data into this CPU's CPC descriptor. */ 844 per_cpu(cpc_desc_ptr, pr->id) = cpc_ptr; 845 846 ret = kobject_init_and_add(&cpc_ptr->kobj, &cppc_ktype, &cpu_dev->kobj, 847 "acpi_cppc"); 848 if (ret) { 849 per_cpu(cpc_desc_ptr, pr->id) = NULL; 850 kobject_put(&cpc_ptr->kobj); 851 goto out_free; 852 } 853 > 854 init_freq_invariance_cppc(); 855 856 kfree(output.pointer); 857 return 0; 858 859 out_free: 860 /* Free all the mapped sys mem areas for this CPU */ 861 for (i = 2; i < cpc_ptr->num_entries; i++) { 862 void __iomem *addr = cpc_ptr->cpc_regs[i-2].sys_mem_vaddr; 863 864 if (addr) 865 iounmap(addr); 866 } 867 kfree(cpc_ptr); 868 869 out_buf_free: 870 kfree(output.pointer); 871 return ret; 872 } 873 EXPORT_SYMBOL_GPL(acpi_cppc_processor_probe); 874 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org [-- Attachment #2: config.gz --] [-- Type: application/gzip, Size: 37803 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 1/3] x86, sched: Calculate frequency invariance for AMD systems 2020-11-14 18:23 ` kernel test robot @ 2020-11-26 9:58 ` Peter Zijlstra 2020-11-26 11:55 ` Giovanni Gherdovich 0 siblings, 1 reply; 15+ messages in thread From: Peter Zijlstra @ 2020-11-26 9:58 UTC (permalink / raw) To: kbuild-all [-- Attachment #1: Type: text/plain, Size: 2949 bytes --] On Sun, Nov 15, 2020 at 02:23:57AM +0800, kernel test robot wrote: > Hi Giovanni, > > Thank you for the patch! Yet something to improve: > > [auto build test ERROR on tip/x86/core] > [also build test ERROR on tip/master v5.10-rc3 next-20201113] > [cannot apply to bp/for-next] > [If your patch is applied to the wrong git tree, kindly drop us a note. > And when submitting patch, we suggest to use '--base' as documented in > https://git-scm.com/docs/git-format-patch] > > url: https://github.com/0day-ci/linux/commits/Giovanni-Gherdovich/Add-support-for-frequency-invariance-to-AMD-EPYC-Zen2/20201113-022732 > base: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 238c91115cd05c71447ea071624a4c9fe661f970 > config: arm64-randconfig-r013-20201114 (attached as .config) > compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 9a85643cd357e412cff69067bb5c4840e228c2ab) > reproduce (this is a W=1 build): > wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross > chmod +x ~/bin/make.cross > # install arm64 cross compiling tool for clang build > # apt-get install binutils-aarch64-linux-gnu > # https://github.com/0day-ci/linux/commit/3331764ab450bfb6ef0f9a3df70b9ec4f948e54f > git remote add linux-review https://github.com/0day-ci/linux > git fetch --no-tags linux-review Giovanni-Gherdovich/Add-support-for-frequency-invariance-to-AMD-EPYC-Zen2/20201113-022732 > git checkout 3331764ab450bfb6ef0f9a3df70b9ec4f948e54f > # save the attached .config to linux build tree > COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm64 > > If you fix the issue, kindly add following tag as appropriate > Reported-by: kernel test robot <lkp@intel.com> > > All errors (new ones prefixed by >>): > > >> drivers/acpi/cppc_acpi.c:854:2: error: implicit declaration of function 'init_freq_invariance_cppc' [-Werror,-Wimplicit-function-declaration] > init_freq_invariance_cppc(); > ^ > 1 error generated. I'll try it with the below change... --- --- a/arch/x86/include/asm/topology.h +++ b/arch/x86/include/asm/topology.h @@ -220,10 +220,7 @@ static inline void arch_set_max_freq_rat #ifdef CONFIG_ACPI_CPPC_LIB void init_freq_invariance_cppc(void); -#else -static inline void init_freq_invariance_cppc(void) -{ -} +#define init_freq_invariance_cppc init_freq_invariance_cppc #endif #endif /* _ASM_X86_TOPOLOGY_H */ --- a/drivers/acpi/cppc_acpi.c +++ b/drivers/acpi/cppc_acpi.c @@ -689,6 +689,10 @@ static bool is_cppc_supported(int revisi * } */ +#ifndef init_freq_invariance_cppc +static inline void init_freq_invariance_cppc(void) { } +#endif + /** * acpi_cppc_processor_probe - Search for per CPU _CPC objects. * @pr: Ptr to acpi_processor containing this CPU's logical ID. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 1/3] x86, sched: Calculate frequency invariance for AMD systems 2020-11-26 9:58 ` Peter Zijlstra @ 2020-11-26 11:55 ` Giovanni Gherdovich 0 siblings, 0 replies; 15+ messages in thread From: Giovanni Gherdovich @ 2020-11-26 11:55 UTC (permalink / raw) To: kbuild-all [-- Attachment #1: Type: text/plain, Size: 1621 bytes --] On Thu, 2020-11-26 at 10:58 +0100, Peter Zijlstra wrote: > On Sun, Nov 15, 2020 at 02:23:57AM +0800, kernel test robot wrote: > > [...] > > All errors (new ones prefixed by >>): > > > > > > drivers/acpi/cppc_acpi.c:854:2: error: implicit declaration of function 'init_freq_invariance_cppc' [-Werror,-Wimplicit-function-declaration] > > > > init_freq_invariance_cppc(); > > ^ > > 1 error generated. > > I'll try it with the below change... > > --- > --- a/arch/x86/include/asm/topology.h > +++ b/arch/x86/include/asm/topology.h > @@ -220,10 +220,7 @@ static inline void arch_set_max_freq_rat > > #ifdef CONFIG_ACPI_CPPC_LIB > void init_freq_invariance_cppc(void); > -#else > -static inline void init_freq_invariance_cppc(void) > -{ > -} > +#define init_freq_invariance_cppc init_freq_invariance_cppc > #endif > > #endif /* _ASM_X86_TOPOLOGY_H */ > --- a/drivers/acpi/cppc_acpi.c > +++ b/drivers/acpi/cppc_acpi.c > @@ -689,6 +689,10 @@ static bool is_cppc_supported(int revisi > * } > */ > > +#ifndef init_freq_invariance_cppc > +static inline void init_freq_invariance_cppc(void) { } > +#endif > + > /** > * acpi_cppc_processor_probe - Search for per CPU _CPC objects. > * @pr: Ptr to acpi_processor containing this CPU's logical ID. Thanks, and sorry for not replying. I was about to send an amended patch using the technique "#define funcname funcname" as you do here, but I was placing the macro definition in include/asm-generic/topology.h and it wasn't great. Your change is better, thanks for fixing it. Giovanni ^ permalink raw reply [flat|nested] 15+ messages in thread
* [tip: sched/core] x86, sched: Calculate frequency invariance for AMD systems 2020-11-12 18:26 ` [PATCH v4 1/3] x86, sched: Calculate frequency invariance for AMD systems Giovanni Gherdovich 2020-11-14 18:23 ` kernel test robot @ 2020-12-03 9:13 ` tip-bot2 for Nathan Fontenot 2020-12-11 9:34 ` tip-bot2 for Nathan Fontenot 2 siblings, 0 replies; 15+ messages in thread From: tip-bot2 for Nathan Fontenot @ 2020-12-03 9:13 UTC (permalink / raw) To: linux-tip-commits Cc: Giovanni Gherdovich, Nathan Fontenot, Peter Zijlstra (Intel), x86, linux-kernel The following commit has been merged into the sched/core branch of tip: Commit-ID: 0edb0fb35fa687e633322d23e5f44b7cfd21a5c5 Gitweb: https://git.kernel.org/tip/0edb0fb35fa687e633322d23e5f44b7cfd21a5c5 Author: Nathan Fontenot <nathan.fontenot@amd.com> AuthorDate: Thu, 12 Nov 2020 19:26:12 +01:00 Committer: Peter Zijlstra <peterz@infradead.org> CommitterDate: Thu, 03 Dec 2020 10:00:34 +01:00 x86, sched: Calculate frequency invariance for AMD systems This is the first pass in creating the ability to calculate the frequency invariance on AMD systems. This approach uses the CPPC highest performance and nominal performance values that range from 0 - 255 instead of a high and base frquency. This is because we do not have the ability on AMD to get a highest frequency value. On AMD systems the highest performance and nominal performance vaues do correspond to the highest and base frequencies for the system so using them should produce an appropriate ratio but some tweaking is likely necessary. Due to CPPC being initialized later in boot than when the frequency invariant calculation is currently made, I had to create a callback from the CPPC init code to do the calculation after we have CPPC data. Special thanks to "kernel test robot <lkp@intel.com>" for reporting that compilation of drivers/acpi/cppc_acpi.c is conditional to CONFIG_ACPI_CPPC_LIB, not just CONFIG_ACPI. [ ggherdovich@suse.cz: made safe under CPU hotplug, edited changelog ] Signed-off-by: Giovanni Gherdovich <ggherdovich@suse.cz> Signed-off-by: Nathan Fontenot <nathan.fontenot@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201112182614.10700-2-ggherdovich@suse.cz --- arch/x86/include/asm/topology.h | 5 ++- arch/x86/kernel/smpboot.c | 76 +++++++++++++++++++++++++++++--- drivers/acpi/cppc_acpi.c | 7 +++- 3 files changed, 83 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h index f423457..488a8e8 100644 --- a/arch/x86/include/asm/topology.h +++ b/arch/x86/include/asm/topology.h @@ -218,4 +218,9 @@ static inline void arch_set_max_freq_ratio(bool turbo_disabled) } #endif +#ifdef CONFIG_ACPI_CPPC_LIB +void init_freq_invariance_cppc(void); +#define init_freq_invariance_cppc init_freq_invariance_cppc +#endif + #endif /* _ASM_X86_TOPOLOGY_H */ diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index de776b2..a4ab5cf 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -82,6 +82,10 @@ #include <asm/hw_irq.h> #include <asm/stackprotector.h> +#ifdef CONFIG_ACPI_CPPC_LIB +#include <acpi/cppc_acpi.h> +#endif + /* representing HT siblings of each logical CPU */ DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_map); EXPORT_PER_CPU_SYMBOL(cpu_sibling_map); @@ -148,7 +152,7 @@ static inline void smpboot_restore_warm_reset_vector(void) *((volatile u32 *)phys_to_virt(TRAMPOLINE_PHYS_LOW)) = 0; } -static void init_freq_invariance(bool secondary); +static void init_freq_invariance(bool secondary, bool cppc_ready); /* * Report back to the Boot Processor during boot time or to the caller processor @@ -186,7 +190,7 @@ static void smp_callin(void) */ set_cpu_sibling_map(raw_smp_processor_id()); - init_freq_invariance(true); + init_freq_invariance(true, false); /* * Get our bogomips. @@ -1340,7 +1344,7 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus) set_sched_topology(x86_topology); set_cpu_sibling_map(0); - init_freq_invariance(false); + init_freq_invariance(false, false); smp_sanity_check(); switch (apic_intr_mode) { @@ -2027,6 +2031,46 @@ out: return true; } +#ifdef CONFIG_ACPI_CPPC_LIB +static bool amd_set_max_freq_ratio(void) +{ + struct cppc_perf_caps perf_caps; + u64 highest_perf, nominal_perf; + u64 perf_ratio; + int rc; + + rc = cppc_get_perf_caps(0, &perf_caps); + if (rc) { + pr_debug("Could not retrieve perf counters (%d)\n", rc); + return false; + } + + highest_perf = perf_caps.highest_perf; + nominal_perf = perf_caps.nominal_perf; + + if (!highest_perf || !nominal_perf) { + pr_debug("Could not retrieve highest or nominal performance\n"); + return false; + } + + perf_ratio = div_u64(highest_perf * SCHED_CAPACITY_SCALE, nominal_perf); + if (!perf_ratio) { + pr_debug("Non-zero highest/nominal perf values led to a 0 ratio\n"); + return false; + } + + arch_turbo_freq_ratio = perf_ratio; + arch_set_max_freq_ratio(false); + + return true; +} +#else +static bool amd_set_max_freq_ratio(void) +{ + return false; +} +#endif + static void init_counter_refs(void) { u64 aperf, mperf; @@ -2038,7 +2082,7 @@ static void init_counter_refs(void) this_cpu_write(arch_prev_mperf, mperf); } -static void init_freq_invariance(bool secondary) +static void init_freq_invariance(bool secondary, bool cppc_ready) { bool ret = false; @@ -2054,6 +2098,12 @@ static void init_freq_invariance(bool secondary) if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) ret = intel_set_max_freq_ratio(); + else if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) { + if (!cppc_ready) { + return; + } + ret = amd_set_max_freq_ratio(); + } if (ret) { init_counter_refs(); @@ -2063,6 +2113,22 @@ static void init_freq_invariance(bool secondary) } } +#ifdef CONFIG_ACPI_CPPC_LIB +static DEFINE_MUTEX(freq_invariance_lock); + +void init_freq_invariance_cppc(void) +{ + static bool secondary; + + mutex_lock(&freq_invariance_lock); + + init_freq_invariance(secondary, true); + secondary = true; + + mutex_unlock(&freq_invariance_lock); +} +#endif + static void disable_freq_invariance_workfn(struct work_struct *work) { static_branch_disable(&arch_scale_freq_key); @@ -2112,7 +2178,7 @@ error: schedule_work(&disable_freq_invariance_work); } #else -static inline void init_freq_invariance(bool secondary) +static inline void init_freq_invariance(bool secondary, bool cppc_ready) { } #endif /* CONFIG_X86_64 */ diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c index 7a99b19..a852dc4 100644 --- a/drivers/acpi/cppc_acpi.c +++ b/drivers/acpi/cppc_acpi.c @@ -39,6 +39,7 @@ #include <linux/ktime.h> #include <linux/rwsem.h> #include <linux/wait.h> +#include <linux/topology.h> #include <acpi/cppc_acpi.h> @@ -688,6 +689,10 @@ static bool is_cppc_supported(int revision, int num_ent) * } */ +#ifndef init_freq_invariance_cppc +static inline void init_freq_invariance_cppc(void) { } +#endif + /** * acpi_cppc_processor_probe - Search for per CPU _CPC objects. * @pr: Ptr to acpi_processor containing this CPU's logical ID. @@ -850,6 +855,8 @@ int acpi_cppc_processor_probe(struct acpi_processor *pr) goto out_free; } + init_freq_invariance_cppc(); + kfree(output.pointer); return 0; ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [tip: sched/core] x86, sched: Calculate frequency invariance for AMD systems 2020-11-12 18:26 ` [PATCH v4 1/3] x86, sched: Calculate frequency invariance for AMD systems Giovanni Gherdovich 2020-11-14 18:23 ` kernel test robot 2020-12-03 9:13 ` [tip: sched/core] " tip-bot2 for Nathan Fontenot @ 2020-12-11 9:34 ` tip-bot2 for Nathan Fontenot 2 siblings, 0 replies; 15+ messages in thread From: tip-bot2 for Nathan Fontenot @ 2020-12-11 9:34 UTC (permalink / raw) To: linux-tip-commits Cc: Nathan Fontenot, Giovanni Gherdovich, Peter Zijlstra (Intel), Ingo Molnar, x86, linux-kernel The following commit has been merged into the sched/core branch of tip: Commit-ID: 41ea667227bad5c247d76e6605054e96e4d95f51 Gitweb: https://git.kernel.org/tip/41ea667227bad5c247d76e6605054e96e4d95f51 Author: Nathan Fontenot <nathan.fontenot@amd.com> AuthorDate: Thu, 12 Nov 2020 19:26:12 +01:00 Committer: Ingo Molnar <mingo@kernel.org> CommitterDate: Fri, 11 Dec 2020 10:26:00 +01:00 x86, sched: Calculate frequency invariance for AMD systems This is the first pass in creating the ability to calculate the frequency invariance on AMD systems. This approach uses the CPPC highest performance and nominal performance values that range from 0 - 255 instead of a high and base frquency. This is because we do not have the ability on AMD to get a highest frequency value. On AMD systems the highest performance and nominal performance vaues do correspond to the highest and base frequencies for the system so using them should produce an appropriate ratio but some tweaking is likely necessary. Due to CPPC being initialized later in boot than when the frequency invariant calculation is currently made, I had to create a callback from the CPPC init code to do the calculation after we have CPPC data. Special thanks to "kernel test robot <lkp@intel.com>" for reporting that compilation of drivers/acpi/cppc_acpi.c is conditional to CONFIG_ACPI_CPPC_LIB, not just CONFIG_ACPI. [ ggherdovich@suse.cz: made safe under CPU hotplug, edited changelog. ] Signed-off-by: Nathan Fontenot <nathan.fontenot@amd.com> Signed-off-by: Giovanni Gherdovich <ggherdovich@suse.cz> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20201112182614.10700-2-ggherdovich@suse.cz --- arch/x86/include/asm/topology.h | 5 ++- arch/x86/kernel/smpboot.c | 76 +++++++++++++++++++++++++++++--- drivers/acpi/cppc_acpi.c | 7 +++- 3 files changed, 83 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h index f423457..488a8e8 100644 --- a/arch/x86/include/asm/topology.h +++ b/arch/x86/include/asm/topology.h @@ -218,4 +218,9 @@ static inline void arch_set_max_freq_ratio(bool turbo_disabled) } #endif +#ifdef CONFIG_ACPI_CPPC_LIB +void init_freq_invariance_cppc(void); +#define init_freq_invariance_cppc init_freq_invariance_cppc +#endif + #endif /* _ASM_X86_TOPOLOGY_H */ diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index de776b2..a4ab5cf 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -82,6 +82,10 @@ #include <asm/hw_irq.h> #include <asm/stackprotector.h> +#ifdef CONFIG_ACPI_CPPC_LIB +#include <acpi/cppc_acpi.h> +#endif + /* representing HT siblings of each logical CPU */ DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_map); EXPORT_PER_CPU_SYMBOL(cpu_sibling_map); @@ -148,7 +152,7 @@ static inline void smpboot_restore_warm_reset_vector(void) *((volatile u32 *)phys_to_virt(TRAMPOLINE_PHYS_LOW)) = 0; } -static void init_freq_invariance(bool secondary); +static void init_freq_invariance(bool secondary, bool cppc_ready); /* * Report back to the Boot Processor during boot time or to the caller processor @@ -186,7 +190,7 @@ static void smp_callin(void) */ set_cpu_sibling_map(raw_smp_processor_id()); - init_freq_invariance(true); + init_freq_invariance(true, false); /* * Get our bogomips. @@ -1340,7 +1344,7 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus) set_sched_topology(x86_topology); set_cpu_sibling_map(0); - init_freq_invariance(false); + init_freq_invariance(false, false); smp_sanity_check(); switch (apic_intr_mode) { @@ -2027,6 +2031,46 @@ out: return true; } +#ifdef CONFIG_ACPI_CPPC_LIB +static bool amd_set_max_freq_ratio(void) +{ + struct cppc_perf_caps perf_caps; + u64 highest_perf, nominal_perf; + u64 perf_ratio; + int rc; + + rc = cppc_get_perf_caps(0, &perf_caps); + if (rc) { + pr_debug("Could not retrieve perf counters (%d)\n", rc); + return false; + } + + highest_perf = perf_caps.highest_perf; + nominal_perf = perf_caps.nominal_perf; + + if (!highest_perf || !nominal_perf) { + pr_debug("Could not retrieve highest or nominal performance\n"); + return false; + } + + perf_ratio = div_u64(highest_perf * SCHED_CAPACITY_SCALE, nominal_perf); + if (!perf_ratio) { + pr_debug("Non-zero highest/nominal perf values led to a 0 ratio\n"); + return false; + } + + arch_turbo_freq_ratio = perf_ratio; + arch_set_max_freq_ratio(false); + + return true; +} +#else +static bool amd_set_max_freq_ratio(void) +{ + return false; +} +#endif + static void init_counter_refs(void) { u64 aperf, mperf; @@ -2038,7 +2082,7 @@ static void init_counter_refs(void) this_cpu_write(arch_prev_mperf, mperf); } -static void init_freq_invariance(bool secondary) +static void init_freq_invariance(bool secondary, bool cppc_ready) { bool ret = false; @@ -2054,6 +2098,12 @@ static void init_freq_invariance(bool secondary) if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) ret = intel_set_max_freq_ratio(); + else if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) { + if (!cppc_ready) { + return; + } + ret = amd_set_max_freq_ratio(); + } if (ret) { init_counter_refs(); @@ -2063,6 +2113,22 @@ static void init_freq_invariance(bool secondary) } } +#ifdef CONFIG_ACPI_CPPC_LIB +static DEFINE_MUTEX(freq_invariance_lock); + +void init_freq_invariance_cppc(void) +{ + static bool secondary; + + mutex_lock(&freq_invariance_lock); + + init_freq_invariance(secondary, true); + secondary = true; + + mutex_unlock(&freq_invariance_lock); +} +#endif + static void disable_freq_invariance_workfn(struct work_struct *work) { static_branch_disable(&arch_scale_freq_key); @@ -2112,7 +2178,7 @@ error: schedule_work(&disable_freq_invariance_work); } #else -static inline void init_freq_invariance(bool secondary) +static inline void init_freq_invariance(bool secondary, bool cppc_ready) { } #endif /* CONFIG_X86_64 */ diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c index 7a99b19..a852dc4 100644 --- a/drivers/acpi/cppc_acpi.c +++ b/drivers/acpi/cppc_acpi.c @@ -39,6 +39,7 @@ #include <linux/ktime.h> #include <linux/rwsem.h> #include <linux/wait.h> +#include <linux/topology.h> #include <acpi/cppc_acpi.h> @@ -688,6 +689,10 @@ static bool is_cppc_supported(int revision, int num_ent) * } */ +#ifndef init_freq_invariance_cppc +static inline void init_freq_invariance_cppc(void) { } +#endif + /** * acpi_cppc_processor_probe - Search for per CPU _CPC objects. * @pr: Ptr to acpi_processor containing this CPU's logical ID. @@ -850,6 +855,8 @@ int acpi_cppc_processor_probe(struct acpi_processor *pr) goto out_free; } + init_freq_invariance_cppc(); + kfree(output.pointer); return 0; ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v4 2/3] x86, sched: Use midpoint of max_boost and max_P for frequency invariance on AMD EPYC 2020-11-12 18:26 [PATCH v4 0/3] Add support for frequency invariance to AMD EPYC Zen2 Giovanni Gherdovich 2020-11-12 18:26 ` [PATCH v4 1/3] x86, sched: Calculate frequency invariance for AMD systems Giovanni Gherdovich @ 2020-11-12 18:26 ` Giovanni Gherdovich 2020-12-03 9:13 ` [tip: sched/core] " tip-bot2 for Giovanni Gherdovich ` (2 more replies) 2020-11-12 18:26 ` [PATCH v4 3/3] x86: Print ratio freq_max/freq_base used in frequency invariance calculations Giovanni Gherdovich 2 siblings, 3 replies; 15+ messages in thread From: Giovanni Gherdovich @ 2020-11-12 18:26 UTC (permalink / raw) To: Borislav Petkov, Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Len Brown, Rafael J . Wysocki Cc: Jon Grimm, Nathan Fontenot, Yazen Ghannam, Thomas Lendacky, Mel Gorman, Pu Wen, Viresh Kumar, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Doug Smythies, x86, linux-pm, linux-kernel, linux-acpi, Giovanni Gherdovich Frequency invariant accounting calculations need the ratio freq_curr/freq_max, but freq_max is unknown as it depends on dynamic power allocation between cores: AMD EPYC CPUs implement "Core Performance Boost". Three candidates are considered to estimate this value: - maximum non-boost frequency - maximum boost frequency - the mid point between the above two Experimental data on an AMD EPYC Zen2 machine slightly favors the third option, which is applied with this patch. The analysis uses the ondemand cpufreq governor as baseline, and compares it with schedutil in a number of configurations. Using the freq_max value described above offers a moderate advantage in performance and efficiency: sugov-max (freq_max=max_boost) performs the worst on tbench: less throughput and reduced efficiency than the other invariant-schedutil options (see "Data Overview" below). Consider that tbench is generally a problematic case as no schedutil version currently is better than ondemand. sugov-P0 (freq_max=max_P) is the worst on dbench, while the other sugov's can surpass ondemand with less filesystem latency and slightly increased efficiency. 1. DATA OVERVIEW 2. DETAILED PERFORMANCE TABLES 3. POWER CONSUMPTION TABLE 1. DATA OVERVIEW ================ sugov-noinv : non-invariant schedutil governor sugov-max : invariant schedutil, freq_max=max_boost sugov-mid : invariant schedutil, freq_max=midpoint sugov-P0 : invariant schedutil, freq_max=max_P perfgov : performance governor driver : acpi_cpufreq machine : AMD EPYC 7742 (Zen2, aka "Rome"), dual socket, 128 cores / 256 threads, SATA SSD storage, 250G of memory, XFS filesystem Benchmarks are described in the next section. Tilde (~) means the value is the same as baseline. ------------------------------------------------------------------------------------- ondemand perfgov sugov-noinv sugov-max sugov-mid sugov-P0 better if ------------------------------------------------------------------------------------- PERFORMANCE RATIOS tbench 1.00 1.44 0.90 0.87 0.93 0.93 higher dbench 1.00 0.91 0.95 0.94 0.94 1.06 lower kernbench 1.00 0.93 ~ ~ ~ 0.97 lower gitsource 1.00 0.66 0.97 0.96 ~ 0.95 lower ------------------------------------------------------------------------------------- PERFORMANCE-PER-WATT RATIOS tbench 1.00 1.16 0.84 0.84 0.88 0.85 higher dbench 1.00 1.03 1.02 1.02 1.02 0.93 higher kernbench 1.00 1.05 ~ ~ ~ ~ higher gitsource 1.00 1.46 1.04 1.04 ~ 1.05 higher 2. DETAILED PERFORMANCE TABLES ============================== Benchmark : tbench4 (i.e. dbench4 over the network, actually loopback) Varying parameter : number of clients Unit : MB/sec (higher is better) 5.9.0-ondemand (BASELINE) 5.9.0-perfgov 5.9.0-sugov-noinv - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Hmean 1 427.19 +- 0.16% ( ) 778.35 +- 0.10% ( 82.20%) 346.92 +- 0.14% ( -18.79%) Hmean 2 853.82 +- 0.09% ( ) 1536.23 +- 0.03% ( 79.93%) 694.36 +- 0.05% ( -18.68%) Hmean 4 1657.54 +- 0.12% ( ) 2938.18 +- 0.12% ( 77.26%) 1362.81 +- 0.11% ( -17.78%) Hmean 8 3301.87 +- 0.06% ( ) 5679.10 +- 0.04% ( 72.00%) 2693.35 +- 0.04% ( -18.43%) Hmean 16 6139.65 +- 0.05% ( ) 9498.81 +- 0.04% ( 54.71%) 4889.97 +- 0.17% ( -20.35%) Hmean 32 11170.28 +- 0.09% ( ) 17393.25 +- 0.08% ( 55.71%) 9104.55 +- 0.09% ( -18.49%) Hmean 64 19322.97 +- 0.17% ( ) 31573.91 +- 0.08% ( 63.40%) 18552.52 +- 0.40% ( -3.99%) Hmean 128 30383.71 +- 0.11% ( ) 37416.91 +- 0.15% ( 23.15%) 25938.70 +- 0.41% ( -14.63%) Hmean 256 31143.96 +- 0.41% ( ) 30908.76 +- 0.88% ( -0.76%) 29754.32 +- 0.24% ( -4.46%) Hmean 512 30858.49 +- 0.26% ( ) 38524.60 +- 1.19% ( 24.84%) 42080.39 +- 0.56% ( 36.37%) Hmean 1024 39187.37 +- 0.19% ( ) 36213.86 +- 0.26% ( -7.59%) 39555.98 +- 0.12% ( 0.94%) 5.9.0-sugov-max 5.9.0-sugov-mid 5.9.0-sugov-P0 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Hmean 1 352.59 +- 1.03% ( -17.46%) 352.08 +- 0.75% ( -17.58%) 352.31 +- 1.48% ( -17.53%) Hmean 2 697.32 +- 0.08% ( -18.33%) 700.16 +- 0.20% ( -18.00%) 696.79 +- 0.06% ( -18.39%) Hmean 4 1369.88 +- 0.04% ( -17.35%) 1369.72 +- 0.07% ( -17.36%) 1365.91 +- 0.05% ( -17.59%) Hmean 8 2696.79 +- 0.04% ( -18.33%) 2711.06 +- 0.04% ( -17.89%) 2715.10 +- 0.61% ( -17.77%) Hmean 16 4725.03 +- 0.03% ( -23.04%) 4875.65 +- 0.02% ( -20.59%) 4953.05 +- 0.28% ( -19.33%) Hmean 32 9231.65 +- 0.10% ( -17.36%) 8704.89 +- 0.27% ( -22.07%) 10562.02 +- 0.36% ( -5.45%) Hmean 64 15364.27 +- 0.19% ( -20.49%) 17786.64 +- 0.15% ( -7.95%) 19665.40 +- 0.22% ( 1.77%) Hmean 128 42100.58 +- 0.13% ( 38.56%) 34946.28 +- 0.13% ( 15.02%) 38635.79 +- 0.06% ( 27.16%) Hmean 256 30660.23 +- 1.08% ( -1.55%) 32307.67 +- 0.54% ( 3.74%) 31153.27 +- 0.12% ( 0.03%) Hmean 512 24604.32 +- 0.14% ( -20.27%) 40408.50 +- 1.10% ( 30.95%) 38800.29 +- 1.23% ( 25.74%) Hmean 1024 35535.47 +- 0.28% ( -9.32%) 41070.38 +- 2.56% ( 4.81%) 31308.29 +- 2.52% ( -20.11%) Benchmark : dbench (filesystem stressor) Varying parameter : number of clients Unit : seconds (lower is better) NOTE-1: This dbench version measures the average latency of a set of filesystem operations, as we found the traditional dbench metric (throughput) to be misleading. NOTE-2: Due to high variability, we partition the original dataset and apply statistical bootrapping (a resampling method). Accuracy is reported in the form of 95% confidence intervals. 5.9.0-ondemand (BASELINE) 5.9.0-perfgov 5.9.0-sugov-noinv - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - SubAmean 1 98.79 +- 0.92 ( ) 83.36 +- 0.82 ( 15.62%) 84.82 +- 0.92 ( 14.14%) SubAmean 2 116.00 +- 0.89 ( ) 102.12 +- 0.77 ( 11.96%) 109.63 +- 0.89 ( 5.49%) SubAmean 4 149.90 +- 1.03 ( ) 132.12 +- 0.91 ( 11.86%) 143.90 +- 1.15 ( 4.00%) SubAmean 8 182.41 +- 1.13 ( ) 159.86 +- 0.93 ( 12.36%) 165.82 +- 1.03 ( 9.10%) SubAmean 16 237.83 +- 1.23 ( ) 219.46 +- 1.14 ( 7.72%) 229.28 +- 1.19 ( 3.59%) SubAmean 32 334.34 +- 1.49 ( ) 309.94 +- 1.42 ( 7.30%) 321.19 +- 1.36 ( 3.93%) SubAmean 64 576.61 +- 2.16 ( ) 540.75 +- 2.00 ( 6.22%) 551.27 +- 1.99 ( 4.39%) SubAmean 128 1350.07 +- 4.14 ( ) 1205.47 +- 3.20 ( 10.71%) 1280.26 +- 3.75 ( 5.17%) SubAmean 256 3444.42 +- 7.97 ( ) 3698.00 +- 27.43 ( -7.36%) 3494.14 +- 7.81 ( -1.44%) SubAmean 2048 39457.89 +- 29.01 ( ) 34105.33 +- 41.85 ( 13.57%) 39688.52 +- 36.26 ( -0.58%) 5.9.0-sugov-max 5.9.0-sugov-mid 5.9.0-sugov-P0 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - SubAmean 1 85.68 +- 1.04 ( 13.27%) 84.16 +- 0.84 ( 14.81%) 83.99 +- 0.90 ( 14.99%) SubAmean 2 108.42 +- 0.95 ( 6.54%) 109.91 +- 1.39 ( 5.24%) 112.06 +- 0.91 ( 3.39%) SubAmean 4 136.90 +- 1.04 ( 8.67%) 137.59 +- 0.93 ( 8.21%) 136.55 +- 0.95 ( 8.91%) SubAmean 8 163.15 +- 0.96 ( 10.56%) 166.07 +- 1.02 ( 8.96%) 165.81 +- 0.99 ( 9.10%) SubAmean 16 224.86 +- 1.12 ( 5.45%) 223.83 +- 1.06 ( 5.89%) 230.66 +- 1.19 ( 3.01%) SubAmean 32 320.51 +- 1.38 ( 4.13%) 322.85 +- 1.49 ( 3.44%) 321.96 +- 1.46 ( 3.70%) SubAmean 64 553.25 +- 1.93 ( 4.05%) 554.19 +- 2.08 ( 3.89%) 562.26 +- 2.22 ( 2.49%) SubAmean 128 1264.35 +- 3.72 ( 6.35%) 1256.99 +- 3.46 ( 6.89%) 2018.97 +- 18.79 ( -49.55%) SubAmean 256 3466.25 +- 8.25 ( -0.63%) 3450.58 +- 8.44 ( -0.18%) 5032.12 +- 38.74 ( -46.09%) SubAmean 2048 39133.10 +- 45.71 ( 0.82%) 39905.95 +- 34.33 ( -1.14%) 53811.86 +-193.04 ( -36.38%) Benchmark : kernbench (kernel compilation) Varying parameter : number of jobs Unit : seconds (lower is better) 5.9.0-ondemand (BASELINE) 5.9.0-perfgov 5.9.0-sugov-noinv - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Amean 2 471.71 +- 26.61% ( ) 409.88 +- 16.99% ( 13.11%) 430.63 +- 0.18% ( 8.71%) Amean 4 211.87 +- 0.58% ( ) 194.03 +- 0.74% ( 8.42%) 215.33 +- 0.64% ( -1.63%) Amean 8 109.79 +- 1.27% ( ) 101.43 +- 1.53% ( 7.61%) 111.05 +- 1.95% ( -1.15%) Amean 16 59.50 +- 1.28% ( ) 55.61 +- 1.35% ( 6.55%) 59.65 +- 1.78% ( -0.24%) Amean 32 34.94 +- 1.22% ( ) 32.36 +- 1.95% ( 7.41%) 35.44 +- 0.63% ( -1.43%) Amean 64 22.58 +- 0.38% ( ) 20.97 +- 1.28% ( 7.11%) 22.41 +- 1.73% ( 0.74%) Amean 128 17.72 +- 0.44% ( ) 16.68 +- 0.32% ( 5.88%) 17.65 +- 0.96% ( 0.37%) Amean 256 16.44 +- 0.53% ( ) 15.76 +- 0.32% ( 4.18%) 16.76 +- 0.60% ( -1.93%) Amean 512 16.54 +- 0.21% ( ) 15.62 +- 0.41% ( 5.53%) 16.84 +- 0.85% ( -1.83%) 5.9.0-sugov-max 5.9.0-sugov-mid 5.9.0-sugov-P0 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Amean 2 421.30 +- 0.24% ( 10.69%) 419.26 +- 0.15% ( 11.12%) 414.38 +- 0.33% ( 12.15%) Amean 4 217.81 +- 5.53% ( -2.80%) 211.63 +- 0.99% ( 0.12%) 208.43 +- 0.47% ( 1.63%) Amean 8 108.80 +- 0.43% ( 0.90%) 108.48 +- 1.44% ( 1.19%) 108.59 +- 3.08% ( 1.09%) Amean 16 58.84 +- 0.74% ( 1.12%) 58.37 +- 0.94% ( 1.91%) 57.78 +- 0.78% ( 2.90%) Amean 32 34.04 +- 2.00% ( 2.59%) 34.28 +- 1.18% ( 1.91%) 33.98 +- 2.21% ( 2.75%) Amean 64 22.22 +- 1.69% ( 1.60%) 22.27 +- 1.60% ( 1.38%) 22.25 +- 1.41% ( 1.47%) Amean 128 17.55 +- 0.24% ( 0.97%) 17.53 +- 0.94% ( 1.04%) 17.49 +- 0.43% ( 1.30%) Amean 256 16.51 +- 0.46% ( -0.40%) 16.48 +- 0.48% ( -0.19%) 16.44 +- 1.21% ( 0.00%) Amean 512 16.50 +- 0.35% ( 0.19%) 16.35 +- 0.42% ( 1.14%) 16.37 +- 0.33% ( 0.99%) Benchmark : gitsource (time to run the git unit test suite) Varying parameter : none Unit : seconds (lower is better) 5.9.0-ondemand (BASELINE) 5.9.0-perfgov 5.9.0-sugov-noinv - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Amean 1035.76 +- 0.30% ( ) 688.21 +- 0.04% ( 33.56%) 1003.85 +- 0.14% ( 3.08%) 5.9.0-sugov-max 5.9.0-sugov-mid 5.9.0-sugov-P0 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Amean 995.82 +- 0.08% ( 3.86%) 1011.98 +- 0.03% ( 2.30%) 986.87 +- 0.19% ( 4.72%) 3. POWER CONSUMPTION TABLE ========================== Average power consumption (watts). -------------------------------------------------------------------------- ondemand perfgov sugov-noinv sugov-max sugov-mid sugov-P0 -------------------------------------------------------------------------- tbench4 227.25 281.83 244.17 236.76 241.50 247.99 dbench4 151.97 161.87 157.08 158.10 158.06 153.73 kernbench 162.78 167.22 162.90 164.19 164.65 164.72 gitsource 133.65 139.00 133.04 134.43 134.18 134.32 Signed-off-by: Giovanni Gherdovich <ggherdovich@suse.cz> --- arch/x86/kernel/smpboot.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index a4ab5cf6aeab..c5dd5f6199d9 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -2054,6 +2054,8 @@ static bool amd_set_max_freq_ratio(void) } perf_ratio = div_u64(highest_perf * SCHED_CAPACITY_SCALE, nominal_perf); + /* midpoint between max_boost and max_P */ + perf_ratio = (perf_ratio + SCHED_CAPACITY_SCALE) >> 1; if (!perf_ratio) { pr_debug("Non-zero highest/nominal perf values led to a 0 ratio\n"); return false; -- 2.26.2 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [tip: sched/core] x86, sched: Use midpoint of max_boost and max_P for frequency invariance on AMD EPYC 2020-11-12 18:26 ` [PATCH v4 2/3] x86, sched: Use midpoint of max_boost and max_P for frequency invariance on AMD EPYC Giovanni Gherdovich @ 2020-12-03 9:13 ` tip-bot2 for Giovanni Gherdovich 2020-12-04 17:03 ` [PATCH v4 RESEND] " Giovanni Gherdovich 2020-12-11 9:34 ` [tip: sched/core] " tip-bot2 for Giovanni Gherdovich 2 siblings, 0 replies; 15+ messages in thread From: tip-bot2 for Giovanni Gherdovich @ 2020-12-03 9:13 UTC (permalink / raw) To: linux-tip-commits; +Cc: Peter Zijlstra (Intel), x86, linux-kernel The following commit has been merged into the sched/core branch of tip: Commit-ID: 46609527577d1def0af29ca5b56cffeeea771ada Gitweb: https://git.kernel.org/tip/46609527577d1def0af29ca5b56cffeeea771ada Author: Giovanni Gherdovich <ggherdovich@suse.cz> AuthorDate: Thu, 12 Nov 2020 19:26:13 +01:00 Committer: Peter Zijlstra <peterz@infradead.org> CommitterDate: Thu, 03 Dec 2020 10:00:35 +01:00 x86, sched: Use midpoint of max_boost and max_P for frequency invariance on AMD EPYC Frequency invariant accounting calculations need the ratio freq_curr/freq_max, but freq_max is unknown as it depends on dynamic power allocation between cores: AMD EPYC CPUs implement "Core Performance Boost". Three candidates are considered to estimate this value: - maximum non-boost frequency - maximum boost frequency - the mid point between the above two Experimental data on an AMD EPYC Zen2 machine slightly favors the third option, which is applied with this patch. The analysis uses the ondemand cpufreq governor as baseline, and compares it with schedutil in a number of configurations. Using the freq_max value described above offers a moderate advantage in performance and efficiency: sugov-max (freq_max=max_boost) performs the worst on tbench: less throughput and reduced efficiency than the other invariant-schedutil options (see "Data Overview" below). Consider that tbench is generally a problematic case as no schedutil version currently is better than ondemand. sugov-P0 (freq_max=max_P) is the worst on dbench, while the other sugov's can surpass ondemand with less filesystem latency and slightly increased efficiency. 1. DATA OVERVIEW 2. DETAILED PERFORMANCE TABLES 3. POWER CONSUMPTION TABLE 1. DATA OVERVIEW ================ sugov-noinv : non-invariant schedutil governor sugov-max : invariant schedutil, freq_max=max_boost sugov-mid : invariant schedutil, freq_max=midpoint sugov-P0 : invariant schedutil, freq_max=max_P perfgov : performance governor driver : acpi_cpufreq machine : AMD EPYC 7742 (Zen2, aka "Rome"), dual socket, 128 cores / 256 threads, SATA SSD storage, 250G of memory, XFS filesystem Benchmarks are described in the next section. Tilde (~) means the value is the same as baseline. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201112182614.10700-3-ggherdovich@suse.cz --- arch/x86/kernel/smpboot.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index a4ab5cf..c5dd5f6 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -2054,6 +2054,8 @@ static bool amd_set_max_freq_ratio(void) } perf_ratio = div_u64(highest_perf * SCHED_CAPACITY_SCALE, nominal_perf); + /* midpoint between max_boost and max_P */ + perf_ratio = (perf_ratio + SCHED_CAPACITY_SCALE) >> 1; if (!perf_ratio) { pr_debug("Non-zero highest/nominal perf values led to a 0 ratio\n"); return false; ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v4 RESEND] x86, sched: Use midpoint of max_boost and max_P for frequency invariance on AMD EPYC 2020-11-12 18:26 ` [PATCH v4 2/3] x86, sched: Use midpoint of max_boost and max_P for frequency invariance on AMD EPYC Giovanni Gherdovich 2020-12-03 9:13 ` [tip: sched/core] " tip-bot2 for Giovanni Gherdovich @ 2020-12-04 17:03 ` Giovanni Gherdovich 2020-12-04 17:10 ` Giovanni Gherdovich 2020-12-11 9:34 ` [tip: sched/core] " tip-bot2 for Giovanni Gherdovich 2 siblings, 1 reply; 15+ messages in thread From: Giovanni Gherdovich @ 2020-12-04 17:03 UTC (permalink / raw) To: Borislav Petkov, Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Len Brown, Rafael J . Wysocki Cc: Nathan Fontenot, x86, linux-pm, linux-kernel, linux-acpi, Giovanni Gherdovich Frequency invariant accounting calculations need the ratio freq_curr/freq_max, but freq_max is unknown as it depends on dynamic power allocation between cores: AMD EPYC CPUs implement "Core Performance Boost". Three candidates are considered to estimate this value: - maximum non-boost frequency - maximum boost frequency - the mid point between the above two Experimental data on an AMD EPYC Zen2 machine slightly favors the third option, which is applied with this patch. The analysis uses the ondemand cpufreq governor as baseline, and compares it with schedutil in a number of configurations. Using the freq_max value described above offers a moderate advantage in performance and efficiency: sugov-max (freq_max=max_boost) performs the worst on tbench: less throughput and reduced efficiency than the other invariant-schedutil options (see "Data Overview" below). Consider that tbench is generally a problematic case as no schedutil version currently is better than ondemand. sugov-P0 (freq_max=max_P) is the worst on dbench, while the other sugov's can surpass ondemand with less filesystem latency and slightly increased efficiency. 1. DATA OVERVIEW 2. DETAILED PERFORMANCE TABLES 3. POWER CONSUMPTION TABLE 1. DATA OVERVIEW ================ sugov-noinv : non-invariant schedutil governor sugov-max : invariant schedutil, freq_max=max_boost sugov-mid : invariant schedutil, freq_max=midpoint sugov-P0 : invariant schedutil, freq_max=max_P perfgov : performance governor driver : acpi_cpufreq machine : AMD EPYC 7742 (Zen2, aka "Rome"), dual socket, 128 cores / 256 threads, SATA SSD storage, 250G of memory, XFS filesystem Benchmarks are described in the next section. Tilde (~) means the value is the same as baseline. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ondemand perfgov sugov-noinv sugov-max sugov-mid sugov-P0 better if - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - PERFORMANCE RATIOS tbench 1.00 1.44 0.90 0.87 0.93 0.93 higher dbench 1.00 0.91 0.95 0.94 0.94 1.06 lower kernbench 1.00 0.93 ~ ~ ~ 0.97 lower gitsource 1.00 0.66 0.97 0.96 ~ 0.95 lower - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - PERFORMANCE-PER-WATT RATIOS tbench 1.00 1.16 0.84 0.84 0.88 0.85 higher dbench 1.00 1.03 1.02 1.02 1.02 0.93 higher kernbench 1.00 1.05 ~ ~ ~ ~ higher gitsource 1.00 1.46 1.04 1.04 ~ 1.05 higher 2. DETAILED PERFORMANCE TABLES ============================== Benchmark : tbench4 (i.e. dbench4 over the network, actually loopback) Varying parameter : number of clients Unit : MB/sec (higher is better) 5.9.0-ondemand (BASELINE) 5.9.0-perfgov 5.9.0-sugov-noinv - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Hmean 1 427.19 +- 0.16% ( ) 778.35 +- 0.10% ( 82.20%) 346.92 +- 0.14% ( -18.79%) Hmean 2 853.82 +- 0.09% ( ) 1536.23 +- 0.03% ( 79.93%) 694.36 +- 0.05% ( -18.68%) Hmean 4 1657.54 +- 0.12% ( ) 2938.18 +- 0.12% ( 77.26%) 1362.81 +- 0.11% ( -17.78%) Hmean 8 3301.87 +- 0.06% ( ) 5679.10 +- 0.04% ( 72.00%) 2693.35 +- 0.04% ( -18.43%) Hmean 16 6139.65 +- 0.05% ( ) 9498.81 +- 0.04% ( 54.71%) 4889.97 +- 0.17% ( -20.35%) Hmean 32 11170.28 +- 0.09% ( ) 17393.25 +- 0.08% ( 55.71%) 9104.55 +- 0.09% ( -18.49%) Hmean 64 19322.97 +- 0.17% ( ) 31573.91 +- 0.08% ( 63.40%) 18552.52 +- 0.40% ( -3.99%) Hmean 128 30383.71 +- 0.11% ( ) 37416.91 +- 0.15% ( 23.15%) 25938.70 +- 0.41% ( -14.63%) Hmean 256 31143.96 +- 0.41% ( ) 30908.76 +- 0.88% ( -0.76%) 29754.32 +- 0.24% ( -4.46%) Hmean 512 30858.49 +- 0.26% ( ) 38524.60 +- 1.19% ( 24.84%) 42080.39 +- 0.56% ( 36.37%) Hmean 1024 39187.37 +- 0.19% ( ) 36213.86 +- 0.26% ( -7.59%) 39555.98 +- 0.12% ( 0.94%) 5.9.0-sugov-max 5.9.0-sugov-mid 5.9.0-sugov-P0 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Hmean 1 352.59 +- 1.03% ( -17.46%) 352.08 +- 0.75% ( -17.58%) 352.31 +- 1.48% ( -17.53%) Hmean 2 697.32 +- 0.08% ( -18.33%) 700.16 +- 0.20% ( -18.00%) 696.79 +- 0.06% ( -18.39%) Hmean 4 1369.88 +- 0.04% ( -17.35%) 1369.72 +- 0.07% ( -17.36%) 1365.91 +- 0.05% ( -17.59%) Hmean 8 2696.79 +- 0.04% ( -18.33%) 2711.06 +- 0.04% ( -17.89%) 2715.10 +- 0.61% ( -17.77%) Hmean 16 4725.03 +- 0.03% ( -23.04%) 4875.65 +- 0.02% ( -20.59%) 4953.05 +- 0.28% ( -19.33%) Hmean 32 9231.65 +- 0.10% ( -17.36%) 8704.89 +- 0.27% ( -22.07%) 10562.02 +- 0.36% ( -5.45%) Hmean 64 15364.27 +- 0.19% ( -20.49%) 17786.64 +- 0.15% ( -7.95%) 19665.40 +- 0.22% ( 1.77%) Hmean 128 42100.58 +- 0.13% ( 38.56%) 34946.28 +- 0.13% ( 15.02%) 38635.79 +- 0.06% ( 27.16%) Hmean 256 30660.23 +- 1.08% ( -1.55%) 32307.67 +- 0.54% ( 3.74%) 31153.27 +- 0.12% ( 0.03%) Hmean 512 24604.32 +- 0.14% ( -20.27%) 40408.50 +- 1.10% ( 30.95%) 38800.29 +- 1.23% ( 25.74%) Hmean 1024 35535.47 +- 0.28% ( -9.32%) 41070.38 +- 2.56% ( 4.81%) 31308.29 +- 2.52% ( -20.11%) Benchmark : dbench (filesystem stressor) Varying parameter : number of clients Unit : seconds (lower is better) NOTE-1: This dbench version measures the average latency of a set of filesystem operations, as we found the traditional dbench metric (throughput) to be misleading. NOTE-2: Due to high variability, we partition the original dataset and apply statistical bootrapping (a resampling method). Accuracy is reported in the form of 95% confidence intervals. 5.9.0-ondemand (BASELINE) 5.9.0-perfgov 5.9.0-sugov-noinv - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - SubAmean 1 98.79 +- 0.92 ( ) 83.36 +- 0.82 ( 15.62%) 84.82 +- 0.92 ( 14.14%) SubAmean 2 116.00 +- 0.89 ( ) 102.12 +- 0.77 ( 11.96%) 109.63 +- 0.89 ( 5.49%) SubAmean 4 149.90 +- 1.03 ( ) 132.12 +- 0.91 ( 11.86%) 143.90 +- 1.15 ( 4.00%) SubAmean 8 182.41 +- 1.13 ( ) 159.86 +- 0.93 ( 12.36%) 165.82 +- 1.03 ( 9.10%) SubAmean 16 237.83 +- 1.23 ( ) 219.46 +- 1.14 ( 7.72%) 229.28 +- 1.19 ( 3.59%) SubAmean 32 334.34 +- 1.49 ( ) 309.94 +- 1.42 ( 7.30%) 321.19 +- 1.36 ( 3.93%) SubAmean 64 576.61 +- 2.16 ( ) 540.75 +- 2.00 ( 6.22%) 551.27 +- 1.99 ( 4.39%) SubAmean 128 1350.07 +- 4.14 ( ) 1205.47 +- 3.20 ( 10.71%) 1280.26 +- 3.75 ( 5.17%) SubAmean 256 3444.42 +- 7.97 ( ) 3698.00 +- 27.43 ( -7.36%) 3494.14 +- 7.81 ( -1.44%) SubAmean 2048 39457.89 +- 29.01 ( ) 34105.33 +- 41.85 ( 13.57%) 39688.52 +- 36.26 ( -0.58%) 5.9.0-sugov-max 5.9.0-sugov-mid 5.9.0-sugov-P0 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - SubAmean 1 85.68 +- 1.04 ( 13.27%) 84.16 +- 0.84 ( 14.81%) 83.99 +- 0.90 ( 14.99%) SubAmean 2 108.42 +- 0.95 ( 6.54%) 109.91 +- 1.39 ( 5.24%) 112.06 +- 0.91 ( 3.39%) SubAmean 4 136.90 +- 1.04 ( 8.67%) 137.59 +- 0.93 ( 8.21%) 136.55 +- 0.95 ( 8.91%) SubAmean 8 163.15 +- 0.96 ( 10.56%) 166.07 +- 1.02 ( 8.96%) 165.81 +- 0.99 ( 9.10%) SubAmean 16 224.86 +- 1.12 ( 5.45%) 223.83 +- 1.06 ( 5.89%) 230.66 +- 1.19 ( 3.01%) SubAmean 32 320.51 +- 1.38 ( 4.13%) 322.85 +- 1.49 ( 3.44%) 321.96 +- 1.46 ( 3.70%) SubAmean 64 553.25 +- 1.93 ( 4.05%) 554.19 +- 2.08 ( 3.89%) 562.26 +- 2.22 ( 2.49%) SubAmean 128 1264.35 +- 3.72 ( 6.35%) 1256.99 +- 3.46 ( 6.89%) 2018.97 +- 18.79 ( -49.55%) SubAmean 256 3466.25 +- 8.25 ( -0.63%) 3450.58 +- 8.44 ( -0.18%) 5032.12 +- 38.74 ( -46.09%) SubAmean 2048 39133.10 +- 45.71 ( 0.82%) 39905.95 +- 34.33 ( -1.14%) 53811.86 +-193.04 ( -36.38%) Benchmark : kernbench (kernel compilation) Varying parameter : number of jobs Unit : seconds (lower is better) 5.9.0-ondemand (BASELINE) 5.9.0-perfgov 5.9.0-sugov-noinv - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Amean 2 471.71 +- 26.61% ( ) 409.88 +- 16.99% ( 13.11%) 430.63 +- 0.18% ( 8.71%) Amean 4 211.87 +- 0.58% ( ) 194.03 +- 0.74% ( 8.42%) 215.33 +- 0.64% ( -1.63%) Amean 8 109.79 +- 1.27% ( ) 101.43 +- 1.53% ( 7.61%) 111.05 +- 1.95% ( -1.15%) Amean 16 59.50 +- 1.28% ( ) 55.61 +- 1.35% ( 6.55%) 59.65 +- 1.78% ( -0.24%) Amean 32 34.94 +- 1.22% ( ) 32.36 +- 1.95% ( 7.41%) 35.44 +- 0.63% ( -1.43%) Amean 64 22.58 +- 0.38% ( ) 20.97 +- 1.28% ( 7.11%) 22.41 +- 1.73% ( 0.74%) Amean 128 17.72 +- 0.44% ( ) 16.68 +- 0.32% ( 5.88%) 17.65 +- 0.96% ( 0.37%) Amean 256 16.44 +- 0.53% ( ) 15.76 +- 0.32% ( 4.18%) 16.76 +- 0.60% ( -1.93%) Amean 512 16.54 +- 0.21% ( ) 15.62 +- 0.41% ( 5.53%) 16.84 +- 0.85% ( -1.83%) 5.9.0-sugov-max 5.9.0-sugov-mid 5.9.0-sugov-P0 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Amean 2 421.30 +- 0.24% ( 10.69%) 419.26 +- 0.15% ( 11.12%) 414.38 +- 0.33% ( 12.15%) Amean 4 217.81 +- 5.53% ( -2.80%) 211.63 +- 0.99% ( 0.12%) 208.43 +- 0.47% ( 1.63%) Amean 8 108.80 +- 0.43% ( 0.90%) 108.48 +- 1.44% ( 1.19%) 108.59 +- 3.08% ( 1.09%) Amean 16 58.84 +- 0.74% ( 1.12%) 58.37 +- 0.94% ( 1.91%) 57.78 +- 0.78% ( 2.90%) Amean 32 34.04 +- 2.00% ( 2.59%) 34.28 +- 1.18% ( 1.91%) 33.98 +- 2.21% ( 2.75%) Amean 64 22.22 +- 1.69% ( 1.60%) 22.27 +- 1.60% ( 1.38%) 22.25 +- 1.41% ( 1.47%) Amean 128 17.55 +- 0.24% ( 0.97%) 17.53 +- 0.94% ( 1.04%) 17.49 +- 0.43% ( 1.30%) Amean 256 16.51 +- 0.46% ( -0.40%) 16.48 +- 0.48% ( -0.19%) 16.44 +- 1.21% ( 0.00%) Amean 512 16.50 +- 0.35% ( 0.19%) 16.35 +- 0.42% ( 1.14%) 16.37 +- 0.33% ( 0.99%) Benchmark : gitsource (time to run the git unit test suite) Varying parameter : none Unit : seconds (lower is better) 5.9.0-ondemand (BASELINE) 5.9.0-perfgov 5.9.0-sugov-noinv - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Amean 1035.76 +- 0.30% ( ) 688.21 +- 0.04% ( 33.56%) 1003.85 +- 0.14% ( 3.08%) 5.9.0-sugov-max 5.9.0-sugov-mid 5.9.0-sugov-P0 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Amean 995.82 +- 0.08% ( 3.86%) 1011.98 +- 0.03% ( 2.30%) 986.87 +- 0.19% ( 4.72%) 3. POWER CONSUMPTION TABLE ========================== Average power consumption (watts). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ondemand perfgov sugov-noinv sugov-max sugov-mid sugov-P0 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - tbench4 227.25 281.83 244.17 236.76 241.50 247.99 dbench4 151.97 161.87 157.08 158.10 158.06 153.73 kernbench 162.78 167.22 162.90 164.19 164.65 164.72 gitsource 133.65 139.00 133.04 134.43 134.18 134.32 Signed-off-by: Giovanni Gherdovich <ggherdovich@suse.cz> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201112182614.10700-3-ggherdovich@suse.cz --- arch/x86/kernel/smpboot.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index a4ab5cf6aeab..c5dd5f6199d9 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -2054,6 +2054,8 @@ static bool amd_set_max_freq_ratio(void) } perf_ratio = div_u64(highest_perf * SCHED_CAPACITY_SCALE, nominal_perf); + /* midpoint between max_boost and max_P */ + perf_ratio = (perf_ratio + SCHED_CAPACITY_SCALE) >> 1; if (!perf_ratio) { pr_debug("Non-zero highest/nominal perf values led to a 0 ratio\n"); return false; -- 2.26.2 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v4 RESEND] x86, sched: Use midpoint of max_boost and max_P for frequency invariance on AMD EPYC 2020-12-04 17:03 ` [PATCH v4 RESEND] " Giovanni Gherdovich @ 2020-12-04 17:10 ` Giovanni Gherdovich 0 siblings, 0 replies; 15+ messages in thread From: Giovanni Gherdovich @ 2020-12-04 17:10 UTC (permalink / raw) To: Borislav Petkov, Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Len Brown, Rafael J . Wysocki Cc: Nathan Fontenot, x86, linux-pm, linux-kernel, linux-acpi On Fri, 2020-12-04 at 18:03 +0100, Giovanni Gherdovich wrote: > Frequency invariant accounting calculations need the ratio > freq_curr/freq_max, but freq_max is unknown as it depends on dynamic power > allocation between cores: AMD EPYC CPUs implement "Core Performance Boost". > Three candidates are considered to estimate this value: > [...] > > Benchmarks are described in the next section. > Tilde (~) means the value is the same as baseline. > > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > ondemand perfgov sugov-noinv sugov-max sugov-mid sugov-P0 better if > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > [...] Hello, this patch is currently merged in the tip tree, branch sched/core as commit 46609527577d1def0af29ca5b56cffeeea771ada. Unfortunately in the original commit message I used "----------" for making table headers, and git dropped all the commit message after that sign, i.e. the benchmark results and my signed-off-by. In this "resend" I replaced the offending sign and the new commit message should make it intact to the destination tree. Thanks, Giovanni ^ permalink raw reply [flat|nested] 15+ messages in thread
* [tip: sched/core] x86, sched: Use midpoint of max_boost and max_P for frequency invariance on AMD EPYC 2020-11-12 18:26 ` [PATCH v4 2/3] x86, sched: Use midpoint of max_boost and max_P for frequency invariance on AMD EPYC Giovanni Gherdovich 2020-12-03 9:13 ` [tip: sched/core] " tip-bot2 for Giovanni Gherdovich 2020-12-04 17:03 ` [PATCH v4 RESEND] " Giovanni Gherdovich @ 2020-12-11 9:34 ` tip-bot2 for Giovanni Gherdovich 2 siblings, 0 replies; 15+ messages in thread From: tip-bot2 for Giovanni Gherdovich @ 2020-12-11 9:34 UTC (permalink / raw) To: linux-tip-commits Cc: Giovanni Gherdovich, Peter Zijlstra (Intel), Ingo Molnar, x86, linux-kernel The following commit has been merged into the sched/core branch of tip: Commit-ID: 976df7e5730e3ec8a7e192c09c10ce6e8db07e65 Gitweb: https://git.kernel.org/tip/976df7e5730e3ec8a7e192c09c10ce6e8db07e65 Author: Giovanni Gherdovich <ggherdovich@suse.cz> AuthorDate: Thu, 12 Nov 2020 19:26:13 +01:00 Committer: Ingo Molnar <mingo@kernel.org> CommitterDate: Fri, 11 Dec 2020 10:29:55 +01:00 x86, sched: Use midpoint of max_boost and max_P for frequency invariance on AMD EPYC Frequency invariant accounting calculations need the ratio freq_curr/freq_max, but freq_max is unknown as it depends on dynamic power allocation between cores: AMD EPYC CPUs implement "Core Performance Boost". Three candidates are considered to estimate this value: - maximum non-boost frequency - maximum boost frequency - the mid point between the above two Experimental data on an AMD EPYC Zen2 machine slightly favors the third option, which is applied with this patch. The analysis uses the ondemand cpufreq governor as baseline, and compares it with schedutil in a number of configurations. Using the freq_max value described above offers a moderate advantage in performance and efficiency: sugov-max (freq_max=max_boost) performs the worst on tbench: less throughput and reduced efficiency than the other invariant-schedutil options (see "Data Overview" below). Consider that tbench is generally a problematic case as no schedutil version currently is better than ondemand. sugov-P0 (freq_max=max_P) is the worst on dbench, while the other sugov's can surpass ondemand with less filesystem latency and slightly increased efficiency. 1. DATA OVERVIEW 2. DETAILED PERFORMANCE TABLES 3. POWER CONSUMPTION TABLE 1. DATA OVERVIEW ================ sugov-noinv : non-invariant schedutil governor sugov-max : invariant schedutil, freq_max=max_boost sugov-mid : invariant schedutil, freq_max=midpoint sugov-P0 : invariant schedutil, freq_max=max_P perfgov : performance governor driver : acpi_cpufreq machine : AMD EPYC 7742 (Zen2, aka "Rome"), dual socket, 128 cores / 256 threads, SATA SSD storage, 250G of memory, XFS filesystem Benchmarks are described in the next section. Tilde (~) means the value is the same as baseline. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ondemand perfgov sugov-noinv sugov-max sugov-mid sugov-P0 better if - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - PERFORMANCE RATIOS tbench 1.00 1.44 0.90 0.87 0.93 0.93 higher dbench 1.00 0.91 0.95 0.94 0.94 1.06 lower kernbench 1.00 0.93 ~ ~ ~ 0.97 lower gitsource 1.00 0.66 0.97 0.96 ~ 0.95 lower - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - PERFORMANCE-PER-WATT RATIOS tbench 1.00 1.16 0.84 0.84 0.88 0.85 higher dbench 1.00 1.03 1.02 1.02 1.02 0.93 higher kernbench 1.00 1.05 ~ ~ ~ ~ higher gitsource 1.00 1.46 1.04 1.04 ~ 1.05 higher 2. DETAILED PERFORMANCE TABLES ============================== Benchmark : tbench4 (i.e. dbench4 over the network, actually loopback) Varying parameter : number of clients Unit : MB/sec (higher is better) 5.9.0-ondemand (BASELINE) 5.9.0-perfgov 5.9.0-sugov-noinv - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Hmean 1 427.19 +- 0.16% ( ) 778.35 +- 0.10% ( 82.20%) 346.92 +- 0.14% ( -18.79%) Hmean 2 853.82 +- 0.09% ( ) 1536.23 +- 0.03% ( 79.93%) 694.36 +- 0.05% ( -18.68%) Hmean 4 1657.54 +- 0.12% ( ) 2938.18 +- 0.12% ( 77.26%) 1362.81 +- 0.11% ( -17.78%) Hmean 8 3301.87 +- 0.06% ( ) 5679.10 +- 0.04% ( 72.00%) 2693.35 +- 0.04% ( -18.43%) Hmean 16 6139.65 +- 0.05% ( ) 9498.81 +- 0.04% ( 54.71%) 4889.97 +- 0.17% ( -20.35%) Hmean 32 11170.28 +- 0.09% ( ) 17393.25 +- 0.08% ( 55.71%) 9104.55 +- 0.09% ( -18.49%) Hmean 64 19322.97 +- 0.17% ( ) 31573.91 +- 0.08% ( 63.40%) 18552.52 +- 0.40% ( -3.99%) Hmean 128 30383.71 +- 0.11% ( ) 37416.91 +- 0.15% ( 23.15%) 25938.70 +- 0.41% ( -14.63%) Hmean 256 31143.96 +- 0.41% ( ) 30908.76 +- 0.88% ( -0.76%) 29754.32 +- 0.24% ( -4.46%) Hmean 512 30858.49 +- 0.26% ( ) 38524.60 +- 1.19% ( 24.84%) 42080.39 +- 0.56% ( 36.37%) Hmean 1024 39187.37 +- 0.19% ( ) 36213.86 +- 0.26% ( -7.59%) 39555.98 +- 0.12% ( 0.94%) 5.9.0-sugov-max 5.9.0-sugov-mid 5.9.0-sugov-P0 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Hmean 1 352.59 +- 1.03% ( -17.46%) 352.08 +- 0.75% ( -17.58%) 352.31 +- 1.48% ( -17.53%) Hmean 2 697.32 +- 0.08% ( -18.33%) 700.16 +- 0.20% ( -18.00%) 696.79 +- 0.06% ( -18.39%) Hmean 4 1369.88 +- 0.04% ( -17.35%) 1369.72 +- 0.07% ( -17.36%) 1365.91 +- 0.05% ( -17.59%) Hmean 8 2696.79 +- 0.04% ( -18.33%) 2711.06 +- 0.04% ( -17.89%) 2715.10 +- 0.61% ( -17.77%) Hmean 16 4725.03 +- 0.03% ( -23.04%) 4875.65 +- 0.02% ( -20.59%) 4953.05 +- 0.28% ( -19.33%) Hmean 32 9231.65 +- 0.10% ( -17.36%) 8704.89 +- 0.27% ( -22.07%) 10562.02 +- 0.36% ( -5.45%) Hmean 64 15364.27 +- 0.19% ( -20.49%) 17786.64 +- 0.15% ( -7.95%) 19665.40 +- 0.22% ( 1.77%) Hmean 128 42100.58 +- 0.13% ( 38.56%) 34946.28 +- 0.13% ( 15.02%) 38635.79 +- 0.06% ( 27.16%) Hmean 256 30660.23 +- 1.08% ( -1.55%) 32307.67 +- 0.54% ( 3.74%) 31153.27 +- 0.12% ( 0.03%) Hmean 512 24604.32 +- 0.14% ( -20.27%) 40408.50 +- 1.10% ( 30.95%) 38800.29 +- 1.23% ( 25.74%) Hmean 1024 35535.47 +- 0.28% ( -9.32%) 41070.38 +- 2.56% ( 4.81%) 31308.29 +- 2.52% ( -20.11%) Benchmark : dbench (filesystem stressor) Varying parameter : number of clients Unit : seconds (lower is better) NOTE-1: This dbench version measures the average latency of a set of filesystem operations, as we found the traditional dbench metric (throughput) to be misleading. NOTE-2: Due to high variability, we partition the original dataset and apply statistical bootrapping (a resampling method). Accuracy is reported in the form of 95% confidence intervals. 5.9.0-ondemand (BASELINE) 5.9.0-perfgov 5.9.0-sugov-noinv - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - SubAmean 1 98.79 +- 0.92 ( ) 83.36 +- 0.82 ( 15.62%) 84.82 +- 0.92 ( 14.14%) SubAmean 2 116.00 +- 0.89 ( ) 102.12 +- 0.77 ( 11.96%) 109.63 +- 0.89 ( 5.49%) SubAmean 4 149.90 +- 1.03 ( ) 132.12 +- 0.91 ( 11.86%) 143.90 +- 1.15 ( 4.00%) SubAmean 8 182.41 +- 1.13 ( ) 159.86 +- 0.93 ( 12.36%) 165.82 +- 1.03 ( 9.10%) SubAmean 16 237.83 +- 1.23 ( ) 219.46 +- 1.14 ( 7.72%) 229.28 +- 1.19 ( 3.59%) SubAmean 32 334.34 +- 1.49 ( ) 309.94 +- 1.42 ( 7.30%) 321.19 +- 1.36 ( 3.93%) SubAmean 64 576.61 +- 2.16 ( ) 540.75 +- 2.00 ( 6.22%) 551.27 +- 1.99 ( 4.39%) SubAmean 128 1350.07 +- 4.14 ( ) 1205.47 +- 3.20 ( 10.71%) 1280.26 +- 3.75 ( 5.17%) SubAmean 256 3444.42 +- 7.97 ( ) 3698.00 +- 27.43 ( -7.36%) 3494.14 +- 7.81 ( -1.44%) SubAmean 2048 39457.89 +- 29.01 ( ) 34105.33 +- 41.85 ( 13.57%) 39688.52 +- 36.26 ( -0.58%) 5.9.0-sugov-max 5.9.0-sugov-mid 5.9.0-sugov-P0 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - SubAmean 1 85.68 +- 1.04 ( 13.27%) 84.16 +- 0.84 ( 14.81%) 83.99 +- 0.90 ( 14.99%) SubAmean 2 108.42 +- 0.95 ( 6.54%) 109.91 +- 1.39 ( 5.24%) 112.06 +- 0.91 ( 3.39%) SubAmean 4 136.90 +- 1.04 ( 8.67%) 137.59 +- 0.93 ( 8.21%) 136.55 +- 0.95 ( 8.91%) SubAmean 8 163.15 +- 0.96 ( 10.56%) 166.07 +- 1.02 ( 8.96%) 165.81 +- 0.99 ( 9.10%) SubAmean 16 224.86 +- 1.12 ( 5.45%) 223.83 +- 1.06 ( 5.89%) 230.66 +- 1.19 ( 3.01%) SubAmean 32 320.51 +- 1.38 ( 4.13%) 322.85 +- 1.49 ( 3.44%) 321.96 +- 1.46 ( 3.70%) SubAmean 64 553.25 +- 1.93 ( 4.05%) 554.19 +- 2.08 ( 3.89%) 562.26 +- 2.22 ( 2.49%) SubAmean 128 1264.35 +- 3.72 ( 6.35%) 1256.99 +- 3.46 ( 6.89%) 2018.97 +- 18.79 ( -49.55%) SubAmean 256 3466.25 +- 8.25 ( -0.63%) 3450.58 +- 8.44 ( -0.18%) 5032.12 +- 38.74 ( -46.09%) SubAmean 2048 39133.10 +- 45.71 ( 0.82%) 39905.95 +- 34.33 ( -1.14%) 53811.86 +-193.04 ( -36.38%) Benchmark : kernbench (kernel compilation) Varying parameter : number of jobs Unit : seconds (lower is better) 5.9.0-ondemand (BASELINE) 5.9.0-perfgov 5.9.0-sugov-noinv - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Amean 2 471.71 +- 26.61% ( ) 409.88 +- 16.99% ( 13.11%) 430.63 +- 0.18% ( 8.71%) Amean 4 211.87 +- 0.58% ( ) 194.03 +- 0.74% ( 8.42%) 215.33 +- 0.64% ( -1.63%) Amean 8 109.79 +- 1.27% ( ) 101.43 +- 1.53% ( 7.61%) 111.05 +- 1.95% ( -1.15%) Amean 16 59.50 +- 1.28% ( ) 55.61 +- 1.35% ( 6.55%) 59.65 +- 1.78% ( -0.24%) Amean 32 34.94 +- 1.22% ( ) 32.36 +- 1.95% ( 7.41%) 35.44 +- 0.63% ( -1.43%) Amean 64 22.58 +- 0.38% ( ) 20.97 +- 1.28% ( 7.11%) 22.41 +- 1.73% ( 0.74%) Amean 128 17.72 +- 0.44% ( ) 16.68 +- 0.32% ( 5.88%) 17.65 +- 0.96% ( 0.37%) Amean 256 16.44 +- 0.53% ( ) 15.76 +- 0.32% ( 4.18%) 16.76 +- 0.60% ( -1.93%) Amean 512 16.54 +- 0.21% ( ) 15.62 +- 0.41% ( 5.53%) 16.84 +- 0.85% ( -1.83%) 5.9.0-sugov-max 5.9.0-sugov-mid 5.9.0-sugov-P0 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Amean 2 421.30 +- 0.24% ( 10.69%) 419.26 +- 0.15% ( 11.12%) 414.38 +- 0.33% ( 12.15%) Amean 4 217.81 +- 5.53% ( -2.80%) 211.63 +- 0.99% ( 0.12%) 208.43 +- 0.47% ( 1.63%) Amean 8 108.80 +- 0.43% ( 0.90%) 108.48 +- 1.44% ( 1.19%) 108.59 +- 3.08% ( 1.09%) Amean 16 58.84 +- 0.74% ( 1.12%) 58.37 +- 0.94% ( 1.91%) 57.78 +- 0.78% ( 2.90%) Amean 32 34.04 +- 2.00% ( 2.59%) 34.28 +- 1.18% ( 1.91%) 33.98 +- 2.21% ( 2.75%) Amean 64 22.22 +- 1.69% ( 1.60%) 22.27 +- 1.60% ( 1.38%) 22.25 +- 1.41% ( 1.47%) Amean 128 17.55 +- 0.24% ( 0.97%) 17.53 +- 0.94% ( 1.04%) 17.49 +- 0.43% ( 1.30%) Amean 256 16.51 +- 0.46% ( -0.40%) 16.48 +- 0.48% ( -0.19%) 16.44 +- 1.21% ( 0.00%) Amean 512 16.50 +- 0.35% ( 0.19%) 16.35 +- 0.42% ( 1.14%) 16.37 +- 0.33% ( 0.99%) Benchmark : gitsource (time to run the git unit test suite) Varying parameter : none Unit : seconds (lower is better) 5.9.0-ondemand (BASELINE) 5.9.0-perfgov 5.9.0-sugov-noinv - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Amean 1035.76 +- 0.30% ( ) 688.21 +- 0.04% ( 33.56%) 1003.85 +- 0.14% ( 3.08%) 5.9.0-sugov-max 5.9.0-sugov-mid 5.9.0-sugov-P0 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Amean 995.82 +- 0.08% ( 3.86%) 1011.98 +- 0.03% ( 2.30%) 986.87 +- 0.19% ( 4.72%) 3. POWER CONSUMPTION TABLE ========================== Average power consumption (watts). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ondemand perfgov sugov-noinv sugov-max sugov-mid sugov-P0 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - tbench4 227.25 281.83 244.17 236.76 241.50 247.99 dbench4 151.97 161.87 157.08 158.10 158.06 153.73 kernbench 162.78 167.22 162.90 164.19 164.65 164.72 gitsource 133.65 139.00 133.04 134.43 134.18 134.32 Signed-off-by: Giovanni Gherdovich <ggherdovich@suse.cz> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20201112182614.10700-3-ggherdovich@suse.cz --- arch/x86/kernel/smpboot.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index a4ab5cf..c5dd5f6 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -2054,6 +2054,8 @@ static bool amd_set_max_freq_ratio(void) } perf_ratio = div_u64(highest_perf * SCHED_CAPACITY_SCALE, nominal_perf); + /* midpoint between max_boost and max_P */ + perf_ratio = (perf_ratio + SCHED_CAPACITY_SCALE) >> 1; if (!perf_ratio) { pr_debug("Non-zero highest/nominal perf values led to a 0 ratio\n"); return false; ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v4 3/3] x86: Print ratio freq_max/freq_base used in frequency invariance calculations 2020-11-12 18:26 [PATCH v4 0/3] Add support for frequency invariance to AMD EPYC Zen2 Giovanni Gherdovich 2020-11-12 18:26 ` [PATCH v4 1/3] x86, sched: Calculate frequency invariance for AMD systems Giovanni Gherdovich 2020-11-12 18:26 ` [PATCH v4 2/3] x86, sched: Use midpoint of max_boost and max_P for frequency invariance on AMD EPYC Giovanni Gherdovich @ 2020-11-12 18:26 ` Giovanni Gherdovich 2020-12-03 9:13 ` [tip: sched/core] " tip-bot2 for Giovanni Gherdovich 2020-12-11 9:34 ` tip-bot2 for Giovanni Gherdovich 2 siblings, 2 replies; 15+ messages in thread From: Giovanni Gherdovich @ 2020-11-12 18:26 UTC (permalink / raw) To: Borislav Petkov, Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Len Brown, Rafael J . Wysocki Cc: Jon Grimm, Nathan Fontenot, Yazen Ghannam, Thomas Lendacky, Mel Gorman, Pu Wen, Viresh Kumar, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Doug Smythies, x86, linux-pm, linux-kernel, linux-acpi, Giovanni Gherdovich The value freq_max/freq_base is a fundamental component of frequency invariance calculations. It may come from a variety of sources such as MSRs or ACPI data, tracking it down when troubleshooting a system could be non-trivial. It is worth saving it in the kernel logs. # dmesg | grep 'Estimated ratio of average max' [ 14.024036] smpboot: Estimated ratio of average max frequency by base frequency (times 1024): 1289 Signed-off-by: Giovanni Gherdovich <ggherdovich@suse.cz> --- arch/x86/kernel/smpboot.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index c5dd5f6199d9..3577bb756d64 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -2110,6 +2110,7 @@ static void init_freq_invariance(bool secondary, bool cppc_ready) if (ret) { init_counter_refs(); static_branch_enable(&arch_scale_freq_key); + pr_info("Estimated ratio of average max frequency by base frequency (times 1024): %llu\n", arch_max_freq_ratio); } else { pr_debug("Couldn't determine max cpu frequency, necessary for scale-invariant accounting.\n"); } -- 2.26.2 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [tip: sched/core] x86: Print ratio freq_max/freq_base used in frequency invariance calculations 2020-11-12 18:26 ` [PATCH v4 3/3] x86: Print ratio freq_max/freq_base used in frequency invariance calculations Giovanni Gherdovich @ 2020-12-03 9:13 ` tip-bot2 for Giovanni Gherdovich 2020-12-11 9:34 ` tip-bot2 for Giovanni Gherdovich 1 sibling, 0 replies; 15+ messages in thread From: tip-bot2 for Giovanni Gherdovich @ 2020-12-03 9:13 UTC (permalink / raw) To: linux-tip-commits Cc: Giovanni Gherdovich, Peter Zijlstra (Intel), x86, linux-kernel The following commit has been merged into the sched/core branch of tip: Commit-ID: 24f326686c925a848d603a2c22f1f6ed1b7786e2 Gitweb: https://git.kernel.org/tip/24f326686c925a848d603a2c22f1f6ed1b7786e2 Author: Giovanni Gherdovich <ggherdovich@suse.cz> AuthorDate: Thu, 12 Nov 2020 19:26:14 +01:00 Committer: Peter Zijlstra <peterz@infradead.org> CommitterDate: Thu, 03 Dec 2020 10:00:35 +01:00 x86: Print ratio freq_max/freq_base used in frequency invariance calculations The value freq_max/freq_base is a fundamental component of frequency invariance calculations. It may come from a variety of sources such as MSRs or ACPI data, tracking it down when troubleshooting a system could be non-trivial. It is worth saving it in the kernel logs. # dmesg | grep 'Estimated ratio of average max' [ 14.024036] smpboot: Estimated ratio of average max frequency by base frequency (times 1024): 1289 Signed-off-by: Giovanni Gherdovich <ggherdovich@suse.cz> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201112182614.10700-4-ggherdovich@suse.cz --- arch/x86/kernel/smpboot.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index c5dd5f6..3577bb7 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -2110,6 +2110,7 @@ static void init_freq_invariance(bool secondary, bool cppc_ready) if (ret) { init_counter_refs(); static_branch_enable(&arch_scale_freq_key); + pr_info("Estimated ratio of average max frequency by base frequency (times 1024): %llu\n", arch_max_freq_ratio); } else { pr_debug("Couldn't determine max cpu frequency, necessary for scale-invariant accounting.\n"); } ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [tip: sched/core] x86: Print ratio freq_max/freq_base used in frequency invariance calculations 2020-11-12 18:26 ` [PATCH v4 3/3] x86: Print ratio freq_max/freq_base used in frequency invariance calculations Giovanni Gherdovich 2020-12-03 9:13 ` [tip: sched/core] " tip-bot2 for Giovanni Gherdovich @ 2020-12-11 9:34 ` tip-bot2 for Giovanni Gherdovich 1 sibling, 0 replies; 15+ messages in thread From: tip-bot2 for Giovanni Gherdovich @ 2020-12-11 9:34 UTC (permalink / raw) To: linux-tip-commits Cc: Giovanni Gherdovich, Peter Zijlstra (Intel), Ingo Molnar, x86, linux-kernel The following commit has been merged into the sched/core branch of tip: Commit-ID: 3149cd55302748df771dc1c8c10f34b1cbce88ed Gitweb: https://git.kernel.org/tip/3149cd55302748df771dc1c8c10f34b1cbce88ed Author: Giovanni Gherdovich <ggherdovich@suse.cz> AuthorDate: Thu, 12 Nov 2020 19:26:14 +01:00 Committer: Ingo Molnar <mingo@kernel.org> CommitterDate: Fri, 11 Dec 2020 10:30:23 +01:00 x86: Print ratio freq_max/freq_base used in frequency invariance calculations The value freq_max/freq_base is a fundamental component of frequency invariance calculations. It may come from a variety of sources such as MSRs or ACPI data, tracking it down when troubleshooting a system could be non-trivial. It is worth saving it in the kernel logs. # dmesg | grep 'Estimated ratio of average max' [ 14.024036] smpboot: Estimated ratio of average max frequency by base frequency (times 1024): 1289 Signed-off-by: Giovanni Gherdovich <ggherdovich@suse.cz> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20201112182614.10700-4-ggherdovich@suse.cz --- arch/x86/kernel/smpboot.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index c5dd5f6..3577bb7 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -2110,6 +2110,7 @@ static void init_freq_invariance(bool secondary, bool cppc_ready) if (ret) { init_counter_refs(); static_branch_enable(&arch_scale_freq_key); + pr_info("Estimated ratio of average max frequency by base frequency (times 1024): %llu\n", arch_max_freq_ratio); } else { pr_debug("Couldn't determine max cpu frequency, necessary for scale-invariant accounting.\n"); } ^ permalink raw reply related [flat|nested] 15+ messages in thread
end of thread, other threads:[~2020-12-11 9:36 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-11-12 18:26 [PATCH v4 0/3] Add support for frequency invariance to AMD EPYC Zen2 Giovanni Gherdovich 2020-11-12 18:26 ` [PATCH v4 1/3] x86, sched: Calculate frequency invariance for AMD systems Giovanni Gherdovich 2020-11-14 18:23 ` kernel test robot 2020-11-26 9:58 ` Peter Zijlstra 2020-11-26 11:55 ` Giovanni Gherdovich 2020-12-03 9:13 ` [tip: sched/core] " tip-bot2 for Nathan Fontenot 2020-12-11 9:34 ` tip-bot2 for Nathan Fontenot 2020-11-12 18:26 ` [PATCH v4 2/3] x86, sched: Use midpoint of max_boost and max_P for frequency invariance on AMD EPYC Giovanni Gherdovich 2020-12-03 9:13 ` [tip: sched/core] " tip-bot2 for Giovanni Gherdovich 2020-12-04 17:03 ` [PATCH v4 RESEND] " Giovanni Gherdovich 2020-12-04 17:10 ` Giovanni Gherdovich 2020-12-11 9:34 ` [tip: sched/core] " tip-bot2 for Giovanni Gherdovich 2020-11-12 18:26 ` [PATCH v4 3/3] x86: Print ratio freq_max/freq_base used in frequency invariance calculations Giovanni Gherdovich 2020-12-03 9:13 ` [tip: sched/core] " tip-bot2 for Giovanni Gherdovich 2020-12-11 9:34 ` tip-bot2 for Giovanni Gherdovich
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.