Greeting, FYI, we noticed a +85.7% improvement of hackbench.throughput due to commit: commit: 9824134a551e01246f3cb34287609db5d0d4f514 ("sched: limit cpu search and rotate search window for scalability") url: https://github.com/0day-ci/linux/commits/subhra-mazumdar/Improve-scheduler-scalability-for-fast-path/20180424-143304 in testcase: hackbench on test machine: 192 threads Intel(R) Xeon(R) CPU E7-8890 v4 @ 2.20GHz with 512G memory with following parameters: nr_threads: 50% mode: process ipc: socket cpufreq_governor: performance test-description: Hackbench is both a benchmark and a stress test for the Linux kernel scheduler. test-url: https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/sched/cfs-scheduler/hackbench.c Details are as below: --------------------------------------------------------------------------------------------------> To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp install job.yaml # job file is attached in this email bin/lkp run job.yaml ========================================================================================= cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase: cs-localhost/gcc-7/performance/ipv4/x86_64-rhel-7.2/25%/debian-x86_64-2016-08-31.cgz/300s/lkp-bdw-de1/TCP_RR/netperf commit: c3e41edefc ("sched: introduce per-cpu var next_cpu to track search limit") 9824134a55 ("sched: limit cpu search and rotate search window for scalability") c3e41edefc91d2e6 9824134a551e01246f3cb34287 ---------------- -------------------------- %stddev %change %stddev \ | \ 168534 ± 5% +85.7% 313043 ± 3% hackbench.throughput 2.656e+09 ± 5% -11.1% 2.362e+09 ± 4% hackbench.time.involuntary_context_switches 896204 +75.4% 1571534 hackbench.time.minor_page_faults 8353 ± 3% +14.2% 9540 ± 2% hackbench.time.user_time 7.767e+09 ± 7% -21.0% 6.139e+09 ± 3% hackbench.time.voluntary_context_switches 1.152e+09 +80.0% 2.074e+09 hackbench.workload 446617 +13.2% 505758 meminfo.SUnreclaim 556029 +12.2% 623806 meminfo.Slab 1.00 ± 11% +0.9 1.91 ± 7% mpstat.cpu.idle% 6.20 ± 2% +1.2 7.43 mpstat.cpu.usr% 936095 ± 24% +71.5% 1605186 ± 16% numa-numastat.node0.local_node 950015 ± 23% +70.9% 1623757 ± 16% numa-numastat.node0.numa_hit 19907406 ± 8% -15.3% 16869937 ± 3% softirqs.RCU 1455625 ± 8% +182.7% 4115065 softirqs.SCHED 2005 ± 3% -11.9% 1766 vmstat.procs.r 15505501 ± 2% -15.2% 13152358 vmstat.system.cs 3331308 ± 6% -28.6% 2379228 vmstat.system.in 2.656e+09 ± 5% -11.1% 2.362e+09 ± 4% time.involuntary_context_switches 896204 +75.4% 1571534 time.minor_page_faults 8353 ± 3% +14.2% 9540 ± 2% time.user_time 7.767e+09 ± 7% -21.0% 6.139e+09 ± 3% time.voluntary_context_switches 1858 ± 16% -31.0% 1282 ± 11% numa-meminfo.node0.PageTables 315.00 ± 64% +409.8% 1606 ± 71% numa-meminfo.node1.Inactive(anon) 1403 ± 37% +6457.3% 91998 ± 98% numa-meminfo.node1.Shmem 104041 ± 89% -88.6% 11896 ± 27% numa-meminfo.node3.Active 104040 ± 89% -88.6% 11896 ± 27% numa-meminfo.node3.Active(anon) 54704 ±128% -79.1% 11459 ± 26% numa-meminfo.node3.AnonPages 5717 ± 8% -10.3% 5127 numa-meminfo.node3.Mapped 1.655e+08 ± 11% +226.5% 5.404e+08 ± 3% cpuidle.C1.time 15928871 ± 18% +375.2% 75693280 ± 2% cpuidle.C1.usage 56272051 ± 12% +209.9% 1.744e+08 ± 8% cpuidle.C1E.time 1144146 ± 16% +239.9% 3888817 ± 10% cpuidle.C1E.usage hackbench.throughput 350000 +-+----------------------------------------------------------------+ | O O O O | 300000 +-+ O O O O O O O O O | O O O O | 250000 +-O O O O O O | | | 200000 +-+ | | .+. .+.. .+.+.+.+.+.+.+.+. .+. .+.+.+.+.+.+.+.+. .+.+.+.+.+.| 150000 +-+.+ + + +. + +. | | : | 100000 +-+ | |: | 50000 +-+ | | | 0 +-+----------------------------------------------------------------+ hackbench.workload 2.5e+09 +-+---------------------------------------------------------------+ | | | O O O O O O O O O O | 2e+09 +-+ | O O O O O O O | | O O O O O O | 1.5e+09 +-+ | | | 1e+09 +-+.+.+.+.+.+.+..+.+.+.+.+.+.+.+.+.+.+.+.+.+.+.+..+.+.+.+.+.+.+.+.| | : | |: | 5e+08 +-+ | |: | | | 0 +-+---------------------------------------------------------------+ Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Thanks, Xiaolong