Greeting, FYI, we noticed a 1.5% improvement of unixbench.score due to commit: commit: e7f28850eadc14c0976f7872f2ddfef7a0a1d9f4 ("[PATCH 3/3] sched/numa: Limit the amount of imbalance that can exist at fork time") url: https://github.com/0day-ci/linux/commits/Mel-Gorman/Revisit-NUMA-imbalance-tolerance-and-fork-balancing/20201117-214609 base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git dc824eb898534cd8e34582874dae3bb7cf2fa008 in testcase: unixbench on test machine: 96 threads Intel(R) Xeon(R) CPU @ 2.30GHz with 128G memory with following parameters: runtime: 300s nr_task: 30% test: pipe cpufreq_governor: performance ucode: 0x4003003 test-description: UnixBench is the original BYTE UNIX benchmark suite aims to test performance of Unix-like system. test-url: https://github.com/kdlucas/byte-unixbench Details are as below: --------------------------------------------------------------------------------------------------> To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp install job.yaml # job file is attached in this email bin/lkp run job.yaml ========================================================================================= compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase/ucode: gcc-9/performance/x86_64-rhel-8.3/30%/debian-10.4-x86_64-20200603.cgz/300s/lkp-csl-2sp4/pipe/unixbench/0x4003003 commit: b619be42c0 ("sched/numa: Allow a floating imbalance between NUMA nodes") e7f28850ea ("sched/numa: Limit the amount of imbalance that can exist at fork time") b619be42c0eab221 e7f28850eadc14c0976f7872f2d ---------------- --------------------------- %stddev %change %stddev \ | \ 41663 +1.5% 42289 unixbench.score 17225 -2.7% 16754 unixbench.time.involuntary_context_switches 2.025e+10 +1.4% 2.054e+10 unixbench.workload 0.30 ±101% +0.3 0.65 ± 8% perf-profile.calltrace.cycles-pp.__entry_text_start.read 0.38 ± 13% +0.1 0.46 ± 10% perf-profile.self.cycles-pp.write 7969 ± 6% +13.7% 9064 ± 7% numa-vmstat.node0.nr_kernel_stack 9007 ± 5% -11.6% 7966 ± 8% numa-vmstat.node1.nr_kernel_stack 11370 ± 14% -18.2% 9295 ± 12% numa-vmstat.node1.nr_slab_reclaimable 7971 ± 6% +13.7% 9063 ± 7% numa-meminfo.node0.KernelStack 45482 ± 14% -18.2% 37186 ± 12% numa-meminfo.node1.KReclaimable 9000 ± 5% -11.4% 7974 ± 8% numa-meminfo.node1.KernelStack 45482 ± 14% -18.2% 37186 ± 12% numa-meminfo.node1.SReclaimable 27.83 ± 5% -20.4% 22.16 ± 12% sched_debug.cfs_rq:/.load_avg.avg 260.86 ± 7% -13.0% 226.86 ± 12% sched_debug.cfs_rq:/.load_avg.max 64.01 ± 30% -77.2% 14.62 ±173% sched_debug.cfs_rq:/.removed.runnable_avg.max 64.01 ± 30% -77.2% 14.62 ±173% sched_debug.cfs_rq:/.removed.util_avg.max 140029 ± 6% -15.5% 118380 ± 9% sched_debug.cpu.avg_idle.stddev 6.87 ±117% -70.0% 2.06 ± 5% perf-stat.i.MPKI 2.058e+10 +1.6% 2.092e+10 perf-stat.i.branch-instructions 1.776e+08 +1.5% 1.802e+08 perf-stat.i.branch-misses 3.084e+10 +1.6% 3.134e+10 perf-stat.i.dTLB-loads 1.937e+10 +1.6% 1.968e+10 perf-stat.i.dTLB-stores 1.831e+08 +2.0% 1.868e+08 perf-stat.i.iTLB-load-misses 1.025e+11 +1.6% 1.042e+11 perf-stat.i.instructions 1.28 +2.0% 1.31 perf-stat.i.ipc 737.53 +1.6% 749.52 perf-stat.i.metric.M/sec 0.67 -1.3% 0.66 perf-stat.overall.cpi 1.49 +1.3% 1.51 perf-stat.overall.ipc 2.056e+10 +1.6% 2.089e+10 perf-stat.ps.branch-instructions 1.774e+08 +1.4% 1.8e+08 perf-stat.ps.branch-misses 3.08e+10 +1.6% 3.13e+10 perf-stat.ps.dTLB-loads 1.935e+10 +1.6% 1.965e+10 perf-stat.ps.dTLB-stores 1.829e+08 +2.0% 1.865e+08 perf-stat.ps.iTLB-load-misses 1.024e+11 +1.6% 1.04e+11 perf-stat.ps.instructions 4.017e+13 +1.7% 4.085e+13 perf-stat.total.instructions 40627 ± 4% +9.9% 44633 ± 2% softirqs.CPU10.SCHED 40722 +7.9% 43959 ± 3% softirqs.CPU11.SCHED 14454 ± 5% +16.4% 16827 ± 8% softirqs.CPU15.RCU 14800 ± 8% +21.4% 17968 ± 10% softirqs.CPU16.RCU 15254 ± 7% +16.9% 17835 ± 7% softirqs.CPU17.RCU 14676 ± 11% +19.3% 17502 ± 8% softirqs.CPU18.RCU 15098 ± 6% +15.7% 17472 ± 8% softirqs.CPU19.RCU 14311 ± 5% +23.0% 17595 ± 6% softirqs.CPU21.RCU 15728 ± 3% +14.2% 17965 ± 10% softirqs.CPU22.RCU 15758 ± 6% +14.3% 18005 ± 8% softirqs.CPU23.RCU 15700 +14.8% 18018 ± 6% softirqs.CPU49.RCU 15386 ± 3% +15.4% 17757 ± 8% softirqs.CPU50.RCU 16064 ± 3% +14.9% 18455 ± 8% softirqs.CPU52.RCU 16072 ± 3% +19.5% 19200 ± 4% softirqs.CPU54.RCU 16371 ± 4% +12.9% 18479 ± 5% softirqs.CPU58.RCU 15825 ± 3% +14.5% 18116 ± 6% softirqs.CPU59.RCU 16359 ± 5% +13.6% 18592 ± 7% softirqs.CPU60.RCU 16020 ± 7% +14.7% 18370 ± 7% softirqs.CPU62.RCU 15940 ± 6% +17.6% 18740 ± 8% softirqs.CPU63.RCU 15520 ± 4% +25.0% 19403 ± 7% softirqs.CPU64.RCU 16212 ± 8% +19.4% 19354 ± 11% softirqs.CPU65.RCU 16164 ± 7% +19.1% 19247 ± 9% softirqs.CPU67.RCU 16678 ± 6% +17.5% 19592 ± 9% softirqs.CPU68.RCU 16328 ± 6% +19.7% 19551 ± 6% softirqs.CPU69.RCU 16351 ± 4% +17.2% 19155 ± 7% softirqs.CPU70.RCU 15636 ± 2% +11.1% 17370 ± 6% softirqs.CPU72.RCU 15764 +13.9% 17949 ± 7% softirqs.CPU75.RCU 15899 ± 3% +13.3% 18015 ± 7% softirqs.CPU76.RCU 16157 ± 4% +11.2% 17967 ± 8% softirqs.CPU77.RCU 15480 ± 2% +14.4% 17716 ± 10% softirqs.CPU91.RCU 16142 +10.9% 17893 ± 7% softirqs.CPU93.RCU 16424 ± 3% +12.7% 18503 ± 7% softirqs.CPU95.RCU 38301 ± 7% -12.0% 33723 ± 8% softirqs.CPU95.SCHED 1550393 ± 2% +10.6% 1714970 ± 6% softirqs.RCU 56868 -4.8% 54162 interrupts.CAL:Function_call_interrupts 788.75 ± 25% -30.4% 548.75 ± 15% interrupts.CPU1.CAL:Function_call_interrupts 120.75 ± 37% -56.5% 52.50 ± 41% interrupts.CPU10.RES:Rescheduling_interrupts 498.50 ± 9% -12.7% 435.25 ± 2% interrupts.CPU12.CAL:Function_call_interrupts 94.50 ± 14% -51.6% 45.75 ± 37% interrupts.CPU12.RES:Rescheduling_interrupts 658.25 ± 27% -32.4% 445.25 ± 2% interrupts.CPU14.CAL:Function_call_interrupts 613.75 ± 24% -33.6% 407.50 ± 22% interrupts.CPU2.CAL:Function_call_interrupts 626.25 ± 19% -26.8% 458.25 ± 5% interrupts.CPU23.CAL:Function_call_interrupts 851.00 ± 19% -30.7% 590.00 ± 6% interrupts.CPU25.CAL:Function_call_interrupts 474.75 ± 4% +16.4% 552.50 ± 7% interrupts.CPU32.CAL:Function_call_interrupts 58.75 ± 10% +59.6% 93.75 ± 13% interrupts.CPU36.RES:Rescheduling_interrupts 73.00 ± 29% +111.0% 154.00 ± 59% interrupts.CPU37.RES:Rescheduling_interrupts 3002 ± 43% -85.0% 449.25 ±112% interrupts.CPU40.NMI:Non-maskable_interrupts 3002 ± 43% -85.0% 449.25 ±112% interrupts.CPU40.PMI:Performance_monitoring_interrupts 3355 ± 28% -77.4% 757.50 ± 93% interrupts.CPU42.NMI:Non-maskable_interrupts 3355 ± 28% -77.4% 757.50 ± 93% interrupts.CPU42.PMI:Performance_monitoring_interrupts 3557 ± 29% -51.7% 1718 ± 61% interrupts.CPU46.NMI:Non-maskable_interrupts 3557 ± 29% -51.7% 1718 ± 61% interrupts.CPU46.PMI:Performance_monitoring_interrupts 2004 ± 43% -55.9% 884.75 ± 80% interrupts.CPU61.NMI:Non-maskable_interrupts 2004 ± 43% -55.9% 884.75 ± 80% interrupts.CPU61.PMI:Performance_monitoring_interrupts 609.25 ± 88% +496.3% 3632 ± 64% interrupts.CPU62.NMI:Non-maskable_interrupts 609.25 ± 88% +496.3% 3632 ± 64% interrupts.CPU62.PMI:Performance_monitoring_interrupts 52.75 ± 56% +133.2% 123.00 ± 42% interrupts.CPU63.RES:Rescheduling_interrupts 441.25 ± 78% +744.6% 3727 ± 64% interrupts.CPU69.NMI:Non-maskable_interrupts 441.25 ± 78% +744.6% 3727 ± 64% interrupts.CPU69.PMI:Performance_monitoring_interrupts 48.50 ± 58% +152.6% 122.50 ± 37% interrupts.CPU69.RES:Rescheduling_interrupts 408.25 ± 74% +610.8% 2901 ± 67% interrupts.CPU70.NMI:Non-maskable_interrupts 408.25 ± 74% +610.8% 2901 ± 67% interrupts.CPU70.PMI:Performance_monitoring_interrupts 57.00 ± 68% +95.6% 111.50 ± 32% interrupts.CPU70.RES:Rescheduling_interrupts 57.00 ± 45% +120.2% 125.50 ± 30% interrupts.CPU71.RES:Rescheduling_interrupts 712.50 ± 24% -34.3% 468.00 ± 7% interrupts.CPU75.CAL:Function_call_interrupts 1236 ±111% +258.1% 4426 ± 20% interrupts.CPU75.NMI:Non-maskable_interrupts 1236 ±111% +258.1% 4426 ± 20% interrupts.CPU75.PMI:Performance_monitoring_interrupts 1953 ± 42% -84.3% 306.75 ±108% interrupts.CPU88.NMI:Non-maskable_interrupts 1953 ± 42% -84.3% 306.75 ±108% interrupts.CPU88.PMI:Performance_monitoring_interrupts 1077 ± 39% -47.5% 565.50 ± 11% interrupts.CPU90.CAL:Function_call_interrupts unixbench.score 42600 +-------------------------------------------------------------------+ | O O O O O | 42400 |-+ | | O O O O O O O O | | O O O O O O O O | 42200 |-+ | | | 42000 |-+ | | | 41800 |.+ .+. .+. .+.+ | | +.+.+. .+.+ +. + +.+ : .+.+. .+.| | + +. + : .+.+. .+ +.+.+.+ | 41600 |-+ + :.+. .+.+ +.+ | | + + | 41400 +-------------------------------------------------------------------+ unixbench.workload 2.08e+10 +----------------------------------------------------------------+ | | 2.07e+10 |-+ O O | | O O O | 2.06e+10 |-+ O OO O O O | | O O O O O O O | 2.05e+10 |-+ O O O O | | | 2.04e+10 |-+ | | .+.+ | 2.03e+10 |++ .+. .++.+. .+.+.+ : .+.+ +.| | +.+ + +. + : +.+ +.+ : .+. + | 2.02e+10 |-+ +. + ++.+.+. + + + +.+ + | | + + + | 2.01e+10 +----------------------------------------------------------------+ [*] bisect-good sample [O] bisect-bad sample Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Thanks, Oliver Sang