Greeting, FYI, we noticed a -13.0% regression of will-it-scale.per_process_ops due to commit: commit: 63e02a2a3292d8815eac7be438c8c73d72a7bb93 ("x86/entry/64: Create a per-CPU SYSCALL entry trampoline") https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master in testcase: will-it-scale on test machine: 32 threads Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz with 64G memory with following parameters: test: poll1 cpufreq_governor: performance test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two. test-url: https://github.com/antonblanchard/will-it-scale In addition to that, the commit also has significant impact on the following tests: +------------------+---------------------------------------------------------------------+ | testcase: change | will-it-scale: will-it-scale.per_process_ops -7.0% regression | | test machine | 32 threads Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz with 64G memory | | test parameters | cpufreq_governor=performance | | | test=writeseek1 | +------------------+---------------------------------------------------------------------+ | testcase: change | aim9: aim9.brk_test.ops_per_sec -9.9% regression | | test machine | 4 threads Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz with 4G memory | | test parameters | cpufreq_governor=performance | | | test=brk_test | | | testtime=300s | +------------------+---------------------------------------------------------------------+ Details are as below: --------------------------------------------------------------------------------------------------> To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp install job.yaml # job file is attached in this email bin/lkp run job.yaml ========================================================================================= compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase: gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2016-08-31.cgz/lkp-sb03/poll1/will-it-scale commit: 955cef1517 ("x86/entry/64: Return to userspace from the trampoline stack") 63e02a2a32 ("x86/entry/64: Create a per-CPU SYSCALL entry trampoline") 955cef1517a1be93 63e02a2a3292d8815eac7be438 ---------------- -------------------------- %stddev %change %stddev \ | \ 7435674 -13.0% 6465918 will-it-scale.per_process_ops 5868564 -10.4% 5256868 will-it-scale.per_thread_ops 0.56 +8.0% 0.61 ± 2% will-it-scale.scalability 1947 -2.0% 1908 will-it-scale.time.system_time 562.79 +6.9% 601.69 will-it-scale.time.user_time 8.06 +0.8 8.86 ± 3% mpstat.cpu.usr% 4969 ± 83% -84.5% 769.00 ± 6% numa-meminfo.node1.Inactive(anon) 116.75 ± 63% +90.1% 222.00 ± 9% numa-vmstat.node0.nr_mlock 116.75 ± 63% +90.1% 222.00 ± 9% numa-vmstat.node0.nr_unevictable 116.75 ± 63% +90.1% 222.00 ± 9% numa-vmstat.node0.nr_zone_unevictable 1242 ± 83% -84.6% 191.25 ± 6% numa-vmstat.node1.nr_inactive_anon 1242 ± 83% -84.6% 191.25 ± 6% numa-vmstat.node1.nr_zone_inactive_anon 1414780 +7.7% 1524182 ± 3% sched_debug.cfs_rq:/.min_vruntime.max 144.71 ± 12% +17.8% 170.42 ± 2% sched_debug.cfs_rq:/.runnable_load_avg.max -568616 -29.5% -400842 sched_debug.cfs_rq:/.spread0.min 202980 ± 13% +56.8% 318219 ± 6% sched_debug.cpu.avg_idle.min 173545 ± 3% -13.9% 149414 ± 5% sched_debug.cpu.avg_idle.stddev 2.906e+12 -7.9% 2.676e+12 perf-stat.branch-instructions 0.01 ± 2% +2.0 2.00 perf-stat.branch-miss-rate% 2.405e+08 +22170.9% 5.356e+10 perf-stat.branch-misses 1.15 +11.6% 1.28 perf-stat.cpi 3.659e+12 -9.3% 3.318e+12 perf-stat.dTLB-loads 0.00 ± 6% +0.0 0.00 ± 3% perf-stat.dTLB-store-miss-rate% 2.869e+12 -8.8% 2.616e+12 perf-stat.dTLB-stores 1.406e+13 -9.7% 1.27e+13 perf-stat.instructions 0.87 -10.4% 0.78 perf-stat.ipc 13.72 ± 2% -13.7 0.00 perf-profile.calltrace.cycles.entry_SYSCALL_64 24.53 ± 2% -0.2 24.30 ± 3% perf-profile.calltrace.cycles.copy_user_generic_string._copy_from_user.do_sys_poll.sys_poll.entry_SYSCALL_64_fastpath 12.15 ± 3% -0.2 11.98 ± 3% perf-profile.calltrace.cycles.__fget_light.do_sys_poll.sys_poll.entry_SYSCALL_64_fastpath 9.57 ± 3% -0.1 9.48 ± 4% perf-profile.calltrace.cycles.__fget.__fget_light.do_sys_poll.sys_poll.entry_SYSCALL_64_fastpath 5.79 ± 6% -0.0 5.75 ± 3% perf-profile.calltrace.cycles.fput.do_sys_poll.sys_poll.entry_SYSCALL_64_fastpath 32.25 ± 2% +1.5 33.78 ± 3% perf-profile.calltrace.cycles._copy_from_user.do_sys_poll.sys_poll.entry_SYSCALL_64_fastpath 3.99 ± 5% +1.6 5.56 ± 3% perf-profile.calltrace.cycles.__might_fault._copy_from_user.do_sys_poll.sys_poll.entry_SYSCALL_64_fastpath 65.36 ± 2% +2.0 67.34 ± 2% perf-profile.calltrace.cycles.do_sys_poll.sys_poll.entry_SYSCALL_64_fastpath 68.87 ± 2% +3.1 72.01 ± 2% perf-profile.calltrace.cycles.sys_poll.entry_SYSCALL_64_fastpath 7.33 ± 35% +3.7 11.05 ± 23% perf-profile.calltrace.cycles.poll_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary 71.48 ± 2% +3.9 75.41 ± 2% perf-profile.calltrace.cycles.entry_SYSCALL_64_fastpath 9.50 ± 25% +4.0 13.49 ± 19% perf-profile.calltrace.cycles.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64 10.06 ± 23% +4.0 14.05 ± 18% perf-profile.calltrace.cycles.secondary_startup_64 9.66 ± 24% +4.0 13.66 ± 19% perf-profile.calltrace.cycles.cpu_startup_entry.start_secondary.secondary_startup_64 9.66 ± 24% +4.0 13.66 ± 19% perf-profile.calltrace.cycles.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64 9.66 ± 24% +4.0 13.66 ± 19% perf-profile.calltrace.cycles.start_secondary.secondary_startup_64 2.25 ± 3% +5.4 7.67 ± 3% perf-profile.calltrace.cycles.entry_SYSCALL_64_after_hwframe 13.72 ± 2% -13.7 0.00 perf-profile.children.cycles.entry_SYSCALL_64 24.53 ± 2% -0.2 24.31 ± 3% perf-profile.children.cycles.copy_user_generic_string 12.16 ± 3% -0.2 11.99 ± 3% perf-profile.children.cycles.__fget_light 9.57 ± 3% -0.1 9.48 ± 4% perf-profile.children.cycles.__fget 5.79 ± 6% -0.0 5.75 ± 3% perf-profile.children.cycles.fput 32.25 ± 2% +1.5 33.78 ± 3% perf-profile.children.cycles._copy_from_user 3.99 ± 5% +1.6 5.56 ± 3% perf-profile.children.cycles.__might_fault 65.36 ± 2% +2.0 67.34 ± 2% perf-profile.children.cycles.do_sys_poll 68.87 ± 2% +3.1 72.01 ± 2% perf-profile.children.cycles.sys_poll 7.42 ± 34% +3.7 11.14 ± 22% perf-profile.children.cycles.poll_idle 71.61 ± 2% +3.9 75.50 ± 2% perf-profile.children.cycles.entry_SYSCALL_64_fastpath 9.88 ± 23% +4.0 13.87 ± 19% perf-profile.children.cycles.cpuidle_enter_state 10.06 ± 23% +4.0 14.05 ± 18% perf-profile.children.cycles.secondary_startup_64 10.06 ± 23% +4.0 14.05 ± 18% perf-profile.children.cycles.cpu_startup_entry 9.66 ± 24% +4.0 13.66 ± 19% perf-profile.children.cycles.start_secondary 10.06 ± 23% +4.0 14.05 ± 18% perf-profile.children.cycles.do_idle 2.25 ± 3% +5.4 7.67 ± 3% perf-profile.children.cycles.entry_SYSCALL_64_after_hwframe 13.72 ± 2% -13.7 0.00 perf-profile.self.cycles.entry_SYSCALL_64 24.21 ± 2% -0.3 23.93 ± 2% perf-profile.self.cycles.copy_user_generic_string 9.47 ± 3% -0.1 9.41 ± 4% perf-profile.self.cycles.__fget 5.69 ± 5% +0.0 5.71 ± 3% perf-profile.self.cycles.fput 13.55 ± 4% +0.7 14.24 perf-profile.self.cycles.do_sys_poll 7.41 ± 34% +3.7 11.07 ± 22% perf-profile.self.cycles.poll_idle 2.25 ± 3% +5.4 7.67 ± 3% perf-profile.self.cycles.entry_SYSCALL_64_after_hwframe will-it-scale.per_process_ops 7.8e+06 +-+---------------------------------------------------------------+ |. .+.++ .++. | 7.6e+06 +-+ : .+.+ +.+.+.+ +.+ | | : .+.+ + + + | 7.4e+06 +-+ +.+.+.+.++.+.+.+.+.++ ++.+.+ ++.+.| | | 7.2e+06 +-+ | | | 7e+06 +-+ | | | 6.8e+06 +-+ | | | 6.6e+06 O-+ O OO OO O O | | O O O O OO O O O O OO O O O O O | 6.4e+06 +-+--------O-----------------------O-O-------------O--------------+ perf-stat.instructions 1.5e+13 +-+--------------------------------------------------------------+ | | 1.45e+13 +-+ +.+ .+. | | +.+ + +.+.+.+. .+.+.+. +. .+.++.+ +. | | +. : +.++ + +.+ ++.+.| 1.4e+13 +-+ +.++.+.+.+.+ | | | 1.35e+13 +-+ | | | 1.3e+13 +-+ | O OO O O OO O O O O O | | O O O O OO O O O O O O O O O | 1.25e+13 +-+ O O | | | 1.2e+13 +-+--------------------------------------------------------------+ perf-stat.branch-instructions 3.05e+12 +-+--------------------------------------------------------------+ 3e+12 +-+ + | |.+.++.+ + ++ .+.+ .+. + + + | 2.95e+12 +-+ + + + +.+. .+. + +. + + .+ + + + + + : +| 2.9e+12 +-+ + + + + + + + + + + + :+ + : | | + + + + ++ | 2.85e+12 +-+ | 2.8e+12 +-+ | 2.75e+12 +-+ | | O | 2.7e+12 +-+ O O O O O | 2.65e+12 O-+ O O O O O O O O O O O O | | O O O O O O O O O | 2.6e+12 +-+ O | 2.55e+12 +-+--------------------------------------------------------------+ perf-stat.branch-misses 6e+10 +-+-----------------------------------------------------------------+ | O O O O O O O | 5e+10 O-O O O O O O O O O OO O O O O O O OO O O | | | | | 4e+10 +-+ | | | 3e+10 +-+ | | | 2e+10 +-+ | | | | | 1e+10 +-+ | | | 0 +-+-----------------------------------------------------------------+ perf-stat.dTLB-stores 3.2e+12 +-+---------------------------------------------------------------+ | + + + + | 3.1e+12 +-+ + + : :+ +: | | + + : + + : | 3e+12 +-+ : : : : | |. : : : : + | 2.9e+12 +-+.+.++. : : +.+ .+. : +. .+ : +| | +.+. .+.++.+.: +. + :.+ +.: + :: | 2.8e+12 +-+ + + +.+ + + + | | | 2.7e+12 +-+ | O OO O O O O | 2.6e+12 +-O O O O O O O O OO O O OO | | O O O O O O O O | 2.5e+12 +-+---------------------------------------------------------------+ perf-stat.branch-miss-rate_ 2.5 +-+-------------------------------------------------------------------+ | | | | 2 O-O O O O O O O O OO O O O O O O O O O O O O O O O O OO | | | | | 1.5 +-+ | | | 1 +-+ | | | | | 0.5 +-+ | | | | | 0 +-+-------------------------------------------------------------------+ perf-stat.ipc 0.92 +-+------------------------------------------------------------------+ | | 0.9 +-+.+. +. .+. .+. +. .+. | 0.88 +-+ +. + + +. +.+ +. .+. + + + .+. | | +. +. .+ +.+ + +.+ + +. .+.| 0.86 +-+ +.+ +.+.+.+ + | | | 0.84 +-+ | | | 0.82 +-+ | 0.8 +-+ O O O O | | O O O O | 0.78 +-O O O O O O O O O O O O O O | O O O O O O O | 0.76 +-+------------------------------------------------------------------+ perf-stat.cpi 1.3 +-+---------------------------------O-O------------------------------+ 1.28 O-+ O O O O O O O O O | | O O O O O O O O O O O O | 1.26 +-+ O | 1.24 +-+ O O O O | | | 1.22 +-+ | 1.2 +-+ | 1.18 +-+ | | | 1.16 +-+ .+.+ .+.+.+.+. .+ .+. | 1.14 +-+ .+ + + .+ +. .+. .+.+ +. .+ +.| |.+. .+ + .+. .+ +. .+ + + .+. .+ + | 1.12 +-+ + + + + + + | 1.1 +-+------------------------------------------------------------------+ will-it-scale.time.user_time 620 +-+-------------------------------------------------------------------+ 610 +-+ O O | O O O O O O O OO O O O O O O | 600 +-+ O O O O O O O O O O O O | 590 +-+ | | | 580 +-+ | 570 +-+ | 560 +-+ +.+.+.| | : | 550 +-+.+.+.+. .+ .+.+. : | 540 +-+ +.+. + + .+.+ +.+ +. : | | +.+.++.+.+. + +.+ + + + | 530 +-+ +.+.+.+ ++.+.+ | 520 +-+-------------------------------------------------------------------+ [*] bisect-good sample [O] bisect-bad sample *************************************************************************************************** lkp-sb03: 32 threads Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz with 64G memory ========================================================================================= compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase: gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2016-08-31.cgz/lkp-sb03/writeseek1/will-it-scale commit: 955cef1517 ("x86/entry/64: Return to userspace from the trampoline stack") 63e02a2a32 ("x86/entry/64: Create a per-CPU SYSCALL entry trampoline") 955cef1517a1be93 63e02a2a3292d8815eac7be438 ---------------- -------------------------- %stddev %change %stddev \ | \ 1902014 -7.0% 1768039 will-it-scale.per_process_ops 1557647 -6.3% 1459046 will-it-scale.per_thread_ops 0.52 +4.0% 0.54 will-it-scale.scalability 2293 -1.8% 2251 will-it-scale.time.system_time 216.11 +19.7% 258.70 will-it-scale.time.user_time 1.453e+08 ± 6% +21.7% 1.769e+08 ± 9% cpuidle.POLL.time 3.43 +0.8 4.26 mpstat.cpu.usr% 284863 ± 6% +12.9% 321561 ± 3% softirqs.RCU 7178 ± 6% -11.3% 6368 slabinfo.kmalloc-96.active_objs 7218 ± 5% -10.6% 6450 slabinfo.kmalloc-96.num_objs 72.27 ± 6% +19.5% 86.39 ± 7% sched_debug.cfs_rq:/.load_avg.avg 107.67 ± 3% +31.1% 141.11 ± 19% sched_debug.cfs_rq:/.load_avg.stddev 50035 ± 23% +17.3% 58672 ± 24% sched_debug.cpu.load.stddev 7.58 ± 21% +65.4% 12.54 ± 11% sched_debug.cpu.nr_uninterruptible.max 3.143e+12 -4.7% 2.995e+12 perf-stat.branch-instructions 0.01 ± 2% +1.0 0.97 perf-stat.branch-miss-rate% 3.791e+08 ± 3% +7525.5% 2.891e+10 perf-stat.branch-misses 2.54e+08 +1.0% 2.566e+08 perf-stat.cache-misses 1.03 +6.3% 1.10 perf-stat.cpi 6.671e+12 -4.7% 6.361e+12 perf-stat.dTLB-loads 4.722e+12 -5.0% 4.485e+12 perf-stat.dTLB-stores 35.63 ± 12% -29.7 5.89 ± 20% perf-stat.iTLB-load-miss-rate% 8.119e+08 ± 8% +829.8% 7.549e+09 ± 2% perf-stat.iTLB-loads 1.563e+13 -5.3% 1.48e+13 perf-stat.instructions 0.97 -5.9% 0.91 perf-stat.ipc 5.97 -6.0 0.00 perf-profile.calltrace.cycles.entry_SYSCALL_64 7.43 ± 2% -0.1 7.29 ± 3% perf-profile.calltrace.cycles.find_lock_entry.shmem_getpage_gfp.shmem_write_begin.generic_perform_write.__generic_file_write_iter 9.10 ± 2% -0.1 9.00 ± 3% perf-profile.calltrace.cycles.shmem_getpage_gfp.shmem_write_begin.generic_perform_write.__generic_file_write_iter.generic_file_write_iter 9.43 ± 2% -0.1 9.33 ± 3% perf-profile.calltrace.cycles.shmem_write_begin.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.__vfs_write 19.45 -0.1 19.39 ± 2% perf-profile.calltrace.cycles.copyin.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter 19.14 -0.0 19.10 perf-profile.calltrace.cycles.copy_user_generic_string.copyin.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter 21.14 +0.0 21.15 ± 2% perf-profile.calltrace.cycles.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.__vfs_write 9.16 ± 10% +0.0 9.20 ± 41% perf-profile.calltrace.cycles.poll_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary 41.59 +0.1 41.71 ± 2% perf-profile.calltrace.cycles.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.__vfs_write.vfs_write 11.09 ± 8% +0.2 11.24 ± 31% perf-profile.calltrace.cycles.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64 11.21 ± 8% +0.2 11.37 ± 31% perf-profile.calltrace.cycles.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64 11.21 ± 8% +0.2 11.37 ± 31% perf-profile.calltrace.cycles.cpu_startup_entry.start_secondary.secondary_startup_64 11.21 ± 8% +0.2 11.37 ± 31% perf-profile.calltrace.cycles.start_secondary.secondary_startup_64 11.68 ± 7% +0.2 11.90 ± 27% perf-profile.calltrace.cycles.secondary_startup_64 45.10 +0.3 45.37 ± 2% perf-profile.calltrace.cycles.__generic_file_write_iter.generic_file_write_iter.__vfs_write.vfs_write.sys_write 51.69 +0.3 52.02 ± 2% perf-profile.calltrace.cycles.__vfs_write.vfs_write.sys_write.entry_SYSCALL_64_fastpath 50.28 +0.4 50.63 ± 2% perf-profile.calltrace.cycles.generic_file_write_iter.__vfs_write.vfs_write.sys_write.entry_SYSCALL_64_fastpath 61.80 +0.8 62.60 ± 3% perf-profile.calltrace.cycles.vfs_write.sys_write.entry_SYSCALL_64_fastpath 4.92 +0.9 5.80 ± 5% perf-profile.calltrace.cycles.__fdget_pos.sys_lseek.entry_SYSCALL_64_fastpath 4.96 +0.9 5.86 ± 3% perf-profile.calltrace.cycles.__fdget_pos.sys_write.entry_SYSCALL_64_fastpath 8.74 +1.0 9.75 ± 6% perf-profile.calltrace.cycles.sys_lseek.entry_SYSCALL_64_fastpath 69.88 +1.6 71.49 ± 3% perf-profile.calltrace.cycles.sys_write.entry_SYSCALL_64_fastpath 80.00 +2.9 82.90 ± 3% perf-profile.calltrace.cycles.entry_SYSCALL_64_fastpath 5.97 -6.0 0.00 perf-profile.children.cycles.entry_SYSCALL_64 7.43 ± 2% -0.1 7.29 ± 3% perf-profile.children.cycles.find_lock_entry 9.10 ± 2% -0.1 9.00 ± 3% perf-profile.children.cycles.shmem_getpage_gfp 9.43 ± 2% -0.1 9.33 ± 3% perf-profile.children.cycles.shmem_write_begin 19.45 -0.1 19.39 ± 2% perf-profile.children.cycles.copyin 19.14 -0.0 19.11 perf-profile.children.cycles.copy_user_generic_string 21.14 +0.0 21.15 ± 2% perf-profile.children.cycles.iov_iter_copy_from_user_atomic 9.46 ± 9% +0.1 9.56 ± 36% perf-profile.children.cycles.poll_idle 41.60 +0.1 41.72 ± 2% perf-profile.children.cycles.generic_perform_write 11.21 ± 8% +0.2 11.37 ± 31% perf-profile.children.cycles.start_secondary 11.56 ± 7% +0.2 11.76 ± 27% perf-profile.children.cycles.cpuidle_enter_state 11.69 ± 7% +0.2 11.90 ± 27% perf-profile.children.cycles.do_idle 11.68 ± 7% +0.2 11.90 ± 27% perf-profile.children.cycles.secondary_startup_64 11.68 ± 7% +0.2 11.90 ± 27% perf-profile.children.cycles.cpu_startup_entry 45.10 +0.3 45.37 ± 2% perf-profile.children.cycles.__generic_file_write_iter 51.72 +0.3 52.03 ± 2% perf-profile.children.cycles.__vfs_write 50.28 +0.4 50.63 ± 2% perf-profile.children.cycles.generic_file_write_iter 61.84 +0.8 62.62 ± 3% perf-profile.children.cycles.vfs_write 8.74 +1.0 9.75 ± 6% perf-profile.children.cycles.sys_lseek 3.81 +1.6 5.38 ± 5% perf-profile.children.cycles.__fget_light 69.93 +1.6 71.50 ± 3% perf-profile.children.cycles.sys_write 9.88 +1.8 11.67 ± 3% perf-profile.children.cycles.__fdget_pos 80.23 +2.7 82.94 ± 3% perf-profile.children.cycles.entry_SYSCALL_64_fastpath 5.97 -6.0 0.00 perf-profile.self.cycles.entry_SYSCALL_64 18.93 -0.1 18.84 ± 2% perf-profile.self.cycles.copy_user_generic_string 9.39 ± 8% +0.0 9.42 ± 35% perf-profile.self.cycles.poll_idle *************************************************************************************************** lkp-ivb-d03: 4 threads Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz with 4G memory ========================================================================================= compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime: gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2016-08-31.cgz/lkp-ivb-d03/brk_test/aim9/300s commit: 955cef1517 ("x86/entry/64: Return to userspace from the trampoline stack") 63e02a2a32 ("x86/entry/64: Create a per-CPU SYSCALL entry trampoline") 955cef1517a1be93 63e02a2a3292d8815eac7be438 ---------------- -------------------------- %stddev %change %stddev \ | \ 4124214 -9.9% 3717599 aim9.brk_test.ops_per_sec 272.29 -4.9% 259.03 aim9.time.system_time 27.71 +47.2% 40.78 aim9.time.user_time 12605 ± 9% -27.0% 9203 ± 10% cpuidle.POLL.usage 3.24 ± 2% +1.4 4.62 mpstat.cpu.usr% 4007 ± 3% -9.2% 3639 ± 4% slabinfo.anon_vma_chain.num_objs 9.80 -1.9% 9.61 turbostat.CorWatt 30309 -1.3% 29929 vmstat.system.cs 18905 -1.1% 18689 vmstat.system.in 716.67 ± 11% -22.7% 554.33 ± 6% sched_debug.cfs_rq:/.load_avg.avg 1.00 ± 11% -79.2% 0.21 ±173% sched_debug.cfs_rq:/.nr_spread_over.min 0.45 ± 55% +70.3% 0.76 ± 19% sched_debug.cfs_rq:/.nr_spread_over.stddev 521.82 ± 3% -10.2% 468.57 ± 2% sched_debug.cfs_rq:/.util_avg.avg 1.96 ± 7% +34.0% 2.62 ± 9% sched_debug.cpu.nr_running.max 0.68 ± 15% +42.9% 0.98 ± 15% sched_debug.cpu.nr_running.stddev 0.06 ± 19% +0.9 0.92 perf-stat.branch-miss-rate% 3.583e+08 ± 5% +1125.0% 4.389e+09 ± 28% perf-stat.branch-misses 9163065 -1.8% 8997254 perf-stat.context-switches 0.56 ± 2% +12.8% 0.63 ± 4% perf-stat.cpi 0.06 ±132% +0.2 0.23 ± 6% perf-stat.dTLB-load-miss-rate% 4.062e+08 ±142% +234.1% 1.357e+09 ± 8% perf-stat.dTLB-load-misses 9061724 ± 12% +22.0% 11056158 ± 6% perf-stat.dTLB-store-misses 11.72 ± 24% -6.6 5.08 ± 33% perf-stat.iTLB-load-miss-rate% 4.4e+08 ± 29% +135.5% 1.036e+09 ± 23% perf-stat.iTLB-loads 1.80 ± 2% -11.2% 1.60 ± 3% perf-stat.ipc 14.11 ± 88% -2.6 11.50 ± 86% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64 14.22 ± 88% -2.6 11.63 ± 85% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64 14.22 ± 88% -2.6 11.63 ± 85% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64 14.22 ± 88% -2.6 11.63 ± 85% perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64 12.86 ± 92% -2.4 10.45 ± 97% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary 45.20 ± 3% -1.4 43.82 perf-profile.calltrace.cycles-pp.do_brk_flags.sys_brk.entry_SYSCALL_64_fastpath 16.60 ± 3% -0.9 15.74 ± 3% perf-profile.calltrace.cycles-pp.vma_merge.do_brk_flags.sys_brk.entry_SYSCALL_64_fastpath 56.05 ± 2% -0.8 55.25 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_fastpath 14.60 ± 3% -0.7 13.88 ± 2% perf-profile.calltrace.cycles-pp.__vma_adjust.vma_merge.do_brk_flags.sys_brk.entry_SYSCALL_64_fastpath 54.84 ± 3% -0.7 54.15 perf-profile.calltrace.cycles-pp.sys_brk.entry_SYSCALL_64_fastpath 11.52 ± 9% -0.1 11.46 perf-profile.calltrace.cycles-pp.perf_event_mmap.do_brk_flags.sys_brk.entry_SYSCALL_64_fastpath 6.30 ± 5% +0.2 6.48 ± 3% perf-profile.calltrace.cycles-pp.security_vm_enough_memory_mm.do_brk_flags.sys_brk.entry_SYSCALL_64_fastpath 27.40 ± 3% +0.8 28.18 ± 4% perf-profile.calltrace.cycles-pp.secondary_startup_64 12.40 ± 94% +3.3 15.73 ± 62% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_kernel 13.18 ± 88% +3.4 16.55 ± 57% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_kernel.secondary_startup_64 13.18 ± 88% +3.4 16.55 ± 57% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_kernel.secondary_startup_64 13.18 ± 88% +3.4 16.55 ± 57% perf-profile.calltrace.cycles-pp.start_kernel.secondary_startup_64 13.14 ± 88% +3.4 16.53 ± 57% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.do_idle.cpu_startup_entry.start_kernel.secondary_startup_64 14.22 ± 88% -2.6 11.63 ± 85% perf-profile.children.cycles-pp.start_secondary 45.83 ± 3% -1.2 44.59 perf-profile.children.cycles-pp.do_brk_flags 56.30 ± 2% -0.9 55.36 perf-profile.children.cycles-pp.entry_SYSCALL_64_fastpath 17.05 ± 3% -0.8 16.24 ± 3% perf-profile.children.cycles-pp.vma_merge 15.45 ± 3% -0.7 14.79 ± 2% perf-profile.children.cycles-pp.__vma_adjust 55.47 ± 3% -0.6 54.88 perf-profile.children.cycles-pp.sys_brk 12.21 ± 8% -0.1 12.08 perf-profile.children.cycles-pp.perf_event_mmap 6.40 ± 5% +0.2 6.57 ± 3% perf-profile.children.cycles-pp.security_vm_enough_memory_mm 27.41 ± 3% +0.8 28.19 ± 4% perf-profile.children.cycles-pp.do_idle 27.30 ± 3% +0.8 28.07 ± 4% perf-profile.children.cycles-pp.cpuidle_enter_state 27.40 ± 3% +0.8 28.18 ± 4% perf-profile.children.cycles-pp.secondary_startup_64 27.40 ± 3% +0.8 28.18 ± 4% perf-profile.children.cycles-pp.cpu_startup_entry 25.27 +0.9 26.19 perf-profile.children.cycles-pp.intel_idle 13.18 ± 88% +3.4 16.55 ± 57% perf-profile.children.cycles-pp.start_kernel 4.82 ± 9% +0.0 4.83 ± 5% perf-profile.self.cycles-pp.__vma_adjust 5.25 ± 9% +0.0 5.29 ± 2% perf-profile.self.cycles-pp.perf_event_mmap 5.33 ± 3% +0.4 5.75 ± 3% perf-profile.self.cycles-pp.do_brk_flags 25.26 +0.9 26.19 perf-profile.self.cycles-pp.intel_idle Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Thanks, Xiaolong