Greeting, FYI, we noticed a -5.5% regression of will-it-scale.per_process_ops due to commit: commit: fec92278217ba01b4a3b9f9ec0f6a392069cdbd0 ("[RFC PATCH 12/13] fs/userfaultfd: kmem-cache for wait-queue objects") url: https://github.com/0day-ci/linux/commits/Nadav-Amit/fs-userfaultfd-support-iouring-and-polling/20201129-085119 base: https://git.kernel.org/cgit/linux/kernel/git/shuah/linux-kselftest.git next in testcase: will-it-scale on test machine: 104 threads Skylake with 192G memory with following parameters: nr_task: 50% mode: process test: brk1 cpufreq_governor: performance ucode: 0x2006a08 test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two. test-url: https://github.com/antonblanchard/will-it-scale In addition to that, the commit also has significant impact on the following tests: +------------------+---------------------------------------------------------------------------+ | testcase: change | will-it-scale: will-it-scale.per_process_ops -11.0% regression | | test machine | 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory | | test parameters | cpufreq_governor=performance | | | mode=process | | | nr_task=16 | | | test=brk1 | | | ucode=0x5003003 | +------------------+---------------------------------------------------------------------------+ If you fix the issue, kindly add following tag Reported-by: kernel test robot Details are as below: --------------------------------------------------------------------------------------------------> To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp install job.yaml # job file is attached in this email bin/lkp run job.yaml ========================================================================================= compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode: gcc-9/performance/x86_64-rhel-8.3/process/50%/debian-10.4-x86_64-20200603.cgz/lkp-skl-fpga01/brk1/will-it-scale/0x2006a08 commit: ddfa740e9c ("fs/userfaultfd: complete write asynchronously") fec9227821 ("fs/userfaultfd: kmem-cache for wait-queue objects") ddfa740e9caf7642 fec92278217ba01b4a3b9f9ec0f ---------------- --------------------------- %stddev %change %stddev \ | \ 65219467 -5.5% 61607693 will-it-scale.52.processes 1254220 -5.5% 1184763 will-it-scale.per_process_ops 65219467 -5.5% 61607693 will-it-scale.workload 20.00 -5.0% 19.00 vmstat.cpu.us 34.22 -4.0% 32.85 ± 2% boot-time.boot 3146 -4.3% 3010 ± 2% boot-time.idle 654.25 ± 20% -39.0% 399.25 ± 25% numa-vmstat.node0.nr_active_anon 654.25 ± 20% -39.0% 399.25 ± 25% numa-vmstat.node0.nr_zone_active_anon 10140 ± 9% +27.1% 12889 ± 10% numa-vmstat.node1.nr_slab_reclaimable 21388 ± 3% +13.2% 24204 ± 7% numa-vmstat.node1.nr_slab_unreclaimable 1096 ± 8% +304.6% 4434 slabinfo.dmaengine-unmap-16.active_objs 1096 ± 8% +304.6% 4434 slabinfo.dmaengine-unmap-16.num_objs 4838 ± 4% -17.0% 4018 ± 3% slabinfo.eventpoll_pwq.active_objs 4838 ± 4% -17.0% 4018 ± 3% slabinfo.eventpoll_pwq.num_objs 2689 ± 18% -37.7% 1675 ± 22% numa-meminfo.node0.Active 2617 ± 20% -38.9% 1599 ± 25% numa-meminfo.node0.Active(anon) 40564 ± 9% +27.1% 51560 ± 10% numa-meminfo.node1.KReclaimable 40564 ± 9% +27.1% 51560 ± 10% numa-meminfo.node1.SReclaimable 85552 ± 3% +13.2% 96818 ± 7% numa-meminfo.node1.SUnreclaim 126118 ± 4% +17.7% 148380 ± 8% numa-meminfo.node1.Slab 7.12 ± 17% -87.9% 0.86 ±100% sched_debug.cfs_rq:/.removed.load_avg.avg 33.50 ± 8% -74.6% 8.50 ±100% sched_debug.cfs_rq:/.removed.load_avg.stddev 2.76 ± 27% -88.1% 0.33 ±102% sched_debug.cfs_rq:/.removed.runnable_avg.avg 80.58 ± 9% -60.1% 32.12 ±102% sched_debug.cfs_rq:/.removed.runnable_avg.max 13.39 ± 18% -75.9% 3.23 ±102% sched_debug.cfs_rq:/.removed.runnable_avg.stddev 2.76 ± 28% -88.1% 0.33 ±102% sched_debug.cfs_rq:/.removed.util_avg.avg 80.58 ± 9% -60.1% 32.12 ±102% sched_debug.cfs_rq:/.removed.util_avg.max 13.39 ± 18% -75.9% 3.23 ±102% sched_debug.cfs_rq:/.removed.util_avg.stddev 1036 ± 8% +14.0% 1181 ± 8% sched_debug.cpu.nr_switches.min -22.25 -30.3% -15.50 sched_debug.cpu.nr_uninterruptible.min 2.50 ± 91% +7990.0% 202.25 ±166% interrupts.CPU1.TLB:TLB_shootdowns 451.00 +12.8% 508.75 ± 5% interrupts.CPU100.CAL:Function_call_interrupts 457.50 ± 3% +12.3% 514.00 ± 8% interrupts.CPU103.CAL:Function_call_interrupts 48.75 ±130% -89.7% 5.00 ±122% interrupts.CPU15.RES:Rescheduling_interrupts 3195 ± 18% +140.3% 7678 interrupts.CPU24.NMI:Non-maskable_interrupts 3195 ± 18% +140.3% 7678 interrupts.CPU24.PMI:Performance_monitoring_interrupts 8.25 ± 41% +1009.1% 91.50 ± 49% interrupts.CPU24.RES:Rescheduling_interrupts 694.25 ± 28% +89.6% 1316 ± 24% interrupts.CPU3.CAL:Function_call_interrupts 3946 ± 46% +86.3% 7352 ± 12% interrupts.CPU30.NMI:Non-maskable_interrupts 3946 ± 46% +86.3% 7352 ± 12% interrupts.CPU30.PMI:Performance_monitoring_interrupts 30.00 ±115% +200.8% 90.25 ± 51% interrupts.CPU36.RES:Rescheduling_interrupts 7.50 ± 14% +1123.3% 91.75 ± 51% interrupts.CPU40.RES:Rescheduling_interrupts 10.50 ± 38% +590.5% 72.50 ± 60% interrupts.CPU42.RES:Rescheduling_interrupts 449.00 +214.1% 1410 ±107% interrupts.CPU76.CAL:Function_call_interrupts 448.75 +99.8% 896.75 ± 51% interrupts.CPU82.CAL:Function_call_interrupts 453.25 +78.7% 809.75 ± 50% interrupts.CPU86.CAL:Function_call_interrupts 456.00 +145.0% 1117 ± 93% interrupts.CPU90.CAL:Function_call_interrupts 72.75 ± 82% -89.7% 7.50 ± 33% interrupts.CPU92.RES:Rescheduling_interrupts 2.00 ± 79% +1737.5% 36.75 ±146% interrupts.CPU92.TLB:TLB_shootdowns 5545 ± 32% +32.6% 7353 ± 12% interrupts.CPU93.NMI:Non-maskable_interrupts 5545 ± 32% +32.6% 7353 ± 12% interrupts.CPU93.PMI:Performance_monitoring_interrupts 10.50 ± 10% +514.3% 64.50 ± 76% interrupts.CPU93.RES:Rescheduling_interrupts 2.683e+10 +3.7% 2.781e+10 perf-stat.i.branch-instructions 0.68 -0.1 0.63 perf-stat.i.branch-miss-rate% 1.811e+08 -5.2% 1.718e+08 perf-stat.i.branch-misses 1.12 -4.6% 1.07 perf-stat.i.cpi 0.17 -0.0 0.15 perf-stat.i.dTLB-load-miss-rate% 64926279 -5.5% 61335249 perf-stat.i.dTLB-load-misses 3.779e+10 +5.6% 3.99e+10 perf-stat.i.dTLB-loads 2.1e+10 +2.7% 2.157e+10 perf-stat.i.dTLB-stores 1.292e+11 +4.6% 1.352e+11 perf-stat.i.instructions 1957 +3.7% 2029 perf-stat.i.instructions-per-iTLB-miss 0.89 +4.8% 0.94 perf-stat.i.ipc 823.71 +4.3% 858.87 perf-stat.i.metric.M/sec 0.67 -0.1 0.62 perf-stat.overall.branch-miss-rate% 1.12 -4.6% 1.07 perf-stat.overall.cpi 0.17 -0.0 0.15 perf-stat.overall.dTLB-load-miss-rate% 1933 +3.6% 2004 perf-stat.overall.instructions-per-iTLB-miss 0.89 +4.8% 0.94 perf-stat.overall.ipc 82.14 +1.7 83.85 perf-stat.overall.node-store-miss-rate% 597331 +10.8% 662119 perf-stat.overall.path-length 2.674e+10 +3.7% 2.772e+10 perf-stat.ps.branch-instructions 1.804e+08 -5.2% 1.71e+08 perf-stat.ps.branch-misses 64722645 -5.5% 61153001 perf-stat.ps.dTLB-load-misses 3.766e+10 +5.6% 3.976e+10 perf-stat.ps.dTLB-loads 2.093e+10 +2.7% 2.15e+10 perf-stat.ps.dTLB-stores 1.288e+11 +4.6% 1.347e+11 perf-stat.ps.instructions 3.896e+13 +4.7% 4.079e+13 perf-stat.total.instructions 19290 ± 14% -31.0% 13316 ± 5% softirqs.CPU13.RCU 22289 ± 79% -44.0% 12473 ±110% softirqs.CPU18.SCHED 19387 ± 12% -26.7% 14206 ± 6% softirqs.CPU21.RCU 14997 ± 5% +51.6% 22739 ± 2% softirqs.CPU24.RCU 39995 ± 3% -88.9% 4457 softirqs.CPU24.SCHED 22221 ± 79% -73.2% 5963 ± 42% softirqs.CPU28.SCHED 18559 ± 24% -28.7% 13237 ± 7% softirqs.CPU33.RCU 16004 ± 19% +31.9% 21107 ± 4% softirqs.CPU34.RCU 22675 ± 7% -31.0% 15655 ± 18% softirqs.CPU35.RCU 4273 ± 17% +620.7% 30798 ± 48% softirqs.CPU35.SCHED 20207 ± 16% -23.6% 15448 ± 19% softirqs.CPU37.RCU 15311 ± 19% +37.4% 21044 ± 7% softirqs.CPU4.RCU 30669 ± 48% -68.4% 9687 ± 89% softirqs.CPU40.SCHED 20195 ± 15% -23.5% 15442 ± 20% softirqs.CPU41.RCU 22191 ± 25% -37.8% 13806 ± 10% softirqs.CPU43.RCU 16782 ± 14% -21.8% 13122 ± 4% softirqs.CPU47.RCU 22290 ± 8% -22.0% 17381 ± 22% softirqs.CPU49.RCU 22338 ± 79% -79.7% 4526 softirqs.CPU61.SCHED 30860 ± 49% -85.3% 4533 softirqs.CPU65.SCHED 24975 ± 57% -82.2% 4447 softirqs.CPU73.SCHED 20318 ± 6% -39.8% 12236 ± 2% softirqs.CPU76.RCU 4615 ± 5% +761.7% 39773 ± 2% softirqs.CPU76.SCHED 21142 ± 3% -29.2% 14979 ± 9% softirqs.CPU82.RCU 13144 ±113% +199.0% 39305 ± 3% softirqs.CPU86.SCHED 39713 ± 4% -67.4% 12956 ±110% softirqs.CPU87.SCHED 17739 ± 16% -22.2% 13795 ± 4% softirqs.CPU88.RCU 18651 ± 15% -27.5% 13514 ± 11% softirqs.CPU92.RCU 30590 ± 48% -57.5% 12998 ±111% softirqs.CPU93.SCHED 15264 ± 17% +26.7% 19337 ± 5% softirqs.CPU95.RCU 1.33 ± 10% -0.1 1.20 ± 10% perf-profile.calltrace.cycles-pp.find_vma.__do_munmap.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.80 ± 11% -0.1 0.69 ± 11% perf-profile.calltrace.cycles-pp.security_mmap_addr.get_unmapped_area.do_brk_flags.__x64_sys_brk.do_syscall_64 0.00 +0.8 0.76 ± 9% perf-profile.calltrace.cycles-pp.memset_erms.kmem_cache_alloc.userfaultfd_unmap_complete.__x64_sys_brk.do_syscall_64 0.00 +0.9 0.94 ± 4% perf-profile.calltrace.cycles-pp.kmem_cache_free.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk 0.00 +2.3 2.29 ± 13% perf-profile.calltrace.cycles-pp.kmem_cache_alloc.userfaultfd_unmap_complete.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +2.5 2.51 ± 12% perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk 0.55 ± 10% -0.3 0.28 ± 14% perf-profile.children.cycles-pp.vma_merge 1.81 ± 10% -0.2 1.59 ± 10% perf-profile.children.cycles-pp.get_unmapped_area 1.72 ± 10% -0.2 1.54 ± 10% perf-profile.children.cycles-pp.find_vma 0.30 ± 9% -0.1 0.15 ± 11% perf-profile.children.cycles-pp.cap_capable 0.82 ± 11% -0.1 0.70 ± 11% perf-profile.children.cycles-pp.security_mmap_addr 0.57 ± 11% -0.1 0.50 ± 9% perf-profile.children.cycles-pp.obj_cgroup_charge 0.32 ± 10% -0.1 0.25 ± 11% perf-profile.children.cycles-pp.__vm_enough_memory 0.32 ± 12% -0.1 0.26 ± 9% perf-profile.children.cycles-pp.__x86_retpoline_rax 0.46 ± 9% -0.1 0.41 ± 11% perf-profile.children.cycles-pp.vmacache_find 0.22 ± 11% -0.0 0.19 ± 10% perf-profile.children.cycles-pp.exit_to_user_mode_prepare 0.24 ± 9% -0.0 0.21 ± 11% perf-profile.children.cycles-pp.free_pgd_range 0.00 +0.1 0.08 ± 10% perf-profile.children.cycles-pp.should_failslab 2.83 ± 11% +0.7 3.49 ± 7% perf-profile.children.cycles-pp.kmem_cache_free 0.00 +0.8 0.77 ± 9% perf-profile.children.cycles-pp.memset_erms 4.08 ± 11% +1.9 6.03 ± 11% perf-profile.children.cycles-pp.kmem_cache_alloc 0.21 ± 10% +2.3 2.52 ± 12% perf-profile.children.cycles-pp.userfaultfd_unmap_complete 0.53 ± 9% -0.3 0.27 ± 14% perf-profile.self.cycles-pp.vma_merge 0.28 ± 11% -0.1 0.14 ± 11% perf-profile.self.cycles-pp.cap_capable 0.99 ± 10% -0.1 0.88 ± 11% perf-profile.self.cycles-pp.unmap_page_range 0.78 ± 11% -0.1 0.69 ± 9% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe 0.70 ± 11% -0.1 0.62 ± 10% perf-profile.self.cycles-pp.vm_area_alloc 0.41 ± 11% -0.1 0.34 ± 12% perf-profile.self.cycles-pp.percpu_counter_add_batch 0.55 ± 12% -0.1 0.49 ± 9% perf-profile.self.cycles-pp.obj_cgroup_charge 0.44 ± 9% -0.1 0.39 ± 11% perf-profile.self.cycles-pp.vmacache_find 0.25 ± 12% -0.1 0.20 ± 10% perf-profile.self.cycles-pp.__x86_retpoline_rax 0.36 ± 11% -0.0 0.31 ± 10% perf-profile.self.cycles-pp.security_mmap_addr 0.19 ± 11% -0.0 0.16 ± 10% perf-profile.self.cycles-pp.exit_to_user_mode_prepare 0.10 ± 12% -0.0 0.08 ± 13% perf-profile.self.cycles-pp.__vm_enough_memory 0.48 ± 10% +0.1 0.61 ± 9% perf-profile.self.cycles-pp.cap_vm_enough_memory 0.00 +0.7 0.73 ± 10% perf-profile.self.cycles-pp.memset_erms 1.86 ± 11% +0.8 2.62 ± 7% perf-profile.self.cycles-pp.kmem_cache_free 1.91 ± 11% +0.8 2.74 ± 12% perf-profile.self.cycles-pp.kmem_cache_alloc will-it-scale.52.processes 6.6e+07 +----------------------------------------------------------------+ 6.55e+07 |.+..+.+.+.. .+..+.+.+..+.+. | | .+..+.+.+ +..+.+.+ | 6.5e+07 |-+ +.+. .+.+.+..+.+ | 6.45e+07 |-+ +. | | | 6.4e+07 |-+ | 6.35e+07 |-+ | 6.3e+07 |-+ | | | 6.25e+07 |-+ | 6.2e+07 |-+ | | O O O O O O O O O O O O O O O O O | 6.15e+07 |-O O O O O O O O O | 6.1e+07 +----------------------------------------------------------------+ will-it-scale.per_process_ops 1.27e+06 +----------------------------------------------------------------+ 1.26e+06 |.+..+.+.+.. .+..+.+.+..+.+. | | .+..+.+.+ +..+.+.+ | 1.25e+06 |-+ +.+.+..+.+.+..+.+ | 1.24e+06 |-+ | | | 1.23e+06 |-+ | 1.22e+06 |-+ | 1.21e+06 |-+ | | | 1.2e+06 |-+ | 1.19e+06 |-+ O O | | O O O O O O O O O O O O O O O O O O O O O O | 1.18e+06 |-O O O | 1.17e+06 +----------------------------------------------------------------+ will-it-scale.workload 6.6e+07 +----------------------------------------------------------------+ 6.55e+07 |.+..+.+.+.. .+..+.+.+..+.+. | | .+..+.+.+ +..+.+.+ | 6.5e+07 |-+ +.+. .+.+.+..+.+ | 6.45e+07 |-+ +. | | | 6.4e+07 |-+ | 6.35e+07 |-+ | 6.3e+07 |-+ | | | 6.25e+07 |-+ | 6.2e+07 |-+ | | O O O O O O O O O O O O O O O O O | 6.15e+07 |-O O O O O O O O O | 6.1e+07 +----------------------------------------------------------------+ [*] bisect-good sample [O] bisect-bad sample *************************************************************************************************** lkp-csl-2ap2: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory ========================================================================================= compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode: gcc-9/performance/x86_64-rhel-8.3/process/16/debian-10.4-x86_64-20200603.cgz/lkp-csl-2ap2/brk1/will-it-scale/0x5003003 commit: ddfa740e9c ("fs/userfaultfd: complete write asynchronously") fec9227821 ("fs/userfaultfd: kmem-cache for wait-queue objects") ddfa740e9caf7642 fec92278217ba01b4a3b9f9ec0f ---------------- --------------------------- %stddev %change %stddev \ | \ 46606610 -11.0% 41486565 will-it-scale.16.processes 2912912 -11.0% 2592909 will-it-scale.per_process_ops 46606610 -11.0% 41486565 will-it-scale.workload 0.72 -0.1 0.65 mpstat.cpu.all.usr% 17741 -4.4% 16964 proc-vmstat.nr_shmem -116535 -515.3% 484006 ± 50% sched_debug.cfs_rq:/.spread0.avg 1380 ± 6% +495.6% 8222 slabinfo.dmaengine-unmap-16.active_objs 32.50 ± 7% +500.0% 195.00 slabinfo.dmaengine-unmap-16.active_slabs 1380 ± 6% +495.6% 8222 slabinfo.dmaengine-unmap-16.num_objs 32.50 ± 7% +500.0% 195.00 slabinfo.dmaengine-unmap-16.num_slabs 11962 ± 7% -17.3% 9891 ± 12% softirqs.CPU10.RCU 10075 ± 23% +28.9% 12985 ± 4% softirqs.CPU110.RCU 42801 ± 4% -5.8% 40327 ± 2% softirqs.CPU136.SCHED 42633 ± 4% -15.2% 36169 ± 18% softirqs.CPU137.SCHED 42786 ± 4% -6.8% 39864 softirqs.CPU156.SCHED 11795 ± 8% -16.6% 9835 ± 11% softirqs.CPU2.RCU 42004 ± 4% -5.9% 39537 ± 3% softirqs.CPU25.SCHED 39956 ± 4% -65.4% 13836 ±110% softirqs.CPU5.SCHED 9734 ± 8% -13.2% 8450 ± 8% softirqs.CPU68.RCU 41424 ± 4% -14.7% 35347 ± 19% softirqs.CPU87.SCHED 1.935e+10 -2.0% 1.895e+10 perf-stat.i.branch-instructions 0.61 +2.3% 0.62 perf-stat.i.cpi 1.494e+10 -2.9% 1.451e+10 perf-stat.i.dTLB-stores 9.271e+10 -1.1% 9.17e+10 perf-stat.i.instructions 1.64 -2.2% 1.61 perf-stat.i.ipc 320.23 -1.4% 315.65 perf-stat.i.metric.M/sec 0.61 +2.3% 0.62 perf-stat.overall.cpi 1.65 -2.2% 1.61 perf-stat.overall.ipc 601140 +10.9% 666775 perf-stat.overall.path-length 1.928e+10 -2.0% 1.889e+10 perf-stat.ps.branch-instructions 1.489e+10 -2.9% 1.446e+10 perf-stat.ps.dTLB-stores 9.24e+10 -1.1% 9.139e+10 perf-stat.ps.instructions 2.802e+13 -1.3% 2.766e+13 perf-stat.total.instructions 0.01 ± 25% +188.2% 0.02 ± 57% perf-sched.sch_delay.avg.ms.do_syslog.part.0.kmsg_read.vfs_read 0.01 ± 15% -46.6% 0.01 ± 42% perf-sched.sch_delay.avg.ms.schedule_timeout.wait_for_completion.__flush_work.lru_add_drain_all 0.01 ± 22% +324.4% 0.05 ± 67% perf-sched.sch_delay.max.ms.do_syslog.part.0.kmsg_read.vfs_read 0.01 ± 15% -43.1% 0.01 ± 41% perf-sched.sch_delay.max.ms.schedule_timeout.wait_for_completion.__flush_work.lru_add_drain_all 0.03 ± 23% -78.0% 0.01 ±173% perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown] 605.16 ± 7% +13.3% 685.54 ± 5% perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_sys_poll 4.35 ± 10% +19.2% 5.19 ± 4% perf-sched.wait_and_delay.avg.ms.schedule_timeout.rcu_gp_kthread.kthread.ret_from_fork 54.50 ± 9% -18.8% 44.25 ± 5% perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_sys_poll 2295 ± 10% -17.2% 1900 ± 4% perf-sched.wait_and_delay.count.schedule_timeout.rcu_gp_kthread.kthread.ret_from_fork 0.43 ±143% -92.6% 0.03 ±173% perf-sched.wait_and_delay.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown] 85.77 ± 63% +111.9% 181.78 ± 16% perf-sched.wait_and_delay.max.ms.schedule_timeout.rcu_gp_kthread.kthread.ret_from_fork 0.03 ± 23% -25.0% 0.02 ± 11% perf-sched.wait_time.avg.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown] 605.15 ± 7% +13.3% 685.54 ± 5% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_sys_poll 4.34 ± 10% +19.1% 5.17 ± 4% perf-sched.wait_time.avg.ms.schedule_timeout.rcu_gp_kthread.kthread.ret_from_fork 4.24 ± 10% +64.4% 6.97 ± 49% perf-sched.wait_time.max.ms.rcu_gp_kthread.kthread.ret_from_fork 85.73 ± 63% +112.0% 181.73 ± 16% perf-sched.wait_time.max.ms.schedule_timeout.rcu_gp_kthread.kthread.ret_from_fork 8753 -54.6% 3974 ± 70% interrupts.CPU101.NMI:Non-maskable_interrupts 8753 -54.6% 3974 ± 70% interrupts.CPU101.PMI:Performance_monitoring_interrupts 1.75 ± 47% +8342.9% 147.75 ±168% interrupts.CPU137.RES:Rescheduling_interrupts 112.75 ± 8% +40.1% 158.00 ± 19% interrupts.CPU145.NMI:Non-maskable_interrupts 112.75 ± 8% +40.1% 158.00 ± 19% interrupts.CPU145.PMI:Performance_monitoring_interrupts 1251 ± 31% +151.4% 3145 ± 43% interrupts.CPU149.CAL:Function_call_interrupts 117.50 ± 7% +27.7% 150.00 ± 9% interrupts.CPU159.NMI:Non-maskable_interrupts 117.50 ± 7% +27.7% 150.00 ± 9% interrupts.CPU159.PMI:Performance_monitoring_interrupts 115.25 ± 9% -26.7% 84.50 ± 20% interrupts.CPU161.NMI:Non-maskable_interrupts 115.25 ± 9% -26.7% 84.50 ± 20% interrupts.CPU161.PMI:Performance_monitoring_interrupts 8756 -50.5% 4334 ± 58% interrupts.CPU2.NMI:Non-maskable_interrupts 8756 -50.5% 4334 ± 58% interrupts.CPU2.PMI:Performance_monitoring_interrupts 113.75 ± 8% +26.6% 144.00 ± 8% interrupts.CPU49.NMI:Non-maskable_interrupts 113.75 ± 8% +26.6% 144.00 ± 8% interrupts.CPU49.PMI:Performance_monitoring_interrupts 98.75 ± 22% +44.3% 142.50 ± 19% interrupts.CPU66.NMI:Non-maskable_interrupts 98.75 ± 22% +44.3% 142.50 ± 19% interrupts.CPU66.PMI:Performance_monitoring_interrupts 1.50 ±110% +4266.7% 65.50 ±129% interrupts.CPU98.RES:Rescheduling_interrupts 228023 ± 7% -16.3% 190922 ± 7% interrupts.NMI:Non-maskable_interrupts 228023 ± 7% -16.3% 190922 ± 7% interrupts.PMI:Performance_monitoring_interrupts 0.66 ± 31% +0.2 0.90 ± 30% perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.asm_call_sysvec_on_stack.sysvec_apic_timer_interrupt 0.87 ± 9% +0.3 1.18 ± 3% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.brk 1.06 ± 16% +0.4 1.42 ± 21% perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.asm_call_sysvec_on_stack.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt 1.08 ± 16% +0.4 1.46 ± 22% perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.asm_call_sysvec_on_stack.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state 1.09 ± 16% +0.4 1.47 ± 23% perf-profile.calltrace.cycles-pp.asm_call_sysvec_on_stack.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter 0.00 +0.6 0.58 ± 3% perf-profile.calltrace.cycles-pp.___might_sleep.kmem_cache_alloc.userfaultfd_unmap_complete.__x64_sys_brk.do_syscall_64 0.00 +1.7 1.67 ± 3% perf-profile.calltrace.cycles-pp.memset_erms.kmem_cache_alloc.userfaultfd_unmap_complete.__x64_sys_brk.do_syscall_64 0.00 +1.8 1.79 ± 4% perf-profile.calltrace.cycles-pp.kmem_cache_free.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk 0.00 +4.5 4.46 ± 5% perf-profile.calltrace.cycles-pp.kmem_cache_alloc.userfaultfd_unmap_complete.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +5.0 4.96 ± 4% perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk 47.85 ± 9% +7.2 55.00 ± 3% perf-profile.calltrace.cycles-pp.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk 49.25 ± 9% +7.4 56.63 ± 3% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk 51.01 ± 9% +7.5 58.48 ± 3% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.brk 0.44 ± 11% -0.2 0.27 ± 4% perf-profile.children.cycles-pp.cap_capable 0.05 ± 8% +0.0 0.07 ± 5% perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore 0.08 +0.0 0.10 ± 10% perf-profile.children.cycles-pp.sched_clock 0.08 ± 6% +0.0 0.10 ± 10% perf-profile.children.cycles-pp.native_sched_clock 0.09 ± 4% +0.0 0.11 ± 17% perf-profile.children.cycles-pp.read_tsc 0.10 ± 14% +0.0 0.13 ± 8% perf-profile.children.cycles-pp.lapic_next_deadline 0.09 +0.0 0.12 ± 10% perf-profile.children.cycles-pp.sched_clock_cpu 0.04 ± 57% +0.0 0.07 ± 17% perf-profile.children.cycles-pp.get_next_timer_interrupt 0.00 +0.1 0.05 ± 9% perf-profile.children.cycles-pp.memset 0.04 ±115% +0.1 0.10 ± 31% perf-profile.children.cycles-pp.tick_nohz_irq_exit 0.26 ± 18% +0.1 0.33 ± 11% perf-profile.children.cycles-pp.clockevents_program_event 0.04 ± 58% +0.1 0.16 ± 2% perf-profile.children.cycles-pp.should_failslab 0.14 ± 42% +0.1 0.28 ± 19% perf-profile.children.cycles-pp.tick_nohz_next_event 0.21 ± 31% +0.2 0.36 ± 9% perf-profile.children.cycles-pp.tick_nohz_get_sleep_length 0.54 ± 21% +0.2 0.72 ± 17% perf-profile.children.cycles-pp.update_process_times 0.54 ± 10% +0.2 0.72 ± 4% perf-profile.children.cycles-pp.rcu_all_qs 0.65 ± 20% +0.2 0.84 ± 17% perf-profile.children.cycles-pp.tick_sched_timer 0.56 ± 24% +0.2 0.76 ± 20% perf-profile.children.cycles-pp.tick_sched_handle 0.93 ± 23% +0.3 1.20 ± 22% perf-profile.children.cycles-pp.__hrtimer_run_queues 0.87 ± 10% +0.3 1.15 ± 2% perf-profile.children.cycles-pp.__might_sleep 1.09 ± 11% +0.4 1.46 ± 5% perf-profile.children.cycles-pp._cond_resched 1.39 ± 13% +0.4 1.79 ± 16% perf-profile.children.cycles-pp.hrtimer_interrupt 1.43 ± 13% +0.4 1.83 ± 17% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt 1.68 ± 13% +0.5 2.15 ± 19% perf-profile.children.cycles-pp.asm_call_sysvec_on_stack 1.94 ± 10% +0.6 2.54 ± 3% perf-profile.children.cycles-pp.___might_sleep 0.00 +1.7 1.67 ± 3% perf-profile.children.cycles-pp.memset_erms 4.88 ± 8% +1.7 6.63 ± 4% perf-profile.children.cycles-pp.kmem_cache_free 6.56 ± 10% +4.6 11.13 ± 3% perf-profile.children.cycles-pp.kmem_cache_alloc 0.37 ± 9% +4.6 4.99 ± 4% perf-profile.children.cycles-pp.userfaultfd_unmap_complete 48.02 ± 9% +7.1 55.14 ± 3% perf-profile.children.cycles-pp.__x64_sys_brk 49.49 ± 9% +7.3 56.81 ± 3% perf-profile.children.cycles-pp.do_syscall_64 51.22 ± 9% +7.4 58.66 ± 3% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 0.42 ± 11% -0.2 0.24 ± 6% perf-profile.self.cycles-pp.cap_capable 0.07 ± 5% +0.0 0.09 ± 14% perf-profile.self.cycles-pp.native_sched_clock 0.04 ± 57% +0.0 0.07 ± 7% perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore 0.10 ± 14% +0.0 0.13 ± 8% perf-profile.self.cycles-pp.lapic_next_deadline 0.00 +0.1 0.05 ± 9% perf-profile.self.cycles-pp.memset 0.01 ±173% +0.1 0.08 ± 23% perf-profile.self.cycles-pp.tick_nohz_next_event 0.34 ± 10% +0.1 0.41 ± 6% perf-profile.self.cycles-pp.userfaultfd_unmap_complete 0.00 +0.1 0.08 ± 5% perf-profile.self.cycles-pp.should_failslab 0.40 ± 9% +0.1 0.48 ± 8% perf-profile.self.cycles-pp.syscall_exit_to_user_mode 0.37 ± 12% +0.1 0.49 ± 5% perf-profile.self.cycles-pp.rcu_all_qs 0.53 ± 11% +0.2 0.70 ± 3% perf-profile.self.cycles-pp._cond_resched 0.21 ± 8% +0.2 0.44 ± 6% perf-profile.self.cycles-pp.do_syscall_64 0.46 ± 8% +0.2 0.70 ± 6% perf-profile.self.cycles-pp.cap_vm_enough_memory 0.81 ± 10% +0.3 1.09 ± 2% perf-profile.self.cycles-pp.__might_sleep 1.88 ± 10% +0.6 2.46 ± 3% perf-profile.self.cycles-pp.___might_sleep 0.00 +1.6 1.61 ± 3% perf-profile.self.cycles-pp.memset_erms 3.19 ± 10% +1.7 4.86 ± 6% perf-profile.self.cycles-pp.kmem_cache_alloc 3.31 ± 7% +1.7 5.05 ± 4% perf-profile.self.cycles-pp.kmem_cache_free Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Thanks, Oliver Sang