Greeting, FYI, we noticed a -6.3% regression of will-it-scale.per_thread_ops due to commit: commit: 5cb13cb023c0caccfa3e0fcf5e206ea1246c476f ("[PATCH] eventfd: convert to ->write_iter()") url: https://github.com/0day-ci/linux/commits/Michal-Kubecek/eventfd-convert-to-write_iter/20201118-172416 base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git 0fa8ee0d9ab95c9350b8b84574824d9a384a9f7d in testcase: will-it-scale on test machine: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory with following parameters: nr_task: 50% mode: thread test: eventfd1 cpufreq_governor: performance ucode: 0x5003003 test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two. test-url: https://github.com/antonblanchard/will-it-scale If you fix the issue, kindly add following tag Reported-by: kernel test robot Details are as below: --------------------------------------------------------------------------------------------------> To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp install job.yaml # job file is attached in this email bin/lkp run job.yaml ========================================================================================= compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode: gcc-9/performance/x86_64-rhel-8.3/thread/50%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2ap2/eventfd1/will-it-scale/0x5003003 commit: 0fa8ee0d9a ("Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input") 5cb13cb023 ("eventfd: convert to ->write_iter()") 0fa8ee0d9ab95c93 5cb13cb023c0caccfa3e0fcf5e2 ---------------- --------------------------- %stddev %change %stddev \ | \ 2598249 -6.3% 2434898 will-it-scale.per_thread_ops 2.494e+08 -6.3% 2.338e+08 will-it-scale.workload 5624 ± 3% -11.8% 4961 ± 7% boot-time.idle 195795 ± 30% +122.8% 436164 ± 53% numa-numastat.node2.numa_hit 6231 ± 22% -34.0% 4112 ± 36% sched_debug.cpu.ttwu_local.max 1.017e+10 ±107% +131.5% 2.354e+10 ± 37% cpuidle.C1E.time 30461075 ± 67% +70.3% 51869908 ± 22% cpuidle.C1E.usage 1274 ± 3% -12.1% 1120 ± 8% slabinfo.file_lock_cache.active_objs 1274 ± 3% -12.1% 1120 ± 8% slabinfo.file_lock_cache.num_objs 79975 ± 8% +12.6% 90014 ± 12% meminfo.Active 79975 ± 8% +12.6% 90014 ± 12% meminfo.Active(anon) 93866 ± 7% +11.2% 104419 ± 11% meminfo.Shmem 19966 ± 8% +12.6% 22476 ± 12% proc-vmstat.nr_active_anon 88671 -2.2% 86683 proc-vmstat.nr_anon_pages 92029 -2.0% 90156 proc-vmstat.nr_inactive_anon 19966 ± 8% +12.6% 22476 ± 12% proc-vmstat.nr_zone_active_anon 92029 -2.0% 90156 proc-vmstat.nr_zone_inactive_anon 564.00 ± 27% -58.8% 232.50 ± 39% numa-vmstat.node0.nr_page_table_pages 11326 ± 20% -48.1% 5878 ± 5% numa-vmstat.node0.nr_slab_reclaimable 23195 ± 6% -23.3% 17788 ± 9% numa-vmstat.node0.nr_slab_unreclaimable 687301 ± 22% -27.9% 495794 ± 4% numa-vmstat.node0.numa_hit 13611 +32.0% 17965 ± 11% numa-vmstat.node1.nr_slab_unreclaimable 19094 ± 9% +13.0% 21572 ± 11% numa-vmstat.node3.nr_active_anon 7278 ± 32% +54.1% 11218 ± 27% numa-vmstat.node3.nr_slab_reclaimable 19094 ± 9% +13.0% 21572 ± 11% numa-vmstat.node3.nr_zone_active_anon 45304 ± 20% -48.1% 23515 ± 5% numa-meminfo.node0.KReclaimable 2253 ± 27% -58.1% 943.75 ± 37% numa-meminfo.node0.PageTables 45304 ± 20% -48.1% 23515 ± 5% numa-meminfo.node0.SReclaimable 92781 ± 6% -23.3% 71159 ± 9% numa-meminfo.node0.SUnreclaim 138086 ± 6% -31.4% 94675 ± 8% numa-meminfo.node0.Slab 54446 +32.0% 71864 ± 11% numa-meminfo.node1.SUnreclaim 76276 ± 10% +13.1% 86270 ± 11% numa-meminfo.node3.Active 76276 ± 10% +13.1% 86270 ± 11% numa-meminfo.node3.Active(anon) 29121 ± 32% +54.1% 44886 ± 27% numa-meminfo.node3.KReclaimable 29121 ± 32% +54.1% 44886 ± 27% numa-meminfo.node3.SReclaimable 8.01 ± 9% -8.0 0.00 perf-profile.calltrace.cycles-pp.eventfd_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.91 ± 9% +0.3 1.19 ± 9% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__libc_write 0.00 +0.8 0.84 ± 11% perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyin._copy_from_iter_full.eventfd_write.new_sync_write 0.00 +1.1 1.05 ± 11% perf-profile.calltrace.cycles-pp.__might_fault._copy_from_iter_full.eventfd_write.new_sync_write.vfs_write 0.00 +1.1 1.13 ± 11% perf-profile.calltrace.cycles-pp.copy_user_generic_unrolled.copyin._copy_from_iter_full.eventfd_write.new_sync_write 0.00 +1.1 1.15 ± 11% perf-profile.calltrace.cycles-pp.iov_iter_advance._copy_from_iter_full.eventfd_write.new_sync_write.vfs_write 0.00 +1.3 1.26 ± 12% perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.eventfd_write.new_sync_write.vfs_write.ksys_write 0.00 +2.5 2.46 ± 11% perf-profile.calltrace.cycles-pp.copyin._copy_from_iter_full.eventfd_write.new_sync_write.vfs_write 0.00 +5.5 5.50 ± 11% perf-profile.calltrace.cycles-pp._copy_from_iter_full.eventfd_write.new_sync_write.vfs_write.ksys_write 0.00 +8.7 8.70 ± 11% perf-profile.calltrace.cycles-pp.eventfd_write.new_sync_write.vfs_write.ksys_write.do_syscall_64 0.00 +10.1 10.11 ± 11% perf-profile.calltrace.cycles-pp.new_sync_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.16 ± 11% +0.2 0.32 ± 11% perf-profile.children.cycles-pp.iov_iter_init 0.00 +1.2 1.15 ± 11% perf-profile.children.cycles-pp.iov_iter_advance 0.00 +2.5 2.51 ± 11% perf-profile.children.cycles-pp.copyin 0.00 +5.7 5.66 ± 11% perf-profile.children.cycles-pp._copy_from_iter_full 0.00 +10.2 10.22 ± 11% perf-profile.children.cycles-pp.new_sync_write 0.12 ± 6% +0.1 0.23 ± 10% perf-profile.self.cycles-pp.iov_iter_init 0.00 +0.3 0.33 ± 12% perf-profile.self.cycles-pp.copyin 0.00 +0.9 0.91 ± 12% perf-profile.self.cycles-pp._copy_from_iter_full 0.00 +1.1 1.08 ± 11% perf-profile.self.cycles-pp.iov_iter_advance 0.00 +1.2 1.24 ± 11% perf-profile.self.cycles-pp.new_sync_write 9.3e+10 +3.7% 9.647e+10 perf-stat.i.branch-instructions 1.02 -0.1 0.92 perf-stat.i.branch-miss-rate% 9.438e+08 -6.5% 8.828e+08 perf-stat.i.branch-misses 2364 -1.9% 2319 perf-stat.i.context-switches 0.65 -3.9% 0.63 perf-stat.i.cpi 1.288e+11 +2.4% 1.319e+11 perf-stat.i.dTLB-loads 82066 -2.0% 80394 perf-stat.i.dTLB-store-misses 8.791e+10 +3.2% 9.076e+10 perf-stat.i.dTLB-stores 9.303e+08 -3.5% 8.98e+08 perf-stat.i.iTLB-load-misses 4.486e+11 +4.1% 4.672e+11 perf-stat.i.instructions 482.90 +8.1% 522.07 perf-stat.i.instructions-per-iTLB-miss 1.53 +4.2% 1.59 perf-stat.i.ipc 1612 +3.1% 1662 perf-stat.i.metric.M/sec 0.02 ± 3% -6.9% 0.02 ± 2% perf-stat.overall.MPKI 1.01 -0.1 0.92 perf-stat.overall.branch-miss-rate% 0.65 -4.1% 0.63 perf-stat.overall.cpi 0.00 -0.0 0.00 perf-stat.overall.dTLB-store-miss-rate% 482.17 +7.9% 520.31 perf-stat.overall.instructions-per-iTLB-miss 1.53 +4.2% 1.60 perf-stat.overall.ipc 542437 +11.1% 602836 perf-stat.overall.path-length 9.268e+10 +3.7% 9.614e+10 perf-stat.ps.branch-instructions 9.404e+08 -6.5% 8.798e+08 perf-stat.ps.branch-misses 2357 -2.1% 2308 perf-stat.ps.context-switches 1.283e+11 +2.5% 1.315e+11 perf-stat.ps.dTLB-loads 82039 -2.0% 80437 perf-stat.ps.dTLB-store-misses 8.76e+10 +3.3% 9.045e+10 perf-stat.ps.dTLB-stores 9.271e+08 -3.5% 8.95e+08 perf-stat.ps.iTLB-load-misses 4.47e+11 +4.2% 4.656e+11 perf-stat.ps.instructions 1.353e+14 +4.1% 1.409e+14 perf-stat.total.instructions 17923 ± 57% +68.7% 30238 ± 23% softirqs.CPU118.SCHED 13351 ± 5% -24.1% 10129 ± 7% softirqs.CPU120.RCU 14869 ± 19% +128.3% 33952 ± 5% softirqs.CPU120.SCHED 10695 ± 7% +21.5% 12999 ± 7% softirqs.CPU122.RCU 32252 ± 13% -62.5% 12110 ± 36% softirqs.CPU122.SCHED 12027 ± 2% -12.6% 10510 softirqs.CPU129.RCU 22702 ± 20% +35.5% 30756 ± 16% softirqs.CPU129.SCHED 12806 ± 8% -17.5% 10564 ± 7% softirqs.CPU13.RCU 13637 ± 6% -23.8% 10394 ± 13% softirqs.CPU130.RCU 13342 ± 33% +139.9% 32014 ± 15% softirqs.CPU130.SCHED 22307 ± 29% -41.9% 12958 ± 37% softirqs.CPU133.SCHED 12617 ± 10% -11.6% 11152 ± 3% softirqs.CPU135.RCU 36856 ± 41% -47.8% 19244 ± 50% softirqs.CPU15.SCHED 14467 ± 11% -21.6% 11339 ± 3% softirqs.CPU167.RCU 20118 ± 40% +54.6% 31101 ± 31% softirqs.CPU167.SCHED 11933 ± 7% -11.5% 10560 softirqs.CPU183.RCU 16156 ± 34% +55.0% 25048 ± 35% softirqs.CPU184.SCHED 11912 ± 4% -12.2% 10456 ± 5% softirqs.CPU188.RCU 19513 ± 14% +35.2% 26375 ± 2% softirqs.CPU188.SCHED 11382 ± 5% +24.0% 14114 ± 8% softirqs.CPU24.RCU 29259 ± 8% -60.3% 11611 ± 21% softirqs.CPU24.SCHED 14178 ± 3% -26.6% 10407 ± 7% softirqs.CPU26.RCU 12171 ± 30% +171.3% 33015 ± 11% softirqs.CPU26.SCHED 11088 ± 6% +24.4% 13789 ± 3% softirqs.CPU34.RCU 31141 ± 12% -57.6% 13193 ± 42% softirqs.CPU34.SCHED 12838 ± 7% -17.6% 10584 ± 7% softirqs.CPU37.RCU 20138 ± 28% +51.2% 30444 ± 17% softirqs.CPU38.SCHED 13059 ± 3% -13.7% 11272 ± 10% softirqs.CPU46.RCU 13511 ± 5% -10.1% 12147 ± 4% softirqs.CPU68.RCU 19446 ± 16% +22.9% 23906 ± 14% softirqs.CPU68.SCHED 13235 ± 4% -11.3% 11745 ± 8% softirqs.CPU76.RCU 12459 ± 30% +50.0% 18685 ± 26% softirqs.CPU76.SCHED 25126 ± 9% -25.7% 18676 ± 5% softirqs.CPU92.SCHED 598522 ± 13% -13.2% 519288 ± 6% interrupts.CAL:Function_call_interrupts 3490 ± 39% -38.2% 2155 ± 11% interrupts.CPU100.CAL:Function_call_interrupts 2778 ± 4% -18.0% 2278 ± 17% interrupts.CPU101.CAL:Function_call_interrupts 877.50 ± 6% -32.5% 592.25 ± 28% interrupts.CPU101.TLB:TLB_shootdowns 5050 ± 34% +58.5% 8005 ± 7% interrupts.CPU105.NMI:Non-maskable_interrupts 5050 ± 34% +58.5% 8005 ± 7% interrupts.CPU105.PMI:Performance_monitoring_interrupts 250.25 ± 27% -56.1% 109.75 ± 41% interrupts.CPU107.RES:Rescheduling_interrupts 3223 ± 48% +100.3% 6455 ± 22% interrupts.CPU108.NMI:Non-maskable_interrupts 3223 ± 48% +100.3% 6455 ± 22% interrupts.CPU108.PMI:Performance_monitoring_interrupts 917.50 ± 27% +38.4% 1269 ± 8% interrupts.CPU109.TLB:TLB_shootdowns 3957 ± 48% +79.9% 7120 ± 34% interrupts.CPU112.NMI:Non-maskable_interrupts 3957 ± 48% +79.9% 7120 ± 34% interrupts.CPU112.PMI:Performance_monitoring_interrupts 4300 ± 34% +70.2% 7318 ± 16% interrupts.CPU117.NMI:Non-maskable_interrupts 4300 ± 34% +70.2% 7318 ± 16% interrupts.CPU117.PMI:Performance_monitoring_interrupts 2802 ± 10% -29.6% 1972 ± 15% interrupts.CPU120.CAL:Function_call_interrupts 197.00 ± 10% -78.0% 43.25 ± 49% interrupts.CPU120.RES:Rescheduling_interrupts 912.75 ± 37% -65.5% 314.50 ± 46% interrupts.CPU120.TLB:TLB_shootdowns 2411 ± 15% +22.4% 2951 ± 3% interrupts.CPU122.CAL:Function_call_interrupts 57.50 ± 68% +264.8% 209.75 ± 25% interrupts.CPU122.RES:Rescheduling_interrupts 537.50 ± 77% +151.3% 1351 ± 17% interrupts.CPU122.TLB:TLB_shootdowns 3344 ± 83% +114.7% 7182 ± 18% interrupts.CPU126.NMI:Non-maskable_interrupts 3344 ± 83% +114.7% 7182 ± 18% interrupts.CPU126.PMI:Performance_monitoring_interrupts 2849 ± 4% -18.3% 2327 ± 8% interrupts.CPU13.CAL:Function_call_interrupts 185.25 ± 22% -53.4% 86.25 ± 45% interrupts.CPU13.RES:Rescheduling_interrupts 823.00 ± 35% -44.4% 457.50 ± 26% interrupts.CPU13.TLB:TLB_shootdowns 3044 ± 6% -32.5% 2055 ± 23% interrupts.CPU130.CAL:Function_call_interrupts 208.25 ± 20% -74.5% 53.00 ± 77% interrupts.CPU130.RES:Rescheduling_interrupts 1140 ± 21% -67.4% 371.75 ± 78% interrupts.CPU130.TLB:TLB_shootdowns 2652 ± 5% +13.7% 3014 ± 6% interrupts.CPU133.CAL:Function_call_interrupts 856.50 ± 20% +66.1% 1422 ± 18% interrupts.CPU133.TLB:TLB_shootdowns 113.50 ± 40% +70.9% 194.00 ± 30% interrupts.CPU134.RES:Rescheduling_interrupts 3169 ± 7% -20.6% 2515 ± 16% interrupts.CPU135.CAL:Function_call_interrupts 1282 ± 20% -30.9% 886.00 ± 23% interrupts.CPU135.TLB:TLB_shootdowns 2802 ± 8% -16.8% 2331 ± 10% interrupts.CPU136.CAL:Function_call_interrupts 3251 ± 8% -14.8% 2770 ± 6% interrupts.CPU139.CAL:Function_call_interrupts 1169 ± 23% -41.3% 686.00 ± 25% interrupts.CPU149.TLB:TLB_shootdowns 9462 ±112% -72.9% 2565 ± 30% interrupts.CPU16.CAL:Function_call_interrupts 8144 ± 7% -38.6% 4998 ± 58% interrupts.CPU160.NMI:Non-maskable_interrupts 8144 ± 7% -38.6% 4998 ± 58% interrupts.CPU160.PMI:Performance_monitoring_interrupts 4721 ± 48% +63.2% 7707 ± 14% interrupts.CPU173.NMI:Non-maskable_interrupts 4721 ± 48% +63.2% 7707 ± 14% interrupts.CPU173.PMI:Performance_monitoring_interrupts 4629 ± 41% +61.3% 7468 ± 14% interrupts.CPU18.NMI:Non-maskable_interrupts 4629 ± 41% +61.3% 7468 ± 14% interrupts.CPU18.PMI:Performance_monitoring_interrupts 2802 ± 33% +97.6% 5537 ± 28% interrupts.CPU181.NMI:Non-maskable_interrupts 2802 ± 33% +97.6% 5537 ± 28% interrupts.CPU181.PMI:Performance_monitoring_interrupts 2775 ± 3% -10.8% 2475 ± 9% interrupts.CPU183.CAL:Function_call_interrupts 5696 ± 34% +40.8% 8018 ± 6% interrupts.CPU186.NMI:Non-maskable_interrupts 5696 ± 34% +40.8% 8018 ± 6% interrupts.CPU186.PMI:Performance_monitoring_interrupts 1044 ± 25% -49.6% 526.50 ± 56% interrupts.CPU189.TLB:TLB_shootdowns 3090 ± 6% -13.2% 2683 ± 2% interrupts.CPU191.CAL:Function_call_interrupts 3039 ± 6% -19.8% 2437 ± 6% interrupts.CPU21.CAL:Function_call_interrupts 66.00 ± 40% +216.3% 208.75 ± 21% interrupts.CPU24.RES:Rescheduling_interrupts 881.50 ± 47% +64.3% 1448 ± 7% interrupts.CPU24.TLB:TLB_shootdowns 3371 ± 9% -36.5% 2142 ± 20% interrupts.CPU26.CAL:Function_call_interrupts 212.50 ± 11% -79.6% 43.25 ± 83% interrupts.CPU26.RES:Rescheduling_interrupts 1282 ± 29% -63.5% 468.50 ± 48% interrupts.CPU26.TLB:TLB_shootdowns 50.50 ± 57% +294.6% 199.25 ± 31% interrupts.CPU34.RES:Rescheduling_interrupts 649.25 ± 30% +126.1% 1467 ± 21% interrupts.CPU34.TLB:TLB_shootdowns 5367 ± 69% -49.7% 2697 ± 5% interrupts.CPU36.CAL:Function_call_interrupts 2874 ± 6% -29.0% 2041 ± 20% interrupts.CPU37.CAL:Function_call_interrupts 129.00 ± 42% -62.0% 49.00 ± 78% interrupts.CPU37.RES:Rescheduling_interrupts 977.00 ± 11% -62.2% 369.75 ± 61% interrupts.CPU37.TLB:TLB_shootdowns 824.00 ± 25% +32.7% 1093 ± 8% interrupts.CPU40.TLB:TLB_shootdowns 4495 ± 59% +77.6% 7984 ± 7% interrupts.CPU44.NMI:Non-maskable_interrupts 4495 ± 59% +77.6% 7984 ± 7% interrupts.CPU44.PMI:Performance_monitoring_interrupts 806.50 ± 12% +42.4% 1148 ± 17% interrupts.CPU5.TLB:TLB_shootdowns 3147 ± 10% -21.0% 2485 ± 13% interrupts.CPU51.CAL:Function_call_interrupts 691.50 ± 50% +62.4% 1122 ± 11% interrupts.CPU53.TLB:TLB_shootdowns 3478 ± 11% -21.8% 2721 ± 15% interrupts.CPU60.CAL:Function_call_interrupts 2938 ± 23% +157.7% 7571 ± 15% interrupts.CPU67.NMI:Non-maskable_interrupts 2938 ± 23% +157.7% 7571 ± 15% interrupts.CPU67.PMI:Performance_monitoring_interrupts 2923 ± 6% -12.1% 2568 ± 6% interrupts.CPU68.CAL:Function_call_interrupts 3390 ± 6% -14.5% 2897 ± 6% interrupts.CPU70.CAL:Function_call_interrupts 1192 ± 13% -21.6% 935.50 ± 10% interrupts.CPU84.TLB:TLB_shootdowns 105.25 ± 27% +43.9% 151.50 ± 18% interrupts.CPU92.RES:Rescheduling_interrupts 794.75 ± 42% +59.1% 1264 ± 28% interrupts.CPU93.TLB:TLB_shootdowns 2841 ± 20% -28.3% 2037 ± 23% interrupts.CPU98.CAL:Function_call_interrupts 3327 ± 18% -26.3% 2451 ± 12% interrupts.CPU99.CAL:Function_call_interrupts will-it-scale.per_thread_ops 2.62e+06 +----------------------------------------------------------------+ 2.6e+06 |-+ .+.++.+ .+.++.| | +.+ + +.+.++.+ | 2.58e+06 |.++.+. .++.+.+ + .+.+. +.+.+ .+.+ .+.+. : | 2.56e+06 |-+ + + + + + + | 2.54e+06 |-+ | 2.52e+06 |-+ | | | 2.5e+06 |-+ | 2.48e+06 |-+ | 2.46e+06 |-+ | 2.44e+06 |-+ O O OO O OO | | OO O O O O O O OO O OO O | 2.42e+06 |-+ O O O | 2.4e+06 +----------------------------------------------------------------+ will-it-scale.workload 2.5e+08 +----------------------------------------------------------------+ | .+ + .+ .+.+ | 2.48e+08 |-+ .+. .+.++ + .+. .+ +.+ + | 2.46e+08 |.++ +.++ ++.+ ++.+.++.+ +.+.+.+ | | | 2.44e+08 |-+ | 2.42e+08 |-+ | | | 2.4e+08 |-+ | 2.38e+08 |-+ | | | 2.36e+08 |-+ O | 2.34e+08 |-+ O O OO O OO | | OO O O O O O OO O OO O | 2.32e+08 +----------------------------------------------------------------+ [*] bisect-good sample [O] bisect-bad sample Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Thanks, Oliver Sang