Greeting, FYI, we noticed a -3.3% regression of will-it-scale.per_thread_ops due to commit: commit: 4bad58ebc8bc4f20d89cff95417c9b4674769709 ("signal: Allow tasks to cache one sigqueue struct") https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git sched/core in testcase: will-it-scale on test machine: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory with following parameters: nr_task: 100% mode: thread test: futex3 cpufreq_governor: performance ucode: 0x5003006 test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two. test-url: https://github.com/antonblanchard/will-it-scale If you fix the issue, kindly add following tag Reported-by: kernel test robot Details are as below: --------------------------------------------------------------------------------------------------> To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp install job.yaml # job file is attached in this email bin/lkp split-job --compatible job.yaml bin/lkp run compatible-job.yaml ========================================================================================= compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode: gcc-9/performance/x86_64-rhel-8.3/thread/100%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2ap2/futex3/will-it-scale/0x5003006 commit: 69995ebbb9 ("signal: Hand SIGQUEUE_PREALLOC flag to __sigqueue_alloc()") 4bad58ebc8 ("signal: Allow tasks to cache one sigqueue struct") 69995ebbb9d37173 4bad58ebc8bc4f20d89cff95417 ---------------- --------------------------- %stddev %change %stddev \ | \ 1.273e+09 -3.3% 1.231e+09 will-it-scale.192.threads 6630224 -3.3% 6409738 will-it-scale.per_thread_ops 1.273e+09 -3.3% 1.231e+09 will-it-scale.workload 1638 ± 3% -7.8% 1510 ± 5% sched_debug.cfs_rq:/.runnable_avg.max 297.83 ± 68% +1747.6% 5502 ±152% interrupts.33:PCI-MSI.524291-edge.eth0-TxRx-2 297.83 ± 68% +1747.6% 5502 ±152% interrupts.CPU12.33:PCI-MSI.524291-edge.eth0-TxRx-2 8200 -33.4% 5459 ± 35% interrupts.CPU27.NMI:Non-maskable_interrupts 8200 -33.4% 5459 ± 35% interrupts.CPU27.PMI:Performance_monitoring_interrupts 8199 -33.4% 5459 ± 35% interrupts.CPU28.NMI:Non-maskable_interrupts 8199 -33.4% 5459 ± 35% interrupts.CPU28.PMI:Performance_monitoring_interrupts 6148 ± 33% -11.2% 5459 ± 35% interrupts.CPU29.NMI:Non-maskable_interrupts 6148 ± 33% -11.2% 5459 ± 35% interrupts.CPU29.PMI:Performance_monitoring_interrupts 4287 ± 8% +33.6% 5730 ± 15% interrupts.CPU49.CAL:Function_call_interrupts 6356 ± 19% +49.6% 9509 ± 19% interrupts.CPU97.CAL:Function_call_interrupts 9.163e+10 -3.3% 8.857e+10 perf-stat.i.branch-instructions 3.211e+08 -2.9% 3.118e+08 perf-stat.i.branch-misses 0.94 +3.2% 0.97 perf-stat.i.cpi 407730 ± 8% +37.5% 560565 ± 7% perf-stat.i.dTLB-load-misses 1.551e+11 -3.3% 1.499e+11 perf-stat.i.dTLB-loads 274320 -8.4% 251354 ± 18% perf-stat.i.dTLB-store-misses 1.169e+11 -3.3% 1.13e+11 perf-stat.i.dTLB-stores 5.952e+11 -3.3% 5.754e+11 perf-stat.i.instructions 1900 -4.9% 1807 perf-stat.i.instructions-per-iTLB-miss 1.07 -3.2% 1.03 perf-stat.i.ipc 1893 -3.3% 1830 perf-stat.i.metric.M/sec 0.93 +3.3% 0.97 perf-stat.overall.cpi 0.00 ± 8% +0.0 0.00 ± 7% perf-stat.overall.dTLB-load-miss-rate% 1896 -5.1% 1800 perf-stat.overall.instructions-per-iTLB-miss 1.07 -3.2% 1.04 perf-stat.overall.ipc 9.131e+10 -3.3% 8.827e+10 perf-stat.ps.branch-instructions 3.2e+08 -2.9% 3.107e+08 perf-stat.ps.branch-misses 415959 ± 8% +40.4% 583928 ± 7% perf-stat.ps.dTLB-load-misses 1.545e+11 -3.3% 1.494e+11 perf-stat.ps.dTLB-loads 274020 -8.4% 250940 ± 18% perf-stat.ps.dTLB-store-misses 1.165e+11 -3.3% 1.126e+11 perf-stat.ps.dTLB-stores 5.932e+11 -3.3% 5.734e+11 perf-stat.ps.instructions 1.793e+14 -3.3% 1.733e+14 perf-stat.total.instructions 32.73 -1.0 31.71 perf-profile.calltrace.cycles-pp.__entry_text_start.syscall 8.37 -0.2 8.20 perf-profile.calltrace.cycles-pp.hash_futex.futex_wake.do_futex.__x64_sys_futex.do_syscall_64 1.52 -0.1 1.38 perf-profile.calltrace.cycles-pp.rcu_nocb_flush_deferred_wakeup.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.syscall 2.27 -0.1 2.17 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.syscall 2.17 -0.1 2.08 perf-profile.calltrace.cycles-pp.syscall_enter_from_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall 1.32 -0.1 1.26 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.syscall 5.45 +0.3 5.71 perf-profile.calltrace.cycles-pp.get_futex_key.futex_wake.do_futex.__x64_sys_futex.do_syscall_64 7.55 +0.4 7.98 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.syscall 5.07 +0.5 5.58 perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.syscall 28.26 +0.9 29.19 perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall 37.41 +1.1 38.50 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall 33.56 +1.2 34.78 perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall 52.14 +1.3 53.40 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall 23.03 +1.4 24.44 perf-profile.calltrace.cycles-pp.futex_wake.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe 21.10 -0.7 20.44 perf-profile.children.cycles-pp.__entry_text_start 17.77 -0.5 17.31 perf-profile.children.cycles-pp.syscall_return_via_sysret 8.48 -0.2 8.28 perf-profile.children.cycles-pp.hash_futex 1.58 -0.1 1.44 perf-profile.children.cycles-pp.rcu_nocb_flush_deferred_wakeup 2.43 -0.1 2.33 perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack 2.20 -0.1 2.11 perf-profile.children.cycles-pp.syscall_enter_from_user_mode 0.42 ± 6% -0.1 0.36 ± 2% perf-profile.children.cycles-pp.tick_sched_handle 0.42 ± 6% -0.1 0.36 ± 2% perf-profile.children.cycles-pp.update_process_times 1.34 -0.1 1.29 perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare 0.52 ± 2% -0.0 0.48 ± 2% perf-profile.children.cycles-pp.__hrtimer_run_queues 0.47 ± 2% -0.0 0.43 ± 2% perf-profile.children.cycles-pp.tick_sched_timer 0.23 ± 4% -0.0 0.20 ± 2% perf-profile.children.cycles-pp.update_curr 0.18 ± 4% -0.0 0.16 ± 3% perf-profile.children.cycles-pp.perf_prepare_sample 5.60 +0.3 5.89 perf-profile.children.cycles-pp.get_futex_key 8.20 +0.4 8.59 perf-profile.children.cycles-pp.syscall_exit_to_user_mode 5.36 +0.5 5.86 perf-profile.children.cycles-pp.exit_to_user_mode_prepare 23.57 +0.8 24.36 perf-profile.children.cycles-pp.futex_wake 37.58 +1.1 38.68 perf-profile.children.cycles-pp.do_syscall_64 52.56 +1.2 53.80 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 33.87 +1.2 35.11 perf-profile.children.cycles-pp.__x64_sys_futex 28.60 +1.3 29.89 perf-profile.children.cycles-pp.do_futex 17.64 -0.4 17.20 perf-profile.self.cycles-pp.syscall_return_via_sysret 9.47 -0.3 9.15 ± 2% perf-profile.self.cycles-pp.__entry_text_start 6.88 -0.3 6.61 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe 8.18 -0.2 7.98 perf-profile.self.cycles-pp.hash_futex 1.33 -0.1 1.22 perf-profile.self.cycles-pp.rcu_nocb_flush_deferred_wakeup 2.42 -0.1 2.32 perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack 1.85 -0.1 1.77 perf-profile.self.cycles-pp.syscall_exit_to_user_mode 1.88 -0.1 1.81 perf-profile.self.cycles-pp.syscall_enter_from_user_mode 1.25 -0.0 1.21 perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare 1.27 -0.0 1.23 perf-profile.self.cycles-pp.do_syscall_64 1.69 +0.0 1.71 perf-profile.self.cycles-pp.testcase 5.26 +0.2 5.48 perf-profile.self.cycles-pp.get_futex_key 9.61 +0.4 10.02 perf-profile.self.cycles-pp.futex_wake 3.74 +0.6 4.37 perf-profile.self.cycles-pp.exit_to_user_mode_prepare 5.08 +0.8 5.90 perf-profile.self.cycles-pp.do_futex will-it-scale.per_thread_ops 6.65e+06 +----------------------------------------------------------------+ | + +.++.+| 6.6e+06 |-+ + : | | .++. +.+ + + : : | 6.55e+06 |-+ + ++ +.+ +.++. :: :: : | 6.5e+06 |-.++.+ : + : : : + | |+ :+.: : + +.+ +.+ | 6.45e+06 |-+ + + + +.+ .+ +.+ | | O +O + | 6.4e+06 |-+ OO OOO OO O O | 6.35e+06 |-+ | | | 6.3e+06 |-+ OO OO O | |O OO O OO O O | 6.25e+06 +----------------------------------------------------------------+ [*] bisect-good sample [O] bisect-bad sample Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. --- 0DAY/LKP+ Test Infrastructure Open Source Technology Center https://lists.01.org/hyperkitty/list/lkp@lists.01.org Intel Corporation Thanks, Oliver Sang