Greeting, FYI, we noticed a -5.9% regression of will-it-scale.per_thread_ops due to commit: commit: 60f7ed8c7c4d06aeda448c6da74621552ee739aa ("fsnotify: send path type events to group with super block marks") https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master in testcase: will-it-scale on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory with following parameters: nr_task: 16 mode: thread test: unlink2 cpufreq_governor: performance test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two. test-url: https://github.com/antonblanchard/will-it-scale Details are as below: --------------------------------------------------------------------------------------------------> To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp install job.yaml # job file is attached in this email bin/lkp run job.yaml ========================================================================================= compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-7/performance/x86_64-rhel-7.2/thread/16/debian-x86_64-2018-04-03.cgz/lkp-bdw-ep3d/unlink2/will-it-scale commit: 1e6cb72399 ("fsnotify: add super block object type") 60f7ed8c7c ("fsnotify: send path type events to group with super block marks") 1e6cb72399fd58b3 60f7ed8c7c4d06aeda448c6da7 ---------------- -------------------------- %stddev %change %stddev \ | \ 54483 -5.9% 51256 will-it-scale.per_thread_ops 46266 ± 2% -4.3% 44270 ± 2% will-it-scale.time.involuntary_context_switches 103.21 -7.8% 95.17 will-it-scale.time.user_time 871749 -5.9% 820115 will-it-scale.workload 10888 +22.2% 13303 ± 17% numa-meminfo.node0.Mapped 2001 ± 12% -16.8% 1665 ± 16% numa-meminfo.node0.PageTables 865.75 ± 32% +42.7% 1235 ± 22% slabinfo.dmaengine-unmap-16.active_objs 865.75 ± 32% +42.7% 1235 ± 22% slabinfo.dmaengine-unmap-16.num_objs 10974 ± 34% +60.2% 17584 ± 13% numa-vmstat.node0 2826 ± 3% +24.6% 3523 ± 16% numa-vmstat.node0.nr_mapped 500.00 ± 12% -16.9% 415.75 ± 16% numa-vmstat.node0.nr_page_table_pages 20375718 -6.3% 19092155 proc-vmstat.numa_hit 20370933 -6.3% 19087362 proc-vmstat.numa_local 69383484 -6.3% 65029677 proc-vmstat.pgalloc_normal 69362606 -6.3% 65008530 proc-vmstat.pgfree 8.39 ±109% +7.9e+05% 66298 ±140% sched_debug.cfs_rq:/.MIN_vruntime.avg 201.35 ±109% +2.2e+05% 436949 ± 61% sched_debug.cfs_rq:/.MIN_vruntime.max 40.24 ±109% +3.4e+05% 135845 ± 97% sched_debug.cfs_rq:/.MIN_vruntime.stddev 8.39 ±109% +7.9e+05% 66298 ±140% sched_debug.cfs_rq:/.max_vruntime.avg 201.35 ±109% +2.2e+05% 436949 ± 61% sched_debug.cfs_rq:/.max_vruntime.max 40.24 ±109% +3.4e+05% 135845 ± 97% sched_debug.cfs_rq:/.max_vruntime.stddev 43805 ± 6% +35.5% 59365 ± 31% sched_debug.cpu.load.avg 108694 ± 62% +154.7% 276883 ± 25% sched_debug.cpu.load.max 33945 ± 37% +119.6% 74560 ± 42% sched_debug.cpu.load.stddev 34287 ± 3% +10.1% 37761 ± 4% sched_debug.cpu.nr_switches.max 15993 ± 2% +10.8% 17727 ± 4% sched_debug.cpu.sched_goidle.max 1.36 ± 2% -0.1 1.24 perf-stat.branch-miss-rate% 1.546e+10 ± 2% -10.3% 1.387e+10 perf-stat.branch-misses 3.025e+08 ± 8% -17.7% 2.489e+08 ± 14% perf-stat.dTLB-load-misses 1.603e+12 ± 2% -2.6% 1.561e+12 perf-stat.dTLB-loads 0.01 ± 7% -0.0 0.01 ± 6% perf-stat.dTLB-store-miss-rate% 1.02e+08 ± 5% -34.7% 66552058 ± 6% perf-stat.dTLB-store-misses 9.269e+11 -5.8% 8.729e+11 perf-stat.dTLB-stores 4.885e+08 ± 33% -25.3% 3.649e+08 ± 9% perf-stat.iTLB-load-misses 6.92e+08 ± 5% -9.7% 6.251e+08 ± 2% perf-stat.node-loads 3.66e+09 ± 2% -8.5% 3.347e+09 ± 2% perf-stat.node-store-misses 2.464e+09 ± 3% -11.4% 2.184e+09 ± 2% perf-stat.node-stores 6419017 +3.9% 6671008 perf-stat.path-length 11.13 ± 16% -7.2 3.93 ± 6% perf-profile.calltrace.cycles-pp.d_instantiate.shmem_mknod.path_openat.do_filp_open.do_sys_open 11.80 ± 14% -7.0 4.76 ± 4% perf-profile.calltrace.cycles-pp.__destroy_inode.destroy_inode.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe 11.96 ± 14% -7.0 4.94 ± 5% perf-profile.calltrace.cycles-pp.destroy_inode.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe 10.89 ± 15% -7.0 3.88 ± 4% perf-profile.calltrace.cycles-pp.security_inode_free.__destroy_inode.destroy_inode.do_unlinkat.do_syscall_64 10.03 ± 17% -6.7 3.33 ± 7% perf-profile.calltrace.cycles-pp.inode_doinit_with_dentry.security_d_instantiate.d_instantiate.shmem_mknod.path_openat 10.07 ± 16% -6.7 3.37 ± 6% perf-profile.calltrace.cycles-pp.security_d_instantiate.d_instantiate.shmem_mknod.path_openat.do_filp_open 9.91 ± 16% -6.7 3.23 ± 5% perf-profile.calltrace.cycles-pp.selinux_inode_free_security.security_inode_free.__destroy_inode.destroy_inode.do_unlinkat 9.17 ± 17% -6.5 2.66 ± 6% perf-profile.calltrace.cycles-pp._raw_spin_lock.selinux_inode_free_security.security_inode_free.__destroy_inode.destroy_inode 9.24 ± 18% -6.4 2.81 ± 8% perf-profile.calltrace.cycles-pp._raw_spin_lock.inode_doinit_with_dentry.security_d_instantiate.d_instantiate.shmem_mknod 8.56 ± 19% -6.3 2.31 ± 10% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.inode_doinit_with_dentry.security_d_instantiate.d_instantiate 8.57 ± 18% -6.2 2.33 ± 7% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.selinux_inode_free_security.security_inode_free.__destroy_inode 1.89 ± 16% -0.6 1.28 ± 14% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.shmem_reserve_inode.shmem_get_inode.shmem_mknod 3.08 ± 11% -0.6 2.48 ± 10% perf-profile.calltrace.cycles-pp.shmem_evict_inode.evict.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.80 ± 16% -0.5 1.26 ± 16% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.shmem_free_inode.shmem_evict_inode.evict 0.96 ± 11% -0.3 0.62 ± 5% perf-profile.calltrace.cycles-pp.__call_rcu.security_inode_free.__destroy_inode.destroy_inode.do_unlinkat 0.86 ± 10% -0.3 0.55 ± 5% perf-profile.calltrace.cycles-pp.rcu_segcblist_enqueue.__call_rcu.security_inode_free.__destroy_inode.destroy_inode 0.75 ± 8% -0.3 0.47 ± 59% perf-profile.calltrace.cycles-pp.security_inode_init_security.shmem_mknod.path_openat.do_filp_open.do_sys_open 0.70 ± 8% -0.3 0.43 ± 58% perf-profile.calltrace.cycles-pp.selinux_inode_init_security.security_inode_init_security.shmem_mknod.path_openat.do_filp_open 1.07 ± 6% -0.2 0.82 ± 7% perf-profile.calltrace.cycles-pp.security_inode_create.path_openat.do_filp_open.do_sys_open.do_syscall_64 1.01 ± 6% -0.2 0.77 ± 8% perf-profile.calltrace.cycles-pp.may_create.security_inode_create.path_openat.do_filp_open.do_sys_open 0.26 ±100% +0.4 0.63 ± 7% perf-profile.calltrace.cycles-pp._raw_spin_lock.new_inode_pseudo.new_inode.shmem_get_inode.shmem_mknod 0.77 ± 7% +0.4 1.16 ± 2% perf-profile.calltrace.cycles-pp.inode_wait_for_writeback.evict.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.74 ± 7% +0.4 1.14 ± 2% perf-profile.calltrace.cycles-pp._raw_spin_lock.inode_wait_for_writeback.evict.do_unlinkat.do_syscall_64 1.30 ± 9% +0.7 2.00 ± 8% perf-profile.calltrace.cycles-pp.do_dentry_open.path_openat.do_filp_open.do_sys_open.do_syscall_64 0.00 +0.8 0.78 ± 6% perf-profile.calltrace.cycles-pp.fsnotify.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.94 ± 6% +0.8 1.72 ± 3% perf-profile.calltrace.cycles-pp.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.86 ± 6% +0.8 1.64 ± 3% perf-profile.calltrace.cycles-pp.task_work_run.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +0.8 0.79 ± 17% perf-profile.calltrace.cycles-pp.smp_apic_timer_interrupt.apic_timer_interrupt.native_queued_spin_lock_slowpath._raw_spin_lock.evict 0.00 +0.8 0.79 ± 4% perf-profile.calltrace.cycles-pp.fsnotify.do_dentry_open.path_openat.do_filp_open.do_sys_open 0.00 +0.8 0.79 ± 17% perf-profile.calltrace.cycles-pp.apic_timer_interrupt.native_queued_spin_lock_slowpath._raw_spin_lock.evict.do_unlinkat 0.00 +0.8 0.80 ± 18% perf-profile.calltrace.cycles-pp.smp_apic_timer_interrupt.apic_timer_interrupt.native_queued_spin_lock_slowpath._raw_spin_lock.inode_sb_list_add 0.00 +0.8 0.81 ± 18% perf-profile.calltrace.cycles-pp.apic_timer_interrupt.native_queued_spin_lock_slowpath._raw_spin_lock.inode_sb_list_add.new_inode 0.00 +0.9 0.85 perf-profile.calltrace.cycles-pp.fsnotify.__fput.task_work_run.exit_to_usermode_loop.do_syscall_64 0.27 ±100% +1.0 1.29 ± 2% perf-profile.calltrace.cycles-pp.__fput.task_work_run.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.27 ±173% +1.2 1.50 ± 19% perf-profile.calltrace.cycles-pp.rcu_process_callbacks.__softirqentry_text_start.irq_exit.smp_apic_timer_interrupt.apic_timer_interrupt 0.27 ±173% +1.2 1.51 ± 18% perf-profile.calltrace.cycles-pp.irq_exit.smp_apic_timer_interrupt.apic_timer_interrupt.native_queued_spin_lock_slowpath._raw_spin_lock 0.27 ±173% +1.2 1.51 ± 18% perf-profile.calltrace.cycles-pp.__softirqentry_text_start.irq_exit.smp_apic_timer_interrupt.apic_timer_interrupt.native_queued_spin_lock_slowpath 6.36 ± 9% +8.3 14.64 ± 8% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.inode_sb_list_add.new_inode.shmem_get_inode 14.67 ± 8% +8.4 23.08 ± 5% perf-profile.calltrace.cycles-pp.shmem_get_inode.shmem_mknod.path_openat.do_filp_open.do_sys_open 6.72 ± 9% +8.4 15.14 ± 8% perf-profile.calltrace.cycles-pp._raw_spin_lock.inode_sb_list_add.new_inode.shmem_get_inode.shmem_mknod 11.87 ± 7% +8.4 20.30 ± 5% perf-profile.calltrace.cycles-pp.evict.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe 6.19 ± 7% +8.5 14.67 ± 7% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.evict.do_unlinkat.do_syscall_64 7.41 ± 9% +8.6 15.96 ± 7% perf-profile.calltrace.cycles-pp.inode_sb_list_add.new_inode.shmem_get_inode.shmem_mknod.path_openat 10.33 ± 8% +8.6 18.95 ± 6% perf-profile.calltrace.cycles-pp.new_inode.shmem_get_inode.shmem_mknod.path_openat.do_filp_open 6.93 ± 7% +8.8 15.71 ± 7% perf-profile.calltrace.cycles-pp._raw_spin_lock.evict.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe 11.14 ± 15% -7.2 3.93 ± 6% perf-profile.children.cycles-pp.d_instantiate 11.81 ± 14% -7.0 4.76 ± 4% perf-profile.children.cycles-pp.__destroy_inode 11.97 ± 14% -7.0 4.95 ± 5% perf-profile.children.cycles-pp.destroy_inode 10.89 ± 15% -7.0 3.88 ± 4% perf-profile.children.cycles-pp.security_inode_free 10.04 ± 17% -6.7 3.33 ± 7% perf-profile.children.cycles-pp.inode_doinit_with_dentry 10.07 ± 17% -6.7 3.37 ± 6% perf-profile.children.cycles-pp.security_d_instantiate 9.91 ± 16% -6.7 3.24 ± 5% perf-profile.children.cycles-pp.selinux_inode_free_security 3.09 ± 11% -0.6 2.49 ± 11% perf-profile.children.cycles-pp.shmem_evict_inode 0.84 ± 9% -0.4 0.46 ± 46% perf-profile.children.cycles-pp.selinux_determine_inode_label 1.32 ± 9% -0.3 0.99 ± 4% perf-profile.children.cycles-pp.__call_rcu 0.98 ± 9% -0.3 0.66 ± 5% perf-profile.children.cycles-pp.rcu_segcblist_enqueue 1.07 ± 6% -0.2 0.83 ± 8% perf-profile.children.cycles-pp.security_inode_create 0.94 ± 7% -0.2 0.69 ± 9% perf-profile.children.cycles-pp.__list_del_entry_valid 1.01 ± 7% -0.2 0.77 ± 8% perf-profile.children.cycles-pp.may_create 0.37 ± 6% -0.2 0.16 ± 6% perf-profile.children.cycles-pp.__fd_install 0.75 ± 8% -0.2 0.58 ± 16% perf-profile.children.cycles-pp.security_inode_init_security 0.70 ± 8% -0.2 0.54 ± 17% perf-profile.children.cycles-pp.selinux_inode_init_security 0.42 ± 8% -0.1 0.36 ± 4% perf-profile.children.cycles-pp.d_delete 0.21 ± 8% -0.1 0.16 ± 13% perf-profile.children.cycles-pp._atomic_dec_and_lock 0.34 ± 6% -0.1 0.29 ± 5% perf-profile.children.cycles-pp.fsnotify_destroy_marks 0.24 ± 8% -0.0 0.21 ± 7% perf-profile.children.cycles-pp.down_write 0.06 ± 14% +0.0 0.08 ± 15% perf-profile.children.cycles-pp.prandom_u32_state 0.12 ± 8% +0.0 0.16 ± 7% perf-profile.children.cycles-pp.__d_instantiate 0.77 ± 7% +0.4 1.17 ± 2% perf-profile.children.cycles-pp.inode_wait_for_writeback 1.30 ± 9% +0.7 2.00 ± 8% perf-profile.children.cycles-pp.do_dentry_open 0.94 ± 6% +0.8 1.73 ± 3% perf-profile.children.cycles-pp.exit_to_usermode_loop 0.86 ± 6% +0.8 1.65 ± 2% perf-profile.children.cycles-pp.task_work_run 0.51 ± 8% +0.8 1.30 ± 2% perf-profile.children.cycles-pp.__fput 0.23 ± 13% +2.3 2.52 ± 4% perf-profile.children.cycles-pp.fsnotify 14.69 ± 8% +8.4 23.11 ± 5% perf-profile.children.cycles-pp.shmem_get_inode 11.88 ± 7% +8.4 20.30 ± 5% perf-profile.children.cycles-pp.evict 7.42 ± 9% +8.6 15.97 ± 7% perf-profile.children.cycles-pp.inode_sb_list_add 10.33 ± 8% +8.6 18.95 ± 6% perf-profile.children.cycles-pp.new_inode 0.74 ± 9% -0.4 0.37 ± 59% perf-profile.self.cycles-pp.selinux_determine_inode_label 0.97 ± 9% -0.3 0.66 ± 5% perf-profile.self.cycles-pp.rcu_segcblist_enqueue 0.92 ± 7% -0.2 0.68 ± 9% perf-profile.self.cycles-pp.__list_del_entry_valid 0.36 ± 5% -0.2 0.16 ± 7% perf-profile.self.cycles-pp.__fd_install 0.41 ± 18% -0.2 0.22 ± 14% perf-profile.self.cycles-pp.inode_doinit_with_dentry 0.15 ± 12% -0.0 0.10 ± 10% perf-profile.self.cycles-pp._atomic_dec_and_lock 0.14 ± 8% -0.0 0.11 ± 6% perf-profile.self.cycles-pp.down_write 0.08 ± 11% +0.0 0.11 ± 4% perf-profile.self.cycles-pp.__d_instantiate 0.10 ± 10% +0.0 0.14 ± 7% perf-profile.self.cycles-pp.inode_sb_list_add 0.22 ± 12% +2.2 2.47 ± 3% perf-profile.self.cycles-pp.fsnotify will-it-scale.workload 940000 +-+----------------------------------------------------------------+ | + + :: | 920000 +-+ : :: .+ : : | 900000 +-+ +. + : : : : + + : : | | + : +. : + : : .+ .+. : :+ +. .+.: + +. | 880000 +-+: + : +. : + + : +. .+ + + + + + : : + | |: : + + + + :+ + : : +| 860000 +-+ + + :: | | + | 840000 +-+ O | 820000 O-+ O O O O O O O O O | | O O O O | 800000 +-+ O O O O O O O | | O O O O O | 780000 +-+----------------------------------------------------------------+ [*] bisect-good sample [O] bisect-bad sample Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Thanks, Rong Chen