Greeting, FYI, we noticed a -5.0% regression of will-it-scale.per_process_ops due to commit: commit: e7ae097b0bda3e7dfd224e2a960346c37aa42394 ("[PATCH 5/6] mm/gup: /proc/vmstat support for get/put user pages") url: https://github.com/0day-ci/linux/commits/john-hubbard-gmail-com/RFC-v2-mm-gup-dma-tracking/20190205-001101 in testcase: will-it-scale on test machine: 192 threads Skylake-4S with 704G memory with following parameters: nr_task: 100% mode: process test: futex1 cpufreq_governor: performance test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two. test-url: https://github.com/antonblanchard/will-it-scale Details are as below: --------------------------------------------------------------------------------------------------> To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp install job.yaml # job file is attached in this email bin/lkp run job.yaml ========================================================================================= compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-7/performance/x86_64-rhel-7.2/process/100%/debian-x86_64-2018-04-03.cgz/lkp-skl-4sp1/futex1/will-it-scale commit: cdaa813278 ("mm/gup: track gup-pinned pages") e7ae097b0b ("mm/gup: /proc/vmstat support for get/put user pages") cdaa813278ddc616 e7ae097b0bda3e7dfd224e2a96 ---------------- -------------------------- %stddev %change %stddev \ | \ 1808394 -5.0% 1717524 will-it-scale.per_process_ops 3.472e+08 -5.0% 3.298e+08 will-it-scale.workload 7936 +3.0% 8174 vmstat.system.cs 546083 ± 30% -41.7% 318102 ± 68% numa-numastat.node2.local_node 576076 ± 27% -40.3% 343781 ± 60% numa-numastat.node2.numa_hit 27067 -4.5% 25852 ± 2% proc-vmstat.nr_shmem 30283 -6.3% 28378 ± 3% proc-vmstat.pgactivate 18458 ± 5% +9.0% 20120 ± 5% slabinfo.kmalloc-96.active_objs 19055 ± 4% +7.7% 20521 ± 5% slabinfo.kmalloc-96.num_objs 1654 ± 5% +13.5% 1877 ± 3% slabinfo.pool_workqueue.active_objs 1747 ± 5% +10.9% 1937 ± 3% slabinfo.pool_workqueue.num_objs 17.13 ± 3% +2.3 19.45 ± 2% perf-profile.calltrace.cycles-pp.gup_pgd_range.get_user_pages_fast.get_futex_key.futex_wake.do_futex 0.00 +2.4 2.40 ± 9% perf-profile.calltrace.cycles-pp.mod_node_page_state.gup_pgd_range.get_user_pages_fast.get_futex_key.futex_wake 19.62 ± 3% +2.4 22.06 ± 2% perf-profile.calltrace.cycles-pp.get_user_pages_fast.get_futex_key.futex_wake.do_futex.__x64_sys_futex 26.50 ± 3% +2.8 29.33 ± 2% perf-profile.calltrace.cycles-pp.get_futex_key.futex_wake.do_futex.__x64_sys_futex.do_syscall_64 32.48 ± 3% +3.0 35.52 ± 2% perf-profile.calltrace.cycles-pp.futex_wake.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe 46.45 ±100% -67.5% 15.08 ± 22% sched_debug.cfs_rq:/.load_avg.stddev 0.04 ± 5% -11.4% 0.04 ± 6% sched_debug.cfs_rq:/.nr_running.stddev 13.08 ± 4% +18.8% 15.54 ± 5% sched_debug.cfs_rq:/.runnable_load_avg.max 1.14 ± 4% +18.6% 1.35 ± 5% sched_debug.cfs_rq:/.runnable_load_avg.stddev 13543 ± 7% +9.5% 14825 ± 4% sched_debug.cfs_rq:/.runnable_weight.max 1267 ± 5% +13.4% 1436 ± 6% sched_debug.cfs_rq:/.runnable_weight.stddev 1089 ± 3% +14.4% 1246 ± 7% sched_debug.cfs_rq:/.util_avg.max 533.25 ± 7% +7.7% 574.21 ± 5% sched_debug.cfs_rq:/.util_est_enqueued.max 1.17 ± 6% +18.1% 1.39 ± 6% sched_debug.cpu.cpu_load[0].stddev 3321 +9.1% 3622 ± 4% sched_debug.cpu.curr->pid.min 1355 ± 5% +10.1% 1493 ± 6% sched_debug.cpu.load.stddev 1399 ± 6% -11.5% 1239 ± 3% sched_debug.cpu.nr_load_updates.stddev 270.33 ± 8% +52.7% 412.92 ± 14% sched_debug.cpu.nr_switches.min 113.58 ± 3% +75.1% 198.88 ± 6% sched_debug.cpu.sched_count.min 6331 ± 17% +41.5% 8956 ± 9% sched_debug.cpu.sched_goidle.max 545.75 ± 5% +24.9% 681.80 ± 9% sched_debug.cpu.sched_goidle.stddev 89.38 +48.7% 132.92 ± 3% sched_debug.cpu.ttwu_count.min 48.21 ± 2% +82.6% 88.04 ± 6% sched_debug.cpu.ttwu_local.min 47917 ± 29% -71.3% 13757 ± 83% numa-vmstat.node0.nr_active_anon 30393 ± 45% -71.0% 8807 ±111% numa-vmstat.node0.nr_anon_pages 78776 ± 10% -24.6% 59392 ± 7% numa-vmstat.node0.nr_file_pages 9592 ± 8% -10.3% 8602 ± 7% numa-vmstat.node0.nr_kernel_stack 2706 ± 37% -32.6% 1824 ± 38% numa-vmstat.node0.nr_mapped 18243 ± 45% -90.1% 1803 ±118% numa-vmstat.node0.nr_shmem 9147 ± 19% -42.9% 5227 ± 42% numa-vmstat.node0.nr_slab_reclaimable 18959 ± 7% -17.8% 15594 ± 14% numa-vmstat.node0.nr_slab_unreclaimable 47917 ± 29% -71.3% 13758 ± 83% numa-vmstat.node0.nr_zone_active_anon 34497 ± 41% +54.6% 53318 ± 19% numa-vmstat.node1.nr_active_anon 60479 ± 6% +9.6% 66283 ± 9% numa-vmstat.node1.nr_file_pages 1419 ±122% +162.7% 3729 ± 45% numa-vmstat.node1.nr_inactive_anon 1571 ±112% +441.7% 8510 ± 73% numa-vmstat.node1.nr_shmem 34497 ± 41% +54.6% 53318 ± 19% numa-vmstat.node1.nr_zone_active_anon 1419 ±122% +162.7% 3729 ± 45% numa-vmstat.node1.nr_zone_inactive_anon 9145 ±104% +188.9% 26423 ± 48% numa-vmstat.node3.nr_active_anon 1476 ± 52% +432.0% 7853 ±101% numa-vmstat.node3.nr_anon_pages 3918 ± 20% +78.8% 7005 ± 20% numa-vmstat.node3.nr_slab_reclaimable 12499 ± 7% +22.7% 15336 ± 12% numa-vmstat.node3.nr_slab_unreclaimable 9145 ±104% +188.9% 26423 ± 48% numa-vmstat.node3.nr_zone_active_anon 194801 ± 28% -71.7% 55131 ± 83% numa-meminfo.node0.Active 191699 ± 28% -71.2% 55131 ± 83% numa-meminfo.node0.Active(anon) 38555 ± 84% -100.0% 0.00 numa-meminfo.node0.AnonHugePages 121677 ± 45% -70.9% 35356 ±111% numa-meminfo.node0.AnonPages 315090 ± 10% -24.6% 237568 ± 7% numa-meminfo.node0.FilePages 36584 ± 19% -42.8% 20912 ± 42% numa-meminfo.node0.KReclaimable 9594 ± 8% -10.3% 8603 ± 7% numa-meminfo.node0.KernelStack 10625 ± 34% -32.6% 7165 ± 36% numa-meminfo.node0.Mapped 892127 ± 11% -25.0% 668837 ± 13% numa-meminfo.node0.MemUsed 36584 ± 19% -42.8% 20912 ± 42% numa-meminfo.node0.SReclaimable 75840 ± 7% -17.8% 62378 ± 14% numa-meminfo.node0.SUnreclaim 72956 ± 45% -90.1% 7212 ±118% numa-meminfo.node0.Shmem 112426 ± 11% -25.9% 83290 ± 21% numa-meminfo.node0.Slab 138083 ± 41% +54.5% 213397 ± 19% numa-meminfo.node1.Active(anon) 241919 ± 6% +9.6% 265074 ± 9% numa-meminfo.node1.FilePages 5679 ±122% +162.6% 14916 ± 45% numa-meminfo.node1.Inactive(anon) 6284 ±112% +440.7% 33980 ± 73% numa-meminfo.node1.Shmem 36547 ±104% +189.0% 105608 ± 48% numa-meminfo.node3.Active 36547 ±104% +189.0% 105608 ± 48% numa-meminfo.node3.Active(anon) 5876 ± 52% +433.8% 31369 ±101% numa-meminfo.node3.AnonPages 15671 ± 20% +78.8% 28021 ± 20% numa-meminfo.node3.KReclaimable 592511 ± 8% +15.3% 683394 ± 8% numa-meminfo.node3.MemUsed 15671 ± 20% +78.8% 28021 ± 20% numa-meminfo.node3.SReclaimable 49997 ± 7% +22.7% 61344 ± 12% numa-meminfo.node3.SUnreclaim 65670 ± 6% +36.1% 89366 ± 14% numa-meminfo.node3.Slab will-it-scale.per_process_ops 1.84e+06 +-+--------------------------------------------------------------+ | | 1.82e+06 +-+.. .+.+.. .+..+.+.+..+.+.+.. .+.+.+.. .+.. .+..+. .| 1.8e+06 +-+ + +.+ + +.+ +.+ +.+..+ | | | 1.78e+06 +-+ | | | 1.76e+06 +-+ | | | 1.74e+06 +-+ | 1.72e+06 +-+ O O O O O O O O O | | O O O O O O O O O | 1.7e+06 +-+ O O | O O O O O | 1.68e+06 +-+--------------------------------------------------------------+ will-it-scale.workload 3.5e+08 +-+--------------------------------------------------------------+ |.+..+.+ + + +.+ +.+ +..+.+.+..+.+.+..+.+. .+.| 3.45e+08 +-+ +.+ +. | | | | | 3.4e+08 +-+ | | | 3.35e+08 +-+ | | | 3.3e+08 +-+ O O O O O O O O O O O O O O | | O O O O | | O O | 3.25e+08 O-O O O O | | | 3.2e+08 +-+--------------------------------------------------------------+ [*] bisect-good sample [O] bisect-bad sample Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Thanks, Rong Chen