FYI, we noticed vm-scalability.throughput -23.8% regression due to commit: commit 23047a96d7cfcfca1a6d026ecaec526ea4803e9e ("mm: workingset: per-cgroup cache thrash detection") https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master in testcase: vm-scalability on test machine: lkp-hsw01: 56 threads Grantley Haswell-EP with 64G memory with following conditions: cpufreq_governor=performance/runtime=300s/test=lru-file-readtwice Details are as below: --------------------------------------------------------------------------------------------------> ========================================================================================= compiler/cpufreq_governor/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-4.9/performance/x86_64-rhel/debian-x86_64-2015-02-07.cgz/300s/lkp-hsw01/lru-file-readtwice/vm-scalability commit: 612e44939c3c77245ac80843c0c7876c8cf97282 23047a96d7cfcfca1a6d026ecaec526ea4803e9e 612e44939c3c7724 23047a96d7cfcfca1a6d026eca ---------------- -------------------------- %stddev %change %stddev \ | \ 28384711 ± 0% -23.8% 21621405 ± 0% vm-scalability.throughput 1854112 ± 0% -7.7% 1711141 ± 0% vm-scalability.time.involuntary_context_switches 176.03 ± 0% -22.2% 136.95 ± 1% vm-scalability.time.user_time 302905 ± 2% -31.2% 208386 ± 0% vm-scalability.time.voluntary_context_switches 0.92 ± 2% +51.0% 1.38 ± 2% perf-profile.cycles-pp.kswapd 754212 ± 1% -29.2% 533832 ± 2% softirqs.RCU 20518 ± 2% -8.1% 18866 ± 2% vmstat.system.cs 10574 ± 19% +29.9% 13737 ± 8% numa-meminfo.node0.Mapped 13490 ± 13% -36.6% 8549 ± 17% numa-meminfo.node1.Mapped 583.00 ± 8% +18.8% 692.50 ± 5% slabinfo.avc_xperms_node.active_objs 583.00 ± 8% +18.8% 692.50 ± 5% slabinfo.avc_xperms_node.num_objs 176.03 ± 0% -22.2% 136.95 ± 1% time.user_time 302905 ± 2% -31.2% 208386 ± 0% time.voluntary_context_switches 263.42 ± 0% -3.0% 255.52 ± 0% turbostat.PkgWatt 61.05 ± 0% -12.7% 53.26 ± 0% turbostat.RAMWatt 1868 ± 16% -43.7% 1052 ± 13% cpuidle.C1-HSW.usage 1499 ± 9% -30.3% 1045 ± 12% cpuidle.C3-HSW.usage 16071 ± 4% -15.0% 13664 ± 3% cpuidle.C6-HSW.usage 17572 ± 27% -59.1% 7179 ± 5% cpuidle.POLL.usage 4.896e+08 ± 0% -20.7% 3.884e+08 ± 0% numa-numastat.node0.local_node 71305376 ± 2% -19.7% 57223573 ± 4% numa-numastat.node0.numa_foreign 4.896e+08 ± 0% -20.7% 3.884e+08 ± 0% numa-numastat.node0.numa_hit 43760475 ± 3% -22.1% 34074417 ± 5% numa-numastat.node0.numa_miss 43765010 ± 3% -22.1% 34078937 ± 5% numa-numastat.node0.other_node 4.586e+08 ± 0% -25.7% 3.408e+08 ± 1% numa-numastat.node1.local_node 43760472 ± 3% -22.1% 34074417 ± 5% numa-numastat.node1.numa_foreign 4.586e+08 ± 0% -25.7% 3.408e+08 ± 1% numa-numastat.node1.numa_hit 71305376 ± 2% -19.7% 57223573 ± 4% numa-numastat.node1.numa_miss 71311721 ± 2% -19.7% 57229904 ± 4% numa-numastat.node1.other_node 543.25 ± 3% -15.0% 461.50 ± 3% numa-vmstat.node0.nr_isolated_file 2651 ± 19% +30.2% 3451 ± 8% numa-vmstat.node0.nr_mapped 1226 ± 6% -31.7% 837.25 ± 9% numa-vmstat.node0.nr_pages_scanned 37111278 ± 1% -20.6% 29474561 ± 3% numa-vmstat.node0.numa_foreign 2.568e+08 ± 0% -21.0% 2.028e+08 ± 0% numa-vmstat.node0.numa_hit 2.567e+08 ± 0% -21.0% 2.027e+08 ± 0% numa-vmstat.node0.numa_local 22595209 ± 2% -22.9% 17420980 ± 4% numa-vmstat.node0.numa_miss 22665391 ± 2% -22.8% 17490378 ± 4% numa-vmstat.node0.numa_other 88.25 ±173% +1029.7% 997.00 ± 63% numa-vmstat.node0.workingset_activate 3965715 ± 0% -24.9% 2977998 ± 0% numa-vmstat.node0.workingset_nodereclaim 90.25 ±170% +1006.4% 998.50 ± 63% numa-vmstat.node0.workingset_refault 612.50 ± 3% -9.4% 554.75 ± 4% numa-vmstat.node1.nr_alloc_batch 3279 ± 14% -34.1% 2161 ± 17% numa-vmstat.node1.nr_mapped 22597658 ± 2% -22.9% 17423271 ± 4% numa-vmstat.node1.numa_foreign 2.403e+08 ± 0% -25.9% 1.781e+08 ± 1% numa-vmstat.node1.numa_hit 2.403e+08 ± 0% -25.9% 1.781e+08 ± 1% numa-vmstat.node1.numa_local 37115261 ± 1% -20.6% 29478460 ± 3% numa-vmstat.node1.numa_miss 37136533 ± 1% -20.6% 29500409 ± 3% numa-vmstat.node1.numa_other 6137 ±173% +257.3% 21927 ± 60% numa-vmstat.node1.workingset_activate 3237162 ± 0% -30.6% 2246385 ± 1% numa-vmstat.node1.workingset_nodereclaim 6139 ±173% +257.2% 21930 ± 60% numa-vmstat.node1.workingset_refault 501243 ± 0% -26.9% 366510 ± 1% proc-vmstat.allocstall 28483 ± 0% -50.7% 14047 ± 3% proc-vmstat.kswapd_low_wmark_hit_quickly 1.151e+08 ± 0% -20.7% 91297990 ± 0% proc-vmstat.numa_foreign 9.482e+08 ± 0% -23.1% 7.293e+08 ± 0% proc-vmstat.numa_hit 9.482e+08 ± 0% -23.1% 7.293e+08 ± 0% proc-vmstat.numa_local 1.151e+08 ± 0% -20.7% 91297990 ± 0% proc-vmstat.numa_miss 1.151e+08 ± 0% -20.7% 91308842 ± 0% proc-vmstat.numa_other 31562 ± 0% -47.1% 16687 ± 2% proc-vmstat.pageoutrun 1.048e+09 ± 0% -22.8% 8.088e+08 ± 0% proc-vmstat.pgactivate 28481000 ± 0% -21.3% 22422907 ± 0% proc-vmstat.pgalloc_dma32 1.035e+09 ± 0% -22.9% 7.984e+08 ± 0% proc-vmstat.pgalloc_normal 1.041e+09 ± 0% -23.0% 8.024e+08 ± 0% proc-vmstat.pgdeactivate 1.063e+09 ± 0% -22.8% 8.2e+08 ± 0% proc-vmstat.pgfree 2458 ± 91% -93.5% 160.75 ± 29% proc-vmstat.pgmigrate_success 27571690 ± 0% -20.6% 21889554 ± 0% proc-vmstat.pgrefill_dma32 1.014e+09 ± 0% -23.0% 7.805e+08 ± 0% proc-vmstat.pgrefill_normal 25263166 ± 0% -27.4% 18337251 ± 1% proc-vmstat.pgscan_direct_dma32 9.377e+08 ± 0% -26.9% 6.852e+08 ± 1% proc-vmstat.pgscan_direct_normal 2134103 ± 1% +57.6% 3363418 ± 6% proc-vmstat.pgscan_kswapd_dma32 69594167 ± 0% +26.7% 88192786 ± 2% proc-vmstat.pgscan_kswapd_normal 25260851 ± 0% -27.4% 18335464 ± 1% proc-vmstat.pgsteal_direct_dma32 9.376e+08 ± 0% -26.9% 6.852e+08 ± 1% proc-vmstat.pgsteal_direct_normal 2133563 ± 1% +57.6% 3362346 ± 6% proc-vmstat.pgsteal_kswapd_dma32 69585316 ± 0% +26.7% 88176045 ± 2% proc-vmstat.pgsteal_kswapd_normal 17530080 ± 0% -23.3% 13440416 ± 0% proc-vmstat.slabs_scanned 6226 ±173% +268.2% 22924 ± 58% proc-vmstat.workingset_activate 7202139 ± 0% -27.5% 5223203 ± 0% proc-vmstat.workingset_nodereclaim 6230 ±173% +268.0% 22929 ± 58% proc-vmstat.workingset_refault 123.70 ± 12% +26.7% 156.79 ± 11% sched_debug.cfs_rq:/.load.stddev 42.08 ± 1% +23.3% 51.90 ± 8% sched_debug.cfs_rq:/.load_avg.avg 779.50 ± 2% +20.7% 940.83 ± 5% sched_debug.cfs_rq:/.load_avg.max 9.46 ± 8% -13.7% 8.17 ± 1% sched_debug.cfs_rq:/.load_avg.min 123.38 ± 2% +31.4% 162.10 ± 6% sched_debug.cfs_rq:/.load_avg.stddev 304497 ± 22% +65.6% 504169 ± 7% sched_debug.cfs_rq:/.min_vruntime.stddev 25.74 ± 8% +33.9% 34.46 ± 8% sched_debug.cfs_rq:/.runnable_load_avg.avg 481.33 ± 11% +50.5% 724.54 ± 11% sched_debug.cfs_rq:/.runnable_load_avg.max 69.65 ± 15% +62.2% 112.95 ± 12% sched_debug.cfs_rq:/.runnable_load_avg.stddev -1363122 ±-14% +52.6% -2080627 ±-10% sched_debug.cfs_rq:/.spread0.min 304448 ± 22% +65.6% 504111 ± 7% sched_debug.cfs_rq:/.spread0.stddev 733220 ± 5% +13.0% 828548 ± 1% sched_debug.cpu.avg_idle.avg 123344 ± 11% +73.4% 213827 ± 27% sched_debug.cpu.avg_idle.min 233732 ± 5% -13.5% 202264 ± 6% sched_debug.cpu.avg_idle.stddev 26.93 ± 9% +27.8% 34.42 ± 8% sched_debug.cpu.cpu_load[0].avg 78.79 ± 19% +43.7% 113.20 ± 12% sched_debug.cpu.cpu_load[0].stddev 26.23 ± 8% +30.5% 34.23 ± 7% sched_debug.cpu.cpu_load[1].avg 513.17 ± 12% +38.6% 711.12 ± 11% sched_debug.cpu.cpu_load[1].max 73.34 ± 15% +50.7% 110.55 ± 11% sched_debug.cpu.cpu_load[1].stddev 25.93 ± 6% +32.6% 34.40 ± 6% sched_debug.cpu.cpu_load[2].avg 488.38 ± 8% +44.8% 706.96 ± 10% sched_debug.cpu.cpu_load[2].max 69.79 ± 10% +56.9% 109.52 ± 10% sched_debug.cpu.cpu_load[2].stddev 25.89 ± 4% +35.1% 34.97 ± 4% sched_debug.cpu.cpu_load[3].avg 467.83 ± 7% +50.2% 702.71 ± 9% sched_debug.cpu.cpu_load[3].max 67.27 ± 9% +63.6% 110.03 ± 8% sched_debug.cpu.cpu_load[3].stddev 25.83 ± 4% +37.2% 35.44 ± 3% sched_debug.cpu.cpu_load[4].avg 445.29 ± 9% +56.7% 697.88 ± 8% sched_debug.cpu.cpu_load[4].max 64.41 ± 9% +72.4% 111.02 ± 6% sched_debug.cpu.cpu_load[4].stddev 123.66 ± 12% +28.2% 158.54 ± 11% sched_debug.cpu.load.stddev 1.56 ± 1% +9.8% 1.71 ± 0% sched_debug.cpu.nr_running.avg 0.46 ± 12% +28.4% 0.59 ± 6% sched_debug.cpu.nr_running.stddev 57967 ± 3% -9.8% 52290 ± 2% sched_debug.cpu.nr_switches.avg 270099 ± 9% -16.4% 225748 ± 7% sched_debug.cpu.nr_switches.max 27370 ± 1% -13.3% 23723 ± 0% sched_debug.cpu.nr_switches.min 55749 ± 7% -14.3% 47767 ± 5% sched_debug.cpu.nr_switches.stddev -55.33 ±-19% -40.4% -32.96 ± -2% sched_debug.cpu.nr_uninterruptible.min ========================================================================================= compiler/kconfig/rootfs/sleep/tbox_group/testcase: gcc-5/x86_64-randconfig-a0-04240012/yocto-minimal-i386.cgz/1/vm-kbuild-yocto-ia32/boot commit: 612e44939c3c77245ac80843c0c7876c8cf97282 23047a96d7cfcfca1a6d026ecaec526ea4803e9e 612e44939c3c7724 23047a96d7cfcfca1a6d026eca ---------------- -------------------------- fail:runs %reproduction fail:runs | | | :50 2% 1:180 kmsg.augmented_rbtree_testing :50 216% 108:180 last_state.is_incomplete_run vm-scalability.time.user_time 180 *+**-*-----*-*-*--*-*---*----*-*-**------**-*-**-**------*-**---------+ 175 ++ **.* * *.* *.* **.* *.** + .*.**.* | ** | 170 ++ | 165 ++ | | | 160 ++ | 155 ++ | 150 ++ | | O | 145 O+ O O OO O | 140 ++ OO O | | OO O O O O O OO O O | 135 ++ O O O O OO O O O | 130 ++-------------O------------------------------------------------------+ vm-scalability.throughput 2.9e+07 ++------*--------------------------*---------------*--------------+ |.* * *.* .**.**.* .*.**. *. .* **. *. .* * *.* .* .**.**.* 2.8e+07 *+ *.* * * * ** * * *.* * * | 2.7e+07 ++ | | | 2.6e+07 ++ | | | 2.5e+07 ++ | | | 2.4e+07 ++ | 2.3e+07 ++O OO O | O O OO O | 2.2e+07 ++ | | OO OO OO O OO OO OO OO OO OO O OO | 2.1e+07 ++----------------------------------------------------------------+ [*] bisect-good sample [O] bisect-bad sample To reproduce: git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git cd lkp-tests bin/lkp install job.yaml # job file is attached in this email bin/lkp run job.yaml Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Thanks, Xiaolong