Greeting, FYI, we noticed a -61.3% regression of vm-scalability.throughput due to commit: commit: ac5b2c18911ffe95c08d69273917f90212cf5659 ("mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings") https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master in testcase: vm-scalability on test machine: 72 threads Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz with 128G memory with following parameters: runtime: 300 thp_enabled: always thp_defrag: always nr_task: 32 nr_ssd: 1 test: swap-w-seq ucode: 0x3d cpufreq_governor: performance test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us. test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/ Details are as below: --------------------------------------------------------------------------------------------------> To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp install job.yaml # job file is attached in this email bin/lkp run job.yaml ========================================================================================= compiler/cpufreq_governor/kconfig/nr_ssd/nr_task/rootfs/runtime/tbox_group/test/testcase/thp_defrag/thp_enabled/ucode: gcc-7/performance/x86_64-rhel-7.2/1/32/debian-x86_64-2018-04-03.cgz/300/lkp-hsw-ep4/swap-w-seq/vm-scalability/always/always/0x3d commit: 94e297c50b ("include/linux/notifier.h: SRCU: fix ctags") ac5b2c1891 ("mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings") 94e297c50b529f5d ac5b2c18911ffe95c08d692739 ---------------- -------------------------- %stddev %change %stddev \ | \ 0.57 ± 35% +258.8% 2.05 ± 4% vm-scalability.free_time 146022 ± 14% -40.5% 86833 ± 2% vm-scalability.median 29.29 ± 40% -89.6% 3.06 ± 26% vm-scalability.stddev 7454656 ± 9% -61.3% 2885836 ± 3% vm-scalability.throughput 189.21 ± 10% +52.4% 288.34 ± 2% vm-scalability.time.elapsed_time 189.21 ± 10% +52.4% 288.34 ± 2% vm-scalability.time.elapsed_time.max 8768 ± 3% +11.6% 9781 ± 5% vm-scalability.time.involuntary_context_switches 20320196 ± 2% -33.4% 13531732 ± 3% vm-scalability.time.maximum_resident_set_size 425945 ± 9% +17.4% 499908 ± 4% vm-scalability.time.minor_page_faults 253.79 ± 6% +62.0% 411.07 ± 4% vm-scalability.time.system_time 322.52 +8.0% 348.18 vm-scalability.time.user_time 246150 ± 12% +50.3% 370019 ± 4% vm-scalability.time.voluntary_context_switches 7746519 ± 11% +49.0% 11538799 ± 4% cpuidle.C6.usage 192240 ± 10% +44.3% 277460 ± 8% interrupts.CAL:Function_call_interrupts 22.45 ± 85% -80.6% 4.36 ±173% sched_debug.cfs_rq:/.MIN_vruntime.avg 22.45 ± 85% -80.6% 4.36 ±173% sched_debug.cfs_rq:/.max_vruntime.avg 29.36 ± 13% +10.8% 32.52 ± 14% boot-time.boot 24.28 ± 16% +12.4% 27.30 ± 16% boot-time.dhcp 1597 ± 15% +10.2% 1760 ± 15% boot-time.idle 68.25 -9.8 58.48 ± 2% mpstat.cpu.idle% 27.12 ± 5% +10.3 37.42 ± 3% mpstat.cpu.iowait% 2.48 ± 11% -0.8 1.73 ± 2% mpstat.cpu.usr% 3422396 ± 14% +46.6% 5018492 ± 7% softirqs.RCU 1776561 ± 10% +52.4% 2707785 ± 2% softirqs.SCHED 5685772 ± 7% +54.3% 8774055 ± 3% softirqs.TIMER 7742519 ± 11% +49.0% 11534924 ± 4% turbostat.C6 29922317 ± 10% +55.0% 46366941 turbostat.IRQ 9.49 ± 64% -83.0% 1.62 ± 54% turbostat.Pkg%pc2 36878790 ± 27% -84.9% 5570259 ± 47% vmstat.memory.free 1.117e+08 ± 15% +73.5% 1.939e+08 ± 10% vmstat.memory.swpd 25.25 ± 4% +34.7% 34.00 vmstat.procs.b 513725 ± 7% +47.9% 759561 ± 12% numa-numastat.node0.local_node 35753 ±160% +264.7% 130386 ± 39% numa-numastat.node0.numa_foreign 519403 ± 6% +48.7% 772182 ± 12% numa-numastat.node0.numa_hit 35753 ±160% +264.7% 130386 ± 39% numa-numastat.node1.numa_miss 44315 ±138% +198.5% 132279 ± 40% numa-numastat.node1.other_node 32798032 ± 46% +80.6% 59228165 ± 2% numa-meminfo.node0.Active 32798009 ± 46% +80.6% 59228160 ± 2% numa-meminfo.node0.Active(anon) 33430762 ± 46% +80.8% 60429537 ± 2% numa-meminfo.node0.AnonHugePages 33572119 ± 46% +80.2% 60512777 ± 2% numa-meminfo.node0.AnonPages 1310559 ± 64% +86.4% 2442244 ± 2% numa-meminfo.node0.Inactive 1309969 ± 64% +86.4% 2442208 ± 2% numa-meminfo.node0.Inactive(anon) 30385359 ± 53% -90.7% 2821023 ± 44% numa-meminfo.node0.MemFree 35505047 ± 45% +77.6% 63055165 ± 2% numa-meminfo.node0.MemUsed 166560 ± 42% +130.2% 383345 ± 6% numa-meminfo.node0.PageTables 23702 ±105% -89.0% 2617 ± 44% numa-meminfo.node0.Shmem 1212 ± 65% +402.8% 6093 ± 57% numa-meminfo.node1.Shmem 8354144 ± 44% +77.1% 14798964 ± 2% numa-vmstat.node0.nr_active_anon 8552787 ± 44% +76.8% 15122222 ± 2% numa-vmstat.node0.nr_anon_pages 16648 ± 44% +77.1% 29492 ± 2% numa-vmstat.node0.nr_anon_transparent_hugepages 7436650 ± 53% -90.4% 712936 ± 44% numa-vmstat.node0.nr_free_pages 332106 ± 63% +83.8% 610268 ± 2% numa-vmstat.node0.nr_inactive_anon 41929 ± 41% +130.6% 96703 ± 6% numa-vmstat.node0.nr_page_table_pages 5900 ±106% -89.0% 650.75 ± 45% numa-vmstat.node0.nr_shmem 43336 ± 92% +151.2% 108840 ± 7% numa-vmstat.node0.nr_vmscan_write 43110 ± 92% +150.8% 108110 ± 7% numa-vmstat.node0.nr_written 8354142 ± 44% +77.1% 14798956 ± 2% numa-vmstat.node0.nr_zone_active_anon 332105 ± 63% +83.8% 610269 ± 2% numa-vmstat.node0.nr_zone_inactive_anon 321.50 ± 66% +384.9% 1559 ± 59% numa-vmstat.node1.nr_shmem 88815743 ± 10% +33.8% 1.188e+08 ± 2% meminfo.Active 88815702 ± 10% +33.8% 1.188e+08 ± 2% meminfo.Active(anon) 90446011 ± 11% +34.0% 1.212e+08 ± 2% meminfo.AnonHugePages 90613587 ± 11% +34.0% 1.214e+08 ± 2% meminfo.AnonPages 5.15e+08 ± 3% +22.2% 6.293e+08 ± 2% meminfo.Committed_AS 187419 ± 10% -19.6% 150730 ± 7% meminfo.DirectMap4k 3620693 ± 18% +35.3% 4897093 ± 2% meminfo.Inactive 3620054 ± 18% +35.3% 4896961 ± 2% meminfo.Inactive(anon) 36144979 ± 28% -87.0% 4715681 ± 57% meminfo.MemAvailable 36723121 ± 27% -85.9% 5179468 ± 52% meminfo.MemFree 395801 ± 2% +56.9% 620816 ± 4% meminfo.PageTables 178672 +15.1% 205668 ± 2% meminfo.SUnreclaim 249496 +12.6% 280897 meminfo.Slab 1751813 ± 2% +34.7% 2360437 ± 3% meminfo.SwapCached 6.716e+08 -12.4% 5.88e+08 ± 2% meminfo.SwapFree 3926 ± 17% +72.9% 6788 ± 13% meminfo.Writeback 1076 ± 3% +42.9% 1538 ± 4% slabinfo.biovec-max.active_objs 275.75 ± 2% +41.1% 389.00 ± 4% slabinfo.biovec-max.active_slabs 1104 ± 2% +41.0% 1557 ± 4% slabinfo.biovec-max.num_objs 275.75 ± 2% +41.1% 389.00 ± 4% slabinfo.biovec-max.num_slabs 588.25 ± 7% +17.9% 693.75 ± 7% slabinfo.file_lock_cache.active_objs 588.25 ± 7% +17.9% 693.75 ± 7% slabinfo.file_lock_cache.num_objs 13852 ± 3% +37.5% 19050 ± 3% slabinfo.kmalloc-4k.active_objs 1776 ± 3% +37.7% 2446 ± 3% slabinfo.kmalloc-4k.active_slabs 14217 ± 3% +37.7% 19577 ± 3% slabinfo.kmalloc-4k.num_objs 1776 ± 3% +37.7% 2446 ± 3% slabinfo.kmalloc-4k.num_slabs 158.25 ± 15% +54.0% 243.75 ± 18% slabinfo.nfs_read_data.active_objs 158.25 ± 15% +54.0% 243.75 ± 18% slabinfo.nfs_read_data.num_objs 17762 ± 4% +44.3% 25638 ± 4% slabinfo.pool_workqueue.active_objs 563.25 ± 3% +43.7% 809.25 ± 4% slabinfo.pool_workqueue.active_slabs 18048 ± 3% +43.5% 25906 ± 4% slabinfo.pool_workqueue.num_objs 563.25 ± 3% +43.7% 809.25 ± 4% slabinfo.pool_workqueue.num_slabs 34631 ± 3% +21.0% 41905 ± 2% slabinfo.radix_tree_node.active_objs 624.50 ± 3% +20.7% 753.75 ± 2% slabinfo.radix_tree_node.active_slabs 34998 ± 3% +20.7% 42228 ± 2% slabinfo.radix_tree_node.num_objs 624.50 ± 3% +20.7% 753.75 ± 2% slabinfo.radix_tree_node.num_slabs 9.727e+11 ± 8% +50.4% 1.463e+12 ± 12% perf-stat.branch-instructions 1.11 ± 12% +1.2 2.31 ± 8% perf-stat.branch-miss-rate% 1.078e+10 ± 13% +214.9% 3.395e+10 ± 17% perf-stat.branch-misses 3.17 ± 11% -1.5 1.65 ± 9% perf-stat.cache-miss-rate% 8.206e+08 ± 7% +49.4% 1.226e+09 ± 11% perf-stat.cache-misses 2.624e+10 ± 14% +187.0% 7.532e+10 ± 17% perf-stat.cache-references 1174249 ± 9% +52.3% 1788442 ± 3% perf-stat.context-switches 2.921e+12 ± 8% +71.1% 4.998e+12 ± 9% perf-stat.cpu-cycles 1437 ± 14% +85.1% 2661 ± 20% perf-stat.cpu-migrations 7.586e+08 ± 21% +134.7% 1.78e+09 ± 30% perf-stat.dTLB-load-misses 7.943e+11 ± 11% +84.3% 1.464e+12 ± 14% perf-stat.dTLB-loads 93963731 ± 22% +40.9% 1.324e+08 ± 10% perf-stat.dTLB-store-misses 3.394e+11 ± 6% +60.5% 5.449e+11 ± 10% perf-stat.dTLB-stores 1.531e+08 ± 22% +44.0% 2.204e+08 ± 11% perf-stat.iTLB-load-misses 1.688e+08 ± 23% +71.1% 2.888e+08 ± 12% perf-stat.iTLB-loads 3.267e+12 ± 7% +58.5% 5.177e+12 ± 13% perf-stat.instructions 3988 ± 43% +123.9% 8930 ± 22% perf-stat.major-faults 901474 ± 5% +34.2% 1209877 ± 2% perf-stat.minor-faults 31.24 ± 10% +21.7 52.91 ± 2% perf-stat.node-load-miss-rate% 1.135e+08 ± 16% +187.6% 3.264e+08 ± 13% perf-stat.node-load-misses 6.27 ± 17% +26.9 33.19 ± 4% perf-stat.node-store-miss-rate% 27354489 ± 15% +601.2% 1.918e+08 ± 13% perf-stat.node-store-misses 905482 ± 5% +34.6% 1218833 ± 2% perf-stat.page-faults 4254 ± 7% +58.5% 6741 ± 13% perf-stat.path-length 6364 ± 25% +84.1% 11715 ± 14% proc-vmstat.allocstall_movable 46439 ± 12% +100.4% 93049 ± 21% proc-vmstat.compact_migrate_scanned 22425696 ± 10% +29.0% 28932634 ± 6% proc-vmstat.nr_active_anon 22875703 ± 11% +29.2% 29560082 ± 6% proc-vmstat.nr_anon_pages 44620 ± 11% +29.2% 57643 ± 6% proc-vmstat.nr_anon_transparent_hugepages 879436 ± 28% -77.3% 199768 ± 98% proc-vmstat.nr_dirty_background_threshold 1761029 ± 28% -77.3% 400034 ± 98% proc-vmstat.nr_dirty_threshold 715724 +17.6% 841386 ± 3% proc-vmstat.nr_file_pages 8960545 ± 28% -76.4% 2111248 ± 93% proc-vmstat.nr_free_pages 904330 ± 18% +31.2% 1186458 ± 7% proc-vmstat.nr_inactive_anon 11137 ± 2% +27.1% 14154 ± 9% proc-vmstat.nr_isolated_anon 12566 +3.5% 13012 proc-vmstat.nr_kernel_stack 97491 ± 2% +52.6% 148790 ± 10% proc-vmstat.nr_page_table_pages 17674 +6.2% 18763 ± 2% proc-vmstat.nr_slab_reclaimable 44820 +13.0% 50645 ± 2% proc-vmstat.nr_slab_unreclaimable 135763 ± 9% +68.4% 228600 ± 6% proc-vmstat.nr_vmscan_write 1017 ± 10% +54.7% 1573 ± 14% proc-vmstat.nr_writeback 220023 ± 5% +73.5% 381732 ± 6% proc-vmstat.nr_written 22425696 ± 10% +29.0% 28932635 ± 6% proc-vmstat.nr_zone_active_anon 904330 ± 18% +31.2% 1186457 ± 7% proc-vmstat.nr_zone_inactive_anon 1018 ± 10% +55.3% 1581 ± 13% proc-vmstat.nr_zone_write_pending 145368 ± 48% +63.1% 237050 ± 17% proc-vmstat.numa_foreign 671.50 ± 96% +479.4% 3890 ± 71% proc-vmstat.numa_hint_faults 1122389 ± 9% +17.2% 1315380 ± 4% proc-vmstat.numa_hit 214722 ± 5% +21.6% 261076 ± 3% proc-vmstat.numa_huge_pte_updates 1108142 ± 9% +17.4% 1300857 ± 4% proc-vmstat.numa_local 145368 ± 48% +63.1% 237050 ± 17% proc-vmstat.numa_miss 159615 ± 44% +57.6% 251573 ± 16% proc-vmstat.numa_other 185.50 ± 81% +8278.6% 15542 ± 40% proc-vmstat.numa_pages_migrated 1.1e+08 ± 5% +21.6% 1.337e+08 ± 3% proc-vmstat.numa_pte_updates 688332 ±106% +177.9% 1913062 ± 3% proc-vmstat.pgalloc_dma32 72593045 ± 10% +51.1% 1.097e+08 ± 3% proc-vmstat.pgdeactivate 919059 ± 4% +35.1% 1241472 ± 2% proc-vmstat.pgfault 3716 ± 45% +120.3% 8186 ± 25% proc-vmstat.pgmajfault 7.25 ± 26% +4.2e+05% 30239 ± 25% proc-vmstat.pgmigrate_fail 5340 ±106% +264.0% 19438 ± 33% proc-vmstat.pgmigrate_success 2.837e+08 ± 10% +51.7% 4.303e+08 ± 3% proc-vmstat.pgpgout 211428 ± 6% +74.1% 368188 ± 4% proc-vmstat.pgrefill 219051 ± 5% +73.7% 380419 ± 5% proc-vmstat.pgrotated 559397 ± 8% +43.0% 800110 ± 11% proc-vmstat.pgscan_direct 32894 ± 59% +158.3% 84981 ± 23% proc-vmstat.pgscan_kswapd 207042 ± 8% +71.5% 355174 ± 5% proc-vmstat.pgsteal_direct 14745 ± 65% +104.3% 30121 ± 18% proc-vmstat.pgsteal_kswapd 70934968 ± 10% +51.7% 1.076e+08 ± 3% proc-vmstat.pswpout 5852284 ± 12% +145.8% 14382881 ± 5% proc-vmstat.slabs_scanned 13453 ± 24% +204.9% 41023 ± 8% proc-vmstat.thp_split_page_failed 138385 ± 10% +51.6% 209783 ± 3% proc-vmstat.thp_split_pmd 138385 ± 10% +51.6% 209782 ± 3% proc-vmstat.thp_swpout 4.61 ± 24% -1.2 3.37 ± 10% perf-profile.calltrace.cycles-pp.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state.do_idle 2.90 ± 18% -0.7 2.19 ± 8% perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state 1.86 ± 32% -0.6 1.22 ± 7% perf-profile.calltrace.cycles-pp.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt 1.60 ± 31% -0.5 1.08 ± 10% perf-profile.calltrace.cycles-pp.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt 2.98 ± 8% -0.5 2.48 ± 10% perf-profile.calltrace.cycles-pp.menu_select.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64 1.46 ± 32% -0.5 0.96 ± 9% perf-profile.calltrace.cycles-pp.update_process_times.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt 0.74 ± 27% -0.5 0.28 ±100% perf-profile.calltrace.cycles-pp.scheduler_tick.update_process_times.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues 1.03 ± 52% -0.4 0.63 ± 15% perf-profile.calltrace.cycles-pp.clockevents_program_event.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state 1.17 ± 19% -0.3 0.87 ± 11% perf-profile.calltrace.cycles-pp.tick_nohz_next_event.tick_nohz_get_sleep_length.menu_select.do_idle.cpu_startup_entry 0.80 ± 17% +0.4 1.22 ± 7% perf-profile.calltrace.cycles-pp.nvme_queue_rq.__blk_mq_try_issue_directly.blk_mq_try_issue_directly.blk_mq_make_request.generic_make_request 0.81 ± 16% +0.4 1.23 ± 8% perf-profile.calltrace.cycles-pp.blk_mq_try_issue_directly.blk_mq_make_request.generic_make_request.submit_bio.__swap_writepage 0.81 ± 16% +0.4 1.23 ± 8% perf-profile.calltrace.cycles-pp.__blk_mq_try_issue_directly.blk_mq_try_issue_directly.blk_mq_make_request.generic_make_request.submit_bio 0.52 ± 60% +0.5 1.03 ± 6% perf-profile.calltrace.cycles-pp.dma_pool_alloc.nvme_queue_rq.__blk_mq_try_issue_directly.blk_mq_try_issue_directly.blk_mq_make_request 1.64 ± 16% +0.6 2.21 ± 15% perf-profile.calltrace.cycles-pp.find_next_bit.blk_mq_queue_tag_busy_iter.blk_mq_in_flight.part_round_stats.blk_account_io_done 0.51 ± 64% +0.8 1.31 ± 17% perf-profile.calltrace.cycles-pp.bt_iter.blk_mq_queue_tag_busy_iter.blk_mq_in_flight.part_round_stats.blk_account_io_start 0.80 ± 25% +0.9 1.67 ± 19% perf-profile.calltrace.cycles-pp.blk_mq_queue_tag_busy_iter.blk_mq_in_flight.part_round_stats.blk_account_io_start.blk_mq_make_request 0.82 ± 24% +0.9 1.71 ± 19% perf-profile.calltrace.cycles-pp.blk_mq_in_flight.part_round_stats.blk_account_io_start.blk_mq_make_request.generic_make_request 0.82 ± 25% +0.9 1.73 ± 19% perf-profile.calltrace.cycles-pp.part_round_stats.blk_account_io_start.blk_mq_make_request.generic_make_request.submit_bio 0.87 ± 25% +1.0 1.87 ± 14% perf-profile.calltrace.cycles-pp.blk_account_io_start.blk_mq_make_request.generic_make_request.submit_bio.__swap_writepage 2.05 ± 15% +1.4 3.48 ± 7% perf-profile.calltrace.cycles-pp.generic_make_request.submit_bio.__swap_writepage.pageout.shrink_page_list 2.09 ± 15% +1.4 3.53 ± 7% perf-profile.calltrace.cycles-pp.__swap_writepage.pageout.shrink_page_list.shrink_inactive_list.shrink_node_memcg 2.05 ± 15% +1.4 3.49 ± 7% perf-profile.calltrace.cycles-pp.blk_mq_make_request.generic_make_request.submit_bio.__swap_writepage.pageout 2.06 ± 15% +1.4 3.50 ± 7% perf-profile.calltrace.cycles-pp.submit_bio.__swap_writepage.pageout.shrink_page_list.shrink_inactive_list 2.10 ± 15% +1.4 3.54 ± 6% perf-profile.calltrace.cycles-pp.pageout.shrink_page_list.shrink_inactive_list.shrink_node_memcg.shrink_node 3.31 ± 12% +1.5 4.83 ± 5% perf-profile.calltrace.cycles-pp.shrink_page_list.shrink_inactive_list.shrink_node_memcg.shrink_node.do_try_to_free_pages 3.33 ± 12% +1.5 4.86 ± 5% perf-profile.calltrace.cycles-pp.shrink_inactive_list.shrink_node_memcg.shrink_node.do_try_to_free_pages.try_to_free_pages 3.40 ± 12% +1.6 4.97 ± 6% perf-profile.calltrace.cycles-pp.shrink_node_memcg.shrink_node.do_try_to_free_pages.try_to_free_pages.__alloc_pages_slowpath 3.57 ± 12% +1.6 5.19 ± 7% perf-profile.calltrace.cycles-pp.__alloc_pages_nodemask.do_huge_pmd_anonymous_page.__handle_mm_fault.handle_mm_fault.__do_page_fault 3.48 ± 12% +1.6 5.13 ± 6% perf-profile.calltrace.cycles-pp.shrink_node.do_try_to_free_pages.try_to_free_pages.__alloc_pages_slowpath.__alloc_pages_nodemask 3.48 ± 12% +1.6 5.13 ± 6% perf-profile.calltrace.cycles-pp.do_try_to_free_pages.try_to_free_pages.__alloc_pages_slowpath.__alloc_pages_nodemask.do_huge_pmd_anonymous_page 3.49 ± 12% +1.6 5.13 ± 6% perf-profile.calltrace.cycles-pp.try_to_free_pages.__alloc_pages_slowpath.__alloc_pages_nodemask.do_huge_pmd_anonymous_page.__handle_mm_fault 3.51 ± 12% +1.7 5.17 ± 6% perf-profile.calltrace.cycles-pp.__alloc_pages_slowpath.__alloc_pages_nodemask.do_huge_pmd_anonymous_page.__handle_mm_fault.handle_mm_fault 3.34 ± 14% +2.2 5.57 ± 21% perf-profile.calltrace.cycles-pp.blk_mq_check_inflight.bt_iter.blk_mq_queue_tag_busy_iter.blk_mq_in_flight.part_round_stats 9.14 ± 17% +5.5 14.66 ± 22% perf-profile.calltrace.cycles-pp.bt_iter.blk_mq_queue_tag_busy_iter.blk_mq_in_flight.part_round_stats.blk_account_io_done 12.10 ± 17% +6.8 18.89 ± 20% perf-profile.calltrace.cycles-pp.blk_mq_queue_tag_busy_iter.blk_mq_in_flight.part_round_stats.blk_account_io_done.blk_mq_end_request 13.88 ± 15% +7.1 21.01 ± 20% perf-profile.calltrace.cycles-pp.handle_irq.do_IRQ.ret_from_intr.cpuidle_enter_state.do_idle 13.87 ± 15% +7.1 21.00 ± 20% perf-profile.calltrace.cycles-pp.handle_edge_irq.handle_irq.do_IRQ.ret_from_intr.cpuidle_enter_state 13.97 ± 15% +7.1 21.10 ± 20% perf-profile.calltrace.cycles-pp.ret_from_intr.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary 13.94 ± 15% +7.1 21.07 ± 20% perf-profile.calltrace.cycles-pp.do_IRQ.ret_from_intr.cpuidle_enter_state.do_idle.cpu_startup_entry 12.50 ± 17% +7.2 19.65 ± 20% perf-profile.calltrace.cycles-pp.blk_account_io_done.blk_mq_end_request.blk_mq_complete_request.nvme_irq.__handle_irq_event_percpu 12.70 ± 17% +7.2 19.86 ± 21% perf-profile.calltrace.cycles-pp.blk_mq_end_request.blk_mq_complete_request.nvme_irq.__handle_irq_event_percpu.handle_irq_event_percpu 12.48 ± 17% +7.2 19.65 ± 20% perf-profile.calltrace.cycles-pp.part_round_stats.blk_account_io_done.blk_mq_end_request.blk_mq_complete_request.nvme_irq 12.46 ± 17% +7.2 19.63 ± 21% perf-profile.calltrace.cycles-pp.blk_mq_in_flight.part_round_stats.blk_account_io_done.blk_mq_end_request.blk_mq_complete_request 14.78 ± 18% +8.1 22.83 ± 20% perf-profile.calltrace.cycles-pp.blk_mq_complete_request.nvme_irq.__handle_irq_event_percpu.handle_irq_event_percpu.handle_irq_event 14.87 ± 18% +8.1 22.95 ± 21% perf-profile.calltrace.cycles-pp.nvme_irq.__handle_irq_event_percpu.handle_irq_event_percpu.handle_irq_event.handle_edge_irq 14.89 ± 18% +8.1 22.98 ± 21% perf-profile.calltrace.cycles-pp.handle_irq_event_percpu.handle_irq_event.handle_edge_irq.handle_irq.do_IRQ 14.90 ± 18% +8.1 22.99 ± 21% perf-profile.calltrace.cycles-pp.handle_irq_event.handle_edge_irq.handle_irq.do_IRQ.ret_from_intr 14.88 ± 18% +8.1 22.97 ± 21% perf-profile.calltrace.cycles-pp.__handle_irq_event_percpu.handle_irq_event_percpu.handle_irq_event.handle_edge_irq.handle_irq 4.79 ± 22% -1.3 3.52 ± 9% perf-profile.children.cycles-pp.hrtimer_interrupt 3.04 ± 16% -0.7 2.30 ± 8% perf-profile.children.cycles-pp.__hrtimer_run_queues 1.98 ± 29% -0.7 1.29 ± 7% perf-profile.children.cycles-pp.tick_sched_timer 1.70 ± 27% -0.6 1.14 ± 9% perf-profile.children.cycles-pp.tick_sched_handle 1.57 ± 29% -0.5 1.03 ± 10% perf-profile.children.cycles-pp.update_process_times 3.02 ± 8% -0.5 2.52 ± 10% perf-profile.children.cycles-pp.menu_select 1.19 ± 19% -0.3 0.89 ± 10% perf-profile.children.cycles-pp.tick_nohz_next_event 0.81 ± 25% -0.2 0.56 ± 11% perf-profile.children.cycles-pp.scheduler_tick 0.42 ± 19% -0.1 0.30 ± 14% perf-profile.children.cycles-pp._raw_spin_lock 0.27 ± 13% -0.1 0.21 ± 15% perf-profile.children.cycles-pp.hrtimer_next_event_without 0.11 ± 35% -0.0 0.07 ± 19% perf-profile.children.cycles-pp.run_local_timers 0.10 ± 15% -0.0 0.07 ± 28% perf-profile.children.cycles-pp.cpu_load_update 0.14 ± 9% -0.0 0.11 ± 15% perf-profile.children.cycles-pp.perf_event_task_tick 0.07 ± 17% +0.0 0.10 ± 12% perf-profile.children.cycles-pp.blk_flush_plug_list 0.07 ± 17% +0.0 0.10 ± 12% perf-profile.children.cycles-pp.blk_mq_flush_plug_list 0.06 ± 26% +0.0 0.11 ± 17% perf-profile.children.cycles-pp.read 0.07 ± 17% +0.0 0.11 ± 36% perf-profile.children.cycles-pp.deferred_split_scan 0.15 ± 14% +0.1 0.21 ± 13% perf-profile.children.cycles-pp.blk_mq_sched_dispatch_requests 0.15 ± 16% +0.1 0.21 ± 15% perf-profile.children.cycles-pp.blk_mq_dispatch_rq_list 0.15 ± 14% +0.1 0.22 ± 14% perf-profile.children.cycles-pp.__blk_mq_run_hw_queue 0.08 ± 23% +0.1 0.14 ± 34% perf-profile.children.cycles-pp.shrink_slab 0.08 ± 23% +0.1 0.14 ± 34% perf-profile.children.cycles-pp.do_shrink_slab 0.72 ± 19% +0.3 1.00 ± 8% perf-profile.children.cycles-pp._raw_spin_lock_irqsave 0.44 ± 18% +0.3 0.74 ± 14% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath 0.86 ± 18% +0.4 1.30 ± 9% perf-profile.children.cycles-pp.blk_mq_try_issue_directly 0.88 ± 18% +0.5 1.34 ± 9% perf-profile.children.cycles-pp.__blk_mq_try_issue_directly 0.82 ± 19% +0.5 1.30 ± 9% perf-profile.children.cycles-pp.dma_pool_alloc 1.02 ± 15% +0.5 1.55 ± 9% perf-profile.children.cycles-pp.nvme_queue_rq 1.07 ± 15% +0.7 1.76 ± 21% perf-profile.children.cycles-pp.__indirect_thunk_start 2.42 ± 16% +0.7 3.16 ± 13% perf-profile.children.cycles-pp.find_next_bit 0.95 ± 26% +1.1 2.00 ± 15% perf-profile.children.cycles-pp.blk_account_io_start 2.19 ± 17% +1.5 3.72 ± 8% perf-profile.children.cycles-pp.blk_mq_make_request 2.20 ± 17% +1.5 3.73 ± 8% perf-profile.children.cycles-pp.submit_bio 2.20 ± 17% +1.5 3.73 ± 8% perf-profile.children.cycles-pp.generic_make_request 2.21 ± 17% +1.5 3.76 ± 8% perf-profile.children.cycles-pp.__swap_writepage 2.23 ± 17% +1.5 3.77 ± 7% perf-profile.children.cycles-pp.pageout 3.60 ± 13% +1.6 5.22 ± 7% perf-profile.children.cycles-pp.__alloc_pages_nodemask 3.51 ± 15% +1.6 5.15 ± 6% perf-profile.children.cycles-pp.shrink_page_list 3.48 ± 12% +1.6 5.13 ± 6% perf-profile.children.cycles-pp.do_try_to_free_pages 3.53 ± 14% +1.6 5.17 ± 6% perf-profile.children.cycles-pp.shrink_inactive_list 3.49 ± 12% +1.6 5.13 ± 6% perf-profile.children.cycles-pp.try_to_free_pages 3.51 ± 12% +1.7 5.19 ± 6% perf-profile.children.cycles-pp.__alloc_pages_slowpath 3.61 ± 14% +1.7 5.30 ± 6% perf-profile.children.cycles-pp.shrink_node_memcg 3.69 ± 15% +1.8 5.45 ± 7% perf-profile.children.cycles-pp.shrink_node 4.10 ± 17% +2.4 6.47 ± 18% perf-profile.children.cycles-pp.blk_mq_check_inflight 10.64 ± 16% +6.8 17.39 ± 17% perf-profile.children.cycles-pp.bt_iter 13.17 ± 16% +7.4 20.59 ± 18% perf-profile.children.cycles-pp.blk_account_io_done 13.39 ± 16% +7.4 20.81 ± 19% perf-profile.children.cycles-pp.blk_mq_end_request 15.67 ± 17% +8.2 23.84 ± 20% perf-profile.children.cycles-pp.handle_irq 15.66 ± 17% +8.2 23.83 ± 20% perf-profile.children.cycles-pp.handle_edge_irq 15.57 ± 17% +8.2 23.73 ± 20% perf-profile.children.cycles-pp.nvme_irq 15.60 ± 17% +8.2 23.77 ± 20% perf-profile.children.cycles-pp.handle_irq_event 15.58 ± 17% +8.2 23.75 ± 20% perf-profile.children.cycles-pp.__handle_irq_event_percpu 15.59 ± 17% +8.2 23.77 ± 20% perf-profile.children.cycles-pp.handle_irq_event_percpu 15.75 ± 17% +8.2 23.93 ± 20% perf-profile.children.cycles-pp.ret_from_intr 15.73 ± 17% +8.2 23.91 ± 20% perf-profile.children.cycles-pp.do_IRQ 15.54 ± 17% +8.2 23.73 ± 19% perf-profile.children.cycles-pp.blk_mq_complete_request 14.07 ± 16% +8.4 22.45 ± 16% perf-profile.children.cycles-pp.part_round_stats 14.05 ± 16% +8.4 22.45 ± 16% perf-profile.children.cycles-pp.blk_mq_queue_tag_busy_iter 14.05 ± 16% +8.4 22.45 ± 16% perf-profile.children.cycles-pp.blk_mq_in_flight 0.38 ± 20% -0.1 0.28 ± 16% perf-profile.self.cycles-pp._raw_spin_lock 0.10 ± 13% -0.0 0.07 ± 17% perf-profile.self.cycles-pp.idle_cpu 0.12 ± 14% -0.0 0.09 ± 7% perf-profile.self.cycles-pp.perf_mux_hrtimer_handler 0.10 ± 15% -0.0 0.07 ± 28% perf-profile.self.cycles-pp.cpu_load_update 0.14 ± 9% -0.0 0.11 ± 15% perf-profile.self.cycles-pp.perf_event_task_tick 0.62 ± 16% +0.3 0.89 ± 12% perf-profile.self.cycles-pp.dma_pool_alloc 0.44 ± 18% +0.3 0.74 ± 14% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath 0.81 ± 16% +0.5 1.32 ± 25% perf-profile.self.cycles-pp.__indirect_thunk_start 2.07 ± 16% +0.6 2.69 ± 12% perf-profile.self.cycles-pp.find_next_bit 1.77 ± 16% +1.1 2.91 ± 15% perf-profile.self.cycles-pp.blk_mq_queue_tag_busy_iter 3.82 ± 16% +2.2 6.00 ± 18% perf-profile.self.cycles-pp.blk_mq_check_inflight 6.43 ± 15% +4.1 10.56 ± 16% perf-profile.self.cycles-pp.bt_iter vm-scalability.time.system_time 600 +-+-------------------------------------------------------------------+ | | 500 +-O OO | O O OO | | O O O O O O | 400 +-+ O O O O | | | 300 +-+ .+.++. .+ .+ .+ .+.+ +.+ .++.+.++ .++.+.++. | | ++ + :.+ + + + : +.+ : +.+ + .++.+ +.+ .| 200 +-+ + + : : : : + + | |: : : : : | |: :: : : | 100 +-+ :: :: | | : : | 0 +-+-------------------------------------------------------------------+ vm-scalability.time.maximum_resident_set_size 2.5e+07 +-+---------------------------------------------------------------+ | | | ++.++.++.+. +. +. + +.++.+ +.++.+ .+.++. .++. .++.+ .+ | 2e+07 +-+ + .+ + +.+: : : : + ++ + + +.| | : + : : : : | | : : : : : | 1.5e+07 O-+O OO: : : : | |: O O O O : : : : | 1e+07 +-+ O O O O OO O : : : : | |:O O :: :: | |: :: :: | 5e+06 +-+ : : | | : : | | : : | 0 +-+---------------------------------------------------------------+ [*] bisect-good sample [O] bisect-bad sample Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Thanks, Rong Chen