Greeting, FYI, we noticed a 18.6% improvement of vm-scalability.median due to commit: commit: dd12385915f4f83b738467e599b053c33dffbd48 ("[PATCH] mm: fix COW faults after mlock()") url: https://github.com/0day-ci/linux/commits/Yury-Norov/mm-fix-COW-faults-after-mlock/20180925-174527 base: https://github.com/thesofproject/linux master in testcase: vm-scalability on test machine: 80 threads Skylake with 64G memory with following parameters: runtime: 300s size: 1T test: msync cpufreq_governor: performance test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us. test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/ Details are as below: --------------------------------------------------------------------------------------------------> To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp install job.yaml # job file is attached in this email bin/lkp run job.yaml ========================================================================================= compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase/ucode: gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2018-04-03.cgz/300s/1T/lkp-ivb-d02/msync/vm-scalability/0x20 commit: dd52cb8790 (" platform-drivers-x86 for v4.17-4") dd12385915 ("mm: fix COW faults after mlock()") dd52cb879063ca54 dd12385915f4f83b738467e599 ---------------- -------------------------- fail:runs %reproduction fail:runs | | | 1:4 -25% :4 dmesg.WARNING:at#for_ip_error_entry/0x 0:4 18% 1:4 perf-profile.calltrace.cycles-pp.error_entry 2:4 -18% 2:4 perf-profile.children.cycles-pp.error_entry 0:4 0% 0:4 perf-profile.children.cycles-pp.schedule_timeout 2:4 -17% 1:4 perf-profile.self.cycles-pp.error_entry %stddev %change %stddev \ | \ 277824 ± 4% +18.6% 329586 ± 6% vm-scalability.median 0.46 ± 6% -46.9% 0.24 ± 27% vm-scalability.median_stddev 2817 ± 26% +5.7e+06% 1.608e+08 ± 6% vm-scalability.time.file_system_inputs 6509 ± 28% -73.7% 1714 ± 3% vm-scalability.time.major_page_faults 142.50 ± 5% -40.0% 85.50 ± 4% vm-scalability.time.percent_of_cpu_this_job_got 281.87 ± 6% -35.2% 182.76 ± 5% vm-scalability.time.system_time 152.54 ± 6% -46.3% 81.85 ± 5% vm-scalability.time.user_time 448928 ± 6% -31.1% 309105 ± 8% vm-scalability.time.voluntary_context_switches 2.093e+08 ± 6% -44.9% 1.154e+08 ± 5% vm-scalability.workload 18807972 ± 5% -27.3% 13664118 ± 6% interrupts.CAL:Function_call_interrupts 88890 ± 2% -10.4% 79618 ± 2% softirqs.RCU 100860 ± 2% +31.5% 132623 ± 2% softirqs.SCHED 9.00 +2.8e+06% 256051 ± 5% vmstat.io.bi 6.00 -50.0% 3.00 vmstat.memory.buff 67870 ± 4% -25.0% 50893 ± 5% vmstat.system.in 14.04 ± 8% +2.4 16.47 ± 7% mpstat.cpu.idle% 29.68 ± 7% +13.5 43.20 ± 2% mpstat.cpu.iowait% 42.68 ± 5% -10.1 32.62 ± 5% mpstat.cpu.sys% 13.58 ± 4% -5.9 7.70 ± 4% mpstat.cpu.usr% 294613 +88.5% 555281 slabinfo.buffer_head.active_objs 7559 +88.5% 14248 slabinfo.buffer_head.active_slabs 294837 +88.5% 555706 slabinfo.buffer_head.num_objs 7559 +88.5% 14248 slabinfo.buffer_head.num_slabs 817.75 ± 6% -9.7% 738.25 ± 7% slabinfo.proc_inode_cache.num_objs 454.00 ± 8% +24.4% 565.00 ± 18% slabinfo.skbuff_head_cache.active_objs 13651918 ± 29% +56.9% 21419299 ± 8% cpuidle.C1.time 1.931e+08 ± 6% +51.1% 2.919e+08 ± 2% cpuidle.C1E.time 508755 ± 6% +33.4% 678785 ± 2% cpuidle.C1E.usage 16723882 ± 4% +73.1% 28943328 cpuidle.C3.time 21421 ± 3% +78.8% 38298 cpuidle.C3.usage 3.002e+08 ± 5% +29.2% 3.879e+08 ± 4% cpuidle.C6.time 312454 ± 4% +28.8% 402526 ± 4% cpuidle.C6.usage 107432 ± 20% +61.4% 173441 ± 12% cpuidle.POLL.time 18727 ± 8% -25.9% 13876 ± 6% cpuidle.POLL.usage 1073224 ± 2% +17.7% 1263701 meminfo.Active(file) 308786 ± 3% +18.0% 364452 ± 2% meminfo.Dirty 1342759 ± 2% -20.9% 1061562 ± 3% meminfo.Inactive 172701 ± 16% -39.3% 104801 ± 3% meminfo.Inactive(anon) 1170057 -18.2% 956761 ± 3% meminfo.Inactive(file) 13762 -20.3% 10968 ± 2% meminfo.PageTables 85607 +29.8% 111114 meminfo.SReclaimable 107046 +23.8% 132534 meminfo.Slab 37561 ± 4% +23.5% 46379 ± 4% meminfo.Writeback 84144 ± 7% -26.0% 62266 ± 5% sched_debug.cfs_rq:/.exec_clock.avg 85958 ± 6% -24.6% 64822 ± 5% sched_debug.cfs_rq:/.exec_clock.max 82818 ± 7% -26.6% 60788 ± 5% sched_debug.cfs_rq:/.exec_clock.min 209438 ± 7% -41.9% 121723 ± 4% sched_debug.cfs_rq:/.min_vruntime.avg 218172 ± 8% -40.0% 130873 ± 3% sched_debug.cfs_rq:/.min_vruntime.max 199461 ± 6% -43.1% 113416 ± 7% sched_debug.cfs_rq:/.min_vruntime.min 1399 ± 2% -16.2% 1172 ± 11% sched_debug.cpu.curr->pid.stddev 15.87 ± 25% +104.2% 32.42 ± 35% sched_debug.cpu.nr_uninterruptible.max -14.25 +91.2% -27.25 sched_debug.cpu.nr_uninterruptible.min 11.61 ± 28% +98.0% 22.99 ± 28% sched_debug.cpu.nr_uninterruptible.stddev 127503 ± 14% +26.0% 160648 ± 4% sched_debug.cpu.sched_goidle.min 0.00 ± 94% -91.3% 0.00 ±158% sched_debug.rt_rq:/.rt_time.stddev 1879 ± 4% -27.7% 1358 ± 5% turbostat.Avg_MHz 57.42 ± 4% -15.7 41.70 ± 4% turbostat.Busy% 1.11 ± 30% +0.6 1.71 ± 7% turbostat.C1% 508750 ± 6% +33.4% 678785 ± 2% turbostat.C1E 15.71 ± 7% +7.6 23.32 ± 2% turbostat.C1E% 21419 ± 3% +78.8% 38298 turbostat.C3 1.36 ± 5% +1.0 2.31 turbostat.C3% 312453 ± 4% +28.8% 402527 ± 4% turbostat.C6 24.42 ± 6% +6.6 31.00 ± 5% turbostat.C6% 30.15 ± 8% +39.3% 42.01 ± 3% turbostat.CPU%c1 1.09 ± 5% +60.0% 1.75 ± 6% turbostat.CPU%c3 11.33 ± 9% +28.3% 14.54 ± 6% turbostat.CPU%c6 12.77 ± 3% -18.3% 10.43 ± 3% turbostat.CorWatt 39798103 ± 5% -25.5% 29652069 ± 6% turbostat.IRQ 30.09 -8.0% 27.69 turbostat.PkgWatt 4.902e+11 ± 5% -41.5% 2.869e+11 ± 28% perf-stat.branch-instructions 3.636e+09 ± 8% -32.3% 2.461e+09 ± 28% perf-stat.branch-misses 66.28 -7.3 58.95 perf-stat.cache-miss-rate% 9.761e+09 ± 7% -26.2% 7.2e+09 ± 28% perf-stat.cache-misses 1.11 +5.9% 1.17 perf-stat.cpi 2.298e+12 ± 6% -35.6% 1.48e+12 ± 27% perf-stat.cpu-cycles 19198 ± 5% +14.7% 22028 ± 5% perf-stat.cpu-migrations 3.858e+09 ± 15% -42.9% 2.205e+09 ± 16% perf-stat.dTLB-load-misses 5.213e+11 ± 5% -36.1% 3.332e+11 ± 28% perf-stat.dTLB-loads 0.12 ± 9% -0.0 0.09 ± 18% perf-stat.dTLB-store-miss-rate% 3.446e+08 ± 10% -52.5% 1.638e+08 ± 17% perf-stat.dTLB-store-misses 2.839e+11 ± 5% -35.2% 1.839e+11 ± 28% perf-stat.dTLB-stores 74.39 +7.5 81.93 perf-stat.iTLB-load-miss-rate% 3.687e+08 ± 8% -45.1% 2.024e+08 ± 27% perf-stat.iTLB-load-misses 1.272e+08 ± 10% -65.7% 43583231 ± 21% perf-stat.iTLB-loads 2.077e+12 ± 5% -39.0% 1.266e+12 ± 28% perf-stat.instructions 5646 ± 3% +10.3% 6228 ± 4% perf-stat.instructions-per-iTLB-miss 0.90 -5.5% 0.85 perf-stat.ipc 47044927 ± 6% -44.5% 26119814 ± 5% perf-stat.minor-faults 47046188 ± 6% -44.5% 26120941 ± 5% perf-stat.page-faults 22566 ± 39% -78.5% 4857 ± 4% proc-vmstat.allocstall_movable 120471 ± 14% -86.8% 15866 ± 5% proc-vmstat.allocstall_normal 7661 ± 28% -78.7% 1631 ± 11% proc-vmstat.compact_fail 7682 ± 28% -78.4% 1659 ± 11% proc-vmstat.compact_stall 2752 ± 5% -37.9% 1710 ± 4% proc-vmstat.kswapd_low_wmark_hit_quickly 267103 ± 2% +17.7% 314329 ± 2% proc-vmstat.nr_active_file 78076 ± 3% +17.3% 91559 proc-vmstat.nr_dirty 7811 ± 6% -21.7% 6118 ± 28% proc-vmstat.nr_free_cma 43289 ± 16% -39.4% 26248 ± 3% proc-vmstat.nr_inactive_anon 293783 -17.7% 241780 ± 4% proc-vmstat.nr_inactive_file 3445 -20.5% 2740 proc-vmstat.nr_page_table_pages 21416 +29.8% 27796 proc-vmstat.nr_slab_reclaimable 9153 ± 2% +26.1% 11542 ± 2% proc-vmstat.nr_writeback 267072 ± 2% +17.7% 314320 ± 2% proc-vmstat.nr_zone_active_file 43289 ± 16% -39.4% 26251 ± 3% proc-vmstat.nr_zone_inactive_anon 293694 -17.7% 241712 ± 4% proc-vmstat.nr_zone_inactive_file 87180 ± 3% +18.2% 103074 proc-vmstat.nr_zone_write_pending 85522443 ± 6% -43.5% 48347251 ± 5% proc-vmstat.numa_hit 85522443 ± 6% -43.5% 48347251 ± 5% proc-vmstat.numa_local 2958 ± 5% -35.3% 1913 ± 3% proc-vmstat.pageoutrun 51287873 ± 6% -43.2% 29135174 ± 5% proc-vmstat.pgactivate 28504604 ± 8% -18.2% 23322955 ± 5% proc-vmstat.pgalloc_dma32 57168735 ± 6% -56.0% 25151136 ± 8% proc-vmstat.pgalloc_normal 41153300 ± 7% -46.5% 22026100 ± 6% proc-vmstat.pgdeactivate 85685675 ± 6% -43.4% 48485032 ± 5% proc-vmstat.pgfree 6509 ± 28% -73.7% 1714 ± 3% proc-vmstat.pgmajfault 2976 +2.7e+06% 80419921 ± 6% proc-vmstat.pgpgin 41153300 ± 7% -46.5% 22026099 ± 6% proc-vmstat.pgrefill 477249 ± 6% +72.3% 822527 ± 11% proc-vmstat.pgrotated 43943485 ± 5% -52.5% 20878181 ± 5% proc-vmstat.pgscan_direct 84942346 ± 6% -41.7% 49530834 ± 5% proc-vmstat.pgscan_kswapd 8102736 ± 7% -85.2% 1196177 ± 5% proc-vmstat.pgsteal_direct 47558626 ± 6% -39.2% 28921712 ± 6% proc-vmstat.pgsteal_kswapd 14421 -13.1% 12538 ± 3% proc-vmstat.slabs_scanned 2020457 ± 15% -66.2% 682455 ± 8% proc-vmstat.workingset_activate 17434531 ± 8% -44.0% 9754788 ± 5% proc-vmstat.workingset_refault 19.94 ± 25% -17.5 2.42 ±110% perf-profile.calltrace.cycles-pp.do_access 12.24 ± 26% -10.7 1.50 ±109% perf-profile.calltrace.cycles-pp.page_fault.do_access 12.22 ± 26% -10.7 1.50 ±109% perf-profile.calltrace.cycles-pp.do_page_fault.page_fault.do_access 12.21 ± 26% -10.7 1.50 ±109% perf-profile.calltrace.cycles-pp.__do_page_fault.do_page_fault.page_fault.do_access 11.34 ± 26% -10.0 1.37 ±109% perf-profile.calltrace.cycles-pp.handle_mm_fault.__do_page_fault.do_page_fault.page_fault.do_access 16.17 ± 23% -8.6 7.58 ± 25% perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault 9.39 ± 27% -8.1 1.29 ±110% perf-profile.calltrace.cycles-pp.do_rw_once 7.59 ± 23% -4.5 3.05 ± 30% perf-profile.calltrace.cycles-pp.do_page_mkwrite.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault 7.53 ± 23% -4.5 3.00 ± 30% perf-profile.calltrace.cycles-pp.__xfs_filemap_fault.do_page_mkwrite.__handle_mm_fault.handle_mm_fault.__do_page_fault 6.69 ± 24% -3.3 3.44 ± 23% perf-profile.calltrace.cycles-pp.__do_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault 6.55 ± 24% -3.2 3.38 ± 23% perf-profile.calltrace.cycles-pp.__xfs_filemap_fault.__do_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault 14.30 ± 5% -3.0 11.27 ± 7% perf-profile.calltrace.cycles-pp.__do_page_cache_readahead.ondemand_readahead.filemap_fault.__xfs_filemap_fault.__do_fault 14.31 ± 5% -3.0 11.29 ± 8% perf-profile.calltrace.cycles-pp.ondemand_readahead.filemap_fault.__xfs_filemap_fault.__do_fault.__handle_mm_fault 15.40 ± 3% -2.7 12.73 ± 6% perf-profile.calltrace.cycles-pp.filemap_fault.__xfs_filemap_fault.__do_fault.__handle_mm_fault.handle_mm_fault 2.67 ± 41% -2.1 0.56 ±116% perf-profile.calltrace.cycles-pp.__alloc_pages_slowpath.__alloc_pages_nodemask.__do_page_cache_readahead.ondemand_readahead.filemap_fault 4.15 ± 27% -2.1 2.10 ± 12% perf-profile.calltrace.cycles-pp.__alloc_pages_nodemask.__do_page_cache_readahead.ondemand_readahead.filemap_fault.__xfs_filemap_fault 2.46 ± 19% -1.6 0.89 ± 68% perf-profile.calltrace.cycles-pp.shrink_page_list.shrink_inactive_list.shrink_node_memcg.shrink_node.do_try_to_free_pages 3.60 ± 24% -1.5 2.06 ± 40% perf-profile.calltrace.cycles-pp.shrink_inactive_list.shrink_node_memcg.shrink_node.do_try_to_free_pages.try_to_free_pages 9.38 ± 3% -1.4 7.95 ± 14% perf-profile.calltrace.cycles-pp.mpage_readpages.__do_page_cache_readahead.ondemand_readahead.filemap_fault.__xfs_filemap_fault 3.54 ± 3% -0.6 2.95 ± 15% perf-profile.calltrace.cycles-pp.add_to_page_cache_lru.mpage_readpages.__do_page_cache_readahead.ondemand_readahead.filemap_fault 0.31 ±100% +0.5 0.81 ± 17% perf-profile.calltrace.cycles-pp.xfs_bmap_add_extent_hole_delay.xfs_bmapi_reserve_delalloc.xfs_file_iomap_begin.iomap_apply.iomap_page_mkwrite 0.00 +0.6 0.63 ± 22% perf-profile.calltrace.cycles-pp.follow_page_pte.__get_user_pages.populate_vma_page_range.__mm_populate.vm_mmap_pgoff 0.34 ±102% +0.8 1.10 ± 40% perf-profile.calltrace.cycles-pp.__set_page_dirty.mark_buffer_dirty.__block_commit_write.block_commit_write.iomap_page_mkwrite_actor 0.59 ± 66% +1.0 1.59 ± 26% perf-profile.calltrace.cycles-pp.mark_buffer_dirty.__block_commit_write.block_commit_write.iomap_page_mkwrite_actor.iomap_apply 0.00 +1.1 1.06 ± 36% perf-profile.calltrace.cycles-pp.alloc_set_pte.finish_fault.__handle_mm_fault.handle_mm_fault.__get_user_pages 0.00 +1.1 1.11 ± 34% perf-profile.calltrace.cycles-pp.finish_fault.__handle_mm_fault.handle_mm_fault.__get_user_pages.populate_vma_page_range 2.14 ± 34% +1.2 3.35 ± 19% perf-profile.calltrace.cycles-pp.xfs_file_iomap_begin.iomap_apply.iomap_page_mkwrite.__xfs_filemap_fault.do_page_mkwrite 1.14 ± 34% +1.2 2.35 ± 47% perf-profile.calltrace.cycles-pp.kmem_cache_alloc.alloc_buffer_head.alloc_page_buffers.create_empty_buffers.create_page_buffers 1.24 ± 34% +1.2 2.46 ± 45% perf-profile.calltrace.cycles-pp.alloc_page_buffers.create_empty_buffers.create_page_buffers.__block_write_begin_int.iomap_page_mkwrite_actor 1.19 ± 33% +1.2 2.43 ± 46% perf-profile.calltrace.cycles-pp.alloc_buffer_head.alloc_page_buffers.create_empty_buffers.create_page_buffers.__block_write_begin_int 1.57 ± 33% +1.4 2.98 ± 35% perf-profile.calltrace.cycles-pp.create_empty_buffers.create_page_buffers.__block_write_begin_int.iomap_page_mkwrite_actor.iomap_apply 1.60 ± 34% +1.4 3.04 ± 33% perf-profile.calltrace.cycles-pp.create_page_buffers.__block_write_begin_int.iomap_page_mkwrite_actor.iomap_apply.iomap_page_mkwrite 1.90 ± 32% +2.0 3.86 ± 38% perf-profile.calltrace.cycles-pp.__block_write_begin_int.iomap_page_mkwrite_actor.iomap_apply.iomap_page_mkwrite.__xfs_filemap_fault 0.00 +2.1 2.09 ± 50% perf-profile.calltrace.cycles-pp.memcpy_erms.memcpy_to_page._copy_to_iter.copy_page_to_iter.shmem_file_read_iter 0.00 +2.1 2.11 ± 50% perf-profile.calltrace.cycles-pp.memcpy_to_page._copy_to_iter.copy_page_to_iter.shmem_file_read_iter.do_iter_readv_writev 0.00 +2.2 2.19 ± 50% perf-profile.calltrace.cycles-pp._copy_to_iter.copy_page_to_iter.shmem_file_read_iter.do_iter_readv_writev.do_iter_read 0.00 +2.2 2.22 ± 50% perf-profile.calltrace.cycles-pp.copy_page_to_iter.shmem_file_read_iter.do_iter_readv_writev.do_iter_read.loop_queue_work 3.25 ± 30% +2.9 6.13 ± 26% perf-profile.calltrace.cycles-pp.iomap_page_mkwrite_actor.iomap_apply.iomap_page_mkwrite.__xfs_filemap_fault.do_page_mkwrite 0.00 +3.0 3.00 ± 51% perf-profile.calltrace.cycles-pp.shmem_file_read_iter.do_iter_readv_writev.do_iter_read.loop_queue_work.kthread_worker_fn 0.00 +3.1 3.06 ± 51% perf-profile.calltrace.cycles-pp.do_iter_readv_writev.do_iter_read.loop_queue_work.kthread_worker_fn.kthread 0.00 +3.3 3.26 ± 52% perf-profile.calltrace.cycles-pp.do_iter_read.loop_queue_work.kthread_worker_fn.kthread.ret_from_fork 6.10 ± 23% +4.1 10.16 ± 21% perf-profile.calltrace.cycles-pp.iomap_apply.iomap_page_mkwrite.__xfs_filemap_fault.do_page_mkwrite.__handle_mm_fault 6.60 ± 23% +4.4 11.01 ± 19% perf-profile.calltrace.cycles-pp.iomap_page_mkwrite.__xfs_filemap_fault.do_page_mkwrite.__handle_mm_fault.handle_mm_fault 3.01 ± 25% +5.9 8.94 ± 38% perf-profile.calltrace.cycles-pp.secondary_startup_64 0.00 +9.5 9.53 ± 27% perf-profile.calltrace.cycles-pp.__xfs_filemap_fault.do_page_mkwrite.__handle_mm_fault.handle_mm_fault.__get_user_pages 0.00 +9.6 9.59 ± 27% perf-profile.calltrace.cycles-pp.do_page_mkwrite.__handle_mm_fault.handle_mm_fault.__get_user_pages.populate_vma_page_range 11.83 ± 21% +10.1 21.96 ± 18% perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.__get_user_pages.populate_vma_page_range.__mm_populate 11.85 ± 21% +10.4 22.23 ± 18% perf-profile.calltrace.cycles-pp.handle_mm_fault.__get_user_pages.populate_vma_page_range.__mm_populate.vm_mmap_pgoff 12.52 ± 21% +11.2 23.77 ± 17% perf-profile.calltrace.cycles-pp.__get_user_pages.populate_vma_page_range.__mm_populate.vm_mmap_pgoff.ksys_mmap_pgoff 12.52 ± 21% +11.2 23.77 ± 17% perf-profile.calltrace.cycles-pp.__mm_populate.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe 12.52 ± 21% +11.2 23.77 ± 17% perf-profile.calltrace.cycles-pp.populate_vma_page_range.__mm_populate.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64 19.94 ± 25% -17.5 2.42 ±110% perf-profile.children.cycles-pp.do_access 17.94 ± 23% -9.2 8.71 ± 25% perf-profile.children.cycles-pp.__do_page_fault 17.89 ± 23% -9.2 8.69 ± 25% perf-profile.children.cycles-pp.do_page_fault 17.91 ± 23% -9.2 8.73 ± 25% perf-profile.children.cycles-pp.page_fault 9.39 ± 27% -8.1 1.29 ±110% perf-profile.children.cycles-pp.do_rw_once 14.54 ± 6% -3.3 11.29 ± 8% perf-profile.children.cycles-pp.__do_page_cache_readahead 14.52 ± 5% -3.2 11.29 ± 7% perf-profile.children.cycles-pp.ondemand_readahead 15.55 ± 4% -2.8 12.78 ± 7% perf-profile.children.cycles-pp.filemap_fault 16.04 ± 4% -2.6 13.46 ± 6% perf-profile.children.cycles-pp.__do_fault 7.32 ± 17% -1.6 5.70 ± 14% perf-profile.children.cycles-pp.__alloc_pages_nodemask 5.75 ± 23% -1.6 4.13 ± 17% perf-profile.children.cycles-pp.__alloc_pages_slowpath 5.38 ± 22% -1.5 3.85 ± 16% perf-profile.children.cycles-pp.try_to_free_pages 5.38 ± 22% -1.5 3.85 ± 16% perf-profile.children.cycles-pp.do_try_to_free_pages 4.18 ± 7% -1.4 2.76 ± 29% perf-profile.children.cycles-pp.page_vma_mapped_walk 9.57 -1.4 8.19 ± 12% perf-profile.children.cycles-pp.mpage_readpages 1.76 ± 30% -1.3 0.48 ± 69% perf-profile.children.cycles-pp.pte_alloc_one 2.91 ± 17% -1.1 1.77 ± 35% perf-profile.children.cycles-pp.page_referenced_one 0.79 ± 22% -0.7 0.07 ±113% perf-profile.children.cycles-pp.filemap_map_pages 0.65 ± 39% -0.6 0.08 ± 70% perf-profile.children.cycles-pp.shrink_slab 1.12 ± 32% -0.5 0.66 ± 29% perf-profile.children.cycles-pp.free_pcppages_bulk 1.27 ± 36% -0.5 0.82 ± 38% perf-profile.children.cycles-pp.free_unref_page_list 3.68 ± 2% -0.4 3.24 ± 9% perf-profile.children.cycles-pp.add_to_page_cache_lru 0.75 ± 22% -0.3 0.46 ± 16% perf-profile.children.cycles-pp.__radix_tree_replace 0.30 ± 24% -0.1 0.15 ± 53% perf-profile.children.cycles-pp.find_vma 1.26 ± 3% -0.1 1.12 ± 6% perf-profile.children.cycles-pp.pagevec_lru_move_fn 0.29 ± 9% -0.1 0.15 ± 23% perf-profile.children.cycles-pp.replace_slot 1.31 ± 3% -0.1 1.17 ± 8% perf-profile.children.cycles-pp.__lru_cache_add 0.20 ± 27% -0.1 0.06 ± 65% perf-profile.children.cycles-pp.drain_local_pages_wq 0.20 ± 27% -0.1 0.06 ± 65% perf-profile.children.cycles-pp.drain_pages 0.20 ± 27% -0.1 0.06 ± 65% perf-profile.children.cycles-pp.drain_pages_zone 0.83 ± 11% -0.1 0.69 ± 16% perf-profile.children.cycles-pp.page_cache_tree_insert 0.59 ± 7% -0.1 0.46 ± 20% perf-profile.children.cycles-pp.unmap_page_range 0.58 ± 5% -0.1 0.46 ± 19% perf-profile.children.cycles-pp.unmap_vmas 0.21 ± 18% -0.1 0.10 ± 50% perf-profile.children.cycles-pp.page_mapped 0.13 ± 38% -0.1 0.04 ± 58% perf-profile.children.cycles-pp.super_cache_count 0.16 ± 19% -0.1 0.09 ± 27% perf-profile.children.cycles-pp.ptep_clear_flush_young 0.09 ± 27% -0.1 0.03 ±100% perf-profile.children.cycles-pp.down_read_trylock 0.00 +0.1 0.06 ± 11% perf-profile.children.cycles-pp.find_next_and_bit 0.08 ± 32% +0.1 0.14 ± 34% perf-profile.children.cycles-pp.task_tick_fair 0.01 ±173% +0.1 0.08 ± 24% perf-profile.children.cycles-pp.cpumask_next_and 0.07 ± 59% +0.1 0.13 ± 30% perf-profile.children.cycles-pp.pmd_devmap_trans_unstable 0.02 ±173% +0.1 0.09 ± 21% perf-profile.children.cycles-pp.xfs_fsb_to_db 0.04 ±101% +0.1 0.14 ± 18% perf-profile.children.cycles-pp.kthread_queue_work 0.06 ±101% +0.1 0.15 ± 11% perf-profile.children.cycles-pp.loop_queue_rq 0.09 ± 28% +0.1 0.19 ± 27% perf-profile.children.cycles-pp.memset_erms 0.11 ± 38% +0.1 0.22 ± 15% perf-profile.children.cycles-pp.current_kernel_time64 0.07 ± 70% +0.1 0.19 ± 7% perf-profile.children.cycles-pp.blk_mq_dispatch_rq_list 0.00 +0.1 0.12 ± 65% perf-profile.children.cycles-pp.touch_atime 0.14 ± 18% +0.1 0.27 ± 31% perf-profile.children.cycles-pp.xfs_file_iomap_end 0.11 ± 70% +0.1 0.23 ± 8% perf-profile.children.cycles-pp.blk_mq_run_hw_queue 0.10 ± 69% +0.1 0.23 ± 6% perf-profile.children.cycles-pp.__blk_mq_run_hw_queue 0.10 ± 69% +0.1 0.23 ± 6% perf-profile.children.cycles-pp.blk_mq_sched_dispatch_requests 0.10 ± 65% +0.1 0.23 ± 7% perf-profile.children.cycles-pp.blk_mq_do_dispatch_sched 0.19 ± 15% +0.1 0.34 ± 18% perf-profile.children.cycles-pp.current_time 0.35 ± 25% +0.2 0.51 ± 14% perf-profile.children.cycles-pp.try_to_wake_up 0.48 ± 19% +0.2 0.63 ± 5% perf-profile.children.cycles-pp.set_page_dirty 0.28 ± 19% +0.2 0.44 ± 24% perf-profile.children.cycles-pp.xfs_vn_update_time 0.07 ±101% +0.2 0.23 ± 7% perf-profile.children.cycles-pp.__blk_mq_delay_run_hw_queue 0.09 ± 43% +0.2 0.25 ± 46% perf-profile.children.cycles-pp.update_load_avg 0.80 ± 9% +0.2 0.98 ± 6% perf-profile.children.cycles-pp._raw_spin_lock_irqsave 0.55 ± 17% +0.2 0.73 ± 3% perf-profile.children.cycles-pp.radix_tree_tag_clear 0.46 ± 20% +0.2 0.65 ± 5% perf-profile.children.cycles-pp.xfs_iunlock 0.09 ±102% +0.2 0.28 ± 15% perf-profile.children.cycles-pp.blk_mq_flush_plug_list 0.10 ± 79% +0.2 0.30 ± 11% perf-profile.children.cycles-pp.blk_flush_plug_list 0.42 ± 20% +0.2 0.61 ± 19% perf-profile.children.cycles-pp.tick_sched_timer 0.49 ± 13% +0.2 0.69 ± 16% perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore 0.00 +0.2 0.20 ± 19% perf-profile.children.cycles-pp.blk_finish_plug 0.00 +0.2 0.23 ± 61% perf-profile.children.cycles-pp.mpage_end_io 1.46 ± 13% +0.2 1.68 ± 4% perf-profile.children.cycles-pp.__radix_tree_lookup 0.43 ± 14% +0.3 0.70 ± 18% perf-profile.children.cycles-pp.follow_page_pte 0.37 ± 28% +0.3 0.64 ± 18% perf-profile.children.cycles-pp.blk_mq_make_request 0.15 ± 39% +0.3 0.42 ± 83% perf-profile.children.cycles-pp.menu_select 0.65 ± 8% +0.3 0.94 ± 20% perf-profile.children.cycles-pp.file_update_time 0.45 ± 12% +0.3 0.76 ± 28% perf-profile.children.cycles-pp.radix_tree_tag_set 0.39 ± 28% +0.3 0.70 ± 17% perf-profile.children.cycles-pp.submit_bio 0.39 ± 28% +0.3 0.70 ± 17% perf-profile.children.cycles-pp.generic_make_request 0.73 ± 10% +0.4 1.11 ± 8% perf-profile.children.cycles-pp.radix_tree_lookup_slot 0.07 ±119% +0.4 0.46 ± 67% perf-profile.children.cycles-pp.__pte_alloc 0.65 ± 14% +0.4 1.06 ± 36% perf-profile.children.cycles-pp.__hrtimer_run_queues 0.22 ± 37% +0.4 0.64 ± 12% perf-profile.children.cycles-pp.follow_page_mask 0.92 ± 6% +0.5 1.46 ± 21% perf-profile.children.cycles-pp.alloc_set_pte 1.04 ± 8% +0.5 1.59 ± 19% perf-profile.children.cycles-pp.xfs_vm_set_page_dirty 1.18 ± 8% +0.7 1.83 ± 10% perf-profile.children.cycles-pp.find_get_entry 0.69 ± 11% +0.7 1.42 ± 19% perf-profile.children.cycles-pp.finish_fault 0.00 +0.8 0.77 ± 70% perf-profile.children.cycles-pp.mempool_alloc 0.00 +0.8 0.77 ± 71% perf-profile.children.cycles-pp.bio_alloc_bioset 1.22 ± 16% +0.8 2.06 ± 50% perf-profile.children.cycles-pp.smp_apic_timer_interrupt 1.00 ± 28% +0.9 1.86 ± 20% perf-profile.children.cycles-pp.mark_buffer_dirty 1.22 ± 15% +0.9 2.09 ± 50% perf-profile.children.cycles-pp.apic_timer_interrupt 1.22 ± 25% +1.0 2.23 ± 16% perf-profile.children.cycles-pp.__block_commit_write 1.18 ± 26% +1.0 2.22 ± 16% perf-profile.children.cycles-pp.block_commit_write 2.56 ± 24% +1.1 3.69 ± 16% perf-profile.children.cycles-pp.xfs_file_iomap_begin 1.32 ± 27% +1.4 2.77 ± 41% perf-profile.children.cycles-pp.alloc_buffer_head 1.41 ± 26% +1.5 2.87 ± 41% perf-profile.children.cycles-pp.alloc_page_buffers 1.75 ± 26% +1.5 3.29 ± 36% perf-profile.children.cycles-pp.create_empty_buffers 1.78 ± 27% +1.6 3.34 ± 34% perf-profile.children.cycles-pp.create_page_buffers 2.17 ± 24% +1.8 4.00 ± 35% perf-profile.children.cycles-pp.__block_write_begin_int 0.84 ± 36% +1.9 2.71 ± 31% perf-profile.children.cycles-pp.new_slab 0.88 ± 35% +1.9 2.76 ± 30% perf-profile.children.cycles-pp.___slab_alloc 0.88 ± 35% +1.9 2.76 ± 30% perf-profile.children.cycles-pp.__slab_alloc 0.00 +2.1 2.11 ± 50% perf-profile.children.cycles-pp.memcpy_to_page 0.00 +2.2 2.20 ± 50% perf-profile.children.cycles-pp._copy_to_iter 1.41 ± 25% +2.2 3.61 ± 26% perf-profile.children.cycles-pp.kmem_cache_alloc 0.00 +2.3 2.25 ± 50% perf-profile.children.cycles-pp.copy_page_to_iter 3.42 ± 23% +3.0 6.38 ± 26% perf-profile.children.cycles-pp.iomap_page_mkwrite_actor 0.00 +3.0 3.01 ± 51% perf-profile.children.cycles-pp.shmem_file_read_iter 0.00 +3.3 3.30 ± 53% perf-profile.children.cycles-pp.do_iter_read 6.12 ± 23% +4.2 10.27 ± 21% perf-profile.children.cycles-pp.iomap_apply 2.14 ± 22% +4.3 6.46 ± 22% perf-profile.children.cycles-pp.intel_idle 6.65 ± 24% +4.4 11.08 ± 19% perf-profile.children.cycles-pp.iomap_page_mkwrite 7.61 ± 24% +5.0 12.65 ± 17% perf-profile.children.cycles-pp.do_page_mkwrite 2.75 ± 26% +5.5 8.29 ± 35% perf-profile.children.cycles-pp.cpuidle_enter_state 3.01 ± 25% +5.9 8.94 ± 38% perf-profile.children.cycles-pp.secondary_startup_64 3.01 ± 25% +5.9 8.94 ± 38% perf-profile.children.cycles-pp.cpu_startup_entry 3.01 ± 25% +6.0 8.97 ± 38% perf-profile.children.cycles-pp.do_idle 16.28 ± 22% +10.9 27.21 ± 15% perf-profile.children.cycles-pp.do_syscall_64 16.28 ± 22% +11.0 27.24 ± 15% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 12.56 ± 21% +11.2 23.80 ± 17% perf-profile.children.cycles-pp.vm_mmap_pgoff 12.52 ± 21% +11.2 23.77 ± 17% perf-profile.children.cycles-pp.__mm_populate 12.52 ± 21% +11.2 23.77 ± 17% perf-profile.children.cycles-pp.populate_vma_page_range 12.55 ± 21% +11.3 23.80 ± 17% perf-profile.children.cycles-pp.ksys_mmap_pgoff 12.54 ± 21% +11.3 23.81 ± 17% perf-profile.children.cycles-pp.__get_user_pages 9.01 ± 26% -7.8 1.25 ±110% perf-profile.self.cycles-pp.do_rw_once 3.84 ± 24% -3.4 0.42 ±108% perf-profile.self.cycles-pp.do_access 2.89 ± 9% -1.0 1.88 ± 30% perf-profile.self.cycles-pp.page_vma_mapped_walk 4.27 ± 3% -0.9 3.35 ± 18% perf-profile.self.cycles-pp.do_mpage_readpage 1.14 ± 16% -0.4 0.70 ± 41% perf-profile.self.cycles-pp.page_referenced_one 0.39 ± 11% -0.1 0.24 ± 28% perf-profile.self.cycles-pp.xfs_get_blocks 0.29 ± 9% -0.1 0.15 ± 23% perf-profile.self.cycles-pp.replace_slot 0.40 ± 20% -0.1 0.26 ± 33% perf-profile.self.cycles-pp.page_referenced 0.21 ± 20% -0.1 0.10 ± 50% perf-profile.self.cycles-pp.page_mapped 0.16 ± 19% -0.1 0.09 ± 27% perf-profile.self.cycles-pp.ptep_clear_flush_young 0.09 ± 28% -0.1 0.03 ±100% perf-profile.self.cycles-pp.down_read_trylock 0.17 ± 3% -0.0 0.14 ± 12% perf-profile.self.cycles-pp.mem_cgroup_commit_charge 0.07 ± 14% +0.0 0.12 ± 21% perf-profile.self.cycles-pp.kmem_cache_free 0.00 +0.1 0.06 ± 11% perf-profile.self.cycles-pp.find_next_and_bit 0.09 ± 27% +0.1 0.15 ± 19% perf-profile.self.cycles-pp.iomap_page_mkwrite_actor 0.01 ±173% +0.1 0.08 ± 23% perf-profile.self.cycles-pp.radix_tree_lookup_slot 0.02 ±173% +0.1 0.09 ± 21% perf-profile.self.cycles-pp.xfs_fsb_to_db 0.09 ± 36% +0.1 0.16 ± 14% perf-profile.self.cycles-pp.__mark_inode_dirty 0.09 ± 28% +0.1 0.16 ± 31% perf-profile.self.cycles-pp.memset_erms 0.01 ±173% +0.1 0.08 ± 48% perf-profile.self.cycles-pp.memcpy_from_page 0.21 ± 23% +0.1 0.28 ± 12% perf-profile.self.cycles-pp.lock_page_memcg 0.13 ± 32% +0.1 0.22 ± 30% perf-profile.self.cycles-pp.mark_buffer_dirty 0.16 ± 5% +0.1 0.25 ± 12% perf-profile.self.cycles-pp.__block_commit_write 0.15 ± 33% +0.1 0.25 ± 32% perf-profile.self.cycles-pp.xfs_add_to_ioend 0.18 ± 23% +0.1 0.28 ± 15% perf-profile.self.cycles-pp.set_page_dirty 0.10 ± 36% +0.1 0.22 ± 15% perf-profile.self.cycles-pp.current_kernel_time64 0.07 ± 69% +0.1 0.19 ± 47% perf-profile.self.cycles-pp.security_file_permission 0.05 ± 63% +0.1 0.18 ± 71% perf-profile.self.cycles-pp.update_load_avg 0.13 ± 27% +0.1 0.27 ± 24% perf-profile.self.cycles-pp.__xfs_filemap_fault 0.23 ± 27% +0.1 0.38 ± 26% perf-profile.self.cycles-pp.xfs_ilock 0.44 ± 20% +0.1 0.59 ± 5% perf-profile.self.cycles-pp.radix_tree_tag_clear 0.17 ± 29% +0.2 0.34 ± 15% perf-profile.self.cycles-pp.follow_page_pte 0.00 +0.2 0.18 ± 43% perf-profile.self.cycles-pp.shmem_file_read_iter 1.43 ± 12% +0.2 1.62 ± 3% perf-profile.self.cycles-pp.__radix_tree_lookup 0.16 ± 42% +0.2 0.36 ± 33% perf-profile.self.cycles-pp.xfs_bmapi_reserve_delalloc 0.33 ± 19% +0.2 0.53 ± 29% perf-profile.self.cycles-pp.kmem_cache_alloc 0.53 ± 9% +0.2 0.76 ± 10% perf-profile.self.cycles-pp._raw_spin_lock_irqsave 0.37 ± 21% +0.2 0.61 ± 23% perf-profile.self.cycles-pp.__block_write_begin_int 0.20 ± 38% +0.3 0.48 ± 13% perf-profile.self.cycles-pp.follow_page_mask 0.05 ±112% +0.3 0.34 ± 51% perf-profile.self.cycles-pp.__get_user_pages 0.45 ± 12% +0.3 0.76 ± 28% perf-profile.self.cycles-pp.radix_tree_tag_set 0.46 ± 15% +0.3 0.79 ± 11% perf-profile.self.cycles-pp.find_get_entry 0.46 ± 33% +0.4 0.82 ± 18% perf-profile.self.cycles-pp.xfs_file_iomap_begin 0.61 ± 14% +0.4 0.98 ± 16% perf-profile.self.cycles-pp.__handle_mm_fault 2.14 ± 21% +4.3 6.46 ± 22% perf-profile.self.cycles-pp.intel_idle vm-scalability.time.user_time 280 +-+-------------------------------------------------------------------+ | + + .+..+.+.. .+.. .+.. + +.. | 260 +-+ + + .+. +. +..+.+. +.. + +.. .+ | 240 +-+ + +. + +. | | + | 220 +-+ | | | 200 +-+ | | | 180 +-+ | 160 +-+ | | O O O O O O O | 140 O-+ O O O O O O O O O O O O O O O | O O O | 120 +-+-------------------------------------------------------------------+ vm-scalability.time.system_time 2600 +-+------------------------------------------------------------------+ | +.. .+. .+ .+.. | 2400 +-+ .. +. +..+..+. + .+. +.+..+..+ | 2200 +-+ +.+.. .+.+ +. | | .. +..+. | 2000 +-++ | | | 1800 +-+ | | | 1600 +-+ | 1400 +-+ | | O O | 1200 O-+O O O O O O O | | O O O O O O O O O O O O O O O O 1000 +-+------------------------------------------------------------------+ vm-scalability.time.percent_of_cpu_this_job_got 900 +-+-------------------------------------------------------------------+ 850 +-+ .+.. .+.. .+. .+.. .+.+.. | |.. .+.. .+ +. +. +. +..+. +..+..+ | 800 +-+ .+ +..+..+. | 750 +-++. | 700 +-+ | 650 +-+ | | | 600 +-+ | 550 +-+ | 500 +-+ | 450 +-+ O | O O O O O O O O O O O O O | 400 +-+ O O O O O O O O O O O O 350 +-+-------------------------------------------------------------------+ vm-scalability.time.major_page_faults 8000 +-+------------------------------------------------------------------+ | : | 7000 +-+ : | | : : | |: : | 6000 +-+ : +.. .+. .+..+. + +.. | | :.+.. + +. +..+. +..+.. .. + .. | 5000 +-+ + +..+.. + + + + | | +.+ | 4000 +-+ | | | | O O | 3000 +-+ O O O O O O O O O O O O O O O O O O O O O O O | 2000 +-+--------------------------------------O---------------------------+ vm-scalability.time.file_system_inputs 3e+08 +-+---------------------------------------------------------------+ | | 2.5e+08 +-+ O O | | | | O O O O 2e+08 +-+ O O | | | 1.5e+08 +-+ | | | 1e+08 +-+ | O O O O O O O O O O O O O O | | O O O O | 5e+07 +-+ | | | 0 +-+---------------------------------------------------------------+ vm-scalability.time.file_system_outputs 1.6e+09 +-+O-O--O--O------O-----O-O-----O----O-------O--------------------+ | | 1.5e+09 +-+ | O O O O O O O O | 1.4e+09 +-+ | | | 1.3e+09 +-+ O O O O O O O O | | 1.2e+09 +-+ | | | 1.1e+09 +-+ | |.. .+..+.. .+.+..+.. .+.. .+.. .+.. | 1e+09 +-++ +.+. + +..+.+. +.+. +.+..+.+ | | | 9e+08 +-+---------------------------------------------------------------+ vm-scalability.workload 6e+08 +-+---------------------------------------------------------------+ |.. +..+.. +.+..+.. +.. +.. +.. | | + .. + .. .. | 5.5e+08 +-++ +.+ + +..+.+ +.+ +.+..+.+ | | | | | 5e+08 +-+ | | | 4.5e+08 +-+ | | O O O O O O O O O O | | | 4e+08 O-+ O O O O O O O | | | | O O O O O O O O 3.5e+08 +-+---------------------------------------------------------------+ [*] bisect-good sample [O] bisect-bad sample Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Thanks, Rong Chen