From: kernel test robot <oliver.sang@intel.com>
To: Huang Ying <ying.huang@intel.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
Mel Gorman <mgorman@techsingularity.net>,
Michal Hocko <mhocko@suse.com>,
Andrew Morton <akpm@linux-foundation.org>,
Vlastimil Babka <vbabka@suse.cz>,
"David Hildenbrand" <david@redhat.com>,
Johannes Weiner <jweiner@redhat.com>,
"Dave Hansen" <dave.hansen@linux.intel.com>,
Pavel Tatashin <pasha.tatashin@soleen.com>,
Matthew Wilcox <willy@infradead.org>,
"Christoph Lameter" <cl@linux.com>, <linux-mm@kvack.org>,
<ying.huang@intel.com>, <feng.tang@intel.com>,
<fengwei.yin@intel.com>, <linux-kernel@vger.kernel.org>,
Arjan Van De Ven <arjan@linux.intel.com>, <oliver.sang@intel.com>
Subject: Re: [PATCH -V3 7/9] mm: tune PCP high automatically
Date: Tue, 31 Oct 2023 10:50:33 +0800 [thread overview]
Message-ID: <202310311001.edbc5817-oliver.sang@intel.com> (raw)
In-Reply-To: <20231016053002.756205-8-ying.huang@intel.com>
Hello,
kernel test robot noticed a 8.4% improvement of will-it-scale.per_process_ops on:
commit: ba6149e96007edcdb01284c1531ebd49b4720f72 ("[PATCH -V3 7/9] mm: tune PCP high automatically")
url: https://github.com/intel-lab-lkp/linux/commits/Huang-Ying/mm-pcp-avoid-to-drain-PCP-when-process-exit/20231017-143633
base: https://git.kernel.org/cgit/linux/kernel/git/gregkh/driver-core.git 36b2d7dd5a8ac95c8c1e69bdc93c4a6e2dc28a23
patch link: https://lore.kernel.org/all/20231016053002.756205-8-ying.huang@intel.com/
patch subject: [PATCH -V3 7/9] mm: tune PCP high automatically
testcase: will-it-scale
test machine: 224 threads 4 sockets Intel(R) Xeon(R) Platinum 8380H CPU @ 2.90GHz (Cooper Lake) with 192G memory
parameters:
nr_task: 16
mode: process
test: page_fault2
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20231031/202310311001.edbc5817-oliver.sang@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-8.3/process/16/debian-11.1-x86_64-20220510.cgz/lkp-cpl-4sp2/page_fault2/will-it-scale
commit:
9f9d0b0869 ("mm: add framework for PCP high auto-tuning")
ba6149e960 ("mm: tune PCP high automatically")
9f9d0b08696fb316 ba6149e96007edcdb01284c1531
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.29 +0.0 0.32 mpstat.cpu.all.usr%
1434135 ± 2% +15.8% 1660688 ± 4% numa-meminfo.node0.AnonPages.max
22.97 +2.0% 23.43 turbostat.RAMWatt
213121 ± 5% -19.5% 171478 ± 7% meminfo.DirectMap4k
8031428 +12.0% 8998346 meminfo.Memused
9777522 +14.3% 11178004 meminfo.max_used_kB
4913700 +8.4% 5326025 will-it-scale.16.processes
307105 +8.4% 332876 will-it-scale.per_process_ops
4913700 +8.4% 5326025 will-it-scale.workload
1.488e+09 +8.5% 1.614e+09 proc-vmstat.numa_hit
1.487e+09 +8.4% 1.612e+09 proc-vmstat.numa_local
1.486e+09 +8.3% 1.609e+09 proc-vmstat.pgalloc_normal
1.482e+09 +8.3% 1.604e+09 proc-vmstat.pgfault
1.486e+09 +8.3% 1.609e+09 proc-vmstat.pgfree
2535424 ± 2% +6.2% 2693888 ± 2% proc-vmstat.unevictable_pgs_scanned
0.04 ± 9% +62.2% 0.06 ± 20% perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_exc_page_fault
85.33 ± 7% +36.1% 116.17 ± 8% perf-sched.wait_and_delay.count.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_exc_page_fault
475.33 ± 3% +24.8% 593.33 ± 4% perf-sched.wait_and_delay.count.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
0.16 ± 17% +449.1% 0.87 ± 39% perf-sched.wait_and_delay.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_exc_page_fault
0.03 ± 10% +94.1% 0.07 ± 26% perf-sched.wait_time.avg.ms.__cond_resched.__alloc_pages.__folio_alloc.vma_alloc_folio.do_cow_fault
0.04 ± 9% +62.2% 0.06 ± 20% perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_exc_page_fault
0.16 ± 17% +449.1% 0.87 ± 39% perf-sched.wait_time.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_exc_page_fault
14.01 +6.0% 14.85 perf-stat.i.MPKI
5.79e+09 +3.6% 6.001e+09 perf-stat.i.branch-instructions
0.20 ± 2% +0.0 0.21 ± 2% perf-stat.i.branch-miss-rate%
12098037 ± 2% +8.5% 13122446 ± 2% perf-stat.i.branch-misses
82.90 +2.1 85.03 perf-stat.i.cache-miss-rate%
4.005e+08 +9.8% 4.399e+08 perf-stat.i.cache-misses
4.83e+08 +7.1% 5.174e+08 perf-stat.i.cache-references
2.29 -3.2% 2.22 perf-stat.i.cpi
164.08 -9.0% 149.33 perf-stat.i.cycles-between-cache-misses
7.091e+09 +4.2% 7.392e+09 perf-stat.i.dTLB-loads
0.97 +0.0 1.01 perf-stat.i.dTLB-store-miss-rate%
40301594 +8.8% 43829422 perf-stat.i.dTLB-store-misses
4.121e+09 +4.4% 4.302e+09 perf-stat.i.dTLB-stores
83.96 +2.6 86.59 perf-stat.i.iTLB-load-miss-rate%
10268085 ± 3% +23.0% 12628681 ± 3% perf-stat.i.iTLB-load-misses
2.861e+10 +3.7% 2.966e+10 perf-stat.i.instructions
2796 ± 3% -15.7% 2356 ± 3% perf-stat.i.instructions-per-iTLB-miss
0.44 +3.3% 0.45 perf-stat.i.ipc
984.67 +9.6% 1078 perf-stat.i.metric.K/sec
78.05 +4.2% 81.29 perf-stat.i.metric.M/sec
4913856 +8.4% 5329060 perf-stat.i.minor-faults
1.356e+08 +10.6% 1.499e+08 perf-stat.i.node-loads
32443508 +7.6% 34908277 perf-stat.i.node-stores
4913858 +8.4% 5329062 perf-stat.i.page-faults
14.00 +6.0% 14.83 perf-stat.overall.MPKI
0.21 ± 2% +0.0 0.22 ± 2% perf-stat.overall.branch-miss-rate%
82.92 +2.1 85.02 perf-stat.overall.cache-miss-rate%
2.29 -3.1% 2.21 perf-stat.overall.cpi
163.33 -8.6% 149.29 perf-stat.overall.cycles-between-cache-misses
0.97 +0.0 1.01 perf-stat.overall.dTLB-store-miss-rate%
84.00 +2.6 86.61 perf-stat.overall.iTLB-load-miss-rate%
2789 ± 3% -15.7% 2350 ± 3% perf-stat.overall.instructions-per-iTLB-miss
0.44 +3.2% 0.45 perf-stat.overall.ipc
1754985 -4.7% 1673375 perf-stat.overall.path-length
5.771e+09 +3.6% 5.981e+09 perf-stat.ps.branch-instructions
12074113 ± 2% +8.4% 13094204 ± 2% perf-stat.ps.branch-misses
3.992e+08 +9.8% 4.384e+08 perf-stat.ps.cache-misses
4.814e+08 +7.1% 5.157e+08 perf-stat.ps.cache-references
7.068e+09 +4.2% 7.367e+09 perf-stat.ps.dTLB-loads
40167519 +8.7% 43680173 perf-stat.ps.dTLB-store-misses
4.107e+09 +4.4% 4.288e+09 perf-stat.ps.dTLB-stores
10234325 ± 3% +23.0% 12587000 ± 3% perf-stat.ps.iTLB-load-misses
2.852e+10 +3.6% 2.956e+10 perf-stat.ps.instructions
4897507 +8.4% 5310921 perf-stat.ps.minor-faults
1.351e+08 +10.5% 1.494e+08 perf-stat.ps.node-loads
32335421 +7.6% 34789913 perf-stat.ps.node-stores
4897509 +8.4% 5310923 perf-stat.ps.page-faults
8.623e+12 +3.4% 8.912e+12 perf-stat.total.instructions
9.86 ± 3% -8.4 1.49 ± 5% perf-profile.calltrace.cycles-pp.rmqueue_bulk.__rmqueue_pcplist.rmqueue.get_page_from_freelist.__alloc_pages
8.11 ± 3% -7.5 0.58 ± 8% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.rmqueue_bulk.__rmqueue_pcplist.rmqueue.get_page_from_freelist
8.10 ± 3% -7.5 0.58 ± 8% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.rmqueue_bulk.__rmqueue_pcplist.rmqueue
7.52 ± 3% -6.4 1.15 ± 5% perf-profile.calltrace.cycles-pp.free_pcppages_bulk.free_unref_page_list.release_pages.tlb_batch_pages_flush.zap_pte_range
7.90 ± 4% -6.4 1.55 ± 4% perf-profile.calltrace.cycles-pp.free_unref_page_list.release_pages.tlb_batch_pages_flush.zap_pte_range.zap_pmd_range
5.78 ± 4% -5.8 0.00 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.free_pcppages_bulk.free_unref_page_list.release_pages.tlb_batch_pages_flush
5.78 ± 4% -5.8 0.00 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.free_pcppages_bulk.free_unref_page_list.release_pages
10.90 ± 3% -5.3 5.59 ± 2% perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages.__folio_alloc.vma_alloc_folio.do_cow_fault
10.57 ± 3% -5.3 5.26 ± 3% perf-profile.calltrace.cycles-pp.rmqueue.get_page_from_freelist.__alloc_pages.__folio_alloc.vma_alloc_folio
10.21 ± 3% -5.3 4.94 ± 3% perf-profile.calltrace.cycles-pp.__rmqueue_pcplist.rmqueue.get_page_from_freelist.__alloc_pages.__folio_alloc
11.18 ± 3% -5.3 5.91 ± 2% perf-profile.calltrace.cycles-pp.__folio_alloc.vma_alloc_folio.do_cow_fault.do_fault.__handle_mm_fault
11.15 ± 3% -5.3 5.88 ± 2% perf-profile.calltrace.cycles-pp.__alloc_pages.__folio_alloc.vma_alloc_folio.do_cow_fault.do_fault
11.56 ± 3% -5.2 6.37 ± 2% perf-profile.calltrace.cycles-pp.vma_alloc_folio.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault
9.76 ± 3% -4.3 5.50 ± 6% perf-profile.calltrace.cycles-pp.release_pages.tlb_batch_pages_flush.zap_pte_range.zap_pmd_range.unmap_page_range
10.18 ± 3% -4.2 5.95 ± 5% perf-profile.calltrace.cycles-pp.tlb_batch_pages_flush.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
15.40 ± 3% -3.7 11.70 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap
15.40 ± 3% -3.7 11.70 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
15.40 ± 3% -3.7 11.70 perf-profile.calltrace.cycles-pp.__munmap
15.40 ± 3% -3.7 11.70 perf-profile.calltrace.cycles-pp.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
15.40 ± 3% -3.7 11.70 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
15.40 ± 3% -3.7 11.70 perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
15.40 ± 3% -3.7 11.70 perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
15.39 ± 3% -3.7 11.70 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap
14.08 ± 3% -3.6 10.49 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region
14.10 ± 3% -3.6 10.52 perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
14.10 ± 3% -3.6 10.52 perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap
14.10 ± 3% -3.6 10.52 perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap
1.60 ± 2% -0.7 0.86 ± 6% perf-profile.calltrace.cycles-pp.__list_del_entry_valid_or_report.rmqueue_bulk.__rmqueue_pcplist.rmqueue.get_page_from_freelist
0.96 ± 3% -0.4 0.56 ± 3% perf-profile.calltrace.cycles-pp.free_pcppages_bulk.free_unref_page_list.release_pages.tlb_batch_pages_flush.tlb_finish_mmu
1.00 ± 4% -0.4 0.62 ± 4% perf-profile.calltrace.cycles-pp.free_unref_page_list.release_pages.tlb_batch_pages_flush.tlb_finish_mmu.unmap_region
1.26 ± 4% -0.1 1.11 ± 2% perf-profile.calltrace.cycles-pp.release_pages.tlb_batch_pages_flush.tlb_finish_mmu.unmap_region.do_vmi_align_munmap
1.28 ± 3% -0.1 1.16 ± 3% perf-profile.calltrace.cycles-pp.tlb_batch_pages_flush.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap
1.28 ± 4% -0.1 1.17 ± 2% perf-profile.calltrace.cycles-pp.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
0.60 ± 3% -0.0 0.57 perf-profile.calltrace.cycles-pp.__mem_cgroup_charge.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault
0.55 ± 3% +0.0 0.60 perf-profile.calltrace.cycles-pp.__perf_sw_event.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
0.73 ± 3% +0.1 0.79 ± 2% perf-profile.calltrace.cycles-pp.lock_vma_under_rcu.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
0.68 ± 3% +0.1 0.78 ± 3% perf-profile.calltrace.cycles-pp.page_remove_rmap.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
0.57 ± 7% +0.1 0.71 ± 8% perf-profile.calltrace.cycles-pp.__mod_lruvec_page_state.folio_add_new_anon_rmap.set_pte_range.finish_fault.do_cow_fault
1.41 ± 3% +0.1 1.55 perf-profile.calltrace.cycles-pp.sync_regs.asm_exc_page_fault.testcase
0.77 ± 4% +0.2 0.93 ± 5% perf-profile.calltrace.cycles-pp.folio_add_new_anon_rmap.set_pte_range.finish_fault.do_cow_fault.do_fault
0.94 ± 3% +0.2 1.12 ± 3% perf-profile.calltrace.cycles-pp.lru_add_fn.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault
0.36 ± 70% +0.2 0.57 perf-profile.calltrace.cycles-pp.__perf_sw_event.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
1.26 ± 5% +0.2 1.47 ± 3% perf-profile.calltrace.cycles-pp.filemap_get_entry.shmem_get_folio_gfp.shmem_fault.__do_fault.do_cow_fault
1.61 ± 5% +0.3 1.87 ± 3% perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_fault.__do_fault.do_cow_fault.do_fault
1.75 ± 5% +0.3 2.05 ± 3% perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.do_cow_fault.do_fault.__handle_mm_fault
1.86 ± 4% +0.3 2.17 ± 2% perf-profile.calltrace.cycles-pp.__do_fault.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault
0.17 ±141% +0.4 0.58 ± 3% perf-profile.calltrace.cycles-pp.xas_load.filemap_get_entry.shmem_get_folio_gfp.shmem_fault.__do_fault
2.60 ± 3% +0.5 3.14 ± 5% perf-profile.calltrace.cycles-pp._compound_head.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
4.51 ± 3% +0.7 5.16 perf-profile.calltrace.cycles-pp._raw_spin_lock.__pte_offset_map_lock.finish_fault.do_cow_fault.do_fault
4.65 ± 3% +0.7 5.32 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.finish_fault.do_cow_fault.do_fault.__handle_mm_fault
1.61 ± 3% +1.9 3.52 ± 6% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma
0.85 ± 2% +1.9 2.77 ± 13% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.release_pages.tlb_batch_pages_flush.zap_pte_range
0.84 ± 2% +1.9 2.76 ± 13% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.release_pages.tlb_batch_pages_flush
0.85 ± 2% +1.9 2.78 ± 12% perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.release_pages.tlb_batch_pages_flush.zap_pte_range.zap_pmd_range
1.71 ± 3% +1.9 3.64 ± 6% perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault
1.70 ± 2% +1.9 3.63 ± 6% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma.set_pte_range
3.31 ± 2% +2.2 5.52 ± 5% perf-profile.calltrace.cycles-pp.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault.do_cow_fault
3.46 ± 2% +2.2 5.71 ± 5% perf-profile.calltrace.cycles-pp.folio_add_lru_vma.set_pte_range.finish_fault.do_cow_fault.do_fault
4.47 ± 2% +2.4 6.90 ± 4% perf-profile.calltrace.cycles-pp.set_pte_range.finish_fault.do_cow_fault.do_fault.__handle_mm_fault
9.22 ± 2% +3.1 12.33 ± 2% perf-profile.calltrace.cycles-pp.finish_fault.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault
44.13 ± 3% +3.2 47.34 perf-profile.calltrace.cycles-pp.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
44.27 ± 3% +3.2 47.49 perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
45.63 ± 2% +3.3 48.95 perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
0.00 +3.4 3.37 ± 2% perf-profile.calltrace.cycles-pp.__list_del_entry_valid_or_report.__rmqueue_pcplist.rmqueue.get_page_from_freelist.__alloc_pages
46.88 ± 3% +3.4 50.29 perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
49.40 ± 2% +3.6 53.03 perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
49.59 ± 2% +3.7 53.24 perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.testcase
59.06 ± 2% +4.5 63.60 perf-profile.calltrace.cycles-pp.asm_exc_page_fault.testcase
56.32 ± 3% +4.6 60.89 perf-profile.calltrace.cycles-pp.testcase
20.16 ± 3% +4.9 25.10 perf-profile.calltrace.cycles-pp.copy_page.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault
16.66 ± 3% -8.8 7.83 ± 8% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
16.48 ± 3% -8.8 7.66 ± 8% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
9.90 ± 3% -8.4 1.50 ± 5% perf-profile.children.cycles-pp.rmqueue_bulk
8.92 ± 3% -6.7 2.18 ± 2% perf-profile.children.cycles-pp.free_unref_page_list
8.47 ± 3% -6.7 1.74 ± 4% perf-profile.children.cycles-pp.free_pcppages_bulk
10.96 ± 3% -5.3 5.64 ± 2% perf-profile.children.cycles-pp.get_page_from_freelist
10.62 ± 3% -5.3 5.30 ± 2% perf-profile.children.cycles-pp.rmqueue
10.26 ± 3% -5.3 4.97 ± 3% perf-profile.children.cycles-pp.__rmqueue_pcplist
11.24 ± 3% -5.3 5.96 ± 2% perf-profile.children.cycles-pp.__alloc_pages
11.18 ± 3% -5.3 5.92 ± 2% perf-profile.children.cycles-pp.__folio_alloc
11.57 ± 3% -5.2 6.37 ± 2% perf-profile.children.cycles-pp.vma_alloc_folio
11.19 ± 3% -4.4 6.82 ± 5% perf-profile.children.cycles-pp.release_pages
11.46 ± 3% -4.3 7.12 ± 5% perf-profile.children.cycles-pp.tlb_batch_pages_flush
15.52 ± 3% -3.7 11.81 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
15.52 ± 3% -3.7 11.81 perf-profile.children.cycles-pp.do_syscall_64
15.41 ± 3% -3.7 11.70 perf-profile.children.cycles-pp.__munmap
15.40 ± 3% -3.7 11.70 perf-profile.children.cycles-pp.do_vmi_munmap
15.40 ± 3% -3.7 11.70 perf-profile.children.cycles-pp.do_vmi_align_munmap
15.40 ± 3% -3.7 11.70 perf-profile.children.cycles-pp.__x64_sys_munmap
15.40 ± 3% -3.7 11.70 perf-profile.children.cycles-pp.__vm_munmap
15.39 ± 3% -3.7 11.70 perf-profile.children.cycles-pp.unmap_region
14.10 ± 3% -3.6 10.52 perf-profile.children.cycles-pp.unmap_vmas
14.10 ± 3% -3.6 10.52 perf-profile.children.cycles-pp.unmap_page_range
14.10 ± 3% -3.6 10.52 perf-profile.children.cycles-pp.zap_pmd_range
14.10 ± 3% -3.6 10.52 perf-profile.children.cycles-pp.zap_pte_range
2.60 ± 3% -2.0 0.56 ± 4% perf-profile.children.cycles-pp.__free_one_page
1.28 ± 3% -0.1 1.17 ± 2% perf-profile.children.cycles-pp.tlb_finish_mmu
0.15 ± 19% -0.1 0.08 ± 14% perf-profile.children.cycles-pp.get_mem_cgroup_from_mm
0.61 ± 3% -0.0 0.58 ± 2% perf-profile.children.cycles-pp.__mem_cgroup_charge
0.11 ± 6% -0.0 0.08 ± 7% perf-profile.children.cycles-pp.__mod_zone_page_state
0.25 ± 4% +0.0 0.26 perf-profile.children.cycles-pp.error_entry
0.15 ± 3% +0.0 0.17 ± 4% perf-profile.children.cycles-pp.free_unref_page_commit
0.12 ± 8% +0.0 0.14 ± 3% perf-profile.children.cycles-pp.__mem_cgroup_uncharge_list
0.18 ± 3% +0.0 0.20 ± 4% perf-profile.children.cycles-pp.access_error
0.07 ± 5% +0.0 0.09 ± 7% perf-profile.children.cycles-pp.task_tick_fair
0.04 ± 45% +0.0 0.06 ± 7% perf-profile.children.cycles-pp.page_counter_try_charge
0.30 ± 4% +0.0 0.32 perf-profile.children.cycles-pp.down_read_trylock
0.27 ± 3% +0.0 0.30 ± 2% perf-profile.children.cycles-pp.up_read
0.15 ± 8% +0.0 0.18 ± 3% perf-profile.children.cycles-pp.mem_cgroup_update_lru_size
0.02 ±142% +0.1 0.07 ± 29% perf-profile.children.cycles-pp.ret_from_fork_asm
0.44 ± 2% +0.1 0.49 ± 3% perf-profile.children.cycles-pp.mas_walk
0.46 ± 4% +0.1 0.52 ± 2% perf-profile.children.cycles-pp.__mod_memcg_lruvec_state
0.67 ± 3% +0.1 0.73 ± 4% perf-profile.children.cycles-pp.lock_mm_and_find_vma
0.42 ± 3% +0.1 0.48 ± 2% perf-profile.children.cycles-pp.free_swap_cache
0.43 ± 4% +0.1 0.49 ± 2% perf-profile.children.cycles-pp.free_pages_and_swap_cache
0.30 ± 5% +0.1 0.37 ± 3% perf-profile.children.cycles-pp.xas_descend
0.86 ± 3% +0.1 0.92 perf-profile.children.cycles-pp.___perf_sw_event
0.73 ± 3% +0.1 0.80 perf-profile.children.cycles-pp.lock_vma_under_rcu
0.40 ± 2% +0.1 0.47 perf-profile.children.cycles-pp.__mod_node_page_state
0.01 ±223% +0.1 0.09 ± 12% perf-profile.children.cycles-pp.shmem_get_policy
0.53 ± 2% +0.1 0.62 ± 2% perf-profile.children.cycles-pp.__mod_lruvec_state
1.09 ± 3% +0.1 1.18 perf-profile.children.cycles-pp.__perf_sw_event
0.50 ± 5% +0.1 0.60 ± 3% perf-profile.children.cycles-pp.xas_load
0.68 ± 3% +0.1 0.78 ± 3% perf-profile.children.cycles-pp.page_remove_rmap
1.45 ± 3% +0.1 1.60 perf-profile.children.cycles-pp.sync_regs
0.77 ± 4% +0.2 0.93 ± 5% perf-profile.children.cycles-pp.folio_add_new_anon_rmap
0.84 ± 5% +0.2 1.02 ± 7% perf-profile.children.cycles-pp.__mod_lruvec_page_state
0.96 ± 4% +0.2 1.15 ± 3% perf-profile.children.cycles-pp.lru_add_fn
1.27 ± 5% +0.2 1.48 ± 3% perf-profile.children.cycles-pp.filemap_get_entry
1.62 ± 4% +0.3 1.88 ± 3% perf-profile.children.cycles-pp.shmem_get_folio_gfp
1.75 ± 5% +0.3 2.06 ± 3% perf-profile.children.cycles-pp.shmem_fault
1.87 ± 4% +0.3 2.18 ± 2% perf-profile.children.cycles-pp.__do_fault
2.19 ± 2% +0.3 2.51 perf-profile.children.cycles-pp.native_irq_return_iret
2.64 ± 4% +0.5 3.18 ± 6% perf-profile.children.cycles-pp._compound_head
4.62 ± 3% +0.6 5.26 perf-profile.children.cycles-pp._raw_spin_lock
4.67 ± 3% +0.7 5.34 perf-profile.children.cycles-pp.__pte_offset_map_lock
3.32 ± 2% +2.2 5.54 ± 5% perf-profile.children.cycles-pp.folio_batch_move_lru
3.47 ± 2% +2.2 5.72 ± 5% perf-profile.children.cycles-pp.folio_add_lru_vma
4.49 ± 2% +2.4 6.92 ± 4% perf-profile.children.cycles-pp.set_pte_range
9.25 ± 2% +3.1 12.36 ± 2% perf-profile.children.cycles-pp.finish_fault
2.25 ± 2% +3.1 5.36 ± 2% perf-profile.children.cycles-pp.__list_del_entry_valid_or_report
44.16 ± 3% +3.2 47.37 perf-profile.children.cycles-pp.do_cow_fault
44.28 ± 3% +3.2 47.50 perf-profile.children.cycles-pp.do_fault
45.66 ± 2% +3.3 48.98 perf-profile.children.cycles-pp.__handle_mm_fault
46.91 ± 2% +3.4 50.33 perf-profile.children.cycles-pp.handle_mm_fault
49.44 ± 2% +3.6 53.08 perf-profile.children.cycles-pp.do_user_addr_fault
49.62 ± 2% +3.6 53.27 perf-profile.children.cycles-pp.exc_page_fault
2.70 ± 3% +4.1 6.75 ± 8% perf-profile.children.cycles-pp.folio_lruvec_lock_irqsave
55.26 ± 2% +4.2 59.44 perf-profile.children.cycles-pp.asm_exc_page_fault
58.13 ± 3% +4.6 62.72 perf-profile.children.cycles-pp.testcase
20.19 ± 3% +4.9 25.14 perf-profile.children.cycles-pp.copy_page
16.48 ± 3% -8.8 7.66 ± 8% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
2.53 ± 3% -2.0 0.54 ± 3% perf-profile.self.cycles-pp.__free_one_page
0.12 ± 4% -0.1 0.05 ± 46% perf-profile.self.cycles-pp.rmqueue_bulk
0.14 ± 19% -0.1 0.08 ± 14% perf-profile.self.cycles-pp.get_mem_cgroup_from_mm
0.10 ± 3% -0.0 0.08 ± 10% perf-profile.self.cycles-pp.__mod_zone_page_state
0.13 ± 5% +0.0 0.14 ± 2% perf-profile.self.cycles-pp.free_unref_page_commit
0.13 ± 3% +0.0 0.14 ± 3% perf-profile.self.cycles-pp.exc_page_fault
0.15 ± 5% +0.0 0.17 ± 4% perf-profile.self.cycles-pp.__pte_offset_map
0.04 ± 44% +0.0 0.06 ± 6% perf-profile.self.cycles-pp.page_counter_try_charge
0.18 ± 3% +0.0 0.20 ± 4% perf-profile.self.cycles-pp.access_error
0.30 ± 3% +0.0 0.32 ± 2% perf-profile.self.cycles-pp.down_read_trylock
0.16 ± 6% +0.0 0.18 perf-profile.self.cycles-pp.set_pte_range
0.26 ± 2% +0.0 0.29 ± 3% perf-profile.self.cycles-pp.up_read
0.15 ± 8% +0.0 0.18 ± 4% perf-profile.self.cycles-pp.folio_add_lru_vma
0.15 ± 8% +0.0 0.18 ± 3% perf-profile.self.cycles-pp.mem_cgroup_update_lru_size
0.22 ± 6% +0.0 0.26 ± 5% perf-profile.self.cycles-pp.__alloc_pages
0.32 ± 6% +0.0 0.36 ± 3% perf-profile.self.cycles-pp.shmem_get_folio_gfp
0.28 ± 5% +0.0 0.32 ± 4% perf-profile.self.cycles-pp.do_cow_fault
0.14 ± 7% +0.0 0.18 ± 6% perf-profile.self.cycles-pp.shmem_fault
0.34 ± 5% +0.0 0.38 ± 4% perf-profile.self.cycles-pp.__mod_memcg_lruvec_state
0.00 +0.1 0.05 perf-profile.self.cycles-pp.__cond_resched
0.44 ± 3% +0.1 0.49 ± 4% perf-profile.self.cycles-pp.page_remove_rmap
0.41 ± 3% +0.1 0.47 ± 3% perf-profile.self.cycles-pp.free_swap_cache
0.75 ± 3% +0.1 0.81 ± 2% perf-profile.self.cycles-pp.___perf_sw_event
0.91 ± 2% +0.1 0.98 ± 2% perf-profile.self.cycles-pp.__handle_mm_fault
0.29 ± 6% +0.1 0.36 ± 3% perf-profile.self.cycles-pp.xas_descend
0.38 ± 2% +0.1 0.45 ± 2% perf-profile.self.cycles-pp.__mod_node_page_state
0.01 ±223% +0.1 0.09 ± 8% perf-profile.self.cycles-pp.shmem_get_policy
0.58 ± 3% +0.1 0.66 ± 2% perf-profile.self.cycles-pp.release_pages
0.44 ± 4% +0.1 0.54 ± 3% perf-profile.self.cycles-pp.lru_add_fn
1.44 ± 3% +0.1 1.59 perf-profile.self.cycles-pp.sync_regs
2.18 ± 2% +0.3 2.50 perf-profile.self.cycles-pp.native_irq_return_iret
4.36 ± 3% +0.4 4.76 perf-profile.self.cycles-pp.testcase
2.61 ± 4% +0.5 3.14 ± 5% perf-profile.self.cycles-pp._compound_head
4.60 ± 3% +0.6 5.23 perf-profile.self.cycles-pp._raw_spin_lock
2.23 ± 2% +3.1 5.34 ± 2% perf-profile.self.cycles-pp.__list_del_entry_valid_or_report
20.10 ± 3% +4.9 25.02 perf-profile.self.cycles-pp.copy_page
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
next prev parent reply other threads:[~2023-10-31 2:50 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-16 5:29 [PATCH -V3 0/9] mm: PCP high auto-tuning Huang Ying
2023-10-16 5:29 ` [PATCH -V3 1/9] mm, pcp: avoid to drain PCP when process exit Huang Ying
2023-10-16 5:29 ` [PATCH -V3 2/9] cacheinfo: calculate size of per-CPU data cache slice Huang Ying
2023-10-19 12:11 ` Mel Gorman
2023-10-16 5:29 ` [PATCH -V3 3/9] mm, pcp: reduce lock contention for draining high-order pages Huang Ying
2023-10-27 6:23 ` kernel test robot
2023-11-06 6:22 ` kernel test robot
2023-11-06 6:38 ` Huang, Ying
2023-10-16 5:29 ` [PATCH -V3 4/9] mm: restrict the pcp batch scale factor to avoid too long latency Huang Ying
2023-10-19 12:12 ` Mel Gorman
2023-10-16 5:29 ` [PATCH -V3 5/9] mm, page_alloc: scale the number of pages that are batch allocated Huang Ying
2023-10-16 5:29 ` [PATCH -V3 6/9] mm: add framework for PCP high auto-tuning Huang Ying
2023-10-19 12:16 ` Mel Gorman
2023-10-16 5:30 ` [PATCH -V3 7/9] mm: tune PCP high automatically Huang Ying
2023-10-31 2:50 ` kernel test robot [this message]
2023-10-16 5:30 ` [PATCH -V3 8/9] mm, pcp: decrease PCP high if free pages < high watermark Huang Ying
2023-10-19 12:33 ` Mel Gorman
2023-10-20 3:30 ` Huang, Ying
2023-10-23 9:26 ` Mel Gorman
2023-10-16 5:30 ` [PATCH -V3 9/9] mm, pcp: reduce detecting time of consecutive high order page freeing Huang Ying
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202310311001.edbc5817-oliver.sang@intel.com \
--to=oliver.sang@intel.com \
--cc=akpm@linux-foundation.org \
--cc=arjan@linux.intel.com \
--cc=cl@linux.com \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=feng.tang@intel.com \
--cc=fengwei.yin@intel.com \
--cc=jweiner@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lkp@intel.com \
--cc=mgorman@techsingularity.net \
--cc=mhocko@suse.com \
--cc=oe-lkp@lists.linux.dev \
--cc=pasha.tatashin@soleen.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.