All of lore.kernel.org
 help / color / mirror / Atom feed
From: kernel test robot <oliver.sang@intel.com>
To: Huang Ying <ying.huang@intel.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Michal Hocko <mhocko@suse.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	"David Hildenbrand" <david@redhat.com>,
	Johannes Weiner <jweiner@redhat.com>,
	"Dave Hansen" <dave.hansen@linux.intel.com>,
	Pavel Tatashin <pasha.tatashin@soleen.com>,
	Matthew Wilcox <willy@infradead.org>,
	"Christoph Lameter" <cl@linux.com>, <linux-mm@kvack.org>,
	<ying.huang@intel.com>, <feng.tang@intel.com>,
	<fengwei.yin@intel.com>, <linux-kernel@vger.kernel.org>,
	Arjan Van De Ven <arjan@linux.intel.com>, <oliver.sang@intel.com>
Subject: Re: [PATCH -V3 7/9] mm: tune PCP high automatically
Date: Tue, 31 Oct 2023 10:50:33 +0800	[thread overview]
Message-ID: <202310311001.edbc5817-oliver.sang@intel.com> (raw)
In-Reply-To: <20231016053002.756205-8-ying.huang@intel.com>



Hello,

kernel test robot noticed a 8.4% improvement of will-it-scale.per_process_ops on:


commit: ba6149e96007edcdb01284c1531ebd49b4720f72 ("[PATCH -V3 7/9] mm: tune PCP high automatically")
url: https://github.com/intel-lab-lkp/linux/commits/Huang-Ying/mm-pcp-avoid-to-drain-PCP-when-process-exit/20231017-143633
base: https://git.kernel.org/cgit/linux/kernel/git/gregkh/driver-core.git 36b2d7dd5a8ac95c8c1e69bdc93c4a6e2dc28a23
patch link: https://lore.kernel.org/all/20231016053002.756205-8-ying.huang@intel.com/
patch subject: [PATCH -V3 7/9] mm: tune PCP high automatically

testcase: will-it-scale
test machine: 224 threads 4 sockets Intel(R) Xeon(R) Platinum 8380H CPU @ 2.90GHz (Cooper Lake) with 192G memory
parameters:

	nr_task: 16
	mode: process
	test: page_fault2
	cpufreq_governor: performance






Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20231031/202310311001.edbc5817-oliver.sang@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-8.3/process/16/debian-11.1-x86_64-20220510.cgz/lkp-cpl-4sp2/page_fault2/will-it-scale

commit: 
  9f9d0b0869 ("mm: add framework for PCP high auto-tuning")
  ba6149e960 ("mm: tune PCP high automatically")

9f9d0b08696fb316 ba6149e96007edcdb01284c1531 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      0.29            +0.0        0.32        mpstat.cpu.all.usr%
   1434135 ±  2%     +15.8%    1660688 ±  4%  numa-meminfo.node0.AnonPages.max
     22.97            +2.0%      23.43        turbostat.RAMWatt
    213121 ±  5%     -19.5%     171478 ±  7%  meminfo.DirectMap4k
   8031428           +12.0%    8998346        meminfo.Memused
   9777522           +14.3%   11178004        meminfo.max_used_kB
   4913700            +8.4%    5326025        will-it-scale.16.processes
    307105            +8.4%     332876        will-it-scale.per_process_ops
   4913700            +8.4%    5326025        will-it-scale.workload
 1.488e+09            +8.5%  1.614e+09        proc-vmstat.numa_hit
 1.487e+09            +8.4%  1.612e+09        proc-vmstat.numa_local
 1.486e+09            +8.3%  1.609e+09        proc-vmstat.pgalloc_normal
 1.482e+09            +8.3%  1.604e+09        proc-vmstat.pgfault
 1.486e+09            +8.3%  1.609e+09        proc-vmstat.pgfree
   2535424 ±  2%      +6.2%    2693888 ±  2%  proc-vmstat.unevictable_pgs_scanned
      0.04 ±  9%     +62.2%       0.06 ± 20%  perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_exc_page_fault
     85.33 ±  7%     +36.1%     116.17 ±  8%  perf-sched.wait_and_delay.count.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_exc_page_fault
    475.33 ±  3%     +24.8%     593.33 ±  4%  perf-sched.wait_and_delay.count.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      0.16 ± 17%    +449.1%       0.87 ± 39%  perf-sched.wait_and_delay.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_exc_page_fault
      0.03 ± 10%     +94.1%       0.07 ± 26%  perf-sched.wait_time.avg.ms.__cond_resched.__alloc_pages.__folio_alloc.vma_alloc_folio.do_cow_fault
      0.04 ±  9%     +62.2%       0.06 ± 20%  perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_exc_page_fault
      0.16 ± 17%    +449.1%       0.87 ± 39%  perf-sched.wait_time.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_exc_page_fault
     14.01            +6.0%      14.85        perf-stat.i.MPKI
  5.79e+09            +3.6%  6.001e+09        perf-stat.i.branch-instructions
      0.20 ±  2%      +0.0        0.21 ±  2%  perf-stat.i.branch-miss-rate%
  12098037 ±  2%      +8.5%   13122446 ±  2%  perf-stat.i.branch-misses
     82.90            +2.1       85.03        perf-stat.i.cache-miss-rate%
 4.005e+08            +9.8%  4.399e+08        perf-stat.i.cache-misses
  4.83e+08            +7.1%  5.174e+08        perf-stat.i.cache-references
      2.29            -3.2%       2.22        perf-stat.i.cpi
    164.08            -9.0%     149.33        perf-stat.i.cycles-between-cache-misses
 7.091e+09            +4.2%  7.392e+09        perf-stat.i.dTLB-loads
      0.97            +0.0        1.01        perf-stat.i.dTLB-store-miss-rate%
  40301594            +8.8%   43829422        perf-stat.i.dTLB-store-misses
 4.121e+09            +4.4%  4.302e+09        perf-stat.i.dTLB-stores
     83.96            +2.6       86.59        perf-stat.i.iTLB-load-miss-rate%
  10268085 ±  3%     +23.0%   12628681 ±  3%  perf-stat.i.iTLB-load-misses
 2.861e+10            +3.7%  2.966e+10        perf-stat.i.instructions
      2796 ±  3%     -15.7%       2356 ±  3%  perf-stat.i.instructions-per-iTLB-miss
      0.44            +3.3%       0.45        perf-stat.i.ipc
    984.67            +9.6%       1078        perf-stat.i.metric.K/sec
     78.05            +4.2%      81.29        perf-stat.i.metric.M/sec
   4913856            +8.4%    5329060        perf-stat.i.minor-faults
 1.356e+08           +10.6%  1.499e+08        perf-stat.i.node-loads
  32443508            +7.6%   34908277        perf-stat.i.node-stores
   4913858            +8.4%    5329062        perf-stat.i.page-faults
     14.00            +6.0%      14.83        perf-stat.overall.MPKI
      0.21 ±  2%      +0.0        0.22 ±  2%  perf-stat.overall.branch-miss-rate%
     82.92            +2.1       85.02        perf-stat.overall.cache-miss-rate%
      2.29            -3.1%       2.21        perf-stat.overall.cpi
    163.33            -8.6%     149.29        perf-stat.overall.cycles-between-cache-misses
      0.97            +0.0        1.01        perf-stat.overall.dTLB-store-miss-rate%
     84.00            +2.6       86.61        perf-stat.overall.iTLB-load-miss-rate%
      2789 ±  3%     -15.7%       2350 ±  3%  perf-stat.overall.instructions-per-iTLB-miss
      0.44            +3.2%       0.45        perf-stat.overall.ipc
   1754985            -4.7%    1673375        perf-stat.overall.path-length
 5.771e+09            +3.6%  5.981e+09        perf-stat.ps.branch-instructions
  12074113 ±  2%      +8.4%   13094204 ±  2%  perf-stat.ps.branch-misses
 3.992e+08            +9.8%  4.384e+08        perf-stat.ps.cache-misses
 4.814e+08            +7.1%  5.157e+08        perf-stat.ps.cache-references
 7.068e+09            +4.2%  7.367e+09        perf-stat.ps.dTLB-loads
  40167519            +8.7%   43680173        perf-stat.ps.dTLB-store-misses
 4.107e+09            +4.4%  4.288e+09        perf-stat.ps.dTLB-stores
  10234325 ±  3%     +23.0%   12587000 ±  3%  perf-stat.ps.iTLB-load-misses
 2.852e+10            +3.6%  2.956e+10        perf-stat.ps.instructions
   4897507            +8.4%    5310921        perf-stat.ps.minor-faults
 1.351e+08           +10.5%  1.494e+08        perf-stat.ps.node-loads
  32335421            +7.6%   34789913        perf-stat.ps.node-stores
   4897509            +8.4%    5310923        perf-stat.ps.page-faults
 8.623e+12            +3.4%  8.912e+12        perf-stat.total.instructions
      9.86 ±  3%      -8.4        1.49 ±  5%  perf-profile.calltrace.cycles-pp.rmqueue_bulk.__rmqueue_pcplist.rmqueue.get_page_from_freelist.__alloc_pages
      8.11 ±  3%      -7.5        0.58 ±  8%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.rmqueue_bulk.__rmqueue_pcplist.rmqueue.get_page_from_freelist
      8.10 ±  3%      -7.5        0.58 ±  8%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.rmqueue_bulk.__rmqueue_pcplist.rmqueue
      7.52 ±  3%      -6.4        1.15 ±  5%  perf-profile.calltrace.cycles-pp.free_pcppages_bulk.free_unref_page_list.release_pages.tlb_batch_pages_flush.zap_pte_range
      7.90 ±  4%      -6.4        1.55 ±  4%  perf-profile.calltrace.cycles-pp.free_unref_page_list.release_pages.tlb_batch_pages_flush.zap_pte_range.zap_pmd_range
      5.78 ±  4%      -5.8        0.00        perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.free_pcppages_bulk.free_unref_page_list.release_pages.tlb_batch_pages_flush
      5.78 ±  4%      -5.8        0.00        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.free_pcppages_bulk.free_unref_page_list.release_pages
     10.90 ±  3%      -5.3        5.59 ±  2%  perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages.__folio_alloc.vma_alloc_folio.do_cow_fault
     10.57 ±  3%      -5.3        5.26 ±  3%  perf-profile.calltrace.cycles-pp.rmqueue.get_page_from_freelist.__alloc_pages.__folio_alloc.vma_alloc_folio
     10.21 ±  3%      -5.3        4.94 ±  3%  perf-profile.calltrace.cycles-pp.__rmqueue_pcplist.rmqueue.get_page_from_freelist.__alloc_pages.__folio_alloc
     11.18 ±  3%      -5.3        5.91 ±  2%  perf-profile.calltrace.cycles-pp.__folio_alloc.vma_alloc_folio.do_cow_fault.do_fault.__handle_mm_fault
     11.15 ±  3%      -5.3        5.88 ±  2%  perf-profile.calltrace.cycles-pp.__alloc_pages.__folio_alloc.vma_alloc_folio.do_cow_fault.do_fault
     11.56 ±  3%      -5.2        6.37 ±  2%  perf-profile.calltrace.cycles-pp.vma_alloc_folio.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault
      9.76 ±  3%      -4.3        5.50 ±  6%  perf-profile.calltrace.cycles-pp.release_pages.tlb_batch_pages_flush.zap_pte_range.zap_pmd_range.unmap_page_range
     10.18 ±  3%      -4.2        5.95 ±  5%  perf-profile.calltrace.cycles-pp.tlb_batch_pages_flush.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
     15.40 ±  3%      -3.7       11.70        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap
     15.40 ±  3%      -3.7       11.70        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
     15.40 ±  3%      -3.7       11.70        perf-profile.calltrace.cycles-pp.__munmap
     15.40 ±  3%      -3.7       11.70        perf-profile.calltrace.cycles-pp.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
     15.40 ±  3%      -3.7       11.70        perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
     15.40 ±  3%      -3.7       11.70        perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
     15.40 ±  3%      -3.7       11.70        perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
     15.39 ±  3%      -3.7       11.70        perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap
     14.08 ±  3%      -3.6       10.49        perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region
     14.10 ±  3%      -3.6       10.52        perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
     14.10 ±  3%      -3.6       10.52        perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap
     14.10 ±  3%      -3.6       10.52        perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap
      1.60 ±  2%      -0.7        0.86 ±  6%  perf-profile.calltrace.cycles-pp.__list_del_entry_valid_or_report.rmqueue_bulk.__rmqueue_pcplist.rmqueue.get_page_from_freelist
      0.96 ±  3%      -0.4        0.56 ±  3%  perf-profile.calltrace.cycles-pp.free_pcppages_bulk.free_unref_page_list.release_pages.tlb_batch_pages_flush.tlb_finish_mmu
      1.00 ±  4%      -0.4        0.62 ±  4%  perf-profile.calltrace.cycles-pp.free_unref_page_list.release_pages.tlb_batch_pages_flush.tlb_finish_mmu.unmap_region
      1.26 ±  4%      -0.1        1.11 ±  2%  perf-profile.calltrace.cycles-pp.release_pages.tlb_batch_pages_flush.tlb_finish_mmu.unmap_region.do_vmi_align_munmap
      1.28 ±  3%      -0.1        1.16 ±  3%  perf-profile.calltrace.cycles-pp.tlb_batch_pages_flush.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap
      1.28 ±  4%      -0.1        1.17 ±  2%  perf-profile.calltrace.cycles-pp.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
      0.60 ±  3%      -0.0        0.57        perf-profile.calltrace.cycles-pp.__mem_cgroup_charge.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault
      0.55 ±  3%      +0.0        0.60        perf-profile.calltrace.cycles-pp.__perf_sw_event.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
      0.73 ±  3%      +0.1        0.79 ±  2%  perf-profile.calltrace.cycles-pp.lock_vma_under_rcu.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
      0.68 ±  3%      +0.1        0.78 ±  3%  perf-profile.calltrace.cycles-pp.page_remove_rmap.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
      0.57 ±  7%      +0.1        0.71 ±  8%  perf-profile.calltrace.cycles-pp.__mod_lruvec_page_state.folio_add_new_anon_rmap.set_pte_range.finish_fault.do_cow_fault
      1.41 ±  3%      +0.1        1.55        perf-profile.calltrace.cycles-pp.sync_regs.asm_exc_page_fault.testcase
      0.77 ±  4%      +0.2        0.93 ±  5%  perf-profile.calltrace.cycles-pp.folio_add_new_anon_rmap.set_pte_range.finish_fault.do_cow_fault.do_fault
      0.94 ±  3%      +0.2        1.12 ±  3%  perf-profile.calltrace.cycles-pp.lru_add_fn.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault
      0.36 ± 70%      +0.2        0.57        perf-profile.calltrace.cycles-pp.__perf_sw_event.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
      1.26 ±  5%      +0.2        1.47 ±  3%  perf-profile.calltrace.cycles-pp.filemap_get_entry.shmem_get_folio_gfp.shmem_fault.__do_fault.do_cow_fault
      1.61 ±  5%      +0.3        1.87 ±  3%  perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_fault.__do_fault.do_cow_fault.do_fault
      1.75 ±  5%      +0.3        2.05 ±  3%  perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.do_cow_fault.do_fault.__handle_mm_fault
      1.86 ±  4%      +0.3        2.17 ±  2%  perf-profile.calltrace.cycles-pp.__do_fault.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault
      0.17 ±141%      +0.4        0.58 ±  3%  perf-profile.calltrace.cycles-pp.xas_load.filemap_get_entry.shmem_get_folio_gfp.shmem_fault.__do_fault
      2.60 ±  3%      +0.5        3.14 ±  5%  perf-profile.calltrace.cycles-pp._compound_head.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
      4.51 ±  3%      +0.7        5.16        perf-profile.calltrace.cycles-pp._raw_spin_lock.__pte_offset_map_lock.finish_fault.do_cow_fault.do_fault
      4.65 ±  3%      +0.7        5.32        perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.finish_fault.do_cow_fault.do_fault.__handle_mm_fault
      1.61 ±  3%      +1.9        3.52 ±  6%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma
      0.85 ±  2%      +1.9        2.77 ± 13%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.release_pages.tlb_batch_pages_flush.zap_pte_range
      0.84 ±  2%      +1.9        2.76 ± 13%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.release_pages.tlb_batch_pages_flush
      0.85 ±  2%      +1.9        2.78 ± 12%  perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.release_pages.tlb_batch_pages_flush.zap_pte_range.zap_pmd_range
      1.71 ±  3%      +1.9        3.64 ±  6%  perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault
      1.70 ±  2%      +1.9        3.63 ±  6%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma.set_pte_range
      3.31 ±  2%      +2.2        5.52 ±  5%  perf-profile.calltrace.cycles-pp.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault.do_cow_fault
      3.46 ±  2%      +2.2        5.71 ±  5%  perf-profile.calltrace.cycles-pp.folio_add_lru_vma.set_pte_range.finish_fault.do_cow_fault.do_fault
      4.47 ±  2%      +2.4        6.90 ±  4%  perf-profile.calltrace.cycles-pp.set_pte_range.finish_fault.do_cow_fault.do_fault.__handle_mm_fault
      9.22 ±  2%      +3.1       12.33 ±  2%  perf-profile.calltrace.cycles-pp.finish_fault.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault
     44.13 ±  3%      +3.2       47.34        perf-profile.calltrace.cycles-pp.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
     44.27 ±  3%      +3.2       47.49        perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
     45.63 ±  2%      +3.3       48.95        perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
      0.00            +3.4        3.37 ±  2%  perf-profile.calltrace.cycles-pp.__list_del_entry_valid_or_report.__rmqueue_pcplist.rmqueue.get_page_from_freelist.__alloc_pages
     46.88 ±  3%      +3.4       50.29        perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
     49.40 ±  2%      +3.6       53.03        perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
     49.59 ±  2%      +3.7       53.24        perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.testcase
     59.06 ±  2%      +4.5       63.60        perf-profile.calltrace.cycles-pp.asm_exc_page_fault.testcase
     56.32 ±  3%      +4.6       60.89        perf-profile.calltrace.cycles-pp.testcase
     20.16 ±  3%      +4.9       25.10        perf-profile.calltrace.cycles-pp.copy_page.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault
     16.66 ±  3%      -8.8        7.83 ±  8%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
     16.48 ±  3%      -8.8        7.66 ±  8%  perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
      9.90 ±  3%      -8.4        1.50 ±  5%  perf-profile.children.cycles-pp.rmqueue_bulk
      8.92 ±  3%      -6.7        2.18 ±  2%  perf-profile.children.cycles-pp.free_unref_page_list
      8.47 ±  3%      -6.7        1.74 ±  4%  perf-profile.children.cycles-pp.free_pcppages_bulk
     10.96 ±  3%      -5.3        5.64 ±  2%  perf-profile.children.cycles-pp.get_page_from_freelist
     10.62 ±  3%      -5.3        5.30 ±  2%  perf-profile.children.cycles-pp.rmqueue
     10.26 ±  3%      -5.3        4.97 ±  3%  perf-profile.children.cycles-pp.__rmqueue_pcplist
     11.24 ±  3%      -5.3        5.96 ±  2%  perf-profile.children.cycles-pp.__alloc_pages
     11.18 ±  3%      -5.3        5.92 ±  2%  perf-profile.children.cycles-pp.__folio_alloc
     11.57 ±  3%      -5.2        6.37 ±  2%  perf-profile.children.cycles-pp.vma_alloc_folio
     11.19 ±  3%      -4.4        6.82 ±  5%  perf-profile.children.cycles-pp.release_pages
     11.46 ±  3%      -4.3        7.12 ±  5%  perf-profile.children.cycles-pp.tlb_batch_pages_flush
     15.52 ±  3%      -3.7       11.81        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     15.52 ±  3%      -3.7       11.81        perf-profile.children.cycles-pp.do_syscall_64
     15.41 ±  3%      -3.7       11.70        perf-profile.children.cycles-pp.__munmap
     15.40 ±  3%      -3.7       11.70        perf-profile.children.cycles-pp.do_vmi_munmap
     15.40 ±  3%      -3.7       11.70        perf-profile.children.cycles-pp.do_vmi_align_munmap
     15.40 ±  3%      -3.7       11.70        perf-profile.children.cycles-pp.__x64_sys_munmap
     15.40 ±  3%      -3.7       11.70        perf-profile.children.cycles-pp.__vm_munmap
     15.39 ±  3%      -3.7       11.70        perf-profile.children.cycles-pp.unmap_region
     14.10 ±  3%      -3.6       10.52        perf-profile.children.cycles-pp.unmap_vmas
     14.10 ±  3%      -3.6       10.52        perf-profile.children.cycles-pp.unmap_page_range
     14.10 ±  3%      -3.6       10.52        perf-profile.children.cycles-pp.zap_pmd_range
     14.10 ±  3%      -3.6       10.52        perf-profile.children.cycles-pp.zap_pte_range
      2.60 ±  3%      -2.0        0.56 ±  4%  perf-profile.children.cycles-pp.__free_one_page
      1.28 ±  3%      -0.1        1.17 ±  2%  perf-profile.children.cycles-pp.tlb_finish_mmu
      0.15 ± 19%      -0.1        0.08 ± 14%  perf-profile.children.cycles-pp.get_mem_cgroup_from_mm
      0.61 ±  3%      -0.0        0.58 ±  2%  perf-profile.children.cycles-pp.__mem_cgroup_charge
      0.11 ±  6%      -0.0        0.08 ±  7%  perf-profile.children.cycles-pp.__mod_zone_page_state
      0.25 ±  4%      +0.0        0.26        perf-profile.children.cycles-pp.error_entry
      0.15 ±  3%      +0.0        0.17 ±  4%  perf-profile.children.cycles-pp.free_unref_page_commit
      0.12 ±  8%      +0.0        0.14 ±  3%  perf-profile.children.cycles-pp.__mem_cgroup_uncharge_list
      0.18 ±  3%      +0.0        0.20 ±  4%  perf-profile.children.cycles-pp.access_error
      0.07 ±  5%      +0.0        0.09 ±  7%  perf-profile.children.cycles-pp.task_tick_fair
      0.04 ± 45%      +0.0        0.06 ±  7%  perf-profile.children.cycles-pp.page_counter_try_charge
      0.30 ±  4%      +0.0        0.32        perf-profile.children.cycles-pp.down_read_trylock
      0.27 ±  3%      +0.0        0.30 ±  2%  perf-profile.children.cycles-pp.up_read
      0.15 ±  8%      +0.0        0.18 ±  3%  perf-profile.children.cycles-pp.mem_cgroup_update_lru_size
      0.02 ±142%      +0.1        0.07 ± 29%  perf-profile.children.cycles-pp.ret_from_fork_asm
      0.44 ±  2%      +0.1        0.49 ±  3%  perf-profile.children.cycles-pp.mas_walk
      0.46 ±  4%      +0.1        0.52 ±  2%  perf-profile.children.cycles-pp.__mod_memcg_lruvec_state
      0.67 ±  3%      +0.1        0.73 ±  4%  perf-profile.children.cycles-pp.lock_mm_and_find_vma
      0.42 ±  3%      +0.1        0.48 ±  2%  perf-profile.children.cycles-pp.free_swap_cache
      0.43 ±  4%      +0.1        0.49 ±  2%  perf-profile.children.cycles-pp.free_pages_and_swap_cache
      0.30 ±  5%      +0.1        0.37 ±  3%  perf-profile.children.cycles-pp.xas_descend
      0.86 ±  3%      +0.1        0.92        perf-profile.children.cycles-pp.___perf_sw_event
      0.73 ±  3%      +0.1        0.80        perf-profile.children.cycles-pp.lock_vma_under_rcu
      0.40 ±  2%      +0.1        0.47        perf-profile.children.cycles-pp.__mod_node_page_state
      0.01 ±223%      +0.1        0.09 ± 12%  perf-profile.children.cycles-pp.shmem_get_policy
      0.53 ±  2%      +0.1        0.62 ±  2%  perf-profile.children.cycles-pp.__mod_lruvec_state
      1.09 ±  3%      +0.1        1.18        perf-profile.children.cycles-pp.__perf_sw_event
      0.50 ±  5%      +0.1        0.60 ±  3%  perf-profile.children.cycles-pp.xas_load
      0.68 ±  3%      +0.1        0.78 ±  3%  perf-profile.children.cycles-pp.page_remove_rmap
      1.45 ±  3%      +0.1        1.60        perf-profile.children.cycles-pp.sync_regs
      0.77 ±  4%      +0.2        0.93 ±  5%  perf-profile.children.cycles-pp.folio_add_new_anon_rmap
      0.84 ±  5%      +0.2        1.02 ±  7%  perf-profile.children.cycles-pp.__mod_lruvec_page_state
      0.96 ±  4%      +0.2        1.15 ±  3%  perf-profile.children.cycles-pp.lru_add_fn
      1.27 ±  5%      +0.2        1.48 ±  3%  perf-profile.children.cycles-pp.filemap_get_entry
      1.62 ±  4%      +0.3        1.88 ±  3%  perf-profile.children.cycles-pp.shmem_get_folio_gfp
      1.75 ±  5%      +0.3        2.06 ±  3%  perf-profile.children.cycles-pp.shmem_fault
      1.87 ±  4%      +0.3        2.18 ±  2%  perf-profile.children.cycles-pp.__do_fault
      2.19 ±  2%      +0.3        2.51        perf-profile.children.cycles-pp.native_irq_return_iret
      2.64 ±  4%      +0.5        3.18 ±  6%  perf-profile.children.cycles-pp._compound_head
      4.62 ±  3%      +0.6        5.26        perf-profile.children.cycles-pp._raw_spin_lock
      4.67 ±  3%      +0.7        5.34        perf-profile.children.cycles-pp.__pte_offset_map_lock
      3.32 ±  2%      +2.2        5.54 ±  5%  perf-profile.children.cycles-pp.folio_batch_move_lru
      3.47 ±  2%      +2.2        5.72 ±  5%  perf-profile.children.cycles-pp.folio_add_lru_vma
      4.49 ±  2%      +2.4        6.92 ±  4%  perf-profile.children.cycles-pp.set_pte_range
      9.25 ±  2%      +3.1       12.36 ±  2%  perf-profile.children.cycles-pp.finish_fault
      2.25 ±  2%      +3.1        5.36 ±  2%  perf-profile.children.cycles-pp.__list_del_entry_valid_or_report
     44.16 ±  3%      +3.2       47.37        perf-profile.children.cycles-pp.do_cow_fault
     44.28 ±  3%      +3.2       47.50        perf-profile.children.cycles-pp.do_fault
     45.66 ±  2%      +3.3       48.98        perf-profile.children.cycles-pp.__handle_mm_fault
     46.91 ±  2%      +3.4       50.33        perf-profile.children.cycles-pp.handle_mm_fault
     49.44 ±  2%      +3.6       53.08        perf-profile.children.cycles-pp.do_user_addr_fault
     49.62 ±  2%      +3.6       53.27        perf-profile.children.cycles-pp.exc_page_fault
      2.70 ±  3%      +4.1        6.75 ±  8%  perf-profile.children.cycles-pp.folio_lruvec_lock_irqsave
     55.26 ±  2%      +4.2       59.44        perf-profile.children.cycles-pp.asm_exc_page_fault
     58.13 ±  3%      +4.6       62.72        perf-profile.children.cycles-pp.testcase
     20.19 ±  3%      +4.9       25.14        perf-profile.children.cycles-pp.copy_page
     16.48 ±  3%      -8.8        7.66 ±  8%  perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
      2.53 ±  3%      -2.0        0.54 ±  3%  perf-profile.self.cycles-pp.__free_one_page
      0.12 ±  4%      -0.1        0.05 ± 46%  perf-profile.self.cycles-pp.rmqueue_bulk
      0.14 ± 19%      -0.1        0.08 ± 14%  perf-profile.self.cycles-pp.get_mem_cgroup_from_mm
      0.10 ±  3%      -0.0        0.08 ± 10%  perf-profile.self.cycles-pp.__mod_zone_page_state
      0.13 ±  5%      +0.0        0.14 ±  2%  perf-profile.self.cycles-pp.free_unref_page_commit
      0.13 ±  3%      +0.0        0.14 ±  3%  perf-profile.self.cycles-pp.exc_page_fault
      0.15 ±  5%      +0.0        0.17 ±  4%  perf-profile.self.cycles-pp.__pte_offset_map
      0.04 ± 44%      +0.0        0.06 ±  6%  perf-profile.self.cycles-pp.page_counter_try_charge
      0.18 ±  3%      +0.0        0.20 ±  4%  perf-profile.self.cycles-pp.access_error
      0.30 ±  3%      +0.0        0.32 ±  2%  perf-profile.self.cycles-pp.down_read_trylock
      0.16 ±  6%      +0.0        0.18        perf-profile.self.cycles-pp.set_pte_range
      0.26 ±  2%      +0.0        0.29 ±  3%  perf-profile.self.cycles-pp.up_read
      0.15 ±  8%      +0.0        0.18 ±  4%  perf-profile.self.cycles-pp.folio_add_lru_vma
      0.15 ±  8%      +0.0        0.18 ±  3%  perf-profile.self.cycles-pp.mem_cgroup_update_lru_size
      0.22 ±  6%      +0.0        0.26 ±  5%  perf-profile.self.cycles-pp.__alloc_pages
      0.32 ±  6%      +0.0        0.36 ±  3%  perf-profile.self.cycles-pp.shmem_get_folio_gfp
      0.28 ±  5%      +0.0        0.32 ±  4%  perf-profile.self.cycles-pp.do_cow_fault
      0.14 ±  7%      +0.0        0.18 ±  6%  perf-profile.self.cycles-pp.shmem_fault
      0.34 ±  5%      +0.0        0.38 ±  4%  perf-profile.self.cycles-pp.__mod_memcg_lruvec_state
      0.00            +0.1        0.05        perf-profile.self.cycles-pp.__cond_resched
      0.44 ±  3%      +0.1        0.49 ±  4%  perf-profile.self.cycles-pp.page_remove_rmap
      0.41 ±  3%      +0.1        0.47 ±  3%  perf-profile.self.cycles-pp.free_swap_cache
      0.75 ±  3%      +0.1        0.81 ±  2%  perf-profile.self.cycles-pp.___perf_sw_event
      0.91 ±  2%      +0.1        0.98 ±  2%  perf-profile.self.cycles-pp.__handle_mm_fault
      0.29 ±  6%      +0.1        0.36 ±  3%  perf-profile.self.cycles-pp.xas_descend
      0.38 ±  2%      +0.1        0.45 ±  2%  perf-profile.self.cycles-pp.__mod_node_page_state
      0.01 ±223%      +0.1        0.09 ±  8%  perf-profile.self.cycles-pp.shmem_get_policy
      0.58 ±  3%      +0.1        0.66 ±  2%  perf-profile.self.cycles-pp.release_pages
      0.44 ±  4%      +0.1        0.54 ±  3%  perf-profile.self.cycles-pp.lru_add_fn
      1.44 ±  3%      +0.1        1.59        perf-profile.self.cycles-pp.sync_regs
      2.18 ±  2%      +0.3        2.50        perf-profile.self.cycles-pp.native_irq_return_iret
      4.36 ±  3%      +0.4        4.76        perf-profile.self.cycles-pp.testcase
      2.61 ±  4%      +0.5        3.14 ±  5%  perf-profile.self.cycles-pp._compound_head
      4.60 ±  3%      +0.6        5.23        perf-profile.self.cycles-pp._raw_spin_lock
      2.23 ±  2%      +3.1        5.34 ±  2%  perf-profile.self.cycles-pp.__list_del_entry_valid_or_report
     20.10 ±  3%      +4.9       25.02        perf-profile.self.cycles-pp.copy_page




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


  reply	other threads:[~2023-10-31  2:50 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-16  5:29 [PATCH -V3 0/9] mm: PCP high auto-tuning Huang Ying
2023-10-16  5:29 ` [PATCH -V3 1/9] mm, pcp: avoid to drain PCP when process exit Huang Ying
2023-10-16  5:29 ` [PATCH -V3 2/9] cacheinfo: calculate size of per-CPU data cache slice Huang Ying
2023-10-19 12:11   ` Mel Gorman
2023-10-16  5:29 ` [PATCH -V3 3/9] mm, pcp: reduce lock contention for draining high-order pages Huang Ying
2023-10-27  6:23   ` kernel test robot
2023-11-06  6:22   ` kernel test robot
2023-11-06  6:38     ` Huang, Ying
2023-10-16  5:29 ` [PATCH -V3 4/9] mm: restrict the pcp batch scale factor to avoid too long latency Huang Ying
2023-10-19 12:12   ` Mel Gorman
2023-10-16  5:29 ` [PATCH -V3 5/9] mm, page_alloc: scale the number of pages that are batch allocated Huang Ying
2023-10-16  5:29 ` [PATCH -V3 6/9] mm: add framework for PCP high auto-tuning Huang Ying
2023-10-19 12:16   ` Mel Gorman
2023-10-16  5:30 ` [PATCH -V3 7/9] mm: tune PCP high automatically Huang Ying
2023-10-31  2:50   ` kernel test robot [this message]
2023-10-16  5:30 ` [PATCH -V3 8/9] mm, pcp: decrease PCP high if free pages < high watermark Huang Ying
2023-10-19 12:33   ` Mel Gorman
2023-10-20  3:30     ` Huang, Ying
2023-10-23  9:26       ` Mel Gorman
2023-10-16  5:30 ` [PATCH -V3 9/9] mm, pcp: reduce detecting time of consecutive high order page freeing Huang Ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202310311001.edbc5817-oliver.sang@intel.com \
    --to=oliver.sang@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=arjan@linux.intel.com \
    --cc=cl@linux.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=feng.tang@intel.com \
    --cc=fengwei.yin@intel.com \
    --cc=jweiner@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkp@intel.com \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=oe-lkp@lists.linux.dev \
    --cc=pasha.tatashin@soleen.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.