Greeting,

FYI, we noticed a 3.8% improvement of will-it-scale.per_process_ops due to commit:


commit: 180c117d46be304a08b14fb080010773faf50788 ("[PATCH] maple_tree: Remove GFP_ZERO from kmem_cache_alloc() and kmem_cache_alloc_bulk()")
url: https://github.com/intel-lab-lkp/linux/commits/Liam-Howlett/maple_tree-Remove-GFP_ZERO-from-kmem_cache_alloc-and-kmem_cache_alloc_bulk/20230106-000849
base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git 41c03ba9beea760bd2d2ac9250b09a2e192da2dc
patch link: https://lore.kernel.org/all/20230105160427.2988454-1-Liam.Howlett@oracle.com/
patch subject: [PATCH] maple_tree: Remove GFP_ZERO from kmem_cache_alloc() and kmem_cache_alloc_bulk()

in testcase: will-it-scale
on test machine: 104 threads 2 sockets (Skylake) with 192G memory
with following parameters:

	nr_task: 16
	mode: process
	test: mmap1
	cpufreq_governor: performance

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        sudo bin/lkp install job.yaml           # job file is attached in this email
        bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
        sudo bin/lkp run generated-yaml-file

        # if come across any failure that blocks the test,
        # please remove ~/.lkp and /lkp dir to run from a clean state.

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-11/performance/x86_64-rhel-8.3/process/16/debian-11.1-x86_64-20220510.cgz/lkp-skl-fpga01/mmap1/will-it-scale

commit: 
  41c03ba9be ("Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost")
  180c117d46 ("maple_tree: Remove GFP_ZERO from kmem_cache_alloc() and kmem_cache_alloc_bulk()")

41c03ba9beea760b 180c117d46be304a08b14fb0800 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
   2604393            +3.8%    2702107        will-it-scale.16.processes
    162774            +3.8%     168881        will-it-scale.per_process_ops
   2604393            +3.8%    2702107        will-it-scale.workload
 1.254e+10            +2.6%  1.286e+10        perf-stat.i.branch-instructions
 1.251e+10            +2.2%  1.279e+10        perf-stat.i.dTLB-loads
 6.283e+09            +1.3%  6.367e+09        perf-stat.i.dTLB-stores
 5.838e+10            +2.5%  5.987e+10        perf-stat.i.instructions
    301.29            +2.2%     307.85        perf-stat.i.metric.M/sec
      0.81            -2.1%       0.79        perf-stat.overall.cpi
   6756492            -1.0%    6686622        perf-stat.overall.path-length
  1.25e+10            +2.6%  1.282e+10        perf-stat.ps.branch-instructions
 1.247e+10            +2.2%  1.275e+10        perf-stat.ps.dTLB-loads
 6.263e+09            +1.3%  6.346e+09        perf-stat.ps.dTLB-stores
 5.819e+10            +2.5%  5.967e+10        perf-stat.ps.instructions
  1.76e+13            +2.7%  1.807e+13        perf-stat.total.instructions
      2.31 ą 10%      -1.0        1.29 ą  7%  perf-profile.calltrace.cycles-pp.mas_preallocate.do_mas_align_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
      2.24 ą 10%      -1.0        1.23 ą  7%  perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.do_mas_align_munmap.__vm_munmap.__x64_sys_munmap
      2.28 ą 11%      -1.0        1.27 ą  8%  perf-profile.calltrace.cycles-pp.mas_preallocate.mmap_region.do_mmap.vm_mmap_pgoff.do_syscall_64
      2.24 ą 10%      -1.0        1.23 ą  8%  perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.mmap_region.do_mmap.vm_mmap_pgoff
      1.52 ą 11%      -0.8        0.67 ą  8%  perf-profile.calltrace.cycles-pp.kmem_cache_alloc_bulk.mas_alloc_nodes.mas_preallocate.mmap_region.do_mmap
      1.49 ą 11%      -0.8        0.66 ą  7%  perf-profile.calltrace.cycles-pp.kmem_cache_alloc_bulk.mas_alloc_nodes.mas_preallocate.do_mas_align_munmap.__vm_munmap
      4.94 ą 10%      -2.1        2.83 ą  8%  perf-profile.children.cycles-pp.mas_alloc_nodes
      4.61 ą 10%      -2.0        2.57 ą  8%  perf-profile.children.cycles-pp.mas_preallocate
      1.96 ą 10%      -1.8        0.21 ą  7%  perf-profile.children.cycles-pp.memset_erms
      3.09 ą 10%      -1.7        1.36 ą  7%  perf-profile.children.cycles-pp.kmem_cache_alloc_bulk
      0.09 ą 13%      -0.1        0.03 ą100%  perf-profile.children.cycles-pp.vm_area_free
      0.07 ą 10%      +0.0        0.09 ą 11%  perf-profile.children.cycles-pp.mas_wr_modify
      0.22 ą 10%      +0.1        0.28 ą  7%  perf-profile.children.cycles-pp.__might_sleep
      0.00            +0.2        0.22 ą 10%  perf-profile.children.cycles-pp.mas_pop_node
      1.89 ą 11%      -1.7        0.21 ą  8%  perf-profile.self.cycles-pp.memset_erms
      0.72 ą 11%      -0.3        0.37 ą  7%  perf-profile.self.cycles-pp.kmem_cache_alloc_bulk
      0.09 ą 13%      -0.1        0.03 ą100%  perf-profile.self.cycles-pp.vm_area_free
      0.11 ą  7%      +0.1        0.17 ą 11%  perf-profile.self.cycles-pp.do_mas_munmap
      0.00            +0.2        0.21 ą  9%  perf-profile.self.cycles-pp.mas_pop_node


Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests