linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [linus:master] [readahead]  ab4443fe3c:  vm-scalability.throughput -21.4% regression
@ 2024-02-20  8:25 kernel test robot
  2024-02-21 11:14 ` Jan Kara
  0 siblings, 1 reply; 13+ messages in thread
From: kernel test robot @ 2024-02-20  8:25 UTC (permalink / raw)
  To: Jan Kara
  Cc: oe-lkp, lkp, linux-kernel, Andrew Morton, Matthew Wilcox,
	Guo Xuenan, linux-fsdevel, ying.huang, feng.tang, fengwei.yin,
	oliver.sang



Hello,

kernel test robot noticed a -21.4% regression of vm-scalability.throughput on:


commit: ab4443fe3ca6298663a55c4a70efc6c3ce913ca6 ("readahead: avoid multiple marked readahead pages")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: vm-scalability
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory
parameters:

	runtime: 300s
	test: lru-file-readtwice
	cpufreq_governor: performance



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202402201642.c8d6bbc3-oliver.sang@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240220/202402201642.c8d6bbc3-oliver.sang@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/300s/lkp-spr-2sp4/lru-file-readtwice/vm-scalability

commit: 
  f0b7a0d1d4 ("Merge branch 'master' into mm-hotfixes-stable")
  ab4443fe3c ("readahead: avoid multiple marked readahead pages")

f0b7a0d1d46625db ab4443fe3ca6298663a55c4a70e 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     12.33 ±  8%     +74.7%      21.54 ±  3%  vmstat.procs.b
 6.641e+09 ±  7%     +43.4%  9.522e+09 ±  3%  cpuidle..time
   7219825 ±  7%     +40.7%   10156643 ±  3%  cpuidle..usage
     87356 ± 44%    +130.7%     201564 ± 12%  meminfo.Active(anon)
    711730           +26.7%     901680        meminfo.SUnreclaim
    198.25           +23.7%     245.26        uptime.boot
     18890 ±  2%     +14.7%      21667 ±  2%  uptime.idle
      0.17 ± 62%      +0.5        0.70 ± 34%  mpstat.cpu.all.iowait%
      0.03 ±  5%      -0.0        0.02 ±  2%  mpstat.cpu.all.soft%
      0.83 ±  3%      -0.2        0.65 ±  4%  mpstat.cpu.all.usr%
    347214 ± 10%     +19.9%     416202 ±  2%  numa-meminfo.node0.SUnreclaim
 1.525e+08 ±  4%     +13.4%  1.728e+08 ±  4%  numa-meminfo.node1.Active
 1.524e+08 ±  4%     +13.3%  1.727e+08 ±  4%  numa-meminfo.node1.Active(file)
  71750516 ± 10%     -24.9%   53877171 ± 13%  numa-meminfo.node1.Inactive
  71127836 ± 10%     -25.1%   53268721 ± 13%  numa-meminfo.node1.Inactive(file)
    364797 ± 10%     +33.0%     485106 ±  2%  numa-meminfo.node1.SUnreclaim
   3610954 ±  6%     +40.2%    5062891 ±  3%  turbostat.C1E
   3627684 ±  7%     +40.9%    5111624 ±  3%  turbostat.C6
     12.35 ± 55%     -61.0%       4.82 ± 50%  turbostat.IPC
  31624764 ±  2%     +33.5%   42205318        turbostat.IRQ
      3.60 ± 24%      -1.7        1.94 ± 28%  turbostat.PKG_%
     12438 ±  4%     +90.4%      23687 ± 23%  turbostat.POLL
     48.81           -12.6%      42.65        turbostat.RAMWatt
  24934637 ±  9%     +83.8%   45836252 ±  5%  numa-numastat.node0.local_node
   3271697 ± 22%     +70.7%    5586210 ± 22%  numa-numastat.node0.numa_foreign
  25077126 ±  9%     +83.3%   45969061 ±  5%  numa-numastat.node0.numa_hit
   4703977 ± 10%    +159.8%   12220561 ±  7%  numa-numastat.node0.numa_miss
   4847049 ±  9%    +154.8%   12350702 ±  7%  numa-numastat.node0.other_node
  26364328 ±  5%    +111.3%   55706473 ±  3%  numa-numastat.node1.local_node
   4704476 ± 10%    +159.7%   12219530 ±  7%  numa-numastat.node1.numa_foreign
  26458496 ±  5%    +110.9%   55813309 ±  3%  numa-numastat.node1.numa_hit
   3271887 ± 22%     +70.7%    5586065 ± 22%  numa-numastat.node1.numa_miss
   3363897 ± 20%     +69.2%    5691334 ± 22%  numa-numastat.node1.other_node
    186286 ±  2%     -24.3%     140930 ±  2%  vm-scalability.median
      6476 ± 20%   +2723.0        9199 ± 11%  vm-scalability.stddev%
  88930342 ±  5%     -21.4%   69899439 ±  3%  vm-scalability.throughput
    135.95 ±  2%     +35.0%     183.51        vm-scalability.time.elapsed_time
    135.95 ±  2%     +35.0%     183.51        vm-scalability.time.elapsed_time.max
   3898231 ±  7%     +22.7%    4784231 ±  7%  vm-scalability.time.involuntary_context_switches
    246538            +1.2%     249586        vm-scalability.time.minor_page_faults
     17484            -3.0%      16967        vm-scalability.time.percent_of_cpu_this_job_got
     23546 ±  2%     +31.3%      30915        vm-scalability.time.system_time
    125622 ±  7%    +232.5%     417746 ±  7%  vm-scalability.time.voluntary_context_switches
      7.10 ± 31%     -26.9%       5.19 ±  3%  perf-sched.wait_and_delay.avg.ms.__cond_resched.__alloc_pages.alloc_pages_mpol.folio_alloc.page_cache_ra_order
     14.80 ± 42%     -42.0%       8.58 ± 11%  perf-sched.wait_and_delay.avg.ms.io_schedule.folio_wait_bit_common.filemap_update_page.filemap_get_pages
      6.01 ± 27%    -100.0%       0.00        perf-sched.wait_and_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
     11652 ± 37%    +480.5%      67637 ± 21%  perf-sched.wait_and_delay.count.__cond_resched.__alloc_pages.alloc_pages_mpol.folio_alloc.page_cache_ra_order
      1328 ± 86%    +760.2%      11431 ± 31%  perf-sched.wait_and_delay.count.__cond_resched.__kmalloc.ifs_alloc.isra.0
     10417 ± 30%    +223.8%      33728 ± 30%  perf-sched.wait_and_delay.count.io_schedule.folio_wait_bit_common.filemap_update_page.filemap_get_pages
      2529 ± 36%    -100.0%       0.00        perf-sched.wait_and_delay.count.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
      1336 ±133%    -100.0%       0.00        perf-sched.wait_and_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
      6.74 ± 26%     -24.9%       5.06 ±  3%  perf-sched.wait_time.avg.ms.__cond_resched.__alloc_pages.alloc_pages_mpol.folio_alloc.page_cache_ra_order
      3.12 ± 31%     -48.8%       1.60 ± 14%  perf-sched.wait_time.avg.ms.__cond_resched.__kmalloc.ifs_alloc.isra.0
      1.68 ± 23%     -70.2%       0.50 ±  6%  perf-sched.wait_time.avg.ms.__cond_resched.down_read.page_cache_ra_unbounded.filemap_get_pages.filemap_read
      0.54 ±133%    +441.1%       2.94 ± 33%  perf-sched.wait_time.avg.ms.__cond_resched.task_work_run.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
     13.13 ± 40%     -42.8%       7.51 ± 12%  perf-sched.wait_time.avg.ms.io_schedule.folio_wait_bit_common.filemap_update_page.filemap_get_pages
      1.47 ±122%    +359.5%       6.78 ± 22%  perf-sched.wait_time.max.ms.__cond_resched.task_work_run.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
      4.47 ± 50%     -75.4%       1.10 ±134%  perf-sched.wait_time.max.ms.__cond_resched.unmap_vmas.exit_mmap.__mmput.exit_mm
     86841 ± 10%     +19.8%     104069 ±  2%  numa-vmstat.node0.nr_slab_unreclaimable
   3271697 ± 22%     +70.7%    5586210 ± 22%  numa-vmstat.node0.numa_foreign
  25076787 ±  9%     +83.3%   45969300 ±  5%  numa-vmstat.node0.numa_hit
  24934299 ±  9%     +83.8%   45836491 ±  5%  numa-vmstat.node0.numa_local
   4703977 ± 10%    +159.8%   12220561 ±  7%  numa-vmstat.node0.numa_miss
   4847048 ±  9%    +154.8%   12350702 ±  7%  numa-vmstat.node0.numa_other
  38159902 ±  4%     +13.2%   43207654 ±  4%  numa-vmstat.node1.nr_active_file
  17768992 ± 10%     -25.1%   13307850 ± 13%  numa-vmstat.node1.nr_inactive_file
     91228 ± 10%     +33.0%     121288 ±  2%  numa-vmstat.node1.nr_slab_unreclaimable
  38159860 ±  4%     +13.2%   43207611 ±  4%  numa-vmstat.node1.nr_zone_active_file
  17768981 ± 10%     -25.1%   13307832 ± 13%  numa-vmstat.node1.nr_zone_inactive_file
   4704476 ± 10%    +159.7%   12219530 ±  7%  numa-vmstat.node1.numa_foreign
  26458450 ±  5%    +110.9%   55813002 ±  3%  numa-vmstat.node1.numa_hit
  26364282 ±  5%    +111.3%   55706167 ±  3%  numa-vmstat.node1.numa_local
   3271887 ± 22%     +70.7%    5586065 ± 22%  numa-vmstat.node1.numa_miss
   3363897 ± 20%     +69.2%    5691333 ± 22%  numa-vmstat.node1.numa_other
  90826607 ±109%     -65.8%   31040624 ± 32%  proc-vmstat.compact_daemon_free_scanned
  96602657 ±103%     -65.4%   33447362 ± 32%  proc-vmstat.compact_free_scanned
      1184 ± 92%     -95.5%      52.75 ± 29%  proc-vmstat.kswapd_low_wmark_hit_quickly
     21460 ± 47%    +137.3%      50924 ± 12%  proc-vmstat.nr_active_anon
      3576 ±  3%     -29.3%       2528 ±  3%  proc-vmstat.nr_isolated_file
    178094           +26.5%     225368        proc-vmstat.nr_slab_unreclaimable
     21460 ± 47%    +137.3%      50924 ± 12%  proc-vmstat.nr_zone_active_anon
   7976174 ±  8%    +123.2%   17805741 ±  6%  proc-vmstat.numa_foreign
  51538988 ±  3%     +97.5%  1.018e+08        proc-vmstat.numa_hit
  51302328 ±  3%     +97.9%  1.015e+08        proc-vmstat.numa_local
   7975865 ±  8%    +123.3%   17806626 ±  6%  proc-vmstat.numa_miss
   8210948 ±  7%    +119.7%   18042039 ±  6%  proc-vmstat.numa_other
      1208 ± 92%     -93.1%      83.38 ± 24%  proc-vmstat.pageoutrun
      2270            +4.9%       2381        proc-vmstat.pgpgin
     51647 ±  9%     +24.2%      64144 ± 19%  proc-vmstat.pgreuse
  12722105 ± 16%     +51.8%   19317724 ± 27%  proc-vmstat.workingset_activate_file
   8714025          +122.7%   19406236 ± 13%  sched_debug.cfs_rq:/.avg_vruntime.avg
  14306847 ±  4%    +105.2%   29360984 ± 10%  sched_debug.cfs_rq:/.avg_vruntime.max
    909251 ± 71%    +426.4%    4786321 ± 57%  sched_debug.cfs_rq:/.avg_vruntime.min
   2239402 ± 12%     +77.5%    3975146 ± 11%  sched_debug.cfs_rq:/.avg_vruntime.stddev
      7790 ±  9%     +27.6%       9939 ±  6%  sched_debug.cfs_rq:/.load.avg
    536737           +35.6%     727628 ± 20%  sched_debug.cfs_rq:/.load.max
     52392 ±  6%     +31.7%      68975 ± 15%  sched_debug.cfs_rq:/.load.stddev
   8714025          +122.7%   19406236 ± 13%  sched_debug.cfs_rq:/.min_vruntime.avg
  14306847 ±  4%    +105.2%   29360984 ± 10%  sched_debug.cfs_rq:/.min_vruntime.max
    909251 ± 71%    +426.4%    4786321 ± 57%  sched_debug.cfs_rq:/.min_vruntime.min
   2239402 ± 12%     +77.5%    3975147 ± 11%  sched_debug.cfs_rq:/.min_vruntime.stddev
    263.62           -37.2%     165.56 ± 16%  sched_debug.cfs_rq:/.removed.runnable_avg.max
     34.46 ± 20%     -36.9%      21.75 ± 27%  sched_debug.cfs_rq:/.removed.runnable_avg.stddev
    263.62           -37.2%     165.56 ± 16%  sched_debug.cfs_rq:/.removed.util_avg.max
     34.46 ± 20%     -36.9%      21.75 ± 27%  sched_debug.cfs_rq:/.removed.util_avg.stddev
     24033 ± 20%    +132.7%      55928 ± 31%  sched_debug.cpu.avg_idle.min
     90800           +46.2%     132766 ± 10%  sched_debug.cpu.clock.avg
     90862           +46.2%     132821 ± 10%  sched_debug.cpu.clock.max
     90743           +46.2%     132681 ± 10%  sched_debug.cpu.clock.min
     90382           +46.0%     131999 ± 10%  sched_debug.cpu.clock_task.avg
     90564           +46.0%     132188 ± 10%  sched_debug.cpu.clock_task.max
     75846 ±  2%     +54.3%     117026 ± 11%  sched_debug.cpu.clock_task.min
      8008           +20.9%       9683 ±  8%  sched_debug.cpu.curr->pid.max
      7262           +96.3%      14257 ± 11%  sched_debug.cpu.nr_switches.avg
      1335 ± 25%    +147.2%       3301 ± 46%  sched_debug.cpu.nr_switches.min
      0.04 ± 51%     +99.6%       0.07 ± 16%  sched_debug.cpu.nr_uninterruptible.avg
      6.94 ± 10%     +28.9%       8.94 ±  8%  sched_debug.cpu.nr_uninterruptible.stddev
     90747           +46.2%     132684 ± 10%  sched_debug.cpu_clk
     89537           +46.8%     131475 ± 10%  sched_debug.ktime
     91651           +45.8%     133586 ± 10%  sched_debug.sched_clk
     12.03           -19.4%       9.70        perf-stat.i.MPKI
 1.752e+10 ±  2%     -11.2%  1.556e+10        perf-stat.i.branch-instructions
     78.57            -3.0       75.62        perf-stat.i.cache-miss-rate%
 1.081e+09 ±  2%     -27.8%  7.811e+08        perf-stat.i.cache-misses
  1.28e+09 ±  2%     -23.4%    9.8e+08        perf-stat.i.cache-references
      5.60            +6.0%       5.94        perf-stat.i.cpi
 5.076e+11 ±  2%      -3.6%  4.895e+11        perf-stat.i.cpu-cycles
    505.00 ±  3%     +14.2%     576.55 ±  2%  perf-stat.i.cpu-migrations
 2.087e+10 ±  2%     -12.9%  1.818e+10        perf-stat.i.dTLB-loads
      0.04 ±  2%      +0.0        0.06 ±  3%  perf-stat.i.dTLB-store-miss-rate%
   1787964 ±  3%     +30.8%    2339432        perf-stat.i.dTLB-store-misses
 6.896e+09 ±  2%     -21.0%  5.448e+09        perf-stat.i.dTLB-stores
 7.872e+10 ±  2%     -12.2%   6.91e+10        perf-stat.i.instructions
      0.27 ±  3%      +5.9%       0.28 ±  2%  perf-stat.i.ipc
      0.12 ± 27%     -52.0%       0.06 ± 20%  perf-stat.i.major-faults
    646.66 ±  8%     +31.4%     849.88 ±  2%  perf-stat.i.metric.K/sec
    201.93 ±  2%     -13.2%     175.35        perf-stat.i.metric.M/sec
      8279 ± 10%     -19.5%       6667 ±  8%  perf-stat.i.minor-faults
  76148688 ±  6%     -25.0%   57102562 ±  4%  perf-stat.i.node-load-misses
 1.996e+08 ±  6%     -27.3%  1.451e+08 ±  4%  perf-stat.i.node-loads
      8279 ± 10%     -19.5%       6667 ±  8%  perf-stat.i.page-faults
     13.78           -17.3%      11.40        perf-stat.overall.MPKI
      0.11 ±  2%      +0.0        0.12        perf-stat.overall.branch-miss-rate%
     84.62            -4.7       79.91        perf-stat.overall.cache-miss-rate%
      6.47            +9.8%       7.10        perf-stat.overall.cpi
    469.58 ±  2%     +32.7%     623.15        perf-stat.overall.cycles-between-cache-misses
      0.03 ±  2%      +0.0        0.04        perf-stat.overall.dTLB-store-miss-rate%
      0.15            -8.9%       0.14        perf-stat.overall.ipc
      1265 ±  2%     +19.2%       1507        perf-stat.overall.path-length
 1.757e+10 ±  2%     -10.1%   1.58e+10        perf-stat.ps.branch-instructions
 1.088e+09 ±  2%     -26.5%  7.996e+08        perf-stat.ps.cache-misses
 1.285e+09 ±  2%     -22.1%  1.001e+09        perf-stat.ps.cache-references
    490.77 ±  4%     +15.5%     566.82 ±  3%  perf-stat.ps.cpu-migrations
 2.094e+10 ±  2%     -11.8%  1.847e+10        perf-stat.ps.dTLB-loads
   1746391 ±  2%     +32.3%    2310550        perf-stat.ps.dTLB-store-misses
  6.91e+09 ±  2%     -19.9%  5.536e+09        perf-stat.ps.dTLB-stores
 7.892e+10 ±  2%     -11.1%  7.017e+10        perf-stat.ps.instructions
      0.12 ± 28%     -55.6%       0.05 ± 21%  perf-stat.ps.major-faults
      7608 ±  9%     -18.1%       6231 ±  7%  perf-stat.ps.minor-faults
  76550152 ±  5%     -24.5%   57810813 ±  4%  perf-stat.ps.node-load-misses
 2.022e+08 ±  5%     -26.0%  1.495e+08 ±  4%  perf-stat.ps.node-loads
      7608 ±  9%     -18.1%       6231 ±  7%  perf-stat.ps.page-faults
 1.087e+13 ±  2%     +19.2%  1.295e+13        perf-stat.total.instructions
     19.35 ± 18%     -19.3        0.00        perf-profile.calltrace.cycles-pp.__libc_start_main
     19.35 ± 18%     -19.3        0.00        perf-profile.calltrace.cycles-pp.main.__libc_start_main
     19.35 ± 18%     -19.3        0.00        perf-profile.calltrace.cycles-pp.run_builtin.main.__libc_start_main
     18.24 ± 41%     -18.2        0.00        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
     18.24 ± 41%     -18.2        0.00        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
     16.85 ± 12%     -16.8        0.00        perf-profile.calltrace.cycles-pp.__cmd_record.cmd_record.run_builtin.main.__libc_start_main
     16.85 ± 12%     -16.8        0.00        perf-profile.calltrace.cycles-pp.cmd_record.run_builtin.main.__libc_start_main
     16.79 ± 20%     -16.8        0.00        perf-profile.calltrace.cycles-pp.record__pushfn.perf_mmap__push.record__mmap_read_evlist.__cmd_record.cmd_record
     16.61 ± 16%     -16.6        0.00        perf-profile.calltrace.cycles-pp.record__mmap_read_evlist.__cmd_record.cmd_record.run_builtin.main
     16.60 ± 18%     -16.6        0.00        perf-profile.calltrace.cycles-pp.__libc_write.writen.record__pushfn.perf_mmap__push.record__mmap_read_evlist
     16.60 ± 18%     -16.6        0.00        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write.writen.record__pushfn
     16.60 ± 18%     -16.6        0.00        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_write.writen.record__pushfn.perf_mmap__push
     16.60 ± 18%     -16.6        0.00        perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write.writen
     16.60 ± 18%     -16.6        0.00        perf-profile.calltrace.cycles-pp.shmem_file_write_iter.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
     16.60 ± 18%     -16.6        0.00        perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write
     16.60 ± 18%     -16.6        0.00        perf-profile.calltrace.cycles-pp.writen.record__pushfn.perf_mmap__push.record__mmap_read_evlist.__cmd_record
     16.55 ± 20%     -16.6        0.00        perf-profile.calltrace.cycles-pp.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write.do_syscall_64
     16.27 ± 17%     -16.3        0.00        perf-profile.calltrace.cycles-pp.perf_mmap__push.record__mmap_read_evlist.__cmd_record.cmd_record.run_builtin
     16.59 ± 42%     -15.9        0.66 ±126%  perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
     16.59 ± 42%     -15.9        0.66 ±126%  perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
     16.59 ± 42%     -15.9        0.73 ±111%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
     16.59 ± 42%     -15.9        0.73 ±111%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.read
     16.59 ± 42%     -15.8        0.76 ±112%  perf-profile.calltrace.cycles-pp.read
      9.60 ± 77%      -9.6        0.00        perf-profile.calltrace.cycles-pp.proc_reg_read_iter.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
      9.60 ± 77%      -9.6        0.00        perf-profile.calltrace.cycles-pp.seq_read_iter.proc_reg_read_iter.vfs_read.ksys_read.do_syscall_64
      9.47 ± 75%      -9.5        0.00        perf-profile.calltrace.cycles-pp.arch_do_signal_or_restart.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
      9.47 ± 75%      -9.5        0.00        perf-profile.calltrace.cycles-pp.do_exit.do_group_exit.get_signal.arch_do_signal_or_restart.syscall_exit_to_user_mode
      9.47 ± 75%      -9.5        0.00        perf-profile.calltrace.cycles-pp.do_group_exit.get_signal.arch_do_signal_or_restart.syscall_exit_to_user_mode.do_syscall_64
      9.47 ± 75%      -9.5        0.00        perf-profile.calltrace.cycles-pp.get_signal.arch_do_signal_or_restart.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
      9.47 ± 75%      -9.5        0.00        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
      9.40 ± 37%      -9.4        0.00        perf-profile.calltrace.cycles-pp.task_work_run.do_exit.do_group_exit.get_signal.arch_do_signal_or_restart
      9.20 ± 38%      -9.2        0.00        perf-profile.calltrace.cycles-pp.__fput.task_work_run.do_exit.do_group_exit.get_signal
      8.39 ± 40%      -8.4        0.00        perf-profile.calltrace.cycles-pp.perf_event_release_kernel.perf_release.__fput.task_work_run.do_exit
      8.39 ± 40%      -8.4        0.00        perf-profile.calltrace.cycles-pp.perf_release.__fput.task_work_run.do_exit.do_group_exit
      8.24 ± 36%      -8.2        0.00        perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_openat2.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe
      8.24 ± 36%      -8.2        0.00        perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_sys_openat2.__x64_sys_openat.do_syscall_64
      7.29 ± 87%      -7.3        0.00        perf-profile.calltrace.cycles-pp.show_interrupts.seq_read_iter.proc_reg_read_iter.vfs_read.ksys_read
      7.01 ± 49%      -7.0        0.00        perf-profile.calltrace.cycles-pp.exit_mmap.__mmput.exit_mm.do_exit.do_group_exit
      6.40 ± 57%      -6.4        0.00        perf-profile.calltrace.cycles-pp.open64
      6.23 ± 44%      -6.2        0.00        perf-profile.calltrace.cycles-pp.proc_pid_status.proc_single_show.seq_read_iter.seq_read.vfs_read
      6.23 ± 44%      -6.2        0.00        perf-profile.calltrace.cycles-pp.proc_single_show.seq_read_iter.seq_read.vfs_read.ksys_read
      6.23 ± 44%      -6.2        0.00        perf-profile.calltrace.cycles-pp.seq_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
      6.23 ± 44%      -6.2        0.00        perf-profile.calltrace.cycles-pp.seq_read_iter.seq_read.vfs_read.ksys_read.do_syscall_64
      6.08 ± 31%      -6.1        0.00        perf-profile.calltrace.cycles-pp.event_function_call.perf_event_release_kernel.perf_release.__fput.task_work_run
      6.05 ± 63%      -6.0        0.00        perf-profile.calltrace.cycles-pp.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe.open64
      6.05 ± 63%      -6.0        0.00        perf-profile.calltrace.cycles-pp.do_sys_openat2.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe.open64
      6.05 ± 63%      -6.0        0.00        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.open64
      6.05 ± 63%      -6.0        0.00        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.open64
      5.86 ± 28%      -5.9        0.00        perf-profile.calltrace.cycles-pp.smp_call_function_single.event_function_call.perf_event_release_kernel.perf_release.__fput
      5.82 ± 48%      -5.8        0.00        perf-profile.calltrace.cycles-pp.__x64_sys_execve.do_syscall_64.entry_SYSCALL_64_after_hwframe.execve
      5.82 ± 48%      -5.8        0.00        perf-profile.calltrace.cycles-pp.do_execveat_common.__x64_sys_execve.do_syscall_64.entry_SYSCALL_64_after_hwframe.execve
      5.82 ± 48%      -5.8        0.00        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.execve
      5.82 ± 48%      -5.8        0.00        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.execve
      5.82 ± 48%      -5.8        0.00        perf-profile.calltrace.cycles-pp.execve
      5.79 ± 47%      -5.8        0.00        perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
      5.75 ± 66%      -5.7        0.00        perf-profile.calltrace.cycles-pp.fault_in_readable.fault_in_iov_iter_readable.generic_perform_write.shmem_file_write_iter.vfs_write
      5.60 ± 55%      -5.6        0.00        perf-profile.calltrace.cycles-pp.fault_in_iov_iter_readable.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
      4.97 ± 49%      -5.0        0.00        perf-profile.calltrace.cycles-pp.copy_page_from_iter_atomic.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
      4.30 ± 60%      -4.3        0.00        perf-profile.calltrace.cycles-pp.asm_exc_page_fault
      3.87 ± 70%      -3.9        0.00        perf-profile.calltrace.cycles-pp.__mmput.exit_mm.do_exit.do_group_exit.get_signal
      3.87 ± 70%      -3.9        0.00        perf-profile.calltrace.cycles-pp.exit_mm.do_exit.do_group_exit.get_signal.arch_do_signal_or_restart
      0.00            +0.7        0.71 ± 23%  perf-profile.calltrace.cycles-pp.__free_pages_ok.release_pages.__folio_batch_release.truncate_inode_pages_range.evict
      0.00            +0.7        0.72 ± 21%  perf-profile.calltrace.cycles-pp.delete_from_page_cache_batch.truncate_inode_pages_range.evict.do_unlinkat.__x64_sys_unlinkat
      0.00            +0.7        0.73 ± 22%  perf-profile.calltrace.cycles-pp.__mem_cgroup_uncharge.destroy_large_folio.release_pages.__folio_batch_release.truncate_inode_pages_range
      0.00            +0.7        0.73 ± 23%  perf-profile.calltrace.cycles-pp.free_unref_page_prepare.free_unref_page.release_pages.__folio_batch_release.truncate_inode_pages_range
      0.00            +0.8        0.80 ± 17%  perf-profile.calltrace.cycles-pp.xas_load.truncate_folio_batch_exceptionals.truncate_inode_pages_range.evict.do_unlinkat
      0.00            +0.9        0.86 ± 13%  perf-profile.calltrace.cycles-pp._raw_spin_trylock.rebalance_domains.__do_softirq.irq_exit_rcu.sysvec_apic_timer_interrupt
      0.00            +0.9        0.86 ± 21%  perf-profile.calltrace.cycles-pp.truncate_cleanup_folio.truncate_inode_pages_range.evict.do_unlinkat.__x64_sys_unlinkat
      0.00            +0.9        0.89 ± 18%  perf-profile.calltrace.cycles-pp.kmem_cache_free.rcu_do_batch.rcu_core.__do_softirq.irq_exit_rcu
      0.00            +0.9        0.89 ± 14%  perf-profile.calltrace.cycles-pp.workingset_update_node.xas_store.truncate_folio_batch_exceptionals.truncate_inode_pages_range.evict
      0.00            +0.9        0.94 ± 63%  perf-profile.calltrace.cycles-pp.fast_imageblit.sys_imageblit.drm_fbdev_generic_defio_imageblit.bit_putcs.fbcon_putcs
      0.00            +1.0        0.95 ± 63%  perf-profile.calltrace.cycles-pp.sys_imageblit.drm_fbdev_generic_defio_imageblit.bit_putcs.fbcon_putcs.fbcon_redraw
      0.00            +1.0        0.95 ± 63%  perf-profile.calltrace.cycles-pp.drm_fbdev_generic_defio_imageblit.bit_putcs.fbcon_putcs.fbcon_redraw.fbcon_scroll
      0.00            +1.1        1.06 ± 31%  perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.memcpy_toio.drm_fb_memcpy.ast_primary_plane_helper_atomic_update.drm_atomic_helper_commit_planes
      0.00            +1.1        1.14 ± 39%  perf-profile.calltrace.cycles-pp.clockevents_program_event.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
      0.13 ±264%      +1.3        1.38 ± 45%  perf-profile.calltrace.cycles-pp.tick_nohz_get_sleep_length.menu_select.cpuidle_idle_call.do_idle.cpu_startup_entry
      0.00            +1.3        1.26 ± 58%  perf-profile.calltrace.cycles-pp.bit_putcs.fbcon_putcs.fbcon_redraw.fbcon_scroll.con_scroll
      0.00            +1.3        1.31 ± 59%  perf-profile.calltrace.cycles-pp.fbcon_putcs.fbcon_redraw.fbcon_scroll.con_scroll.lf
      0.00            +1.5        1.45 ± 24%  perf-profile.calltrace.cycles-pp.destroy_large_folio.release_pages.__folio_batch_release.truncate_inode_pages_range.evict
      0.00            +1.5        1.49 ± 60%  perf-profile.calltrace.cycles-pp.fbcon_redraw.fbcon_scroll.con_scroll.lf.vt_console_print
      0.00            +1.5        1.52 ± 22%  perf-profile.calltrace.cycles-pp.free_unref_page.release_pages.__folio_batch_release.truncate_inode_pages_range.evict
      0.00            +1.5        1.55 ± 58%  perf-profile.calltrace.cycles-pp.con_scroll.lf.vt_console_print.console_flush_all.console_unlock
      0.00            +1.5        1.55 ± 58%  perf-profile.calltrace.cycles-pp.fbcon_scroll.con_scroll.lf.vt_console_print.console_flush_all
      0.00            +1.5        1.55 ± 58%  perf-profile.calltrace.cycles-pp.lf.vt_console_print.console_flush_all.console_unlock.vprintk_emit
      0.00            +1.6        1.58 ± 57%  perf-profile.calltrace.cycles-pp.vt_console_print.console_flush_all.console_unlock.vprintk_emit.devkmsg_emit
      0.00            +1.8        1.77 ± 41%  perf-profile.calltrace.cycles-pp.perf_mux_hrtimer_handler.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt
      0.00            +1.8        1.81 ± 27%  perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.rcu_do_batch.rcu_core.__do_softirq
      0.13 ±264%      +2.0        2.08 ± 30%  perf-profile.calltrace.cycles-pp.menu_select.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
      0.00            +2.0        1.97 ± 17%  perf-profile.calltrace.cycles-pp.rcu_segcblist_enqueue.__call_rcu_common.xas_store.truncate_folio_batch_exceptionals.truncate_inode_pages_range
      0.00            +2.0        2.04 ± 19%  perf-profile.calltrace.cycles-pp.kmem_cache_free.rcu_do_batch.rcu_core.__do_softirq.run_ksoftirqd
      0.00            +2.2        2.17 ± 17%  perf-profile.calltrace.cycles-pp.__call_rcu_common.xas_store.truncate_folio_batch_exceptionals.truncate_inode_pages_range.evict
      0.55 ±134%      +2.2        2.78 ± 18%  perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.75 ±132%      +2.5        3.22 ± 10%  perf-profile.calltrace.cycles-pp.__do_softirq.irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
      0.00            +2.6        2.64 ± 19%  perf-profile.calltrace.cycles-pp.rcu_do_batch.rcu_core.__do_softirq.run_ksoftirqd.smpboot_thread_fn
      0.75 ±132%      +2.7        3.40 ± 10%  perf-profile.calltrace.cycles-pp.irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
      0.00            +2.7        2.66 ± 19%  perf-profile.calltrace.cycles-pp.rcu_core.__do_softirq.run_ksoftirqd.smpboot_thread_fn.kthread
      0.00            +2.7        2.67 ± 19%  perf-profile.calltrace.cycles-pp.__do_softirq.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork
      0.00            +2.7        2.67 ± 19%  perf-profile.calltrace.cycles-pp.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.00            +2.8        2.76 ± 13%  perf-profile.calltrace.cycles-pp.io_serial_in.wait_for_lsr.wait_for_xmitr.serial8250_console_write.console_flush_all
      0.17 ±264%      +2.8        2.97 ± 10%  perf-profile.calltrace.cycles-pp.__intel_pmu_enable_all.perf_adjust_freq_unthr_context.perf_event_task_tick.scheduler_tick.update_process_times
      0.00            +2.8        2.83 ± 13%  perf-profile.calltrace.cycles-pp.wait_for_lsr.wait_for_xmitr.serial8250_console_write.console_flush_all.console_unlock
      0.00            +2.8        2.83 ± 13%  perf-profile.calltrace.cycles-pp.wait_for_xmitr.serial8250_console_write.console_flush_all.console_unlock.vprintk_emit
      0.00            +3.7        3.67 ± 15%  perf-profile.calltrace.cycles-pp.xas_find.find_lock_entries.truncate_inode_pages_range.evict.do_unlinkat
      0.30 ±175%      +4.2        4.48 ±  8%  perf-profile.calltrace.cycles-pp.perf_adjust_freq_unthr_context.perf_event_task_tick.scheduler_tick.update_process_times.tick_sched_handle
      0.30 ±175%      +4.3        4.60 ±  8%  perf-profile.calltrace.cycles-pp.perf_event_task_tick.scheduler_tick.update_process_times.tick_sched_handle.tick_nohz_highres_handler
      0.00            +4.4        4.40 ± 22%  perf-profile.calltrace.cycles-pp.release_pages.__folio_batch_release.truncate_inode_pages_range.evict.do_unlinkat
      0.50 ±132%      +4.4        4.90 ±  8%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      0.00            +4.4        4.41 ± 21%  perf-profile.calltrace.cycles-pp.__folio_batch_release.truncate_inode_pages_range.evict.do_unlinkat.__x64_sys_unlinkat
      0.00            +4.7        4.72 ± 33%  perf-profile.calltrace.cycles-pp.io_serial_out.serial8250_console_write.console_flush_all.console_unlock.vprintk_emit
      0.00            +4.8        4.76 ± 15%  perf-profile.calltrace.cycles-pp.find_lock_entries.truncate_inode_pages_range.evict.do_unlinkat.__x64_sys_unlinkat
      1.06 ±125%      +4.9        5.96 ± 10%  perf-profile.calltrace.cycles-pp.scheduler_tick.update_process_times.tick_sched_handle.tick_nohz_highres_handler.__hrtimer_run_queues
      0.00            +5.3        5.34 ±  8%  perf-profile.calltrace.cycles-pp.intel_idle_xstate.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      1.28 ±100%      +5.5        6.82 ± 10%  perf-profile.calltrace.cycles-pp.update_process_times.tick_sched_handle.tick_nohz_highres_handler.__hrtimer_run_queues.hrtimer_interrupt
      1.28 ±100%      +5.6        6.87 ± 10%  perf-profile.calltrace.cycles-pp.tick_sched_handle.tick_nohz_highres_handler.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt
      1.28 ±100%      +6.1        7.36 ± 10%  perf-profile.calltrace.cycles-pp.tick_nohz_highres_handler.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt
      0.00            +6.9        6.86 ± 15%  perf-profile.calltrace.cycles-pp.xas_store.truncate_folio_batch_exceptionals.truncate_inode_pages_range.evict.do_unlinkat
      0.00            +7.7        7.69 ± 12%  perf-profile.calltrace.cycles-pp.io_serial_in.wait_for_lsr.serial8250_console_write.console_flush_all.console_unlock
      0.00            +7.9        7.90 ± 12%  perf-profile.calltrace.cycles-pp.wait_for_lsr.serial8250_console_write.console_flush_all.console_unlock.vprintk_emit
      0.00            +8.3        8.34 ± 15%  perf-profile.calltrace.cycles-pp.truncate_folio_batch_exceptionals.truncate_inode_pages_range.evict.do_unlinkat.__x64_sys_unlinkat
      1.28 ±100%      +8.4        9.71 ±  7%  perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
      0.26 ±264%     +11.4       11.67 ±  8%  perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
      0.26 ±264%     +11.8       12.06 ±  8%  perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
      1.00 ±102%     +15.9       16.89 ±  8%  perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
      0.00           +16.1       16.11 ± 13%  perf-profile.calltrace.cycles-pp.serial8250_console_write.console_flush_all.console_unlock.vprintk_emit.devkmsg_emit
      0.17 ±264%     +17.4       17.59 ± 12%  perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
      0.17 ±264%     +17.4       17.59 ± 12%  perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
      0.17 ±264%     +17.4       17.60 ± 12%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
      0.17 ±264%     +17.4       17.60 ± 12%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write
      0.17 ±264%     +17.4       17.62 ± 12%  perf-profile.calltrace.cycles-pp.write
      0.00           +17.5       17.47 ± 13%  perf-profile.calltrace.cycles-pp.console_flush_all.console_unlock.vprintk_emit.devkmsg_emit.devkmsg_write
      0.00           +17.5       17.48 ± 13%  perf-profile.calltrace.cycles-pp.console_unlock.vprintk_emit.devkmsg_emit.devkmsg_write.vfs_write
      1.60 ± 88%     +17.5       19.11 ±  8%  perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      0.00           +17.5       17.53 ± 12%  perf-profile.calltrace.cycles-pp.memcpy_toio.drm_fb_memcpy.ast_primary_plane_helper_atomic_update.drm_atomic_helper_commit_planes.drm_atomic_helper_commit_tail_rpm
      0.00           +17.5       17.55 ± 13%  perf-profile.calltrace.cycles-pp.vprintk_emit.devkmsg_emit.devkmsg_write.vfs_write.ksys_write
      0.00           +17.5       17.55 ± 13%  perf-profile.calltrace.cycles-pp.devkmsg_emit.devkmsg_write.vfs_write.ksys_write.do_syscall_64
      0.00           +17.6       17.55 ± 13%  perf-profile.calltrace.cycles-pp.devkmsg_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.00           +18.0       17.98 ± 12%  perf-profile.calltrace.cycles-pp.ast_primary_plane_helper_atomic_update.drm_atomic_helper_commit_planes.drm_atomic_helper_commit_tail_rpm.ast_mode_config_helper_atomic_commit_tail.commit_tail
      0.00           +18.0       17.98 ± 12%  perf-profile.calltrace.cycles-pp.drm_fb_memcpy.ast_primary_plane_helper_atomic_update.drm_atomic_helper_commit_planes.drm_atomic_helper_commit_tail_rpm.ast_mode_config_helper_atomic_commit_tail
      0.00           +18.0       18.02 ± 12%  perf-profile.calltrace.cycles-pp.ast_mode_config_helper_atomic_commit_tail.commit_tail.drm_atomic_helper_commit.drm_atomic_commit.drm_atomic_helper_dirtyfb
      0.00           +18.0       18.02 ± 12%  perf-profile.calltrace.cycles-pp.commit_tail.drm_atomic_helper_commit.drm_atomic_commit.drm_atomic_helper_dirtyfb.drm_fbdev_generic_helper_fb_dirty
      0.00           +18.0       18.02 ± 12%  perf-profile.calltrace.cycles-pp.drm_atomic_helper_commit_planes.drm_atomic_helper_commit_tail_rpm.ast_mode_config_helper_atomic_commit_tail.commit_tail.drm_atomic_helper_commit
      0.00           +18.0       18.02 ± 12%  perf-profile.calltrace.cycles-pp.drm_atomic_helper_commit_tail_rpm.ast_mode_config_helper_atomic_commit_tail.commit_tail.drm_atomic_helper_commit.drm_atomic_commit
      0.00           +18.0       18.03 ± 12%  perf-profile.calltrace.cycles-pp.drm_atomic_commit.drm_atomic_helper_dirtyfb.drm_fbdev_generic_helper_fb_dirty.drm_fb_helper_damage_work.process_one_work
      0.00           +18.0       18.03 ± 12%  perf-profile.calltrace.cycles-pp.drm_atomic_helper_commit.drm_atomic_commit.drm_atomic_helper_dirtyfb.drm_fbdev_generic_helper_fb_dirty.drm_fb_helper_damage_work
      0.00           +18.0       18.03 ± 12%  perf-profile.calltrace.cycles-pp.drm_atomic_helper_dirtyfb.drm_fbdev_generic_helper_fb_dirty.drm_fb_helper_damage_work.process_one_work.worker_thread
      0.00           +18.7       18.67 ± 12%  perf-profile.calltrace.cycles-pp.drm_fb_helper_damage_work.process_one_work.worker_thread.kthread.ret_from_fork
      0.00           +18.7       18.67 ± 12%  perf-profile.calltrace.cycles-pp.drm_fbdev_generic_helper_fb_dirty.drm_fb_helper_damage_work.process_one_work.worker_thread.kthread
      0.00           +18.7       18.72 ± 12%  perf-profile.calltrace.cycles-pp.process_one_work.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      0.00           +18.8       18.77 ± 12%  perf-profile.calltrace.cycles-pp.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      0.00           +19.3       19.26 ± 15%  perf-profile.calltrace.cycles-pp.truncate_inode_pages_range.evict.do_unlinkat.__x64_sys_unlinkat.do_syscall_64
      0.00           +19.3       19.28 ± 15%  perf-profile.calltrace.cycles-pp.evict.do_unlinkat.__x64_sys_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.00           +19.3       19.29 ± 15%  perf-profile.calltrace.cycles-pp.__x64_sys_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlinkat
      0.00           +19.3       19.29 ± 15%  perf-profile.calltrace.cycles-pp.do_unlinkat.__x64_sys_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlinkat
      0.00           +19.3       19.29 ± 15%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlinkat
      0.00           +19.3       19.29 ± 15%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.unlinkat
      0.00           +19.3       19.29 ± 15%  perf-profile.calltrace.cycles-pp.unlinkat
      0.89 ±100%     +22.5       23.36 ± 15%  perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm
      0.89 ±100%     +22.5       23.36 ± 15%  perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm
      0.89 ±100%     +22.5       23.36 ± 15%  perf-profile.calltrace.cycles-pp.ret_from_fork_asm
      3.26 ±101%     +26.4       29.63 ±  7%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
      6.41 ± 88%     +26.4       32.84 ±  7%  perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
      6.41 ± 88%     +26.4       32.85 ±  7%  perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
      6.41 ± 88%     +26.4       32.85 ±  7%  perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify
      3.56 ± 99%     +26.6       30.12 ±  7%  perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
      6.41 ± 88%     +26.6       33.04 ±  7%  perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
      3.68 ± 96%     +28.8       32.52 ±  8%  perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
     75.17 ± 12%     -36.3       38.83 ± 10%  perf-profile.children.cycles-pp.do_syscall_64
     75.17 ± 12%     -36.3       38.84 ± 10%  perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     19.35 ± 18%     -19.2        0.10 ± 84%  perf-profile.children.cycles-pp.__libc_start_main
     19.35 ± 18%     -19.2        0.10 ± 84%  perf-profile.children.cycles-pp.main
     19.35 ± 18%     -19.2        0.10 ± 84%  perf-profile.children.cycles-pp.run_builtin
     19.35 ± 18%     -19.2        0.10 ± 90%  perf-profile.children.cycles-pp.cmd_record
     18.76 ± 18%     -18.8        0.00        perf-profile.children.cycles-pp.perf_mmap__push
     18.76 ± 18%     -18.8        0.00        perf-profile.children.cycles-pp.record__mmap_read_evlist
     18.63 ± 28%     -18.4        0.18 ± 46%  perf-profile.children.cycles-pp.do_exit
     18.63 ± 28%     -18.4        0.18 ± 46%  perf-profile.children.cycles-pp.do_group_exit
     16.81 ± 19%     -16.8        0.00        perf-profile.children.cycles-pp.__libc_write
     16.80 ± 20%     -16.8        0.00        perf-profile.children.cycles-pp.record__pushfn
     16.60 ± 18%     -16.6        0.00        perf-profile.children.cycles-pp.writen
     16.60 ± 18%     -16.6        0.01 ±264%  perf-profile.children.cycles-pp.shmem_file_write_iter
     16.42 ± 19%     -16.4        0.00        perf-profile.children.cycles-pp.generic_perform_write
     16.42 ± 41%     -16.1        0.29 ± 13%  perf-profile.children.cycles-pp.seq_read_iter
     16.81 ± 41%     -15.9        0.94 ± 79%  perf-profile.children.cycles-pp.read
     16.59 ± 42%     -15.6        0.96 ± 69%  perf-profile.children.cycles-pp.vfs_read
     16.59 ± 42%     -15.6        0.96 ± 69%  perf-profile.children.cycles-pp.ksys_read
     19.35 ± 18%     -15.4        3.98 ±173%  perf-profile.children.cycles-pp.__cmd_record
     14.11 ± 37%     -14.1        0.00        perf-profile.children.cycles-pp.arch_do_signal_or_restart
     14.11 ± 37%     -14.1        0.00        perf-profile.children.cycles-pp.get_signal
     10.35 ± 37%     -10.1        0.27 ± 64%  perf-profile.children.cycles-pp.asm_exc_page_fault
     10.10 ± 32%     -10.0        0.07 ±100%  perf-profile.children.cycles-pp.__fput
      9.93 ± 74%      -9.9        0.01 ±264%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode
      9.60 ± 77%      -9.6        0.00        perf-profile.children.cycles-pp.proc_reg_read_iter
      9.58 ± 36%      -9.6        0.00        perf-profile.children.cycles-pp.task_work_run
      9.10 ± 54%      -8.9        0.16 ± 53%  perf-profile.children.cycles-pp.exit_mmap
      9.10 ± 54%      -8.9        0.16 ± 53%  perf-profile.children.cycles-pp.__mmput
      8.57 ± 35%      -8.5        0.09 ± 78%  perf-profile.children.cycles-pp.do_sys_openat2
      8.57 ± 35%      -8.5        0.10 ± 78%  perf-profile.children.cycles-pp.__x64_sys_openat
      8.39 ± 40%      -8.3        0.06 ±101%  perf-profile.children.cycles-pp.perf_event_release_kernel
      8.39 ± 40%      -8.3        0.06 ±101%  perf-profile.children.cycles-pp.perf_release
      8.24 ± 36%      -8.2        0.09 ± 78%  perf-profile.children.cycles-pp.do_filp_open
      8.24 ± 36%      -8.2        0.09 ± 78%  perf-profile.children.cycles-pp.path_openat
      8.36 ± 29%      -8.1        0.23 ± 62%  perf-profile.children.cycles-pp.exc_page_fault
      8.16 ± 33%      -7.9        0.23 ± 62%  perf-profile.children.cycles-pp.do_user_addr_fault
      7.12 ± 84%      -7.1        0.00        perf-profile.children.cycles-pp.show_interrupts
      6.98 ± 79%      -7.0        0.00        perf-profile.children.cycles-pp.seq_printf
      7.01 ± 49%      -6.9        0.15 ± 54%  perf-profile.children.cycles-pp.exit_mm
      6.95 ± 36%      -6.7        0.22 ± 61%  perf-profile.children.cycles-pp.handle_mm_fault
      6.23 ± 44%      -6.2        0.00        perf-profile.children.cycles-pp.proc_pid_status
      6.23 ± 44%      -6.2        0.00        perf-profile.children.cycles-pp.proc_single_show
      6.22 ± 60%      -6.2        0.00        perf-profile.children.cycles-pp.open64
      6.08 ± 31%      -6.1        0.00        perf-profile.children.cycles-pp.event_function_call
      6.23 ± 42%      -6.0        0.21 ± 62%  perf-profile.children.cycles-pp.__handle_mm_fault
      6.23 ± 44%      -6.0        0.25 ± 15%  perf-profile.children.cycles-pp.seq_read
      5.86 ± 28%      -5.9        0.00        perf-profile.children.cycles-pp.smp_call_function_single
      5.72 ± 54%      -5.7        0.00        perf-profile.children.cycles-pp.fault_in_iov_iter_readable
      5.82 ± 48%      -5.6        0.20 ± 46%  perf-profile.children.cycles-pp.do_execveat_common
      5.82 ± 48%      -5.6        0.20 ± 46%  perf-profile.children.cycles-pp.execve
      5.82 ± 48%      -5.6        0.20 ± 46%  perf-profile.children.cycles-pp.__x64_sys_execve
      5.60 ± 55%      -5.6        0.00        perf-profile.children.cycles-pp.fault_in_readable
      5.83 ± 57%      -5.0        0.83 ± 18%  perf-profile.children.cycles-pp._raw_spin_lock
      4.97 ± 49%      -5.0        0.00        perf-profile.children.cycles-pp.copy_page_from_iter_atomic
      4.65 ± 60%      -4.5        0.13 ± 47%  perf-profile.children.cycles-pp.bprm_execve
      4.47 ± 64%      -4.4        0.10 ± 54%  perf-profile.children.cycles-pp.load_elf_binary
      4.47 ± 64%      -4.4        0.10 ± 54%  perf-profile.children.cycles-pp.exec_binprm
      4.47 ± 64%      -4.4        0.10 ± 54%  perf-profile.children.cycles-pp.search_binary_handler
      4.52 ± 42%      -4.3        0.18 ± 46%  perf-profile.children.cycles-pp.__x64_sys_exit_group
      3.52 ± 57%      -3.5        0.06 ± 83%  perf-profile.children.cycles-pp.kernel_clone
      3.28 ± 47%      -3.2        0.09 ± 80%  perf-profile.children.cycles-pp.do_fault
      3.12 ± 46%      -3.0        0.08 ± 80%  perf-profile.children.cycles-pp.do_read_fault
      2.33 ± 42%      -2.3        0.07 ± 80%  perf-profile.children.cycles-pp.filemap_map_pages
      1.99 ± 34%      -1.9        0.05 ± 78%  perf-profile.children.cycles-pp.link_path_walk
      0.00            +0.1        0.07 ± 15%  perf-profile.children.cycles-pp.__update_blocked_fair
      0.00            +0.1        0.07 ± 23%  perf-profile.children.cycles-pp.uncharge_folio
      0.00            +0.1        0.07 ± 19%  perf-profile.children.cycles-pp.rcu_nocb_try_bypass
      0.00            +0.1        0.07 ± 11%  perf-profile.children.cycles-pp.hrtimer_update_next_event
      0.00            +0.1        0.08 ± 26%  perf-profile.children.cycles-pp.memcg_account_kmem
      0.00            +0.1        0.08 ± 22%  perf-profile.children.cycles-pp.free_tail_page_prepare
      0.00            +0.1        0.08 ± 22%  perf-profile.children.cycles-pp.note_gp_changes
      0.00            +0.1        0.08 ± 38%  perf-profile.children.cycles-pp.console_conditional_schedule
      0.00            +0.1        0.08 ± 10%  perf-profile.children.cycles-pp.call_cpuidle
      0.00            +0.1        0.08 ± 15%  perf-profile.children.cycles-pp.cpuidle_governor_latency_req
      0.00            +0.1        0.08 ± 13%  perf-profile.children.cycles-pp.error_entry
      0.00            +0.1        0.09 ± 15%  perf-profile.children.cycles-pp.__libc_read
      0.00            +0.1        0.09 ± 23%  perf-profile.children.cycles-pp.read_counters
      0.00            +0.1        0.09 ± 14%  perf-profile.children.cycles-pp.xa_load
      0.00            +0.1        0.09 ± 16%  perf-profile.children.cycles-pp.hrtimer_get_next_event
      0.00            +0.1        0.09 ± 13%  perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
      0.00            +0.1        0.09 ± 22%  perf-profile.children.cycles-pp.cmd_stat
      0.00            +0.1        0.09 ± 22%  perf-profile.children.cycles-pp.dispatch_events
      0.00            +0.1        0.09 ± 22%  perf-profile.children.cycles-pp.process_interval
      0.00            +0.1        0.10 ± 46%  perf-profile.children.cycles-pp.__sysvec_irq_work
      0.00            +0.1        0.10 ± 46%  perf-profile.children.cycles-pp._printk
      0.00            +0.1        0.10 ± 46%  perf-profile.children.cycles-pp.asm_sysvec_irq_work
      0.00            +0.1        0.10 ± 46%  perf-profile.children.cycles-pp.irq_work_run
      0.00            +0.1        0.10 ± 46%  perf-profile.children.cycles-pp.sysvec_irq_work
      0.00            +0.1        0.10 ± 15%  perf-profile.children.cycles-pp.timerqueue_add
      0.00            +0.1        0.11 ± 13%  perf-profile.children.cycles-pp.x86_pmu_disable
      0.00            +0.1        0.11 ± 39%  perf-profile.children.cycles-pp.irq_work_single
      0.00            +0.1        0.11 ± 14%  perf-profile.children.cycles-pp.timerqueue_del
      0.00            +0.1        0.11 ± 54%  perf-profile.children.cycles-pp.drm_fb_helper_damage_area
      0.00            +0.1        0.11 ± 11%  perf-profile.children.cycles-pp.__hrtimer_next_event_base
      0.00            +0.1        0.12 ± 35%  perf-profile.children.cycles-pp.folio_lruvec_lock_irqsave
      0.00            +0.1        0.12 ± 36%  perf-profile.children.cycles-pp.irq_work_run_list
      0.00            +0.1        0.12 ± 13%  perf-profile.children.cycles-pp.perf_pmu_nop_void
      0.00            +0.1        0.13 ± 13%  perf-profile.children.cycles-pp.enqueue_hrtimer
      0.00            +0.1        0.13 ± 12%  perf-profile.children.cycles-pp.irqtime_account_process_tick
      0.00            +0.1        0.14 ± 23%  perf-profile.children.cycles-pp.__put_partials
      0.00            +0.1        0.14 ± 31%  perf-profile.children.cycles-pp.check_cpu_stall
      0.00            +0.1        0.14 ± 15%  perf-profile.children.cycles-pp.update_rq_clock
      0.00            +0.1        0.14 ± 16%  perf-profile.children.cycles-pp.hrtimer_next_event_without
      0.00            +0.2        0.16 ± 67%  perf-profile.children.cycles-pp.calc_global_load_tick
      0.00            +0.2        0.16 ± 24%  perf-profile.children.cycles-pp.filemap_unaccount_folio
      0.00            +0.2        0.16 ± 12%  perf-profile.children.cycles-pp.local_clock_noinstr
      0.00            +0.2        0.17 ± 14%  perf-profile.children.cycles-pp.should_we_balance
      0.00            +0.2        0.17 ± 25%  perf-profile.children.cycles-pp.delay_halt_tpause
      0.00            +0.2        0.19 ± 35%  perf-profile.children.cycles-pp.arch_call_rest_init
      0.00            +0.2        0.19 ± 35%  perf-profile.children.cycles-pp.rest_init
      0.00            +0.2        0.19 ± 35%  perf-profile.children.cycles-pp.start_kernel
      0.00            +0.2        0.19 ± 35%  perf-profile.children.cycles-pp.x86_64_start_kernel
      0.00            +0.2        0.19 ± 35%  perf-profile.children.cycles-pp.x86_64_start_reservations
      0.00            +0.2        0.21 ± 19%  perf-profile.children.cycles-pp.get_next_timer_interrupt
      0.00            +0.2        0.22 ± 25%  perf-profile.children.cycles-pp.free_one_page
      0.00            +0.2        0.22 ± 14%  perf-profile.children.cycles-pp.trigger_load_balance
      0.00            +0.2        0.23 ± 14%  perf-profile.children.cycles-pp.update_irq_load_avg
      0.00            +0.2        0.23 ± 15%  perf-profile.children.cycles-pp.update_blocked_averages
      0.00            +0.2        0.24 ± 17%  perf-profile.children.cycles-pp.run_rebalance_domains
      0.00            +0.2        0.24 ± 17%  perf-profile.children.cycles-pp.list_lru_del
      0.00            +0.2        0.24 ± 19%  perf-profile.children.cycles-pp.radix_tree_node_rcu_free
      0.00            +0.2        0.25 ± 15%  perf-profile.children.cycles-pp.get_slabinfo
      0.00            +0.2        0.25 ± 15%  perf-profile.children.cycles-pp.slab_show
      0.00            +0.3        0.25 ± 21%  perf-profile.children.cycles-pp.ct_kernel_exit_state
      0.00            +0.3        0.28 ± 24%  perf-profile.children.cycles-pp.delay_halt
      0.00            +0.3        0.29 ± 18%  perf-profile.children.cycles-pp.ct_kernel_enter
      0.00            +0.3        0.30 ±  9%  perf-profile.children.cycles-pp.irqtime_account_irq
      0.00            +0.3        0.31 ± 18%  perf-profile.children.cycles-pp.ct_idle_exit
      0.00            +0.3        0.31 ± 18%  perf-profile.children.cycles-pp.tick_sched_do_timer
      0.00            +0.3        0.32 ± 16%  perf-profile.children.cycles-pp.__mod_lruvec_kmem_state
      0.00            +0.3        0.32 ± 15%  perf-profile.children.cycles-pp.xas_start
      0.00            +0.3        0.32 ± 16%  perf-profile.children.cycles-pp.rcu_pending
      0.00            +0.3        0.32 ± 10%  perf-profile.children.cycles-pp.sched_clock
      0.00            +0.4        0.37 ± 11%  perf-profile.children.cycles-pp.native_apic_msr_eoi
      0.00            +0.4        0.38 ±  9%  perf-profile.children.cycles-pp.sched_clock_cpu
      0.00            +0.4        0.40 ± 17%  perf-profile.children.cycles-pp.ifs_free
      0.00            +0.4        0.40 ± 14%  perf-profile.children.cycles-pp.rcu_sched_clock_irq
      0.00            +0.4        0.40 ± 18%  perf-profile.children.cycles-pp.__page_cache_release
      0.00            +0.4        0.42 ± 10%  perf-profile.children.cycles-pp.native_sched_clock
      0.00            +0.4        0.43 ± 13%  perf-profile.children.cycles-pp.lapic_next_deadline
      0.00            +0.4        0.45 ± 13%  perf-profile.children.cycles-pp.mem_cgroup_from_slab_obj
      0.00            +0.5        0.46 ± 12%  perf-profile.children.cycles-pp.read_tsc
      0.00            +0.5        0.48 ±  7%  perf-profile.children.cycles-pp.perf_rotate_context
      0.00            +0.5        0.52 ± 23%  perf-profile.children.cycles-pp.page_counter_uncharge
      0.00            +0.5        0.52 ± 58%  perf-profile.children.cycles-pp.tick_nohz_irq_exit
      0.00            +0.5        0.55 ± 25%  perf-profile.children.cycles-pp.rcu_cblist_dequeue
      0.00            +0.6        0.56 ± 14%  perf-profile.children.cycles-pp.list_lru_del_obj
      0.00            +0.6        0.59 ± 21%  perf-profile.children.cycles-pp.ktime_get_update_offsets_now
      0.00            +0.6        0.64 ± 24%  perf-profile.children.cycles-pp.drm_fbdev_generic_damage_blit_real
      0.00            +0.6        0.64 ± 93%  perf-profile.children.cycles-pp.tick_irq_enter
      0.00            +0.7        0.66 ± 89%  perf-profile.children.cycles-pp.irq_enter_rcu
      0.00            +0.7        0.69 ± 40%  perf-profile.children.cycles-pp.xas_descend
      0.00            +0.7        0.69 ± 22%  perf-profile.children.cycles-pp.uncharge_batch
      0.00            +0.7        0.70 ± 25%  perf-profile.children.cycles-pp.folio_undo_large_rmappable
      0.00            +0.7        0.73 ± 22%  perf-profile.children.cycles-pp.__free_pages_ok
      0.00            +0.7        0.73 ± 21%  perf-profile.children.cycles-pp.delete_from_page_cache_batch
      0.00            +0.7        0.74 ± 22%  perf-profile.children.cycles-pp.__mem_cgroup_uncharge
      0.00            +0.8        0.78 ± 14%  perf-profile.children.cycles-pp.xas_clear_mark
      0.00            +0.9        0.87 ± 21%  perf-profile.children.cycles-pp.truncate_cleanup_folio
      0.00            +0.9        0.95 ± 13%  perf-profile.children.cycles-pp.workingset_update_node
      0.00            +1.0        0.96 ± 62%  perf-profile.children.cycles-pp.fast_imageblit
      0.00            +1.0        0.98 ± 61%  perf-profile.children.cycles-pp.sys_imageblit
      0.00            +1.0        0.98 ± 61%  perf-profile.children.cycles-pp.drm_fbdev_generic_defio_imageblit
      0.13 ±264%      +1.1        1.20 ± 54%  perf-profile.children.cycles-pp.tick_nohz_next_event
      0.00            +1.2        1.16 ± 38%  perf-profile.children.cycles-pp.clockevents_program_event
      0.13 ±264%      +1.3        1.40 ± 45%  perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
      0.00            +1.3        1.29 ± 57%  perf-profile.children.cycles-pp.bit_putcs
      0.00            +1.3        1.34 ± 57%  perf-profile.children.cycles-pp.fbcon_putcs
      0.00            +1.5        1.46 ± 23%  perf-profile.children.cycles-pp.destroy_large_folio
      0.00            +1.5        1.52 ± 59%  perf-profile.children.cycles-pp.fbcon_redraw
      0.00            +1.5        1.55 ± 58%  perf-profile.children.cycles-pp.con_scroll
      0.00            +1.5        1.55 ± 58%  perf-profile.children.cycles-pp.fbcon_scroll
      0.00            +1.5        1.55 ± 58%  perf-profile.children.cycles-pp.lf
      0.00            +1.6        1.58 ± 57%  perf-profile.children.cycles-pp.vt_console_print
      0.00            +1.7        1.67 ± 20%  perf-profile.children.cycles-pp.free_unref_page
      0.00            +1.9        1.86 ± 39%  perf-profile.children.cycles-pp.perf_mux_hrtimer_handler
      0.13 ±264%      +2.0        2.12 ± 29%  perf-profile.children.cycles-pp.menu_select
      0.55 ±134%      +2.2        2.78 ± 18%  perf-profile.children.cycles-pp.smpboot_thread_fn
      0.00            +2.7        2.67 ± 19%  perf-profile.children.cycles-pp.run_ksoftirqd
      0.00            +2.8        2.83 ± 13%  perf-profile.children.cycles-pp.wait_for_xmitr
      0.75 ±132%      +2.9        3.64 ± 10%  perf-profile.children.cycles-pp.irq_exit_rcu
      0.30 ±175%      +3.1        3.40 ±  9%  perf-profile.children.cycles-pp.__intel_pmu_enable_all
      0.00            +3.2        3.24 ± 30%  perf-profile.children.cycles-pp.ktime_get
      0.44 ±173%      +3.7        4.11 ± 15%  perf-profile.children.cycles-pp.rcu_core
      0.00            +3.7        3.72 ± 14%  perf-profile.children.cycles-pp.xas_find
      0.22 ±264%      +3.8        4.01 ± 16%  perf-profile.children.cycles-pp.rcu_do_batch
      0.30 ±175%      +4.4        4.68 ±  8%  perf-profile.children.cycles-pp.perf_adjust_freq_unthr_context
      0.30 ±175%      +4.4        4.70 ±  8%  perf-profile.children.cycles-pp.perf_event_task_tick
      0.00            +4.4        4.41 ± 21%  perf-profile.children.cycles-pp.__folio_batch_release
      0.50 ±132%      +4.4        4.93 ±  8%  perf-profile.children.cycles-pp.intel_idle
      0.00            +4.8        4.79 ± 15%  perf-profile.children.cycles-pp.find_lock_entries
      0.00            +5.0        5.01 ± 20%  perf-profile.children.cycles-pp.io_serial_out
      1.06 ±125%      +5.2        6.25 ± 10%  perf-profile.children.cycles-pp.scheduler_tick
      0.00            +5.4        5.37 ±  8%  perf-profile.children.cycles-pp.intel_idle_xstate
      0.75 ±132%      +5.4        6.12 ± 13%  perf-profile.children.cycles-pp.__do_softirq
      1.28 ±100%      +5.9        7.18 ± 10%  perf-profile.children.cycles-pp.update_process_times
      1.28 ±100%      +5.9        7.22 ± 10%  perf-profile.children.cycles-pp.tick_sched_handle
      1.28 ±100%      +6.5        7.78 ±  9%  perf-profile.children.cycles-pp.tick_nohz_highres_handler
      0.22 ±264%      +7.4        7.62 ± 12%  perf-profile.children.cycles-pp.xas_store
      0.00            +8.4        8.39 ± 15%  perf-profile.children.cycles-pp.truncate_folio_batch_exceptionals
      1.28 ±100%      +9.0       10.25 ±  6%  perf-profile.children.cycles-pp.__hrtimer_run_queues
      0.00           +10.6       10.65 ± 12%  perf-profile.children.cycles-pp.io_serial_in
      0.00           +10.8       10.80 ± 12%  perf-profile.children.cycles-pp.wait_for_lsr
      1.28 ±100%     +11.0       12.30 ±  7%  perf-profile.children.cycles-pp.hrtimer_interrupt
      1.28 ±100%     +11.4       12.71 ±  7%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
      2.02 ± 85%     +15.8       17.80 ±  7%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
      0.00           +15.9       15.94 ± 14%  perf-profile.children.cycles-pp.serial8250_console_write
      0.17 ±264%     +17.5       17.64 ± 12%  perf-profile.children.cycles-pp.write
      0.00           +17.5       17.55 ± 13%  perf-profile.children.cycles-pp.devkmsg_emit
      0.00           +17.6       17.55 ± 13%  perf-profile.children.cycles-pp.devkmsg_write
      0.00           +17.6       17.56 ± 12%  perf-profile.children.cycles-pp.console_flush_all
      0.00           +17.6       17.56 ± 12%  perf-profile.children.cycles-pp.console_unlock
      0.00           +17.6       17.65 ± 12%  perf-profile.children.cycles-pp.vprintk_emit
      0.00           +18.0       17.98 ± 12%  perf-profile.children.cycles-pp.ast_primary_plane_helper_atomic_update
      0.00           +18.0       17.98 ± 12%  perf-profile.children.cycles-pp.drm_fb_memcpy
      0.00           +18.0       17.98 ± 12%  perf-profile.children.cycles-pp.memcpy_toio
      0.00           +18.0       18.02 ± 12%  perf-profile.children.cycles-pp.ast_mode_config_helper_atomic_commit_tail
      0.00           +18.0       18.02 ± 12%  perf-profile.children.cycles-pp.commit_tail
      0.00           +18.0       18.02 ± 12%  perf-profile.children.cycles-pp.drm_atomic_helper_commit_planes
      0.00           +18.0       18.02 ± 12%  perf-profile.children.cycles-pp.drm_atomic_helper_commit_tail_rpm
      0.00           +18.0       18.03 ± 12%  perf-profile.children.cycles-pp.drm_atomic_commit
      0.00           +18.0       18.03 ± 12%  perf-profile.children.cycles-pp.drm_atomic_helper_commit
      0.00           +18.0       18.03 ± 12%  perf-profile.children.cycles-pp.drm_atomic_helper_dirtyfb
      0.00           +18.7       18.67 ± 12%  perf-profile.children.cycles-pp.drm_fb_helper_damage_work
      0.00           +18.7       18.67 ± 12%  perf-profile.children.cycles-pp.drm_fbdev_generic_helper_fb_dirty
      0.00           +18.7       18.72 ± 12%  perf-profile.children.cycles-pp.process_one_work
      0.00           +18.8       18.77 ± 12%  perf-profile.children.cycles-pp.worker_thread
      0.00           +19.3       19.28 ± 15%  perf-profile.children.cycles-pp.truncate_inode_pages_range
      0.00           +19.3       19.28 ± 15%  perf-profile.children.cycles-pp.evict
      0.00           +19.3       19.29 ± 15%  perf-profile.children.cycles-pp.__x64_sys_unlinkat
      0.00           +19.3       19.29 ± 15%  perf-profile.children.cycles-pp.do_unlinkat
      0.00           +19.3       19.29 ± 15%  perf-profile.children.cycles-pp.unlinkat
      1.04 ± 79%     +22.3       23.38 ± 15%  perf-profile.children.cycles-pp.ret_from_fork_asm
      0.89 ±100%     +22.5       23.36 ± 15%  perf-profile.children.cycles-pp.kthread
      0.89 ±100%     +22.5       23.38 ± 15%  perf-profile.children.cycles-pp.ret_from_fork
      6.41 ± 88%     +26.4       32.85 ±  7%  perf-profile.children.cycles-pp.start_secondary
      6.41 ± 88%     +26.6       33.04 ±  7%  perf-profile.children.cycles-pp.cpu_startup_entry
      6.41 ± 88%     +26.6       33.04 ±  7%  perf-profile.children.cycles-pp.do_idle
      6.41 ± 88%     +26.6       33.04 ±  7%  perf-profile.children.cycles-pp.secondary_startup_64_no_verify
      3.56 ± 99%     +26.7       30.28 ±  7%  perf-profile.children.cycles-pp.cpuidle_enter_state
      3.56 ± 99%     +26.7       30.30 ±  7%  perf-profile.children.cycles-pp.cpuidle_enter
      3.68 ± 96%     +29.0       32.73 ±  7%  perf-profile.children.cycles-pp.cpuidle_idle_call
      5.46 ± 30%      -5.5        0.00        perf-profile.self.cycles-pp.smp_call_function_single
      5.44 ± 65%      -4.6        0.80 ± 18%  perf-profile.self.cycles-pp._raw_spin_lock
      4.63 ± 59%      -4.6        0.00        perf-profile.self.cycles-pp.copy_page_from_iter_atomic
      0.00            +0.1        0.06 ± 32%  perf-profile.self.cycles-pp.free_unref_page_commit
      0.00            +0.1        0.06 ± 13%  perf-profile.self.cycles-pp.do_idle
      0.00            +0.1        0.06 ± 10%  perf-profile.self.cycles-pp.perf_rotate_context
      0.00            +0.1        0.06 ± 17%  perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore
      0.00            +0.1        0.07 ± 14%  perf-profile.self.cycles-pp.load_balance
      0.00            +0.1        0.07 ± 17%  perf-profile.self.cycles-pp.ct_kernel_enter
      0.00            +0.1        0.07 ± 25%  perf-profile.self.cycles-pp.rcu_sched_clock_irq
      0.00            +0.1        0.08 ± 15%  perf-profile.self.cycles-pp.asm_sysvec_apic_timer_interrupt
      0.00            +0.1        0.08 ±  9%  perf-profile.self.cycles-pp.call_cpuidle
      0.00            +0.1        0.08 ± 13%  perf-profile.self.cycles-pp.error_entry
      0.00            +0.1        0.08 ± 21%  perf-profile.self.cycles-pp.__do_softirq
      0.00            +0.1        0.08 ± 24%  perf-profile.self.cycles-pp.uncharge_batch
      0.00            +0.1        0.08 ± 20%  perf-profile.self.cycles-pp.__page_cache_release
      0.00            +0.1        0.09 ±  9%  perf-profile.self.cycles-pp.hrtimer_interrupt
      0.00            +0.1        0.09 ± 15%  perf-profile.self.cycles-pp.x86_pmu_disable
      0.00            +0.1        0.09 ± 26%  perf-profile.self.cycles-pp.irqtime_account_irq
      0.00            +0.1        0.09 ± 10%  perf-profile.self.cycles-pp.__hrtimer_next_event_base
      0.00            +0.1        0.10 ± 26%  perf-profile.self.cycles-pp.delay_halt
      0.00            +0.1        0.10 ± 24%  perf-profile.self.cycles-pp.delete_from_page_cache_batch
      0.00            +0.1        0.11 ± 11%  perf-profile.self.cycles-pp.scheduler_tick
      0.00            +0.1        0.12 ± 12%  perf-profile.self.cycles-pp.workingset_update_node
      0.00            +0.1        0.13 ± 32%  perf-profile.self.cycles-pp.check_cpu_stall
      0.00            +0.1        0.13 ± 12%  perf-profile.self.cycles-pp.irqtime_account_process_tick
      0.00            +0.1        0.13 ± 12%  perf-profile.self.cycles-pp.__hrtimer_run_queues
      0.00            +0.1        0.15 ± 71%  perf-profile.self.cycles-pp.fbcon_redraw
      0.00            +0.2        0.15 ± 67%  perf-profile.self.cycles-pp.calc_global_load_tick
      0.00            +0.2        0.15 ±  9%  perf-profile.self.cycles-pp.cpuidle_idle_call
      0.00            +0.2        0.16 ± 55%  perf-profile.self.cycles-pp.bit_putcs
      0.00            +0.2        0.17 ± 25%  perf-profile.self.cycles-pp.delay_halt_tpause
      0.00            +0.2        0.18 ± 13%  perf-profile.self.cycles-pp.rcu_pending
      0.00            +0.2        0.19 ± 16%  perf-profile.self.cycles-pp.list_lru_del_obj
      0.00            +0.2        0.21 ± 12%  perf-profile.self.cycles-pp.trigger_load_balance
      0.00            +0.2        0.22 ± 15%  perf-profile.self.cycles-pp.truncate_folio_batch_exceptionals
      0.00            +0.2        0.22 ± 16%  perf-profile.self.cycles-pp.update_irq_load_avg
      0.00            +0.2        0.23 ± 16%  perf-profile.self.cycles-pp.radix_tree_node_rcu_free
      0.00            +0.2        0.23 ± 30%  perf-profile.self.cycles-pp.tick_sched_do_timer
      0.00            +0.2        0.25 ± 15%  perf-profile.self.cycles-pp.get_slabinfo
      0.00            +0.2        0.25 ± 20%  perf-profile.self.cycles-pp.ct_kernel_exit_state
      0.00            +0.3        0.27 ± 16%  perf-profile.self.cycles-pp.xas_start
      0.00            +0.3        0.33 ± 54%  perf-profile.self.cycles-pp.tick_nohz_next_event
      0.00            +0.4        0.36 ± 10%  perf-profile.self.cycles-pp.native_apic_msr_eoi
      0.00            +0.4        0.38 ± 16%  perf-profile.self.cycles-pp.ifs_free
      0.00            +0.4        0.40 ± 10%  perf-profile.self.cycles-pp.native_sched_clock
      0.00            +0.4        0.43 ± 13%  perf-profile.self.cycles-pp.lapic_next_deadline
      0.00            +0.4        0.44 ± 13%  perf-profile.self.cycles-pp.mem_cgroup_from_slab_obj
      0.00            +0.4        0.45 ± 11%  perf-profile.self.cycles-pp.read_tsc
      0.00            +0.4        0.45 ± 22%  perf-profile.self.cycles-pp.__free_pages_ok
      0.00            +0.5        0.48 ± 23%  perf-profile.self.cycles-pp.page_counter_uncharge
      0.00            +0.5        0.52 ± 23%  perf-profile.self.cycles-pp.ktime_get_update_offsets_now
      0.00            +0.5        0.52 ± 15%  perf-profile.self.cycles-pp.xas_load
      0.00            +0.5        0.53 ± 25%  perf-profile.self.cycles-pp.rcu_cblist_dequeue
      0.00            +0.6        0.59 ± 48%  perf-profile.self.cycles-pp.xas_descend
      0.00            +0.6        0.62 ±  8%  perf-profile.self.cycles-pp.menu_select
      0.00            +0.6        0.62 ± 14%  perf-profile.self.cycles-pp.xas_clear_mark
      0.00            +0.7        0.69 ± 25%  perf-profile.self.cycles-pp.folio_undo_large_rmappable
      0.00            +1.0        0.96 ± 62%  perf-profile.self.cycles-pp.fast_imageblit
      0.00            +1.1        1.10 ± 23%  perf-profile.self.cycles-pp.find_lock_entries
      0.13 ±264%      +1.2        1.38 ± 12%  perf-profile.self.cycles-pp.perf_adjust_freq_unthr_context
      0.30 ±175%      +1.3        1.62 ± 10%  perf-profile.self.cycles-pp.cpuidle_enter_state
      0.48 ±132%      +1.8        2.28 ± 16%  perf-profile.self.cycles-pp.__slab_free
      0.00            +2.8        2.85 ± 34%  perf-profile.self.cycles-pp.ktime_get
      0.30 ±175%      +3.1        3.40 ±  9%  perf-profile.self.cycles-pp.__intel_pmu_enable_all
      0.00            +3.2        3.19 ± 14%  perf-profile.self.cycles-pp.xas_store
      0.00            +3.5        3.54 ± 15%  perf-profile.self.cycles-pp.xas_find
      0.50 ±132%      +4.4        4.93 ±  8%  perf-profile.self.cycles-pp.intel_idle
      0.00            +5.0        5.01 ± 20%  perf-profile.self.cycles-pp.io_serial_out
      0.00            +5.3        5.34 ±  8%  perf-profile.self.cycles-pp.intel_idle_xstate
      0.00           +10.6       10.65 ± 12%  perf-profile.self.cycles-pp.io_serial_in
      0.00           +17.6       17.56 ± 12%  perf-profile.self.cycles-pp.memcpy_toio



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [linus:master] [readahead]  ab4443fe3c: vm-scalability.throughput -21.4% regression
  2024-02-20  8:25 [linus:master] [readahead] ab4443fe3c: vm-scalability.throughput -21.4% regression kernel test robot
@ 2024-02-21 11:14 ` Jan Kara
  2024-02-22  1:32   ` Oliver Sang
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Kara @ 2024-02-21 11:14 UTC (permalink / raw)
  To: kernel test robot
  Cc: Jan Kara, oe-lkp, lkp, linux-kernel, Andrew Morton,
	Matthew Wilcox, Guo Xuenan, linux-fsdevel, ying.huang, feng.tang,
	fengwei.yin

On Tue 20-02-24 16:25:37, kernel test robot wrote:
> Hello,
> 
> kernel test robot noticed a -21.4% regression of vm-scalability.throughput on:
> 
> 
> commit: ab4443fe3ca6298663a55c4a70efc6c3ce913ca6 ("readahead: avoid multiple marked readahead pages")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> 
> testcase: vm-scalability
> test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory
> parameters:
> 
> 	runtime: 300s
> 	test: lru-file-readtwice
> 	cpufreq_governor: performance

JFYI I had a look into this. What the test seems to do is that it creates
image files on tmpfs, loopmounts XFS there, and does reads over file on
XFS. But I was not able to find what lru-file-readtwice exactly does,
neither I was able to reproduce it because I got stuck on some missing Ruby
dependencies on my test system yesterday.

Given the workload is over tmpfs, I'm not very concerned about what
readahead does and how it performs but still I'd like to investigate where
the regression is coming from because it is unexpected.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [linus:master] [readahead]  ab4443fe3c: vm-scalability.throughput -21.4% regression
  2024-02-21 11:14 ` Jan Kara
@ 2024-02-22  1:32   ` Oliver Sang
  2024-02-22 11:50     ` Jan Kara
  0 siblings, 1 reply; 13+ messages in thread
From: Oliver Sang @ 2024-02-22  1:32 UTC (permalink / raw)
  To: Jan Kara
  Cc: oe-lkp, lkp, linux-kernel, Andrew Morton, Matthew Wilcox,
	Guo Xuenan, linux-fsdevel, ying.huang, feng.tang, fengwei.yin,
	oliver.sang

hi, Jan Kara,

On Wed, Feb 21, 2024 at 12:14:25PM +0100, Jan Kara wrote:
> On Tue 20-02-24 16:25:37, kernel test robot wrote:
> > Hello,
> > 
> > kernel test robot noticed a -21.4% regression of vm-scalability.throughput on:
> > 
> > 
> > commit: ab4443fe3ca6298663a55c4a70efc6c3ce913ca6 ("readahead: avoid multiple marked readahead pages")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > 
> > testcase: vm-scalability
> > test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory
> > parameters:
> > 
> > 	runtime: 300s
> > 	test: lru-file-readtwice
> > 	cpufreq_governor: performance
> 
> JFYI I had a look into this. What the test seems to do is that it creates
> image files on tmpfs, loopmounts XFS there, and does reads over file on
> XFS. But I was not able to find what lru-file-readtwice exactly does,
> neither I was able to reproduce it because I got stuck on some missing Ruby
> dependencies on my test system yesterday.

what's your OS?

> 
> Given the workload is over tmpfs, I'm not very concerned about what
> readahead does and how it performs but still I'd like to investigate where
> the regression is coming from because it is unexpected.

Thanks a lot for information!
it was hard to me to determine the connection, so I rebuilt and rerun tests more
which still showed stable data.

if you have any patch want us to try, please let us know.
it's always our great pleasure to supply supports :)

> 
> 								Honza
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [linus:master] [readahead]  ab4443fe3c: vm-scalability.throughput -21.4% regression
  2024-02-22  1:32   ` Oliver Sang
@ 2024-02-22 11:50     ` Jan Kara
  2024-02-22 18:37       ` Jan Kara
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Kara @ 2024-02-22 11:50 UTC (permalink / raw)
  To: Oliver Sang
  Cc: Jan Kara, oe-lkp, lkp, linux-kernel, Andrew Morton,
	Matthew Wilcox, Guo Xuenan, linux-fsdevel, ying.huang, feng.tang,
	fengwei.yin

Hello,

On Thu 22-02-24 09:32:52, Oliver Sang wrote:
> On Wed, Feb 21, 2024 at 12:14:25PM +0100, Jan Kara wrote:
> > On Tue 20-02-24 16:25:37, kernel test robot wrote:
> > > kernel test robot noticed a -21.4% regression of vm-scalability.throughput on:
> > > 
> > > commit: ab4443fe3ca6298663a55c4a70efc6c3ce913ca6 ("readahead: avoid multiple marked readahead pages")
> > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > > 
> > > testcase: vm-scalability
> > > test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory
> > > parameters:
> > > 
> > > 	runtime: 300s
> > > 	test: lru-file-readtwice
> > > 	cpufreq_governor: performance
> > 
> > JFYI I had a look into this. What the test seems to do is that it creates
> > image files on tmpfs, loopmounts XFS there, and does reads over file on
> > XFS. But I was not able to find what lru-file-readtwice exactly does,
> > neither I was able to reproduce it because I got stuck on some missing Ruby
> > dependencies on my test system yesterday.
> 
> what's your OS?

I have SLES15-SP4 installed in my VM. What was missing was 'git' rubygem
which apparently is not packaged at all and when I manually installed it, I
was still hitting other problems so I rather went ahead and checked the
vm-scalability source and wrote my own reproducer based on that.

I'm now able to reproduce the regression in my VM so I'm investigating...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [linus:master] [readahead]  ab4443fe3c: vm-scalability.throughput -21.4% regression
  2024-02-22 11:50     ` Jan Kara
@ 2024-02-22 18:37       ` Jan Kara
  2024-03-04  4:59         ` Yujie Liu
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Kara @ 2024-02-22 18:37 UTC (permalink / raw)
  To: Oliver Sang
  Cc: Jan Kara, oe-lkp, lkp, linux-kernel, Andrew Morton,
	Matthew Wilcox, Guo Xuenan, linux-fsdevel, ying.huang, feng.tang,
	fengwei.yin

On Thu 22-02-24 12:50:32, Jan Kara wrote:
> On Thu 22-02-24 09:32:52, Oliver Sang wrote:
> > On Wed, Feb 21, 2024 at 12:14:25PM +0100, Jan Kara wrote:
> > > On Tue 20-02-24 16:25:37, kernel test robot wrote:
> > > > kernel test robot noticed a -21.4% regression of vm-scalability.throughput on:
> > > > 
> > > > commit: ab4443fe3ca6298663a55c4a70efc6c3ce913ca6 ("readahead: avoid multiple marked readahead pages")
> > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > > > 
> > > > testcase: vm-scalability
> > > > test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory
> > > > parameters:
> > > > 
> > > > 	runtime: 300s
> > > > 	test: lru-file-readtwice
> > > > 	cpufreq_governor: performance
> > > 
> > > JFYI I had a look into this. What the test seems to do is that it creates
> > > image files on tmpfs, loopmounts XFS there, and does reads over file on
> > > XFS. But I was not able to find what lru-file-readtwice exactly does,
> > > neither I was able to reproduce it because I got stuck on some missing Ruby
> > > dependencies on my test system yesterday.
> > 
> > what's your OS?
> 
> I have SLES15-SP4 installed in my VM. What was missing was 'git' rubygem
> which apparently is not packaged at all and when I manually installed it, I
> was still hitting other problems so I rather went ahead and checked the
> vm-scalability source and wrote my own reproducer based on that.
> 
> I'm now able to reproduce the regression in my VM so I'm investigating...

So I was experimenting with this. What the test does is it creates as many
files as there are CPUs, files are sized so that their total size is 8x the
amount of available RAM. For each file two tasks are started which
sequentially read the file from start to end. Trivial repro from my VM with
8 CPUs and 64GB of RAM is like:

truncate -s 60000000000 /dev/shm/xfsimg
mkfs.xfs /dev/shm/xfsimg
mount -t xfs -o loop /dev/shm/xfsimg /mnt
for (( i = 0; i < 8; i++ )); do truncate -s 60000000000 /mnt/sparse-file-$i; done
echo "Ready..."
sleep 3
echo "Running..."
for (( i = 0; i < 8; i++ )); do
	dd bs=4k if=/mnt/sparse-file-$i of=/dev/null &
	dd bs=4k if=/mnt/sparse-file-$i of=/dev/null &
done 2>&1 | grep "copied"
wait
umount /mnt

The difference between slow and fast runs seems to be in the amount of
pages reclaimed with direct reclaim - after commit ab4443fe3c we reclaim
about 10% of pages with direct reclaim, before commit ab4443fe3c only about
1% of pages is reclaimed with direct reclaim. In both cases we reclaim the
same amount of pages corresponding to the total size of files so it isn't
the case that we would be rereading one page twice.

I suspect the reclaim difference is because after commit ab4443fe3c we
trigger readahead somewhat earlier so our effective workingset is somewhat
larger. This apparently gives harder time to kswapd and we end up with
direct reclaim more often.

Since this is a case of heavy overload on the system, I don't think the
throughput here matters that much and AFAICT the readahead code does
nothing wrong here. So I don't think we need to do anything here.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [linus:master] [readahead]  ab4443fe3c: vm-scalability.throughput -21.4% regression
  2024-02-22 18:37       ` Jan Kara
@ 2024-03-04  4:59         ` Yujie Liu
  2024-03-04  5:35           ` Yin, Fengwei
  0 siblings, 1 reply; 13+ messages in thread
From: Yujie Liu @ 2024-03-04  4:59 UTC (permalink / raw)
  To: Jan Kara
  Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, Andrew Morton,
	Matthew Wilcox, Guo Xuenan, linux-fsdevel, ying.huang, feng.tang,
	fengwei.yin

Hi Honza,

On Thu, Feb 22, 2024 at 07:37:56PM +0100, Jan Kara wrote:
> On Thu 22-02-24 12:50:32, Jan Kara wrote:
> > On Thu 22-02-24 09:32:52, Oliver Sang wrote:
> > > On Wed, Feb 21, 2024 at 12:14:25PM +0100, Jan Kara wrote:
> > > > On Tue 20-02-24 16:25:37, kernel test robot wrote:
> > > > > kernel test robot noticed a -21.4% regression of vm-scalability.throughput on:
> > > > > 
> > > > > commit: ab4443fe3ca6298663a55c4a70efc6c3ce913ca6 ("readahead: avoid multiple marked readahead pages")
> > > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > > > > 
> > > > > testcase: vm-scalability
> > > > > test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory
> > > > > parameters:
> > > > > 
> > > > > 	runtime: 300s
> > > > > 	test: lru-file-readtwice
> > > > > 	cpufreq_governor: performance
> > > > 
> > > > JFYI I had a look into this. What the test seems to do is that it creates
> > > > image files on tmpfs, loopmounts XFS there, and does reads over file on
> > > > XFS. But I was not able to find what lru-file-readtwice exactly does,
> > > > neither I was able to reproduce it because I got stuck on some missing Ruby
> > > > dependencies on my test system yesterday.
> > > 
> > > what's your OS?
> > 
> > I have SLES15-SP4 installed in my VM. What was missing was 'git' rubygem
> > which apparently is not packaged at all and when I manually installed it, I
> > was still hitting other problems so I rather went ahead and checked the
> > vm-scalability source and wrote my own reproducer based on that.
> > 
> > I'm now able to reproduce the regression in my VM so I'm investigating...
> 
> So I was experimenting with this. What the test does is it creates as many
> files as there are CPUs, files are sized so that their total size is 8x the
> amount of available RAM. For each file two tasks are started which
> sequentially read the file from start to end. Trivial repro from my VM with
> 8 CPUs and 64GB of RAM is like:
> 
> truncate -s 60000000000 /dev/shm/xfsimg
> mkfs.xfs /dev/shm/xfsimg
> mount -t xfs -o loop /dev/shm/xfsimg /mnt
> for (( i = 0; i < 8; i++ )); do truncate -s 60000000000 /mnt/sparse-file-$i; done
> echo "Ready..."
> sleep 3
> echo "Running..."
> for (( i = 0; i < 8; i++ )); do
> 	dd bs=4k if=/mnt/sparse-file-$i of=/dev/null &
> 	dd bs=4k if=/mnt/sparse-file-$i of=/dev/null &
> done 2>&1 | grep "copied"
> wait
> umount /mnt
> 
> The difference between slow and fast runs seems to be in the amount of
> pages reclaimed with direct reclaim - after commit ab4443fe3c we reclaim
> about 10% of pages with direct reclaim, before commit ab4443fe3c only about
> 1% of pages is reclaimed with direct reclaim. In both cases we reclaim the
> same amount of pages corresponding to the total size of files so it isn't
> the case that we would be rereading one page twice.
> 
> I suspect the reclaim difference is because after commit ab4443fe3c we
> trigger readahead somewhat earlier so our effective workingset is somewhat
> larger. This apparently gives harder time to kswapd and we end up with
> direct reclaim more often.
> 
> Since this is a case of heavy overload on the system, I don't think the
> throughput here matters that much and AFAICT the readahead code does
> nothing wrong here. So I don't think we need to do anything here.

Thanks a lot for the analysis. Seems we can abstract two factors that
may affect the throughput:

1. The benchmark itself is "dd" from a file to null, which is basically
a sequential operation, so the earlier readahead should bring benefit
to the throughput.

2. The earlier readahead somewhat enlarges the workingset and causes
more often direct memory reclaim, which may hurt the throughput.

We did another round of test. Our machine has 512GB RAM, now we set
the total file size to 256GB so that all the files can be fully loaded
into the memory and there will be no reclaim anymore. This eliminates
the impact of factor 2, but unexpectedly, we still see a -42.3%
throughput regression after commit ab4443fe3c.

From the perf profile, we can see that the contention of folio lru lock
becomes more intense. We also did a simple one-file "dd" test. Looks
like it is more likely that low-order folios are allocated after commit
ab4443fe3c (Fengwei will help provide the data soon). Therefore, the
average folio size decreases while the total folio amount increases,
which leads to touching lru lock more often.

Please kindly check the detailed metrics below:

=========================================================================================
tbox_group/testcase/rootfs/kconfig/compiler/runtime/test/cpufreq_governor/debug-setup:
  lkp-spr-2sp4/vm-scalability/debian-11.1-x86_64-20220510.cgz/x86_64-rhel-8.3/gcc-12/300s/lru-file-readtwice/performance/256GB-perf

commit:
  f0b7a0d1d466 ("Merge branch 'master' into mm-hotfixes-stable")
  ab4443fe3ca6 ("readahead: avoid multiple marked readahead pages")

f0b7a0d1d46625db ab4443fe3ca6298663a55c4a70e
---------------- ---------------------------
         %stddev     %change         %stddev
             \          |                \
      0.00 ± 49%      -0.0        0.00        mpstat.cpu.all.iowait%
      8.43 ±  2%      +3.6       12.06 ±  5%  mpstat.cpu.all.sys%
      0.31            -0.0        0.27 ±  2%  mpstat.cpu.all.usr%
   2289863 ±  8%     +55.6%    3563274 ±  7%  numa-numastat.node0.local_node
   2375395 ±  6%     +54.4%    3666799 ±  6%  numa-numastat.node0.numa_hit
   2311189 ±  7%     +53.8%    3554903 ±  6%  numa-numastat.node1.local_node
   2454386 ±  6%     +50.1%    3684288 ±  4%  numa-numastat.node1.numa_hit
    300.98           +25.2%     376.84 ±  4%  vmstat.memory.buff
  46333305           +27.5%   59075372 ±  3%  vmstat.memory.cache
     25.22 ±  4%     +51.4%      38.18 ±  6%  vmstat.procs.r
    303089            +8.0%     327220        vmstat.system.in
     29.30           +13.5%      33.27        time.elapsed_time
     29.30           +13.5%      33.27        time.elapsed_time.max
     33780 ± 16%     +94.4%      65660 ±  7%  time.involuntary_context_switches
      1943 ±  2%     +42.4%       2767 ±  5%  time.percent_of_cpu_this_job_got
    554.77 ±  3%     +63.5%     907.13 ±  6%  time.system_time
     14.90            -3.3%      14.40        time.user_time
     20505 ± 11%     -41.1%      12085 ±  8%  time.voluntary_context_switches
    284.00 ±  3%     +34.8%     382.75 ±  5%  turbostat.Avg_MHz
     10.08 ±  2%      +3.6       13.65 ±  4%  turbostat.Busy%
     39.50 ±  2%      -1.8       37.68        turbostat.C1E%
      0.38 ±  9%     -17.3%       0.31 ± 14%  turbostat.CPU%c6
   9577640           +22.3%   11715251 ±  2%  turbostat.IRQ
      4.88 ± 12%      -3.7        1.15 ± 48%  turbostat.PKG_%
      5558 ±  5%     +41.9%       7887 ±  6%  turbostat.POLL
    790616 ±  6%     -43.9%     443300 ±  7%  vm-scalability.median
     12060 ±  7%   +3811.3       15871 ±  4%  vm-scalability.stddev%
 3.681e+08 ±  7%     -42.3%  2.122e+08 ±  7%  vm-scalability.throughput
     33780 ± 16%     +94.4%      65660 ±  7%  vm-scalability.time.involuntary_context_switches
      1943 ±  2%     +42.4%       2767 ±  5%  vm-scalability.time.percent_of_cpu_this_job_got
    554.77 ±  3%     +63.5%     907.13 ±  6%  vm-scalability.time.system_time
     20505 ± 11%     -41.1%      12085 ±  8%  vm-scalability.time.voluntary_context_switches
  21390979 ±  4%     +31.7%   28175360 ± 19%  numa-meminfo.node0.Active
  21388266 ±  4%     +31.7%   28172516 ± 19%  numa-meminfo.node0.Active(file)
  24037883 ±  6%     +31.1%   31516721 ± 17%  numa-meminfo.node0.FilePages
    497645 ± 25%     +82.4%     907626 ± 38%  numa-meminfo.node0.Inactive(file)
  25952309 ±  6%     +29.2%   33533454 ± 16%  numa-meminfo.node0.MemUsed
     20138 ±  9%    +154.2%      51187 ± 11%  numa-meminfo.node1.Active(anon)
    704324 ± 17%     +85.4%    1306147 ± 33%  numa-meminfo.node1.Inactive
    427031 ± 22%    +141.7%    1031971 ± 41%  numa-meminfo.node1.Inactive(file)
  43712836           +27.4%   55698257 ±  2%  meminfo.Active
     22786 ±  6%    +136.6%      53907 ± 11%  meminfo.Active(anon)
  43690049           +27.4%   55644350 ±  2%  meminfo.Active(file)
  47543418           +27.4%   60583554 ±  2%  meminfo.Cached
   1454581 ± 10%     +72.8%    2513041 ± 11%  meminfo.Inactive
    929099 ± 16%    +109.5%    1946433 ± 14%  meminfo.Inactive(file)
    242993           +12.9%     274324        meminfo.KReclaimable
     79132 ±  2%     +34.8%     106631 ±  2%  meminfo.Mapped
  51363725           +25.6%   64520957 ±  2%  meminfo.Memused
      9840           +12.2%      11041 ±  2%  meminfo.PageTables
    242993           +12.9%     274324        meminfo.SReclaimable
    136679           +50.2%     205224 ±  5%  meminfo.Shmem
  72281513 ±  2%     +25.8%   90925817 ±  2%  meminfo.max_used_kB
   5346609 ±  4%     +31.7%    7042196 ± 19%  numa-vmstat.node0.nr_active_file
   6008637 ±  7%     +31.1%    7878524 ± 17%  numa-vmstat.node0.nr_file_pages
    123918 ± 25%     +83.2%     227064 ± 38%  numa-vmstat.node0.nr_inactive_file
   5346510 ±  4%     +31.7%    7042147 ± 19%  numa-vmstat.node0.nr_zone_active_file
    123908 ± 25%     +83.3%     227063 ± 38%  numa-vmstat.node0.nr_zone_inactive_file
   2375271 ±  6%     +54.4%    3666818 ±  6%  numa-vmstat.node0.numa_hit
   2289740 ±  8%     +55.6%    3563294 ±  7%  numa-vmstat.node0.numa_local
      5043 ±  9%    +153.9%      12803 ± 11%  numa-vmstat.node1.nr_active_anon
    106576 ± 22%    +141.7%     257597 ± 41%  numa-vmstat.node1.nr_inactive_file
      5043 ±  9%    +153.9%      12803 ± 11%  numa-vmstat.node1.nr_zone_active_anon
    106574 ± 22%    +141.7%     257604 ± 41%  numa-vmstat.node1.nr_zone_inactive_file
   2454493 ±  6%     +50.1%    3684201 ±  4%  numa-vmstat.node1.numa_hit
   2311296 ±  7%     +53.8%    3554816 ±  6%  numa-vmstat.node1.numa_local
      5701 ±  6%    +136.5%      13486 ± 11%  proc-vmstat.nr_active_anon
  10923519           +27.3%   13904109 ±  2%  proc-vmstat.nr_active_file
  11886157           +27.4%   15138396 ±  2%  proc-vmstat.nr_file_pages
  1.19e+08            -2.8%  1.157e+08        proc-vmstat.nr_free_pages
    131227            +8.1%     141868        proc-vmstat.nr_inactive_anon
    231610 ± 16%    +109.7%     485756 ± 14%  proc-vmstat.nr_inactive_file
     19793 ±  2%     +34.7%      26668 ±  2%  proc-vmstat.nr_mapped
      2455           +12.3%       2758 ±  2%  proc-vmstat.nr_page_table_pages
     34038 ±  2%     +51.4%      51526 ±  5%  proc-vmstat.nr_shmem
     60753           +12.9%      68588        proc-vmstat.nr_slab_reclaimable
    113209            +5.9%     119837        proc-vmstat.nr_slab_unreclaimable
      5701 ±  6%    +136.5%      13486 ± 11%  proc-vmstat.nr_zone_active_anon
  10923517           +27.3%   13904109 ±  2%  proc-vmstat.nr_zone_active_file
    131227            +8.1%     141868        proc-vmstat.nr_zone_inactive_anon
    231612 ± 16%    +109.7%     485757 ± 14%  proc-vmstat.nr_zone_inactive_file
    162.75 ± 79%    +552.8%       1062 ± 72%  proc-vmstat.numa_hint_faults
   4831171 ±  4%     +52.2%    7352661 ±  4%  proc-vmstat.numa_hit
   4602441 ±  5%     +54.7%    7119707 ±  4%  proc-vmstat.numa_local
    128.75 ± 59%    +527.5%     807.88 ± 31%  proc-vmstat.numa_pages_migrated
  69656618            -1.5%   68615309        proc-vmstat.pgalloc_normal
    672926            +3.0%     692907        proc-vmstat.pgfault
    128.75 ± 59%    +527.5%     807.88 ± 31%  proc-vmstat.pgmigrate_success
     31089            +3.7%      32235        proc-vmstat.pgreuse
      0.77 ±  2%      -0.0        0.74 ±  2%  perf-stat.i.branch-miss-rate%
     23.58 ±  6%      +3.6       27.18 ±  4%  perf-stat.i.cache-miss-rate%
      2.74            +6.0%       2.90        perf-stat.i.cpi
 5.887e+10 ±  7%     +28.6%  7.572e+10 ± 10%  perf-stat.i.cpu-cycles
     10194 ±  3%      -9.5%       9226 ±  4%  perf-stat.i.cycles-between-cache-misses
      0.44            -2.7%       0.43        perf-stat.i.ipc
      0.25 ± 11%     +29.9%       0.32 ± 11%  perf-stat.i.metric.GHz
     17995 ±  2%      -9.0%      16374 ±  3%  perf-stat.i.minor-faults
     17995 ±  2%      -9.0%      16374 ±  3%  perf-stat.i.page-faults
     17.09           -16.1%      14.34 ±  2%  perf-stat.overall.MPKI
      0.32            -0.0        0.29        perf-stat.overall.branch-miss-rate%
     82.93            -2.1       80.88        perf-stat.overall.cache-miss-rate%
      3.55 ±  2%     +28.9%       4.58 ±  3%  perf-stat.overall.cpi
    207.81 ±  2%     +53.7%     319.49 ±  5%  perf-stat.overall.cycles-between-cache-misses
      0.01 ±  4%      +0.0        0.01 ±  3%  perf-stat.overall.dTLB-load-miss-rate%
      0.01 ±  3%      +0.0        0.01 ±  2%  perf-stat.overall.dTLB-store-miss-rate%
      0.28 ±  2%     -22.3%       0.22 ±  3%  perf-stat.overall.ipc
    967.32           +21.5%       1175 ±  2%  perf-stat.overall.path-length
 3.648e+09            +9.0%  3.976e+09        perf-stat.ps.branch-instructions
 2.987e+08           -10.6%   2.67e+08        perf-stat.ps.cache-misses
 3.602e+08            -8.4%  3.301e+08        perf-stat.ps.cache-references
 6.207e+10 ±  2%     +37.3%  8.524e+10 ±  4%  perf-stat.ps.cpu-cycles
    356765 ±  4%     +14.6%     408833 ±  4%  perf-stat.ps.dTLB-load-misses
 4.786e+09            +5.2%  5.034e+09        perf-stat.ps.dTLB-loads
    222451 ±  2%      +6.7%     237255 ±  2%  perf-stat.ps.dTLB-store-misses
 2.207e+09            -7.4%  2.043e+09        perf-stat.ps.dTLB-stores
 1.748e+10            +6.5%  1.862e+10        perf-stat.ps.instructions
     17777            -9.3%      16117 ±  2%  perf-stat.ps.minor-faults
     17778            -9.3%      16118 ±  2%  perf-stat.ps.page-faults
 5.193e+11           +21.5%   6.31e+11 ±  2%  perf-stat.total.instructions
     12.70            -7.9        4.85 ± 38%  perf-profile.calltrace.cycles-pp.copy_page_to_iter.filemap_read.xfs_file_buffered_read.xfs_file_read_iter.vfs_read
     12.53            -7.8        4.76 ± 38%  perf-profile.calltrace.cycles-pp._copy_to_iter.copy_page_to_iter.filemap_read.xfs_file_buffered_read.xfs_file_read_iter
      8.68            -5.2        3.46 ± 38%  perf-profile.calltrace.cycles-pp.read_pages.page_cache_ra_order.filemap_get_pages.filemap_read.xfs_file_buffered_read
      8.13            -4.7        3.38 ±  9%  perf-profile.calltrace.cycles-pp.zero_user_segments.iomap_readpage_iter.iomap_readahead.read_pages.page_cache_ra_order
      8.67            -4.7        3.93 ±  8%  perf-profile.calltrace.cycles-pp.iomap_readahead.read_pages.page_cache_ra_order.filemap_get_pages.filemap_read
      8.51            -4.7        3.81 ±  8%  perf-profile.calltrace.cycles-pp.iomap_readpage_iter.iomap_readahead.read_pages.page_cache_ra_order.filemap_get_pages
      7.84            -4.6        3.28 ±  8%  perf-profile.calltrace.cycles-pp.__memset.zero_user_segments.iomap_readpage_iter.iomap_readahead.read_pages
      6.47 ±  2%      -2.1        4.39 ±  5%  perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
      6.44 ±  2%      -2.1        4.36 ±  5%  perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
      6.44 ±  2%      -2.1        4.36 ±  5%  perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify
      6.43 ±  2%      -2.1        4.36 ±  5%  perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
      6.39 ±  2%      -2.1        4.33 ±  5%  perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
      6.08 ±  2%      -2.0        4.11 ±  5%  perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
      5.85 ±  2%      -1.9        3.96 ±  5%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
      3.96 ±  2%      -1.3        2.62 ±  6%  perf-profile.calltrace.cycles-pp.write
      3.50 ±  2%      -1.1        2.36 ±  6%  perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      3.28 ±  2%      -1.1        2.22 ±  6%  perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
      2.76 ±  3%      -0.9        1.86 ±  6%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write
      2.63 ±  3%      -0.8        1.79 ±  6%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
      2.37 ±  2%      -0.8        1.57 ±  6%  perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
      2.30 ±  2%      -0.8        1.52 ±  6%  perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
      2.34 ±  4%      -0.7        1.61 ±  7%  perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
      1.91 ±  3%      -0.7        1.26 ±  7%  perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
      2.04 ±  4%      -0.6        1.43 ±  7%  perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
      1.32 ±  2%      -0.6        0.71 ± 38%  perf-profile.calltrace.cycles-pp.filemap_get_read_batch.filemap_get_pages.filemap_read.xfs_file_buffered_read.xfs_file_read_iter
      1.68 ±  4%      -0.6        1.09 ±  8%  perf-profile.calltrace.cycles-pp.tick_nohz_highres_handler.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt
      1.48 ±  3%      -0.5        0.98 ±  6%  perf-profile.calltrace.cycles-pp.tick_sched_handle.tick_nohz_highres_handler.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt
      1.47 ±  3%      -0.5        0.98 ±  6%  perf-profile.calltrace.cycles-pp.update_process_times.tick_sched_handle.tick_nohz_highres_handler.__hrtimer_run_queues.hrtimer_interrupt
      1.29 ±  3%      -0.4        0.87 ±  5%  perf-profile.calltrace.cycles-pp.scheduler_tick.update_process_times.tick_sched_handle.tick_nohz_highres_handler.__hrtimer_run_queues
      1.46 ± 11%      -0.4        1.07 ± 38%  perf-profile.calltrace.cycles-pp.ast_primary_plane_helper_atomic_update.drm_atomic_helper_commit_planes.drm_atomic_helper_commit_tail_rpm.ast_mode_config_helper_atomic_commit_tail.commit_tail
      1.46 ± 11%      -0.4        1.07 ± 38%  perf-profile.calltrace.cycles-pp.drm_fb_memcpy.ast_primary_plane_helper_atomic_update.drm_atomic_helper_commit_planes.drm_atomic_helper_commit_tail_rpm.ast_mode_config_helper_atomic_commit_tail
      1.46 ± 11%      -0.4        1.07 ± 38%  perf-profile.calltrace.cycles-pp.ast_mode_config_helper_atomic_commit_tail.commit_tail.drm_atomic_helper_commit.drm_atomic_commit.drm_atomic_helper_dirtyfb
      1.46 ± 11%      -0.4        1.07 ± 38%  perf-profile.calltrace.cycles-pp.drm_atomic_helper_commit_planes.drm_atomic_helper_commit_tail_rpm.ast_mode_config_helper_atomic_commit_tail.commit_tail.drm_atomic_helper_commit
      1.46 ± 11%      -0.4        1.07 ± 38%  perf-profile.calltrace.cycles-pp.drm_atomic_helper_commit_tail_rpm.ast_mode_config_helper_atomic_commit_tail.commit_tail.drm_atomic_helper_commit.drm_atomic_commit
      1.43 ± 11%      -0.4        1.04 ± 38%  perf-profile.calltrace.cycles-pp.memcpy_toio.drm_fb_memcpy.ast_primary_plane_helper_atomic_update.drm_atomic_helper_commit_planes.drm_atomic_helper_commit_tail_rpm
      1.13 ±  3%      -0.4        0.76 ±  6%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      1.02            -0.3        0.70 ±  5%  perf-profile.calltrace.cycles-pp.intel_idle_xstate.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      0.94 ±  3%      -0.3        0.65 ±  5%  perf-profile.calltrace.cycles-pp.perf_event_task_tick.scheduler_tick.update_process_times.tick_sched_handle.tick_nohz_highres_handler
      0.92 ±  3%      -0.3        0.63 ±  5%  perf-profile.calltrace.cycles-pp.perf_adjust_freq_unthr_context.perf_event_task_tick.scheduler_tick.update_process_times.tick_sched_handle
      1.58 ± 12%      -0.3        1.29 ±  3%  perf-profile.calltrace.cycles-pp.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      0.74 ±  4%      -0.3        0.46 ± 38%  perf-profile.calltrace.cycles-pp.__filemap_add_folio.filemap_add_folio.page_cache_ra_order.filemap_get_pages.filemap_read
      1.56 ± 12%      -0.3        1.29 ±  4%  perf-profile.calltrace.cycles-pp.process_one_work.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      1.55 ± 12%      -0.3        1.28 ±  3%  perf-profile.calltrace.cycles-pp.drm_fb_helper_damage_work.process_one_work.worker_thread.kthread.ret_from_fork
      1.55 ± 12%      -0.3        1.28 ±  3%  perf-profile.calltrace.cycles-pp.drm_fbdev_generic_helper_fb_dirty.drm_fb_helper_damage_work.process_one_work.worker_thread.kthread
      1.65 ± 11%      -0.3        1.38 ±  3%  perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm
      1.65 ± 11%      -0.3        1.38 ±  3%  perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm
      1.65 ± 11%      -0.3        1.38 ±  3%  perf-profile.calltrace.cycles-pp.ret_from_fork_asm
      1.46 ± 11%      -0.2        1.22 ±  3%  perf-profile.calltrace.cycles-pp.commit_tail.drm_atomic_helper_commit.drm_atomic_commit.drm_atomic_helper_dirtyfb.drm_fbdev_generic_helper_fb_dirty
      1.46 ± 11%      -0.2        1.22 ±  3%  perf-profile.calltrace.cycles-pp.drm_atomic_commit.drm_atomic_helper_dirtyfb.drm_fbdev_generic_helper_fb_dirty.drm_fb_helper_damage_work.process_one_work
      1.46 ± 11%      -0.2        1.22 ±  3%  perf-profile.calltrace.cycles-pp.drm_atomic_helper_commit.drm_atomic_commit.drm_atomic_helper_dirtyfb.drm_fbdev_generic_helper_fb_dirty.drm_fb_helper_damage_work
      1.47 ± 11%      -0.2        1.22 ±  3%  perf-profile.calltrace.cycles-pp.drm_atomic_helper_dirtyfb.drm_fbdev_generic_helper_fb_dirty.drm_fb_helper_damage_work.process_one_work.worker_thread
      1.03 ±  7%      -0.2        0.82 ±  8%  perf-profile.calltrace.cycles-pp.devkmsg_emit.devkmsg_write.vfs_write.ksys_write.do_syscall_64
      1.03 ±  7%      -0.2        0.82 ±  8%  perf-profile.calltrace.cycles-pp.devkmsg_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.03 ±  7%      -0.2        0.82 ±  8%  perf-profile.calltrace.cycles-pp.vprintk_emit.devkmsg_emit.devkmsg_write.vfs_write.ksys_write
      1.02 ±  7%      -0.2        0.82 ±  8%  perf-profile.calltrace.cycles-pp.console_flush_all.console_unlock.vprintk_emit.devkmsg_emit.devkmsg_write
      1.02 ±  7%      -0.2        0.82 ±  8%  perf-profile.calltrace.cycles-pp.console_unlock.vprintk_emit.devkmsg_emit.devkmsg_write.vfs_write
      0.61 ±  5%      -0.1        0.55 ±  4%  perf-profile.calltrace.cycles-pp.truncate_inode_pages_range.evict.do_unlinkat.__x64_sys_unlinkat.do_syscall_64
     86.32            +4.1       90.42        perf-profile.calltrace.cycles-pp.read
     85.08            +4.6       89.66        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.read
     84.96            +4.6       89.58        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
     84.63            +4.7       89.38        perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
     84.36            +4.9       89.21        perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
     26.79            +9.3       36.06 ±  2%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_activate
     26.94            +9.3       36.22 ±  2%  perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_activate.folio_mark_accessed.filemap_read
     26.87            +9.3       36.17 ±  2%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_activate.folio_mark_accessed
     26.91           +10.9       37.78 ±  2%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru
     27.00           +10.9       37.89 ±  2%  perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru.filemap_add_folio.page_cache_ra_order
     26.99           +10.9       37.89 ±  2%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru.filemap_add_folio
     27.44           +10.9       38.36 ±  2%  perf-profile.calltrace.cycles-pp.folio_batch_move_lru.folio_add_lru.filemap_add_folio.page_cache_ra_order.filemap_get_pages
     27.47           +10.9       38.39 ±  2%  perf-profile.calltrace.cycles-pp.folio_add_lru.filemap_add_folio.page_cache_ra_order.filemap_get_pages.filemap_read
     12.72            -7.2        5.56 ±  7%  perf-profile.children.cycles-pp.copy_page_to_iter
     12.56            -7.1        5.46 ±  7%  perf-profile.children.cycles-pp._copy_to_iter
      8.80            -4.9        3.95 ±  8%  perf-profile.children.cycles-pp.read_pages
      8.78            -4.8        3.94 ±  8%  perf-profile.children.cycles-pp.iomap_readahead
      8.62            -4.8        3.83 ±  8%  perf-profile.children.cycles-pp.iomap_readpage_iter
      8.15            -4.8        3.39 ±  9%  perf-profile.children.cycles-pp.zero_user_segments
      8.07            -4.7        3.36 ±  9%  perf-profile.children.cycles-pp.__memset
      6.47 ±  2%      -2.1        4.39 ±  5%  perf-profile.children.cycles-pp.cpu_startup_entry
      6.47 ±  2%      -2.1        4.39 ±  5%  perf-profile.children.cycles-pp.do_idle
      6.47 ±  2%      -2.1        4.39 ±  5%  perf-profile.children.cycles-pp.secondary_startup_64_no_verify
      6.44 ±  2%      -2.1        4.36 ±  5%  perf-profile.children.cycles-pp.start_secondary
      6.42 ±  2%      -2.1        4.36 ±  5%  perf-profile.children.cycles-pp.cpuidle_idle_call
      6.10 ±  2%      -2.0        4.13 ±  5%  perf-profile.children.cycles-pp.cpuidle_enter
      6.10 ±  2%      -2.0        4.13 ±  5%  perf-profile.children.cycles-pp.cpuidle_enter_state
      4.53 ±  2%      -1.5        2.98 ±  6%  perf-profile.children.cycles-pp.write
      4.34 ±  2%      -1.4        2.90 ±  5%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
      3.93 ±  2%      -1.3        2.66 ±  5%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
      2.96 ±  2%      -1.0        1.99 ±  5%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
      2.89 ±  2%      -1.0        1.94 ±  5%  perf-profile.children.cycles-pp.hrtimer_interrupt
      2.46 ±  3%      -0.8        1.64 ±  6%  perf-profile.children.cycles-pp.__hrtimer_run_queues
      2.47 ±  3%      -0.8        1.72 ±  7%  perf-profile.children.cycles-pp.ksys_write
      2.18 ±  3%      -0.7        1.43 ±  7%  perf-profile.children.cycles-pp.tick_nohz_highres_handler
      1.96 ±  2%      -0.7        1.31 ±  5%  perf-profile.children.cycles-pp.tick_sched_handle
      1.96 ±  2%      -0.7        1.30 ±  5%  perf-profile.children.cycles-pp.update_process_times
      2.20 ±  3%      -0.6        1.55 ±  7%  perf-profile.children.cycles-pp.vfs_write
      1.74 ±  2%      -0.6        1.17 ±  5%  perf-profile.children.cycles-pp.scheduler_tick
      1.35 ±  2%      -0.5        0.82 ±  5%  perf-profile.children.cycles-pp.filemap_get_read_batch
      1.38 ±  2%      -0.5        0.88 ±  6%  perf-profile.children.cycles-pp.entry_SYSCALL_64
      0.48 ± 17%      -0.4        0.05 ± 42%  perf-profile.children.cycles-pp.page_cache_ra_unbounded
      0.69 ±  7%      -0.4        0.27 ± 39%  perf-profile.children.cycles-pp.xfs_ilock
      0.82 ±  4%      -0.4        0.40 ±  7%  perf-profile.children.cycles-pp.touch_atime
      1.46 ± 11%      -0.4        1.07 ± 38%  perf-profile.children.cycles-pp.ast_primary_plane_helper_atomic_update
      1.46 ± 11%      -0.4        1.07 ± 38%  perf-profile.children.cycles-pp.ast_mode_config_helper_atomic_commit_tail
      0.77 ±  5%      -0.4        0.38 ±  7%  perf-profile.children.cycles-pp.atime_needs_update
      1.22 ±  2%      -0.4        0.84 ±  5%  perf-profile.children.cycles-pp.perf_event_task_tick
      1.21 ±  2%      -0.4        0.83 ±  5%  perf-profile.children.cycles-pp.perf_adjust_freq_unthr_context
      1.14 ±  3%      -0.4        0.76 ±  6%  perf-profile.children.cycles-pp.intel_idle
      0.65 ±  8%      -0.4        0.28 ±  9%  perf-profile.children.cycles-pp.down_read
      1.02 ±  2%      -0.3        0.70 ±  5%  perf-profile.children.cycles-pp.intel_idle_xstate
      0.79 ±  2%      -0.3        0.49 ±  5%  perf-profile.children.cycles-pp.rw_verify_area
      1.58 ± 12%      -0.3        1.29 ±  3%  perf-profile.children.cycles-pp.worker_thread
      1.56 ± 12%      -0.3        1.29 ±  4%  perf-profile.children.cycles-pp.process_one_work
      1.55 ± 12%      -0.3        1.28 ±  3%  perf-profile.children.cycles-pp.drm_fb_helper_damage_work
      1.55 ± 12%      -0.3        1.28 ±  3%  perf-profile.children.cycles-pp.drm_fbdev_generic_helper_fb_dirty
      1.65 ± 11%      -0.3        1.38 ±  3%  perf-profile.children.cycles-pp.ret_from_fork_asm
      1.65 ± 11%      -0.3        1.38 ±  3%  perf-profile.children.cycles-pp.ret_from_fork
      1.65 ± 11%      -0.3        1.38 ±  3%  perf-profile.children.cycles-pp.kthread
      0.68 ±  2%      -0.3        0.43 ±  7%  perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
      1.46 ± 11%      -0.2        1.22 ±  3%  perf-profile.children.cycles-pp.drm_fb_memcpy
      1.46 ± 11%      -0.2        1.22 ±  3%  perf-profile.children.cycles-pp.memcpy_toio
      0.77 ±  4%      -0.2        0.52 ±  4%  perf-profile.children.cycles-pp.__filemap_add_folio
      1.46 ± 11%      -0.2        1.22 ±  3%  perf-profile.children.cycles-pp.commit_tail
      1.46 ± 11%      -0.2        1.22 ±  3%  perf-profile.children.cycles-pp.drm_atomic_commit
      1.46 ± 11%      -0.2        1.22 ±  3%  perf-profile.children.cycles-pp.drm_atomic_helper_commit
      1.46 ± 11%      -0.2        1.22 ±  3%  perf-profile.children.cycles-pp.drm_atomic_helper_commit_planes
      1.46 ± 11%      -0.2        1.22 ±  3%  perf-profile.children.cycles-pp.drm_atomic_helper_commit_tail_rpm
      1.47 ± 11%      -0.2        1.22 ±  3%  perf-profile.children.cycles-pp.drm_atomic_helper_dirtyfb
      0.62 ±  3%      -0.2        0.39 ±  5%  perf-profile.children.cycles-pp.xas_load
      0.61 ±  2%      -0.2        0.37 ±  5%  perf-profile.children.cycles-pp.security_file_permission
      0.76 ±  3%      -0.2        0.53 ±  5%  perf-profile.children.cycles-pp.__intel_pmu_enable_all
      0.41 ±  5%      -0.2        0.20 ± 38%  perf-profile.children.cycles-pp.xfs_iunlock
      1.03 ±  7%      -0.2        0.82 ±  8%  perf-profile.children.cycles-pp.devkmsg_emit
      1.03 ±  7%      -0.2        0.82 ±  8%  perf-profile.children.cycles-pp.devkmsg_write
      1.03 ±  7%      -0.2        0.83 ±  8%  perf-profile.children.cycles-pp.console_flush_all
      1.03 ±  7%      -0.2        0.83 ±  8%  perf-profile.children.cycles-pp.console_unlock
      1.04 ±  7%      -0.2        0.84 ±  8%  perf-profile.children.cycles-pp.vprintk_emit
      0.62 ±  3%      -0.2        0.42 ±  6%  perf-profile.children.cycles-pp.irq_exit_rcu
      0.60 ±  2%      -0.2        0.41 ±  5%  perf-profile.children.cycles-pp.__do_softirq
      0.52 ±  3%      -0.2        0.33 ±  7%  perf-profile.children.cycles-pp.folio_alloc
      0.45 ±  2%      -0.2        0.27 ±  5%  perf-profile.children.cycles-pp.apparmor_file_permission
      0.33 ±  6%      -0.2        0.16 ±  5%  perf-profile.children.cycles-pp.up_read
      0.38 ±  4%      -0.1        0.23 ±  8%  perf-profile.children.cycles-pp.__fsnotify_parent
      0.45 ±  3%      -0.1        0.31 ±  6%  perf-profile.children.cycles-pp.rebalance_domains
      0.34 ±  3%      -0.1        0.20 ±  6%  perf-profile.children.cycles-pp.__fdget_pos
      0.40 ±  4%      -0.1        0.27 ±  7%  perf-profile.children.cycles-pp.__alloc_pages
      0.33 ±  3%      -0.1        0.19 ±  6%  perf-profile.children.cycles-pp.xas_descend
      0.41 ±  3%      -0.1        0.27 ±  7%  perf-profile.children.cycles-pp.alloc_pages_mpol
      0.29 ±  6%      -0.1        0.16 ±  8%  perf-profile.children.cycles-pp.__mem_cgroup_charge
      0.38 ±  3%      -0.1        0.25 ±  7%  perf-profile.children.cycles-pp.get_page_from_freelist
      0.34 ±  2%      -0.1        0.21 ±  6%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode
      0.22 ±  7%      -0.1        0.10 ± 12%  perf-profile.children.cycles-pp.try_charge_memcg
      0.25 ±  3%      -0.1        0.14 ±  5%  perf-profile.children.cycles-pp.xas_store
      0.31 ±  3%      -0.1        0.22 ±  6%  perf-profile.children.cycles-pp._raw_spin_trylock
      0.20 ±  4%      -0.1        0.11 ±  7%  perf-profile.children.cycles-pp.__free_pages_ok
      0.23 ±  5%      -0.1        0.14 ±  7%  perf-profile.children.cycles-pp.rmqueue
      0.22 ±  4%      -0.1        0.13 ±  8%  perf-profile.children.cycles-pp.current_time
      0.18 ±  6%      -0.1        0.10 ±  7%  perf-profile.children.cycles-pp.syscall_return_via_sysret
      0.38 ± 15%      -0.1        0.29 ± 11%  perf-profile.children.cycles-pp.ktime_get
      0.16 ±  8%      -0.1        0.08 ±  9%  perf-profile.children.cycles-pp.page_counter_try_charge
      0.25 ±  6%      -0.1        0.17 ±  9%  perf-profile.children.cycles-pp.ktime_get_update_offsets_now
      0.18 ±  3%      -0.1        0.10 ±  6%  perf-profile.children.cycles-pp.__x64_sys_execve
      0.18 ±  3%      -0.1        0.10 ±  6%  perf-profile.children.cycles-pp.do_execveat_common
      0.18 ±  3%      -0.1        0.10 ±  6%  perf-profile.children.cycles-pp.execve
      0.28 ± 21%      -0.1        0.20 ± 13%  perf-profile.children.cycles-pp.tick_irq_enter
      0.17 ±  4%      -0.1        0.10 ±  5%  perf-profile.children.cycles-pp.__mmput
      0.17 ±  4%      -0.1        0.10 ±  5%  perf-profile.children.cycles-pp.exit_mmap
      0.25            -0.1        0.18 ±  7%  perf-profile.children.cycles-pp.menu_select
      0.18 ±  3%      -0.1        0.11 ±  6%  perf-profile.children.cycles-pp.aa_file_perm
      0.28 ± 20%      -0.1        0.21 ± 14%  perf-profile.children.cycles-pp.irq_enter_rcu
      0.13 ±  4%      -0.1        0.06 ±  8%  perf-profile.children.cycles-pp.xas_create
      0.20 ±  4%      -0.1        0.13 ±  7%  perf-profile.children.cycles-pp.__mod_node_page_state
      0.21 ±  4%      -0.1        0.14 ±  5%  perf-profile.children.cycles-pp.load_balance
      0.20 ±  6%      -0.1        0.14 ±  3%  perf-profile.children.cycles-pp.xas_start
      0.21 ±  4%      -0.1        0.14 ±  6%  perf-profile.children.cycles-pp.__mod_lruvec_state
      0.11 ±  3%      -0.1        0.04 ± 38%  perf-profile.children.cycles-pp.kmem_cache_alloc_lru
      0.11 ±  4%      -0.1        0.04 ± 38%  perf-profile.children.cycles-pp.xas_alloc
      0.12 ±  2%      -0.1        0.06 ±  8%  perf-profile.children.cycles-pp.folio_prep_large_rmappable
      0.18 ±  4%      -0.1        0.12 ±  6%  perf-profile.children.cycles-pp.__cond_resched
      0.15 ±  5%      -0.1        0.09 ±  4%  perf-profile.children.cycles-pp.bprm_execve
      0.61 ±  5%      -0.1        0.55 ±  4%  perf-profile.children.cycles-pp.truncate_inode_pages_range
      0.13 ±  5%      -0.1        0.08 ±  5%  perf-profile.children.cycles-pp.exec_binprm
      0.13 ±  5%      -0.1        0.08 ±  5%  perf-profile.children.cycles-pp.load_elf_binary
      0.13 ±  5%      -0.1        0.08 ±  5%  perf-profile.children.cycles-pp.search_binary_handler
      0.08 ±  6%      -0.1        0.02 ±100%  perf-profile.children.cycles-pp.begin_new_exec
      0.14 ± 11%      -0.1        0.08 ±  8%  perf-profile.children.cycles-pp.arch_scale_freq_tick
      0.12 ±  4%      -0.1        0.07 ±  7%  perf-profile.children.cycles-pp.lru_add_drain
      0.10 ±  5%      -0.1        0.04 ± 37%  perf-profile.children.cycles-pp.__xas_next
      0.12 ±  5%      -0.1        0.06 ±  7%  perf-profile.children.cycles-pp.lru_add_drain_cpu
      0.15 ±  6%      -0.1        0.10 ±  5%  perf-profile.children.cycles-pp.update_sd_lb_stats
      0.12 ±  2%      -0.1        0.07 ±  7%  perf-profile.children.cycles-pp.asm_exc_page_fault
      0.15 ±  4%      -0.1        0.10 ±  7%  perf-profile.children.cycles-pp.find_busiest_group
      0.31 ±  5%      -0.0        0.26 ±  6%  perf-profile.children.cycles-pp.workingset_activation
      0.12 ±  4%      -0.0        0.07 ±  4%  perf-profile.children.cycles-pp.do_exit
      0.12 ±  3%      -0.0        0.07 ±  4%  perf-profile.children.cycles-pp.__x64_sys_exit_group
      0.12 ±  3%      -0.0        0.07 ±  4%  perf-profile.children.cycles-pp.do_group_exit
      0.13 ±  5%      -0.0        0.09 ±  5%  perf-profile.children.cycles-pp.update_sg_lb_stats
      0.10 ±  5%      -0.0        0.06 ±  9%  perf-profile.children.cycles-pp.do_vmi_munmap
      0.10 ±  4%      -0.0        0.05 ±  8%  perf-profile.children.cycles-pp.do_vmi_align_munmap
      0.10 ±  5%      -0.0        0.05 ± 38%  perf-profile.children.cycles-pp.ktime_get_coarse_real_ts64
      0.15 ±  4%      -0.0        0.10 ±  9%  perf-profile.children.cycles-pp._raw_spin_lock
      0.35 ±  2%      -0.0        0.30 ±  3%  perf-profile.children.cycles-pp.folio_activate_fn
      0.11 ±  6%      -0.0        0.07 ±  7%  perf-profile.children.cycles-pp.__schedule
      0.11 ±  4%      -0.0        0.06 ± 10%  perf-profile.children.cycles-pp.do_user_addr_fault
      0.11 ±  4%      -0.0        0.06 ± 10%  perf-profile.children.cycles-pp.exc_page_fault
      0.08 ±  4%      -0.0        0.04 ± 57%  perf-profile.children.cycles-pp.unmap_region
      0.10 ±  4%      -0.0        0.06 ±  5%  perf-profile.children.cycles-pp.exit_mm
      0.10 ±  3%      -0.0        0.06 ±  5%  perf-profile.children.cycles-pp.handle_mm_fault
      0.15 ±  5%      -0.0        0.11 ± 14%  perf-profile.children.cycles-pp.__mod_memcg_lruvec_state
      0.15 ±  4%      -0.0        0.11 ±  6%  perf-profile.children.cycles-pp.native_irq_return_iret
      0.10 ±  3%      -0.0        0.06 ±  8%  perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      0.10 ±  6%      -0.0        0.06 ±  5%  perf-profile.children.cycles-pp.tlb_batch_pages_flush
      0.10 ±  5%      -0.0        0.06 ±  5%  perf-profile.children.cycles-pp.vm_mmap_pgoff
      0.10 ±  4%      -0.0        0.06 ±  8%  perf-profile.children.cycles-pp.mmap_region
      0.15 ±  6%      -0.0        0.11 ±  7%  perf-profile.children.cycles-pp.__lruvec_stat_mod_folio
      0.08 ±  4%      -0.0        0.04 ± 57%  perf-profile.children.cycles-pp.rcu_core
      0.10 ±  4%      -0.0        0.06 ±  7%  perf-profile.children.cycles-pp.tlb_finish_mmu
      0.10 ±  4%      -0.0        0.06 ±  5%  perf-profile.children.cycles-pp.do_mmap
      0.10 ±  5%      -0.0        0.06 ±  8%  perf-profile.children.cycles-pp.__handle_mm_fault
      0.16 ± 14%      -0.0        0.12 ± 23%  perf-profile.children.cycles-pp.vt_console_print
      0.15 ± 13%      -0.0        0.12 ± 23%  perf-profile.children.cycles-pp.con_scroll
      0.15 ± 13%      -0.0        0.11 ± 24%  perf-profile.children.cycles-pp.fbcon_redraw
      0.15 ± 13%      -0.0        0.12 ± 23%  perf-profile.children.cycles-pp.fbcon_scroll
      0.15 ± 13%      -0.0        0.12 ± 23%  perf-profile.children.cycles-pp.lf
      0.11 ±  5%      -0.0        0.08 ±  6%  perf-profile.children.cycles-pp.task_tick_fair
      0.08 ± 17%      -0.0        0.05 ± 38%  perf-profile.children.cycles-pp.calc_global_load_tick
      0.11 ±  4%      -0.0        0.07 ±  4%  perf-profile.children.cycles-pp.perf_rotate_context
      0.09 ±  6%      -0.0        0.05 ±  8%  perf-profile.children.cycles-pp.schedule
      0.14 ± 13%      -0.0        0.10 ± 24%  perf-profile.children.cycles-pp.fbcon_putcs
      0.10 ±  5%      -0.0        0.06 ±  8%  perf-profile.children.cycles-pp.rcu_all_qs
      0.07 ±  7%      -0.0        0.03 ± 77%  perf-profile.children.cycles-pp.sched_clock
      0.10 ± 18%      -0.0        0.07 ±  7%  perf-profile.children.cycles-pp.__memcpy
      0.08 ±  6%      -0.0        0.04 ± 37%  perf-profile.children.cycles-pp.asm_sysvec_call_function
      0.11 ±  4%      -0.0        0.08 ±  5%  perf-profile.children.cycles-pp.clockevents_program_event
      0.11 ± 16%      -0.0        0.08 ± 25%  perf-profile.children.cycles-pp.fast_imageblit
      0.11 ± 16%      -0.0        0.08 ± 25%  perf-profile.children.cycles-pp.drm_fbdev_generic_defio_imageblit
      0.11 ± 16%      -0.0        0.08 ± 25%  perf-profile.children.cycles-pp.sys_imageblit
      0.12 ±  5%      -0.0        0.10 ±  7%  perf-profile.children.cycles-pp.find_lock_entries
      0.09 ±  4%      -0.0        0.06 ±  8%  perf-profile.children.cycles-pp.native_sched_clock
      0.08 ±  6%      -0.0        0.05 ±  8%  perf-profile.children.cycles-pp.sched_clock_cpu
      0.08 ±  5%      -0.0        0.06 ±  5%  perf-profile.children.cycles-pp.lapic_next_deadline
      0.09 ±  4%      -0.0        0.06 ±  7%  perf-profile.children.cycles-pp.read_tsc
      0.07 ±  6%      -0.0        0.05 ±  8%  perf-profile.children.cycles-pp.native_apic_msr_eoi
      0.07 ±  8%      -0.0        0.05 ±  6%  perf-profile.children.cycles-pp.__free_one_page
      0.07 ±  9%      +0.0        0.09        perf-profile.children.cycles-pp.__mem_cgroup_uncharge
      0.06 ±  7%      +0.0        0.09 ±  5%  perf-profile.children.cycles-pp.uncharge_batch
      0.04 ± 58%      +0.0        0.07        perf-profile.children.cycles-pp.page_counter_uncharge
      0.09 ±  7%      +0.0        0.12 ±  2%  perf-profile.children.cycles-pp.destroy_large_folio
      0.08 ±  4%      +0.0        0.13 ± 13%  perf-profile.children.cycles-pp._raw_spin_lock_irq
      0.00            +0.1        0.06 ±  7%  perf-profile.children.cycles-pp.free_unref_page
     89.22            +3.4       92.57        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     89.02            +3.4       92.44        perf-profile.children.cycles-pp.do_syscall_64
     86.89            +3.9       90.79        perf-profile.children.cycles-pp.read
     39.51            +4.7       44.21        perf-profile.children.cycles-pp.filemap_get_pages
     84.67            +4.7       89.40        perf-profile.children.cycles-pp.ksys_read
     84.40            +4.8       89.24        perf-profile.children.cycles-pp.vfs_read
     37.48            +5.7       43.21        perf-profile.children.cycles-pp.page_cache_ra_order
     82.04            +5.9       87.98        perf-profile.children.cycles-pp.filemap_read
     28.01            +9.2       37.19        perf-profile.children.cycles-pp.folio_mark_accessed
     27.68            +9.2       36.91 ±  2%  perf-profile.children.cycles-pp.folio_activate
     28.55           +10.4       38.96 ±  2%  perf-profile.children.cycles-pp.filemap_add_folio
     27.81           +10.7       38.48 ±  2%  perf-profile.children.cycles-pp.folio_add_lru
     54.31           +19.8       74.12 ±  2%  perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
     54.49           +19.8       74.33 ±  2%  perf-profile.children.cycles-pp.folio_lruvec_lock_irqsave
     54.82           +19.8       74.67 ±  2%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
     55.58           +19.9       75.44 ±  2%  perf-profile.children.cycles-pp.folio_batch_move_lru
     12.46            -7.0        5.42 ±  7%  perf-profile.self.cycles-pp._copy_to_iter
      8.02            -4.7        3.34 ±  9%  perf-profile.self.cycles-pp.__memset
      1.14 ±  3%      -0.4        0.76 ±  6%  perf-profile.self.cycles-pp.intel_idle
      0.93 ±  3%      -0.3        0.58 ±  6%  perf-profile.self.cycles-pp.filemap_read
      0.56 ±  8%      -0.3        0.23 ±  9%  perf-profile.self.cycles-pp.down_read
      1.02            -0.3        0.70 ±  5%  perf-profile.self.cycles-pp.intel_idle_xstate
      0.49 ±  7%      -0.3        0.21 ±  8%  perf-profile.self.cycles-pp.atime_needs_update
      0.66 ±  2%      -0.2        0.42 ±  7%  perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      1.43 ± 11%      -0.2        1.18 ±  3%  perf-profile.self.cycles-pp.memcpy_toio
      0.66 ±  2%      -0.2        0.44 ±  5%  perf-profile.self.cycles-pp.filemap_get_read_batch
      0.76 ±  3%      -0.2        0.53 ±  5%  perf-profile.self.cycles-pp.__intel_pmu_enable_all
      0.60 ±  3%      -0.2        0.38 ±  6%  perf-profile.self.cycles-pp.write
      0.60 ±  2%      -0.2        0.38 ±  6%  perf-profile.self.cycles-pp.read
      0.53 ±  3%      -0.2        0.32 ±  6%  perf-profile.self.cycles-pp.vfs_read
      0.48            -0.2        0.31 ±  4%  perf-profile.self.cycles-pp.perf_adjust_freq_unthr_context
      0.32 ±  7%      -0.2        0.16 ±  6%  perf-profile.self.cycles-pp.up_read
      0.36 ±  4%      -0.1        0.22 ±  7%  perf-profile.self.cycles-pp.__fsnotify_parent
      0.32 ±  4%      -0.1        0.19 ±  6%  perf-profile.self.cycles-pp.__fdget_pos
      0.30 ±  7%      -0.1        0.17 ±  9%  perf-profile.self.cycles-pp.vfs_write
      0.30 ±  3%      -0.1        0.17 ±  6%  perf-profile.self.cycles-pp.xas_descend
      0.28 ±  2%      -0.1        0.16 ±  8%  perf-profile.self.cycles-pp.do_syscall_64
      0.28 ±  3%      -0.1        0.17 ±  6%  perf-profile.self.cycles-pp.entry_SYSCALL_64
      0.32 ±  3%      -0.1        0.22 ±  7%  perf-profile.self.cycles-pp.cpuidle_enter_state
      0.19 ±  8%      -0.1        0.09 ± 38%  perf-profile.self.cycles-pp.xfs_file_read_iter
      0.24 ±  3%      -0.1        0.14 ±  6%  perf-profile.self.cycles-pp.apparmor_file_permission
      0.31 ±  3%      -0.1        0.22 ±  6%  perf-profile.self.cycles-pp._raw_spin_trylock
      0.18 ±  5%      -0.1        0.10 ±  8%  perf-profile.self.cycles-pp.syscall_return_via_sysret
      0.22 ±  4%      -0.1        0.14 ±  4%  perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.23 ±  6%      -0.1        0.16 ±  9%  perf-profile.self.cycles-pp.ktime_get_update_offsets_now
      0.14 ±  9%      -0.1        0.07 ± 12%  perf-profile.self.cycles-pp.page_counter_try_charge
      0.10 ±  4%      -0.1        0.02 ±100%  perf-profile.self.cycles-pp.rmqueue
      0.20 ±  3%      -0.1        0.13 ±  7%  perf-profile.self.cycles-pp.__mod_node_page_state
      0.18 ±  3%      -0.1        0.12 ±  7%  perf-profile.self.cycles-pp.xas_load
      0.09            -0.1        0.02 ±100%  perf-profile.self.cycles-pp.__xas_next
      0.17 ±  2%      -0.1        0.10 ±  6%  perf-profile.self.cycles-pp.rw_verify_area
      0.16 ±  3%      -0.1        0.10 ±  6%  perf-profile.self.cycles-pp.aa_file_perm
      0.16 ±  3%      -0.1        0.09 ±  9%  perf-profile.self.cycles-pp.filemap_get_pages
      0.19 ±  6%      -0.1        0.13 ±  3%  perf-profile.self.cycles-pp.xas_start
      0.18 ±  3%      -0.1        0.12 ±  5%  perf-profile.self.cycles-pp.security_file_permission
      0.17            -0.1        0.11 ±  6%  perf-profile.self.cycles-pp.copy_page_to_iter
      0.12 ±  2%      -0.1        0.06 ±  8%  perf-profile.self.cycles-pp.folio_prep_large_rmappable
      0.16 ±  3%      -0.1        0.10 ±  6%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      0.14 ± 11%      -0.1        0.08 ±  8%  perf-profile.self.cycles-pp.arch_scale_freq_tick
      0.08 ±  8%      -0.1        0.03 ± 77%  perf-profile.self.cycles-pp.ktime_get_coarse_real_ts64
      0.12 ±  6%      -0.1        0.06 ± 10%  perf-profile.self.cycles-pp.__free_pages_ok
      0.08 ±  4%      -0.0        0.03 ± 77%  perf-profile.self.cycles-pp.xfs_ilock
      0.14 ±  4%      -0.0        0.09 ±  9%  perf-profile.self.cycles-pp._raw_spin_lock
      0.10 ±  4%      -0.0        0.06 ± 39%  perf-profile.self.cycles-pp.xfs_iunlock
      0.12 ±  4%      -0.0        0.08 ±  9%  perf-profile.self.cycles-pp.current_time
      0.11 ± 18%      -0.0        0.06 ± 17%  perf-profile.self.cycles-pp.iomap_set_range_uptodate
      0.12 ±  4%      -0.0        0.07 ±  7%  perf-profile.self.cycles-pp.ksys_write
      0.15 ±  4%      -0.0        0.11 ±  6%  perf-profile.self.cycles-pp.native_irq_return_iret
      0.10 ±  3%      -0.0        0.06 ±  8%  perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      0.09 ±  5%      -0.0        0.05 ± 38%  perf-profile.self.cycles-pp.xfs_file_buffered_read
      0.10 ±  4%      -0.0        0.06 ±  7%  perf-profile.self.cycles-pp.xas_store
      0.14 ±  3%      -0.0        0.10 ± 15%  perf-profile.self.cycles-pp.__mod_memcg_lruvec_state
      0.08 ± 17%      -0.0        0.04 ± 38%  perf-profile.self.cycles-pp.calc_global_load_tick
      0.11 ±  4%      -0.0        0.07 ± 10%  perf-profile.self.cycles-pp.ksys_read
      0.10 ± 18%      -0.0        0.07 ±  7%  perf-profile.self.cycles-pp.__memcpy
      0.10 ±  7%      -0.0        0.07 ±  7%  perf-profile.self.cycles-pp.update_sg_lb_stats
      0.12 ±  4%      -0.0        0.08 ±  8%  perf-profile.self.cycles-pp.menu_select
      0.09 ±  4%      -0.0        0.06 ±  9%  perf-profile.self.cycles-pp.__cond_resched
      0.11 ± 16%      -0.0        0.08 ± 25%  perf-profile.self.cycles-pp.fast_imageblit
      0.09 ±  4%      -0.0        0.06 ±  5%  perf-profile.self.cycles-pp.read_tsc
      0.08 ±  5%      -0.0        0.06 ±  5%  perf-profile.self.cycles-pp.native_sched_clock
      0.08 ±  5%      -0.0        0.06 ±  5%  perf-profile.self.cycles-pp.lapic_next_deadline
      0.08 ±  6%      -0.0        0.06 ± 11%  perf-profile.self.cycles-pp.folio_lruvec_lock_irqsave
      0.07 ±  5%      -0.0        0.05 ±  8%  perf-profile.self.cycles-pp.native_apic_msr_eoi
      0.07 ±  4%      -0.0        0.05 ±  6%  perf-profile.self.cycles-pp.__free_one_page
      0.09 ±  8%      -0.0        0.07 ±  4%  perf-profile.self.cycles-pp.find_lock_entries
      0.09            +0.0        0.10        perf-profile.self.cycles-pp.lru_add_fn
      0.14 ±  2%      +0.0        0.16        perf-profile.self.cycles-pp.folio_batch_move_lru
      0.03 ± 77%      +0.0        0.06 ±  7%  perf-profile.self.cycles-pp.page_counter_uncharge
     54.31           +19.8       74.12 ±  2%  perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath


Best Regards,
Yujie

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [linus:master] [readahead] ab4443fe3c: vm-scalability.throughput -21.4% regression
  2024-03-04  4:59         ` Yujie Liu
@ 2024-03-04  5:35           ` Yin, Fengwei
  2024-03-06  5:36             ` Yin Fengwei
  2024-03-07  9:23             ` Jan Kara
  0 siblings, 2 replies; 13+ messages in thread
From: Yin, Fengwei @ 2024-03-04  5:35 UTC (permalink / raw)
  To: Yujie Liu, Jan Kara
  Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, Andrew Morton,
	Matthew Wilcox, Guo Xuenan, linux-fsdevel, ying.huang, feng.tang

Hi Jan,

On 3/4/2024 12:59 PM, Yujie Liu wrote:
>  From the perf profile, we can see that the contention of folio lru lock
> becomes more intense. We also did a simple one-file "dd" test. Looks
> like it is more likely that low-order folios are allocated after commit
> ab4443fe3c (Fengwei will help provide the data soon). Therefore, the
> average folio size decreases while the total folio amount increases,
> which leads to touching lru lock more often.

I did following testing:
   With a xfs image in tmpfs + mount it to /mnt and create 12G test file
   (sparse-file), use one process to read it on a Ice Lake machine with
   256G system memory. So we could make sure we are doing a sequential
   file read with no page reclaim triggered.

   At the same time, profiling the distribution of order parameter of
   filemap_alloc_folio() call to understand how the large folio order
   for page cache is generated.

Here is what we got:

- Commit f0b7a0d1d46625db:
$ dd bs=4k if=/mnt/sparse-file of=/dev/null
3145728+0 records in
3145728+0 records out
12884901888 bytes (13 GB, 12 GiB) copied, 2.52208 s, 5.01 GB/s

filemap_alloc_folio
      page order    : count     distribution
         0          : 57       |                                        |
         1          : 0        |                                        |
         2          : 20       |                                        |
         3          : 2        |                                        |
         4          : 4        |                                        |
         5          : 98300    |****************************************|

- Commit ab4443fe3ca6:
$ dd bs=4k if=/mnt/sparse-file of=/dev/null
3145728+0 records in
3145728+0 records out
12884901888 bytes (13 GB, 12 GiB) copied, 2.51469 s, 5.1 GB/s

filemap_alloc_folio
      page order    : count     distribution
         0          : 21       |                                        |
         1          : 0        |                                        |
         2          : 196615   |****************************************|
         3          : 98303    |*******************                     |
         4          : 98303    |*******************                     |


Even the file read throughput is almost same. But the distribution of
order looks like a regression with ab4443fe3ca6 (more smaller order
page cache is generated than parent commit). Thanks.


Regards
Yin, Fengwei

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [linus:master] [readahead] ab4443fe3c: vm-scalability.throughput -21.4% regression
  2024-03-04  5:35           ` Yin, Fengwei
@ 2024-03-06  5:36             ` Yin Fengwei
  2024-03-07  9:23             ` Jan Kara
  1 sibling, 0 replies; 13+ messages in thread
From: Yin Fengwei @ 2024-03-06  5:36 UTC (permalink / raw)
  To: Yujie Liu, Jan Kara
  Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, Andrew Morton,
	Matthew Wilcox, Guo Xuenan, linux-fsdevel, ying.huang, feng.tang



On 3/4/24 13:35, Yin, Fengwei wrote:
> Even the file read throughput is almost same. But the distribution of
> order looks like a regression with ab4443fe3ca6 (more smaller order
> page cache is generated than parent commit). Thanks.
There may be confusion here. Let me clarify it as:
I shouldn't say folio order distribution is a regression. It's smaller
folio order cause more folios added in page cache for same workload. And
raise the lru lock contention which trigger the regression.


Regards
Yin, Fengwei

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [linus:master] [readahead] ab4443fe3c: vm-scalability.throughput -21.4% regression
  2024-03-04  5:35           ` Yin, Fengwei
  2024-03-06  5:36             ` Yin Fengwei
@ 2024-03-07  9:23             ` Jan Kara
  2024-03-07 18:19               ` Matthew Wilcox
  2024-03-10  6:40               ` Yin, Fengwei
  1 sibling, 2 replies; 13+ messages in thread
From: Jan Kara @ 2024-03-07  9:23 UTC (permalink / raw)
  To: Yin, Fengwei
  Cc: Yujie Liu, Jan Kara, Oliver Sang, oe-lkp, lkp, linux-kernel,
	Andrew Morton, Matthew Wilcox, Guo Xuenan, linux-fsdevel,
	ying.huang, feng.tang

On Mon 04-03-24 13:35:10, Yin, Fengwei wrote:
> Hi Jan,
> 
> On 3/4/2024 12:59 PM, Yujie Liu wrote:
> >  From the perf profile, we can see that the contention of folio lru lock
> > becomes more intense. We also did a simple one-file "dd" test. Looks
> > like it is more likely that low-order folios are allocated after commit
> > ab4443fe3c (Fengwei will help provide the data soon). Therefore, the
> > average folio size decreases while the total folio amount increases,
> > which leads to touching lru lock more often.
> 
> I did following testing:
>   With a xfs image in tmpfs + mount it to /mnt and create 12G test file
>   (sparse-file), use one process to read it on a Ice Lake machine with
>   256G system memory. So we could make sure we are doing a sequential
>   file read with no page reclaim triggered.
> 
>   At the same time, profiling the distribution of order parameter of
>   filemap_alloc_folio() call to understand how the large folio order
>   for page cache is generated.
> 
> Here is what we got:
> 
> - Commit f0b7a0d1d46625db:
> $ dd bs=4k if=/mnt/sparse-file of=/dev/null
> 3145728+0 records in
> 3145728+0 records out
> 12884901888 bytes (13 GB, 12 GiB) copied, 2.52208 s, 5.01 GB/s
> 
> filemap_alloc_folio
>      page order    : count     distribution
>         0          : 57       |                                        |
>         1          : 0        |                                        |
>         2          : 20       |                                        |
>         3          : 2        |                                        |
>         4          : 4        |                                        |
>         5          : 98300    |****************************************|
> 
> - Commit ab4443fe3ca6:
> $ dd bs=4k if=/mnt/sparse-file of=/dev/null
> 3145728+0 records in
> 3145728+0 records out
> 12884901888 bytes (13 GB, 12 GiB) copied, 2.51469 s, 5.1 GB/s
> 
> filemap_alloc_folio
>      page order    : count     distribution
>         0          : 21       |                                        |
>         1          : 0        |                                        |
>         2          : 196615   |****************************************|
>         3          : 98303    |*******************                     |
>         4          : 98303    |*******************                     |
> 
> 
> Even the file read throughput is almost same. But the distribution of
> order looks like a regression with ab4443fe3ca6 (more smaller order
> page cache is generated than parent commit). Thanks.

Thanks for testing! This is an interesting result and certainly unexpected
for me. The readahead code allocates naturally aligned pages so based on
the distribution of allocations it seems that before commit ab4443fe3ca6
readahead window was at least 32 pages (128KB) aligned and so we allocated
order 5 pages. After the commit, the readahead window somehow ended up only
aligned to 20 modulo 32. To follow natural alignment and fill 128KB
readahead window we allocated order 2 page (got us to offset 24 modulo 32),
then order 3 page (got us to offset 0 modulo 32), order 4 page (larger
would not fit in 128KB readahead window now), and order 2 page to finish
filling the readahead window.

Now I'm not 100% sure why the readahead window alignment changed with
different rounding when placing readahead mark - probably that's some
artifact when readahead window is tiny in the beginning before we scale it
up (I'll verify by tracing whether everything ends up looking correctly
with the current code). So I don't expect this is a problem in ab4443fe3ca6
as such but it exposes the issue that readahead page insertion code should
perhaps strive to achieve better readahead window alignment with logical
file offset even at the cost of occasionally performing somewhat shorter
readahead. I'll look into this once I dig out of the huge heap of email
after vacation...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [linus:master] [readahead] ab4443fe3c: vm-scalability.throughput -21.4% regression
  2024-03-07  9:23             ` Jan Kara
@ 2024-03-07 18:19               ` Matthew Wilcox
  2024-03-08  8:37                 ` Yujie Liu
  2024-03-10  6:41                 ` Yin, Fengwei
  2024-03-10  6:40               ` Yin, Fengwei
  1 sibling, 2 replies; 13+ messages in thread
From: Matthew Wilcox @ 2024-03-07 18:19 UTC (permalink / raw)
  To: Jan Kara
  Cc: Yin, Fengwei, Yujie Liu, Oliver Sang, oe-lkp, lkp, linux-kernel,
	Andrew Morton, Guo Xuenan, linux-fsdevel, ying.huang, feng.tang

On Thu, Mar 07, 2024 at 10:23:08AM +0100, Jan Kara wrote:
> Thanks for testing! This is an interesting result and certainly unexpected
> for me. The readahead code allocates naturally aligned pages so based on
> the distribution of allocations it seems that before commit ab4443fe3ca6
> readahead window was at least 32 pages (128KB) aligned and so we allocated
> order 5 pages. After the commit, the readahead window somehow ended up only
> aligned to 20 modulo 32. To follow natural alignment and fill 128KB
> readahead window we allocated order 2 page (got us to offset 24 modulo 32),
> then order 3 page (got us to offset 0 modulo 32), order 4 page (larger
> would not fit in 128KB readahead window now), and order 2 page to finish
> filling the readahead window.
> 
> Now I'm not 100% sure why the readahead window alignment changed with
> different rounding when placing readahead mark - probably that's some
> artifact when readahead window is tiny in the beginning before we scale it
> up (I'll verify by tracing whether everything ends up looking correctly
> with the current code). So I don't expect this is a problem in ab4443fe3ca6
> as such but it exposes the issue that readahead page insertion code should
> perhaps strive to achieve better readahead window alignment with logical
> file offset even at the cost of occasionally performing somewhat shorter
> readahead. I'll look into this once I dig out of the huge heap of email
> after vacation...

I was surprised by what you said here, so I went and re-read the code
and it doesn't work the way I thought it did.  So I had a good long think
about how it _should_ work, and I looked for some more corner conditions,
and this is what I came up with.

The first thing I've done is separate out the two limits.  The EOF is
a hard limit; we will not allocate pages beyond EOF.  The ra->size is
a soft limit; we will allocate pages beyond ra->size, but not too far.

The second thing I noticed is that index + ra_size could wrap.  So add
a check for that, and set it to ULONG_MAX.  index + ra_size - async_size
could also wrap, but this is harmless.  We certainly don't want to kick
off any more readahead in this circumstance, so leaving 'mark' outside
the range [index..ULONG_MAX] is just fine.

The third thing is that we could allocate a folio which contains a page
at ULONG_MAX.  We don't really want that in the page cache; it makes
filesystems more complicated if they have to check for that, and we
don't allow an order-0 folio at ULONG_MAX, so there's no need for it.
This _should_ already be prohibited by the "Don't allocate pages past EOF"
check, but let's explicitly prohibit it.

Compile tested only.

diff --git a/mm/readahead.c b/mm/readahead.c
index 130c0e7df99f..742e1f39035b 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -488,7 +488,8 @@ void page_cache_ra_order(struct readahead_control *ractl,
 {
 	struct address_space *mapping = ractl->mapping;
 	pgoff_t index = readahead_index(ractl);
-	pgoff_t limit = (i_size_read(mapping->host) - 1) >> PAGE_SHIFT;
+	pgoff_t last = (i_size_read(mapping->host) - 1) >> PAGE_SHIFT;
+	pgoff_t limit = index + ra->size;
 	pgoff_t mark = index + ra->size - ra->async_size;
 	int err = 0;
 	gfp_t gfp = readahead_gfp_mask(mapping);
@@ -496,23 +497,26 @@ void page_cache_ra_order(struct readahead_control *ractl,
 	if (!mapping_large_folio_support(mapping) || ra->size < 4)
 		goto fallback;
 
-	limit = min(limit, index + ra->size - 1);
-
 	if (new_order < MAX_PAGECACHE_ORDER) {
 		new_order += 2;
 		new_order = min_t(unsigned int, MAX_PAGECACHE_ORDER, new_order);
 		new_order = min_t(unsigned int, new_order, ilog2(ra->size));
 	}
 
+	if (limit < index)
+		limit = ULONG_MAX;
 	filemap_invalidate_lock_shared(mapping);
-	while (index <= limit) {
+	while (index < limit) {
 		unsigned int order = new_order;
 
 		/* Align with smaller pages if needed */
 		if (index & ((1UL << order) - 1))
 			order = __ffs(index);
+		/* Avoid wrap */
+		if (index + (1UL << order) == 0)
+			order--;
 		/* Don't allocate pages past EOF */
-		while (index + (1UL << order) - 1 > limit)
+		while (index + (1UL << order) - 1 > last)
 			order--;
 		err = ra_alloc_folio(ractl, index, mark, order, gfp);
 		if (err)

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [linus:master] [readahead] ab4443fe3c: vm-scalability.throughput -21.4% regression
  2024-03-07 18:19               ` Matthew Wilcox
@ 2024-03-08  8:37                 ` Yujie Liu
  2024-03-10  6:41                 ` Yin, Fengwei
  1 sibling, 0 replies; 13+ messages in thread
From: Yujie Liu @ 2024-03-08  8:37 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Jan Kara, Yin, Fengwei, Oliver Sang, oe-lkp, lkp, linux-kernel,
	Andrew Morton, Guo Xuenan, linux-fsdevel, ying.huang, feng.tang

On Thu, Mar 07, 2024 at 06:19:46PM +0000, Matthew Wilcox wrote:
> On Thu, Mar 07, 2024 at 10:23:08AM +0100, Jan Kara wrote:
> > Thanks for testing! This is an interesting result and certainly unexpected
> > for me. The readahead code allocates naturally aligned pages so based on
> > the distribution of allocations it seems that before commit ab4443fe3ca6
> > readahead window was at least 32 pages (128KB) aligned and so we allocated
> > order 5 pages. After the commit, the readahead window somehow ended up only
> > aligned to 20 modulo 32. To follow natural alignment and fill 128KB
> > readahead window we allocated order 2 page (got us to offset 24 modulo 32),
> > then order 3 page (got us to offset 0 modulo 32), order 4 page (larger
> > would not fit in 128KB readahead window now), and order 2 page to finish
> > filling the readahead window.
> > 
> > Now I'm not 100% sure why the readahead window alignment changed with
> > different rounding when placing readahead mark - probably that's some
> > artifact when readahead window is tiny in the beginning before we scale it
> > up (I'll verify by tracing whether everything ends up looking correctly
> > with the current code). So I don't expect this is a problem in ab4443fe3ca6
> > as such but it exposes the issue that readahead page insertion code should
> > perhaps strive to achieve better readahead window alignment with logical
> > file offset even at the cost of occasionally performing somewhat shorter
> > readahead. I'll look into this once I dig out of the huge heap of email
> > after vacation...
> 
> I was surprised by what you said here, so I went and re-read the code
> and it doesn't work the way I thought it did.  So I had a good long think
> about how it _should_ work, and I looked for some more corner conditions,
> and this is what I came up with.
> 
> The first thing I've done is separate out the two limits.  The EOF is
> a hard limit; we will not allocate pages beyond EOF.  The ra->size is
> a soft limit; we will allocate pages beyond ra->size, but not too far.
> 
> The second thing I noticed is that index + ra_size could wrap.  So add
> a check for that, and set it to ULONG_MAX.  index + ra_size - async_size
> could also wrap, but this is harmless.  We certainly don't want to kick
> off any more readahead in this circumstance, so leaving 'mark' outside
> the range [index..ULONG_MAX] is just fine.
> 
> The third thing is that we could allocate a folio which contains a page
> at ULONG_MAX.  We don't really want that in the page cache; it makes
> filesystems more complicated if they have to check for that, and we
> don't allow an order-0 folio at ULONG_MAX, so there's no need for it.
> This _should_ already be prohibited by the "Don't allocate pages past EOF"
> check, but let's explicitly prohibit it.
> 
> Compile tested only.

We applied the diff on top of commit ab4443fe3ca6 but got a kernel panic
when running the dd test:

[ 109.259674][ C46] watchdog: BUG: soft lockup - CPU#46 stuck for 22s! [ dd:8616]
[ 109.268946][ C46] Modules linked in: xfs loop intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp btrfs blake2b_generic kvm_intel xor kvm irqbypass crct10dif_pclmul crc32_pclmul sd_mod raid6_pq ghash_clmulni_intel libcrc32c crc32c_intel sg sha512_ssse3 i915 nvme rapl drm_buddy nvme_core intel_gtt ahci t10_pi drm_display_helper ast intel_cstate libahci ipmi_ssif ttm drm_shmem_helper mei_me i2c_i801 crc64_rocksoft_generic video crc64_rocksoft acpi_ipmi intel_uncore megaraid_sas mei drm_kms_helper joydev libata i2c_ismt i2c_smbus dax_hmem crc64 wmi ipmi_si ipmi_devintf ipmi_msghandler acpi_pad acpi_power_meter drm fuse ip_tables
[ 109.336216][ C46] CPU: 46 PID: 8616 Comm: dd Tainted: G          I        6.8.0-rc1-00005-g6c6de6e42e46 #1
[ 109.347892][ C46] Hardware name: NULL NULL/NULL, BIOS 05.02.01 05/12/2023
[ 109.356324][ C46] RIP: 0010:page_cache_ra_order (mm/readahead.c:521)
[ 109.363394][ C46] Code: cf 48 89 e8 4c 89 fa 48 d3 e0 48 01 c2 75 09 83 e9 01 48 89 e8 48 d3 e0 49 8d 77 ff 48 01 f0 49 39 c6 73 11 83 e9 01 48 89 e8 <48> d3 e0 48 01 f0 49 39 c6 72 ef 31 c0 83 f9 01 8b 3c 24 0f 44 c8
All code
========
   0:   cf                      iret
   1:   48 89 e8                mov    %rbp,%rax
   4:   4c 89 fa                mov    %r15,%rdx
   7:   48 d3 e0                shl    %cl,%rax
   a:   48 01 c2                add    %rax,%rdx
   d:   75 09                   jne    0x18
   f:   83 e9 01                sub    $0x1,%ecx
  12:   48 89 e8                mov    %rbp,%rax
  15:   48 d3 e0                shl    %cl,%rax
  18:   49 8d 77 ff             lea    -0x1(%r15),%rsi
  1c:   48 01 f0                add    %rsi,%rax
  1f:   49 39 c6                cmp    %rax,%r14
  22:   73 11                   jae    0x35
  24:   83 e9 01                sub    $0x1,%ecx
  27:   48 89 e8                mov    %rbp,%rax
  2a:*  48 d3 e0                shl    %cl,%rax         <-- trapping instruction
  2d:   48 01 f0                add    %rsi,%rax
  30:   49 39 c6                cmp    %rax,%r14
  33:   72 ef                   jb     0x24
  35:   31 c0                   xor    %eax,%eax
  37:   83 f9 01                cmp    $0x1,%ecx
  3a:   8b 3c 24                mov    (%rsp),%edi
  3d:   0f 44 c8                cmove  %eax,%ecx

Code starting with the faulting instruction
===========================================
   0:   48 d3 e0                shl    %cl,%rax
   3:   48 01 f0                add    %rsi,%rax
   6:   49 39 c6                cmp    %rax,%r14
   9:   72 ef                   jb     0xfffffffffffffffa
   b:   31 c0                   xor    %eax,%eax
   d:   83 f9 01                cmp    $0x1,%ecx
  10:   8b 3c 24                mov    (%rsp),%edi
  13:   0f 44 c8                cmove  %eax,%ecx
[ 109.385897][ C46] RSP: 0018:ffa0000012837c00 EFLAGS: 00000206
[ 109.393176][ C46] RAX: 0000000000000001 RBX: 0000000000000001 RCX: 0000000020159674
[ 109.402607][ C46] RDX: 000000000004924c RSI: 0000000000049249 RDI: ff11003f6f3ae7c0
[ 109.412038][ C46] RBP: 0000000000000001 R08: 0000000000038700 R09: 0000000000000013
[ 109.421447][ C46] R10: 0000000000022c04 R11: 0000000000000001 R12: ffa0000012837cb0
[ 109.430868][ C46] R13: ffd400004fee4b40 R14: 0000000000049249 R15: 000000000004924a
[ 109.440270][ C46] FS:  00007f777e884640(0000) GS:ff11003f6f380000(0000) knlGS:0000000000000000
[ 109.450756][ C46] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 109.458603][ C46] CR2: 00007f2d4d425020 CR3: 00000001b4f84005 CR4: 0000000000f71ef0
[ 109.468003][ C46] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 109.477392][ C46] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
[ 109.486794][ C46] PKRU: 55555554
[ 109.491197][ C46] Call Trace:
[ 109.495300][ C46]  <IRQ>
[ 109.498922][ C46] ? watchdog_timer_fn (kernel/watchdog.c:548)
[ 109.505074][ C46] ? __pfx_watchdog_timer_fn (kernel/watchdog.c:466)
[ 109.511620][ C46] ? __hrtimer_run_queues (kernel/time/hrtimer.c:1688 kernel/time/hrtimer.c:1752)
[ 109.518059][ C46] ? hrtimer_interrupt (kernel/time/hrtimer.c:1817)
[ 109.524088][ C46] ? __sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1065 arch/x86/kernel/apic/apic.c:1082)
[ 109.531286][ C46] ? sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1076 (discriminator 14))
[ 109.538190][ C46]  </IRQ>
[ 109.541867][ C46]  <TASK>
[ 109.545545][ C46] ? asm_sysvec_apic_timer_interrupt (arch/x86/include/asm/idtentry.h:649)
[ 109.552832][ C46] ? page_cache_ra_order (mm/readahead.c:521)
[ 109.559122][ C46] filemap_get_pages (mm/filemap.c:2500)
[ 109.564935][ C46] filemap_read (mm/filemap.c:2594)
[ 109.570241][ C46] xfs_file_buffered_read (fs/xfs/xfs_file.c:315) xfs
[ 109.577202][ C46] xfs_file_read_iter (fs/xfs/xfs_file.c:341) xfs
[ 109.583749][ C46] vfs_read (include/linux/fs.h:2079 fs/read_write.c:395 fs/read_write.c:476)
[ 109.588762][ C46] ksys_read (fs/read_write.c:619)
[ 109.593660][ C46] do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
[ 109.599038][ C46] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
[ 109.605982][ C46] RIP: 0033:0x7f777e78d3ce
[ 109.611255][ C46] Code: c0 e9 b6 fe ff ff 50 48 8d 3d 6e 08 0b 00 e8 69 01 02 00 66 0f 1f 84 00 00 00 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 66 0f 1f 84 00 00 00 00 00 48 83 ec 28
All code
========
   0:   c0 e9 b6                shr    $0xb6,%cl
   3:   fe                      (bad)
   4:   ff                      (bad)
   5:   ff 50 48                call   *0x48(%rax)
   8:   8d 3d 6e 08 0b 00       lea    0xb086e(%rip),%edi        # 0xb087c
   e:   e8 69 01 02 00          call   0x2017c
  13:   66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
  1a:   00 00
  1c:   64 8b 04 25 18 00 00    mov    %fs:0x18,%eax
  23:   00
  24:   85 c0                   test   %eax,%eax
  26:   75 14                   jne    0x3c
  28:   0f 05                   syscall
  2a:*  48 3d 00 f0 ff ff       cmp    $0xfffffffffffff000,%rax         <-- trapping instruction
  30:   77 5a                   ja     0x8c
  32:   c3                      ret
  33:   66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
  3a:   00 00
  3c:   48 83 ec 28             sub    $0x28,%rsp

Code starting with the faulting instruction
===========================================
   0:   48 3d 00 f0 ff ff       cmp    $0xfffffffffffff000,%rax
   6:   77 5a                   ja     0x62
   8:   c3                      ret
   9:   66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
  10:   00 00
  12:   48 83 ec 28             sub    $0x28,%rsp
[ 109.633619][ C46] RSP: 002b:00007ffc78ab2778 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 109.643392][ C46] RAX: ffffffffffffffda RBX: 0000000000001000 RCX: 00007f777e78d3ce
[ 109.652686][ C46] RDX: 0000000000001000 RSI: 00005629c0f7c000 RDI: 0000000000000000
[ 109.661976][ C46] RBP: 00005629c0f7c000 R08: 00005629c0f7bd30 R09: 00007f777e870be0
[ 109.671251][ C46] R10: 00005629c0f7c000 R11: 0000000000000246 R12: 0000000000000000
[ 109.680528][ C46] R13: 0000000000000000 R14: 0000000000000000 R15: ffffffffffffffff
[ 109.689808][ C46]  </TASK>
[ 109.693512][ C46] Kernel panic - not syncing: softlockup: hung tasks


# mm/readahead.c

486 void page_cache_ra_order(struct readahead_control *ractl,
487                 struct file_ra_state *ra, unsigned int new_order)
488 {
489         struct address_space *mapping = ractl->mapping;
490         pgoff_t index = readahead_index(ractl);
491         pgoff_t last = (i_size_read(mapping->host) - 1) >> PAGE_SHIFT;
492         pgoff_t limit = index + ra->size;
493         pgoff_t mark = index + ra->size - ra->async_size;
494         int err = 0;
495         gfp_t gfp = readahead_gfp_mask(mapping);
496
497         if (!mapping_large_folio_support(mapping) || ra->size < 4)
498                 goto fallback;
499
500         if (new_order < MAX_PAGECACHE_ORDER) {
501                 new_order += 2;
502                 if (new_order > MAX_PAGECACHE_ORDER)
503                         new_order = MAX_PAGECACHE_ORDER;
504                 while ((1 << new_order) > ra->size)
505                         new_order--;
506         }
507
508         if (limit < index)
509                 limit = ULONG_MAX;
510
511         filemap_invalidate_lock_shared(mapping);
512         while (index < limit) {
513                 unsigned int order = new_order;
514
515                 /* Align with smaller pages if needed */
516                 if (index & ((1UL << order) - 1))
517                         order = __ffs(index);
518                 if (index + (1UL << order) == 0)
519                         order--;
520                 /* Don't allocate pages past EOF */
521                 while (index + (1UL << order) - 1 > last)
522                         order--;
523                 /* THP machinery does not support order-1 */
524                 if (order == 1)
525                         order = 0;
526                 err = ra_alloc_folio(ractl, index, mark, order, gfp);
527                 if (err)
528                         break;
529                 index += 1UL << order;
530         }
531
532         if (index > limit) {
533                 ra->size += index - limit - 1;
534                 ra->async_size += index - limit - 1;
535         }
536
537         read_pages(ractl);
538         filemap_invalidate_unlock_shared(mapping);
539
540         /*
541          * If there were already pages in the page cache, then we may have
542          * left some gaps.  Let the regular readahead code take care of this
543          * situation.
544          */
545         if (!err)
546                 return;
547 fallback:
548         do_page_cache_ra(ractl, ra->size, ra->async_size);
549 }


Regards,
Yujie

> diff --git a/mm/readahead.c b/mm/readahead.c
> index 130c0e7df99f..742e1f39035b 100644
> --- a/mm/readahead.c
> +++ b/mm/readahead.c
> @@ -488,7 +488,8 @@ void page_cache_ra_order(struct readahead_control *ractl,
>  {
>  	struct address_space *mapping = ractl->mapping;
>  	pgoff_t index = readahead_index(ractl);
> -	pgoff_t limit = (i_size_read(mapping->host) - 1) >> PAGE_SHIFT;
> +	pgoff_t last = (i_size_read(mapping->host) - 1) >> PAGE_SHIFT;
> +	pgoff_t limit = index + ra->size;
>  	pgoff_t mark = index + ra->size - ra->async_size;
>  	int err = 0;
>  	gfp_t gfp = readahead_gfp_mask(mapping);
> @@ -496,23 +497,26 @@ void page_cache_ra_order(struct readahead_control *ractl,
>  	if (!mapping_large_folio_support(mapping) || ra->size < 4)
>  		goto fallback;
>  
> -	limit = min(limit, index + ra->size - 1);
> -
>  	if (new_order < MAX_PAGECACHE_ORDER) {
>  		new_order += 2;
>  		new_order = min_t(unsigned int, MAX_PAGECACHE_ORDER, new_order);
>  		new_order = min_t(unsigned int, new_order, ilog2(ra->size));
>  	}
>  
> +	if (limit < index)
> +		limit = ULONG_MAX;
>  	filemap_invalidate_lock_shared(mapping);
> -	while (index <= limit) {
> +	while (index < limit) {
>  		unsigned int order = new_order;
>  
>  		/* Align with smaller pages if needed */
>  		if (index & ((1UL << order) - 1))
>  			order = __ffs(index);
> +		/* Avoid wrap */
> +		if (index + (1UL << order) == 0)
> +			order--;
>  		/* Don't allocate pages past EOF */
> -		while (index + (1UL << order) - 1 > limit)
> +		while (index + (1UL << order) - 1 > last)
>  			order--;
>  		err = ra_alloc_folio(ractl, index, mark, order, gfp);
>  		if (err)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [linus:master] [readahead] ab4443fe3c: vm-scalability.throughput -21.4% regression
  2024-03-07  9:23             ` Jan Kara
  2024-03-07 18:19               ` Matthew Wilcox
@ 2024-03-10  6:40               ` Yin, Fengwei
  1 sibling, 0 replies; 13+ messages in thread
From: Yin, Fengwei @ 2024-03-10  6:40 UTC (permalink / raw)
  To: Jan Kara
  Cc: Yujie Liu, Oliver Sang, oe-lkp, lkp, linux-kernel, Andrew Morton,
	Matthew Wilcox, Guo Xuenan, linux-fsdevel, ying.huang, feng.tang

On 3/7/2024 5:23 PM, Jan Kara wrote:
> Thanks for testing! This is an interesting result and certainly unexpected
> for me. The readahead code allocates naturally aligned pages so based on
> the distribution of allocations it seems that before commit ab4443fe3ca6
> readahead window was at least 32 pages (128KB) aligned and so we allocated
> order 5 pages. After the commit, the readahead window somehow ended up only
> aligned to 20 modulo 32. To follow natural alignment and fill 128KB
> readahead window we allocated order 2 page (got us to offset 24 modulo 32),
> then order 3 page (got us to offset 0 modulo 32), order 4 page (larger
> would not fit in 128KB readahead window now), and order 2 page to finish
> filling the readahead window.
> 
> Now I'm not 100% sure why the readahead window alignment changed with
> different rounding when placing readahead mark - probably that's some
> artifact when readahead window is tiny in the beginning before we scale it
> up (I'll verify by tracing whether everything ends up looking correctly
> with the current code). So I don't expect this is a problem in ab4443fe3ca6
> as such but it exposes the issue that readahead page insertion code should
> perhaps strive to achieve better readahead window alignment with logical
> file offset even at the cost of occasionally performing somewhat shorter
> readahead. I'll look into this once I dig out of the huge heap of email
> after vacation...
Hi Jan,
I am also curious to this behavior and add tried add logs to understand
the behavior here. Here is something difference w/o ab4443fe3ca6:
  - with ab4443fe3ca6:
  You are right about the folio order as the readahead window is 0x20.
  The folio order sequence is like order 2, order 4, order3, order2.

  But different thing is always mark the first order 2 folio readahead.
  So the max order is boosted to 4 in page_cache_ra_order(). The code
  path always hit
     if (index == expected || index == (ra->start + ra->size))
  in ondemand_readahead().

  If just change the round_down() to round_up() in ra_alloc_folio(),
  the major folio order will be restored to 5.

  - without ab4443fe3ca6:
  at the beginning, the folio order sequence is same like 2, 4, 3, 2.
  But besides the first order2 folio, order4 folio will be marked as
  readahead also. So it's possible the order boosted to 5.
  Also, not just path
     if (index == expected || index == (ra->start + ra->size))
  is hit. but also
      if (folio) {
  can be hit (I didn't check other path as this testing is sequential
  read).

  There are some back and forth between 5 and 2,4,3,2, the order is
  stabilized on 5.

  I didn't fully understand the whole thing and will dig deeper. The
  above is just what the log showed.


Hi Matthew,
I noticed one thing when readahead folio order is being pushed forward,
there are several times readahead trying to allocate and add folios to
page cache. But failed as there is folio inserted to page cache cover
the requested index already. Once the folio order is correct, there is
no such case anymore. I suppose this is expected.


Regards
Yin, Fengwei

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [linus:master] [readahead] ab4443fe3c: vm-scalability.throughput -21.4% regression
  2024-03-07 18:19               ` Matthew Wilcox
  2024-03-08  8:37                 ` Yujie Liu
@ 2024-03-10  6:41                 ` Yin, Fengwei
  1 sibling, 0 replies; 13+ messages in thread
From: Yin, Fengwei @ 2024-03-10  6:41 UTC (permalink / raw)
  To: Matthew Wilcox, Jan Kara
  Cc: Yujie Liu, Oliver Sang, oe-lkp, lkp, linux-kernel, Andrew Morton,
	Guo Xuenan, linux-fsdevel, ying.huang, feng.tang

Hi Matthew,

On 3/8/2024 2:19 AM, Matthew Wilcox wrote:
>   		/* Align with smaller pages if needed */
>   		if (index & ((1UL << order) - 1))
>   			order = __ffs(index);
> +		/* Avoid wrap */
> +		if (index + (1UL << order) == 0)
> +			order--;
>   		/* Don't allocate pages past EOF */
> -		while (index + (1UL << order) - 1 > limit)
> +		while (index + (1UL << order) - 1 > last)
The lockup is related with this line. When index == (last + 1),
deadloop here.


Regards
Yin, Fengwei

>   			order--;

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2024-03-10  6:41 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-20  8:25 [linus:master] [readahead] ab4443fe3c: vm-scalability.throughput -21.4% regression kernel test robot
2024-02-21 11:14 ` Jan Kara
2024-02-22  1:32   ` Oliver Sang
2024-02-22 11:50     ` Jan Kara
2024-02-22 18:37       ` Jan Kara
2024-03-04  4:59         ` Yujie Liu
2024-03-04  5:35           ` Yin, Fengwei
2024-03-06  5:36             ` Yin Fengwei
2024-03-07  9:23             ` Jan Kara
2024-03-07 18:19               ` Matthew Wilcox
2024-03-08  8:37                 ` Yujie Liu
2024-03-10  6:41                 ` Yin, Fengwei
2024-03-10  6:40               ` Yin, Fengwei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).