From: kernelt test robot <oliver.sang@intel.com>
To: Raghavendra K T <raghavendra.kt@amd.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
Aithal Srikanth <sraithal@amd.com>,
kernel test robot <oliver.sang@intel.com>,
Mel Gorman <mgorman@techsingularity.net>,
<linux-kernel@vger.kernel.org>, <ying.huang@intel.com>,
<feng.tang@intel.com>, <fengwei.yin@intel.com>,
<aubrey.li@linux.intel.com>, <yu.c.chen@intel.com>,
<linux-mm@kvack.org>, Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
"Mel Gorman" <mgorman@suse.de>,
Andrew Morton <akpm@linux-foundation.org>,
"David Hildenbrand" <david@redhat.com>, <rppt@kernel.org>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Bharata B Rao <bharata@amd.com>,
Raghavendra K T <raghavendra.kt@amd.com>,
Sapkal Swapnil <Swapnil.Sapkal@amd.com>,
K Prateek Nayak <kprateek.nayak@amd.com>
Subject: Re: [RFC PATCH V1 2/6] sched/numa: Add disjoint vma unconditional scan logic
Date: Tue, 12 Sep 2023 15:50:45 +0800 [thread overview]
Message-ID: <202309121417.53f44ad6-oliver.sang@intel.com> (raw)
In-Reply-To: <87e3c08bd1770dd3e6eee099c01e595f14c76fc3.1693287931.git.raghavendra.kt@amd.com>
Hello,
kernel test robot noticed a -11.9% improvement of autonuma-benchmark.numa01_THREAD_ALLOC.seconds on:
commit: 1ef5cbb92bdb320c5eb9fdee1a811d22ee9e19fe ("[RFC PATCH V1 2/6] sched/numa: Add disjoint vma unconditional scan logic")
url: https://github.com/intel-lab-lkp/linux/commits/Raghavendra-K-T/sched-numa-Move-up-the-access-pid-reset-logic/20230829-141007
base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git 2f88c8e802c8b128a155976631f4eb2ce4f3c805
patch link: https://lore.kernel.org/all/87e3c08bd1770dd3e6eee099c01e595f14c76fc3.1693287931.git.raghavendra.kt@amd.com/
patch subject: [RFC PATCH V1 2/6] sched/numa: Add disjoint vma unconditional scan logic
testcase: autonuma-benchmark
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory
parameters:
iterations: 4x
test: numa01_THREAD_ALLOC
cpufreq_governor: performance
hi, Raghu,
the reason there is a separate report for this commit besides
https://lore.kernel.org/all/202309102311.84b42068-oliver.sang@intel.com/
is due to bisection nature, for one auto-bisect, we so far only could capture
one commit for performance change.
this auto-bisect is running on another test machine (Sapphire Rapids), and it
happened to choose autonuma-benchmark.numa01_THREAD_ALLOC.seconds as indicator
to do the bisect, it finally captured
"[RFC PATCH V1 2/6] sched/numa: Add disjoint vma unconditional"
and from
https://lore.kernel.org/all/acf254e9-0207-7030-131f-8a3f520c657b@amd.com/
I noticed you care more about the performance impact of whole patch set,
so let me give a summary table as below.
firstly, let me give out how we apply your patch again:
68cfe9439a1ba (linux-review/Raghavendra-K-T/sched-numa-Move-up-the-access-pid-reset-logic/20230829-141007) sched/numa: Allow scanning of shared VMAs
af46f3c9ca2d1 sched/numa: Allow recently accessed VMAs to be scanned
167773d1ddb5f sched/numa: Increase tasks' access history
fc769221b2306 sched/numa: Remove unconditional scan logic using mm numa_scan_seq
1ef5cbb92bdb3 sched/numa: Add disjoint vma unconditional scan logic
2a806eab1c2e1 sched/numa: Move up the access pid reset logic
2f88c8e802c8b (tip/sched/core) sched/eevdf/doc: Modify the documented knob to base_slice_ns as well
we have below data on this test machine
(full table will be very big, if you want it, please let me know):
=========================================================================================
compiler/cpufreq_governor/iterations/kconfig/rootfs/tbox_group/test/testcase:
gcc-12/performance/4x/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/lkp-spr-r02/numa01_THREAD_ALLOC/autonuma-benchmark
commit:
2f88c8e802 ("(tip/sched/core) sched/eevdf/doc: Modify the documented knob to base_slice_ns as well")
2a806eab1c ("sched/numa: Move up the access pid reset logic")
1ef5cbb92b ("sched/numa: Add disjoint vma unconditional scan logic")
68cfe9439a ("sched/numa: Allow scanning of shared VMAs")
2f88c8e802c8b128 2a806eab1c2e1c9f0ae39dc0307 1ef5cbb92bdb320c5eb9fdee1a8 68cfe9439a1baa642e05883fa64
---------------- --------------------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev %change %stddev
\ | \ | \ | \
271.01 +0.8% 273.24 -0.7% 269.00 -26.4% 199.49 ± 3% autonuma-benchmark.numa01.seconds
76.28 +0.2% 76.44 -11.7% 67.36 ± 6% -46.9% 40.49 ± 5% autonuma-benchmark.numa01_THREAD_ALLOC.seconds
8.11 -0.9% 8.04 -0.7% 8.05 -0.1% 8.10 autonuma-benchmark.numa02.seconds
1425 +0.7% 1434 -3.1% 1381 -30.1% 996.02 ± 2% autonuma-benchmark.time.elapsed_time
it has some difference with our previous report on Ice Lake that
autonuma-benchmark.numa02.seconds seems keep stable,
but autonuma-benchmark.numa01.seconds has more changes.
anyway, for both platforms, we see performance improvement consistently
in this test along the patch-set.
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20230912/202309121417.53f44ad6-oliver.sang@intel.com
below are normal data we shared in our performance reports. FYI.
(you won't see data for autonuma-benchmark.numa01.seconds or autonuma-benchmark.numa02.seconds,
since the delta bewteen 2a806eab1c and 1ef5cbb92b are small so our tool won't
show them)
=========================================================================================
compiler/cpufreq_governor/iterations/kconfig/rootfs/tbox_group/test/testcase:
gcc-12/performance/4x/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/lkp-spr-r02/numa01_THREAD_ALLOC/autonuma-benchmark
commit:
2a806eab1c ("sched/numa: Move up the access pid reset logic")
1ef5cbb92b ("sched/numa: Add disjoint vma unconditional scan logic")
2a806eab1c2e1c9f 1ef5cbb92bdb320c5eb9fdee1a8
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.00 ± 79% +0.0 0.00 ± 13% mpstat.cpu.all.iowait%
357.33 ± 12% +90.4% 680.50 ± 30% perf-c2c.DRAM.remote
79.17 ± 14% +34.7% 106.67 ± 18% perf-c2c.HITM.remote
16378 ± 16% +53.9% 25200 ± 22% turbostat.POLL
50.24 +15.4% 57.99 turbostat.RAMWatt
37.04 ±199% -97.2% 1.05 ±141% perf-sched.wait_time.avg.ms.__cond_resched.exit_mmap.__mmput.exit_mm.do_exit
7.46 ± 23% -43.7% 4.20 ± 47% perf-sched.wait_time.avg.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
170.20 ±218% -99.4% 1.05 ±141% perf-sched.wait_time.max.ms.__cond_resched.exit_mmap.__mmput.exit_mm.do_exit
283.88 ± 28% +49.3% 423.88 ± 16% perf-sched.wait_time.max.ms.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64
189.72 ± 23% +50.9% 286.24 ± 25% perf-sched.wait_time.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
76.44 -11.9% 67.36 ± 6% autonuma-benchmark.numa01_THREAD_ALLOC.seconds
1434 -3.7% 1381 autonuma-benchmark.time.elapsed_time
1434 -3.7% 1381 autonuma-benchmark.time.elapsed_time.max
1132634 -6.0% 1064224 ± 2% autonuma-benchmark.time.involuntary_context_switches
2532130 ± 2% +4.5% 2645367 ± 2% autonuma-benchmark.time.minor_page_faults
293184 -3.6% 282626 autonuma-benchmark.time.user_time
16101 +41.9% 22846 ± 4% autonuma-benchmark.time.voluntary_context_switches
6.41 ± 52% +3833.7% 251.97 ± 6% sched_debug.cfs_rq:/.util_est_enqueued.avg
401.88 ± 4% +179.2% 1121 ± 3% sched_debug.cfs_rq:/.util_est_enqueued.max
39.18 ± 16% +698.0% 312.66 ± 3% sched_debug.cfs_rq:/.util_est_enqueued.stddev
1662842 +10.5% 1838160 ± 2% sched_debug.cpu.avg_idle.avg
860266 ± 3% -22.4% 667568 ± 11% sched_debug.cpu.avg_idle.min
647306 ± 4% +13.6% 735595 ± 2% sched_debug.cpu.avg_idle.stddev
664890 +10.4% 733919 ± 2% sched_debug.cpu.max_idle_balance_cost.avg
203832 ± 4% +45.7% 296934 ± 4% sched_debug.cpu.max_idle_balance_cost.stddev
58841 ± 19% +205.6% 179845 ± 8% proc-vmstat.numa_hint_faults
47138 ± 20% +145.1% 115557 ± 8% proc-vmstat.numa_hint_faults_local
652.00 ± 27% +5217.2% 34668 ± 10% proc-vmstat.numa_huge_pte_updates
108295 ± 25% +3179.6% 3551657 ± 11% proc-vmstat.numa_pages_migrated
499336 ± 16% +3503.7% 17994636 ± 10% proc-vmstat.numa_pte_updates
108295 ± 25% +3179.6% 3551657 ± 11% proc-vmstat.pgmigrate_success
238140 +6.7% 254200 proc-vmstat.pgreuse
191.00 ± 29% +3488.8% 6854 ± 11% proc-vmstat.thp_migration_success
4331500 -4.5% 4135400 ± 2% proc-vmstat.unevictable_pgs_scanned
0.66 +0.0 0.67 perf-stat.i.branch-miss-rate%
1779997 +3.1% 1835782 perf-stat.i.branch-misses
2096 +1.6% 2128 perf-stat.i.context-switches
219.07 +2.3% 224.02 perf-stat.i.cpu-migrations
163199 -11.6% 144321 ± 2% perf-stat.i.cycles-between-cache-misses
986545 +1.0% 996780 perf-stat.i.dTLB-store-misses
4436 +4.1% 4616 perf-stat.i.minor-faults
42.56 ± 3% +3.4 45.95 perf-stat.i.node-load-miss-rate%
396254 +28.2% 507952 ± 3% perf-stat.i.node-load-misses
4436 +4.1% 4617 perf-stat.i.page-faults
38.37 ± 6% +6.3 44.69 ± 7% perf-stat.overall.node-load-miss-rate%
1734727 +2.3% 1774826 perf-stat.ps.branch-misses
216.66 +2.2% 221.40 perf-stat.ps.cpu-migrations
983143 +1.1% 993856 perf-stat.ps.dTLB-store-misses
4178 +4.3% 4357 perf-stat.ps.minor-faults
384816 +29.9% 499993 ± 4% perf-stat.ps.node-load-misses
4178 +4.3% 4357 perf-stat.ps.page-faults
47.25 ± 24% -32.1 15.11 ±142% perf-profile.calltrace.cycles-pp.reader__read_event.perf_session__process_events.record__finish_output.__cmd_record
40.98 ± 34% -27.0 13.98 ±141% perf-profile.calltrace.cycles-pp.ordered_events__queue.process_simple.reader__read_event.perf_session__process_events.record__finish_output
40.76 ± 34% -26.9 13.90 ±141% perf-profile.calltrace.cycles-pp.queue_event.ordered_events__queue.process_simple.reader__read_event.perf_session__process_events
40.90 ± 36% -26.6 14.32 ±141% perf-profile.calltrace.cycles-pp.process_simple.reader__read_event.perf_session__process_events.record__finish_output.__cmd_record
6.07 ±101% -5.4 0.62 ±223% perf-profile.calltrace.cycles-pp.__ordered_events__flush.perf_session__process_user_event.reader__read_event.perf_session__process_events.record__finish_output
5.76 ±110% -5.1 0.62 ±223% perf-profile.calltrace.cycles-pp.perf_session__process_user_event.reader__read_event.perf_session__process_events.record__finish_output.__cmd_record
5.42 ±101% -4.9 0.48 ±223% perf-profile.calltrace.cycles-pp.perf_session__deliver_event.__ordered_events__flush.perf_session__process_user_event.reader__read_event.perf_session__process_events
0.58 ± 18% +0.4 0.94 ± 18% perf-profile.calltrace.cycles-pp.rebalance_domains.__do_softirq.__irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
0.49 ± 49% +0.4 0.94 ± 17% perf-profile.calltrace.cycles-pp.load_balance.rebalance_domains.__do_softirq.__irq_exit_rcu.sysvec_apic_timer_interrupt
0.70 ± 25% +0.5 1.21 ± 22% perf-profile.calltrace.cycles-pp.__do_softirq.__irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
0.71 ± 24% +0.5 1.22 ± 22% perf-profile.calltrace.cycles-pp.__irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
0.20 ±142% +0.5 0.74 ± 18% perf-profile.calltrace.cycles-pp.sched_setaffinity.__x64_sys_sched_setaffinity.do_syscall_64.entry_SYSCALL_64_after_hwframe.sched_setaffinity
0.64 ± 53% +0.5 1.18 ± 32% perf-profile.calltrace.cycles-pp.task_work_run.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt
0.18 ±141% +0.6 0.74 ± 19% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_read.readn.perf_evsel__read.read_counters
0.18 ±141% +0.6 0.74 ± 19% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read.readn.perf_evsel__read
0.18 ±141% +0.6 0.75 ± 19% perf-profile.calltrace.cycles-pp.__libc_read.readn.perf_evsel__read.read_counters.process_interval
0.18 ±141% +0.6 0.76 ± 19% perf-profile.calltrace.cycles-pp.readn.perf_evsel__read.read_counters.process_interval.dispatch_events
0.31 ±103% +0.6 0.89 ± 18% perf-profile.calltrace.cycles-pp.update_sd_lb_stats.find_busiest_group.load_balance.rebalance_domains.__do_softirq
0.10 ±223% +0.6 0.69 ± 18% perf-profile.calltrace.cycles-pp.__sched_setaffinity.sched_setaffinity.__x64_sys_sched_setaffinity.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.71 ± 23% +0.6 1.30 ± 26% perf-profile.calltrace.cycles-pp.seq_read_iter.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.31 ±103% +0.6 0.90 ± 18% perf-profile.calltrace.cycles-pp.find_busiest_group.load_balance.rebalance_domains.__do_softirq.__irq_exit_rcu
0.22 ±142% +0.6 0.81 ± 17% perf-profile.calltrace.cycles-pp.__x64_sys_sched_setaffinity.do_syscall_64.entry_SYSCALL_64_after_hwframe.sched_setaffinity.evlist_cpu_iterator__next
0.57 ± 60% +0.6 1.19 ± 16% perf-profile.calltrace.cycles-pp.__do_sys_newstat.do_syscall_64.entry_SYSCALL_64_after_hwframe.__xstat64
0.58 ± 60% +0.6 1.21 ± 16% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__xstat64
0.58 ± 60% +0.6 1.21 ± 16% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__xstat64
0.22 ±143% +0.6 0.86 ± 18% perf-profile.calltrace.cycles-pp.update_sg_lb_stats.update_sd_lb_stats.find_busiest_group.load_balance.rebalance_domains
0.58 ± 61% +0.6 1.23 ± 16% perf-profile.calltrace.cycles-pp.__xstat64
0.25 ±150% +0.6 0.90 ± 19% perf-profile.calltrace.cycles-pp.do_dentry_open.do_open.path_openat.do_filp_open.do_sys_openat2
0.24 ±142% +0.7 0.90 ± 17% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.sched_setaffinity.evlist_cpu_iterator__next.read_counters
0.24 ±142% +0.7 0.90 ± 18% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.sched_setaffinity.evlist_cpu_iterator__next.read_counters.process_interval
0.21 ±141% +0.7 0.89 ± 19% perf-profile.calltrace.cycles-pp.perf_evsel__read.read_counters.process_interval.dispatch_events.cmd_stat
0.37 ±108% +0.7 1.07 ± 17% perf-profile.calltrace.cycles-pp.evlist__id2evsel.evsel__read_counter.read_counters.process_interval.dispatch_events
0.64 ± 57% +0.7 1.33 ± 20% perf-profile.calltrace.cycles-pp.evlist_cpu_iterator__next.read_counters.process_interval.dispatch_events.cmd_stat
0.10 ±223% +0.7 0.81 ± 27% perf-profile.calltrace.cycles-pp.show_stat.seq_read_iter.vfs_read.ksys_read.do_syscall_64
0.26 ±142% +0.7 1.01 ± 19% perf-profile.calltrace.cycles-pp.sched_setaffinity.evlist_cpu_iterator__next.read_counters.process_interval.dispatch_events
0.51 ± 84% +0.7 1.25 ± 28% perf-profile.calltrace.cycles-pp.do_open.path_openat.do_filp_open.do_sys_openat2.__x64_sys_openat
0.09 ±223% +0.8 0.85 ± 27% perf-profile.calltrace.cycles-pp.vmstat_start.seq_read_iter.proc_reg_read_iter.vfs_read.ksys_read
0.53 ± 53% +0.8 1.30 ± 25% perf-profile.calltrace.cycles-pp.seq_read_iter.proc_reg_read_iter.vfs_read.ksys_read.do_syscall_64
0.53 ± 53% +0.8 1.30 ± 25% perf-profile.calltrace.cycles-pp.proc_reg_read_iter.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.85 ± 20% +0.8 1.64 ± 26% perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
0.30 ±103% +0.8 1.12 ± 30% perf-profile.calltrace.cycles-pp.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
0.20 ±143% +0.8 1.03 ± 30% perf-profile.calltrace.cycles-pp.process_one_work.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
0.89 ± 23% +0.8 1.72 ± 26% perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt
0.66 ± 70% +0.8 1.48 ± 38% perf-profile.calltrace.cycles-pp.copy_process.kernel_clone.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.27 ±155% +0.8 1.12 ± 33% perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read.readn
0.32 ±150% +0.9 1.18 ± 40% perf-profile.calltrace.cycles-pp.dup_mm.copy_process.kernel_clone.__do_sys_clone.do_syscall_64
0.94 ± 23% +0.9 1.83 ± 26% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt
0.94 ± 23% +0.9 1.83 ± 26% perf-profile.calltrace.cycles-pp.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt
0.15 ±223% +1.0 1.10 ± 44% perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.exit_mmap
0.15 ±223% +1.0 1.12 ± 43% perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.exit_mmap.__mmput
0.15 ±223% +1.0 1.13 ± 43% perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.exit_mmap.__mmput.exit_mm
1.00 ± 51% +1.0 1.99 ± 36% perf-profile.calltrace.cycles-pp.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_fork
1.00 ± 51% +1.0 1.98 ± 36% perf-profile.calltrace.cycles-pp.kernel_clone.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_fork
1.01 ± 51% +1.0 1.99 ± 36% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_fork
1.01 ± 51% +1.0 1.99 ± 36% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_fork
1.06 ± 42% +1.0 2.05 ± 17% perf-profile.calltrace.cycles-pp.evsel__read_counter.read_counters.process_interval.dispatch_events.cmd_stat
0.17 ±223% +1.0 1.22 ± 41% perf-profile.calltrace.cycles-pp.unmap_vmas.exit_mmap.__mmput.exit_mm.do_exit
1.07 ± 54% +1.0 2.12 ± 36% perf-profile.calltrace.cycles-pp.__libc_fork
0.55 ± 75% +1.2 1.74 ± 36% perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
0.86 ± 59% +1.2 2.10 ± 33% perf-profile.calltrace.cycles-pp.exit_mmap.__mmput.exit_mm.do_exit.do_group_exit
0.87 ± 59% +1.2 2.11 ± 33% perf-profile.calltrace.cycles-pp.__mmput.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group
0.95 ± 59% +1.3 2.29 ± 33% perf-profile.calltrace.cycles-pp.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64
1.74 ± 46% +1.4 3.17 ± 25% perf-profile.calltrace.cycles-pp.do_sys_openat2.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe.open64
1.74 ± 46% +1.4 3.17 ± 25% perf-profile.calltrace.cycles-pp.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe.open64
1.19 ± 60% +1.6 2.78 ± 31% perf-profile.calltrace.cycles-pp.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.19 ± 61% +1.6 2.78 ± 31% perf-profile.calltrace.cycles-pp.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.19 ± 61% +1.6 2.78 ± 31% perf-profile.calltrace.cycles-pp.do_group_exit.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.82 ± 24% +1.6 3.46 ± 23% perf-profile.calltrace.cycles-pp.ret_from_fork_asm
1.82 ± 24% +1.6 3.46 ± 23% perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm
1.82 ± 24% +1.6 3.46 ± 23% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm
2.15 ± 21% +2.0 4.20 ± 24% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt
1.48 ± 80% +3.4 4.89 ± 18% perf-profile.calltrace.cycles-pp.read_counters.process_interval.dispatch_events.cmd_stat
1.54 ± 79% +3.5 5.03 ± 18% perf-profile.calltrace.cycles-pp.dispatch_events.cmd_stat
1.54 ± 79% +3.5 5.03 ± 18% perf-profile.calltrace.cycles-pp.process_interval.dispatch_events.cmd_stat
1.54 ± 79% +3.5 5.04 ± 18% perf-profile.calltrace.cycles-pp.cmd_stat
0.13 ±223% +3.5 3.67 ± 62% perf-profile.calltrace.cycles-pp.copy_page.folio_copy.migrate_folio_extra.move_to_new_folio.migrate_pages_batch
0.14 ±223% +3.6 3.73 ± 62% perf-profile.calltrace.cycles-pp.folio_copy.migrate_folio_extra.move_to_new_folio.migrate_pages_batch.migrate_pages
0.14 ±223% +3.6 3.73 ± 62% perf-profile.calltrace.cycles-pp.move_to_new_folio.migrate_pages_batch.migrate_pages.migrate_misplaced_page.do_huge_pmd_numa_page
0.14 ±223% +3.6 3.73 ± 62% perf-profile.calltrace.cycles-pp.migrate_folio_extra.move_to_new_folio.migrate_pages_batch.migrate_pages.migrate_misplaced_page
0.14 ±223% +3.9 4.00 ± 62% perf-profile.calltrace.cycles-pp.migrate_pages.migrate_misplaced_page.do_huge_pmd_numa_page.__handle_mm_fault.handle_mm_fault
0.14 ±223% +3.9 4.00 ± 62% perf-profile.calltrace.cycles-pp.migrate_pages_batch.migrate_pages.migrate_misplaced_page.do_huge_pmd_numa_page.__handle_mm_fault
0.14 ±223% +3.9 4.00 ± 62% perf-profile.calltrace.cycles-pp.migrate_misplaced_page.do_huge_pmd_numa_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
3.90 ± 41% +3.9 7.77 ± 27% perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
3.97 ± 41% +3.9 7.84 ± 27% perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
0.14 ±223% +3.9 4.06 ± 61% perf-profile.calltrace.cycles-pp.do_huge_pmd_numa_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
4.13 ± 41% +4.0 8.15 ± 27% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
4.13 ± 41% +4.0 8.17 ± 27% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.read
4.18 ± 41% +4.1 8.26 ± 27% perf-profile.calltrace.cycles-pp.read
1.80 ± 50% +5.5 7.29 ± 43% perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
2.02 ± 50% +5.6 7.64 ± 41% perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
2.04 ± 50% +5.6 7.66 ± 41% perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault
2.36 ± 33% +5.6 7.99 ± 33% perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
2.14 ± 50% +5.7 7.84 ± 40% perf-profile.calltrace.cycles-pp.asm_exc_page_fault
69.69 ± 16% -30.0 39.64 ± 40% perf-profile.children.cycles-pp.__cmd_record
6.08 ±101% -5.5 0.62 ±223% perf-profile.children.cycles-pp.perf_session__process_user_event
6.15 ±100% -5.4 0.72 ±190% perf-profile.children.cycles-pp.__ordered_events__flush
5.48 ±101% -4.9 0.56 ±188% perf-profile.children.cycles-pp.perf_session__deliver_event
0.06 ± 29% +0.0 0.11 ± 27% perf-profile.children.cycles-pp.path_init
0.02 ±141% +0.0 0.06 ± 33% perf-profile.children.cycles-pp.cp_new_stat
0.02 ±141% +0.1 0.07 ± 25% perf-profile.children.cycles-pp.ptep_clear_flush
0.02 ±146% +0.1 0.08 ± 34% perf-profile.children.cycles-pp.rcu_nocb_try_bypass
0.08 ± 24% +0.1 0.14 ± 32% perf-profile.children.cycles-pp._raw_spin_lock_irq
0.02 ±141% +0.1 0.08 ± 25% perf-profile.children.cycles-pp.__legitimize_mnt
0.00 +0.1 0.06 ± 16% perf-profile.children.cycles-pp.vm_memory_committed
0.11 ± 26% +0.1 0.17 ± 19% perf-profile.children.cycles-pp.aa_file_perm
0.06 ± 50% +0.1 0.12 ± 38% perf-profile.children.cycles-pp.kcpustat_cpu_fetch
0.02 ±141% +0.1 0.08 ± 40% perf-profile.children.cycles-pp.set_next_entity
0.09 ± 39% +0.1 0.16 ± 28% perf-profile.children.cycles-pp.try_charge_memcg
0.02 ±143% +0.1 0.09 ± 38% perf-profile.children.cycles-pp.__evlist__disable
0.01 ±223% +0.1 0.08 ± 35% perf-profile.children.cycles-pp._IO_setvbuf
0.08 ± 36% +0.1 0.16 ± 29% perf-profile.children.cycles-pp.switch_mm_irqs_off
0.02 ±223% +0.1 0.09 ± 27% perf-profile.children.cycles-pp.drm_gem_vunmap_unlocked
0.12 ± 23% +0.1 0.20 ± 35% perf-profile.children.cycles-pp.get_idle_time
0.01 ±223% +0.1 0.08 ± 19% perf-profile.children.cycles-pp.meminfo_proc_show
0.10 ± 14% +0.1 0.18 ± 33% perf-profile.children.cycles-pp.drm_atomic_helper_commit
0.12 ± 17% +0.1 0.20 ± 32% perf-profile.children.cycles-pp.xas_descend
0.05 ± 77% +0.1 0.13 ± 27% perf-profile.children.cycles-pp.fsnotify_perm
0.02 ±223% +0.1 0.10 ± 42% perf-profile.children.cycles-pp.vm_unmapped_area
0.11 ± 13% +0.1 0.19 ± 33% perf-profile.children.cycles-pp.drm_atomic_commit
0.02 ±143% +0.1 0.11 ± 18% perf-profile.children.cycles-pp.__kmalloc
0.04 ±118% +0.1 0.13 ± 43% perf-profile.children.cycles-pp.xas_find
0.00 +0.1 0.08 ± 30% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
0.08 ± 33% +0.1 0.17 ± 37% perf-profile.children.cycles-pp.node_read_vmstat
0.09 ± 45% +0.1 0.18 ± 29% perf-profile.children.cycles-pp.select_task_rq
0.01 ±223% +0.1 0.10 ± 36% perf-profile.children.cycles-pp.slab_show
0.03 ±143% +0.1 0.12 ± 46% perf-profile.children.cycles-pp.acpi_ps_parse_loop
0.12 ± 36% +0.1 0.21 ± 33% perf-profile.children.cycles-pp.dequeue_entity
0.01 ±223% +0.1 0.10 ± 27% perf-profile.children.cycles-pp._IO_file_doallocate
0.08 ± 53% +0.1 0.17 ± 22% perf-profile.children.cycles-pp.apparmor_ptrace_access_check
0.04 ±105% +0.1 0.13 ± 48% perf-profile.children.cycles-pp.acpi_ps_parse_aml
0.08 ± 32% +0.1 0.18 ± 34% perf-profile.children.cycles-pp.autoremove_wake_function
0.12 ± 26% +0.1 0.22 ± 31% perf-profile.children.cycles-pp.__x64_sys_close
0.11 ± 16% +0.1 0.21 ± 36% perf-profile.children.cycles-pp.drm_atomic_helper_dirtyfb
0.04 ±107% +0.1 0.13 ± 47% perf-profile.children.cycles-pp.acpi_ns_evaluate
0.04 ±107% +0.1 0.13 ± 47% perf-profile.children.cycles-pp.acpi_ps_execute_method
0.02 ±146% +0.1 0.12 ± 32% perf-profile.children.cycles-pp.thread_group_cputime
0.12 ± 35% +0.1 0.22 ± 21% perf-profile.children.cycles-pp.atime_needs_update
0.09 ± 44% +0.1 0.18 ± 23% perf-profile.children.cycles-pp.update_rq_clock_task
0.11 ± 32% +0.1 0.21 ± 25% perf-profile.children.cycles-pp.__perf_event_read_value
0.04 ±107% +0.1 0.14 ± 45% perf-profile.children.cycles-pp.acpi_os_execute_deferred
0.04 ±107% +0.1 0.14 ± 45% perf-profile.children.cycles-pp.acpi_ev_asynch_execute_gpe_method
0.04 ±112% +0.1 0.14 ± 38% perf-profile.children.cycles-pp.get_unmapped_area
0.06 ± 58% +0.1 0.16 ± 38% perf-profile.children.cycles-pp.prepare_task_switch
0.13 ± 34% +0.1 0.23 ± 19% perf-profile.children.cycles-pp.generic_exec_single
0.10 ± 30% +0.1 0.20 ± 30% perf-profile.children.cycles-pp.__wait_for_common
0.03 ±105% +0.1 0.13 ± 29% perf-profile.children.cycles-pp.thread_group_cputime_adjusted
0.13 ± 32% +0.1 0.24 ± 19% perf-profile.children.cycles-pp.smp_call_function_single
0.12 ± 40% +0.1 0.23 ± 26% perf-profile.children.cycles-pp.ttwu_do_activate
0.06 ± 58% +0.1 0.17 ± 32% perf-profile.children.cycles-pp.kstat_irqs_usr
0.10 ± 31% +0.1 0.22 ± 33% perf-profile.children.cycles-pp.__wake_up_common_lock
0.02 ±223% +0.1 0.13 ± 38% perf-profile.children.cycles-pp.free_unref_page_prepare
0.11 ± 48% +0.1 0.22 ± 37% perf-profile.children.cycles-pp.single_release
0.15 ± 33% +0.1 0.26 ± 17% perf-profile.children.cycles-pp.perf_event_read
0.08 ± 48% +0.1 0.20 ± 27% perf-profile.children.cycles-pp.__do_set_cpus_allowed
0.10 ± 70% +0.1 0.21 ± 33% perf-profile.children.cycles-pp.vm_area_dup
0.09 ± 31% +0.1 0.21 ± 35% perf-profile.children.cycles-pp.__wake_up_common
0.20 ± 37% +0.1 0.32 ± 21% perf-profile.children.cycles-pp.update_load_avg
0.12 ± 35% +0.1 0.24 ± 28% perf-profile.children.cycles-pp.blk_mq_queue_tag_busy_iter
0.12 ± 35% +0.1 0.24 ± 28% perf-profile.children.cycles-pp.blk_mq_in_flight
0.20 ± 28% +0.1 0.32 ± 16% perf-profile.children.cycles-pp.__cond_resched
0.17 ± 37% +0.1 0.30 ± 26% perf-profile.children.cycles-pp.dequeue_task_fair
0.08 ± 51% +0.1 0.21 ± 36% perf-profile.children.cycles-pp.free_swap_cache
0.02 ±146% +0.1 0.16 ± 38% perf-profile.children.cycles-pp.flush_tlb_func
0.09 ± 48% +0.1 0.23 ± 37% perf-profile.children.cycles-pp.free_pages_and_swap_cache
0.18 ± 37% +0.1 0.32 ± 26% perf-profile.children.cycles-pp.update_curr
0.02 ±142% +0.1 0.16 ± 26% perf-profile.children.cycles-pp.__x64_sys_newfstat
0.04 ±109% +0.1 0.18 ± 53% perf-profile.children.cycles-pp.free_unref_page_list
0.12 ± 38% +0.1 0.26 ± 30% perf-profile.children.cycles-pp.restore_fpregs_from_fpstate
0.12 ± 59% +0.1 0.27 ± 26% perf-profile.children.cycles-pp.security_ptrace_access_check
0.13 ± 40% +0.1 0.28 ± 33% perf-profile.children.cycles-pp.user_path_at_empty
0.12 ± 31% +0.1 0.27 ± 23% perf-profile.children.cycles-pp.__set_cpus_allowed_ptr_locked
0.20 ± 31% +0.1 0.35 ± 34% perf-profile.children.cycles-pp.dev_attr_show
0.13 ± 44% +0.2 0.28 ± 31% perf-profile.children.cycles-pp.readlink
0.20 ± 30% +0.2 0.35 ± 26% perf-profile.children.cycles-pp.__memcpy
0.00 +0.2 0.15 ± 64% perf-profile.children.cycles-pp.pmdp_invalidate
0.18 ± 31% +0.2 0.34 ± 27% perf-profile.children.cycles-pp.dup_task_struct
0.13 ± 33% +0.2 0.29 ± 29% perf-profile.children.cycles-pp.switch_fpu_return
0.00 +0.2 0.16 ± 64% perf-profile.children.cycles-pp.set_pmd_migration_entry
0.19 ± 26% +0.2 0.34 ± 27% perf-profile.children.cycles-pp.__entry_text_start
0.12 ± 32% +0.2 0.29 ± 38% perf-profile.children.cycles-pp.pipe_write
0.25 ± 35% +0.2 0.42 ± 32% perf-profile.children.cycles-pp.__check_object_size
0.23 ± 35% +0.2 0.40 ± 20% perf-profile.children.cycles-pp.asm_sysvec_reschedule_ipi
0.22 ± 46% +0.2 0.39 ± 21% perf-profile.children.cycles-pp.enqueue_task_fair
0.00 +0.2 0.18 ± 92% perf-profile.children.cycles-pp.cpuidle_enter
0.00 +0.2 0.18 ± 92% perf-profile.children.cycles-pp.cpuidle_enter_state
0.00 +0.2 0.18 ± 59% perf-profile.children.cycles-pp.try_to_migrate
0.00 +0.2 0.18 ± 59% perf-profile.children.cycles-pp.try_to_migrate_one
0.16 ± 37% +0.2 0.34 ± 33% perf-profile.children.cycles-pp.do_readlinkat
0.19 ± 54% +0.2 0.37 ± 44% perf-profile.children.cycles-pp.rcu_cblist_dequeue
0.16 ± 37% +0.2 0.34 ± 32% perf-profile.children.cycles-pp.__x64_sys_readlink
0.00 +0.2 0.19 ± 60% perf-profile.children.cycles-pp.rmap_walk_anon
0.00 +0.2 0.19 ± 66% perf-profile.children.cycles-pp.__sysvec_call_function
0.00 +0.2 0.19 ± 95% perf-profile.children.cycles-pp.cpuidle_idle_call
0.00 +0.2 0.20 ± 59% perf-profile.children.cycles-pp.migrate_folio_unmap
0.21 ± 43% +0.2 0.42 ± 27% perf-profile.children.cycles-pp.diskstats_show
0.28 ± 36% +0.2 0.49 ± 24% perf-profile.children.cycles-pp.__kmem_cache_alloc_node
0.00 +0.2 0.21 ± 53% perf-profile.children.cycles-pp.__flush_smp_call_function_queue
0.01 ±223% +0.2 0.24 ± 63% perf-profile.children.cycles-pp.sysvec_call_function
0.39 ± 15% +0.2 0.63 ± 15% perf-profile.children.cycles-pp.native_irq_return_iret
0.22 ± 13% +0.3 0.48 ± 28% perf-profile.children.cycles-pp.all_vm_events
0.21 ± 38% +0.3 0.48 ± 40% perf-profile.children.cycles-pp.write
0.30 ± 45% +0.3 0.58 ± 27% perf-profile.children.cycles-pp._raw_spin_lock
0.22 ± 40% +0.3 0.50 ± 26% perf-profile.children.cycles-pp.getname_flags
0.28 ± 54% +0.3 0.56 ± 24% perf-profile.children.cycles-pp.memcg_slab_post_alloc_hook
0.30 ± 55% +0.3 0.60 ± 39% perf-profile.children.cycles-pp.dup_mmap
0.01 ±223% +0.3 0.32 ± 88% perf-profile.children.cycles-pp.start_secondary
0.20 ± 38% +0.3 0.50 ± 40% perf-profile.children.cycles-pp.release_pages
0.01 ±223% +0.3 0.32 ± 58% perf-profile.children.cycles-pp.asm_sysvec_call_function
0.01 ±223% +0.3 0.32 ± 85% perf-profile.children.cycles-pp.do_idle
0.01 ±223% +0.3 0.32 ± 85% perf-profile.children.cycles-pp.secondary_startup_64_no_verify
0.01 ±223% +0.3 0.32 ± 85% perf-profile.children.cycles-pp.cpu_startup_entry
0.37 ± 37% +0.3 0.68 ± 30% perf-profile.children.cycles-pp.__close_nocancel
0.26 ± 24% +0.3 0.58 ± 38% perf-profile.children.cycles-pp.drm_fb_helper_damage_work
0.26 ± 24% +0.3 0.58 ± 38% perf-profile.children.cycles-pp.drm_fbdev_generic_helper_fb_dirty
0.36 ± 37% +0.3 0.69 ± 19% perf-profile.children.cycles-pp.perf_read
0.33 ± 29% +0.3 0.66 ± 30% perf-profile.children.cycles-pp.fold_vm_numa_events
0.28 ± 48% +0.3 0.62 ± 34% perf-profile.children.cycles-pp.kmem_cache_free
0.22 ± 81% +0.4 0.58 ± 55% perf-profile.children.cycles-pp.wait4
0.36 ± 32% +0.4 0.72 ± 25% perf-profile.children.cycles-pp.__set_cpus_allowed_ptr
0.40 ± 52% +0.4 0.78 ± 32% perf-profile.children.cycles-pp.__d_lookup_rcu
0.43 ± 23% +0.4 0.81 ± 27% perf-profile.children.cycles-pp.show_stat
0.42 ± 32% +0.4 0.81 ± 24% perf-profile.children.cycles-pp.__sched_setaffinity
0.24 ± 44% +0.4 0.65 ± 43% perf-profile.children.cycles-pp.tlb_batch_pages_flush
0.02 ±223% +0.4 0.45 ± 48% perf-profile.children.cycles-pp.on_each_cpu_cond_mask
0.53 ± 48% +0.4 0.97 ± 34% perf-profile.children.cycles-pp.open_last_lookups
0.03 ±223% +0.4 0.48 ± 42% perf-profile.children.cycles-pp.smp_call_function_many_cond
0.07 ± 58% +0.5 0.52 ± 39% perf-profile.children.cycles-pp.flush_tlb_mm_range
0.43 ± 37% +0.5 0.89 ± 19% perf-profile.children.cycles-pp.perf_evsel__read
0.39 ± 18% +0.5 0.85 ± 27% perf-profile.children.cycles-pp.vmstat_start
0.16 ± 57% +0.5 0.63 ± 42% perf-profile.children.cycles-pp.pick_next_task_fair
0.49 ± 31% +0.5 0.97 ± 23% perf-profile.children.cycles-pp.__x64_sys_sched_setaffinity
0.04 ±168% +0.5 0.52 ± 51% perf-profile.children.cycles-pp.newidle_balance
0.45 ± 28% +0.5 0.96 ± 32% perf-profile.children.cycles-pp.finish_task_switch
0.61 ± 20% +0.5 1.13 ± 24% perf-profile.children.cycles-pp.rebalance_domains
0.55 ± 47% +0.5 1.08 ± 17% perf-profile.children.cycles-pp.evlist__id2evsel
0.46 ± 50% +0.5 0.98 ± 51% perf-profile.children.cycles-pp.do_vmi_munmap
0.29 ± 45% +0.5 0.82 ± 37% perf-profile.children.cycles-pp.tlb_finish_mmu
0.44 ± 53% +0.5 0.98 ± 32% perf-profile.children.cycles-pp.wp_page_copy
0.59 ± 29% +0.6 1.14 ± 30% perf-profile.children.cycles-pp.__percpu_counter_sum
0.46 ± 28% +0.6 1.03 ± 30% perf-profile.children.cycles-pp.process_one_work
0.54 ± 59% +0.6 1.12 ± 29% perf-profile.children.cycles-pp.kmem_cache_alloc
0.63 ± 32% +0.6 1.21 ± 31% perf-profile.children.cycles-pp.__mmdrop
0.76 ± 41% +0.6 1.36 ± 32% perf-profile.children.cycles-pp.walk_component
0.58 ± 53% +0.6 1.18 ± 40% perf-profile.children.cycles-pp.dup_mm
0.48 ± 30% +0.6 1.12 ± 30% perf-profile.children.cycles-pp.worker_thread
0.30 ± 63% +0.6 0.95 ± 41% perf-profile.children.cycles-pp._compound_head
0.68 ± 36% +0.6 1.32 ± 17% perf-profile.children.cycles-pp.readn
0.61 ± 27% +0.7 1.30 ± 25% perf-profile.children.cycles-pp.proc_reg_read_iter
0.99 ± 41% +0.8 1.75 ± 32% perf-profile.children.cycles-pp.lookup_fast
0.78 ± 31% +0.8 1.55 ± 22% perf-profile.children.cycles-pp.evlist_cpu_iterator__next
1.77 ± 16% +0.8 2.55 ± 21% perf-profile.children.cycles-pp.irqentry_exit_to_user_mode
0.58 ± 24% +0.8 1.42 ± 29% perf-profile.children.cycles-pp.update_sg_lb_stats
0.61 ± 24% +0.9 1.48 ± 29% perf-profile.children.cycles-pp.update_sd_lb_stats
0.61 ± 24% +0.9 1.50 ± 30% perf-profile.children.cycles-pp.find_busiest_group
1.08 ± 29% +0.9 2.00 ± 32% perf-profile.children.cycles-pp.__irq_exit_rcu
0.93 ± 48% +0.9 1.87 ± 35% perf-profile.children.cycles-pp.copy_process
0.65 ± 26% +1.0 1.62 ± 29% perf-profile.children.cycles-pp.load_balance
1.00 ± 51% +1.0 1.99 ± 36% perf-profile.children.cycles-pp.__do_sys_clone
1.06 ± 42% +1.0 2.05 ± 17% perf-profile.children.cycles-pp.evsel__read_counter
0.50 ± 62% +1.0 1.51 ± 47% perf-profile.children.cycles-pp.zap_pte_range
0.51 ± 61% +1.0 1.53 ± 46% perf-profile.children.cycles-pp.zap_pmd_range
0.53 ± 61% +1.0 1.57 ± 46% perf-profile.children.cycles-pp.unmap_page_range
1.05 ± 31% +1.1 2.10 ± 23% perf-profile.children.cycles-pp.sched_setaffinity
1.07 ± 54% +1.1 2.12 ± 36% perf-profile.children.cycles-pp.__libc_fork
1.70 ± 17% +1.1 2.80 ± 29% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
0.61 ± 61% +1.1 1.73 ± 43% perf-profile.children.cycles-pp.unmap_vmas
1.32 ± 29% +1.2 2.48 ± 40% perf-profile.children.cycles-pp.__do_softirq
1.30 ± 29% +1.2 2.48 ± 28% perf-profile.children.cycles-pp.do_fault
1.02 ± 32% +1.3 2.29 ± 32% perf-profile.children.cycles-pp.schedule
0.95 ± 59% +1.3 2.30 ± 33% perf-profile.children.cycles-pp.exit_mm
1.15 ± 34% +1.5 2.65 ± 31% perf-profile.children.cycles-pp.__schedule
1.18 ± 58% +1.6 2.76 ± 32% perf-profile.children.cycles-pp.exit_mmap
1.18 ± 58% +1.6 2.78 ± 32% perf-profile.children.cycles-pp.__mmput
1.23 ± 60% +1.6 2.86 ± 31% perf-profile.children.cycles-pp.do_exit
1.23 ± 60% +1.6 2.87 ± 31% perf-profile.children.cycles-pp.do_group_exit
1.23 ± 60% +1.6 2.87 ± 31% perf-profile.children.cycles-pp.__x64_sys_exit_group
1.82 ± 24% +1.6 3.46 ± 23% perf-profile.children.cycles-pp.kthread
1.83 ± 24% +1.7 3.51 ± 24% perf-profile.children.cycles-pp.ret_from_fork_asm
1.83 ± 23% +1.7 3.50 ± 24% perf-profile.children.cycles-pp.ret_from_fork
2.70 ± 16% +1.8 4.51 ± 30% perf-profile.children.cycles-pp.exit_to_user_mode_loop
2.85 ± 17% +2.0 4.83 ± 29% perf-profile.children.cycles-pp.exit_to_user_mode_prepare
3.90 ± 12% +2.0 5.92 ± 23% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
2.51 ± 39% +2.4 4.89 ± 18% perf-profile.children.cycles-pp.read_counters
2.60 ± 38% +2.4 5.03 ± 18% perf-profile.children.cycles-pp.dispatch_events
2.60 ± 38% +2.4 5.03 ± 18% perf-profile.children.cycles-pp.process_interval
2.60 ± 38% +2.4 5.04 ± 18% perf-profile.children.cycles-pp.cmd_stat
3.87 ± 43% +3.2 7.04 ± 26% perf-profile.children.cycles-pp.seq_read_iter
0.23 ±170% +3.6 3.83 ± 58% perf-profile.children.cycles-pp.folio_copy
0.23 ±169% +3.6 3.84 ± 58% perf-profile.children.cycles-pp.migrate_folio_extra
0.23 ±169% +3.6 3.84 ± 58% perf-profile.children.cycles-pp.move_to_new_folio
0.28 ±145% +3.7 4.00 ± 56% perf-profile.children.cycles-pp.copy_page
0.24 ±171% +3.9 4.14 ± 58% perf-profile.children.cycles-pp.migrate_pages_batch
0.24 ±171% +3.9 4.14 ± 58% perf-profile.children.cycles-pp.migrate_pages
0.25 ±171% +3.9 4.15 ± 58% perf-profile.children.cycles-pp.migrate_misplaced_page
0.22 ±166% +3.9 4.13 ± 58% perf-profile.children.cycles-pp.do_huge_pmd_numa_page
4.19 ± 41% +4.1 8.29 ± 27% perf-profile.children.cycles-pp.read
4.84 ± 41% +4.1 8.96 ± 25% perf-profile.children.cycles-pp.vfs_read
5.01 ± 41% +4.3 9.29 ± 25% perf-profile.children.cycles-pp.ksys_read
3.24 ± 32% +6.3 9.52 ± 30% perf-profile.children.cycles-pp.__handle_mm_fault
3.68 ± 31% +6.5 10.18 ± 28% perf-profile.children.cycles-pp.handle_mm_fault
4.55 ± 27% +6.8 11.34 ± 24% perf-profile.children.cycles-pp.do_user_addr_fault
4.62 ± 27% +6.8 11.43 ± 24% perf-profile.children.cycles-pp.exc_page_fault
5.01 ± 26% +7.0 12.02 ± 23% perf-profile.children.cycles-pp.asm_exc_page_fault
0.02 ±141% +0.1 0.08 ± 22% perf-profile.self.cycles-pp.__legitimize_mnt
0.11 ± 26% +0.1 0.16 ± 19% perf-profile.self.cycles-pp.aa_file_perm
0.02 ±141% +0.1 0.08 ± 24% perf-profile.self.cycles-pp.perf_evsel__read
0.02 ±144% +0.1 0.08 ± 40% perf-profile.self.cycles-pp.check_heap_object
0.07 ± 30% +0.1 0.13 ± 34% perf-profile.self.cycles-pp._raw_spin_lock_irq
0.00 +0.1 0.06 ± 13% perf-profile.self.cycles-pp._copy_to_iter
0.07 ± 52% +0.1 0.13 ± 16% perf-profile.self.cycles-pp.atime_needs_update
0.06 ± 50% +0.1 0.12 ± 38% perf-profile.self.cycles-pp.kcpustat_cpu_fetch
0.01 ±223% +0.1 0.08 ± 33% perf-profile.self.cycles-pp.wq_worker_comm
0.05 ± 80% +0.1 0.13 ± 26% perf-profile.self.cycles-pp.try_charge_memcg
0.07 ± 57% +0.1 0.14 ± 28% perf-profile.self.cycles-pp.switch_mm_irqs_off
0.05 ± 84% +0.1 0.13 ± 26% perf-profile.self.cycles-pp.update_rq_clock_task
0.02 ±223% +0.1 0.09 ± 26% perf-profile.self.cycles-pp.enqueue_task_fair
0.01 ±223% +0.1 0.09 ± 27% perf-profile.self.cycles-pp.thread_group_cputime
0.04 ±104% +0.1 0.12 ± 23% perf-profile.self.cycles-pp.fsnotify_perm
0.05 ± 86% +0.1 0.13 ± 23% perf-profile.self.cycles-pp.perf_read
0.01 ±223% +0.1 0.09 ± 27% perf-profile.self.cycles-pp._IO_file_doallocate
0.12 ± 19% +0.1 0.20 ± 32% perf-profile.self.cycles-pp.xas_descend
0.00 +0.1 0.08 ± 30% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
0.10 ± 39% +0.1 0.19 ± 26% perf-profile.self.cycles-pp.update_curr
0.03 ±105% +0.1 0.12 ± 39% perf-profile.self.cycles-pp.__fput
0.03 ±150% +0.1 0.13 ± 40% perf-profile.self.cycles-pp.task_dump_owner
0.06 ± 58% +0.1 0.17 ± 33% perf-profile.self.cycles-pp.kstat_irqs_usr
0.02 ±223% +0.1 0.12 ± 35% perf-profile.self.cycles-pp.free_unref_page_prepare
0.08 ± 27% +0.1 0.20 ± 40% perf-profile.self.cycles-pp.release_pages
0.12 ± 37% +0.1 0.23 ± 27% perf-profile.self.cycles-pp.blk_mq_queue_tag_busy_iter
0.17 ± 37% +0.1 0.29 ± 28% perf-profile.self.cycles-pp.__schedule
0.08 ± 51% +0.1 0.20 ± 35% perf-profile.self.cycles-pp.free_swap_cache
0.13 ± 23% +0.1 0.26 ± 18% perf-profile.self.cycles-pp.__entry_text_start
0.13 ± 39% +0.1 0.26 ± 24% perf-profile.self.cycles-pp.evlist_cpu_iterator__next
0.02 ±142% +0.1 0.15 ± 24% perf-profile.self.cycles-pp.__x64_sys_newfstat
0.08 ± 40% +0.1 0.22 ± 17% perf-profile.self.cycles-pp.vfs_read
0.12 ± 38% +0.1 0.26 ± 30% perf-profile.self.cycles-pp.restore_fpregs_from_fpstate
0.20 ± 32% +0.2 0.35 ± 26% perf-profile.self.cycles-pp.__memcpy
0.22 ± 43% +0.2 0.40 ± 22% perf-profile.self.cycles-pp.do_dentry_open
0.16 ± 41% +0.2 0.34 ± 23% perf-profile.self.cycles-pp.__kmem_cache_alloc_node
0.19 ± 54% +0.2 0.37 ± 44% perf-profile.self.cycles-pp.rcu_cblist_dequeue
0.24 ± 52% +0.2 0.43 ± 35% perf-profile.self.cycles-pp.inode_permission
0.20 ± 39% +0.2 0.40 ± 17% perf-profile.self.cycles-pp.evsel__read_counter
0.14 ± 44% +0.2 0.35 ± 18% perf-profile.self.cycles-pp.memcg_slab_post_alloc_hook
0.39 ± 15% +0.2 0.63 ± 15% perf-profile.self.cycles-pp.native_irq_return_iret
0.22 ± 49% +0.2 0.47 ± 31% perf-profile.self.cycles-pp.kmem_cache_free
0.02 ±223% +0.3 0.27 ± 44% perf-profile.self.cycles-pp.smp_call_function_many_cond
0.22 ± 13% +0.3 0.47 ± 28% perf-profile.self.cycles-pp.all_vm_events
0.24 ± 62% +0.3 0.50 ± 34% perf-profile.self.cycles-pp.kmem_cache_alloc
0.29 ± 42% +0.3 0.56 ± 22% perf-profile.self.cycles-pp.read_counters
0.32 ± 28% +0.3 0.65 ± 31% perf-profile.self.cycles-pp.fold_vm_numa_events
0.39 ± 52% +0.4 0.76 ± 32% perf-profile.self.cycles-pp.__d_lookup_rcu
0.37 ± 18% +0.4 0.75 ± 28% perf-profile.self.cycles-pp.asm_sysvec_apic_timer_interrupt
0.54 ± 46% +0.5 1.05 ± 17% perf-profile.self.cycles-pp.evlist__id2evsel
0.58 ± 29% +0.5 1.10 ± 31% perf-profile.self.cycles-pp.__percpu_counter_sum
0.30 ± 63% +0.6 0.92 ± 40% perf-profile.self.cycles-pp._compound_head
0.46 ± 22% +0.6 1.11 ± 28% perf-profile.self.cycles-pp.update_sg_lb_stats
0.27 ±144% +3.7 3.98 ± 57% perf-profile.self.cycles-pp.copy_page
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
next prev parent reply other threads:[~2023-09-12 7:51 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-29 6:06 [RFC PATCH V1 0/6] sched/numa: Enhance disjoint VMA scanning Raghavendra K T
2023-08-29 6:06 ` [RFC PATCH V1 1/6] sched/numa: Move up the access pid reset logic Raghavendra K T
2023-08-29 6:06 ` [RFC PATCH V1 2/6] sched/numa: Add disjoint vma unconditional scan logic Raghavendra K T
2023-09-12 7:50 ` kernelt test robot [this message]
2023-09-13 6:21 ` Raghavendra K T
2023-08-29 6:06 ` [RFC PATCH V1 3/6] sched/numa: Remove unconditional scan logic using mm numa_scan_seq Raghavendra K T
2023-08-29 6:06 ` [RFC PATCH V1 4/6] sched/numa: Increase tasks' access history Raghavendra K T
2023-09-12 14:24 ` kernel test robot
2023-09-13 6:15 ` Raghavendra K T
2023-09-13 7:34 ` Oliver Sang
2023-08-29 6:06 ` [RFC PATCH V1 5/6] sched/numa: Allow recently accessed VMAs to be scanned Raghavendra K T
2023-09-10 15:29 ` kernel test robot
2023-09-11 11:25 ` Raghavendra K T
2023-09-12 2:22 ` Oliver Sang
2023-09-12 6:43 ` Raghavendra K T
2023-08-29 6:06 ` [RFC PATCH V1 6/6] sched/numa: Allow scanning of shared VMAs Raghavendra K T
2023-09-13 5:28 ` [RFC PATCH V1 0/6] sched/numa: Enhance disjoint VMA scanning Swapnil Sapkal
2023-09-13 6:24 ` Raghavendra K T
2023-09-19 6:30 ` Raghavendra K T
2023-09-19 7:15 ` Ingo Molnar
2023-09-19 8:06 ` Raghavendra K T
2023-09-19 9:28 ` Peter Zijlstra
2023-09-19 16:22 ` Mel Gorman
2023-09-19 19:11 ` Peter Zijlstra
2023-09-20 10:42 ` Raghavendra K T
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202309121417.53f44ad6-oliver.sang@intel.com \
--to=oliver.sang@intel.com \
--cc=Swapnil.Sapkal@amd.com \
--cc=akpm@linux-foundation.org \
--cc=aubrey.li@linux.intel.com \
--cc=bharata@amd.com \
--cc=david@redhat.com \
--cc=feng.tang@intel.com \
--cc=fengwei.yin@intel.com \
--cc=juri.lelli@redhat.com \
--cc=kprateek.nayak@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lkp@intel.com \
--cc=mgorman@suse.de \
--cc=mgorman@techsingularity.net \
--cc=mingo@redhat.com \
--cc=oe-lkp@lists.linux.dev \
--cc=peterz@infradead.org \
--cc=raghavendra.kt@amd.com \
--cc=rppt@kernel.org \
--cc=sraithal@amd.com \
--cc=vincent.guittot@linaro.org \
--cc=ying.huang@intel.com \
--cc=yu.c.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.