* [PATCH v3] sched/fair: sanitize vruntime of entity being placed @ 2023-02-09 19:31 Roman Kagan 2023-02-21 9:38 ` Vincent Guittot 0 siblings, 1 reply; 14+ messages in thread From: Roman Kagan @ 2023-02-09 19:31 UTC (permalink / raw) To: linux-kernel Cc: Valentin Schneider, Zhang Qiao, Ben Segall, Vincent Guittot, Waiman Long, Peter Zijlstra, Steven Rostedt, Mel Gorman, Dietmar Eggemann, Daniel Bristot de Oliveira, Ingo Molnar, Juri Lelli From: Zhang Qiao <zhangqiao22@huawei.com> When a scheduling entity is placed onto cfs_rq, its vruntime is pulled to the base level (around cfs_rq->min_vruntime), so that the entity doesn't gain extra boost when placed backwards. However, if the entity being placed wasn't executed for a long time, its vruntime may get too far behind (e.g. while cfs_rq was executing a low-weight hog), which can inverse the vruntime comparison due to s64 overflow. This results in the entity being placed with its original vruntime way forwards, so that it will effectively never get to the cpu. To prevent that, ignore the vruntime of the entity being placed if it didn't execute for longer than the time that can lead to an overflow. Signed-off-by: Zhang Qiao <zhangqiao22@huawei.com> [rkagan: formatted, adjusted commit log, comments, cutoff value] Co-developed-by: Roman Kagan <rkagan@amazon.de> Signed-off-by: Roman Kagan <rkagan@amazon.de> --- v2 -> v3: - make cutoff less arbitrary and update comments [Vincent] v1 -> v2: - add Zhang Qiao's s-o-b - fix constant promotion on 32bit kernel/sched/fair.c | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 0f8736991427..3baa6b7ea860 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4656,6 +4656,7 @@ static void place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) { u64 vruntime = cfs_rq->min_vruntime; + u64 sleep_time; /* * The 'current' period is already promised to the current tasks, @@ -4685,8 +4686,24 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) vruntime -= thresh; } - /* ensure we never gain time by being placed backwards. */ - se->vruntime = max_vruntime(se->vruntime, vruntime); + /* + * Pull vruntime of the entity being placed to the base level of + * cfs_rq, to prevent boosting it if placed backwards. + * However, min_vruntime can advance much faster than real time, with + * the exterme being when an entity with the minimal weight always runs + * on the cfs_rq. If the new entity slept for long, its vruntime + * difference from min_vruntime may overflow s64 and their comparison + * may get inversed, so ignore the entity's original vruntime in that + * case. + * The maximal vruntime speedup is given by the ratio of normal to + * minimal weight: NICE_0_LOAD / MIN_SHARES, so cutting off on the + * sleep time of 2^63 / NICE_0_LOAD should be safe. + */ + sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start; + if ((s64)sleep_time > (1ULL << 63) / NICE_0_LOAD) + se->vruntime = vruntime; + else + se->vruntime = max_vruntime(se->vruntime, vruntime); } static void check_enqueue_throttle(struct cfs_rq *cfs_rq); -- 2.34.1 Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B Sitz: Berlin Ust-ID: DE 289 237 879 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v3] sched/fair: sanitize vruntime of entity being placed 2023-02-09 19:31 [PATCH v3] sched/fair: sanitize vruntime of entity being placed Roman Kagan @ 2023-02-21 9:38 ` Vincent Guittot 2023-02-21 16:57 ` Roman Kagan 0 siblings, 1 reply; 14+ messages in thread From: Vincent Guittot @ 2023-02-21 9:38 UTC (permalink / raw) To: Roman Kagan Cc: linux-kernel, Valentin Schneider, Zhang Qiao, Ben Segall, Waiman Long, Peter Zijlstra, Steven Rostedt, Mel Gorman, Dietmar Eggemann, Daniel Bristot de Oliveira, Ingo Molnar, Juri Lelli On Thu, 9 Feb 2023 at 20:31, Roman Kagan <rkagan@amazon.de> wrote: > > From: Zhang Qiao <zhangqiao22@huawei.com> > > When a scheduling entity is placed onto cfs_rq, its vruntime is pulled > to the base level (around cfs_rq->min_vruntime), so that the entity > doesn't gain extra boost when placed backwards. > > However, if the entity being placed wasn't executed for a long time, its > vruntime may get too far behind (e.g. while cfs_rq was executing a > low-weight hog), which can inverse the vruntime comparison due to s64 > overflow. This results in the entity being placed with its original > vruntime way forwards, so that it will effectively never get to the cpu. > > To prevent that, ignore the vruntime of the entity being placed if it > didn't execute for longer than the time that can lead to an overflow. > > Signed-off-by: Zhang Qiao <zhangqiao22@huawei.com> > [rkagan: formatted, adjusted commit log, comments, cutoff value] > Co-developed-by: Roman Kagan <rkagan@amazon.de> > Signed-off-by: Roman Kagan <rkagan@amazon.de> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> > --- > v2 -> v3: > - make cutoff less arbitrary and update comments [Vincent] > > v1 -> v2: > - add Zhang Qiao's s-o-b > - fix constant promotion on 32bit > > kernel/sched/fair.c | 21 +++++++++++++++++++-- > 1 file changed, 19 insertions(+), 2 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 0f8736991427..3baa6b7ea860 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -4656,6 +4656,7 @@ static void > place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) > { > u64 vruntime = cfs_rq->min_vruntime; > + u64 sleep_time; > > /* > * The 'current' period is already promised to the current tasks, > @@ -4685,8 +4686,24 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) > vruntime -= thresh; > } > > - /* ensure we never gain time by being placed backwards. */ > - se->vruntime = max_vruntime(se->vruntime, vruntime); > + /* > + * Pull vruntime of the entity being placed to the base level of > + * cfs_rq, to prevent boosting it if placed backwards. > + * However, min_vruntime can advance much faster than real time, with > + * the exterme being when an entity with the minimal weight always runs > + * on the cfs_rq. If the new entity slept for long, its vruntime > + * difference from min_vruntime may overflow s64 and their comparison > + * may get inversed, so ignore the entity's original vruntime in that > + * case. > + * The maximal vruntime speedup is given by the ratio of normal to > + * minimal weight: NICE_0_LOAD / MIN_SHARES, so cutting off on the > + * sleep time of 2^63 / NICE_0_LOAD should be safe. > + */ > + sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start; > + if ((s64)sleep_time > (1ULL << 63) / NICE_0_LOAD) > + se->vruntime = vruntime; > + else > + se->vruntime = max_vruntime(se->vruntime, vruntime); > } > > static void check_enqueue_throttle(struct cfs_rq *cfs_rq); > -- > 2.34.1 > > > > > Amazon Development Center Germany GmbH > Krausenstr. 38 > 10117 Berlin > Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss > Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B > Sitz: Berlin > Ust-ID: DE 289 237 879 > > > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3] sched/fair: sanitize vruntime of entity being placed 2023-02-21 9:38 ` Vincent Guittot @ 2023-02-21 16:57 ` Roman Kagan 2023-02-21 17:26 ` Vincent Guittot 0 siblings, 1 reply; 14+ messages in thread From: Roman Kagan @ 2023-02-21 16:57 UTC (permalink / raw) To: Vincent Guittot, Peter Zijlstra Cc: linux-kernel, Valentin Schneider, Zhang Qiao, Ben Segall, Waiman Long, Steven Rostedt, Mel Gorman, Dietmar Eggemann, Daniel Bristot de Oliveira, Ingo Molnar, Juri Lelli On Tue, Feb 21, 2023 at 10:38:44AM +0100, Vincent Guittot wrote: > On Thu, 9 Feb 2023 at 20:31, Roman Kagan <rkagan@amazon.de> wrote: > > > > From: Zhang Qiao <zhangqiao22@huawei.com> > > > > When a scheduling entity is placed onto cfs_rq, its vruntime is pulled > > to the base level (around cfs_rq->min_vruntime), so that the entity > > doesn't gain extra boost when placed backwards. > > > > However, if the entity being placed wasn't executed for a long time, its > > vruntime may get too far behind (e.g. while cfs_rq was executing a > > low-weight hog), which can inverse the vruntime comparison due to s64 > > overflow. This results in the entity being placed with its original > > vruntime way forwards, so that it will effectively never get to the cpu. > > > > To prevent that, ignore the vruntime of the entity being placed if it > > didn't execute for longer than the time that can lead to an overflow. > > > > Signed-off-by: Zhang Qiao <zhangqiao22@huawei.com> > > [rkagan: formatted, adjusted commit log, comments, cutoff value] > > Co-developed-by: Roman Kagan <rkagan@amazon.de> > > Signed-off-by: Roman Kagan <rkagan@amazon.de> > > Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> > > > --- > > v2 -> v3: > > - make cutoff less arbitrary and update comments [Vincent] > > > > v1 -> v2: > > - add Zhang Qiao's s-o-b > > - fix constant promotion on 32bit > > > > kernel/sched/fair.c | 21 +++++++++++++++++++-- > > 1 file changed, 19 insertions(+), 2 deletions(-) Turns out Peter took v2 through his tree, and it has already landed in Linus' master. What scares me, though, is that I've got a message from the test robot that this commit drammatically affected hackbench results, see the quote below. I expected the commit not to affect any benchmarks. Any idea what could have caused this change? Thanks, Roman. On Tue, Feb 21, 2023 at 03:34:16PM +0800, kernel test robot wrote: > FYI, we noticed a 125.5% improvement of hackbench.throughput due to commit: > > commit: 829c1651e9c4a6f78398d3e67651cef9bb6b42cc ("sched/fair: sanitize vruntime of entity being placed") > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master > > in testcase: hackbench > on test machine: 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz (Cascade Lake) with 128G memory > with following parameters: > > nr_threads: 50% > iterations: 8 > mode: process > ipc: pipe > cpufreq_governor: performance > > test-description: Hackbench is both a benchmark and a stress test for the Linux kernel scheduler. > test-url: https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/sched/cfs-scheduler/hackbench.c > > In addition to that, the commit also has significant impact on the following tests: > > +------------------+--------------------------------------------------+ > | testcase: change | hackbench: hackbench.throughput -8.1% regression | > | test machine | 104 threads 2 sockets (Skylake) with 192G memory | > | test parameters | cpufreq_governor=performance | > | | ipc=socket | > | | iterations=4 | > | | mode=process | > | | nr_threads=100% | > +------------------+--------------------------------------------------+ > > Details are as below: > > ========================================================================================= > compiler/cpufreq_governor/ipc/iterations/kconfig/mode/nr_threads/rootfs/tbox_group/testcase: > gcc-11/performance/pipe/8/x86_64-rhel-8.3/process/50%/debian-11.1-x86_64-20220510.cgz/lkp-csl-2sp9/hackbench > > commit: > a2e90611b9 ("sched/fair: Remove capacity inversion detection") > 829c1651e9 ("sched/fair: sanitize vruntime of entity being placed") > > a2e90611b9f425ad 829c1651e9c4a6f78398d3e6765 > ---------------- --------------------------- > %stddev %change %stddev > \ | \ > 308887 ± 5% +125.5% 696539 hackbench.throughput > 259291 ± 2% +127.3% 589293 hackbench.throughput_avg > 308887 ± 5% +125.5% 696539 hackbench.throughput_best > 198770 ± 2% +105.5% 408552 ± 4% hackbench.throughput_worst > 319.60 ± 2% -55.8% 141.24 hackbench.time.elapsed_time > 319.60 ± 2% -55.8% 141.24 hackbench.time.elapsed_time.max > 1.298e+09 ± 8% -87.6% 1.613e+08 ± 7% hackbench.time.involuntary_context_switches > 477107 -12.5% 417660 hackbench.time.minor_page_faults > 24683 ± 2% -57.2% 10562 hackbench.time.system_time > 2136 ± 3% -45.0% 1174 hackbench.time.user_time > 3.21e+09 ± 4% -83.0% 5.442e+08 ± 3% hackbench.time.voluntary_context_switches > 5.28e+08 ± 4% +8.4% 5.723e+08 ± 3% cpuidle..time > 365.97 ± 2% -48.9% 187.12 uptime.boot > 3322559 ± 3% +34.3% 4463206 ± 15% vmstat.memory.cache > 14194257 ± 2% -62.8% 5279904 ± 3% vmstat.system.cs > 2120781 ± 3% -72.8% 576421 ± 4% vmstat.system.in > 1.84 ± 12% +2.6 4.48 ± 5% mpstat.cpu.all.idle% > 2.49 ± 3% -1.1 1.39 ± 4% mpstat.cpu.all.irq% > 0.04 ± 12% +0.0 0.05 mpstat.cpu.all.soft% > 7.36 +2.2 9.56 mpstat.cpu.all.usr% > 61555 ± 6% -72.8% 16751 ± 16% numa-meminfo.node1.Active > 61515 ± 6% -72.8% 16717 ± 16% numa-meminfo.node1.Active(anon) > 960182 ±102% +225.6% 3125990 ± 42% numa-meminfo.node1.FilePages > 1754002 ± 53% +137.9% 4173379 ± 34% numa-meminfo.node1.MemUsed > 35296824 ± 6% +157.8% 91005048 numa-numastat.node0.local_node > 35310119 ± 6% +157.9% 91058472 numa-numastat.node0.numa_hit > 35512423 ± 5% +159.7% 92232951 numa-numastat.node1.local_node > 35577275 ± 4% +159.4% 92273266 numa-numastat.node1.numa_hit > 35310253 ± 6% +157.9% 91058211 numa-vmstat.node0.numa_hit > 35296958 ± 6% +157.8% 91004787 numa-vmstat.node0.numa_local > 15337 ± 6% -72.5% 4216 ± 17% numa-vmstat.node1.nr_active_anon > 239988 ±102% +225.7% 781607 ± 42% numa-vmstat.node1.nr_file_pages > 15337 ± 6% -72.5% 4216 ± 17% numa-vmstat.node1.nr_zone_active_anon > 35577325 ± 4% +159.4% 92273215 numa-vmstat.node1.numa_hit > 35512473 ± 5% +159.7% 92232900 numa-vmstat.node1.numa_local > 64500 ± 8% -61.8% 24643 ± 32% meminfo.Active > 64422 ± 8% -61.9% 24568 ± 32% meminfo.Active(anon) > 140271 ± 14% -38.0% 86979 ± 24% meminfo.AnonHugePages > 372672 ± 2% +13.3% 422069 meminfo.AnonPages > 3205235 ± 3% +35.1% 4329061 ± 15% meminfo.Cached > 1548601 ± 7% +77.4% 2747319 ± 24% meminfo.Committed_AS > 783193 ± 14% +154.9% 1996137 ± 33% meminfo.Inactive > 783010 ± 14% +154.9% 1995951 ± 33% meminfo.Inactive(anon) > 4986534 ± 2% +28.2% 6394741 ± 10% meminfo.Memused > 475092 ± 22% +236.5% 1598918 ± 41% meminfo.Shmem > 2777 -2.1% 2719 turbostat.Bzy_MHz > 11143123 ± 6% +72.0% 19162667 turbostat.C1 > 0.24 ± 7% +0.7 0.94 ± 3% turbostat.C1% > 100440 ± 18% +203.8% 305136 ± 15% turbostat.C1E > 0.06 ± 9% +0.1 0.18 ± 11% turbostat.C1E% > 1.24 ± 3% +1.6 2.81 ± 4% turbostat.C6% > 1.38 ± 3% +156.1% 3.55 ± 3% turbostat.CPU%c1 > 0.33 ± 5% +76.5% 0.58 ± 7% turbostat.CPU%c6 > 0.16 +31.2% 0.21 turbostat.IPC > 6.866e+08 ± 5% -87.8% 83575393 ± 5% turbostat.IRQ > 0.33 ± 27% +0.2 0.57 turbostat.POLL% > 0.12 ± 10% +176.4% 0.33 ± 12% turbostat.Pkg%pc2 > 0.09 ± 7% -100.0% 0.00 turbostat.Pkg%pc6 > 61.33 +5.2% 64.50 ± 2% turbostat.PkgTmp > 14.81 +2.0% 15.11 turbostat.RAMWatt > 16242 ± 8% -62.0% 6179 ± 32% proc-vmstat.nr_active_anon > 93150 ± 2% +13.2% 105429 proc-vmstat.nr_anon_pages > 801219 ± 3% +35.1% 1082320 ± 15% proc-vmstat.nr_file_pages > 195506 ± 14% +155.2% 498919 ± 33% proc-vmstat.nr_inactive_anon > 118682 ± 22% +236.9% 399783 ± 41% proc-vmstat.nr_shmem > 16242 ± 8% -62.0% 6179 ± 32% proc-vmstat.nr_zone_active_anon > 195506 ± 14% +155.2% 498919 ± 33% proc-vmstat.nr_zone_inactive_anon > 70889233 ± 5% +158.6% 1.833e+08 proc-vmstat.numa_hit > 70811086 ± 5% +158.8% 1.832e+08 proc-vmstat.numa_local > 55885 ± 22% -67.2% 18327 ± 38% proc-vmstat.numa_pages_migrated > 422312 ± 10% -95.4% 19371 ± 7% proc-vmstat.pgactivate > 71068460 ± 5% +158.1% 1.834e+08 proc-vmstat.pgalloc_normal > 1554994 -19.6% 1250346 ± 4% proc-vmstat.pgfault > 71011267 ± 5% +155.9% 1.817e+08 proc-vmstat.pgfree > 55885 ± 22% -67.2% 18327 ± 38% proc-vmstat.pgmigrate_success > 111247 ± 2% -35.0% 72355 ± 2% proc-vmstat.pgreuse > 2506368 ± 2% -53.1% 1176320 proc-vmstat.unevictable_pgs_scanned > 20.06 ± 10% -22.4% 15.56 ± 8% sched_debug.cfs_rq:/.h_nr_running.max > 0.81 ± 32% -93.1% 0.06 ±223% sched_debug.cfs_rq:/.h_nr_running.min > 1917 ± 34% -100.0% 0.00 sched_debug.cfs_rq:/.load.min > 24.18 ± 10% +39.0% 33.62 ± 11% sched_debug.cfs_rq:/.load_avg.avg > 245.61 ± 25% +66.3% 408.33 ± 22% sched_debug.cfs_rq:/.load_avg.max > 47.52 ± 13% +72.6% 82.03 ± 8% sched_debug.cfs_rq:/.load_avg.stddev > 13431147 -64.9% 4717147 sched_debug.cfs_rq:/.min_vruntime.avg > 18161799 ± 7% -67.4% 5925316 ± 6% sched_debug.cfs_rq:/.min_vruntime.max > 12413026 -65.0% 4340952 sched_debug.cfs_rq:/.min_vruntime.min > 739748 ± 16% -66.6% 247410 ± 17% sched_debug.cfs_rq:/.min_vruntime.stddev > 0.85 -16.4% 0.71 sched_debug.cfs_rq:/.nr_running.avg > 0.61 ± 25% -90.9% 0.06 ±223% sched_debug.cfs_rq:/.nr_running.min > 0.10 ± 25% +109.3% 0.22 ± 7% sched_debug.cfs_rq:/.nr_running.stddev > 169.22 +101.7% 341.33 sched_debug.cfs_rq:/.removed.load_avg.max > 32.41 ± 24% +100.2% 64.90 ± 16% sched_debug.cfs_rq:/.removed.load_avg.stddev > 82.92 ± 10% +108.1% 172.56 sched_debug.cfs_rq:/.removed.runnable_avg.max > 13.60 ± 28% +114.0% 29.10 ± 20% sched_debug.cfs_rq:/.removed.runnable_avg.stddev > 82.92 ± 10% +108.1% 172.56 sched_debug.cfs_rq:/.removed.util_avg.max > 13.60 ± 28% +114.0% 29.10 ± 20% sched_debug.cfs_rq:/.removed.util_avg.stddev > 2156 ± 12% -36.6% 1368 ± 27% sched_debug.cfs_rq:/.runnable_avg.min > 2285 ± 7% -19.8% 1833 ± 6% sched_debug.cfs_rq:/.runnable_avg.stddev > -2389921 -64.8% -840940 sched_debug.cfs_rq:/.spread0.min > 739781 ± 16% -66.5% 247837 ± 17% sched_debug.cfs_rq:/.spread0.stddev > 843.88 ± 2% -20.5% 670.53 sched_debug.cfs_rq:/.util_avg.avg > 433.64 ± 7% -43.5% 244.83 ± 17% sched_debug.cfs_rq:/.util_avg.min > 187.00 ± 6% +40.6% 263.02 ± 4% sched_debug.cfs_rq:/.util_avg.stddev > 394.15 ± 14% -29.5% 278.06 ± 3% sched_debug.cfs_rq:/.util_est_enqueued.avg > 1128 ± 12% -17.6% 930.39 ± 5% sched_debug.cfs_rq:/.util_est_enqueued.max > 38.36 ± 29% -100.0% 0.00 sched_debug.cfs_rq:/.util_est_enqueued.min > 3596 ± 15% -39.5% 2175 ± 7% sched_debug.cpu.avg_idle.min > 160647 ± 9% -25.9% 118978 ± 9% sched_debug.cpu.avg_idle.stddev > 197365 -46.2% 106170 sched_debug.cpu.clock.avg > 197450 -46.2% 106208 sched_debug.cpu.clock.max > 197281 -46.2% 106128 sched_debug.cpu.clock.min > 49.96 ± 22% -53.1% 23.44 ± 19% sched_debug.cpu.clock.stddev > 193146 -45.7% 104898 sched_debug.cpu.clock_task.avg > 194592 -45.8% 105455 sched_debug.cpu.clock_task.max > 177878 -49.3% 90211 sched_debug.cpu.clock_task.min > 1794 ± 5% -10.7% 1602 ± 2% sched_debug.cpu.clock_task.stddev > 13154 ± 2% -20.3% 10479 sched_debug.cpu.curr->pid.avg > 15059 -17.2% 12468 sched_debug.cpu.curr->pid.max > 7263 ± 33% -100.0% 0.00 sched_debug.cpu.curr->pid.min > 9321 ± 36% +98.2% 18478 ± 44% sched_debug.cpu.max_idle_balance_cost.stddev > 0.00 ± 17% -41.6% 0.00 ± 13% sched_debug.cpu.next_balance.stddev > 20.00 ± 11% -21.4% 15.72 ± 7% sched_debug.cpu.nr_running.max > 0.86 ± 17% -87.1% 0.11 ±141% sched_debug.cpu.nr_running.min > 25069883 -83.7% 4084117 ± 4% sched_debug.cpu.nr_switches.avg > 26486718 -82.8% 4544009 ± 4% sched_debug.cpu.nr_switches.max > 23680077 -84.5% 3663816 ± 4% sched_debug.cpu.nr_switches.min > 589836 ± 3% -68.7% 184621 ± 16% sched_debug.cpu.nr_switches.stddev > 197278 -46.2% 106128 sched_debug.cpu_clk > 194327 -46.9% 103176 sched_debug.ktime > 197967 -46.0% 106821 sched_debug.sched_clk > 14.91 -37.6% 9.31 perf-stat.i.MPKI > 2.657e+10 +25.0% 3.32e+10 perf-stat.i.branch-instructions > 1.17 -0.4 0.78 perf-stat.i.branch-miss-rate% > 3.069e+08 -20.1% 2.454e+08 perf-stat.i.branch-misses > 6.43 ± 8% +2.2 8.59 ± 4% perf-stat.i.cache-miss-rate% > 1.952e+09 -24.3% 1.478e+09 perf-stat.i.cache-references > 14344055 ± 2% -58.6% 5932018 ± 3% perf-stat.i.context-switches > 1.83 -21.8% 1.43 perf-stat.i.cpi > 2.403e+11 -3.4% 2.322e+11 perf-stat.i.cpu-cycles > 1420139 ± 2% -38.8% 869692 ± 5% perf-stat.i.cpu-migrations > 2619 ± 7% -15.5% 2212 ± 8% perf-stat.i.cycles-between-cache-misses > 0.24 ± 19% -0.1 0.10 ± 17% perf-stat.i.dTLB-load-miss-rate% > 90403286 ± 19% -55.8% 39926283 ± 16% perf-stat.i.dTLB-load-misses > 3.823e+10 +28.6% 4.918e+10 perf-stat.i.dTLB-loads > 0.01 ± 34% -0.0 0.01 ± 33% perf-stat.i.dTLB-store-miss-rate% > 2779663 ± 34% -52.7% 1315899 ± 31% perf-stat.i.dTLB-store-misses > 2.19e+10 +24.2% 2.72e+10 perf-stat.i.dTLB-stores > 47.99 ± 2% +28.0 75.94 perf-stat.i.iTLB-load-miss-rate% > 89417955 ± 2% +38.7% 1.24e+08 ± 4% perf-stat.i.iTLB-load-misses > 97721514 ± 2% -58.2% 40865783 ± 3% perf-stat.i.iTLB-loads > 1.329e+11 +26.3% 1.678e+11 perf-stat.i.instructions > 1503 -7.7% 1388 ± 3% perf-stat.i.instructions-per-iTLB-miss > 0.55 +30.2% 0.72 perf-stat.i.ipc > 1.64 ± 18% +217.4% 5.20 ± 11% perf-stat.i.major-faults > 2.73 -3.7% 2.63 perf-stat.i.metric.GHz > 1098 ± 2% -7.1% 1020 ± 3% perf-stat.i.metric.K/sec > 1008 +24.4% 1254 perf-stat.i.metric.M/sec > 4334 ± 2% +90.5% 8257 ± 7% perf-stat.i.minor-faults > 90.94 -14.9 75.99 perf-stat.i.node-load-miss-rate% > 41932510 ± 8% -43.0% 23899176 ± 10% perf-stat.i.node-load-misses > 3366677 ± 5% +86.2% 6267816 perf-stat.i.node-loads > 81.77 ± 3% -36.3 45.52 ± 3% perf-stat.i.node-store-miss-rate% > 18498318 ± 7% -31.8% 12613933 ± 7% perf-stat.i.node-store-misses > 3023556 ± 10% +508.7% 18405880 ± 2% perf-stat.i.node-stores > 4336 ± 2% +90.5% 8262 ± 7% perf-stat.i.page-faults > 14.70 -41.2% 8.65 perf-stat.overall.MPKI > 1.16 -0.4 0.72 perf-stat.overall.branch-miss-rate% > 6.22 ± 7% +2.4 8.59 ± 4% perf-stat.overall.cache-miss-rate% > 1.81 -24.3% 1.37 perf-stat.overall.cpi > 0.24 ± 19% -0.2 0.07 ± 15% perf-stat.overall.dTLB-load-miss-rate% > 0.01 ± 34% -0.0 0.00 ± 29% perf-stat.overall.dTLB-store-miss-rate% > 47.78 ± 2% +29.3 77.12 perf-stat.overall.iTLB-load-miss-rate% > 1486 -9.1% 1351 ± 4% perf-stat.overall.instructions-per-iTLB-miss > 0.55 +32.0% 0.73 perf-stat.overall.ipc > 92.54 -15.4 77.16 ± 2% perf-stat.overall.node-load-miss-rate% > 85.82 ± 2% -48.1 37.76 ± 5% perf-stat.overall.node-store-miss-rate% > 2.648e+10 +25.2% 3.314e+10 perf-stat.ps.branch-instructions > 3.06e+08 -22.1% 2.383e+08 perf-stat.ps.branch-misses > 1.947e+09 -25.5% 1.451e+09 perf-stat.ps.cache-references > 14298713 ± 2% -62.5% 5359285 ± 3% perf-stat.ps.context-switches > 2.396e+11 -4.0% 2.299e+11 perf-stat.ps.cpu-cycles > 1415512 ± 2% -42.2% 817981 ± 4% perf-stat.ps.cpu-migrations > 90073948 ± 19% -60.4% 35711862 ± 15% perf-stat.ps.dTLB-load-misses > 3.811e+10 +29.7% 4.944e+10 perf-stat.ps.dTLB-loads > 2767291 ± 34% -56.3% 1210210 ± 29% perf-stat.ps.dTLB-store-misses > 2.183e+10 +25.0% 2.729e+10 perf-stat.ps.dTLB-stores > 89118809 ± 2% +39.6% 1.244e+08 ± 4% perf-stat.ps.iTLB-load-misses > 97404381 ± 2% -62.2% 36860047 ± 3% perf-stat.ps.iTLB-loads > 1.324e+11 +26.7% 1.678e+11 perf-stat.ps.instructions > 1.62 ± 18% +164.7% 4.29 ± 8% perf-stat.ps.major-faults > 4310 ± 2% +75.1% 7549 ± 5% perf-stat.ps.minor-faults > 41743097 ± 8% -47.3% 21984450 ± 9% perf-stat.ps.node-load-misses > 3356259 ± 5% +92.6% 6462631 perf-stat.ps.node-loads > 18414647 ± 7% -35.7% 11833799 ± 6% perf-stat.ps.node-store-misses > 3019790 ± 10% +545.0% 19478071 perf-stat.ps.node-stores > 4312 ± 2% +75.2% 7553 ± 5% perf-stat.ps.page-faults > 4.252e+13 -43.7% 2.395e+13 perf-stat.total.instructions > 29.92 ± 4% -22.8 7.09 ± 29% perf-profile.calltrace.cycles-pp.__wake_up_common_lock.pipe_write.vfs_write.ksys_write.do_syscall_64 > 28.53 ± 5% -21.6 6.92 ± 29% perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_common_lock.pipe_write.vfs_write.ksys_write > 27.86 ± 5% -21.1 6.77 ± 29% perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_write.vfs_write > 27.55 ± 5% -20.9 6.68 ± 29% perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_write > 22.28 ± 4% -17.0 5.31 ± 30% perf-profile.calltrace.cycles-pp.schedule.pipe_read.vfs_read.ksys_read.do_syscall_64 > 21.98 ± 4% -16.7 5.24 ± 30% perf-profile.calltrace.cycles-pp.__schedule.schedule.pipe_read.vfs_read.ksys_read > 12.62 ± 4% -9.6 3.00 ± 33% perf-profile.calltrace.cycles-pp.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock > 34.09 -9.2 24.92 ± 3% perf-profile.calltrace.cycles-pp.pipe_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe > 11.48 ± 5% -8.8 2.69 ± 38% perf-profile.calltrace.cycles-pp.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common > 9.60 ± 7% -7.2 2.40 ± 35% perf-profile.calltrace.cycles-pp.dequeue_task_fair.__schedule.schedule.pipe_read.vfs_read > 36.39 -6.2 30.20 perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read > 40.40 -6.1 34.28 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read > 40.95 -5.7 35.26 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_read > 37.43 -5.4 32.07 perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read > 6.30 ± 11% -5.2 1.09 ± 36% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write > 5.66 ± 12% -5.1 0.58 ± 75% perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe > 6.46 ± 10% -5.1 1.40 ± 28% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write > 5.53 ± 13% -5.0 0.56 ± 75% perf-profile.calltrace.cycles-pp.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64 > 5.42 ± 13% -4.9 0.56 ± 75% perf-profile.calltrace.cycles-pp.__schedule.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode > 5.82 ± 9% -4.7 1.10 ± 37% perf-profile.calltrace.cycles-pp._raw_spin_lock.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock > 5.86 ± 16% -4.6 1.31 ± 37% perf-profile.calltrace.cycles-pp.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function > 5.26 ± 9% -4.4 0.89 ± 57% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.try_to_wake_up.autoremove_wake_function.__wake_up_common > 45.18 -3.5 41.68 perf-profile.calltrace.cycles-pp.__libc_read > 50.31 -3.2 47.12 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write > 4.00 ± 27% -2.9 1.09 ± 40% perf-profile.calltrace.cycles-pp.dequeue_entity.dequeue_task_fair.__schedule.schedule.pipe_read > 50.75 -2.7 48.06 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_write > 40.80 -2.6 38.20 perf-profile.calltrace.cycles-pp.pipe_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe > 3.10 ± 15% -2.5 0.62 ±103% perf-profile.calltrace.cycles-pp.update_cfs_group.dequeue_task_fair.__schedule.schedule.pipe_read > 2.94 ± 12% -2.3 0.62 ±102% perf-profile.calltrace.cycles-pp.update_cfs_group.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function > 2.38 ± 9% -2.0 0.38 ±102% perf-profile.calltrace.cycles-pp._raw_spin_lock.__schedule.schedule.pipe_read.vfs_read > 2.24 ± 7% -1.8 0.40 ± 71% perf-profile.calltrace.cycles-pp.prepare_to_wait_event.pipe_read.vfs_read.ksys_read.do_syscall_64 > 2.08 ± 6% -1.8 0.29 ±100% perf-profile.calltrace.cycles-pp.pick_next_task_fair.__schedule.schedule.pipe_read.vfs_read > 2.10 ± 10% -1.8 0.32 ±104% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.__schedule.schedule.pipe_read > 2.76 ± 7% -1.5 1.24 ± 17% perf-profile.calltrace.cycles-pp.select_task_rq.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock > 2.27 ± 5% -1.4 0.88 ± 11% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read > 2.43 ± 7% -1.3 1.16 ± 17% perf-profile.calltrace.cycles-pp.select_task_rq_fair.select_task_rq.try_to_wake_up.autoremove_wake_function.__wake_up_common > 2.46 ± 5% -1.3 1.20 ± 7% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read > 1.54 ± 5% -1.2 0.32 ±101% perf-profile.calltrace.cycles-pp.update_load_avg.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up > 0.97 ± 9% -0.3 0.66 ± 19% perf-profile.calltrace.cycles-pp.select_idle_sibling.select_task_rq_fair.select_task_rq.try_to_wake_up.autoremove_wake_function > 0.86 ± 6% +0.2 1.02 perf-profile.calltrace.cycles-pp.__might_fault._copy_from_iter.copy_page_from_iter.pipe_write.vfs_write > 0.64 ± 9% +0.5 1.16 ± 5% perf-profile.calltrace.cycles-pp.mutex_unlock.pipe_read.vfs_read.ksys_read.do_syscall_64 > 0.47 ± 45% +0.5 0.99 ± 5% perf-profile.calltrace.cycles-pp.__fget_light.__fdget_pos.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.60 ± 8% +0.5 1.13 ± 5% perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read > 0.00 +0.5 0.54 ± 5% perf-profile.calltrace.cycles-pp.current_time.file_update_time.pipe_write.vfs_write.ksys_write > 0.00 +0.6 0.56 ± 4% perf-profile.calltrace.cycles-pp.__might_resched.__might_fault._copy_from_iter.copy_page_from_iter.pipe_write > 0.00 +0.6 0.56 ± 7% perf-profile.calltrace.cycles-pp.__might_resched.__might_fault._copy_to_iter.copy_page_to_iter.pipe_read > 0.00 +0.6 0.58 ± 5% perf-profile.calltrace.cycles-pp.__might_resched.mutex_lock.pipe_write.vfs_write.ksys_write > 0.00 +0.6 0.62 ± 3% perf-profile.calltrace.cycles-pp.__might_resched.mutex_lock.pipe_read.vfs_read.ksys_read > 0.00 +0.7 0.65 ± 6% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.prepare_to_wait_event.pipe_write.vfs_write > 0.00 +0.7 0.65 ± 7% perf-profile.calltrace.cycles-pp.poll_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle > 0.57 ± 5% +0.7 1.24 ± 6% perf-profile.calltrace.cycles-pp.__fget_light.__fdget_pos.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.00 +0.7 0.72 ± 6% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.prepare_to_wait_event.pipe_write.vfs_write.ksys_write > 0.00 +0.8 0.75 ± 6% perf-profile.calltrace.cycles-pp.mutex_spin_on_owner.__mutex_lock.pipe_write.vfs_write.ksys_write > 0.74 ± 9% +0.8 1.48 ± 5% perf-profile.calltrace.cycles-pp.file_update_time.pipe_write.vfs_write.ksys_write.do_syscall_64 > 0.63 ± 5% +0.8 1.40 ± 5% perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write > 0.00 +0.8 0.78 ± 19% perf-profile.calltrace.cycles-pp.reader__read_event.perf_session__process_events.record__finish_output.__cmd_record > 0.00 +0.8 0.78 ± 19% perf-profile.calltrace.cycles-pp.record__finish_output.__cmd_record > 0.00 +0.8 0.78 ± 19% perf-profile.calltrace.cycles-pp.perf_session__process_events.record__finish_output.__cmd_record > 0.00 +0.8 0.80 ± 15% perf-profile.calltrace.cycles-pp.__cmd_record > 0.00 +0.8 0.82 ± 11% perf-profile.calltrace.cycles-pp.intel_idle_irq.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle > 0.00 +0.9 0.85 ± 6% perf-profile.calltrace.cycles-pp.prepare_to_wait_event.pipe_write.vfs_write.ksys_write.do_syscall_64 > 0.00 +0.9 0.86 ± 4% perf-profile.calltrace.cycles-pp.current_time.atime_needs_update.touch_atime.pipe_read.vfs_read > 0.00 +0.9 0.87 ± 5% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__libc_write > 0.00 +0.9 0.88 ± 5% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__libc_read > 0.26 ±100% +1.0 1.22 ± 10% perf-profile.calltrace.cycles-pp.__schedule.schedule.pipe_write.vfs_write.ksys_write > 0.00 +1.0 0.96 ± 6% perf-profile.calltrace.cycles-pp.__might_fault._copy_to_iter.copy_page_to_iter.pipe_read.vfs_read > 0.27 ±100% +1.0 1.23 ± 10% perf-profile.calltrace.cycles-pp.schedule.pipe_write.vfs_write.ksys_write.do_syscall_64 > 0.00 +1.0 0.97 ± 7% perf-profile.calltrace.cycles-pp.page_counter_uncharge.uncharge_batch.__mem_cgroup_uncharge.__folio_put.pipe_read > 0.87 ± 8% +1.1 1.98 ± 5% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.vfs_read.ksys_read.do_syscall_64 > 0.73 ± 6% +1.1 1.85 ± 5% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.vfs_write.ksys_write.do_syscall_64 > 0.00 +1.2 1.15 ± 7% perf-profile.calltrace.cycles-pp.uncharge_batch.__mem_cgroup_uncharge.__folio_put.pipe_read.vfs_read > 0.00 +1.2 1.23 ± 6% perf-profile.calltrace.cycles-pp.__mem_cgroup_uncharge.__folio_put.pipe_read.vfs_read.ksys_read > 0.00 +1.2 1.24 ± 7% perf-profile.calltrace.cycles-pp.__folio_put.pipe_read.vfs_read.ksys_read.do_syscall_64 > 0.48 ± 45% +1.3 1.74 ± 6% perf-profile.calltrace.cycles-pp.atime_needs_update.touch_atime.pipe_read.vfs_read.ksys_read > 0.60 ± 7% +1.3 1.87 ± 8% perf-profile.calltrace.cycles-pp.__wake_up_common_lock.pipe_read.vfs_read.ksys_read.do_syscall_64 > 1.23 ± 7% +1.3 2.51 ± 4% perf-profile.calltrace.cycles-pp.mutex_lock.pipe_read.vfs_read.ksys_read.do_syscall_64 > 43.42 +1.3 44.75 perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write > 0.83 ± 7% +1.3 2.17 ± 5% perf-profile.calltrace.cycles-pp.security_file_permission.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.98 ± 7% +1.4 2.36 ± 6% perf-profile.calltrace.cycles-pp.security_file_permission.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.27 ±100% +1.4 1.70 ± 9% perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_common_lock.pipe_read.vfs_read.ksys_read > 0.79 ± 8% +1.4 2.23 ± 6% perf-profile.calltrace.cycles-pp.touch_atime.pipe_read.vfs_read.ksys_read.do_syscall_64 > 0.18 ±141% +1.5 1.63 ± 9% perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_read > 0.18 ±141% +1.5 1.67 ± 9% perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_read.vfs_read > 0.00 +1.6 1.57 ± 10% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry > 0.00 +1.6 1.57 ± 10% perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary > 1.05 ± 8% +1.7 2.73 ± 6% perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyin._copy_from_iter.copy_page_from_iter.pipe_write > 1.84 ± 9% +1.7 3.56 ± 5% perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyout._copy_to_iter.copy_page_to_iter.pipe_read > 1.41 ± 9% +1.8 3.17 ± 6% perf-profile.calltrace.cycles-pp.copyin._copy_from_iter.copy_page_from_iter.pipe_write.vfs_write > 0.00 +1.8 1.79 ± 9% perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify > 1.99 ± 9% +2.0 3.95 ± 5% perf-profile.calltrace.cycles-pp.copyout._copy_to_iter.copy_page_to_iter.pipe_read.vfs_read > 2.40 ± 7% +2.4 4.82 ± 5% perf-profile.calltrace.cycles-pp._copy_from_iter.copy_page_from_iter.pipe_write.vfs_write.ksys_write > 0.00 +2.5 2.50 ± 7% perf-profile.calltrace.cycles-pp.__mutex_lock.pipe_write.vfs_write.ksys_write.do_syscall_64 > 2.89 ± 8% +2.6 5.47 ± 5% perf-profile.calltrace.cycles-pp.copy_page_from_iter.pipe_write.vfs_write.ksys_write.do_syscall_64 > 1.04 ± 30% +2.8 3.86 ± 5% perf-profile.calltrace.cycles-pp.__entry_text_start.__libc_write > 0.00 +2.9 2.90 ± 11% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify > 0.00 +2.9 2.91 ± 11% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify > 0.00 +2.9 2.91 ± 11% perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify > 0.85 ± 27% +2.9 3.80 ± 5% perf-profile.calltrace.cycles-pp.__entry_text_start.__libc_read > 0.00 +3.0 2.96 ± 11% perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify > 2.60 ± 9% +3.1 5.74 ± 6% perf-profile.calltrace.cycles-pp._copy_to_iter.copy_page_to_iter.pipe_read.vfs_read.ksys_read > 2.93 ± 9% +3.7 6.66 ± 5% perf-profile.calltrace.cycles-pp.copy_page_to_iter.pipe_read.vfs_read.ksys_read.do_syscall_64 > 1.60 ± 12% +4.6 6.18 ± 7% perf-profile.calltrace.cycles-pp.mutex_unlock.pipe_write.vfs_write.ksys_write.do_syscall_64 > 2.60 ± 10% +4.6 7.24 ± 5% perf-profile.calltrace.cycles-pp.mutex_lock.pipe_write.vfs_write.ksys_write.do_syscall_64 > 28.75 ± 5% -21.6 7.19 ± 28% perf-profile.children.cycles-pp.schedule > 30.52 ± 4% -21.6 8.97 ± 22% perf-profile.children.cycles-pp.__wake_up_common_lock > 28.53 ± 6% -21.0 7.56 ± 26% perf-profile.children.cycles-pp.__schedule > 29.04 ± 5% -20.4 8.63 ± 23% perf-profile.children.cycles-pp.__wake_up_common > 28.37 ± 5% -19.9 8.44 ± 23% perf-profile.children.cycles-pp.autoremove_wake_function > 28.08 ± 5% -19.7 8.33 ± 23% perf-profile.children.cycles-pp.try_to_wake_up > 13.90 ± 2% -10.2 3.75 ± 28% perf-profile.children.cycles-pp.ttwu_do_activate > 12.66 ± 3% -9.2 3.47 ± 29% perf-profile.children.cycles-pp.enqueue_task_fair > 34.20 -9.2 25.05 ± 3% perf-profile.children.cycles-pp.pipe_read > 90.86 -9.1 81.73 perf-profile.children.cycles-pp.do_syscall_64 > 91.80 -8.3 83.49 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe > 10.28 ± 7% -7.8 2.53 ± 27% perf-profile.children.cycles-pp._raw_spin_lock > 9.85 ± 7% -6.9 2.92 ± 29% perf-profile.children.cycles-pp.dequeue_task_fair > 8.69 ± 7% -6.6 2.05 ± 24% perf-profile.children.cycles-pp.exit_to_user_mode_prepare > 8.99 ± 6% -6.2 2.81 ± 16% perf-profile.children.cycles-pp.syscall_exit_to_user_mode > 36.46 -6.1 30.34 perf-profile.children.cycles-pp.vfs_read > 8.38 ± 8% -5.8 2.60 ± 23% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath > 6.10 ± 11% -5.4 0.66 ± 61% perf-profile.children.cycles-pp.exit_to_user_mode_loop > 37.45 -5.3 32.13 perf-profile.children.cycles-pp.ksys_read > 6.50 ± 35% -4.9 1.62 ± 61% perf-profile.children.cycles-pp.update_curr > 6.56 ± 15% -4.6 1.95 ± 57% perf-profile.children.cycles-pp.update_cfs_group > 6.38 ± 14% -4.5 1.91 ± 28% perf-profile.children.cycles-pp.enqueue_entity > 5.74 ± 5% -3.8 1.92 ± 25% perf-profile.children.cycles-pp.update_load_avg > 45.56 -3.8 41.75 perf-profile.children.cycles-pp.__libc_read > 3.99 ± 4% -3.1 0.92 ± 24% perf-profile.children.cycles-pp.pick_next_task_fair > 4.12 ± 27% -2.7 1.39 ± 34% perf-profile.children.cycles-pp.dequeue_entity > 40.88 -2.5 38.37 perf-profile.children.cycles-pp.pipe_write > 3.11 ± 4% -2.4 0.75 ± 22% perf-profile.children.cycles-pp.switch_mm_irqs_off > 2.06 ± 33% -1.8 0.27 ± 27% perf-profile.children.cycles-pp.asm_sysvec_call_function_single > 2.38 ± 41% -1.8 0.60 ± 72% perf-profile.children.cycles-pp.perf_trace_sched_wakeup_template > 2.29 ± 5% -1.7 0.60 ± 25% perf-profile.children.cycles-pp.switch_fpu_return > 2.30 ± 6% -1.6 0.68 ± 18% perf-profile.children.cycles-pp.prepare_task_switch > 1.82 ± 33% -1.6 0.22 ± 31% perf-profile.children.cycles-pp.sysvec_call_function_single > 1.77 ± 33% -1.6 0.20 ± 32% perf-profile.children.cycles-pp.__sysvec_call_function_single > 1.96 ± 5% -1.5 0.50 ± 20% perf-profile.children.cycles-pp.reweight_entity > 2.80 ± 7% -1.2 1.60 ± 12% perf-profile.children.cycles-pp.select_task_rq > 1.61 ± 6% -1.2 0.42 ± 25% perf-profile.children.cycles-pp.restore_fpregs_from_fpstate > 1.34 ± 9% -1.2 0.16 ± 28% perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore > 1.62 ± 4% -1.2 0.45 ± 22% perf-profile.children.cycles-pp.set_next_entity > 1.55 ± 8% -1.1 0.43 ± 12% perf-profile.children.cycles-pp.update_rq_clock > 1.49 ± 8% -1.1 0.41 ± 14% perf-profile.children.cycles-pp.__update_load_avg_cfs_rq > 1.30 ± 20% -1.0 0.26 ± 18% perf-profile.children.cycles-pp.finish_task_switch > 1.44 ± 5% -1.0 0.42 ± 19% perf-profile.children.cycles-pp.__switch_to_asm > 2.47 ± 7% -1.0 1.50 ± 12% perf-profile.children.cycles-pp.select_task_rq_fair > 2.33 ± 7% -0.9 1.40 ± 3% perf-profile.children.cycles-pp.prepare_to_wait_event > 1.24 ± 7% -0.9 0.35 ± 14% perf-profile.children.cycles-pp.__update_load_avg_se > 1.41 ± 32% -0.9 0.56 ± 24% perf-profile.children.cycles-pp.sched_ttwu_pending > 2.29 ± 8% -0.8 1.45 ± 3% perf-profile.children.cycles-pp._raw_spin_lock_irqsave > 1.04 ± 7% -0.8 0.24 ± 22% perf-profile.children.cycles-pp.check_preempt_curr > 1.01 ± 3% -0.7 0.30 ± 20% perf-profile.children.cycles-pp.__switch_to > 0.92 ± 7% -0.7 0.26 ± 12% perf-profile.children.cycles-pp.update_min_vruntime > 0.71 ± 2% -0.6 0.08 ± 75% perf-profile.children.cycles-pp.put_prev_entity > 0.76 ± 6% -0.6 0.14 ± 32% perf-profile.children.cycles-pp.check_preempt_wakeup > 0.81 ± 66% -0.6 0.22 ± 34% perf-profile.children.cycles-pp.set_task_cpu > 0.82 ± 17% -0.6 0.23 ± 10% perf-profile.children.cycles-pp.cpuacct_charge > 1.08 ± 15% -0.6 0.51 ± 10% perf-profile.children.cycles-pp.wake_affine > 0.56 ± 15% -0.5 0.03 ±100% perf-profile.children.cycles-pp.asm_sysvec_reschedule_ipi > 0.66 ± 3% -0.5 0.15 ± 28% perf-profile.children.cycles-pp.os_xsave > 0.52 ± 44% -0.5 0.06 ±151% perf-profile.children.cycles-pp.native_irq_return_iret > 0.55 ± 5% -0.4 0.15 ± 21% perf-profile.children.cycles-pp.__calc_delta > 0.56 ± 10% -0.4 0.17 ± 26% perf-profile.children.cycles-pp.___perf_sw_event > 0.70 ± 15% -0.4 0.32 ± 11% perf-profile.children.cycles-pp.task_h_load > 0.40 ± 4% -0.3 0.06 ± 49% perf-profile.children.cycles-pp.pick_next_entity > 0.57 ± 6% -0.3 0.26 ± 7% perf-profile.children.cycles-pp.__list_del_entry_valid > 0.39 ± 8% -0.3 0.08 ± 24% perf-profile.children.cycles-pp.set_next_buddy > 0.64 ± 6% -0.3 0.36 ± 6% perf-profile.children.cycles-pp._raw_spin_lock_irq > 0.53 ± 20% -0.3 0.25 ± 8% perf-profile.children.cycles-pp.ttwu_queue_wakelist > 0.36 ± 8% -0.3 0.08 ± 11% perf-profile.children.cycles-pp.rb_insert_color > 0.41 ± 6% -0.3 0.14 ± 17% perf-profile.children.cycles-pp.sched_clock_cpu > 0.36 ± 33% -0.3 0.10 ± 17% perf-profile.children.cycles-pp.__flush_smp_call_function_queue > 0.37 ± 4% -0.2 0.13 ± 16% perf-profile.children.cycles-pp.native_sched_clock > 0.28 ± 5% -0.2 0.07 ± 18% perf-profile.children.cycles-pp.rb_erase > 0.32 ± 7% -0.2 0.12 ± 10% perf-profile.children.cycles-pp.__list_add_valid > 0.23 ± 6% -0.2 0.03 ±103% perf-profile.children.cycles-pp.resched_curr > 0.27 ± 5% -0.2 0.08 ± 20% perf-profile.children.cycles-pp.__wrgsbase_inactive > 0.26 ± 6% -0.2 0.08 ± 17% perf-profile.children.cycles-pp.finish_wait > 0.26 ± 4% -0.2 0.08 ± 11% perf-profile.children.cycles-pp.rcu_note_context_switch > 0.33 ± 21% -0.2 0.15 ± 32% perf-profile.children.cycles-pp.migrate_task_rq_fair > 0.22 ± 9% -0.2 0.07 ± 22% perf-profile.children.cycles-pp.perf_trace_buf_update > 0.17 ± 8% -0.1 0.03 ±100% perf-profile.children.cycles-pp.rb_next > 0.15 ± 32% -0.1 0.03 ±100% perf-profile.children.cycles-pp.llist_reverse_order > 0.34 ± 7% -0.1 0.26 ± 3% perf-profile.children.cycles-pp.anon_pipe_buf_release > 0.14 ± 6% -0.1 0.07 ± 17% perf-profile.children.cycles-pp.read@plt > 0.10 ± 17% -0.1 0.04 ± 75% perf-profile.children.cycles-pp.remove_entity_load_avg > 0.07 ± 10% -0.0 0.02 ± 99% perf-profile.children.cycles-pp.generic_update_time > 0.11 ± 6% -0.0 0.07 ± 8% perf-profile.children.cycles-pp.__mark_inode_dirty > 0.00 +0.1 0.06 ± 9% perf-profile.children.cycles-pp.load_balance > 0.00 +0.1 0.06 ± 11% perf-profile.children.cycles-pp._raw_spin_trylock > 0.00 +0.1 0.06 ± 7% perf-profile.children.cycles-pp.uncharge_folio > 0.00 +0.1 0.06 ± 7% perf-profile.children.cycles-pp.__do_softirq > 0.00 +0.1 0.07 ± 10% perf-profile.children.cycles-pp.tick_nohz_get_sleep_length > 0.00 +0.1 0.08 ± 14% perf-profile.children.cycles-pp.__get_obj_cgroup_from_memcg > 0.15 ± 23% +0.1 0.23 ± 7% perf-profile.children.cycles-pp.task_tick_fair > 0.19 ± 17% +0.1 0.28 ± 7% perf-profile.children.cycles-pp.scheduler_tick > 0.00 +0.1 0.10 ± 21% perf-profile.children.cycles-pp.select_idle_core > 0.00 +0.1 0.10 ± 9% perf-profile.children.cycles-pp.osq_unlock > 0.23 ± 12% +0.1 0.34 ± 6% perf-profile.children.cycles-pp.update_process_times > 0.37 ± 13% +0.1 0.48 ± 5% perf-profile.children.cycles-pp.hrtimer_interrupt > 0.24 ± 12% +0.1 0.35 ± 6% perf-profile.children.cycles-pp.tick_sched_handle > 0.31 ± 14% +0.1 0.43 ± 4% perf-profile.children.cycles-pp.__hrtimer_run_queues > 0.37 ± 12% +0.1 0.49 ± 5% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt > 0.00 +0.1 0.12 ± 10% perf-profile.children.cycles-pp.__mod_memcg_state > 0.26 ± 10% +0.1 0.38 ± 6% perf-profile.children.cycles-pp.tick_sched_timer > 0.00 +0.1 0.13 ± 7% perf-profile.children.cycles-pp.free_unref_page > 0.00 +0.1 0.14 ± 8% perf-profile.children.cycles-pp.rmqueue > 0.15 ± 8% +0.2 0.30 ± 5% perf-profile.children.cycles-pp.rcu_all_qs > 0.16 ± 6% +0.2 0.31 ± 5% perf-profile.children.cycles-pp.__x64_sys_write > 0.00 +0.2 0.16 ± 10% perf-profile.children.cycles-pp.propagate_protected_usage > 0.00 +0.2 0.16 ± 10% perf-profile.children.cycles-pp.menu_select > 0.00 +0.2 0.16 ± 9% perf-profile.children.cycles-pp.memcg_account_kmem > 0.42 ± 12% +0.2 0.57 ± 4% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt > 0.15 ± 11% +0.2 0.31 ± 8% perf-profile.children.cycles-pp.__x64_sys_read > 0.00 +0.2 0.17 ± 8% perf-profile.children.cycles-pp.get_page_from_freelist > 0.44 ± 11% +0.2 0.62 ± 4% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt > 0.10 ± 31% +0.2 0.28 ± 24% perf-profile.children.cycles-pp.mnt_user_ns > 0.16 ± 4% +0.2 0.35 ± 5% perf-profile.children.cycles-pp.kill_fasync > 0.20 ± 10% +0.2 0.40 ± 3% perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare > 0.09 ± 7% +0.2 0.29 ± 4% perf-profile.children.cycles-pp.page_copy_sane > 0.08 ± 8% +0.2 0.31 ± 6% perf-profile.children.cycles-pp.rw_verify_area > 0.12 ± 11% +0.2 0.36 ± 8% perf-profile.children.cycles-pp.ktime_get_coarse_real_ts64 > 0.28 ± 12% +0.2 0.52 ± 5% perf-profile.children.cycles-pp.inode_needs_update_time > 0.00 +0.3 0.27 ± 7% perf-profile.children.cycles-pp.__memcg_kmem_charge_page > 0.43 ± 6% +0.3 0.73 ± 5% perf-profile.children.cycles-pp.__cond_resched > 0.21 ± 29% +0.3 0.54 ± 15% perf-profile.children.cycles-pp.select_idle_cpu > 0.10 ± 10% +0.3 0.43 ± 17% perf-profile.children.cycles-pp.fsnotify_perm > 0.23 ± 11% +0.3 0.56 ± 6% perf-profile.children.cycles-pp.syscall_enter_from_user_mode > 0.06 ± 75% +0.4 0.47 ± 27% perf-profile.children.cycles-pp.queue_event > 0.21 ± 9% +0.4 0.62 ± 5% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack > 0.06 ± 75% +0.4 0.48 ± 26% perf-profile.children.cycles-pp.ordered_events__queue > 0.06 ± 73% +0.4 0.50 ± 24% perf-profile.children.cycles-pp.process_simple > 0.01 ±223% +0.4 0.44 ± 9% perf-profile.children.cycles-pp.schedule_idle > 0.05 ± 8% +0.5 0.52 ± 7% perf-profile.children.cycles-pp.__alloc_pages > 0.45 ± 7% +0.5 0.94 ± 5% perf-profile.children.cycles-pp.__get_task_ioprio > 0.89 ± 8% +0.5 1.41 ± 4% perf-profile.children.cycles-pp.__might_sleep > 0.01 ±223% +0.5 0.54 ± 21% perf-profile.children.cycles-pp.flush_smp_call_function_queue > 0.05 ± 46% +0.5 0.60 ± 7% perf-profile.children.cycles-pp.osq_lock > 0.34 ± 8% +0.6 0.90 ± 5% perf-profile.children.cycles-pp.aa_file_perm > 0.01 ±223% +0.7 0.67 ± 7% perf-profile.children.cycles-pp.poll_idle > 0.14 ± 17% +0.7 0.82 ± 6% perf-profile.children.cycles-pp.mutex_spin_on_owner > 0.12 ± 12% +0.7 0.82 ± 15% perf-profile.children.cycles-pp.__cmd_record > 0.07 ± 72% +0.7 0.78 ± 19% perf-profile.children.cycles-pp.reader__read_event > 0.07 ± 72% +0.7 0.78 ± 19% perf-profile.children.cycles-pp.record__finish_output > 0.07 ± 72% +0.7 0.78 ± 19% perf-profile.children.cycles-pp.perf_session__process_events > 0.76 ± 8% +0.8 1.52 ± 5% perf-profile.children.cycles-pp.file_update_time > 0.08 ± 61% +0.8 0.85 ± 11% perf-profile.children.cycles-pp.intel_idle_irq > 1.23 ± 8% +0.9 2.11 ± 4% perf-profile.children.cycles-pp.__might_fault > 0.02 ±141% +1.0 0.97 ± 7% perf-profile.children.cycles-pp.page_counter_uncharge > 0.51 ± 9% +1.0 1.48 ± 4% perf-profile.children.cycles-pp.current_time > 0.05 ± 46% +1.1 1.15 ± 7% perf-profile.children.cycles-pp.uncharge_batch > 1.12 ± 6% +1.1 2.23 ± 5% perf-profile.children.cycles-pp.__fget_light > 0.06 ± 14% +1.2 1.23 ± 6% perf-profile.children.cycles-pp.__mem_cgroup_uncharge > 0.06 ± 14% +1.2 1.24 ± 7% perf-profile.children.cycles-pp.__folio_put > 0.64 ± 7% +1.2 1.83 ± 5% perf-profile.children.cycles-pp.syscall_return_via_sysret > 1.19 ± 8% +1.2 2.42 ± 4% perf-profile.children.cycles-pp.__might_resched > 0.59 ± 9% +1.3 1.84 ± 6% perf-profile.children.cycles-pp.atime_needs_update > 43.47 +1.4 44.83 perf-profile.children.cycles-pp.ksys_write > 1.28 ± 6% +1.4 2.68 ± 5% perf-profile.children.cycles-pp.__fdget_pos > 0.80 ± 8% +1.5 2.28 ± 6% perf-profile.children.cycles-pp.touch_atime > 0.11 ± 49% +1.5 1.59 ± 9% perf-profile.children.cycles-pp.cpuidle_enter_state > 0.11 ± 49% +1.5 1.60 ± 9% perf-profile.children.cycles-pp.cpuidle_enter > 0.12 ± 51% +1.7 1.81 ± 9% perf-profile.children.cycles-pp.cpuidle_idle_call > 1.44 ± 8% +1.8 3.22 ± 6% perf-profile.children.cycles-pp.copyin > 2.00 ± 9% +2.0 4.03 ± 5% perf-profile.children.cycles-pp.copyout > 1.02 ± 8% +2.0 3.07 ± 5% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack > 1.63 ± 7% +2.3 3.90 ± 5% perf-profile.children.cycles-pp.apparmor_file_permission > 2.64 ± 8% +2.3 4.98 ± 5% perf-profile.children.cycles-pp._copy_from_iter > 0.40 ± 14% +2.5 2.92 ± 7% perf-profile.children.cycles-pp.__mutex_lock > 2.91 ± 8% +2.6 5.54 ± 5% perf-profile.children.cycles-pp.copy_page_from_iter > 0.17 ± 62% +2.7 2.91 ± 11% perf-profile.children.cycles-pp.start_secondary > 1.83 ± 7% +2.8 4.59 ± 5% perf-profile.children.cycles-pp.security_file_permission > 0.17 ± 60% +2.8 2.94 ± 11% perf-profile.children.cycles-pp.do_idle > 0.17 ± 60% +2.8 2.96 ± 11% perf-profile.children.cycles-pp.secondary_startup_64_no_verify > 0.17 ± 60% +2.8 2.96 ± 11% perf-profile.children.cycles-pp.cpu_startup_entry > 2.62 ± 9% +3.2 5.84 ± 6% perf-profile.children.cycles-pp._copy_to_iter > 1.55 ± 8% +3.2 4.79 ± 5% perf-profile.children.cycles-pp.__entry_text_start > 3.09 ± 8% +3.7 6.77 ± 5% perf-profile.children.cycles-pp.copy_user_enhanced_fast_string > 2.95 ± 9% +3.8 6.73 ± 5% perf-profile.children.cycles-pp.copy_page_to_iter > 2.28 ± 11% +5.1 7.40 ± 6% perf-profile.children.cycles-pp.mutex_unlock > 3.92 ± 9% +6.0 9.94 ± 5% perf-profile.children.cycles-pp.mutex_lock > 8.37 ± 9% -5.8 2.60 ± 23% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath > 6.54 ± 15% -4.6 1.95 ± 57% perf-profile.self.cycles-pp.update_cfs_group > 3.08 ± 4% -2.3 0.74 ± 22% perf-profile.self.cycles-pp.switch_mm_irqs_off > 2.96 ± 4% -1.8 1.13 ± 33% perf-profile.self.cycles-pp.update_load_avg > 2.22 ± 8% -1.5 0.74 ± 12% perf-profile.self.cycles-pp._raw_spin_lock_irqsave > 1.96 ± 9% -1.5 0.48 ± 15% perf-profile.self.cycles-pp.update_curr > 1.94 ± 5% -1.3 0.64 ± 16% perf-profile.self.cycles-pp._raw_spin_lock > 1.78 ± 5% -1.3 0.50 ± 18% perf-profile.self.cycles-pp.__schedule > 1.59 ± 7% -1.2 0.40 ± 12% perf-profile.self.cycles-pp.enqueue_entity > 1.61 ± 6% -1.2 0.42 ± 25% perf-profile.self.cycles-pp.restore_fpregs_from_fpstate > 1.44 ± 8% -1.0 0.39 ± 14% perf-profile.self.cycles-pp.__update_load_avg_cfs_rq > 1.42 ± 5% -1.0 0.41 ± 19% perf-profile.self.cycles-pp.__switch_to_asm > 1.18 ± 7% -0.9 0.33 ± 14% perf-profile.self.cycles-pp.__update_load_avg_se > 1.14 ± 10% -0.8 0.31 ± 9% perf-profile.self.cycles-pp.update_rq_clock > 0.90 ± 7% -0.7 0.19 ± 21% perf-profile.self.cycles-pp.pick_next_task_fair > 1.04 ± 7% -0.7 0.33 ± 13% perf-profile.self.cycles-pp.prepare_task_switch > 0.98 ± 4% -0.7 0.29 ± 20% perf-profile.self.cycles-pp.__switch_to > 0.88 ± 6% -0.7 0.20 ± 17% perf-profile.self.cycles-pp.enqueue_task_fair > 1.01 ± 6% -0.7 0.35 ± 10% perf-profile.self.cycles-pp.prepare_to_wait_event > 0.90 ± 8% -0.6 0.25 ± 12% perf-profile.self.cycles-pp.update_min_vruntime > 0.79 ± 17% -0.6 0.22 ± 9% perf-profile.self.cycles-pp.cpuacct_charge > 1.10 ± 5% -0.6 0.54 ± 9% perf-profile.self.cycles-pp.try_to_wake_up > 0.66 ± 3% -0.5 0.15 ± 27% perf-profile.self.cycles-pp.os_xsave > 0.71 ± 6% -0.5 0.22 ± 18% perf-profile.self.cycles-pp.reweight_entity > 0.68 ± 9% -0.5 0.19 ± 10% perf-profile.self.cycles-pp.perf_trace_sched_wakeup_template > 0.67 ± 9% -0.5 0.18 ± 11% perf-profile.self.cycles-pp.__wake_up_common > 0.65 ± 6% -0.5 0.17 ± 23% perf-profile.self.cycles-pp.switch_fpu_return > 0.60 ± 11% -0.5 0.14 ± 28% perf-profile.self.cycles-pp.perf_tp_event > 0.52 ± 44% -0.5 0.06 ±151% perf-profile.self.cycles-pp.native_irq_return_iret > 0.52 ± 7% -0.4 0.08 ± 25% perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore > 0.55 ± 4% -0.4 0.15 ± 22% perf-profile.self.cycles-pp.__calc_delta > 0.61 ± 5% -0.4 0.21 ± 12% perf-profile.self.cycles-pp.dequeue_task_fair > 0.69 ± 14% -0.4 0.32 ± 11% perf-profile.self.cycles-pp.task_h_load > 0.49 ± 11% -0.3 0.15 ± 29% perf-profile.self.cycles-pp.___perf_sw_event > 0.37 ± 4% -0.3 0.05 ± 73% perf-profile.self.cycles-pp.pick_next_entity > 0.50 ± 3% -0.3 0.19 ± 15% perf-profile.self.cycles-pp.select_idle_sibling > 0.38 ± 9% -0.3 0.08 ± 24% perf-profile.self.cycles-pp.set_next_buddy > 0.32 ± 4% -0.3 0.03 ±100% perf-profile.self.cycles-pp.put_prev_entity > 0.64 ± 6% -0.3 0.35 ± 7% perf-profile.self.cycles-pp._raw_spin_lock_irq > 0.52 ± 5% -0.3 0.25 ± 6% perf-profile.self.cycles-pp.__list_del_entry_valid > 0.34 ± 5% -0.3 0.07 ± 29% perf-profile.self.cycles-pp.schedule > 0.35 ± 9% -0.3 0.08 ± 10% perf-profile.self.cycles-pp.rb_insert_color > 0.40 ± 5% -0.3 0.14 ± 16% perf-profile.self.cycles-pp.select_task_rq_fair > 0.33 ± 6% -0.3 0.08 ± 16% perf-profile.self.cycles-pp.check_preempt_wakeup > 0.33 ± 8% -0.2 0.10 ± 16% perf-profile.self.cycles-pp.select_task_rq > 0.36 ± 3% -0.2 0.13 ± 16% perf-profile.self.cycles-pp.native_sched_clock > 0.32 ± 7% -0.2 0.10 ± 14% perf-profile.self.cycles-pp.finish_task_switch > 0.32 ± 4% -0.2 0.11 ± 13% perf-profile.self.cycles-pp.dequeue_entity > 0.32 ± 8% -0.2 0.12 ± 10% perf-profile.self.cycles-pp.__list_add_valid > 0.23 ± 5% -0.2 0.03 ±103% perf-profile.self.cycles-pp.resched_curr > 0.27 ± 6% -0.2 0.07 ± 21% perf-profile.self.cycles-pp.rb_erase > 0.27 ± 5% -0.2 0.08 ± 20% perf-profile.self.cycles-pp.__wrgsbase_inactive > 0.28 ± 13% -0.2 0.09 ± 12% perf-profile.self.cycles-pp.check_preempt_curr > 0.30 ± 13% -0.2 0.12 ± 7% perf-profile.self.cycles-pp.ttwu_queue_wakelist > 0.24 ± 5% -0.2 0.06 ± 19% perf-profile.self.cycles-pp.set_next_entity > 0.21 ± 34% -0.2 0.04 ± 71% perf-profile.self.cycles-pp.__flush_smp_call_function_queue > 0.25 ± 5% -0.2 0.08 ± 16% perf-profile.self.cycles-pp.rcu_note_context_switch > 0.19 ± 26% -0.1 0.04 ± 73% perf-profile.self.cycles-pp.perf_trace_sched_stat_runtime > 0.20 ± 8% -0.1 0.06 ± 13% perf-profile.self.cycles-pp.ttwu_do_activate > 0.17 ± 8% -0.1 0.03 ±100% perf-profile.self.cycles-pp.rb_next > 0.22 ± 23% -0.1 0.09 ± 31% perf-profile.self.cycles-pp.migrate_task_rq_fair > 0.15 ± 32% -0.1 0.03 ±100% perf-profile.self.cycles-pp.llist_reverse_order > 0.16 ± 8% -0.1 0.06 ± 14% perf-profile.self.cycles-pp.wake_affine > 0.10 ± 31% -0.1 0.03 ±100% perf-profile.self.cycles-pp.sched_ttwu_pending > 0.14 ± 5% -0.1 0.07 ± 20% perf-profile.self.cycles-pp.read@plt > 0.32 ± 8% -0.1 0.26 ± 3% perf-profile.self.cycles-pp.anon_pipe_buf_release > 0.10 ± 6% -0.1 0.04 ± 45% perf-profile.self.cycles-pp.__wake_up_common_lock > 0.10 ± 9% -0.0 0.07 ± 8% perf-profile.self.cycles-pp.__mark_inode_dirty > 0.00 +0.1 0.06 ± 11% perf-profile.self.cycles-pp.free_unref_page > 0.00 +0.1 0.06 ± 6% perf-profile.self.cycles-pp.__alloc_pages > 0.00 +0.1 0.06 ± 11% perf-profile.self.cycles-pp._raw_spin_trylock > 0.00 +0.1 0.06 ± 7% perf-profile.self.cycles-pp.uncharge_folio > 0.00 +0.1 0.06 ± 11% perf-profile.self.cycles-pp.uncharge_batch > 0.00 +0.1 0.07 ± 10% perf-profile.self.cycles-pp.menu_select > 0.00 +0.1 0.08 ± 14% perf-profile.self.cycles-pp.__get_obj_cgroup_from_memcg > 0.00 +0.1 0.08 ± 7% perf-profile.self.cycles-pp.__memcg_kmem_charge_page > 0.00 +0.1 0.10 ± 10% perf-profile.self.cycles-pp.osq_unlock > 0.07 ± 5% +0.1 0.17 ± 8% perf-profile.self.cycles-pp.copyin > 0.00 +0.1 0.11 ± 11% perf-profile.self.cycles-pp.__mod_memcg_state > 0.13 ± 8% +0.1 0.24 ± 6% perf-profile.self.cycles-pp.rcu_all_qs > 0.14 ± 5% +0.1 0.28 ± 5% perf-profile.self.cycles-pp.__x64_sys_write > 0.07 ± 10% +0.1 0.21 ± 5% perf-profile.self.cycles-pp.page_copy_sane > 0.13 ± 12% +0.1 0.28 ± 9% perf-profile.self.cycles-pp.__x64_sys_read > 0.00 +0.2 0.15 ± 10% perf-profile.self.cycles-pp.propagate_protected_usage > 0.18 ± 9% +0.2 0.33 ± 4% perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare > 0.07 ± 8% +0.2 0.23 ± 5% perf-profile.self.cycles-pp.rw_verify_area > 0.08 ± 34% +0.2 0.24 ± 27% perf-profile.self.cycles-pp.mnt_user_ns > 0.13 ± 5% +0.2 0.31 ± 7% perf-profile.self.cycles-pp.kill_fasync > 0.21 ± 8% +0.2 0.39 ± 5% perf-profile.self.cycles-pp.__might_fault > 0.06 ± 13% +0.2 0.26 ± 9% perf-profile.self.cycles-pp.copyout > 0.10 ± 11% +0.2 0.31 ± 8% perf-profile.self.cycles-pp.ktime_get_coarse_real_ts64 > 0.26 ± 13% +0.2 0.49 ± 6% perf-profile.self.cycles-pp.inode_needs_update_time > 0.23 ± 8% +0.2 0.47 ± 5% perf-profile.self.cycles-pp.copy_page_from_iter > 0.14 ± 7% +0.2 0.38 ± 6% perf-profile.self.cycles-pp.file_update_time > 0.36 ± 7% +0.3 0.62 ± 4% perf-profile.self.cycles-pp.ksys_read > 0.54 ± 13% +0.3 0.80 ± 4% perf-profile.self.cycles-pp._copy_from_iter > 0.15 ± 5% +0.3 0.41 ± 8% perf-profile.self.cycles-pp.touch_atime > 0.14 ± 5% +0.3 0.40 ± 6% perf-profile.self.cycles-pp.__cond_resched > 0.18 ± 5% +0.3 0.47 ± 4% perf-profile.self.cycles-pp.syscall_exit_to_user_mode > 0.16 ± 8% +0.3 0.46 ± 6% perf-profile.self.cycles-pp.syscall_enter_from_user_mode > 0.16 ± 9% +0.3 0.47 ± 6% perf-profile.self.cycles-pp.__fdget_pos > 1.79 ± 8% +0.3 2.12 ± 3% perf-profile.self.cycles-pp.pipe_read > 0.10 ± 8% +0.3 0.43 ± 17% perf-profile.self.cycles-pp.fsnotify_perm > 0.20 ± 4% +0.4 0.55 ± 5% perf-profile.self.cycles-pp.ksys_write > 0.05 ± 76% +0.4 0.46 ± 27% perf-profile.self.cycles-pp.queue_event > 0.32 ± 6% +0.4 0.73 ± 6% perf-profile.self.cycles-pp.exit_to_user_mode_prepare > 0.21 ± 9% +0.4 0.62 ± 6% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack > 0.79 ± 8% +0.4 1.22 ± 4% perf-profile.self.cycles-pp.__might_sleep > 0.44 ± 5% +0.4 0.88 ± 7% perf-profile.self.cycles-pp.do_syscall_64 > 0.26 ± 8% +0.4 0.70 ± 4% perf-profile.self.cycles-pp.atime_needs_update > 0.42 ± 7% +0.5 0.88 ± 5% perf-profile.self.cycles-pp.__get_task_ioprio > 0.28 ± 12% +0.5 0.75 ± 5% perf-profile.self.cycles-pp.copy_page_to_iter > 0.19 ± 6% +0.5 0.68 ± 10% perf-profile.self.cycles-pp.security_file_permission > 0.31 ± 8% +0.5 0.83 ± 5% perf-profile.self.cycles-pp.aa_file_perm > 0.05 ± 46% +0.5 0.59 ± 8% perf-profile.self.cycles-pp.osq_lock > 0.30 ± 7% +0.5 0.85 ± 6% perf-profile.self.cycles-pp._copy_to_iter > 0.00 +0.6 0.59 ± 6% perf-profile.self.cycles-pp.poll_idle > 0.13 ± 20% +0.7 0.81 ± 6% perf-profile.self.cycles-pp.mutex_spin_on_owner > 0.38 ± 9% +0.7 1.12 ± 5% perf-profile.self.cycles-pp.current_time > 0.08 ± 59% +0.8 0.82 ± 11% perf-profile.self.cycles-pp.intel_idle_irq > 0.92 ± 6% +0.8 1.72 ± 4% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe > 0.01 ±223% +0.8 0.82 ± 6% perf-profile.self.cycles-pp.page_counter_uncharge > 0.86 ± 7% +1.1 1.91 ± 4% perf-profile.self.cycles-pp.vfs_read > 1.07 ± 6% +1.1 2.14 ± 5% perf-profile.self.cycles-pp.__fget_light > 0.67 ± 7% +1.1 1.74 ± 6% perf-profile.self.cycles-pp.vfs_write > 0.15 ± 12% +1.1 1.28 ± 7% perf-profile.self.cycles-pp.__mutex_lock > 1.09 ± 6% +1.1 2.22 ± 5% perf-profile.self.cycles-pp.__libc_read > 0.62 ± 6% +1.2 1.79 ± 5% perf-profile.self.cycles-pp.syscall_return_via_sysret > 1.16 ± 8% +1.2 2.38 ± 4% perf-profile.self.cycles-pp.__might_resched > 0.91 ± 7% +1.3 2.20 ± 5% perf-profile.self.cycles-pp.__libc_write > 0.59 ± 8% +1.3 1.93 ± 6% perf-profile.self.cycles-pp.__entry_text_start > 1.27 ± 7% +1.7 3.00 ± 6% perf-profile.self.cycles-pp.apparmor_file_permission > 0.99 ± 8% +2.0 2.98 ± 5% perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack > 1.74 ± 8% +3.4 5.15 ± 6% perf-profile.self.cycles-pp.pipe_write > 2.98 ± 8% +3.7 6.64 ± 5% perf-profile.self.cycles-pp.copy_user_enhanced_fast_string > 2.62 ± 10% +4.8 7.38 ± 5% perf-profile.self.cycles-pp.mutex_lock > 2.20 ± 10% +5.1 7.30 ± 6% perf-profile.self.cycles-pp.mutex_unlock > > > *************************************************************************************************** > lkp-skl-fpga01: 104 threads 2 sockets (Skylake) with 192G memory > ========================================================================================= > compiler/cpufreq_governor/ipc/iterations/kconfig/mode/nr_threads/rootfs/tbox_group/testcase: > gcc-11/performance/socket/4/x86_64-rhel-8.3/process/100%/debian-11.1-x86_64-20220510.cgz/lkp-skl-fpga01/hackbench > > commit: > a2e90611b9 ("sched/fair: Remove capacity inversion detection") > 829c1651e9 ("sched/fair: sanitize vruntime of entity being placed") > > a2e90611b9f425ad 829c1651e9c4a6f78398d3e6765 > ---------------- --------------------------- > %stddev %change %stddev > \ | \ > 177139 -8.1% 162815 hackbench.throughput > 174484 -18.8% 141618 ± 2% hackbench.throughput_avg > 177139 -8.1% 162815 hackbench.throughput_best > 168530 -37.3% 105615 ± 3% hackbench.throughput_worst > 281.38 +23.1% 346.39 ± 2% hackbench.time.elapsed_time > 281.38 +23.1% 346.39 ± 2% hackbench.time.elapsed_time.max > 1.053e+08 ± 2% +688.4% 8.302e+08 ± 9% hackbench.time.involuntary_context_switches > 21992 +27.8% 28116 ± 2% hackbench.time.system_time > 6652 +8.2% 7196 hackbench.time.user_time > 3.482e+08 +289.2% 1.355e+09 ± 9% hackbench.time.voluntary_context_switches > 2110813 ± 5% +21.6% 2565791 ± 3% cpuidle..usage > 333.95 +19.5% 399.05 uptime.boot > 0.03 -0.0 0.03 mpstat.cpu.all.soft% > 22.68 -2.9 19.77 mpstat.cpu.all.usr% > 561083 ± 10% +45.5% 816171 ± 12% numa-numastat.node0.local_node > 614314 ± 9% +36.9% 841173 ± 12% numa-numastat.node0.numa_hit > 1393279 ± 7% -16.8% 1158997 ± 2% numa-numastat.node1.local_node > 1443679 ± 5% -14.9% 1229074 ± 3% numa-numastat.node1.numa_hit > 4129900 ± 8% -23.0% 3181115 vmstat.memory.cache > 1731 +30.8% 2265 vmstat.procs.r > 1598044 +290.3% 6237840 ± 7% vmstat.system.cs > 320762 +60.5% 514672 ± 8% vmstat.system.in > 962111 ± 6% +46.0% 1404646 ± 7% turbostat.C1 > 233987 ± 5% +51.2% 353892 turbostat.C1E > 91515563 +97.3% 1.806e+08 ± 10% turbostat.IRQ > 448466 ± 14% -34.2% 294934 ± 5% turbostat.POLL > 34.60 -7.3% 32.07 turbostat.RAMWatt > 514028 ± 2% -14.0% 442125 ± 2% meminfo.AnonPages > 4006312 ± 8% -23.9% 3047078 meminfo.Cached > 3321064 ± 10% -32.7% 2236362 ± 2% meminfo.Committed_AS > 1714752 ± 21% -60.3% 680479 ± 8% meminfo.Inactive > 1714585 ± 21% -60.3% 680305 ± 8% meminfo.Inactive(anon) > 757124 ± 18% -67.2% 248485 ± 27% meminfo.Mapped > 6476123 ± 6% -19.4% 5220738 meminfo.Memused > 1275724 ± 26% -75.2% 316896 ± 15% meminfo.Shmem > 6806047 ± 3% -13.3% 5901974 meminfo.max_used_kB > 161311 ± 23% +31.7% 212494 ± 5% numa-meminfo.node0.AnonPages > 165693 ± 22% +30.5% 216264 ± 5% numa-meminfo.node0.Inactive > 165563 ± 22% +30.6% 216232 ± 5% numa-meminfo.node0.Inactive(anon) > 140638 ± 19% -36.7% 89034 ± 11% numa-meminfo.node0.Mapped > 352173 ± 14% -35.3% 227805 ± 8% numa-meminfo.node1.AnonPages > 501396 ± 11% -22.6% 388042 ± 5% numa-meminfo.node1.AnonPages.max > 1702242 ± 43% -77.8% 378325 ± 22% numa-meminfo.node1.FilePages > 1540803 ± 25% -70.4% 455592 ± 13% numa-meminfo.node1.Inactive > 1540767 ± 25% -70.4% 455451 ± 13% numa-meminfo.node1.Inactive(anon) > 612123 ± 18% -74.9% 153752 ± 37% numa-meminfo.node1.Mapped > 3085231 ± 24% -53.9% 1420940 ± 14% numa-meminfo.node1.MemUsed > 254052 ± 4% -19.1% 205632 ± 21% numa-meminfo.node1.SUnreclaim > 1259640 ± 27% -75.9% 303123 ± 15% numa-meminfo.node1.Shmem > 304597 ± 7% -20.2% 242920 ± 17% numa-meminfo.node1.Slab > 40345 ± 23% +31.5% 53054 ± 5% numa-vmstat.node0.nr_anon_pages > 41412 ± 22% +30.4% 53988 ± 5% numa-vmstat.node0.nr_inactive_anon > 35261 ± 19% -36.9% 22256 ± 12% numa-vmstat.node0.nr_mapped > 41412 ± 22% +30.4% 53988 ± 5% numa-vmstat.node0.nr_zone_inactive_anon > 614185 ± 9% +36.9% 841065 ± 12% numa-vmstat.node0.numa_hit > 560955 ± 11% +45.5% 816063 ± 12% numa-vmstat.node0.numa_local > 88129 ± 14% -35.2% 57097 ± 8% numa-vmstat.node1.nr_anon_pages > 426425 ± 43% -77.9% 94199 ± 22% numa-vmstat.node1.nr_file_pages > 386166 ± 25% -70.5% 113880 ± 13% numa-vmstat.node1.nr_inactive_anon > 153658 ± 18% -75.3% 38021 ± 37% numa-vmstat.node1.nr_mapped > 315775 ± 27% -76.1% 75399 ± 16% numa-vmstat.node1.nr_shmem > 63411 ± 4% -18.6% 51593 ± 21% numa-vmstat.node1.nr_slab_unreclaimable > 386166 ± 25% -70.5% 113880 ± 13% numa-vmstat.node1.nr_zone_inactive_anon > 1443470 ± 5% -14.9% 1228740 ± 3% numa-vmstat.node1.numa_hit > 1393069 ± 7% -16.8% 1158664 ± 2% numa-vmstat.node1.numa_local > 128457 ± 2% -14.0% 110530 ± 3% proc-vmstat.nr_anon_pages > 999461 ± 8% -23.8% 761774 proc-vmstat.nr_file_pages > 426485 ± 21% -60.1% 170237 ± 9% proc-vmstat.nr_inactive_anon > 82464 -2.6% 80281 proc-vmstat.nr_kernel_stack > 187777 ± 18% -66.9% 62076 ± 28% proc-vmstat.nr_mapped > 316813 ± 27% -75.0% 79228 ± 16% proc-vmstat.nr_shmem > 31469 -2.0% 30840 proc-vmstat.nr_slab_reclaimable > 117889 -8.4% 108036 proc-vmstat.nr_slab_unreclaimable > 426485 ± 21% -60.1% 170237 ± 9% proc-vmstat.nr_zone_inactive_anon > 187187 ± 12% -43.5% 105680 ± 9% proc-vmstat.numa_hint_faults > 128363 ± 15% -61.5% 49371 ± 19% proc-vmstat.numa_hint_faults_local > 47314 ± 22% +39.2% 65863 ± 13% proc-vmstat.numa_pages_migrated > 457026 ± 9% -18.1% 374188 ± 13% proc-vmstat.numa_pte_updates > 2586600 ± 3% +27.7% 3302787 ± 8% proc-vmstat.pgalloc_normal > 1589970 -6.2% 1491838 proc-vmstat.pgfault > 2347186 ± 10% +37.7% 3232369 ± 8% proc-vmstat.pgfree > 47314 ± 22% +39.2% 65863 ± 13% proc-vmstat.pgmigrate_success > 112713 +7.0% 120630 ± 3% proc-vmstat.pgreuse > 2189056 +22.2% 2674944 ± 2% proc-vmstat.unevictable_pgs_scanned > 14.08 ± 2% +29.3% 18.20 ± 5% sched_debug.cfs_rq:/.h_nr_running.avg > 0.80 ± 14% +179.2% 2.23 ± 24% sched_debug.cfs_rq:/.h_nr_running.min > 245.23 ± 12% -19.7% 196.97 ± 6% sched_debug.cfs_rq:/.load_avg.max > 2.27 ± 16% +75.0% 3.97 ± 4% sched_debug.cfs_rq:/.load_avg.min > 45.77 ± 16% -17.8% 37.60 ± 6% sched_debug.cfs_rq:/.load_avg.stddev > 11842707 +39.9% 16567992 sched_debug.cfs_rq:/.min_vruntime.avg > 13773080 ± 3% +113.9% 29460281 ± 7% sched_debug.cfs_rq:/.min_vruntime.max > 11423218 +30.3% 14885830 sched_debug.cfs_rq:/.min_vruntime.min > 301190 ± 12% +439.9% 1626088 ± 10% sched_debug.cfs_rq:/.min_vruntime.stddev > 203.83 -16.3% 170.67 sched_debug.cfs_rq:/.removed.load_avg.max > 14330 ± 3% +30.9% 18756 ± 5% sched_debug.cfs_rq:/.runnable_avg.avg > 25115 ± 4% +15.5% 28999 ± 6% sched_debug.cfs_rq:/.runnable_avg.max > 3811 ± 11% +68.0% 6404 ± 21% sched_debug.cfs_rq:/.runnable_avg.min > 3818 ± 6% +15.3% 4404 ± 7% sched_debug.cfs_rq:/.runnable_avg.stddev > -849635 +410.6% -4338612 sched_debug.cfs_rq:/.spread0.avg > 1092373 ± 54% +691.1% 8641673 ± 21% sched_debug.cfs_rq:/.spread0.max > -1263082 +378.1% -6038905 sched_debug.cfs_rq:/.spread0.min > 300764 ± 12% +441.8% 1629507 ± 9% sched_debug.cfs_rq:/.spread0.stddev > 1591 ± 4% -11.1% 1413 ± 3% sched_debug.cfs_rq:/.util_avg.max > 288.90 ± 11% +64.5% 475.23 ± 13% sched_debug.cfs_rq:/.util_avg.min > 240.33 ± 2% -32.1% 163.09 ± 3% sched_debug.cfs_rq:/.util_avg.stddev > 494.27 ± 3% +41.6% 699.85 ± 3% sched_debug.cfs_rq:/.util_est_enqueued.avg > 11.23 ± 54% +634.1% 82.47 ± 22% sched_debug.cfs_rq:/.util_est_enqueued.min > 174576 +20.7% 210681 sched_debug.cpu.clock.avg > 174926 +21.2% 211944 sched_debug.cpu.clock.max > 174164 +20.3% 209436 sched_debug.cpu.clock.min > 230.84 ± 33% +226.1% 752.67 ± 20% sched_debug.cpu.clock.stddev > 172836 +20.6% 208504 sched_debug.cpu.clock_task.avg > 173552 +21.0% 210079 sched_debug.cpu.clock_task.max > 156807 +22.3% 191789 sched_debug.cpu.clock_task.min > 1634 +17.1% 1914 ± 5% sched_debug.cpu.clock_task.stddev > 0.00 ± 32% +220.1% 0.00 ± 20% sched_debug.cpu.next_balance.stddev > 14.12 ± 2% +28.7% 18.18 ± 5% sched_debug.cpu.nr_running.avg > 0.73 ± 25% +213.6% 2.30 ± 24% sched_debug.cpu.nr_running.min > 1810086 +461.3% 10159215 ± 10% sched_debug.cpu.nr_switches.avg > 2315994 ± 3% +515.6% 14258195 ± 9% sched_debug.cpu.nr_switches.max > 1529863 +380.3% 7348324 ± 9% sched_debug.cpu.nr_switches.min > 167487 ± 18% +770.8% 1458519 ± 21% sched_debug.cpu.nr_switches.stddev > 174149 +20.2% 209410 sched_debug.cpu_clk > 170980 +20.6% 206240 sched_debug.ktime > 174896 +20.2% 210153 sched_debug.sched_clk > 7.35 +24.9% 9.18 ± 4% perf-stat.i.MPKI > 1.918e+10 +14.4% 2.194e+10 perf-stat.i.branch-instructions > 2.16 -0.1 2.09 perf-stat.i.branch-miss-rate% > 4.133e+08 +6.6% 4.405e+08 perf-stat.i.branch-misses > 23.08 -9.2 13.86 ± 7% perf-stat.i.cache-miss-rate% > 1.714e+08 -37.2% 1.076e+08 ± 3% perf-stat.i.cache-misses > 7.497e+08 +33.7% 1.002e+09 ± 5% perf-stat.i.cache-references > 1636365 +382.4% 7893858 ± 5% perf-stat.i.context-switches > 2.74 -6.8% 2.56 perf-stat.i.cpi > 131725 +288.0% 511159 ± 10% perf-stat.i.cpu-migrations > 1672 +160.8% 4361 ± 4% perf-stat.i.cycles-between-cache-misses > 0.49 +0.6 1.11 ± 5% perf-stat.i.dTLB-load-miss-rate% > 1.417e+08 +158.7% 3.665e+08 ± 5% perf-stat.i.dTLB-load-misses > 2.908e+10 +9.1% 3.172e+10 perf-stat.i.dTLB-loads > 0.12 ± 4% +0.1 0.20 ± 4% perf-stat.i.dTLB-store-miss-rate% > 20805655 ± 4% +90.9% 39716345 ± 4% perf-stat.i.dTLB-store-misses > 1.755e+10 +8.6% 1.907e+10 perf-stat.i.dTLB-stores > 29.04 +3.6 32.62 ± 2% perf-stat.i.iTLB-load-miss-rate% > 56676082 +60.4% 90917582 ± 3% perf-stat.i.iTLB-load-misses > 1.381e+08 +30.6% 1.804e+08 perf-stat.i.iTLB-loads > 1.03e+11 +10.5% 1.139e+11 perf-stat.i.instructions > 1840 -21.1% 1451 ± 4% perf-stat.i.instructions-per-iTLB-miss > 0.37 +10.9% 0.41 perf-stat.i.ipc > 1084 -4.5% 1035 ± 2% perf-stat.i.metric.K/sec > 640.69 +10.3% 706.44 perf-stat.i.metric.M/sec > 5249 -9.3% 4762 ± 3% perf-stat.i.minor-faults > 23.57 +18.7 42.30 ± 8% perf-stat.i.node-load-miss-rate% > 40174555 -45.0% 22109431 ± 10% perf-stat.i.node-loads > 8.84 ± 2% +24.5 33.30 ± 10% perf-stat.i.node-store-miss-rate% > 2912322 +60.3% 4667137 ± 16% perf-stat.i.node-store-misses > 34046752 -50.6% 16826621 ± 9% perf-stat.i.node-stores > 5278 -9.2% 4791 ± 3% perf-stat.i.page-faults > 7.24 +12.1% 8.12 ± 4% perf-stat.overall.MPKI > 2.15 -0.1 2.05 perf-stat.overall.branch-miss-rate% > 22.92 -9.5 13.41 ± 7% perf-stat.overall.cache-miss-rate% > 2.73 -6.3% 2.56 perf-stat.overall.cpi > 1644 +43.4% 2358 ± 3% perf-stat.overall.cycles-between-cache-misses > 0.48 +0.5 0.99 ± 4% perf-stat.overall.dTLB-load-miss-rate% > 0.12 ± 4% +0.1 0.19 ± 4% perf-stat.overall.dTLB-store-miss-rate% > 29.06 +2.9 32.01 ± 2% perf-stat.overall.iTLB-load-miss-rate% > 1826 -26.6% 1340 ± 4% perf-stat.overall.instructions-per-iTLB-miss > 0.37 +6.8% 0.39 perf-stat.overall.ipc > 22.74 +6.8 29.53 ± 13% perf-stat.overall.node-load-miss-rate% > 7.63 +8.4 16.02 ± 20% perf-stat.overall.node-store-miss-rate% > 1.915e+10 +9.0% 2.088e+10 perf-stat.ps.branch-instructions > 4.119e+08 +3.9% 4.282e+08 perf-stat.ps.branch-misses > 1.707e+08 -30.5% 1.186e+08 ± 3% perf-stat.ps.cache-misses > 7.446e+08 +19.2% 8.874e+08 ± 4% perf-stat.ps.cache-references > 1611874 +289.1% 6271376 ± 7% perf-stat.ps.context-switches > 127362 +189.0% 368041 ± 11% perf-stat.ps.cpu-migrations > 1.407e+08 +116.2% 3.042e+08 ± 5% perf-stat.ps.dTLB-load-misses > 2.901e+10 +5.4% 3.057e+10 perf-stat.ps.dTLB-loads > 20667480 ± 4% +66.8% 34473793 ± 4% perf-stat.ps.dTLB-store-misses > 1.751e+10 +5.1% 1.84e+10 perf-stat.ps.dTLB-stores > 56310692 +45.0% 81644183 ± 4% perf-stat.ps.iTLB-load-misses > 1.375e+08 +26.1% 1.733e+08 perf-stat.ps.iTLB-loads > 1.028e+11 +6.3% 1.093e+11 perf-stat.ps.instructions > 4929 -24.5% 3723 ± 2% perf-stat.ps.minor-faults > 40134633 -32.9% 26946247 ± 9% perf-stat.ps.node-loads > 2805073 +39.5% 3914304 ± 16% perf-stat.ps.node-store-misses > 33938259 -38.9% 20726382 ± 8% perf-stat.ps.node-stores > 4952 -24.5% 3741 ± 2% perf-stat.ps.page-faults > 2.911e+13 +30.9% 3.809e+13 ± 2% perf-stat.total.instructions > 15.30 ± 4% -8.6 6.66 ± 5% perf-profile.calltrace.cycles-pp.sock_alloc_send_pskb.unix_stream_sendmsg.sock_sendmsg.sock_write_iter.vfs_write > 13.84 ± 6% -7.9 5.98 ± 6% perf-profile.calltrace.cycles-pp.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg.sock_sendmsg.sock_write_iter > 13.61 ± 6% -7.8 5.84 ± 6% perf-profile.calltrace.cycles-pp.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg.sock_sendmsg > 9.00 ± 2% -5.5 3.48 ± 4% perf-profile.calltrace.cycles-pp.consume_skb.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter.vfs_read > 6.44 ± 4% -4.3 2.14 ± 6% perf-profile.calltrace.cycles-pp.skb_release_data.consume_skb.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter > 5.83 ± 8% -3.4 2.44 ± 5% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_node.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg > 5.81 ± 6% -3.3 2.48 ± 6% perf-profile.calltrace.cycles-pp.kmalloc_reserve.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg > 5.50 ± 7% -3.2 2.32 ± 6% perf-profile.calltrace.cycles-pp.__kmalloc_node_track_caller.kmalloc_reserve.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb > 5.07 ± 8% -3.0 2.04 ± 6% perf-profile.calltrace.cycles-pp.__kmem_cache_alloc_node.__kmalloc_node_track_caller.kmalloc_reserve.__alloc_skb.alloc_skb_with_frags > 6.22 ± 2% -2.9 3.33 ± 3% perf-profile.calltrace.cycles-pp.unix_stream_read_actor.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter.vfs_read > 6.17 ± 2% -2.9 3.30 ± 3% perf-profile.calltrace.cycles-pp.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter > 6.11 ± 2% -2.9 3.24 ± 3% perf-profile.calltrace.cycles-pp.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_generic.unix_stream_recvmsg > 50.99 -2.6 48.39 perf-profile.calltrace.cycles-pp.__libc_read > 5.66 ± 3% -2.3 3.35 ± 3% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__libc_read > 5.52 ± 3% -2.3 3.27 ± 3% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__libc_write > 3.14 ± 2% -1.7 1.42 ± 4% perf-profile.calltrace.cycles-pp._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_generic > 2.73 ± 2% -1.6 1.15 ± 4% perf-profile.calltrace.cycles-pp.copyout._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor > 2.59 ± 2% -1.5 1.07 ± 4% perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyout._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter > 2.72 ± 3% -1.4 1.34 ± 6% perf-profile.calltrace.cycles-pp.kmem_cache_free.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter.vfs_read > 41.50 -1.2 40.27 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_read > 2.26 ± 4% -1.1 1.12 perf-profile.calltrace.cycles-pp.skb_release_head_state.consume_skb.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter > 2.76 ± 3% -1.1 1.63 ± 3% perf-profile.calltrace.cycles-pp.__check_object_size.simple_copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor > 2.84 ± 3% -1.1 1.71 ± 2% perf-profile.calltrace.cycles-pp.simple_copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_generic > 2.20 ± 4% -1.1 1.08 perf-profile.calltrace.cycles-pp.unix_destruct_scm.skb_release_head_state.consume_skb.unix_stream_read_generic.unix_stream_recvmsg > 2.98 ± 2% -1.1 1.90 ± 6% perf-profile.calltrace.cycles-pp.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_sendmsg.sock_write_iter.vfs_write > 1.99 ± 4% -1.1 0.92 ± 2% perf-profile.calltrace.cycles-pp.sock_wfree.unix_destruct_scm.skb_release_head_state.consume_skb.unix_stream_read_generic > 2.10 ± 3% -1.0 1.08 ± 4% perf-profile.calltrace.cycles-pp.check_heap_object.__check_object_size.simple_copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter > 2.08 ± 4% -0.8 1.24 ± 3% perf-profile.calltrace.cycles-pp.__entry_text_start.__libc_write > 2.16 ± 3% -0.7 1.47 perf-profile.calltrace.cycles-pp.__entry_text_start.__libc_read > 2.20 ± 2% -0.7 1.52 ± 3% perf-profile.calltrace.cycles-pp.__kmem_cache_free.skb_release_data.consume_skb.unix_stream_read_generic.unix_stream_recvmsg > 1.46 ± 3% -0.6 0.87 ± 8% perf-profile.calltrace.cycles-pp._copy_from_iter.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_sendmsg.sock_write_iter > 4.82 ± 2% -0.6 4.24 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read > 1.31 ± 2% -0.4 0.90 ± 4% perf-profile.calltrace.cycles-pp.__check_object_size.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_sendmsg.sock_write_iter > 0.96 ± 3% -0.4 0.57 ± 10% perf-profile.calltrace.cycles-pp.copyin._copy_from_iter.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_sendmsg > 1.14 ± 3% -0.4 0.76 ± 5% perf-profile.calltrace.cycles-pp.memcg_slab_post_alloc_hook.kmem_cache_alloc_node.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb > 0.99 ± 3% -0.3 0.65 ± 8% perf-profile.calltrace.cycles-pp.memcg_slab_post_alloc_hook.__kmem_cache_alloc_node.__kmalloc_node_track_caller.kmalloc_reserve.__alloc_skb > 1.30 ± 4% -0.3 0.99 ± 3% perf-profile.calltrace.cycles-pp.sock_recvmsg.sock_read_iter.vfs_read.ksys_read.do_syscall_64 > 0.98 ± 2% -0.3 0.69 ± 3% perf-profile.calltrace.cycles-pp.security_file_permission.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.67 -0.2 0.42 ± 50% perf-profile.calltrace.cycles-pp.check_heap_object.__check_object_size.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_sendmsg > 0.56 ± 4% -0.2 0.32 ± 81% perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read > 0.86 ± 2% -0.2 0.63 ± 3% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.vfs_write.ksys_write.do_syscall_64 > 1.15 ± 4% -0.2 0.93 ± 4% perf-profile.calltrace.cycles-pp.security_socket_recvmsg.sock_recvmsg.sock_read_iter.vfs_read.ksys_read > 0.90 -0.2 0.69 ± 3% perf-profile.calltrace.cycles-pp.get_obj_cgroup_from_current.kmem_cache_alloc_node.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb > 1.23 ± 3% -0.2 1.07 ± 3% perf-profile.calltrace.cycles-pp.security_socket_sendmsg.sock_sendmsg.sock_write_iter.vfs_write.ksys_write > 1.05 ± 2% -0.2 0.88 ± 2% perf-profile.calltrace.cycles-pp.security_file_permission.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.84 ± 4% -0.2 0.68 ± 4% perf-profile.calltrace.cycles-pp.aa_sk_perm.security_socket_recvmsg.sock_recvmsg.sock_read_iter.vfs_read > 0.88 -0.1 0.78 ± 5% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.vfs_read.ksys_read.do_syscall_64 > 0.94 ± 3% -0.1 0.88 ± 4% perf-profile.calltrace.cycles-pp.aa_sk_perm.security_socket_sendmsg.sock_sendmsg.sock_write_iter.vfs_write > 0.62 ± 2% +0.3 0.90 ± 2% perf-profile.calltrace.cycles-pp.mutex_lock.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter.vfs_read > 0.00 +0.6 0.58 ± 3% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.prepare_to_wait.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg > 0.00 +0.6 0.61 ± 6% perf-profile.calltrace.cycles-pp.update_load_avg.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function > 0.00 +0.6 0.62 ± 4% perf-profile.calltrace.cycles-pp.set_next_entity.pick_next_task_fair.__schedule.schedule.exit_to_user_mode_loop > 0.00 +0.7 0.67 ± 11% perf-profile.calltrace.cycles-pp.update_load_avg.dequeue_entity.dequeue_task_fair.__schedule.schedule > 0.00 +0.7 0.67 ± 7% perf-profile.calltrace.cycles-pp.__switch_to_asm.__libc_write > 0.00 +0.8 0.76 ± 4% perf-profile.calltrace.cycles-pp.reweight_entity.dequeue_task_fair.__schedule.schedule.schedule_timeout > 0.00 +0.8 0.77 ± 4% perf-profile.calltrace.cycles-pp.___perf_sw_event.prepare_task_switch.__schedule.schedule.schedule_timeout > 0.00 +0.8 0.77 ± 8% perf-profile.calltrace.cycles-pp.put_prev_entity.pick_next_task_fair.__schedule.schedule.exit_to_user_mode_loop > 0.00 +0.8 0.81 ± 5% perf-profile.calltrace.cycles-pp.prepare_task_switch.__schedule.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare > 0.00 +0.8 0.81 ± 5% perf-profile.calltrace.cycles-pp.check_preempt_wakeup.check_preempt_curr.ttwu_do_activate.try_to_wake_up.autoremove_wake_function > 0.00 +0.8 0.82 ± 2% perf-profile.calltrace.cycles-pp.__switch_to_asm.__libc_read > 0.00 +0.8 0.82 ± 3% perf-profile.calltrace.cycles-pp.reweight_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function > 0.00 +0.9 0.86 ± 5% perf-profile.calltrace.cycles-pp.perf_trace_sched_wakeup_template.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock > 0.00 +0.9 0.87 ± 8% perf-profile.calltrace.cycles-pp.update_load_avg.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up > 29.66 +0.9 30.58 perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write > 0.00 +1.0 0.95 ± 3% perf-profile.calltrace.cycles-pp.set_next_entity.pick_next_task_fair.__schedule.schedule.schedule_timeout > 0.00 +1.0 0.98 ± 4% perf-profile.calltrace.cycles-pp.check_preempt_curr.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common > 0.00 +1.0 0.99 ± 3% perf-profile.calltrace.cycles-pp.update_curr.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up > 0.00 +1.0 1.05 ± 4% perf-profile.calltrace.cycles-pp.prepare_to_wait.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter > 0.00 +1.1 1.07 ± 12% perf-profile.calltrace.cycles-pp.select_idle_sibling.select_task_rq_fair.select_task_rq.try_to_wake_up.autoremove_wake_function > 27.81 ± 2% +1.2 28.98 perf-profile.calltrace.cycles-pp.unix_stream_recvmsg.sock_read_iter.vfs_read.ksys_read.do_syscall_64 > 27.36 ± 2% +1.2 28.59 perf-profile.calltrace.cycles-pp.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter.vfs_read.ksys_read > 0.00 +1.5 1.46 ± 2% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.try_to_wake_up.autoremove_wake_function.__wake_up_common > 0.00 +1.6 1.55 ± 4% perf-profile.calltrace.cycles-pp.prepare_task_switch.__schedule.schedule.schedule_timeout.unix_stream_data_wait > 0.00 +1.6 1.60 ± 4% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read > 27.58 +1.6 29.19 perf-profile.calltrace.cycles-pp.sock_write_iter.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.00 +1.6 1.63 ± 5% perf-profile.calltrace.cycles-pp.update_curr.dequeue_entity.dequeue_task_fair.__schedule.schedule > 0.00 +1.6 1.65 ± 5% perf-profile.calltrace.cycles-pp.restore_fpregs_from_fpstate.switch_fpu_return.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64 > 0.00 +1.7 1.66 ± 6% perf-profile.calltrace.cycles-pp.switch_mm_irqs_off.__schedule.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare > 0.00 +1.8 1.80 perf-profile.calltrace.cycles-pp._raw_spin_lock.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock > 0.00 +1.8 1.84 ± 2% perf-profile.calltrace.cycles-pp.pick_next_task_fair.__schedule.schedule.schedule_timeout.unix_stream_data_wait > 0.00 +2.0 1.97 ± 2% perf-profile.calltrace.cycles-pp.switch_mm_irqs_off.__schedule.schedule.schedule_timeout.unix_stream_data_wait > 26.63 ± 2% +2.0 28.61 perf-profile.calltrace.cycles-pp.sock_sendmsg.sock_write_iter.vfs_write.ksys_write.do_syscall_64 > 0.00 +2.0 2.01 ± 6% perf-profile.calltrace.cycles-pp.pick_next_task_fair.__schedule.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare > 0.00 +2.1 2.09 ± 6% perf-profile.calltrace.cycles-pp.select_task_rq_fair.select_task_rq.try_to_wake_up.autoremove_wake_function.__wake_up_common > 0.00 +2.1 2.11 ± 5% perf-profile.calltrace.cycles-pp.switch_fpu_return.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe > 25.21 ± 2% +2.2 27.43 perf-profile.calltrace.cycles-pp.unix_stream_sendmsg.sock_sendmsg.sock_write_iter.vfs_write.ksys_write > 0.00 +2.4 2.43 ± 5% perf-profile.calltrace.cycles-pp.select_task_rq.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock > 48.00 +2.7 50.69 perf-profile.calltrace.cycles-pp.__libc_write > 0.00 +2.9 2.87 ± 5% perf-profile.calltrace.cycles-pp.dequeue_entity.dequeue_task_fair.__schedule.schedule.schedule_timeout > 0.09 ±223% +3.4 3.47 ± 3% perf-profile.calltrace.cycles-pp.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function > 39.07 +4.8 43.84 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_write > 0.66 ± 18% +5.0 5.62 ± 4% perf-profile.calltrace.cycles-pp.dequeue_task_fair.__schedule.schedule.schedule_timeout.unix_stream_data_wait > 4.73 +5.1 9.88 ± 3% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write > 0.66 ± 20% +5.3 5.98 ± 3% perf-profile.calltrace.cycles-pp.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common > 35.96 +5.7 41.68 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write > 0.00 +6.0 6.02 ± 6% perf-profile.calltrace.cycles-pp.__schedule.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode > 0.00 +6.2 6.18 ± 6% perf-profile.calltrace.cycles-pp.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64 > 0.00 +6.4 6.36 ± 6% perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.78 ± 19% +6.4 7.15 ± 3% perf-profile.calltrace.cycles-pp.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock > 0.18 ±141% +7.0 7.18 ± 6% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write > 1.89 ± 15% +12.1 13.96 ± 3% perf-profile.calltrace.cycles-pp.__schedule.schedule.schedule_timeout.unix_stream_data_wait.unix_stream_read_generic > 1.92 ± 15% +12.3 14.23 ± 3% perf-profile.calltrace.cycles-pp.schedule.schedule_timeout.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg > 1.66 ± 19% +12.4 14.06 ± 2% perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.sock_def_readable > 1.96 ± 15% +12.5 14.48 ± 3% perf-profile.calltrace.cycles-pp.schedule_timeout.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter > 1.69 ± 19% +12.7 14.38 ± 2% perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.sock_def_readable.unix_stream_sendmsg > 1.75 ± 19% +13.0 14.75 ± 2% perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_common_lock.sock_def_readable.unix_stream_sendmsg.sock_sendmsg > 2.53 ± 10% +13.4 15.90 ± 2% perf-profile.calltrace.cycles-pp.sock_def_readable.unix_stream_sendmsg.sock_sendmsg.sock_write_iter.vfs_write > 1.96 ± 16% +13.5 15.42 ± 2% perf-profile.calltrace.cycles-pp.__wake_up_common_lock.sock_def_readable.unix_stream_sendmsg.sock_sendmsg.sock_write_iter > 2.28 ± 15% +14.6 16.86 ± 3% perf-profile.calltrace.cycles-pp.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter.vfs_read > 15.31 ± 4% -8.6 6.67 ± 5% perf-profile.children.cycles-pp.sock_alloc_send_pskb > 13.85 ± 6% -7.9 5.98 ± 5% perf-profile.children.cycles-pp.alloc_skb_with_frags > 13.70 ± 6% -7.8 5.89 ± 6% perf-profile.children.cycles-pp.__alloc_skb > 9.01 ± 2% -5.5 3.48 ± 4% perf-profile.children.cycles-pp.consume_skb > 6.86 ± 26% -4.7 2.15 ± 2% perf-profile.children.cycles-pp._raw_spin_lock_irqsave > 11.27 ± 3% -4.6 6.67 ± 3% perf-profile.children.cycles-pp.syscall_return_via_sysret > 6.46 ± 4% -4.3 2.15 ± 6% perf-profile.children.cycles-pp.skb_release_data > 4.18 ± 25% -4.0 0.15 ± 69% perf-profile.children.cycles-pp.___slab_alloc > 5.76 ± 32% -3.9 1.91 ± 3% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath > 5.98 ± 8% -3.5 2.52 ± 5% perf-profile.children.cycles-pp.kmem_cache_alloc_node > 5.84 ± 6% -3.3 2.50 ± 6% perf-profile.children.cycles-pp.kmalloc_reserve > 3.33 ± 30% -3.3 0.05 ± 88% perf-profile.children.cycles-pp.get_partial_node > 5.63 ± 7% -3.3 2.37 ± 6% perf-profile.children.cycles-pp.__kmalloc_node_track_caller > 5.20 ± 7% -3.1 2.12 ± 6% perf-profile.children.cycles-pp.__kmem_cache_alloc_node > 6.23 ± 2% -2.9 3.33 ± 3% perf-profile.children.cycles-pp.unix_stream_read_actor > 6.18 ± 2% -2.9 3.31 ± 3% perf-profile.children.cycles-pp.skb_copy_datagram_iter > 6.11 ± 2% -2.9 3.25 ± 3% perf-profile.children.cycles-pp.__skb_datagram_iter > 51.39 -2.5 48.85 perf-profile.children.cycles-pp.__libc_read > 3.14 ± 3% -2.5 0.61 ± 13% perf-profile.children.cycles-pp.__slab_free > 5.34 ± 3% -2.1 3.23 ± 3% perf-profile.children.cycles-pp.__entry_text_start > 3.57 ± 2% -1.9 1.66 ± 6% perf-profile.children.cycles-pp.copy_user_enhanced_fast_string > 3.16 ± 2% -1.7 1.43 ± 4% perf-profile.children.cycles-pp._copy_to_iter > 2.74 ± 2% -1.6 1.16 ± 4% perf-profile.children.cycles-pp.copyout > 4.16 ± 2% -1.5 2.62 ± 3% perf-profile.children.cycles-pp.__check_object_size > 2.73 ± 3% -1.4 1.35 ± 6% perf-profile.children.cycles-pp.kmem_cache_free > 2.82 ± 2% -1.2 1.63 ± 3% perf-profile.children.cycles-pp.check_heap_object > 2.27 ± 4% -1.1 1.13 ± 2% perf-profile.children.cycles-pp.skb_release_head_state > 2.85 ± 3% -1.1 1.72 ± 2% perf-profile.children.cycles-pp.simple_copy_to_iter > 2.22 ± 4% -1.1 1.10 perf-profile.children.cycles-pp.unix_destruct_scm > 3.00 ± 2% -1.1 1.91 ± 5% perf-profile.children.cycles-pp.skb_copy_datagram_from_iter > 2.00 ± 4% -1.1 0.92 ± 2% perf-profile.children.cycles-pp.sock_wfree > 2.16 ± 3% -0.7 1.43 ± 7% perf-profile.children.cycles-pp.memcg_slab_post_alloc_hook > 1.45 ± 3% -0.7 0.73 ± 7% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack > 2.21 ± 2% -0.7 1.52 ± 3% perf-profile.children.cycles-pp.__kmem_cache_free > 1.49 ± 3% -0.6 0.89 ± 8% perf-profile.children.cycles-pp._copy_from_iter > 1.40 ± 3% -0.6 0.85 ± 13% perf-profile.children.cycles-pp.mod_objcg_state > 0.74 -0.5 0.24 ± 16% perf-profile.children.cycles-pp.__build_skb_around > 1.48 -0.5 1.01 ± 2% perf-profile.children.cycles-pp.get_obj_cgroup_from_current > 2.05 ± 2% -0.5 1.59 ± 2% perf-profile.children.cycles-pp.security_file_permission > 0.98 ± 2% -0.4 0.59 ± 10% perf-profile.children.cycles-pp.copyin > 1.08 ± 3% -0.4 0.72 ± 3% perf-profile.children.cycles-pp.__might_resched > 1.75 -0.3 1.42 ± 4% perf-profile.children.cycles-pp.apparmor_file_permission > 1.32 ± 4% -0.3 1.00 ± 3% perf-profile.children.cycles-pp.sock_recvmsg > 0.54 ± 4% -0.3 0.25 ± 6% perf-profile.children.cycles-pp.skb_unlink > 0.54 ± 6% -0.3 0.26 ± 3% perf-profile.children.cycles-pp.unix_write_space > 0.66 ± 3% -0.3 0.39 ± 4% perf-profile.children.cycles-pp.obj_cgroup_charge > 0.68 ± 2% -0.3 0.41 ± 4% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack > 0.86 ± 4% -0.3 0.59 ± 3% perf-profile.children.cycles-pp.__check_heap_object > 0.75 ± 9% -0.3 0.48 ± 2% perf-profile.children.cycles-pp.skb_set_owner_w > 1.84 ± 3% -0.3 1.58 ± 4% perf-profile.children.cycles-pp.aa_sk_perm > 0.68 ± 11% -0.2 0.44 ± 3% perf-profile.children.cycles-pp.skb_queue_tail > 1.22 ± 4% -0.2 0.99 ± 5% perf-profile.children.cycles-pp.__fdget_pos > 0.70 ± 2% -0.2 0.48 ± 5% perf-profile.children.cycles-pp.__get_obj_cgroup_from_memcg > 1.16 ± 4% -0.2 0.93 ± 3% perf-profile.children.cycles-pp.security_socket_recvmsg > 0.48 ± 3% -0.2 0.29 ± 4% perf-profile.children.cycles-pp.__might_fault > 0.24 ± 7% -0.2 0.05 ± 56% perf-profile.children.cycles-pp.fsnotify_perm > 1.12 ± 4% -0.2 0.93 ± 6% perf-profile.children.cycles-pp.__fget_light > 1.24 ± 3% -0.2 1.07 ± 3% perf-profile.children.cycles-pp.security_socket_sendmsg > 0.61 ± 3% -0.2 0.45 ± 2% perf-profile.children.cycles-pp.__might_sleep > 0.33 ± 5% -0.2 0.17 ± 6% perf-profile.children.cycles-pp.refill_obj_stock > 0.40 ± 2% -0.1 0.25 ± 4% perf-profile.children.cycles-pp.kmalloc_slab > 0.57 ± 2% -0.1 0.45 perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt > 0.54 ± 3% -0.1 0.42 ± 2% perf-profile.children.cycles-pp.wait_for_unix_gc > 0.42 ± 2% -0.1 0.30 ± 3% perf-profile.children.cycles-pp.is_vmalloc_addr > 1.00 ± 2% -0.1 0.87 ± 5% perf-profile.children.cycles-pp.__virt_addr_valid > 0.52 ± 2% -0.1 0.41 perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt > 0.33 ± 3% -0.1 0.21 ± 3% perf-profile.children.cycles-pp.tick_sched_handle > 0.36 ± 2% -0.1 0.25 ± 4% perf-profile.children.cycles-pp.tick_sched_timer > 0.47 ± 2% -0.1 0.36 ± 2% perf-profile.children.cycles-pp.hrtimer_interrupt > 0.48 ± 2% -0.1 0.36 ± 2% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt > 0.32 ± 3% -0.1 0.21 ± 5% perf-profile.children.cycles-pp.update_process_times > 0.42 ± 3% -0.1 0.31 ± 2% perf-profile.children.cycles-pp.__hrtimer_run_queues > 0.26 ± 6% -0.1 0.16 ± 4% perf-profile.children.cycles-pp.kmalloc_size_roundup > 0.20 ± 4% -0.1 0.10 ± 9% perf-profile.children.cycles-pp.task_tick_fair > 0.24 ± 3% -0.1 0.15 ± 4% perf-profile.children.cycles-pp.scheduler_tick > 0.30 ± 5% -0.1 0.21 ± 8% perf-profile.children.cycles-pp.obj_cgroup_uncharge_pages > 0.20 ± 2% -0.1 0.11 ± 6% perf-profile.children.cycles-pp.should_failslab > 0.51 ± 2% -0.1 0.43 ± 6% perf-profile.children.cycles-pp.syscall_enter_from_user_mode > 0.15 ± 8% -0.1 0.07 ± 13% perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare > 0.19 ± 4% -0.1 0.12 ± 5% perf-profile.children.cycles-pp.apparmor_socket_sendmsg > 0.20 ± 4% -0.1 0.13 ± 5% perf-profile.children.cycles-pp.aa_file_perm > 0.18 ± 5% -0.1 0.12 ± 5% perf-profile.children.cycles-pp.apparmor_socket_recvmsg > 0.14 ± 13% -0.1 0.08 ± 55% perf-profile.children.cycles-pp.__mod_memcg_lruvec_state > 0.24 ± 4% -0.1 0.18 ± 2% perf-profile.children.cycles-pp.rcu_all_qs > 0.18 ± 10% -0.1 0.12 ± 11% perf-profile.children.cycles-pp.memcg_account_kmem > 0.37 ± 3% -0.1 0.31 ± 3% perf-profile.children.cycles-pp.security_socket_getpeersec_dgram > 0.08 -0.0 0.06 ± 8% perf-profile.children.cycles-pp.put_pid > 0.18 ± 3% -0.0 0.16 ± 4% perf-profile.children.cycles-pp.apparmor_socket_getpeersec_dgram > 0.21 ± 3% +0.0 0.23 ± 2% perf-profile.children.cycles-pp.__get_task_ioprio > 0.00 +0.1 0.05 perf-profile.children.cycles-pp.perf_exclude_event > 0.00 +0.1 0.06 ± 7% perf-profile.children.cycles-pp.invalidate_user_asid > 0.00 +0.1 0.07 ± 6% perf-profile.children.cycles-pp.__bitmap_and > 0.05 +0.1 0.13 ± 8% perf-profile.children.cycles-pp.irqentry_exit_to_user_mode > 0.00 +0.1 0.08 ± 7% perf-profile.children.cycles-pp.schedule_debug > 0.00 +0.1 0.08 ± 13% perf-profile.children.cycles-pp.read@plt > 0.00 +0.1 0.08 ± 5% perf-profile.children.cycles-pp.sysvec_reschedule_ipi > 0.00 +0.1 0.10 ± 4% perf-profile.children.cycles-pp.tracing_gen_ctx_irq_test > 0.00 +0.1 0.10 ± 4% perf-profile.children.cycles-pp.place_entity > 0.00 +0.1 0.12 ± 10% perf-profile.children.cycles-pp.native_irq_return_iret > 0.07 ± 14% +0.1 0.19 ± 3% perf-profile.children.cycles-pp.__list_add_valid > 0.00 +0.1 0.13 ± 6% perf-profile.children.cycles-pp.perf_trace_buf_alloc > 0.00 +0.1 0.13 ± 34% perf-profile.children.cycles-pp._find_next_and_bit > 0.00 +0.1 0.14 ± 5% perf-profile.children.cycles-pp.switch_ldt > 0.00 +0.1 0.15 ± 5% perf-profile.children.cycles-pp.check_cfs_rq_runtime > 0.00 +0.1 0.15 ± 30% perf-profile.children.cycles-pp.migrate_task_rq_fair > 0.00 +0.2 0.15 ± 5% perf-profile.children.cycles-pp.__rdgsbase_inactive > 0.00 +0.2 0.16 ± 3% perf-profile.children.cycles-pp.save_fpregs_to_fpstate > 0.00 +0.2 0.16 ± 6% perf-profile.children.cycles-pp.ttwu_queue_wakelist > 0.00 +0.2 0.17 perf-profile.children.cycles-pp.perf_trace_buf_update > 0.00 +0.2 0.18 ± 2% perf-profile.children.cycles-pp.rb_insert_color > 0.00 +0.2 0.18 ± 4% perf-profile.children.cycles-pp.rb_next > 0.00 +0.2 0.18 ± 21% perf-profile.children.cycles-pp.__cgroup_account_cputime > 0.01 ±223% +0.2 0.21 ± 28% perf-profile.children.cycles-pp.perf_trace_sched_switch > 0.00 +0.2 0.20 ± 3% perf-profile.children.cycles-pp.select_idle_cpu > 0.00 +0.2 0.20 ± 3% perf-profile.children.cycles-pp.rcu_note_context_switch > 0.00 +0.2 0.21 ± 26% perf-profile.children.cycles-pp.set_task_cpu > 0.00 +0.2 0.22 ± 8% perf-profile.children.cycles-pp.resched_curr > 0.08 ± 5% +0.2 0.31 ± 11% perf-profile.children.cycles-pp.task_h_load > 0.00 +0.2 0.24 ± 3% perf-profile.children.cycles-pp.finish_wait > 0.04 ± 44% +0.3 0.29 ± 5% perf-profile.children.cycles-pp.rb_erase > 0.19 ± 6% +0.3 0.46 perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore > 0.20 ± 6% +0.3 0.47 ± 3% perf-profile.children.cycles-pp.__list_del_entry_valid > 0.00 +0.3 0.28 ± 3% perf-profile.children.cycles-pp.__wrgsbase_inactive > 0.02 ±141% +0.3 0.30 ± 2% perf-profile.children.cycles-pp.native_sched_clock > 0.06 ± 13% +0.3 0.34 ± 2% perf-profile.children.cycles-pp.sched_clock_cpu > 0.64 ± 2% +0.3 0.93 perf-profile.children.cycles-pp.mutex_lock > 0.00 +0.3 0.30 ± 5% perf-profile.children.cycles-pp.cr4_update_irqsoff > 0.00 +0.3 0.30 ± 4% perf-profile.children.cycles-pp.clear_buddies > 0.07 ± 55% +0.3 0.37 ± 5% perf-profile.children.cycles-pp.perf_trace_sched_stat_runtime > 0.10 ± 66% +0.3 0.42 ± 5% perf-profile.children.cycles-pp.perf_tp_event > 0.02 ±142% +0.3 0.36 ± 6% perf-profile.children.cycles-pp.cpuacct_charge > 0.12 ± 9% +0.4 0.47 ± 11% perf-profile.children.cycles-pp.wake_affine > 0.00 +0.4 0.36 ± 13% perf-profile.children.cycles-pp.available_idle_cpu > 0.05 ± 48% +0.4 0.42 ± 6% perf-profile.children.cycles-pp.finish_task_switch > 0.12 ± 4% +0.4 0.49 ± 4% perf-profile.children.cycles-pp.asm_sysvec_reschedule_ipi > 0.07 ± 17% +0.4 0.48 perf-profile.children.cycles-pp.__calc_delta > 0.03 ±100% +0.5 0.49 ± 4% perf-profile.children.cycles-pp.pick_next_entity > 0.00 +0.5 0.48 ± 8% perf-profile.children.cycles-pp.set_next_buddy > 0.08 ± 14% +0.6 0.66 ± 4% perf-profile.children.cycles-pp.update_min_vruntime > 0.07 ± 17% +0.6 0.68 ± 2% perf-profile.children.cycles-pp.os_xsave > 0.29 ± 7% +0.7 0.99 ± 3% perf-profile.children.cycles-pp.update_cfs_group > 0.17 ± 17% +0.7 0.87 ± 4% perf-profile.children.cycles-pp.perf_trace_sched_wakeup_template > 0.14 ± 7% +0.7 0.87 ± 3% perf-profile.children.cycles-pp.__update_load_avg_se > 0.14 ± 16% +0.8 0.90 ± 2% perf-profile.children.cycles-pp.update_rq_clock > 0.08 ± 17% +0.8 0.84 ± 5% perf-profile.children.cycles-pp.check_preempt_wakeup > 0.12 ± 14% +0.8 0.95 ± 3% perf-profile.children.cycles-pp.__update_load_avg_cfs_rq > 0.22 ± 5% +0.8 1.07 ± 3% perf-profile.children.cycles-pp.prepare_to_wait > 0.10 ± 18% +0.9 0.98 ± 3% perf-profile.children.cycles-pp.check_preempt_curr > 29.72 +0.9 30.61 perf-profile.children.cycles-pp.vfs_write > 0.14 ± 11% +0.9 1.03 ± 4% perf-profile.children.cycles-pp.__switch_to > 0.07 ± 20% +0.9 0.99 ± 6% perf-profile.children.cycles-pp.put_prev_entity > 0.12 ± 16% +1.0 1.13 ± 5% perf-profile.children.cycles-pp.___perf_sw_event > 0.07 ± 17% +1.0 1.10 ± 13% perf-profile.children.cycles-pp.select_idle_sibling > 27.82 ± 2% +1.2 28.99 perf-profile.children.cycles-pp.unix_stream_recvmsg > 27.41 ± 2% +1.2 28.63 perf-profile.children.cycles-pp.unix_stream_read_generic > 0.20 ± 15% +1.4 1.59 ± 3% perf-profile.children.cycles-pp.reweight_entity > 0.21 ± 13% +1.4 1.60 ± 4% perf-profile.children.cycles-pp.__switch_to_asm > 0.23 ± 10% +1.4 1.65 ± 5% perf-profile.children.cycles-pp.restore_fpregs_from_fpstate > 0.20 ± 13% +1.5 1.69 ± 3% perf-profile.children.cycles-pp.set_next_entity > 27.59 +1.6 29.19 perf-profile.children.cycles-pp.sock_write_iter > 0.28 ± 10% +1.8 2.12 ± 5% perf-profile.children.cycles-pp.switch_fpu_return > 0.26 ± 11% +1.8 2.10 ± 6% perf-profile.children.cycles-pp.select_task_rq_fair > 26.66 ± 2% +2.0 28.63 perf-profile.children.cycles-pp.sock_sendmsg > 0.31 ± 12% +2.1 2.44 ± 5% perf-profile.children.cycles-pp.select_task_rq > 0.30 ± 14% +2.2 2.46 ± 4% perf-profile.children.cycles-pp.prepare_task_switch > 25.27 ± 2% +2.2 27.47 perf-profile.children.cycles-pp.unix_stream_sendmsg > 2.10 +2.3 4.38 ± 2% perf-profile.children.cycles-pp._raw_spin_lock > 0.40 ± 14% +2.5 2.92 ± 5% perf-profile.children.cycles-pp.dequeue_entity > 48.40 +2.6 51.02 perf-profile.children.cycles-pp.__libc_write > 0.46 ± 15% +3.1 3.51 ± 3% perf-profile.children.cycles-pp.enqueue_entity > 0.49 ± 10% +3.2 3.64 ± 7% perf-profile.children.cycles-pp.update_load_avg > 0.53 ± 20% +3.4 3.91 ± 3% perf-profile.children.cycles-pp.update_curr > 80.81 +3.4 84.24 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe > 0.50 ± 12% +3.5 4.00 ± 4% perf-profile.children.cycles-pp.switch_mm_irqs_off > 0.55 ± 9% +3.8 4.38 ± 4% perf-profile.children.cycles-pp.pick_next_task_fair > 9.60 +4.6 14.15 ± 2% perf-profile.children.cycles-pp.syscall_exit_to_user_mode > 0.78 ± 13% +4.9 5.65 ± 4% perf-profile.children.cycles-pp.dequeue_task_fair > 0.78 ± 15% +5.2 5.99 ± 3% perf-profile.children.cycles-pp.enqueue_task_fair > 74.30 +5.6 79.86 perf-profile.children.cycles-pp.do_syscall_64 > 0.90 ± 15% +6.3 7.16 ± 3% perf-profile.children.cycles-pp.ttwu_do_activate > 0.33 ± 31% +6.3 6.61 ± 6% perf-profile.children.cycles-pp.exit_to_user_mode_loop > 0.82 ± 15% +8.1 8.92 ± 5% perf-profile.children.cycles-pp.exit_to_user_mode_prepare > 1.90 ± 16% +12.2 14.10 ± 2% perf-profile.children.cycles-pp.try_to_wake_up > 2.36 ± 11% +12.2 14.60 ± 3% perf-profile.children.cycles-pp.schedule_timeout > 1.95 ± 15% +12.5 14.41 ± 2% perf-profile.children.cycles-pp.autoremove_wake_function > 2.01 ± 15% +12.8 14.76 ± 2% perf-profile.children.cycles-pp.__wake_up_common > 2.23 ± 13% +13.2 15.45 ± 2% perf-profile.children.cycles-pp.__wake_up_common_lock > 2.53 ± 10% +13.4 15.90 ± 2% perf-profile.children.cycles-pp.sock_def_readable > 2.29 ± 15% +14.6 16.93 ± 3% perf-profile.children.cycles-pp.unix_stream_data_wait > 2.61 ± 13% +18.0 20.65 ± 4% perf-profile.children.cycles-pp.schedule > 2.66 ± 13% +18.1 20.77 ± 4% perf-profile.children.cycles-pp.__schedule > 11.25 ± 3% -4.6 6.67 ± 3% perf-profile.self.cycles-pp.syscall_return_via_sysret > 5.76 ± 32% -3.9 1.90 ± 3% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath > 8.69 ± 3% -3.4 5.27 ± 3% perf-profile.self.cycles-pp.syscall_exit_to_user_mode > 3.11 ± 3% -2.5 0.60 ± 13% perf-profile.self.cycles-pp.__slab_free > 6.65 ± 2% -2.2 4.47 ± 2% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe > 4.78 ± 3% -1.9 2.88 ± 3% perf-profile.self.cycles-pp.__entry_text_start > 3.52 ± 2% -1.9 1.64 ± 6% perf-profile.self.cycles-pp.copy_user_enhanced_fast_string > 2.06 ± 3% -1.1 0.96 ± 5% perf-profile.self.cycles-pp.kmem_cache_free > 1.42 ± 3% -1.0 0.46 ± 10% perf-profile.self.cycles-pp.check_heap_object > 1.43 ± 4% -0.8 0.64 perf-profile.self.cycles-pp.sock_wfree > 0.99 ± 3% -0.8 0.21 ± 12% perf-profile.self.cycles-pp.skb_release_data > 0.84 ± 8% -0.7 0.10 ± 64% perf-profile.self.cycles-pp.___slab_alloc > 1.97 ± 2% -0.6 1.32 perf-profile.self.cycles-pp.unix_stream_read_generic > 1.60 ± 3% -0.5 1.11 ± 4% perf-profile.self.cycles-pp.memcg_slab_post_alloc_hook > 1.24 ± 2% -0.5 0.75 ± 11% perf-profile.self.cycles-pp.mod_objcg_state > 0.71 -0.5 0.23 ± 15% perf-profile.self.cycles-pp.__build_skb_around > 0.95 ± 3% -0.5 0.50 ± 6% perf-profile.self.cycles-pp.__alloc_skb > 0.97 ± 4% -0.4 0.55 ± 5% perf-profile.self.cycles-pp.kmem_cache_alloc_node > 0.99 ± 3% -0.4 0.59 ± 4% perf-profile.self.cycles-pp.vfs_write > 1.38 ± 2% -0.4 0.99 perf-profile.self.cycles-pp.__kmem_cache_free > 0.86 ± 2% -0.4 0.50 ± 3% perf-profile.self.cycles-pp.__kmem_cache_alloc_node > 0.92 ± 4% -0.4 0.56 ± 4% perf-profile.self.cycles-pp.sock_write_iter > 1.06 ± 3% -0.4 0.70 ± 3% perf-profile.self.cycles-pp.__might_resched > 0.73 ± 4% -0.3 0.44 ± 4% perf-profile.self.cycles-pp.__cond_resched > 0.85 ± 3% -0.3 0.59 ± 4% perf-profile.self.cycles-pp.__check_heap_object > 1.46 ± 7% -0.3 1.20 ± 2% perf-profile.self.cycles-pp.unix_stream_sendmsg > 0.73 ± 9% -0.3 0.47 ± 2% perf-profile.self.cycles-pp.skb_set_owner_w > 1.54 -0.3 1.28 ± 4% perf-profile.self.cycles-pp.apparmor_file_permission > 0.74 ± 3% -0.2 0.50 ± 2% perf-profile.self.cycles-pp.get_obj_cgroup_from_current > 1.15 ± 3% -0.2 0.91 ± 8% perf-profile.self.cycles-pp.aa_sk_perm > 0.60 -0.2 0.36 ± 4% perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack > 0.65 ± 4% -0.2 0.45 ± 6% perf-profile.self.cycles-pp.__get_obj_cgroup_from_memcg > 0.24 ± 6% -0.2 0.05 ± 56% perf-profile.self.cycles-pp.fsnotify_perm > 0.76 ± 3% -0.2 0.58 ± 2% perf-profile.self.cycles-pp.sock_read_iter > 1.10 ± 4% -0.2 0.92 ± 6% perf-profile.self.cycles-pp.__fget_light > 0.42 ± 3% -0.2 0.25 ± 4% perf-profile.self.cycles-pp.obj_cgroup_charge > 0.32 ± 4% -0.2 0.17 ± 6% perf-profile.self.cycles-pp.refill_obj_stock > 0.29 -0.2 0.14 ± 8% perf-profile.self.cycles-pp.__kmalloc_node_track_caller > 0.54 ± 3% -0.1 0.40 ± 2% perf-profile.self.cycles-pp.__might_sleep > 0.30 ± 7% -0.1 0.16 ± 22% perf-profile.self.cycles-pp.security_file_permission > 0.34 ± 3% -0.1 0.21 ± 6% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack > 0.41 ± 3% -0.1 0.29 ± 3% perf-profile.self.cycles-pp.is_vmalloc_addr > 0.27 ± 3% -0.1 0.16 ± 6% perf-profile.self.cycles-pp._copy_from_iter > 0.24 ± 3% -0.1 0.12 ± 3% perf-profile.self.cycles-pp.ksys_write > 0.95 ± 2% -0.1 0.84 ± 5% perf-profile.self.cycles-pp.__virt_addr_valid > 0.56 ± 11% -0.1 0.46 ± 4% perf-profile.self.cycles-pp.sock_def_readable > 0.16 ± 7% -0.1 0.06 ± 18% perf-profile.self.cycles-pp.sock_recvmsg > 0.22 ± 5% -0.1 0.14 ± 2% perf-profile.self.cycles-pp.ksys_read > 0.27 ± 4% -0.1 0.19 ± 5% perf-profile.self.cycles-pp.kmalloc_slab > 0.28 ± 2% -0.1 0.20 ± 2% perf-profile.self.cycles-pp.consume_skb > 0.35 ± 2% -0.1 0.28 ± 3% perf-profile.self.cycles-pp.__check_object_size > 0.13 ± 8% -0.1 0.06 ± 18% perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare > 0.20 ± 5% -0.1 0.12 ± 6% perf-profile.self.cycles-pp.kmalloc_reserve > 0.26 ± 5% -0.1 0.19 ± 4% perf-profile.self.cycles-pp.sock_alloc_send_pskb > 0.42 ± 2% -0.1 0.35 ± 7% perf-profile.self.cycles-pp.syscall_enter_from_user_mode > 0.19 ± 5% -0.1 0.12 ± 6% perf-profile.self.cycles-pp.aa_file_perm > 0.16 ± 4% -0.1 0.10 ± 4% perf-profile.self.cycles-pp.skb_copy_datagram_from_iter > 0.18 ± 4% -0.1 0.12 ± 6% perf-profile.self.cycles-pp.apparmor_socket_sendmsg > 0.18 ± 5% -0.1 0.12 ± 4% perf-profile.self.cycles-pp.apparmor_socket_recvmsg > 0.15 ± 5% -0.1 0.10 ± 5% perf-profile.self.cycles-pp.alloc_skb_with_frags > 0.64 ± 3% -0.1 0.59 perf-profile.self.cycles-pp.__libc_write > 0.20 ± 4% -0.1 0.15 ± 3% perf-profile.self.cycles-pp._copy_to_iter > 0.15 ± 5% -0.1 0.10 ± 11% perf-profile.self.cycles-pp.sock_sendmsg > 0.08 ± 4% -0.1 0.03 ± 81% perf-profile.self.cycles-pp.copyout > 0.11 ± 6% -0.0 0.06 ± 7% perf-profile.self.cycles-pp.__fdget_pos > 0.12 ± 5% -0.0 0.07 ± 10% perf-profile.self.cycles-pp.kmalloc_size_roundup > 0.34 ± 3% -0.0 0.29 perf-profile.self.cycles-pp.do_syscall_64 > 0.20 ± 4% -0.0 0.15 ± 4% perf-profile.self.cycles-pp.rcu_all_qs > 0.41 ± 3% -0.0 0.37 ± 8% perf-profile.self.cycles-pp.unix_stream_recvmsg > 0.22 ± 2% -0.0 0.17 ± 4% perf-profile.self.cycles-pp.unix_destruct_scm > 0.09 ± 4% -0.0 0.05 perf-profile.self.cycles-pp.should_failslab > 0.10 ± 15% -0.0 0.06 ± 50% perf-profile.self.cycles-pp.__mod_memcg_lruvec_state > 0.11 ± 4% -0.0 0.07 perf-profile.self.cycles-pp.__might_fault > 0.16 ± 2% -0.0 0.13 ± 6% perf-profile.self.cycles-pp.obj_cgroup_uncharge_pages > 0.18 ± 4% -0.0 0.16 ± 3% perf-profile.self.cycles-pp.security_socket_getpeersec_dgram > 0.28 ± 2% -0.0 0.25 ± 2% perf-profile.self.cycles-pp.unix_write_space > 0.17 ± 2% -0.0 0.15 ± 5% perf-profile.self.cycles-pp.apparmor_socket_getpeersec_dgram > 0.08 ± 6% -0.0 0.05 ± 7% perf-profile.self.cycles-pp.security_socket_sendmsg > 0.12 ± 4% -0.0 0.10 ± 3% perf-profile.self.cycles-pp.__skb_datagram_iter > 0.24 ± 2% -0.0 0.22 perf-profile.self.cycles-pp.mutex_unlock > 0.08 ± 5% +0.0 0.10 ± 6% perf-profile.self.cycles-pp.scm_recv > 0.17 ± 2% +0.0 0.19 ± 3% perf-profile.self.cycles-pp.__x64_sys_read > 0.19 ± 3% +0.0 0.22 ± 2% perf-profile.self.cycles-pp.__get_task_ioprio > 0.00 +0.1 0.06 perf-profile.self.cycles-pp.finish_wait > 0.00 +0.1 0.06 ± 7% perf-profile.self.cycles-pp.cr4_update_irqsoff > 0.00 +0.1 0.06 ± 7% perf-profile.self.cycles-pp.invalidate_user_asid > 0.00 +0.1 0.07 ± 12% perf-profile.self.cycles-pp.wake_affine > 0.00 +0.1 0.07 ± 7% perf-profile.self.cycles-pp.check_cfs_rq_runtime > 0.00 +0.1 0.07 ± 5% perf-profile.self.cycles-pp.perf_trace_buf_update > 0.00 +0.1 0.07 ± 9% perf-profile.self.cycles-pp.asm_sysvec_reschedule_ipi > 0.00 +0.1 0.07 ± 10% perf-profile.self.cycles-pp.__bitmap_and > 0.00 +0.1 0.08 ± 10% perf-profile.self.cycles-pp.schedule_debug > 0.00 +0.1 0.08 ± 13% perf-profile.self.cycles-pp.read@plt > 0.00 +0.1 0.08 ± 12% perf-profile.self.cycles-pp.perf_trace_buf_alloc > 0.00 +0.1 0.09 ± 35% perf-profile.self.cycles-pp.migrate_task_rq_fair > 0.00 +0.1 0.09 ± 5% perf-profile.self.cycles-pp.place_entity > 0.00 +0.1 0.10 ± 4% perf-profile.self.cycles-pp.tracing_gen_ctx_irq_test > 0.00 +0.1 0.10 perf-profile.self.cycles-pp.__wake_up_common_lock > 0.07 ± 17% +0.1 0.18 ± 3% perf-profile.self.cycles-pp.__list_add_valid > 0.00 +0.1 0.11 ± 8% perf-profile.self.cycles-pp.native_irq_return_iret > 0.00 +0.1 0.12 ± 6% perf-profile.self.cycles-pp.select_idle_cpu > 0.00 +0.1 0.12 ± 34% perf-profile.self.cycles-pp._find_next_and_bit > 0.00 +0.1 0.13 ± 25% perf-profile.self.cycles-pp.__cgroup_account_cputime > 0.00 +0.1 0.13 ± 7% perf-profile.self.cycles-pp.switch_ldt > 0.00 +0.1 0.14 ± 5% perf-profile.self.cycles-pp.check_preempt_curr > 0.00 +0.1 0.15 ± 2% perf-profile.self.cycles-pp.save_fpregs_to_fpstate > 0.00 +0.1 0.15 ± 5% perf-profile.self.cycles-pp.__rdgsbase_inactive > 0.14 ± 3% +0.2 0.29 perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore > 0.00 +0.2 0.15 ± 7% perf-profile.self.cycles-pp.ttwu_queue_wakelist > 0.00 +0.2 0.17 ± 4% perf-profile.self.cycles-pp.rb_insert_color > 0.00 +0.2 0.17 ± 5% perf-profile.self.cycles-pp.rb_next > 0.00 +0.2 0.18 ± 2% perf-profile.self.cycles-pp.autoremove_wake_function > 0.01 ±223% +0.2 0.19 ± 6% perf-profile.self.cycles-pp.ttwu_do_activate > 0.00 +0.2 0.20 ± 2% perf-profile.self.cycles-pp.rcu_note_context_switch > 0.00 +0.2 0.20 ± 7% perf-profile.self.cycles-pp.exit_to_user_mode_loop > 0.27 +0.2 0.47 ± 3% perf-profile.self.cycles-pp.mutex_lock > 0.00 +0.2 0.20 ± 28% perf-profile.self.cycles-pp.perf_trace_sched_switch > 0.00 +0.2 0.21 ± 9% perf-profile.self.cycles-pp.resched_curr > 0.04 ± 45% +0.2 0.26 ± 7% perf-profile.self.cycles-pp.perf_tp_event > 0.06 ± 7% +0.2 0.28 ± 8% perf-profile.self.cycles-pp.perf_trace_sched_wakeup_template > 0.19 ± 7% +0.2 0.41 ± 5% perf-profile.self.cycles-pp.__list_del_entry_valid > 0.08 ± 5% +0.2 0.31 ± 11% perf-profile.self.cycles-pp.task_h_load > 0.00 +0.2 0.23 ± 5% perf-profile.self.cycles-pp.finish_task_switch > 0.03 ± 70% +0.2 0.27 ± 5% perf-profile.self.cycles-pp.rb_erase > 0.02 ±142% +0.3 0.29 ± 2% perf-profile.self.cycles-pp.native_sched_clock > 0.00 +0.3 0.28 ± 3% perf-profile.self.cycles-pp.__wrgsbase_inactive > 0.00 +0.3 0.28 ± 6% perf-profile.self.cycles-pp.clear_buddies > 0.07 ± 10% +0.3 0.35 ± 3% perf-profile.self.cycles-pp.schedule_timeout > 0.03 ± 70% +0.3 0.33 ± 3% perf-profile.self.cycles-pp.select_task_rq > 0.06 ± 13% +0.3 0.36 ± 4% perf-profile.self.cycles-pp.__wake_up_common > 0.06 ± 13% +0.3 0.36 ± 3% perf-profile.self.cycles-pp.dequeue_entity > 0.06 ± 18% +0.3 0.37 ± 7% perf-profile.self.cycles-pp.perf_trace_sched_stat_runtime > 0.01 ±223% +0.3 0.33 ± 4% perf-profile.self.cycles-pp.schedule > 0.02 ±142% +0.3 0.35 ± 7% perf-profile.self.cycles-pp.cpuacct_charge > 0.01 ±223% +0.3 0.35 perf-profile.self.cycles-pp.set_next_entity > 0.00 +0.4 0.35 ± 13% perf-profile.self.cycles-pp.available_idle_cpu > 0.08 ± 10% +0.4 0.44 ± 5% perf-profile.self.cycles-pp.prepare_to_wait > 0.63 ± 3% +0.4 1.00 ± 4% perf-profile.self.cycles-pp.vfs_read > 0.02 ±142% +0.4 0.40 ± 4% perf-profile.self.cycles-pp.check_preempt_wakeup > 0.02 ±141% +0.4 0.42 ± 4% perf-profile.self.cycles-pp.pick_next_entity > 0.07 ± 17% +0.4 0.48 perf-profile.self.cycles-pp.__calc_delta > 0.06 ± 14% +0.4 0.47 ± 3% perf-profile.self.cycles-pp.unix_stream_data_wait > 0.04 ± 45% +0.4 0.45 ± 4% perf-profile.self.cycles-pp.switch_fpu_return > 0.00 +0.5 0.46 ± 7% perf-profile.self.cycles-pp.set_next_buddy > 0.07 ± 17% +0.5 0.53 ± 3% perf-profile.self.cycles-pp.select_task_rq_fair > 0.08 ± 16% +0.5 0.55 ± 4% perf-profile.self.cycles-pp.try_to_wake_up > 0.08 ± 19% +0.5 0.56 ± 3% perf-profile.self.cycles-pp.update_rq_clock > 0.02 ±141% +0.5 0.50 ± 10% perf-profile.self.cycles-pp.select_idle_sibling > 0.77 ± 2% +0.5 1.25 ± 2% perf-profile.self.cycles-pp.__libc_read > 0.09 ± 19% +0.5 0.59 ± 3% perf-profile.self.cycles-pp.reweight_entity > 0.08 ± 14% +0.5 0.59 ± 2% perf-profile.self.cycles-pp.dequeue_task_fair > 0.08 ± 13% +0.6 0.64 ± 5% perf-profile.self.cycles-pp.update_min_vruntime > 0.02 ±141% +0.6 0.58 ± 7% perf-profile.self.cycles-pp.put_prev_entity > 0.06 ± 11% +0.6 0.64 ± 4% perf-profile.self.cycles-pp.enqueue_task_fair > 0.07 ± 18% +0.6 0.68 ± 3% perf-profile.self.cycles-pp.os_xsave > 1.39 ± 2% +0.7 2.06 ± 3% perf-profile.self.cycles-pp._raw_spin_lock_irqsave > 0.28 ± 8% +0.7 0.97 ± 4% perf-profile.self.cycles-pp.update_cfs_group > 0.14 ± 8% +0.7 0.83 ± 3% perf-profile.self.cycles-pp.__update_load_avg_se > 1.76 ± 3% +0.7 2.47 ± 3% perf-profile.self.cycles-pp._raw_spin_lock > 0.12 ± 12% +0.7 0.85 ± 5% perf-profile.self.cycles-pp.prepare_task_switch > 0.12 ± 12% +0.8 0.91 ± 3% perf-profile.self.cycles-pp.__update_load_avg_cfs_rq > 0.13 ± 12% +0.8 0.93 ± 5% perf-profile.self.cycles-pp.pick_next_task_fair > 0.13 ± 12% +0.9 0.98 ± 4% perf-profile.self.cycles-pp.__switch_to > 0.11 ± 18% +0.9 1.06 ± 5% perf-profile.self.cycles-pp.___perf_sw_event > 0.16 ± 11% +1.2 1.34 ± 4% perf-profile.self.cycles-pp.enqueue_entity > 0.20 ± 12% +1.4 1.58 ± 4% perf-profile.self.cycles-pp.__switch_to_asm > 0.23 ± 10% +1.4 1.65 ± 5% perf-profile.self.cycles-pp.restore_fpregs_from_fpstate > 0.25 ± 12% +1.5 1.77 ± 4% perf-profile.self.cycles-pp.__schedule > 0.22 ± 10% +1.6 1.78 ± 10% perf-profile.self.cycles-pp.update_load_avg > 0.23 ± 16% +1.7 1.91 ± 7% perf-profile.self.cycles-pp.update_curr > 0.48 ± 11% +3.4 3.86 ± 4% perf-profile.self.cycles-pp.switch_mm_irqs_off > > > To reproduce: > > git clone https://github.com/intel/lkp-tests.git > cd lkp-tests > sudo bin/lkp install job.yaml # job file is attached in this email > bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run > sudo bin/lkp run generated-yaml-file > > # if come across any failure that blocks the test, > # please remove ~/.lkp and /lkp dir to run from a clean state. Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B Sitz: Berlin Ust-ID: DE 289 237 879 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3] sched/fair: sanitize vruntime of entity being placed 2023-02-21 16:57 ` Roman Kagan @ 2023-02-21 17:26 ` Vincent Guittot 2023-02-27 8:42 ` Roman Kagan 0 siblings, 1 reply; 14+ messages in thread From: Vincent Guittot @ 2023-02-21 17:26 UTC (permalink / raw) To: Roman Kagan, Vincent Guittot, Peter Zijlstra, linux-kernel, Valentin Schneider, Zhang Qiao, Ben Segall, Waiman Long, Steven Rostedt, Mel Gorman, Dietmar Eggemann, Daniel Bristot de Oliveira, Ingo Molnar, Juri Lelli On Tue, 21 Feb 2023 at 17:57, Roman Kagan <rkagan@amazon.de> wrote: > > On Tue, Feb 21, 2023 at 10:38:44AM +0100, Vincent Guittot wrote: > > On Thu, 9 Feb 2023 at 20:31, Roman Kagan <rkagan@amazon.de> wrote: > > > > > > From: Zhang Qiao <zhangqiao22@huawei.com> > > > > > > When a scheduling entity is placed onto cfs_rq, its vruntime is pulled > > > to the base level (around cfs_rq->min_vruntime), so that the entity > > > doesn't gain extra boost when placed backwards. > > > > > > However, if the entity being placed wasn't executed for a long time, its > > > vruntime may get too far behind (e.g. while cfs_rq was executing a > > > low-weight hog), which can inverse the vruntime comparison due to s64 > > > overflow. This results in the entity being placed with its original > > > vruntime way forwards, so that it will effectively never get to the cpu. > > > > > > To prevent that, ignore the vruntime of the entity being placed if it > > > didn't execute for longer than the time that can lead to an overflow. > > > > > > Signed-off-by: Zhang Qiao <zhangqiao22@huawei.com> > > > [rkagan: formatted, adjusted commit log, comments, cutoff value] > > > Co-developed-by: Roman Kagan <rkagan@amazon.de> > > > Signed-off-by: Roman Kagan <rkagan@amazon.de> > > > > Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> > > > > > --- > > > v2 -> v3: > > > - make cutoff less arbitrary and update comments [Vincent] > > > > > > v1 -> v2: > > > - add Zhang Qiao's s-o-b > > > - fix constant promotion on 32bit > > > > > > kernel/sched/fair.c | 21 +++++++++++++++++++-- > > > 1 file changed, 19 insertions(+), 2 deletions(-) > > Turns out Peter took v2 through his tree, and it has already landed in > Linus' master. > > What scares me, though, is that I've got a message from the test robot > that this commit drammatically affected hackbench results, see the quote > below. I expected the commit not to affect any benchmarks. > > Any idea what could have caused this change? Hmm, It's most probably because se->exec_start is reset after a migration and the condition becomes true for newly migrated task whereas its vruntime should be after min_vruntime. We have missed this condition > > Thanks, > Roman. > > > On Tue, Feb 21, 2023 at 03:34:16PM +0800, kernel test robot wrote: > > FYI, we noticed a 125.5% improvement of hackbench.throughput due to commit: > > > > commit: 829c1651e9c4a6f78398d3e67651cef9bb6b42cc ("sched/fair: sanitize vruntime of entity being placed") > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master > > > > in testcase: hackbench > > on test machine: 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz (Cascade Lake) with 128G memory > > with following parameters: > > > > nr_threads: 50% > > iterations: 8 > > mode: process > > ipc: pipe > > cpufreq_governor: performance > > > > test-description: Hackbench is both a benchmark and a stress test for the Linux kernel scheduler. > > test-url: https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/sched/cfs-scheduler/hackbench.c > > > > In addition to that, the commit also has significant impact on the following tests: > > > > +------------------+--------------------------------------------------+ > > | testcase: change | hackbench: hackbench.throughput -8.1% regression | > > | test machine | 104 threads 2 sockets (Skylake) with 192G memory | > > | test parameters | cpufreq_governor=performance | > > | | ipc=socket | > > | | iterations=4 | > > | | mode=process | > > | | nr_threads=100% | > > +------------------+--------------------------------------------------+ > > > > Details are as below: > > > > ========================================================================================= > > compiler/cpufreq_governor/ipc/iterations/kconfig/mode/nr_threads/rootfs/tbox_group/testcase: > > gcc-11/performance/pipe/8/x86_64-rhel-8.3/process/50%/debian-11.1-x86_64-20220510.cgz/lkp-csl-2sp9/hackbench > > > > commit: > > a2e90611b9 ("sched/fair: Remove capacity inversion detection") > > 829c1651e9 ("sched/fair: sanitize vruntime of entity being placed") > > > > a2e90611b9f425ad 829c1651e9c4a6f78398d3e6765 > > ---------------- --------------------------- > > %stddev %change %stddev > > \ | \ > > 308887 ± 5% +125.5% 696539 hackbench.throughput > > 259291 ± 2% +127.3% 589293 hackbench.throughput_avg > > 308887 ± 5% +125.5% 696539 hackbench.throughput_best > > 198770 ± 2% +105.5% 408552 ± 4% hackbench.throughput_worst > > 319.60 ± 2% -55.8% 141.24 hackbench.time.elapsed_time > > 319.60 ± 2% -55.8% 141.24 hackbench.time.elapsed_time.max > > 1.298e+09 ± 8% -87.6% 1.613e+08 ± 7% hackbench.time.involuntary_context_switches > > 477107 -12.5% 417660 hackbench.time.minor_page_faults > > 24683 ± 2% -57.2% 10562 hackbench.time.system_time > > 2136 ± 3% -45.0% 1174 hackbench.time.user_time > > 3.21e+09 ± 4% -83.0% 5.442e+08 ± 3% hackbench.time.voluntary_context_switches > > 5.28e+08 ± 4% +8.4% 5.723e+08 ± 3% cpuidle..time > > 365.97 ± 2% -48.9% 187.12 uptime.boot > > 3322559 ± 3% +34.3% 4463206 ± 15% vmstat.memory.cache > > 14194257 ± 2% -62.8% 5279904 ± 3% vmstat.system.cs > > 2120781 ± 3% -72.8% 576421 ± 4% vmstat.system.in > > 1.84 ± 12% +2.6 4.48 ± 5% mpstat.cpu.all.idle% > > 2.49 ± 3% -1.1 1.39 ± 4% mpstat.cpu.all.irq% > > 0.04 ± 12% +0.0 0.05 mpstat.cpu.all.soft% > > 7.36 +2.2 9.56 mpstat.cpu.all.usr% > > 61555 ± 6% -72.8% 16751 ± 16% numa-meminfo.node1.Active > > 61515 ± 6% -72.8% 16717 ± 16% numa-meminfo.node1.Active(anon) > > 960182 ±102% +225.6% 3125990 ± 42% numa-meminfo.node1.FilePages > > 1754002 ± 53% +137.9% 4173379 ± 34% numa-meminfo.node1.MemUsed > > 35296824 ± 6% +157.8% 91005048 numa-numastat.node0.local_node > > 35310119 ± 6% +157.9% 91058472 numa-numastat.node0.numa_hit > > 35512423 ± 5% +159.7% 92232951 numa-numastat.node1.local_node > > 35577275 ± 4% +159.4% 92273266 numa-numastat.node1.numa_hit > > 35310253 ± 6% +157.9% 91058211 numa-vmstat.node0.numa_hit > > 35296958 ± 6% +157.8% 91004787 numa-vmstat.node0.numa_local > > 15337 ± 6% -72.5% 4216 ± 17% numa-vmstat.node1.nr_active_anon > > 239988 ±102% +225.7% 781607 ± 42% numa-vmstat.node1.nr_file_pages > > 15337 ± 6% -72.5% 4216 ± 17% numa-vmstat.node1.nr_zone_active_anon > > 35577325 ± 4% +159.4% 92273215 numa-vmstat.node1.numa_hit > > 35512473 ± 5% +159.7% 92232900 numa-vmstat.node1.numa_local > > 64500 ± 8% -61.8% 24643 ± 32% meminfo.Active > > 64422 ± 8% -61.9% 24568 ± 32% meminfo.Active(anon) > > 140271 ± 14% -38.0% 86979 ± 24% meminfo.AnonHugePages > > 372672 ± 2% +13.3% 422069 meminfo.AnonPages > > 3205235 ± 3% +35.1% 4329061 ± 15% meminfo.Cached > > 1548601 ± 7% +77.4% 2747319 ± 24% meminfo.Committed_AS > > 783193 ± 14% +154.9% 1996137 ± 33% meminfo.Inactive > > 783010 ± 14% +154.9% 1995951 ± 33% meminfo.Inactive(anon) > > 4986534 ± 2% +28.2% 6394741 ± 10% meminfo.Memused > > 475092 ± 22% +236.5% 1598918 ± 41% meminfo.Shmem > > 2777 -2.1% 2719 turbostat.Bzy_MHz > > 11143123 ± 6% +72.0% 19162667 turbostat.C1 > > 0.24 ± 7% +0.7 0.94 ± 3% turbostat.C1% > > 100440 ± 18% +203.8% 305136 ± 15% turbostat.C1E > > 0.06 ± 9% +0.1 0.18 ± 11% turbostat.C1E% > > 1.24 ± 3% +1.6 2.81 ± 4% turbostat.C6% > > 1.38 ± 3% +156.1% 3.55 ± 3% turbostat.CPU%c1 > > 0.33 ± 5% +76.5% 0.58 ± 7% turbostat.CPU%c6 > > 0.16 +31.2% 0.21 turbostat.IPC > > 6.866e+08 ± 5% -87.8% 83575393 ± 5% turbostat.IRQ > > 0.33 ± 27% +0.2 0.57 turbostat.POLL% > > 0.12 ± 10% +176.4% 0.33 ± 12% turbostat.Pkg%pc2 > > 0.09 ± 7% -100.0% 0.00 turbostat.Pkg%pc6 > > 61.33 +5.2% 64.50 ± 2% turbostat.PkgTmp > > 14.81 +2.0% 15.11 turbostat.RAMWatt > > 16242 ± 8% -62.0% 6179 ± 32% proc-vmstat.nr_active_anon > > 93150 ± 2% +13.2% 105429 proc-vmstat.nr_anon_pages > > 801219 ± 3% +35.1% 1082320 ± 15% proc-vmstat.nr_file_pages > > 195506 ± 14% +155.2% 498919 ± 33% proc-vmstat.nr_inactive_anon > > 118682 ± 22% +236.9% 399783 ± 41% proc-vmstat.nr_shmem > > 16242 ± 8% -62.0% 6179 ± 32% proc-vmstat.nr_zone_active_anon > > 195506 ± 14% +155.2% 498919 ± 33% proc-vmstat.nr_zone_inactive_anon > > 70889233 ± 5% +158.6% 1.833e+08 proc-vmstat.numa_hit > > 70811086 ± 5% +158.8% 1.832e+08 proc-vmstat.numa_local > > 55885 ± 22% -67.2% 18327 ± 38% proc-vmstat.numa_pages_migrated > > 422312 ± 10% -95.4% 19371 ± 7% proc-vmstat.pgactivate > > 71068460 ± 5% +158.1% 1.834e+08 proc-vmstat.pgalloc_normal > > 1554994 -19.6% 1250346 ± 4% proc-vmstat.pgfault > > 71011267 ± 5% +155.9% 1.817e+08 proc-vmstat.pgfree > > 55885 ± 22% -67.2% 18327 ± 38% proc-vmstat.pgmigrate_success > > 111247 ± 2% -35.0% 72355 ± 2% proc-vmstat.pgreuse > > 2506368 ± 2% -53.1% 1176320 proc-vmstat.unevictable_pgs_scanned > > 20.06 ± 10% -22.4% 15.56 ± 8% sched_debug.cfs_rq:/.h_nr_running.max > > 0.81 ± 32% -93.1% 0.06 ±223% sched_debug.cfs_rq:/.h_nr_running.min > > 1917 ± 34% -100.0% 0.00 sched_debug.cfs_rq:/.load.min > > 24.18 ± 10% +39.0% 33.62 ± 11% sched_debug.cfs_rq:/.load_avg.avg > > 245.61 ± 25% +66.3% 408.33 ± 22% sched_debug.cfs_rq:/.load_avg.max > > 47.52 ± 13% +72.6% 82.03 ± 8% sched_debug.cfs_rq:/.load_avg.stddev > > 13431147 -64.9% 4717147 sched_debug.cfs_rq:/.min_vruntime.avg > > 18161799 ± 7% -67.4% 5925316 ± 6% sched_debug.cfs_rq:/.min_vruntime.max > > 12413026 -65.0% 4340952 sched_debug.cfs_rq:/.min_vruntime.min > > 739748 ± 16% -66.6% 247410 ± 17% sched_debug.cfs_rq:/.min_vruntime.stddev > > 0.85 -16.4% 0.71 sched_debug.cfs_rq:/.nr_running.avg > > 0.61 ± 25% -90.9% 0.06 ±223% sched_debug.cfs_rq:/.nr_running.min > > 0.10 ± 25% +109.3% 0.22 ± 7% sched_debug.cfs_rq:/.nr_running.stddev > > 169.22 +101.7% 341.33 sched_debug.cfs_rq:/.removed.load_avg.max > > 32.41 ± 24% +100.2% 64.90 ± 16% sched_debug.cfs_rq:/.removed.load_avg.stddev > > 82.92 ± 10% +108.1% 172.56 sched_debug.cfs_rq:/.removed.runnable_avg.max > > 13.60 ± 28% +114.0% 29.10 ± 20% sched_debug.cfs_rq:/.removed.runnable_avg.stddev > > 82.92 ± 10% +108.1% 172.56 sched_debug.cfs_rq:/.removed.util_avg.max > > 13.60 ± 28% +114.0% 29.10 ± 20% sched_debug.cfs_rq:/.removed.util_avg.stddev > > 2156 ± 12% -36.6% 1368 ± 27% sched_debug.cfs_rq:/.runnable_avg.min > > 2285 ± 7% -19.8% 1833 ± 6% sched_debug.cfs_rq:/.runnable_avg.stddev > > -2389921 -64.8% -840940 sched_debug.cfs_rq:/.spread0.min > > 739781 ± 16% -66.5% 247837 ± 17% sched_debug.cfs_rq:/.spread0.stddev > > 843.88 ± 2% -20.5% 670.53 sched_debug.cfs_rq:/.util_avg.avg > > 433.64 ± 7% -43.5% 244.83 ± 17% sched_debug.cfs_rq:/.util_avg.min > > 187.00 ± 6% +40.6% 263.02 ± 4% sched_debug.cfs_rq:/.util_avg.stddev > > 394.15 ± 14% -29.5% 278.06 ± 3% sched_debug.cfs_rq:/.util_est_enqueued.avg > > 1128 ± 12% -17.6% 930.39 ± 5% sched_debug.cfs_rq:/.util_est_enqueued.max > > 38.36 ± 29% -100.0% 0.00 sched_debug.cfs_rq:/.util_est_enqueued.min > > 3596 ± 15% -39.5% 2175 ± 7% sched_debug.cpu.avg_idle.min > > 160647 ± 9% -25.9% 118978 ± 9% sched_debug.cpu.avg_idle.stddev > > 197365 -46.2% 106170 sched_debug.cpu.clock.avg > > 197450 -46.2% 106208 sched_debug.cpu.clock.max > > 197281 -46.2% 106128 sched_debug.cpu.clock.min > > 49.96 ± 22% -53.1% 23.44 ± 19% sched_debug.cpu.clock.stddev > > 193146 -45.7% 104898 sched_debug.cpu.clock_task.avg > > 194592 -45.8% 105455 sched_debug.cpu.clock_task.max > > 177878 -49.3% 90211 sched_debug.cpu.clock_task.min > > 1794 ± 5% -10.7% 1602 ± 2% sched_debug.cpu.clock_task.stddev > > 13154 ± 2% -20.3% 10479 sched_debug.cpu.curr->pid.avg > > 15059 -17.2% 12468 sched_debug.cpu.curr->pid.max > > 7263 ± 33% -100.0% 0.00 sched_debug.cpu.curr->pid.min > > 9321 ± 36% +98.2% 18478 ± 44% sched_debug.cpu.max_idle_balance_cost.stddev > > 0.00 ± 17% -41.6% 0.00 ± 13% sched_debug.cpu.next_balance.stddev > > 20.00 ± 11% -21.4% 15.72 ± 7% sched_debug.cpu.nr_running.max > > 0.86 ± 17% -87.1% 0.11 ±141% sched_debug.cpu.nr_running.min > > 25069883 -83.7% 4084117 ± 4% sched_debug.cpu.nr_switches.avg > > 26486718 -82.8% 4544009 ± 4% sched_debug.cpu.nr_switches.max > > 23680077 -84.5% 3663816 ± 4% sched_debug.cpu.nr_switches.min > > 589836 ± 3% -68.7% 184621 ± 16% sched_debug.cpu.nr_switches.stddev > > 197278 -46.2% 106128 sched_debug.cpu_clk > > 194327 -46.9% 103176 sched_debug.ktime > > 197967 -46.0% 106821 sched_debug.sched_clk > > 14.91 -37.6% 9.31 perf-stat.i.MPKI > > 2.657e+10 +25.0% 3.32e+10 perf-stat.i.branch-instructions > > 1.17 -0.4 0.78 perf-stat.i.branch-miss-rate% > > 3.069e+08 -20.1% 2.454e+08 perf-stat.i.branch-misses > > 6.43 ± 8% +2.2 8.59 ± 4% perf-stat.i.cache-miss-rate% > > 1.952e+09 -24.3% 1.478e+09 perf-stat.i.cache-references > > 14344055 ± 2% -58.6% 5932018 ± 3% perf-stat.i.context-switches > > 1.83 -21.8% 1.43 perf-stat.i.cpi > > 2.403e+11 -3.4% 2.322e+11 perf-stat.i.cpu-cycles > > 1420139 ± 2% -38.8% 869692 ± 5% perf-stat.i.cpu-migrations > > 2619 ± 7% -15.5% 2212 ± 8% perf-stat.i.cycles-between-cache-misses > > 0.24 ± 19% -0.1 0.10 ± 17% perf-stat.i.dTLB-load-miss-rate% > > 90403286 ± 19% -55.8% 39926283 ± 16% perf-stat.i.dTLB-load-misses > > 3.823e+10 +28.6% 4.918e+10 perf-stat.i.dTLB-loads > > 0.01 ± 34% -0.0 0.01 ± 33% perf-stat.i.dTLB-store-miss-rate% > > 2779663 ± 34% -52.7% 1315899 ± 31% perf-stat.i.dTLB-store-misses > > 2.19e+10 +24.2% 2.72e+10 perf-stat.i.dTLB-stores > > 47.99 ± 2% +28.0 75.94 perf-stat.i.iTLB-load-miss-rate% > > 89417955 ± 2% +38.7% 1.24e+08 ± 4% perf-stat.i.iTLB-load-misses > > 97721514 ± 2% -58.2% 40865783 ± 3% perf-stat.i.iTLB-loads > > 1.329e+11 +26.3% 1.678e+11 perf-stat.i.instructions > > 1503 -7.7% 1388 ± 3% perf-stat.i.instructions-per-iTLB-miss > > 0.55 +30.2% 0.72 perf-stat.i.ipc > > 1.64 ± 18% +217.4% 5.20 ± 11% perf-stat.i.major-faults > > 2.73 -3.7% 2.63 perf-stat.i.metric.GHz > > 1098 ± 2% -7.1% 1020 ± 3% perf-stat.i.metric.K/sec > > 1008 +24.4% 1254 perf-stat.i.metric.M/sec > > 4334 ± 2% +90.5% 8257 ± 7% perf-stat.i.minor-faults > > 90.94 -14.9 75.99 perf-stat.i.node-load-miss-rate% > > 41932510 ± 8% -43.0% 23899176 ± 10% perf-stat.i.node-load-misses > > 3366677 ± 5% +86.2% 6267816 perf-stat.i.node-loads > > 81.77 ± 3% -36.3 45.52 ± 3% perf-stat.i.node-store-miss-rate% > > 18498318 ± 7% -31.8% 12613933 ± 7% perf-stat.i.node-store-misses > > 3023556 ± 10% +508.7% 18405880 ± 2% perf-stat.i.node-stores > > 4336 ± 2% +90.5% 8262 ± 7% perf-stat.i.page-faults > > 14.70 -41.2% 8.65 perf-stat.overall.MPKI > > 1.16 -0.4 0.72 perf-stat.overall.branch-miss-rate% > > 6.22 ± 7% +2.4 8.59 ± 4% perf-stat.overall.cache-miss-rate% > > 1.81 -24.3% 1.37 perf-stat.overall.cpi > > 0.24 ± 19% -0.2 0.07 ± 15% perf-stat.overall.dTLB-load-miss-rate% > > 0.01 ± 34% -0.0 0.00 ± 29% perf-stat.overall.dTLB-store-miss-rate% > > 47.78 ± 2% +29.3 77.12 perf-stat.overall.iTLB-load-miss-rate% > > 1486 -9.1% 1351 ± 4% perf-stat.overall.instructions-per-iTLB-miss > > 0.55 +32.0% 0.73 perf-stat.overall.ipc > > 92.54 -15.4 77.16 ± 2% perf-stat.overall.node-load-miss-rate% > > 85.82 ± 2% -48.1 37.76 ± 5% perf-stat.overall.node-store-miss-rate% > > 2.648e+10 +25.2% 3.314e+10 perf-stat.ps.branch-instructions > > 3.06e+08 -22.1% 2.383e+08 perf-stat.ps.branch-misses > > 1.947e+09 -25.5% 1.451e+09 perf-stat.ps.cache-references > > 14298713 ± 2% -62.5% 5359285 ± 3% perf-stat.ps.context-switches > > 2.396e+11 -4.0% 2.299e+11 perf-stat.ps.cpu-cycles > > 1415512 ± 2% -42.2% 817981 ± 4% perf-stat.ps.cpu-migrations > > 90073948 ± 19% -60.4% 35711862 ± 15% perf-stat.ps.dTLB-load-misses > > 3.811e+10 +29.7% 4.944e+10 perf-stat.ps.dTLB-loads > > 2767291 ± 34% -56.3% 1210210 ± 29% perf-stat.ps.dTLB-store-misses > > 2.183e+10 +25.0% 2.729e+10 perf-stat.ps.dTLB-stores > > 89118809 ± 2% +39.6% 1.244e+08 ± 4% perf-stat.ps.iTLB-load-misses > > 97404381 ± 2% -62.2% 36860047 ± 3% perf-stat.ps.iTLB-loads > > 1.324e+11 +26.7% 1.678e+11 perf-stat.ps.instructions > > 1.62 ± 18% +164.7% 4.29 ± 8% perf-stat.ps.major-faults > > 4310 ± 2% +75.1% 7549 ± 5% perf-stat.ps.minor-faults > > 41743097 ± 8% -47.3% 21984450 ± 9% perf-stat.ps.node-load-misses > > 3356259 ± 5% +92.6% 6462631 perf-stat.ps.node-loads > > 18414647 ± 7% -35.7% 11833799 ± 6% perf-stat.ps.node-store-misses > > 3019790 ± 10% +545.0% 19478071 perf-stat.ps.node-stores > > 4312 ± 2% +75.2% 7553 ± 5% perf-stat.ps.page-faults > > 4.252e+13 -43.7% 2.395e+13 perf-stat.total.instructions > > 29.92 ± 4% -22.8 7.09 ± 29% perf-profile.calltrace.cycles-pp.__wake_up_common_lock.pipe_write.vfs_write.ksys_write.do_syscall_64 > > 28.53 ± 5% -21.6 6.92 ± 29% perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_common_lock.pipe_write.vfs_write.ksys_write > > 27.86 ± 5% -21.1 6.77 ± 29% perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_write.vfs_write > > 27.55 ± 5% -20.9 6.68 ± 29% perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_write > > 22.28 ± 4% -17.0 5.31 ± 30% perf-profile.calltrace.cycles-pp.schedule.pipe_read.vfs_read.ksys_read.do_syscall_64 > > 21.98 ± 4% -16.7 5.24 ± 30% perf-profile.calltrace.cycles-pp.__schedule.schedule.pipe_read.vfs_read.ksys_read > > 12.62 ± 4% -9.6 3.00 ± 33% perf-profile.calltrace.cycles-pp.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock > > 34.09 -9.2 24.92 ± 3% perf-profile.calltrace.cycles-pp.pipe_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 11.48 ± 5% -8.8 2.69 ± 38% perf-profile.calltrace.cycles-pp.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common > > 9.60 ± 7% -7.2 2.40 ± 35% perf-profile.calltrace.cycles-pp.dequeue_task_fair.__schedule.schedule.pipe_read.vfs_read > > 36.39 -6.2 30.20 perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read > > 40.40 -6.1 34.28 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read > > 40.95 -5.7 35.26 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_read > > 37.43 -5.4 32.07 perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read > > 6.30 ± 11% -5.2 1.09 ± 36% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write > > 5.66 ± 12% -5.1 0.58 ± 75% perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 6.46 ± 10% -5.1 1.40 ± 28% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write > > 5.53 ± 13% -5.0 0.56 ± 75% perf-profile.calltrace.cycles-pp.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64 > > 5.42 ± 13% -4.9 0.56 ± 75% perf-profile.calltrace.cycles-pp.__schedule.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode > > 5.82 ± 9% -4.7 1.10 ± 37% perf-profile.calltrace.cycles-pp._raw_spin_lock.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock > > 5.86 ± 16% -4.6 1.31 ± 37% perf-profile.calltrace.cycles-pp.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function > > 5.26 ± 9% -4.4 0.89 ± 57% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.try_to_wake_up.autoremove_wake_function.__wake_up_common > > 45.18 -3.5 41.68 perf-profile.calltrace.cycles-pp.__libc_read > > 50.31 -3.2 47.12 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write > > 4.00 ± 27% -2.9 1.09 ± 40% perf-profile.calltrace.cycles-pp.dequeue_entity.dequeue_task_fair.__schedule.schedule.pipe_read > > 50.75 -2.7 48.06 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_write > > 40.80 -2.6 38.20 perf-profile.calltrace.cycles-pp.pipe_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 3.10 ± 15% -2.5 0.62 ±103% perf-profile.calltrace.cycles-pp.update_cfs_group.dequeue_task_fair.__schedule.schedule.pipe_read > > 2.94 ± 12% -2.3 0.62 ±102% perf-profile.calltrace.cycles-pp.update_cfs_group.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function > > 2.38 ± 9% -2.0 0.38 ±102% perf-profile.calltrace.cycles-pp._raw_spin_lock.__schedule.schedule.pipe_read.vfs_read > > 2.24 ± 7% -1.8 0.40 ± 71% perf-profile.calltrace.cycles-pp.prepare_to_wait_event.pipe_read.vfs_read.ksys_read.do_syscall_64 > > 2.08 ± 6% -1.8 0.29 ±100% perf-profile.calltrace.cycles-pp.pick_next_task_fair.__schedule.schedule.pipe_read.vfs_read > > 2.10 ± 10% -1.8 0.32 ±104% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.__schedule.schedule.pipe_read > > 2.76 ± 7% -1.5 1.24 ± 17% perf-profile.calltrace.cycles-pp.select_task_rq.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock > > 2.27 ± 5% -1.4 0.88 ± 11% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read > > 2.43 ± 7% -1.3 1.16 ± 17% perf-profile.calltrace.cycles-pp.select_task_rq_fair.select_task_rq.try_to_wake_up.autoremove_wake_function.__wake_up_common > > 2.46 ± 5% -1.3 1.20 ± 7% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read > > 1.54 ± 5% -1.2 0.32 ±101% perf-profile.calltrace.cycles-pp.update_load_avg.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up > > 0.97 ± 9% -0.3 0.66 ± 19% perf-profile.calltrace.cycles-pp.select_idle_sibling.select_task_rq_fair.select_task_rq.try_to_wake_up.autoremove_wake_function > > 0.86 ± 6% +0.2 1.02 perf-profile.calltrace.cycles-pp.__might_fault._copy_from_iter.copy_page_from_iter.pipe_write.vfs_write > > 0.64 ± 9% +0.5 1.16 ± 5% perf-profile.calltrace.cycles-pp.mutex_unlock.pipe_read.vfs_read.ksys_read.do_syscall_64 > > 0.47 ± 45% +0.5 0.99 ± 5% perf-profile.calltrace.cycles-pp.__fget_light.__fdget_pos.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 0.60 ± 8% +0.5 1.13 ± 5% perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read > > 0.00 +0.5 0.54 ± 5% perf-profile.calltrace.cycles-pp.current_time.file_update_time.pipe_write.vfs_write.ksys_write > > 0.00 +0.6 0.56 ± 4% perf-profile.calltrace.cycles-pp.__might_resched.__might_fault._copy_from_iter.copy_page_from_iter.pipe_write > > 0.00 +0.6 0.56 ± 7% perf-profile.calltrace.cycles-pp.__might_resched.__might_fault._copy_to_iter.copy_page_to_iter.pipe_read > > 0.00 +0.6 0.58 ± 5% perf-profile.calltrace.cycles-pp.__might_resched.mutex_lock.pipe_write.vfs_write.ksys_write > > 0.00 +0.6 0.62 ± 3% perf-profile.calltrace.cycles-pp.__might_resched.mutex_lock.pipe_read.vfs_read.ksys_read > > 0.00 +0.7 0.65 ± 6% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.prepare_to_wait_event.pipe_write.vfs_write > > 0.00 +0.7 0.65 ± 7% perf-profile.calltrace.cycles-pp.poll_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle > > 0.57 ± 5% +0.7 1.24 ± 6% perf-profile.calltrace.cycles-pp.__fget_light.__fdget_pos.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 0.00 +0.7 0.72 ± 6% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.prepare_to_wait_event.pipe_write.vfs_write.ksys_write > > 0.00 +0.8 0.75 ± 6% perf-profile.calltrace.cycles-pp.mutex_spin_on_owner.__mutex_lock.pipe_write.vfs_write.ksys_write > > 0.74 ± 9% +0.8 1.48 ± 5% perf-profile.calltrace.cycles-pp.file_update_time.pipe_write.vfs_write.ksys_write.do_syscall_64 > > 0.63 ± 5% +0.8 1.40 ± 5% perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write > > 0.00 +0.8 0.78 ± 19% perf-profile.calltrace.cycles-pp.reader__read_event.perf_session__process_events.record__finish_output.__cmd_record > > 0.00 +0.8 0.78 ± 19% perf-profile.calltrace.cycles-pp.record__finish_output.__cmd_record > > 0.00 +0.8 0.78 ± 19% perf-profile.calltrace.cycles-pp.perf_session__process_events.record__finish_output.__cmd_record > > 0.00 +0.8 0.80 ± 15% perf-profile.calltrace.cycles-pp.__cmd_record > > 0.00 +0.8 0.82 ± 11% perf-profile.calltrace.cycles-pp.intel_idle_irq.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle > > 0.00 +0.9 0.85 ± 6% perf-profile.calltrace.cycles-pp.prepare_to_wait_event.pipe_write.vfs_write.ksys_write.do_syscall_64 > > 0.00 +0.9 0.86 ± 4% perf-profile.calltrace.cycles-pp.current_time.atime_needs_update.touch_atime.pipe_read.vfs_read > > 0.00 +0.9 0.87 ± 5% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__libc_write > > 0.00 +0.9 0.88 ± 5% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__libc_read > > 0.26 ±100% +1.0 1.22 ± 10% perf-profile.calltrace.cycles-pp.__schedule.schedule.pipe_write.vfs_write.ksys_write > > 0.00 +1.0 0.96 ± 6% perf-profile.calltrace.cycles-pp.__might_fault._copy_to_iter.copy_page_to_iter.pipe_read.vfs_read > > 0.27 ±100% +1.0 1.23 ± 10% perf-profile.calltrace.cycles-pp.schedule.pipe_write.vfs_write.ksys_write.do_syscall_64 > > 0.00 +1.0 0.97 ± 7% perf-profile.calltrace.cycles-pp.page_counter_uncharge.uncharge_batch.__mem_cgroup_uncharge.__folio_put.pipe_read > > 0.87 ± 8% +1.1 1.98 ± 5% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.vfs_read.ksys_read.do_syscall_64 > > 0.73 ± 6% +1.1 1.85 ± 5% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.vfs_write.ksys_write.do_syscall_64 > > 0.00 +1.2 1.15 ± 7% perf-profile.calltrace.cycles-pp.uncharge_batch.__mem_cgroup_uncharge.__folio_put.pipe_read.vfs_read > > 0.00 +1.2 1.23 ± 6% perf-profile.calltrace.cycles-pp.__mem_cgroup_uncharge.__folio_put.pipe_read.vfs_read.ksys_read > > 0.00 +1.2 1.24 ± 7% perf-profile.calltrace.cycles-pp.__folio_put.pipe_read.vfs_read.ksys_read.do_syscall_64 > > 0.48 ± 45% +1.3 1.74 ± 6% perf-profile.calltrace.cycles-pp.atime_needs_update.touch_atime.pipe_read.vfs_read.ksys_read > > 0.60 ± 7% +1.3 1.87 ± 8% perf-profile.calltrace.cycles-pp.__wake_up_common_lock.pipe_read.vfs_read.ksys_read.do_syscall_64 > > 1.23 ± 7% +1.3 2.51 ± 4% perf-profile.calltrace.cycles-pp.mutex_lock.pipe_read.vfs_read.ksys_read.do_syscall_64 > > 43.42 +1.3 44.75 perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write > > 0.83 ± 7% +1.3 2.17 ± 5% perf-profile.calltrace.cycles-pp.security_file_permission.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 0.98 ± 7% +1.4 2.36 ± 6% perf-profile.calltrace.cycles-pp.security_file_permission.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 0.27 ±100% +1.4 1.70 ± 9% perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_common_lock.pipe_read.vfs_read.ksys_read > > 0.79 ± 8% +1.4 2.23 ± 6% perf-profile.calltrace.cycles-pp.touch_atime.pipe_read.vfs_read.ksys_read.do_syscall_64 > > 0.18 ±141% +1.5 1.63 ± 9% perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_read > > 0.18 ±141% +1.5 1.67 ± 9% perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_read.vfs_read > > 0.00 +1.6 1.57 ± 10% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry > > 0.00 +1.6 1.57 ± 10% perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary > > 1.05 ± 8% +1.7 2.73 ± 6% perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyin._copy_from_iter.copy_page_from_iter.pipe_write > > 1.84 ± 9% +1.7 3.56 ± 5% perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyout._copy_to_iter.copy_page_to_iter.pipe_read > > 1.41 ± 9% +1.8 3.17 ± 6% perf-profile.calltrace.cycles-pp.copyin._copy_from_iter.copy_page_from_iter.pipe_write.vfs_write > > 0.00 +1.8 1.79 ± 9% perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify > > 1.99 ± 9% +2.0 3.95 ± 5% perf-profile.calltrace.cycles-pp.copyout._copy_to_iter.copy_page_to_iter.pipe_read.vfs_read > > 2.40 ± 7% +2.4 4.82 ± 5% perf-profile.calltrace.cycles-pp._copy_from_iter.copy_page_from_iter.pipe_write.vfs_write.ksys_write > > 0.00 +2.5 2.50 ± 7% perf-profile.calltrace.cycles-pp.__mutex_lock.pipe_write.vfs_write.ksys_write.do_syscall_64 > > 2.89 ± 8% +2.6 5.47 ± 5% perf-profile.calltrace.cycles-pp.copy_page_from_iter.pipe_write.vfs_write.ksys_write.do_syscall_64 > > 1.04 ± 30% +2.8 3.86 ± 5% perf-profile.calltrace.cycles-pp.__entry_text_start.__libc_write > > 0.00 +2.9 2.90 ± 11% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify > > 0.00 +2.9 2.91 ± 11% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify > > 0.00 +2.9 2.91 ± 11% perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify > > 0.85 ± 27% +2.9 3.80 ± 5% perf-profile.calltrace.cycles-pp.__entry_text_start.__libc_read > > 0.00 +3.0 2.96 ± 11% perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify > > 2.60 ± 9% +3.1 5.74 ± 6% perf-profile.calltrace.cycles-pp._copy_to_iter.copy_page_to_iter.pipe_read.vfs_read.ksys_read > > 2.93 ± 9% +3.7 6.66 ± 5% perf-profile.calltrace.cycles-pp.copy_page_to_iter.pipe_read.vfs_read.ksys_read.do_syscall_64 > > 1.60 ± 12% +4.6 6.18 ± 7% perf-profile.calltrace.cycles-pp.mutex_unlock.pipe_write.vfs_write.ksys_write.do_syscall_64 > > 2.60 ± 10% +4.6 7.24 ± 5% perf-profile.calltrace.cycles-pp.mutex_lock.pipe_write.vfs_write.ksys_write.do_syscall_64 > > 28.75 ± 5% -21.6 7.19 ± 28% perf-profile.children.cycles-pp.schedule > > 30.52 ± 4% -21.6 8.97 ± 22% perf-profile.children.cycles-pp.__wake_up_common_lock > > 28.53 ± 6% -21.0 7.56 ± 26% perf-profile.children.cycles-pp.__schedule > > 29.04 ± 5% -20.4 8.63 ± 23% perf-profile.children.cycles-pp.__wake_up_common > > 28.37 ± 5% -19.9 8.44 ± 23% perf-profile.children.cycles-pp.autoremove_wake_function > > 28.08 ± 5% -19.7 8.33 ± 23% perf-profile.children.cycles-pp.try_to_wake_up > > 13.90 ± 2% -10.2 3.75 ± 28% perf-profile.children.cycles-pp.ttwu_do_activate > > 12.66 ± 3% -9.2 3.47 ± 29% perf-profile.children.cycles-pp.enqueue_task_fair > > 34.20 -9.2 25.05 ± 3% perf-profile.children.cycles-pp.pipe_read > > 90.86 -9.1 81.73 perf-profile.children.cycles-pp.do_syscall_64 > > 91.80 -8.3 83.49 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe > > 10.28 ± 7% -7.8 2.53 ± 27% perf-profile.children.cycles-pp._raw_spin_lock > > 9.85 ± 7% -6.9 2.92 ± 29% perf-profile.children.cycles-pp.dequeue_task_fair > > 8.69 ± 7% -6.6 2.05 ± 24% perf-profile.children.cycles-pp.exit_to_user_mode_prepare > > 8.99 ± 6% -6.2 2.81 ± 16% perf-profile.children.cycles-pp.syscall_exit_to_user_mode > > 36.46 -6.1 30.34 perf-profile.children.cycles-pp.vfs_read > > 8.38 ± 8% -5.8 2.60 ± 23% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath > > 6.10 ± 11% -5.4 0.66 ± 61% perf-profile.children.cycles-pp.exit_to_user_mode_loop > > 37.45 -5.3 32.13 perf-profile.children.cycles-pp.ksys_read > > 6.50 ± 35% -4.9 1.62 ± 61% perf-profile.children.cycles-pp.update_curr > > 6.56 ± 15% -4.6 1.95 ± 57% perf-profile.children.cycles-pp.update_cfs_group > > 6.38 ± 14% -4.5 1.91 ± 28% perf-profile.children.cycles-pp.enqueue_entity > > 5.74 ± 5% -3.8 1.92 ± 25% perf-profile.children.cycles-pp.update_load_avg > > 45.56 -3.8 41.75 perf-profile.children.cycles-pp.__libc_read > > 3.99 ± 4% -3.1 0.92 ± 24% perf-profile.children.cycles-pp.pick_next_task_fair > > 4.12 ± 27% -2.7 1.39 ± 34% perf-profile.children.cycles-pp.dequeue_entity > > 40.88 -2.5 38.37 perf-profile.children.cycles-pp.pipe_write > > 3.11 ± 4% -2.4 0.75 ± 22% perf-profile.children.cycles-pp.switch_mm_irqs_off > > 2.06 ± 33% -1.8 0.27 ± 27% perf-profile.children.cycles-pp.asm_sysvec_call_function_single > > 2.38 ± 41% -1.8 0.60 ± 72% perf-profile.children.cycles-pp.perf_trace_sched_wakeup_template > > 2.29 ± 5% -1.7 0.60 ± 25% perf-profile.children.cycles-pp.switch_fpu_return > > 2.30 ± 6% -1.6 0.68 ± 18% perf-profile.children.cycles-pp.prepare_task_switch > > 1.82 ± 33% -1.6 0.22 ± 31% perf-profile.children.cycles-pp.sysvec_call_function_single > > 1.77 ± 33% -1.6 0.20 ± 32% perf-profile.children.cycles-pp.__sysvec_call_function_single > > 1.96 ± 5% -1.5 0.50 ± 20% perf-profile.children.cycles-pp.reweight_entity > > 2.80 ± 7% -1.2 1.60 ± 12% perf-profile.children.cycles-pp.select_task_rq > > 1.61 ± 6% -1.2 0.42 ± 25% perf-profile.children.cycles-pp.restore_fpregs_from_fpstate > > 1.34 ± 9% -1.2 0.16 ± 28% perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore > > 1.62 ± 4% -1.2 0.45 ± 22% perf-profile.children.cycles-pp.set_next_entity > > 1.55 ± 8% -1.1 0.43 ± 12% perf-profile.children.cycles-pp.update_rq_clock > > 1.49 ± 8% -1.1 0.41 ± 14% perf-profile.children.cycles-pp.__update_load_avg_cfs_rq > > 1.30 ± 20% -1.0 0.26 ± 18% perf-profile.children.cycles-pp.finish_task_switch > > 1.44 ± 5% -1.0 0.42 ± 19% perf-profile.children.cycles-pp.__switch_to_asm > > 2.47 ± 7% -1.0 1.50 ± 12% perf-profile.children.cycles-pp.select_task_rq_fair > > 2.33 ± 7% -0.9 1.40 ± 3% perf-profile.children.cycles-pp.prepare_to_wait_event > > 1.24 ± 7% -0.9 0.35 ± 14% perf-profile.children.cycles-pp.__update_load_avg_se > > 1.41 ± 32% -0.9 0.56 ± 24% perf-profile.children.cycles-pp.sched_ttwu_pending > > 2.29 ± 8% -0.8 1.45 ± 3% perf-profile.children.cycles-pp._raw_spin_lock_irqsave > > 1.04 ± 7% -0.8 0.24 ± 22% perf-profile.children.cycles-pp.check_preempt_curr > > 1.01 ± 3% -0.7 0.30 ± 20% perf-profile.children.cycles-pp.__switch_to > > 0.92 ± 7% -0.7 0.26 ± 12% perf-profile.children.cycles-pp.update_min_vruntime > > 0.71 ± 2% -0.6 0.08 ± 75% perf-profile.children.cycles-pp.put_prev_entity > > 0.76 ± 6% -0.6 0.14 ± 32% perf-profile.children.cycles-pp.check_preempt_wakeup > > 0.81 ± 66% -0.6 0.22 ± 34% perf-profile.children.cycles-pp.set_task_cpu > > 0.82 ± 17% -0.6 0.23 ± 10% perf-profile.children.cycles-pp.cpuacct_charge > > 1.08 ± 15% -0.6 0.51 ± 10% perf-profile.children.cycles-pp.wake_affine > > 0.56 ± 15% -0.5 0.03 ±100% perf-profile.children.cycles-pp.asm_sysvec_reschedule_ipi > > 0.66 ± 3% -0.5 0.15 ± 28% perf-profile.children.cycles-pp.os_xsave > > 0.52 ± 44% -0.5 0.06 ±151% perf-profile.children.cycles-pp.native_irq_return_iret > > 0.55 ± 5% -0.4 0.15 ± 21% perf-profile.children.cycles-pp.__calc_delta > > 0.56 ± 10% -0.4 0.17 ± 26% perf-profile.children.cycles-pp.___perf_sw_event > > 0.70 ± 15% -0.4 0.32 ± 11% perf-profile.children.cycles-pp.task_h_load > > 0.40 ± 4% -0.3 0.06 ± 49% perf-profile.children.cycles-pp.pick_next_entity > > 0.57 ± 6% -0.3 0.26 ± 7% perf-profile.children.cycles-pp.__list_del_entry_valid > > 0.39 ± 8% -0.3 0.08 ± 24% perf-profile.children.cycles-pp.set_next_buddy > > 0.64 ± 6% -0.3 0.36 ± 6% perf-profile.children.cycles-pp._raw_spin_lock_irq > > 0.53 ± 20% -0.3 0.25 ± 8% perf-profile.children.cycles-pp.ttwu_queue_wakelist > > 0.36 ± 8% -0.3 0.08 ± 11% perf-profile.children.cycles-pp.rb_insert_color > > 0.41 ± 6% -0.3 0.14 ± 17% perf-profile.children.cycles-pp.sched_clock_cpu > > 0.36 ± 33% -0.3 0.10 ± 17% perf-profile.children.cycles-pp.__flush_smp_call_function_queue > > 0.37 ± 4% -0.2 0.13 ± 16% perf-profile.children.cycles-pp.native_sched_clock > > 0.28 ± 5% -0.2 0.07 ± 18% perf-profile.children.cycles-pp.rb_erase > > 0.32 ± 7% -0.2 0.12 ± 10% perf-profile.children.cycles-pp.__list_add_valid > > 0.23 ± 6% -0.2 0.03 ±103% perf-profile.children.cycles-pp.resched_curr > > 0.27 ± 5% -0.2 0.08 ± 20% perf-profile.children.cycles-pp.__wrgsbase_inactive > > 0.26 ± 6% -0.2 0.08 ± 17% perf-profile.children.cycles-pp.finish_wait > > 0.26 ± 4% -0.2 0.08 ± 11% perf-profile.children.cycles-pp.rcu_note_context_switch > > 0.33 ± 21% -0.2 0.15 ± 32% perf-profile.children.cycles-pp.migrate_task_rq_fair > > 0.22 ± 9% -0.2 0.07 ± 22% perf-profile.children.cycles-pp.perf_trace_buf_update > > 0.17 ± 8% -0.1 0.03 ±100% perf-profile.children.cycles-pp.rb_next > > 0.15 ± 32% -0.1 0.03 ±100% perf-profile.children.cycles-pp.llist_reverse_order > > 0.34 ± 7% -0.1 0.26 ± 3% perf-profile.children.cycles-pp.anon_pipe_buf_release > > 0.14 ± 6% -0.1 0.07 ± 17% perf-profile.children.cycles-pp.read@plt > > 0.10 ± 17% -0.1 0.04 ± 75% perf-profile.children.cycles-pp.remove_entity_load_avg > > 0.07 ± 10% -0.0 0.02 ± 99% perf-profile.children.cycles-pp.generic_update_time > > 0.11 ± 6% -0.0 0.07 ± 8% perf-profile.children.cycles-pp.__mark_inode_dirty > > 0.00 +0.1 0.06 ± 9% perf-profile.children.cycles-pp.load_balance > > 0.00 +0.1 0.06 ± 11% perf-profile.children.cycles-pp._raw_spin_trylock > > 0.00 +0.1 0.06 ± 7% perf-profile.children.cycles-pp.uncharge_folio > > 0.00 +0.1 0.06 ± 7% perf-profile.children.cycles-pp.__do_softirq > > 0.00 +0.1 0.07 ± 10% perf-profile.children.cycles-pp.tick_nohz_get_sleep_length > > 0.00 +0.1 0.08 ± 14% perf-profile.children.cycles-pp.__get_obj_cgroup_from_memcg > > 0.15 ± 23% +0.1 0.23 ± 7% perf-profile.children.cycles-pp.task_tick_fair > > 0.19 ± 17% +0.1 0.28 ± 7% perf-profile.children.cycles-pp.scheduler_tick > > 0.00 +0.1 0.10 ± 21% perf-profile.children.cycles-pp.select_idle_core > > 0.00 +0.1 0.10 ± 9% perf-profile.children.cycles-pp.osq_unlock > > 0.23 ± 12% +0.1 0.34 ± 6% perf-profile.children.cycles-pp.update_process_times > > 0.37 ± 13% +0.1 0.48 ± 5% perf-profile.children.cycles-pp.hrtimer_interrupt > > 0.24 ± 12% +0.1 0.35 ± 6% perf-profile.children.cycles-pp.tick_sched_handle > > 0.31 ± 14% +0.1 0.43 ± 4% perf-profile.children.cycles-pp.__hrtimer_run_queues > > 0.37 ± 12% +0.1 0.49 ± 5% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt > > 0.00 +0.1 0.12 ± 10% perf-profile.children.cycles-pp.__mod_memcg_state > > 0.26 ± 10% +0.1 0.38 ± 6% perf-profile.children.cycles-pp.tick_sched_timer > > 0.00 +0.1 0.13 ± 7% perf-profile.children.cycles-pp.free_unref_page > > 0.00 +0.1 0.14 ± 8% perf-profile.children.cycles-pp.rmqueue > > 0.15 ± 8% +0.2 0.30 ± 5% perf-profile.children.cycles-pp.rcu_all_qs > > 0.16 ± 6% +0.2 0.31 ± 5% perf-profile.children.cycles-pp.__x64_sys_write > > 0.00 +0.2 0.16 ± 10% perf-profile.children.cycles-pp.propagate_protected_usage > > 0.00 +0.2 0.16 ± 10% perf-profile.children.cycles-pp.menu_select > > 0.00 +0.2 0.16 ± 9% perf-profile.children.cycles-pp.memcg_account_kmem > > 0.42 ± 12% +0.2 0.57 ± 4% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt > > 0.15 ± 11% +0.2 0.31 ± 8% perf-profile.children.cycles-pp.__x64_sys_read > > 0.00 +0.2 0.17 ± 8% perf-profile.children.cycles-pp.get_page_from_freelist > > 0.44 ± 11% +0.2 0.62 ± 4% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt > > 0.10 ± 31% +0.2 0.28 ± 24% perf-profile.children.cycles-pp.mnt_user_ns > > 0.16 ± 4% +0.2 0.35 ± 5% perf-profile.children.cycles-pp.kill_fasync > > 0.20 ± 10% +0.2 0.40 ± 3% perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare > > 0.09 ± 7% +0.2 0.29 ± 4% perf-profile.children.cycles-pp.page_copy_sane > > 0.08 ± 8% +0.2 0.31 ± 6% perf-profile.children.cycles-pp.rw_verify_area > > 0.12 ± 11% +0.2 0.36 ± 8% perf-profile.children.cycles-pp.ktime_get_coarse_real_ts64 > > 0.28 ± 12% +0.2 0.52 ± 5% perf-profile.children.cycles-pp.inode_needs_update_time > > 0.00 +0.3 0.27 ± 7% perf-profile.children.cycles-pp.__memcg_kmem_charge_page > > 0.43 ± 6% +0.3 0.73 ± 5% perf-profile.children.cycles-pp.__cond_resched > > 0.21 ± 29% +0.3 0.54 ± 15% perf-profile.children.cycles-pp.select_idle_cpu > > 0.10 ± 10% +0.3 0.43 ± 17% perf-profile.children.cycles-pp.fsnotify_perm > > 0.23 ± 11% +0.3 0.56 ± 6% perf-profile.children.cycles-pp.syscall_enter_from_user_mode > > 0.06 ± 75% +0.4 0.47 ± 27% perf-profile.children.cycles-pp.queue_event > > 0.21 ± 9% +0.4 0.62 ± 5% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack > > 0.06 ± 75% +0.4 0.48 ± 26% perf-profile.children.cycles-pp.ordered_events__queue > > 0.06 ± 73% +0.4 0.50 ± 24% perf-profile.children.cycles-pp.process_simple > > 0.01 ±223% +0.4 0.44 ± 9% perf-profile.children.cycles-pp.schedule_idle > > 0.05 ± 8% +0.5 0.52 ± 7% perf-profile.children.cycles-pp.__alloc_pages > > 0.45 ± 7% +0.5 0.94 ± 5% perf-profile.children.cycles-pp.__get_task_ioprio > > 0.89 ± 8% +0.5 1.41 ± 4% perf-profile.children.cycles-pp.__might_sleep > > 0.01 ±223% +0.5 0.54 ± 21% perf-profile.children.cycles-pp.flush_smp_call_function_queue > > 0.05 ± 46% +0.5 0.60 ± 7% perf-profile.children.cycles-pp.osq_lock > > 0.34 ± 8% +0.6 0.90 ± 5% perf-profile.children.cycles-pp.aa_file_perm > > 0.01 ±223% +0.7 0.67 ± 7% perf-profile.children.cycles-pp.poll_idle > > 0.14 ± 17% +0.7 0.82 ± 6% perf-profile.children.cycles-pp.mutex_spin_on_owner > > 0.12 ± 12% +0.7 0.82 ± 15% perf-profile.children.cycles-pp.__cmd_record > > 0.07 ± 72% +0.7 0.78 ± 19% perf-profile.children.cycles-pp.reader__read_event > > 0.07 ± 72% +0.7 0.78 ± 19% perf-profile.children.cycles-pp.record__finish_output > > 0.07 ± 72% +0.7 0.78 ± 19% perf-profile.children.cycles-pp.perf_session__process_events > > 0.76 ± 8% +0.8 1.52 ± 5% perf-profile.children.cycles-pp.file_update_time > > 0.08 ± 61% +0.8 0.85 ± 11% perf-profile.children.cycles-pp.intel_idle_irq > > 1.23 ± 8% +0.9 2.11 ± 4% perf-profile.children.cycles-pp.__might_fault > > 0.02 ±141% +1.0 0.97 ± 7% perf-profile.children.cycles-pp.page_counter_uncharge > > 0.51 ± 9% +1.0 1.48 ± 4% perf-profile.children.cycles-pp.current_time > > 0.05 ± 46% +1.1 1.15 ± 7% perf-profile.children.cycles-pp.uncharge_batch > > 1.12 ± 6% +1.1 2.23 ± 5% perf-profile.children.cycles-pp.__fget_light > > 0.06 ± 14% +1.2 1.23 ± 6% perf-profile.children.cycles-pp.__mem_cgroup_uncharge > > 0.06 ± 14% +1.2 1.24 ± 7% perf-profile.children.cycles-pp.__folio_put > > 0.64 ± 7% +1.2 1.83 ± 5% perf-profile.children.cycles-pp.syscall_return_via_sysret > > 1.19 ± 8% +1.2 2.42 ± 4% perf-profile.children.cycles-pp.__might_resched > > 0.59 ± 9% +1.3 1.84 ± 6% perf-profile.children.cycles-pp.atime_needs_update > > 43.47 +1.4 44.83 perf-profile.children.cycles-pp.ksys_write > > 1.28 ± 6% +1.4 2.68 ± 5% perf-profile.children.cycles-pp.__fdget_pos > > 0.80 ± 8% +1.5 2.28 ± 6% perf-profile.children.cycles-pp.touch_atime > > 0.11 ± 49% +1.5 1.59 ± 9% perf-profile.children.cycles-pp.cpuidle_enter_state > > 0.11 ± 49% +1.5 1.60 ± 9% perf-profile.children.cycles-pp.cpuidle_enter > > 0.12 ± 51% +1.7 1.81 ± 9% perf-profile.children.cycles-pp.cpuidle_idle_call > > 1.44 ± 8% +1.8 3.22 ± 6% perf-profile.children.cycles-pp.copyin > > 2.00 ± 9% +2.0 4.03 ± 5% perf-profile.children.cycles-pp.copyout > > 1.02 ± 8% +2.0 3.07 ± 5% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack > > 1.63 ± 7% +2.3 3.90 ± 5% perf-profile.children.cycles-pp.apparmor_file_permission > > 2.64 ± 8% +2.3 4.98 ± 5% perf-profile.children.cycles-pp._copy_from_iter > > 0.40 ± 14% +2.5 2.92 ± 7% perf-profile.children.cycles-pp.__mutex_lock > > 2.91 ± 8% +2.6 5.54 ± 5% perf-profile.children.cycles-pp.copy_page_from_iter > > 0.17 ± 62% +2.7 2.91 ± 11% perf-profile.children.cycles-pp.start_secondary > > 1.83 ± 7% +2.8 4.59 ± 5% perf-profile.children.cycles-pp.security_file_permission > > 0.17 ± 60% +2.8 2.94 ± 11% perf-profile.children.cycles-pp.do_idle > > 0.17 ± 60% +2.8 2.96 ± 11% perf-profile.children.cycles-pp.secondary_startup_64_no_verify > > 0.17 ± 60% +2.8 2.96 ± 11% perf-profile.children.cycles-pp.cpu_startup_entry > > 2.62 ± 9% +3.2 5.84 ± 6% perf-profile.children.cycles-pp._copy_to_iter > > 1.55 ± 8% +3.2 4.79 ± 5% perf-profile.children.cycles-pp.__entry_text_start > > 3.09 ± 8% +3.7 6.77 ± 5% perf-profile.children.cycles-pp.copy_user_enhanced_fast_string > > 2.95 ± 9% +3.8 6.73 ± 5% perf-profile.children.cycles-pp.copy_page_to_iter > > 2.28 ± 11% +5.1 7.40 ± 6% perf-profile.children.cycles-pp.mutex_unlock > > 3.92 ± 9% +6.0 9.94 ± 5% perf-profile.children.cycles-pp.mutex_lock > > 8.37 ± 9% -5.8 2.60 ± 23% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath > > 6.54 ± 15% -4.6 1.95 ± 57% perf-profile.self.cycles-pp.update_cfs_group > > 3.08 ± 4% -2.3 0.74 ± 22% perf-profile.self.cycles-pp.switch_mm_irqs_off > > 2.96 ± 4% -1.8 1.13 ± 33% perf-profile.self.cycles-pp.update_load_avg > > 2.22 ± 8% -1.5 0.74 ± 12% perf-profile.self.cycles-pp._raw_spin_lock_irqsave > > 1.96 ± 9% -1.5 0.48 ± 15% perf-profile.self.cycles-pp.update_curr > > 1.94 ± 5% -1.3 0.64 ± 16% perf-profile.self.cycles-pp._raw_spin_lock > > 1.78 ± 5% -1.3 0.50 ± 18% perf-profile.self.cycles-pp.__schedule > > 1.59 ± 7% -1.2 0.40 ± 12% perf-profile.self.cycles-pp.enqueue_entity > > 1.61 ± 6% -1.2 0.42 ± 25% perf-profile.self.cycles-pp.restore_fpregs_from_fpstate > > 1.44 ± 8% -1.0 0.39 ± 14% perf-profile.self.cycles-pp.__update_load_avg_cfs_rq > > 1.42 ± 5% -1.0 0.41 ± 19% perf-profile.self.cycles-pp.__switch_to_asm > > 1.18 ± 7% -0.9 0.33 ± 14% perf-profile.self.cycles-pp.__update_load_avg_se > > 1.14 ± 10% -0.8 0.31 ± 9% perf-profile.self.cycles-pp.update_rq_clock > > 0.90 ± 7% -0.7 0.19 ± 21% perf-profile.self.cycles-pp.pick_next_task_fair > > 1.04 ± 7% -0.7 0.33 ± 13% perf-profile.self.cycles-pp.prepare_task_switch > > 0.98 ± 4% -0.7 0.29 ± 20% perf-profile.self.cycles-pp.__switch_to > > 0.88 ± 6% -0.7 0.20 ± 17% perf-profile.self.cycles-pp.enqueue_task_fair > > 1.01 ± 6% -0.7 0.35 ± 10% perf-profile.self.cycles-pp.prepare_to_wait_event > > 0.90 ± 8% -0.6 0.25 ± 12% perf-profile.self.cycles-pp.update_min_vruntime > > 0.79 ± 17% -0.6 0.22 ± 9% perf-profile.self.cycles-pp.cpuacct_charge > > 1.10 ± 5% -0.6 0.54 ± 9% perf-profile.self.cycles-pp.try_to_wake_up > > 0.66 ± 3% -0.5 0.15 ± 27% perf-profile.self.cycles-pp.os_xsave > > 0.71 ± 6% -0.5 0.22 ± 18% perf-profile.self.cycles-pp.reweight_entity > > 0.68 ± 9% -0.5 0.19 ± 10% perf-profile.self.cycles-pp.perf_trace_sched_wakeup_template > > 0.67 ± 9% -0.5 0.18 ± 11% perf-profile.self.cycles-pp.__wake_up_common > > 0.65 ± 6% -0.5 0.17 ± 23% perf-profile.self.cycles-pp.switch_fpu_return > > 0.60 ± 11% -0.5 0.14 ± 28% perf-profile.self.cycles-pp.perf_tp_event > > 0.52 ± 44% -0.5 0.06 ±151% perf-profile.self.cycles-pp.native_irq_return_iret > > 0.52 ± 7% -0.4 0.08 ± 25% perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore > > 0.55 ± 4% -0.4 0.15 ± 22% perf-profile.self.cycles-pp.__calc_delta > > 0.61 ± 5% -0.4 0.21 ± 12% perf-profile.self.cycles-pp.dequeue_task_fair > > 0.69 ± 14% -0.4 0.32 ± 11% perf-profile.self.cycles-pp.task_h_load > > 0.49 ± 11% -0.3 0.15 ± 29% perf-profile.self.cycles-pp.___perf_sw_event > > 0.37 ± 4% -0.3 0.05 ± 73% perf-profile.self.cycles-pp.pick_next_entity > > 0.50 ± 3% -0.3 0.19 ± 15% perf-profile.self.cycles-pp.select_idle_sibling > > 0.38 ± 9% -0.3 0.08 ± 24% perf-profile.self.cycles-pp.set_next_buddy > > 0.32 ± 4% -0.3 0.03 ±100% perf-profile.self.cycles-pp.put_prev_entity > > 0.64 ± 6% -0.3 0.35 ± 7% perf-profile.self.cycles-pp._raw_spin_lock_irq > > 0.52 ± 5% -0.3 0.25 ± 6% perf-profile.self.cycles-pp.__list_del_entry_valid > > 0.34 ± 5% -0.3 0.07 ± 29% perf-profile.self.cycles-pp.schedule > > 0.35 ± 9% -0.3 0.08 ± 10% perf-profile.self.cycles-pp.rb_insert_color > > 0.40 ± 5% -0.3 0.14 ± 16% perf-profile.self.cycles-pp.select_task_rq_fair > > 0.33 ± 6% -0.3 0.08 ± 16% perf-profile.self.cycles-pp.check_preempt_wakeup > > 0.33 ± 8% -0.2 0.10 ± 16% perf-profile.self.cycles-pp.select_task_rq > > 0.36 ± 3% -0.2 0.13 ± 16% perf-profile.self.cycles-pp.native_sched_clock > > 0.32 ± 7% -0.2 0.10 ± 14% perf-profile.self.cycles-pp.finish_task_switch > > 0.32 ± 4% -0.2 0.11 ± 13% perf-profile.self.cycles-pp.dequeue_entity > > 0.32 ± 8% -0.2 0.12 ± 10% perf-profile.self.cycles-pp.__list_add_valid > > 0.23 ± 5% -0.2 0.03 ±103% perf-profile.self.cycles-pp.resched_curr > > 0.27 ± 6% -0.2 0.07 ± 21% perf-profile.self.cycles-pp.rb_erase > > 0.27 ± 5% -0.2 0.08 ± 20% perf-profile.self.cycles-pp.__wrgsbase_inactive > > 0.28 ± 13% -0.2 0.09 ± 12% perf-profile.self.cycles-pp.check_preempt_curr > > 0.30 ± 13% -0.2 0.12 ± 7% perf-profile.self.cycles-pp.ttwu_queue_wakelist > > 0.24 ± 5% -0.2 0.06 ± 19% perf-profile.self.cycles-pp.set_next_entity > > 0.21 ± 34% -0.2 0.04 ± 71% perf-profile.self.cycles-pp.__flush_smp_call_function_queue > > 0.25 ± 5% -0.2 0.08 ± 16% perf-profile.self.cycles-pp.rcu_note_context_switch > > 0.19 ± 26% -0.1 0.04 ± 73% perf-profile.self.cycles-pp.perf_trace_sched_stat_runtime > > 0.20 ± 8% -0.1 0.06 ± 13% perf-profile.self.cycles-pp.ttwu_do_activate > > 0.17 ± 8% -0.1 0.03 ±100% perf-profile.self.cycles-pp.rb_next > > 0.22 ± 23% -0.1 0.09 ± 31% perf-profile.self.cycles-pp.migrate_task_rq_fair > > 0.15 ± 32% -0.1 0.03 ±100% perf-profile.self.cycles-pp.llist_reverse_order > > 0.16 ± 8% -0.1 0.06 ± 14% perf-profile.self.cycles-pp.wake_affine > > 0.10 ± 31% -0.1 0.03 ±100% perf-profile.self.cycles-pp.sched_ttwu_pending > > 0.14 ± 5% -0.1 0.07 ± 20% perf-profile.self.cycles-pp.read@plt > > 0.32 ± 8% -0.1 0.26 ± 3% perf-profile.self.cycles-pp.anon_pipe_buf_release > > 0.10 ± 6% -0.1 0.04 ± 45% perf-profile.self.cycles-pp.__wake_up_common_lock > > 0.10 ± 9% -0.0 0.07 ± 8% perf-profile.self.cycles-pp.__mark_inode_dirty > > 0.00 +0.1 0.06 ± 11% perf-profile.self.cycles-pp.free_unref_page > > 0.00 +0.1 0.06 ± 6% perf-profile.self.cycles-pp.__alloc_pages > > 0.00 +0.1 0.06 ± 11% perf-profile.self.cycles-pp._raw_spin_trylock > > 0.00 +0.1 0.06 ± 7% perf-profile.self.cycles-pp.uncharge_folio > > 0.00 +0.1 0.06 ± 11% perf-profile.self.cycles-pp.uncharge_batch > > 0.00 +0.1 0.07 ± 10% perf-profile.self.cycles-pp.menu_select > > 0.00 +0.1 0.08 ± 14% perf-profile.self.cycles-pp.__get_obj_cgroup_from_memcg > > 0.00 +0.1 0.08 ± 7% perf-profile.self.cycles-pp.__memcg_kmem_charge_page > > 0.00 +0.1 0.10 ± 10% perf-profile.self.cycles-pp.osq_unlock > > 0.07 ± 5% +0.1 0.17 ± 8% perf-profile.self.cycles-pp.copyin > > 0.00 +0.1 0.11 ± 11% perf-profile.self.cycles-pp.__mod_memcg_state > > 0.13 ± 8% +0.1 0.24 ± 6% perf-profile.self.cycles-pp.rcu_all_qs > > 0.14 ± 5% +0.1 0.28 ± 5% perf-profile.self.cycles-pp.__x64_sys_write > > 0.07 ± 10% +0.1 0.21 ± 5% perf-profile.self.cycles-pp.page_copy_sane > > 0.13 ± 12% +0.1 0.28 ± 9% perf-profile.self.cycles-pp.__x64_sys_read > > 0.00 +0.2 0.15 ± 10% perf-profile.self.cycles-pp.propagate_protected_usage > > 0.18 ± 9% +0.2 0.33 ± 4% perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare > > 0.07 ± 8% +0.2 0.23 ± 5% perf-profile.self.cycles-pp.rw_verify_area > > 0.08 ± 34% +0.2 0.24 ± 27% perf-profile.self.cycles-pp.mnt_user_ns > > 0.13 ± 5% +0.2 0.31 ± 7% perf-profile.self.cycles-pp.kill_fasync > > 0.21 ± 8% +0.2 0.39 ± 5% perf-profile.self.cycles-pp.__might_fault > > 0.06 ± 13% +0.2 0.26 ± 9% perf-profile.self.cycles-pp.copyout > > 0.10 ± 11% +0.2 0.31 ± 8% perf-profile.self.cycles-pp.ktime_get_coarse_real_ts64 > > 0.26 ± 13% +0.2 0.49 ± 6% perf-profile.self.cycles-pp.inode_needs_update_time > > 0.23 ± 8% +0.2 0.47 ± 5% perf-profile.self.cycles-pp.copy_page_from_iter > > 0.14 ± 7% +0.2 0.38 ± 6% perf-profile.self.cycles-pp.file_update_time > > 0.36 ± 7% +0.3 0.62 ± 4% perf-profile.self.cycles-pp.ksys_read > > 0.54 ± 13% +0.3 0.80 ± 4% perf-profile.self.cycles-pp._copy_from_iter > > 0.15 ± 5% +0.3 0.41 ± 8% perf-profile.self.cycles-pp.touch_atime > > 0.14 ± 5% +0.3 0.40 ± 6% perf-profile.self.cycles-pp.__cond_resched > > 0.18 ± 5% +0.3 0.47 ± 4% perf-profile.self.cycles-pp.syscall_exit_to_user_mode > > 0.16 ± 8% +0.3 0.46 ± 6% perf-profile.self.cycles-pp.syscall_enter_from_user_mode > > 0.16 ± 9% +0.3 0.47 ± 6% perf-profile.self.cycles-pp.__fdget_pos > > 1.79 ± 8% +0.3 2.12 ± 3% perf-profile.self.cycles-pp.pipe_read > > 0.10 ± 8% +0.3 0.43 ± 17% perf-profile.self.cycles-pp.fsnotify_perm > > 0.20 ± 4% +0.4 0.55 ± 5% perf-profile.self.cycles-pp.ksys_write > > 0.05 ± 76% +0.4 0.46 ± 27% perf-profile.self.cycles-pp.queue_event > > 0.32 ± 6% +0.4 0.73 ± 6% perf-profile.self.cycles-pp.exit_to_user_mode_prepare > > 0.21 ± 9% +0.4 0.62 ± 6% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack > > 0.79 ± 8% +0.4 1.22 ± 4% perf-profile.self.cycles-pp.__might_sleep > > 0.44 ± 5% +0.4 0.88 ± 7% perf-profile.self.cycles-pp.do_syscall_64 > > 0.26 ± 8% +0.4 0.70 ± 4% perf-profile.self.cycles-pp.atime_needs_update > > 0.42 ± 7% +0.5 0.88 ± 5% perf-profile.self.cycles-pp.__get_task_ioprio > > 0.28 ± 12% +0.5 0.75 ± 5% perf-profile.self.cycles-pp.copy_page_to_iter > > 0.19 ± 6% +0.5 0.68 ± 10% perf-profile.self.cycles-pp.security_file_permission > > 0.31 ± 8% +0.5 0.83 ± 5% perf-profile.self.cycles-pp.aa_file_perm > > 0.05 ± 46% +0.5 0.59 ± 8% perf-profile.self.cycles-pp.osq_lock > > 0.30 ± 7% +0.5 0.85 ± 6% perf-profile.self.cycles-pp._copy_to_iter > > 0.00 +0.6 0.59 ± 6% perf-profile.self.cycles-pp.poll_idle > > 0.13 ± 20% +0.7 0.81 ± 6% perf-profile.self.cycles-pp.mutex_spin_on_owner > > 0.38 ± 9% +0.7 1.12 ± 5% perf-profile.self.cycles-pp.current_time > > 0.08 ± 59% +0.8 0.82 ± 11% perf-profile.self.cycles-pp.intel_idle_irq > > 0.92 ± 6% +0.8 1.72 ± 4% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe > > 0.01 ±223% +0.8 0.82 ± 6% perf-profile.self.cycles-pp.page_counter_uncharge > > 0.86 ± 7% +1.1 1.91 ± 4% perf-profile.self.cycles-pp.vfs_read > > 1.07 ± 6% +1.1 2.14 ± 5% perf-profile.self.cycles-pp.__fget_light > > 0.67 ± 7% +1.1 1.74 ± 6% perf-profile.self.cycles-pp.vfs_write > > 0.15 ± 12% +1.1 1.28 ± 7% perf-profile.self.cycles-pp.__mutex_lock > > 1.09 ± 6% +1.1 2.22 ± 5% perf-profile.self.cycles-pp.__libc_read > > 0.62 ± 6% +1.2 1.79 ± 5% perf-profile.self.cycles-pp.syscall_return_via_sysret > > 1.16 ± 8% +1.2 2.38 ± 4% perf-profile.self.cycles-pp.__might_resched > > 0.91 ± 7% +1.3 2.20 ± 5% perf-profile.self.cycles-pp.__libc_write > > 0.59 ± 8% +1.3 1.93 ± 6% perf-profile.self.cycles-pp.__entry_text_start > > 1.27 ± 7% +1.7 3.00 ± 6% perf-profile.self.cycles-pp.apparmor_file_permission > > 0.99 ± 8% +2.0 2.98 ± 5% perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack > > 1.74 ± 8% +3.4 5.15 ± 6% perf-profile.self.cycles-pp.pipe_write > > 2.98 ± 8% +3.7 6.64 ± 5% perf-profile.self.cycles-pp.copy_user_enhanced_fast_string > > 2.62 ± 10% +4.8 7.38 ± 5% perf-profile.self.cycles-pp.mutex_lock > > 2.20 ± 10% +5.1 7.30 ± 6% perf-profile.self.cycles-pp.mutex_unlock > > > > > > *************************************************************************************************** > > lkp-skl-fpga01: 104 threads 2 sockets (Skylake) with 192G memory > > ========================================================================================= > > compiler/cpufreq_governor/ipc/iterations/kconfig/mode/nr_threads/rootfs/tbox_group/testcase: > > gcc-11/performance/socket/4/x86_64-rhel-8.3/process/100%/debian-11.1-x86_64-20220510.cgz/lkp-skl-fpga01/hackbench > > > > commit: > > a2e90611b9 ("sched/fair: Remove capacity inversion detection") > > 829c1651e9 ("sched/fair: sanitize vruntime of entity being placed") > > > > a2e90611b9f425ad 829c1651e9c4a6f78398d3e6765 > > ---------------- --------------------------- > > %stddev %change %stddev > > \ | \ > > 177139 -8.1% 162815 hackbench.throughput > > 174484 -18.8% 141618 ± 2% hackbench.throughput_avg > > 177139 -8.1% 162815 hackbench.throughput_best > > 168530 -37.3% 105615 ± 3% hackbench.throughput_worst > > 281.38 +23.1% 346.39 ± 2% hackbench.time.elapsed_time > > 281.38 +23.1% 346.39 ± 2% hackbench.time.elapsed_time.max > > 1.053e+08 ± 2% +688.4% 8.302e+08 ± 9% hackbench.time.involuntary_context_switches > > 21992 +27.8% 28116 ± 2% hackbench.time.system_time > > 6652 +8.2% 7196 hackbench.time.user_time > > 3.482e+08 +289.2% 1.355e+09 ± 9% hackbench.time.voluntary_context_switches > > 2110813 ± 5% +21.6% 2565791 ± 3% cpuidle..usage > > 333.95 +19.5% 399.05 uptime.boot > > 0.03 -0.0 0.03 mpstat.cpu.all.soft% > > 22.68 -2.9 19.77 mpstat.cpu.all.usr% > > 561083 ± 10% +45.5% 816171 ± 12% numa-numastat.node0.local_node > > 614314 ± 9% +36.9% 841173 ± 12% numa-numastat.node0.numa_hit > > 1393279 ± 7% -16.8% 1158997 ± 2% numa-numastat.node1.local_node > > 1443679 ± 5% -14.9% 1229074 ± 3% numa-numastat.node1.numa_hit > > 4129900 ± 8% -23.0% 3181115 vmstat.memory.cache > > 1731 +30.8% 2265 vmstat.procs.r > > 1598044 +290.3% 6237840 ± 7% vmstat.system.cs > > 320762 +60.5% 514672 ± 8% vmstat.system.in > > 962111 ± 6% +46.0% 1404646 ± 7% turbostat.C1 > > 233987 ± 5% +51.2% 353892 turbostat.C1E > > 91515563 +97.3% 1.806e+08 ± 10% turbostat.IRQ > > 448466 ± 14% -34.2% 294934 ± 5% turbostat.POLL > > 34.60 -7.3% 32.07 turbostat.RAMWatt > > 514028 ± 2% -14.0% 442125 ± 2% meminfo.AnonPages > > 4006312 ± 8% -23.9% 3047078 meminfo.Cached > > 3321064 ± 10% -32.7% 2236362 ± 2% meminfo.Committed_AS > > 1714752 ± 21% -60.3% 680479 ± 8% meminfo.Inactive > > 1714585 ± 21% -60.3% 680305 ± 8% meminfo.Inactive(anon) > > 757124 ± 18% -67.2% 248485 ± 27% meminfo.Mapped > > 6476123 ± 6% -19.4% 5220738 meminfo.Memused > > 1275724 ± 26% -75.2% 316896 ± 15% meminfo.Shmem > > 6806047 ± 3% -13.3% 5901974 meminfo.max_used_kB > > 161311 ± 23% +31.7% 212494 ± 5% numa-meminfo.node0.AnonPages > > 165693 ± 22% +30.5% 216264 ± 5% numa-meminfo.node0.Inactive > > 165563 ± 22% +30.6% 216232 ± 5% numa-meminfo.node0.Inactive(anon) > > 140638 ± 19% -36.7% 89034 ± 11% numa-meminfo.node0.Mapped > > 352173 ± 14% -35.3% 227805 ± 8% numa-meminfo.node1.AnonPages > > 501396 ± 11% -22.6% 388042 ± 5% numa-meminfo.node1.AnonPages.max > > 1702242 ± 43% -77.8% 378325 ± 22% numa-meminfo.node1.FilePages > > 1540803 ± 25% -70.4% 455592 ± 13% numa-meminfo.node1.Inactive > > 1540767 ± 25% -70.4% 455451 ± 13% numa-meminfo.node1.Inactive(anon) > > 612123 ± 18% -74.9% 153752 ± 37% numa-meminfo.node1.Mapped > > 3085231 ± 24% -53.9% 1420940 ± 14% numa-meminfo.node1.MemUsed > > 254052 ± 4% -19.1% 205632 ± 21% numa-meminfo.node1.SUnreclaim > > 1259640 ± 27% -75.9% 303123 ± 15% numa-meminfo.node1.Shmem > > 304597 ± 7% -20.2% 242920 ± 17% numa-meminfo.node1.Slab > > 40345 ± 23% +31.5% 53054 ± 5% numa-vmstat.node0.nr_anon_pages > > 41412 ± 22% +30.4% 53988 ± 5% numa-vmstat.node0.nr_inactive_anon > > 35261 ± 19% -36.9% 22256 ± 12% numa-vmstat.node0.nr_mapped > > 41412 ± 22% +30.4% 53988 ± 5% numa-vmstat.node0.nr_zone_inactive_anon > > 614185 ± 9% +36.9% 841065 ± 12% numa-vmstat.node0.numa_hit > > 560955 ± 11% +45.5% 816063 ± 12% numa-vmstat.node0.numa_local > > 88129 ± 14% -35.2% 57097 ± 8% numa-vmstat.node1.nr_anon_pages > > 426425 ± 43% -77.9% 94199 ± 22% numa-vmstat.node1.nr_file_pages > > 386166 ± 25% -70.5% 113880 ± 13% numa-vmstat.node1.nr_inactive_anon > > 153658 ± 18% -75.3% 38021 ± 37% numa-vmstat.node1.nr_mapped > > 315775 ± 27% -76.1% 75399 ± 16% numa-vmstat.node1.nr_shmem > > 63411 ± 4% -18.6% 51593 ± 21% numa-vmstat.node1.nr_slab_unreclaimable > > 386166 ± 25% -70.5% 113880 ± 13% numa-vmstat.node1.nr_zone_inactive_anon > > 1443470 ± 5% -14.9% 1228740 ± 3% numa-vmstat.node1.numa_hit > > 1393069 ± 7% -16.8% 1158664 ± 2% numa-vmstat.node1.numa_local > > 128457 ± 2% -14.0% 110530 ± 3% proc-vmstat.nr_anon_pages > > 999461 ± 8% -23.8% 761774 proc-vmstat.nr_file_pages > > 426485 ± 21% -60.1% 170237 ± 9% proc-vmstat.nr_inactive_anon > > 82464 -2.6% 80281 proc-vmstat.nr_kernel_stack > > 187777 ± 18% -66.9% 62076 ± 28% proc-vmstat.nr_mapped > > 316813 ± 27% -75.0% 79228 ± 16% proc-vmstat.nr_shmem > > 31469 -2.0% 30840 proc-vmstat.nr_slab_reclaimable > > 117889 -8.4% 108036 proc-vmstat.nr_slab_unreclaimable > > 426485 ± 21% -60.1% 170237 ± 9% proc-vmstat.nr_zone_inactive_anon > > 187187 ± 12% -43.5% 105680 ± 9% proc-vmstat.numa_hint_faults > > 128363 ± 15% -61.5% 49371 ± 19% proc-vmstat.numa_hint_faults_local > > 47314 ± 22% +39.2% 65863 ± 13% proc-vmstat.numa_pages_migrated > > 457026 ± 9% -18.1% 374188 ± 13% proc-vmstat.numa_pte_updates > > 2586600 ± 3% +27.7% 3302787 ± 8% proc-vmstat.pgalloc_normal > > 1589970 -6.2% 1491838 proc-vmstat.pgfault > > 2347186 ± 10% +37.7% 3232369 ± 8% proc-vmstat.pgfree > > 47314 ± 22% +39.2% 65863 ± 13% proc-vmstat.pgmigrate_success > > 112713 +7.0% 120630 ± 3% proc-vmstat.pgreuse > > 2189056 +22.2% 2674944 ± 2% proc-vmstat.unevictable_pgs_scanned > > 14.08 ± 2% +29.3% 18.20 ± 5% sched_debug.cfs_rq:/.h_nr_running.avg > > 0.80 ± 14% +179.2% 2.23 ± 24% sched_debug.cfs_rq:/.h_nr_running.min > > 245.23 ± 12% -19.7% 196.97 ± 6% sched_debug.cfs_rq:/.load_avg.max > > 2.27 ± 16% +75.0% 3.97 ± 4% sched_debug.cfs_rq:/.load_avg.min > > 45.77 ± 16% -17.8% 37.60 ± 6% sched_debug.cfs_rq:/.load_avg.stddev > > 11842707 +39.9% 16567992 sched_debug.cfs_rq:/.min_vruntime.avg > > 13773080 ± 3% +113.9% 29460281 ± 7% sched_debug.cfs_rq:/.min_vruntime.max > > 11423218 +30.3% 14885830 sched_debug.cfs_rq:/.min_vruntime.min > > 301190 ± 12% +439.9% 1626088 ± 10% sched_debug.cfs_rq:/.min_vruntime.stddev > > 203.83 -16.3% 170.67 sched_debug.cfs_rq:/.removed.load_avg.max > > 14330 ± 3% +30.9% 18756 ± 5% sched_debug.cfs_rq:/.runnable_avg.avg > > 25115 ± 4% +15.5% 28999 ± 6% sched_debug.cfs_rq:/.runnable_avg.max > > 3811 ± 11% +68.0% 6404 ± 21% sched_debug.cfs_rq:/.runnable_avg.min > > 3818 ± 6% +15.3% 4404 ± 7% sched_debug.cfs_rq:/.runnable_avg.stddev > > -849635 +410.6% -4338612 sched_debug.cfs_rq:/.spread0.avg > > 1092373 ± 54% +691.1% 8641673 ± 21% sched_debug.cfs_rq:/.spread0.max > > -1263082 +378.1% -6038905 sched_debug.cfs_rq:/.spread0.min > > 300764 ± 12% +441.8% 1629507 ± 9% sched_debug.cfs_rq:/.spread0.stddev > > 1591 ± 4% -11.1% 1413 ± 3% sched_debug.cfs_rq:/.util_avg.max > > 288.90 ± 11% +64.5% 475.23 ± 13% sched_debug.cfs_rq:/.util_avg.min > > 240.33 ± 2% -32.1% 163.09 ± 3% sched_debug.cfs_rq:/.util_avg.stddev > > 494.27 ± 3% +41.6% 699.85 ± 3% sched_debug.cfs_rq:/.util_est_enqueued.avg > > 11.23 ± 54% +634.1% 82.47 ± 22% sched_debug.cfs_rq:/.util_est_enqueued.min > > 174576 +20.7% 210681 sched_debug.cpu.clock.avg > > 174926 +21.2% 211944 sched_debug.cpu.clock.max > > 174164 +20.3% 209436 sched_debug.cpu.clock.min > > 230.84 ± 33% +226.1% 752.67 ± 20% sched_debug.cpu.clock.stddev > > 172836 +20.6% 208504 sched_debug.cpu.clock_task.avg > > 173552 +21.0% 210079 sched_debug.cpu.clock_task.max > > 156807 +22.3% 191789 sched_debug.cpu.clock_task.min > > 1634 +17.1% 1914 ± 5% sched_debug.cpu.clock_task.stddev > > 0.00 ± 32% +220.1% 0.00 ± 20% sched_debug.cpu.next_balance.stddev > > 14.12 ± 2% +28.7% 18.18 ± 5% sched_debug.cpu.nr_running.avg > > 0.73 ± 25% +213.6% 2.30 ± 24% sched_debug.cpu.nr_running.min > > 1810086 +461.3% 10159215 ± 10% sched_debug.cpu.nr_switches.avg > > 2315994 ± 3% +515.6% 14258195 ± 9% sched_debug.cpu.nr_switches.max > > 1529863 +380.3% 7348324 ± 9% sched_debug.cpu.nr_switches.min > > 167487 ± 18% +770.8% 1458519 ± 21% sched_debug.cpu.nr_switches.stddev > > 174149 +20.2% 209410 sched_debug.cpu_clk > > 170980 +20.6% 206240 sched_debug.ktime > > 174896 +20.2% 210153 sched_debug.sched_clk > > 7.35 +24.9% 9.18 ± 4% perf-stat.i.MPKI > > 1.918e+10 +14.4% 2.194e+10 perf-stat.i.branch-instructions > > 2.16 -0.1 2.09 perf-stat.i.branch-miss-rate% > > 4.133e+08 +6.6% 4.405e+08 perf-stat.i.branch-misses > > 23.08 -9.2 13.86 ± 7% perf-stat.i.cache-miss-rate% > > 1.714e+08 -37.2% 1.076e+08 ± 3% perf-stat.i.cache-misses > > 7.497e+08 +33.7% 1.002e+09 ± 5% perf-stat.i.cache-references > > 1636365 +382.4% 7893858 ± 5% perf-stat.i.context-switches > > 2.74 -6.8% 2.56 perf-stat.i.cpi > > 131725 +288.0% 511159 ± 10% perf-stat.i.cpu-migrations > > 1672 +160.8% 4361 ± 4% perf-stat.i.cycles-between-cache-misses > > 0.49 +0.6 1.11 ± 5% perf-stat.i.dTLB-load-miss-rate% > > 1.417e+08 +158.7% 3.665e+08 ± 5% perf-stat.i.dTLB-load-misses > > 2.908e+10 +9.1% 3.172e+10 perf-stat.i.dTLB-loads > > 0.12 ± 4% +0.1 0.20 ± 4% perf-stat.i.dTLB-store-miss-rate% > > 20805655 ± 4% +90.9% 39716345 ± 4% perf-stat.i.dTLB-store-misses > > 1.755e+10 +8.6% 1.907e+10 perf-stat.i.dTLB-stores > > 29.04 +3.6 32.62 ± 2% perf-stat.i.iTLB-load-miss-rate% > > 56676082 +60.4% 90917582 ± 3% perf-stat.i.iTLB-load-misses > > 1.381e+08 +30.6% 1.804e+08 perf-stat.i.iTLB-loads > > 1.03e+11 +10.5% 1.139e+11 perf-stat.i.instructions > > 1840 -21.1% 1451 ± 4% perf-stat.i.instructions-per-iTLB-miss > > 0.37 +10.9% 0.41 perf-stat.i.ipc > > 1084 -4.5% 1035 ± 2% perf-stat.i.metric.K/sec > > 640.69 +10.3% 706.44 perf-stat.i.metric.M/sec > > 5249 -9.3% 4762 ± 3% perf-stat.i.minor-faults > > 23.57 +18.7 42.30 ± 8% perf-stat.i.node-load-miss-rate% > > 40174555 -45.0% 22109431 ± 10% perf-stat.i.node-loads > > 8.84 ± 2% +24.5 33.30 ± 10% perf-stat.i.node-store-miss-rate% > > 2912322 +60.3% 4667137 ± 16% perf-stat.i.node-store-misses > > 34046752 -50.6% 16826621 ± 9% perf-stat.i.node-stores > > 5278 -9.2% 4791 ± 3% perf-stat.i.page-faults > > 7.24 +12.1% 8.12 ± 4% perf-stat.overall.MPKI > > 2.15 -0.1 2.05 perf-stat.overall.branch-miss-rate% > > 22.92 -9.5 13.41 ± 7% perf-stat.overall.cache-miss-rate% > > 2.73 -6.3% 2.56 perf-stat.overall.cpi > > 1644 +43.4% 2358 ± 3% perf-stat.overall.cycles-between-cache-misses > > 0.48 +0.5 0.99 ± 4% perf-stat.overall.dTLB-load-miss-rate% > > 0.12 ± 4% +0.1 0.19 ± 4% perf-stat.overall.dTLB-store-miss-rate% > > 29.06 +2.9 32.01 ± 2% perf-stat.overall.iTLB-load-miss-rate% > > 1826 -26.6% 1340 ± 4% perf-stat.overall.instructions-per-iTLB-miss > > 0.37 +6.8% 0.39 perf-stat.overall.ipc > > 22.74 +6.8 29.53 ± 13% perf-stat.overall.node-load-miss-rate% > > 7.63 +8.4 16.02 ± 20% perf-stat.overall.node-store-miss-rate% > > 1.915e+10 +9.0% 2.088e+10 perf-stat.ps.branch-instructions > > 4.119e+08 +3.9% 4.282e+08 perf-stat.ps.branch-misses > > 1.707e+08 -30.5% 1.186e+08 ± 3% perf-stat.ps.cache-misses > > 7.446e+08 +19.2% 8.874e+08 ± 4% perf-stat.ps.cache-references > > 1611874 +289.1% 6271376 ± 7% perf-stat.ps.context-switches > > 127362 +189.0% 368041 ± 11% perf-stat.ps.cpu-migrations > > 1.407e+08 +116.2% 3.042e+08 ± 5% perf-stat.ps.dTLB-load-misses > > 2.901e+10 +5.4% 3.057e+10 perf-stat.ps.dTLB-loads > > 20667480 ± 4% +66.8% 34473793 ± 4% perf-stat.ps.dTLB-store-misses > > 1.751e+10 +5.1% 1.84e+10 perf-stat.ps.dTLB-stores > > 56310692 +45.0% 81644183 ± 4% perf-stat.ps.iTLB-load-misses > > 1.375e+08 +26.1% 1.733e+08 perf-stat.ps.iTLB-loads > > 1.028e+11 +6.3% 1.093e+11 perf-stat.ps.instructions > > 4929 -24.5% 3723 ± 2% perf-stat.ps.minor-faults > > 40134633 -32.9% 26946247 ± 9% perf-stat.ps.node-loads > > 2805073 +39.5% 3914304 ± 16% perf-stat.ps.node-store-misses > > 33938259 -38.9% 20726382 ± 8% perf-stat.ps.node-stores > > 4952 -24.5% 3741 ± 2% perf-stat.ps.page-faults > > 2.911e+13 +30.9% 3.809e+13 ± 2% perf-stat.total.instructions > > 15.30 ± 4% -8.6 6.66 ± 5% perf-profile.calltrace.cycles-pp.sock_alloc_send_pskb.unix_stream_sendmsg.sock_sendmsg.sock_write_iter.vfs_write > > 13.84 ± 6% -7.9 5.98 ± 6% perf-profile.calltrace.cycles-pp.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg.sock_sendmsg.sock_write_iter > > 13.61 ± 6% -7.8 5.84 ± 6% perf-profile.calltrace.cycles-pp.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg.sock_sendmsg > > 9.00 ± 2% -5.5 3.48 ± 4% perf-profile.calltrace.cycles-pp.consume_skb.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter.vfs_read > > 6.44 ± 4% -4.3 2.14 ± 6% perf-profile.calltrace.cycles-pp.skb_release_data.consume_skb.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter > > 5.83 ± 8% -3.4 2.44 ± 5% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_node.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg > > 5.81 ± 6% -3.3 2.48 ± 6% perf-profile.calltrace.cycles-pp.kmalloc_reserve.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg > > 5.50 ± 7% -3.2 2.32 ± 6% perf-profile.calltrace.cycles-pp.__kmalloc_node_track_caller.kmalloc_reserve.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb > > 5.07 ± 8% -3.0 2.04 ± 6% perf-profile.calltrace.cycles-pp.__kmem_cache_alloc_node.__kmalloc_node_track_caller.kmalloc_reserve.__alloc_skb.alloc_skb_with_frags > > 6.22 ± 2% -2.9 3.33 ± 3% perf-profile.calltrace.cycles-pp.unix_stream_read_actor.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter.vfs_read > > 6.17 ± 2% -2.9 3.30 ± 3% perf-profile.calltrace.cycles-pp.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter > > 6.11 ± 2% -2.9 3.24 ± 3% perf-profile.calltrace.cycles-pp.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_generic.unix_stream_recvmsg > > 50.99 -2.6 48.39 perf-profile.calltrace.cycles-pp.__libc_read > > 5.66 ± 3% -2.3 3.35 ± 3% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__libc_read > > 5.52 ± 3% -2.3 3.27 ± 3% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__libc_write > > 3.14 ± 2% -1.7 1.42 ± 4% perf-profile.calltrace.cycles-pp._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_generic > > 2.73 ± 2% -1.6 1.15 ± 4% perf-profile.calltrace.cycles-pp.copyout._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor > > 2.59 ± 2% -1.5 1.07 ± 4% perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyout._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter > > 2.72 ± 3% -1.4 1.34 ± 6% perf-profile.calltrace.cycles-pp.kmem_cache_free.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter.vfs_read > > 41.50 -1.2 40.27 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_read > > 2.26 ± 4% -1.1 1.12 perf-profile.calltrace.cycles-pp.skb_release_head_state.consume_skb.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter > > 2.76 ± 3% -1.1 1.63 ± 3% perf-profile.calltrace.cycles-pp.__check_object_size.simple_copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor > > 2.84 ± 3% -1.1 1.71 ± 2% perf-profile.calltrace.cycles-pp.simple_copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_generic > > 2.20 ± 4% -1.1 1.08 perf-profile.calltrace.cycles-pp.unix_destruct_scm.skb_release_head_state.consume_skb.unix_stream_read_generic.unix_stream_recvmsg > > 2.98 ± 2% -1.1 1.90 ± 6% perf-profile.calltrace.cycles-pp.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_sendmsg.sock_write_iter.vfs_write > > 1.99 ± 4% -1.1 0.92 ± 2% perf-profile.calltrace.cycles-pp.sock_wfree.unix_destruct_scm.skb_release_head_state.consume_skb.unix_stream_read_generic > > 2.10 ± 3% -1.0 1.08 ± 4% perf-profile.calltrace.cycles-pp.check_heap_object.__check_object_size.simple_copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter > > 2.08 ± 4% -0.8 1.24 ± 3% perf-profile.calltrace.cycles-pp.__entry_text_start.__libc_write > > 2.16 ± 3% -0.7 1.47 perf-profile.calltrace.cycles-pp.__entry_text_start.__libc_read > > 2.20 ± 2% -0.7 1.52 ± 3% perf-profile.calltrace.cycles-pp.__kmem_cache_free.skb_release_data.consume_skb.unix_stream_read_generic.unix_stream_recvmsg > > 1.46 ± 3% -0.6 0.87 ± 8% perf-profile.calltrace.cycles-pp._copy_from_iter.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_sendmsg.sock_write_iter > > 4.82 ± 2% -0.6 4.24 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read > > 1.31 ± 2% -0.4 0.90 ± 4% perf-profile.calltrace.cycles-pp.__check_object_size.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_sendmsg.sock_write_iter > > 0.96 ± 3% -0.4 0.57 ± 10% perf-profile.calltrace.cycles-pp.copyin._copy_from_iter.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_sendmsg > > 1.14 ± 3% -0.4 0.76 ± 5% perf-profile.calltrace.cycles-pp.memcg_slab_post_alloc_hook.kmem_cache_alloc_node.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb > > 0.99 ± 3% -0.3 0.65 ± 8% perf-profile.calltrace.cycles-pp.memcg_slab_post_alloc_hook.__kmem_cache_alloc_node.__kmalloc_node_track_caller.kmalloc_reserve.__alloc_skb > > 1.30 ± 4% -0.3 0.99 ± 3% perf-profile.calltrace.cycles-pp.sock_recvmsg.sock_read_iter.vfs_read.ksys_read.do_syscall_64 > > 0.98 ± 2% -0.3 0.69 ± 3% perf-profile.calltrace.cycles-pp.security_file_permission.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 0.67 -0.2 0.42 ± 50% perf-profile.calltrace.cycles-pp.check_heap_object.__check_object_size.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_sendmsg > > 0.56 ± 4% -0.2 0.32 ± 81% perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read > > 0.86 ± 2% -0.2 0.63 ± 3% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.vfs_write.ksys_write.do_syscall_64 > > 1.15 ± 4% -0.2 0.93 ± 4% perf-profile.calltrace.cycles-pp.security_socket_recvmsg.sock_recvmsg.sock_read_iter.vfs_read.ksys_read > > 0.90 -0.2 0.69 ± 3% perf-profile.calltrace.cycles-pp.get_obj_cgroup_from_current.kmem_cache_alloc_node.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb > > 1.23 ± 3% -0.2 1.07 ± 3% perf-profile.calltrace.cycles-pp.security_socket_sendmsg.sock_sendmsg.sock_write_iter.vfs_write.ksys_write > > 1.05 ± 2% -0.2 0.88 ± 2% perf-profile.calltrace.cycles-pp.security_file_permission.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 0.84 ± 4% -0.2 0.68 ± 4% perf-profile.calltrace.cycles-pp.aa_sk_perm.security_socket_recvmsg.sock_recvmsg.sock_read_iter.vfs_read > > 0.88 -0.1 0.78 ± 5% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.vfs_read.ksys_read.do_syscall_64 > > 0.94 ± 3% -0.1 0.88 ± 4% perf-profile.calltrace.cycles-pp.aa_sk_perm.security_socket_sendmsg.sock_sendmsg.sock_write_iter.vfs_write > > 0.62 ± 2% +0.3 0.90 ± 2% perf-profile.calltrace.cycles-pp.mutex_lock.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter.vfs_read > > 0.00 +0.6 0.58 ± 3% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.prepare_to_wait.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg > > 0.00 +0.6 0.61 ± 6% perf-profile.calltrace.cycles-pp.update_load_avg.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function > > 0.00 +0.6 0.62 ± 4% perf-profile.calltrace.cycles-pp.set_next_entity.pick_next_task_fair.__schedule.schedule.exit_to_user_mode_loop > > 0.00 +0.7 0.67 ± 11% perf-profile.calltrace.cycles-pp.update_load_avg.dequeue_entity.dequeue_task_fair.__schedule.schedule > > 0.00 +0.7 0.67 ± 7% perf-profile.calltrace.cycles-pp.__switch_to_asm.__libc_write > > 0.00 +0.8 0.76 ± 4% perf-profile.calltrace.cycles-pp.reweight_entity.dequeue_task_fair.__schedule.schedule.schedule_timeout > > 0.00 +0.8 0.77 ± 4% perf-profile.calltrace.cycles-pp.___perf_sw_event.prepare_task_switch.__schedule.schedule.schedule_timeout > > 0.00 +0.8 0.77 ± 8% perf-profile.calltrace.cycles-pp.put_prev_entity.pick_next_task_fair.__schedule.schedule.exit_to_user_mode_loop > > 0.00 +0.8 0.81 ± 5% perf-profile.calltrace.cycles-pp.prepare_task_switch.__schedule.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare > > 0.00 +0.8 0.81 ± 5% perf-profile.calltrace.cycles-pp.check_preempt_wakeup.check_preempt_curr.ttwu_do_activate.try_to_wake_up.autoremove_wake_function > > 0.00 +0.8 0.82 ± 2% perf-profile.calltrace.cycles-pp.__switch_to_asm.__libc_read > > 0.00 +0.8 0.82 ± 3% perf-profile.calltrace.cycles-pp.reweight_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function > > 0.00 +0.9 0.86 ± 5% perf-profile.calltrace.cycles-pp.perf_trace_sched_wakeup_template.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock > > 0.00 +0.9 0.87 ± 8% perf-profile.calltrace.cycles-pp.update_load_avg.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up > > 29.66 +0.9 30.58 perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write > > 0.00 +1.0 0.95 ± 3% perf-profile.calltrace.cycles-pp.set_next_entity.pick_next_task_fair.__schedule.schedule.schedule_timeout > > 0.00 +1.0 0.98 ± 4% perf-profile.calltrace.cycles-pp.check_preempt_curr.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common > > 0.00 +1.0 0.99 ± 3% perf-profile.calltrace.cycles-pp.update_curr.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up > > 0.00 +1.0 1.05 ± 4% perf-profile.calltrace.cycles-pp.prepare_to_wait.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter > > 0.00 +1.1 1.07 ± 12% perf-profile.calltrace.cycles-pp.select_idle_sibling.select_task_rq_fair.select_task_rq.try_to_wake_up.autoremove_wake_function > > 27.81 ± 2% +1.2 28.98 perf-profile.calltrace.cycles-pp.unix_stream_recvmsg.sock_read_iter.vfs_read.ksys_read.do_syscall_64 > > 27.36 ± 2% +1.2 28.59 perf-profile.calltrace.cycles-pp.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter.vfs_read.ksys_read > > 0.00 +1.5 1.46 ± 2% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.try_to_wake_up.autoremove_wake_function.__wake_up_common > > 0.00 +1.6 1.55 ± 4% perf-profile.calltrace.cycles-pp.prepare_task_switch.__schedule.schedule.schedule_timeout.unix_stream_data_wait > > 0.00 +1.6 1.60 ± 4% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read > > 27.58 +1.6 29.19 perf-profile.calltrace.cycles-pp.sock_write_iter.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 0.00 +1.6 1.63 ± 5% perf-profile.calltrace.cycles-pp.update_curr.dequeue_entity.dequeue_task_fair.__schedule.schedule > > 0.00 +1.6 1.65 ± 5% perf-profile.calltrace.cycles-pp.restore_fpregs_from_fpstate.switch_fpu_return.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64 > > 0.00 +1.7 1.66 ± 6% perf-profile.calltrace.cycles-pp.switch_mm_irqs_off.__schedule.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare > > 0.00 +1.8 1.80 perf-profile.calltrace.cycles-pp._raw_spin_lock.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock > > 0.00 +1.8 1.84 ± 2% perf-profile.calltrace.cycles-pp.pick_next_task_fair.__schedule.schedule.schedule_timeout.unix_stream_data_wait > > 0.00 +2.0 1.97 ± 2% perf-profile.calltrace.cycles-pp.switch_mm_irqs_off.__schedule.schedule.schedule_timeout.unix_stream_data_wait > > 26.63 ± 2% +2.0 28.61 perf-profile.calltrace.cycles-pp.sock_sendmsg.sock_write_iter.vfs_write.ksys_write.do_syscall_64 > > 0.00 +2.0 2.01 ± 6% perf-profile.calltrace.cycles-pp.pick_next_task_fair.__schedule.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare > > 0.00 +2.1 2.09 ± 6% perf-profile.calltrace.cycles-pp.select_task_rq_fair.select_task_rq.try_to_wake_up.autoremove_wake_function.__wake_up_common > > 0.00 +2.1 2.11 ± 5% perf-profile.calltrace.cycles-pp.switch_fpu_return.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 25.21 ± 2% +2.2 27.43 perf-profile.calltrace.cycles-pp.unix_stream_sendmsg.sock_sendmsg.sock_write_iter.vfs_write.ksys_write > > 0.00 +2.4 2.43 ± 5% perf-profile.calltrace.cycles-pp.select_task_rq.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock > > 48.00 +2.7 50.69 perf-profile.calltrace.cycles-pp.__libc_write > > 0.00 +2.9 2.87 ± 5% perf-profile.calltrace.cycles-pp.dequeue_entity.dequeue_task_fair.__schedule.schedule.schedule_timeout > > 0.09 ±223% +3.4 3.47 ± 3% perf-profile.calltrace.cycles-pp.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function > > 39.07 +4.8 43.84 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_write > > 0.66 ± 18% +5.0 5.62 ± 4% perf-profile.calltrace.cycles-pp.dequeue_task_fair.__schedule.schedule.schedule_timeout.unix_stream_data_wait > > 4.73 +5.1 9.88 ± 3% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write > > 0.66 ± 20% +5.3 5.98 ± 3% perf-profile.calltrace.cycles-pp.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common > > 35.96 +5.7 41.68 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write > > 0.00 +6.0 6.02 ± 6% perf-profile.calltrace.cycles-pp.__schedule.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode > > 0.00 +6.2 6.18 ± 6% perf-profile.calltrace.cycles-pp.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64 > > 0.00 +6.4 6.36 ± 6% perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 0.78 ± 19% +6.4 7.15 ± 3% perf-profile.calltrace.cycles-pp.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock > > 0.18 ±141% +7.0 7.18 ± 6% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write > > 1.89 ± 15% +12.1 13.96 ± 3% perf-profile.calltrace.cycles-pp.__schedule.schedule.schedule_timeout.unix_stream_data_wait.unix_stream_read_generic > > 1.92 ± 15% +12.3 14.23 ± 3% perf-profile.calltrace.cycles-pp.schedule.schedule_timeout.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg > > 1.66 ± 19% +12.4 14.06 ± 2% perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.sock_def_readable > > 1.96 ± 15% +12.5 14.48 ± 3% perf-profile.calltrace.cycles-pp.schedule_timeout.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter > > 1.69 ± 19% +12.7 14.38 ± 2% perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.sock_def_readable.unix_stream_sendmsg > > 1.75 ± 19% +13.0 14.75 ± 2% perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_common_lock.sock_def_readable.unix_stream_sendmsg.sock_sendmsg > > 2.53 ± 10% +13.4 15.90 ± 2% perf-profile.calltrace.cycles-pp.sock_def_readable.unix_stream_sendmsg.sock_sendmsg.sock_write_iter.vfs_write > > 1.96 ± 16% +13.5 15.42 ± 2% perf-profile.calltrace.cycles-pp.__wake_up_common_lock.sock_def_readable.unix_stream_sendmsg.sock_sendmsg.sock_write_iter > > 2.28 ± 15% +14.6 16.86 ± 3% perf-profile.calltrace.cycles-pp.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter.vfs_read > > 15.31 ± 4% -8.6 6.67 ± 5% perf-profile.children.cycles-pp.sock_alloc_send_pskb > > 13.85 ± 6% -7.9 5.98 ± 5% perf-profile.children.cycles-pp.alloc_skb_with_frags > > 13.70 ± 6% -7.8 5.89 ± 6% perf-profile.children.cycles-pp.__alloc_skb > > 9.01 ± 2% -5.5 3.48 ± 4% perf-profile.children.cycles-pp.consume_skb > > 6.86 ± 26% -4.7 2.15 ± 2% perf-profile.children.cycles-pp._raw_spin_lock_irqsave > > 11.27 ± 3% -4.6 6.67 ± 3% perf-profile.children.cycles-pp.syscall_return_via_sysret > > 6.46 ± 4% -4.3 2.15 ± 6% perf-profile.children.cycles-pp.skb_release_data > > 4.18 ± 25% -4.0 0.15 ± 69% perf-profile.children.cycles-pp.___slab_alloc > > 5.76 ± 32% -3.9 1.91 ± 3% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath > > 5.98 ± 8% -3.5 2.52 ± 5% perf-profile.children.cycles-pp.kmem_cache_alloc_node > > 5.84 ± 6% -3.3 2.50 ± 6% perf-profile.children.cycles-pp.kmalloc_reserve > > 3.33 ± 30% -3.3 0.05 ± 88% perf-profile.children.cycles-pp.get_partial_node > > 5.63 ± 7% -3.3 2.37 ± 6% perf-profile.children.cycles-pp.__kmalloc_node_track_caller > > 5.20 ± 7% -3.1 2.12 ± 6% perf-profile.children.cycles-pp.__kmem_cache_alloc_node > > 6.23 ± 2% -2.9 3.33 ± 3% perf-profile.children.cycles-pp.unix_stream_read_actor > > 6.18 ± 2% -2.9 3.31 ± 3% perf-profile.children.cycles-pp.skb_copy_datagram_iter > > 6.11 ± 2% -2.9 3.25 ± 3% perf-profile.children.cycles-pp.__skb_datagram_iter > > 51.39 -2.5 48.85 perf-profile.children.cycles-pp.__libc_read > > 3.14 ± 3% -2.5 0.61 ± 13% perf-profile.children.cycles-pp.__slab_free > > 5.34 ± 3% -2.1 3.23 ± 3% perf-profile.children.cycles-pp.__entry_text_start > > 3.57 ± 2% -1.9 1.66 ± 6% perf-profile.children.cycles-pp.copy_user_enhanced_fast_string > > 3.16 ± 2% -1.7 1.43 ± 4% perf-profile.children.cycles-pp._copy_to_iter > > 2.74 ± 2% -1.6 1.16 ± 4% perf-profile.children.cycles-pp.copyout > > 4.16 ± 2% -1.5 2.62 ± 3% perf-profile.children.cycles-pp.__check_object_size > > 2.73 ± 3% -1.4 1.35 ± 6% perf-profile.children.cycles-pp.kmem_cache_free > > 2.82 ± 2% -1.2 1.63 ± 3% perf-profile.children.cycles-pp.check_heap_object > > 2.27 ± 4% -1.1 1.13 ± 2% perf-profile.children.cycles-pp.skb_release_head_state > > 2.85 ± 3% -1.1 1.72 ± 2% perf-profile.children.cycles-pp.simple_copy_to_iter > > 2.22 ± 4% -1.1 1.10 perf-profile.children.cycles-pp.unix_destruct_scm > > 3.00 ± 2% -1.1 1.91 ± 5% perf-profile.children.cycles-pp.skb_copy_datagram_from_iter > > 2.00 ± 4% -1.1 0.92 ± 2% perf-profile.children.cycles-pp.sock_wfree > > 2.16 ± 3% -0.7 1.43 ± 7% perf-profile.children.cycles-pp.memcg_slab_post_alloc_hook > > 1.45 ± 3% -0.7 0.73 ± 7% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack > > 2.21 ± 2% -0.7 1.52 ± 3% perf-profile.children.cycles-pp.__kmem_cache_free > > 1.49 ± 3% -0.6 0.89 ± 8% perf-profile.children.cycles-pp._copy_from_iter > > 1.40 ± 3% -0.6 0.85 ± 13% perf-profile.children.cycles-pp.mod_objcg_state > > 0.74 -0.5 0.24 ± 16% perf-profile.children.cycles-pp.__build_skb_around > > 1.48 -0.5 1.01 ± 2% perf-profile.children.cycles-pp.get_obj_cgroup_from_current > > 2.05 ± 2% -0.5 1.59 ± 2% perf-profile.children.cycles-pp.security_file_permission > > 0.98 ± 2% -0.4 0.59 ± 10% perf-profile.children.cycles-pp.copyin > > 1.08 ± 3% -0.4 0.72 ± 3% perf-profile.children.cycles-pp.__might_resched > > 1.75 -0.3 1.42 ± 4% perf-profile.children.cycles-pp.apparmor_file_permission > > 1.32 ± 4% -0.3 1.00 ± 3% perf-profile.children.cycles-pp.sock_recvmsg > > 0.54 ± 4% -0.3 0.25 ± 6% perf-profile.children.cycles-pp.skb_unlink > > 0.54 ± 6% -0.3 0.26 ± 3% perf-profile.children.cycles-pp.unix_write_space > > 0.66 ± 3% -0.3 0.39 ± 4% perf-profile.children.cycles-pp.obj_cgroup_charge > > 0.68 ± 2% -0.3 0.41 ± 4% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack > > 0.86 ± 4% -0.3 0.59 ± 3% perf-profile.children.cycles-pp.__check_heap_object > > 0.75 ± 9% -0.3 0.48 ± 2% perf-profile.children.cycles-pp.skb_set_owner_w > > 1.84 ± 3% -0.3 1.58 ± 4% perf-profile.children.cycles-pp.aa_sk_perm > > 0.68 ± 11% -0.2 0.44 ± 3% perf-profile.children.cycles-pp.skb_queue_tail > > 1.22 ± 4% -0.2 0.99 ± 5% perf-profile.children.cycles-pp.__fdget_pos > > 0.70 ± 2% -0.2 0.48 ± 5% perf-profile.children.cycles-pp.__get_obj_cgroup_from_memcg > > 1.16 ± 4% -0.2 0.93 ± 3% perf-profile.children.cycles-pp.security_socket_recvmsg > > 0.48 ± 3% -0.2 0.29 ± 4% perf-profile.children.cycles-pp.__might_fault > > 0.24 ± 7% -0.2 0.05 ± 56% perf-profile.children.cycles-pp.fsnotify_perm > > 1.12 ± 4% -0.2 0.93 ± 6% perf-profile.children.cycles-pp.__fget_light > > 1.24 ± 3% -0.2 1.07 ± 3% perf-profile.children.cycles-pp.security_socket_sendmsg > > 0.61 ± 3% -0.2 0.45 ± 2% perf-profile.children.cycles-pp.__might_sleep > > 0.33 ± 5% -0.2 0.17 ± 6% perf-profile.children.cycles-pp.refill_obj_stock > > 0.40 ± 2% -0.1 0.25 ± 4% perf-profile.children.cycles-pp.kmalloc_slab > > 0.57 ± 2% -0.1 0.45 perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt > > 0.54 ± 3% -0.1 0.42 ± 2% perf-profile.children.cycles-pp.wait_for_unix_gc > > 0.42 ± 2% -0.1 0.30 ± 3% perf-profile.children.cycles-pp.is_vmalloc_addr > > 1.00 ± 2% -0.1 0.87 ± 5% perf-profile.children.cycles-pp.__virt_addr_valid > > 0.52 ± 2% -0.1 0.41 perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt > > 0.33 ± 3% -0.1 0.21 ± 3% perf-profile.children.cycles-pp.tick_sched_handle > > 0.36 ± 2% -0.1 0.25 ± 4% perf-profile.children.cycles-pp.tick_sched_timer > > 0.47 ± 2% -0.1 0.36 ± 2% perf-profile.children.cycles-pp.hrtimer_interrupt > > 0.48 ± 2% -0.1 0.36 ± 2% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt > > 0.32 ± 3% -0.1 0.21 ± 5% perf-profile.children.cycles-pp.update_process_times > > 0.42 ± 3% -0.1 0.31 ± 2% perf-profile.children.cycles-pp.__hrtimer_run_queues > > 0.26 ± 6% -0.1 0.16 ± 4% perf-profile.children.cycles-pp.kmalloc_size_roundup > > 0.20 ± 4% -0.1 0.10 ± 9% perf-profile.children.cycles-pp.task_tick_fair > > 0.24 ± 3% -0.1 0.15 ± 4% perf-profile.children.cycles-pp.scheduler_tick > > 0.30 ± 5% -0.1 0.21 ± 8% perf-profile.children.cycles-pp.obj_cgroup_uncharge_pages > > 0.20 ± 2% -0.1 0.11 ± 6% perf-profile.children.cycles-pp.should_failslab > > 0.51 ± 2% -0.1 0.43 ± 6% perf-profile.children.cycles-pp.syscall_enter_from_user_mode > > 0.15 ± 8% -0.1 0.07 ± 13% perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare > > 0.19 ± 4% -0.1 0.12 ± 5% perf-profile.children.cycles-pp.apparmor_socket_sendmsg > > 0.20 ± 4% -0.1 0.13 ± 5% perf-profile.children.cycles-pp.aa_file_perm > > 0.18 ± 5% -0.1 0.12 ± 5% perf-profile.children.cycles-pp.apparmor_socket_recvmsg > > 0.14 ± 13% -0.1 0.08 ± 55% perf-profile.children.cycles-pp.__mod_memcg_lruvec_state > > 0.24 ± 4% -0.1 0.18 ± 2% perf-profile.children.cycles-pp.rcu_all_qs > > 0.18 ± 10% -0.1 0.12 ± 11% perf-profile.children.cycles-pp.memcg_account_kmem > > 0.37 ± 3% -0.1 0.31 ± 3% perf-profile.children.cycles-pp.security_socket_getpeersec_dgram > > 0.08 -0.0 0.06 ± 8% perf-profile.children.cycles-pp.put_pid > > 0.18 ± 3% -0.0 0.16 ± 4% perf-profile.children.cycles-pp.apparmor_socket_getpeersec_dgram > > 0.21 ± 3% +0.0 0.23 ± 2% perf-profile.children.cycles-pp.__get_task_ioprio > > 0.00 +0.1 0.05 perf-profile.children.cycles-pp.perf_exclude_event > > 0.00 +0.1 0.06 ± 7% perf-profile.children.cycles-pp.invalidate_user_asid > > 0.00 +0.1 0.07 ± 6% perf-profile.children.cycles-pp.__bitmap_and > > 0.05 +0.1 0.13 ± 8% perf-profile.children.cycles-pp.irqentry_exit_to_user_mode > > 0.00 +0.1 0.08 ± 7% perf-profile.children.cycles-pp.schedule_debug > > 0.00 +0.1 0.08 ± 13% perf-profile.children.cycles-pp.read@plt > > 0.00 +0.1 0.08 ± 5% perf-profile.children.cycles-pp.sysvec_reschedule_ipi > > 0.00 +0.1 0.10 ± 4% perf-profile.children.cycles-pp.tracing_gen_ctx_irq_test > > 0.00 +0.1 0.10 ± 4% perf-profile.children.cycles-pp.place_entity > > 0.00 +0.1 0.12 ± 10% perf-profile.children.cycles-pp.native_irq_return_iret > > 0.07 ± 14% +0.1 0.19 ± 3% perf-profile.children.cycles-pp.__list_add_valid > > 0.00 +0.1 0.13 ± 6% perf-profile.children.cycles-pp.perf_trace_buf_alloc > > 0.00 +0.1 0.13 ± 34% perf-profile.children.cycles-pp._find_next_and_bit > > 0.00 +0.1 0.14 ± 5% perf-profile.children.cycles-pp.switch_ldt > > 0.00 +0.1 0.15 ± 5% perf-profile.children.cycles-pp.check_cfs_rq_runtime > > 0.00 +0.1 0.15 ± 30% perf-profile.children.cycles-pp.migrate_task_rq_fair > > 0.00 +0.2 0.15 ± 5% perf-profile.children.cycles-pp.__rdgsbase_inactive > > 0.00 +0.2 0.16 ± 3% perf-profile.children.cycles-pp.save_fpregs_to_fpstate > > 0.00 +0.2 0.16 ± 6% perf-profile.children.cycles-pp.ttwu_queue_wakelist > > 0.00 +0.2 0.17 perf-profile.children.cycles-pp.perf_trace_buf_update > > 0.00 +0.2 0.18 ± 2% perf-profile.children.cycles-pp.rb_insert_color > > 0.00 +0.2 0.18 ± 4% perf-profile.children.cycles-pp.rb_next > > 0.00 +0.2 0.18 ± 21% perf-profile.children.cycles-pp.__cgroup_account_cputime > > 0.01 ±223% +0.2 0.21 ± 28% perf-profile.children.cycles-pp.perf_trace_sched_switch > > 0.00 +0.2 0.20 ± 3% perf-profile.children.cycles-pp.select_idle_cpu > > 0.00 +0.2 0.20 ± 3% perf-profile.children.cycles-pp.rcu_note_context_switch > > 0.00 +0.2 0.21 ± 26% perf-profile.children.cycles-pp.set_task_cpu > > 0.00 +0.2 0.22 ± 8% perf-profile.children.cycles-pp.resched_curr > > 0.08 ± 5% +0.2 0.31 ± 11% perf-profile.children.cycles-pp.task_h_load > > 0.00 +0.2 0.24 ± 3% perf-profile.children.cycles-pp.finish_wait > > 0.04 ± 44% +0.3 0.29 ± 5% perf-profile.children.cycles-pp.rb_erase > > 0.19 ± 6% +0.3 0.46 perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore > > 0.20 ± 6% +0.3 0.47 ± 3% perf-profile.children.cycles-pp.__list_del_entry_valid > > 0.00 +0.3 0.28 ± 3% perf-profile.children.cycles-pp.__wrgsbase_inactive > > 0.02 ±141% +0.3 0.30 ± 2% perf-profile.children.cycles-pp.native_sched_clock > > 0.06 ± 13% +0.3 0.34 ± 2% perf-profile.children.cycles-pp.sched_clock_cpu > > 0.64 ± 2% +0.3 0.93 perf-profile.children.cycles-pp.mutex_lock > > 0.00 +0.3 0.30 ± 5% perf-profile.children.cycles-pp.cr4_update_irqsoff > > 0.00 +0.3 0.30 ± 4% perf-profile.children.cycles-pp.clear_buddies > > 0.07 ± 55% +0.3 0.37 ± 5% perf-profile.children.cycles-pp.perf_trace_sched_stat_runtime > > 0.10 ± 66% +0.3 0.42 ± 5% perf-profile.children.cycles-pp.perf_tp_event > > 0.02 ±142% +0.3 0.36 ± 6% perf-profile.children.cycles-pp.cpuacct_charge > > 0.12 ± 9% +0.4 0.47 ± 11% perf-profile.children.cycles-pp.wake_affine > > 0.00 +0.4 0.36 ± 13% perf-profile.children.cycles-pp.available_idle_cpu > > 0.05 ± 48% +0.4 0.42 ± 6% perf-profile.children.cycles-pp.finish_task_switch > > 0.12 ± 4% +0.4 0.49 ± 4% perf-profile.children.cycles-pp.asm_sysvec_reschedule_ipi > > 0.07 ± 17% +0.4 0.48 perf-profile.children.cycles-pp.__calc_delta > > 0.03 ±100% +0.5 0.49 ± 4% perf-profile.children.cycles-pp.pick_next_entity > > 0.00 +0.5 0.48 ± 8% perf-profile.children.cycles-pp.set_next_buddy > > 0.08 ± 14% +0.6 0.66 ± 4% perf-profile.children.cycles-pp.update_min_vruntime > > 0.07 ± 17% +0.6 0.68 ± 2% perf-profile.children.cycles-pp.os_xsave > > 0.29 ± 7% +0.7 0.99 ± 3% perf-profile.children.cycles-pp.update_cfs_group > > 0.17 ± 17% +0.7 0.87 ± 4% perf-profile.children.cycles-pp.perf_trace_sched_wakeup_template > > 0.14 ± 7% +0.7 0.87 ± 3% perf-profile.children.cycles-pp.__update_load_avg_se > > 0.14 ± 16% +0.8 0.90 ± 2% perf-profile.children.cycles-pp.update_rq_clock > > 0.08 ± 17% +0.8 0.84 ± 5% perf-profile.children.cycles-pp.check_preempt_wakeup > > 0.12 ± 14% +0.8 0.95 ± 3% perf-profile.children.cycles-pp.__update_load_avg_cfs_rq > > 0.22 ± 5% +0.8 1.07 ± 3% perf-profile.children.cycles-pp.prepare_to_wait > > 0.10 ± 18% +0.9 0.98 ± 3% perf-profile.children.cycles-pp.check_preempt_curr > > 29.72 +0.9 30.61 perf-profile.children.cycles-pp.vfs_write > > 0.14 ± 11% +0.9 1.03 ± 4% perf-profile.children.cycles-pp.__switch_to > > 0.07 ± 20% +0.9 0.99 ± 6% perf-profile.children.cycles-pp.put_prev_entity > > 0.12 ± 16% +1.0 1.13 ± 5% perf-profile.children.cycles-pp.___perf_sw_event > > 0.07 ± 17% +1.0 1.10 ± 13% perf-profile.children.cycles-pp.select_idle_sibling > > 27.82 ± 2% +1.2 28.99 perf-profile.children.cycles-pp.unix_stream_recvmsg > > 27.41 ± 2% +1.2 28.63 perf-profile.children.cycles-pp.unix_stream_read_generic > > 0.20 ± 15% +1.4 1.59 ± 3% perf-profile.children.cycles-pp.reweight_entity > > 0.21 ± 13% +1.4 1.60 ± 4% perf-profile.children.cycles-pp.__switch_to_asm > > 0.23 ± 10% +1.4 1.65 ± 5% perf-profile.children.cycles-pp.restore_fpregs_from_fpstate > > 0.20 ± 13% +1.5 1.69 ± 3% perf-profile.children.cycles-pp.set_next_entity > > 27.59 +1.6 29.19 perf-profile.children.cycles-pp.sock_write_iter > > 0.28 ± 10% +1.8 2.12 ± 5% perf-profile.children.cycles-pp.switch_fpu_return > > 0.26 ± 11% +1.8 2.10 ± 6% perf-profile.children.cycles-pp.select_task_rq_fair > > 26.66 ± 2% +2.0 28.63 perf-profile.children.cycles-pp.sock_sendmsg > > 0.31 ± 12% +2.1 2.44 ± 5% perf-profile.children.cycles-pp.select_task_rq > > 0.30 ± 14% +2.2 2.46 ± 4% perf-profile.children.cycles-pp.prepare_task_switch > > 25.27 ± 2% +2.2 27.47 perf-profile.children.cycles-pp.unix_stream_sendmsg > > 2.10 +2.3 4.38 ± 2% perf-profile.children.cycles-pp._raw_spin_lock > > 0.40 ± 14% +2.5 2.92 ± 5% perf-profile.children.cycles-pp.dequeue_entity > > 48.40 +2.6 51.02 perf-profile.children.cycles-pp.__libc_write > > 0.46 ± 15% +3.1 3.51 ± 3% perf-profile.children.cycles-pp.enqueue_entity > > 0.49 ± 10% +3.2 3.64 ± 7% perf-profile.children.cycles-pp.update_load_avg > > 0.53 ± 20% +3.4 3.91 ± 3% perf-profile.children.cycles-pp.update_curr > > 80.81 +3.4 84.24 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe > > 0.50 ± 12% +3.5 4.00 ± 4% perf-profile.children.cycles-pp.switch_mm_irqs_off > > 0.55 ± 9% +3.8 4.38 ± 4% perf-profile.children.cycles-pp.pick_next_task_fair > > 9.60 +4.6 14.15 ± 2% perf-profile.children.cycles-pp.syscall_exit_to_user_mode > > 0.78 ± 13% +4.9 5.65 ± 4% perf-profile.children.cycles-pp.dequeue_task_fair > > 0.78 ± 15% +5.2 5.99 ± 3% perf-profile.children.cycles-pp.enqueue_task_fair > > 74.30 +5.6 79.86 perf-profile.children.cycles-pp.do_syscall_64 > > 0.90 ± 15% +6.3 7.16 ± 3% perf-profile.children.cycles-pp.ttwu_do_activate > > 0.33 ± 31% +6.3 6.61 ± 6% perf-profile.children.cycles-pp.exit_to_user_mode_loop > > 0.82 ± 15% +8.1 8.92 ± 5% perf-profile.children.cycles-pp.exit_to_user_mode_prepare > > 1.90 ± 16% +12.2 14.10 ± 2% perf-profile.children.cycles-pp.try_to_wake_up > > 2.36 ± 11% +12.2 14.60 ± 3% perf-profile.children.cycles-pp.schedule_timeout > > 1.95 ± 15% +12.5 14.41 ± 2% perf-profile.children.cycles-pp.autoremove_wake_function > > 2.01 ± 15% +12.8 14.76 ± 2% perf-profile.children.cycles-pp.__wake_up_common > > 2.23 ± 13% +13.2 15.45 ± 2% perf-profile.children.cycles-pp.__wake_up_common_lock > > 2.53 ± 10% +13.4 15.90 ± 2% perf-profile.children.cycles-pp.sock_def_readable > > 2.29 ± 15% +14.6 16.93 ± 3% perf-profile.children.cycles-pp.unix_stream_data_wait > > 2.61 ± 13% +18.0 20.65 ± 4% perf-profile.children.cycles-pp.schedule > > 2.66 ± 13% +18.1 20.77 ± 4% perf-profile.children.cycles-pp.__schedule > > 11.25 ± 3% -4.6 6.67 ± 3% perf-profile.self.cycles-pp.syscall_return_via_sysret > > 5.76 ± 32% -3.9 1.90 ± 3% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath > > 8.69 ± 3% -3.4 5.27 ± 3% perf-profile.self.cycles-pp.syscall_exit_to_user_mode > > 3.11 ± 3% -2.5 0.60 ± 13% perf-profile.self.cycles-pp.__slab_free > > 6.65 ± 2% -2.2 4.47 ± 2% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe > > 4.78 ± 3% -1.9 2.88 ± 3% perf-profile.self.cycles-pp.__entry_text_start > > 3.52 ± 2% -1.9 1.64 ± 6% perf-profile.self.cycles-pp.copy_user_enhanced_fast_string > > 2.06 ± 3% -1.1 0.96 ± 5% perf-profile.self.cycles-pp.kmem_cache_free > > 1.42 ± 3% -1.0 0.46 ± 10% perf-profile.self.cycles-pp.check_heap_object > > 1.43 ± 4% -0.8 0.64 perf-profile.self.cycles-pp.sock_wfree > > 0.99 ± 3% -0.8 0.21 ± 12% perf-profile.self.cycles-pp.skb_release_data > > 0.84 ± 8% -0.7 0.10 ± 64% perf-profile.self.cycles-pp.___slab_alloc > > 1.97 ± 2% -0.6 1.32 perf-profile.self.cycles-pp.unix_stream_read_generic > > 1.60 ± 3% -0.5 1.11 ± 4% perf-profile.self.cycles-pp.memcg_slab_post_alloc_hook > > 1.24 ± 2% -0.5 0.75 ± 11% perf-profile.self.cycles-pp.mod_objcg_state > > 0.71 -0.5 0.23 ± 15% perf-profile.self.cycles-pp.__build_skb_around > > 0.95 ± 3% -0.5 0.50 ± 6% perf-profile.self.cycles-pp.__alloc_skb > > 0.97 ± 4% -0.4 0.55 ± 5% perf-profile.self.cycles-pp.kmem_cache_alloc_node > > 0.99 ± 3% -0.4 0.59 ± 4% perf-profile.self.cycles-pp.vfs_write > > 1.38 ± 2% -0.4 0.99 perf-profile.self.cycles-pp.__kmem_cache_free > > 0.86 ± 2% -0.4 0.50 ± 3% perf-profile.self.cycles-pp.__kmem_cache_alloc_node > > 0.92 ± 4% -0.4 0.56 ± 4% perf-profile.self.cycles-pp.sock_write_iter > > 1.06 ± 3% -0.4 0.70 ± 3% perf-profile.self.cycles-pp.__might_resched > > 0.73 ± 4% -0.3 0.44 ± 4% perf-profile.self.cycles-pp.__cond_resched > > 0.85 ± 3% -0.3 0.59 ± 4% perf-profile.self.cycles-pp.__check_heap_object > > 1.46 ± 7% -0.3 1.20 ± 2% perf-profile.self.cycles-pp.unix_stream_sendmsg > > 0.73 ± 9% -0.3 0.47 ± 2% perf-profile.self.cycles-pp.skb_set_owner_w > > 1.54 -0.3 1.28 ± 4% perf-profile.self.cycles-pp.apparmor_file_permission > > 0.74 ± 3% -0.2 0.50 ± 2% perf-profile.self.cycles-pp.get_obj_cgroup_from_current > > 1.15 ± 3% -0.2 0.91 ± 8% perf-profile.self.cycles-pp.aa_sk_perm > > 0.60 -0.2 0.36 ± 4% perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack > > 0.65 ± 4% -0.2 0.45 ± 6% perf-profile.self.cycles-pp.__get_obj_cgroup_from_memcg > > 0.24 ± 6% -0.2 0.05 ± 56% perf-profile.self.cycles-pp.fsnotify_perm > > 0.76 ± 3% -0.2 0.58 ± 2% perf-profile.self.cycles-pp.sock_read_iter > > 1.10 ± 4% -0.2 0.92 ± 6% perf-profile.self.cycles-pp.__fget_light > > 0.42 ± 3% -0.2 0.25 ± 4% perf-profile.self.cycles-pp.obj_cgroup_charge > > 0.32 ± 4% -0.2 0.17 ± 6% perf-profile.self.cycles-pp.refill_obj_stock > > 0.29 -0.2 0.14 ± 8% perf-profile.self.cycles-pp.__kmalloc_node_track_caller > > 0.54 ± 3% -0.1 0.40 ± 2% perf-profile.self.cycles-pp.__might_sleep > > 0.30 ± 7% -0.1 0.16 ± 22% perf-profile.self.cycles-pp.security_file_permission > > 0.34 ± 3% -0.1 0.21 ± 6% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack > > 0.41 ± 3% -0.1 0.29 ± 3% perf-profile.self.cycles-pp.is_vmalloc_addr > > 0.27 ± 3% -0.1 0.16 ± 6% perf-profile.self.cycles-pp._copy_from_iter > > 0.24 ± 3% -0.1 0.12 ± 3% perf-profile.self.cycles-pp.ksys_write > > 0.95 ± 2% -0.1 0.84 ± 5% perf-profile.self.cycles-pp.__virt_addr_valid > > 0.56 ± 11% -0.1 0.46 ± 4% perf-profile.self.cycles-pp.sock_def_readable > > 0.16 ± 7% -0.1 0.06 ± 18% perf-profile.self.cycles-pp.sock_recvmsg > > 0.22 ± 5% -0.1 0.14 ± 2% perf-profile.self.cycles-pp.ksys_read > > 0.27 ± 4% -0.1 0.19 ± 5% perf-profile.self.cycles-pp.kmalloc_slab > > 0.28 ± 2% -0.1 0.20 ± 2% perf-profile.self.cycles-pp.consume_skb > > 0.35 ± 2% -0.1 0.28 ± 3% perf-profile.self.cycles-pp.__check_object_size > > 0.13 ± 8% -0.1 0.06 ± 18% perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare > > 0.20 ± 5% -0.1 0.12 ± 6% perf-profile.self.cycles-pp.kmalloc_reserve > > 0.26 ± 5% -0.1 0.19 ± 4% perf-profile.self.cycles-pp.sock_alloc_send_pskb > > 0.42 ± 2% -0.1 0.35 ± 7% perf-profile.self.cycles-pp.syscall_enter_from_user_mode > > 0.19 ± 5% -0.1 0.12 ± 6% perf-profile.self.cycles-pp.aa_file_perm > > 0.16 ± 4% -0.1 0.10 ± 4% perf-profile.self.cycles-pp.skb_copy_datagram_from_iter > > 0.18 ± 4% -0.1 0.12 ± 6% perf-profile.self.cycles-pp.apparmor_socket_sendmsg > > 0.18 ± 5% -0.1 0.12 ± 4% perf-profile.self.cycles-pp.apparmor_socket_recvmsg > > 0.15 ± 5% -0.1 0.10 ± 5% perf-profile.self.cycles-pp.alloc_skb_with_frags > > 0.64 ± 3% -0.1 0.59 perf-profile.self.cycles-pp.__libc_write > > 0.20 ± 4% -0.1 0.15 ± 3% perf-profile.self.cycles-pp._copy_to_iter > > 0.15 ± 5% -0.1 0.10 ± 11% perf-profile.self.cycles-pp.sock_sendmsg > > 0.08 ± 4% -0.1 0.03 ± 81% perf-profile.self.cycles-pp.copyout > > 0.11 ± 6% -0.0 0.06 ± 7% perf-profile.self.cycles-pp.__fdget_pos > > 0.12 ± 5% -0.0 0.07 ± 10% perf-profile.self.cycles-pp.kmalloc_size_roundup > > 0.34 ± 3% -0.0 0.29 perf-profile.self.cycles-pp.do_syscall_64 > > 0.20 ± 4% -0.0 0.15 ± 4% perf-profile.self.cycles-pp.rcu_all_qs > > 0.41 ± 3% -0.0 0.37 ± 8% perf-profile.self.cycles-pp.unix_stream_recvmsg > > 0.22 ± 2% -0.0 0.17 ± 4% perf-profile.self.cycles-pp.unix_destruct_scm > > 0.09 ± 4% -0.0 0.05 perf-profile.self.cycles-pp.should_failslab > > 0.10 ± 15% -0.0 0.06 ± 50% perf-profile.self.cycles-pp.__mod_memcg_lruvec_state > > 0.11 ± 4% -0.0 0.07 perf-profile.self.cycles-pp.__might_fault > > 0.16 ± 2% -0.0 0.13 ± 6% perf-profile.self.cycles-pp.obj_cgroup_uncharge_pages > > 0.18 ± 4% -0.0 0.16 ± 3% perf-profile.self.cycles-pp.security_socket_getpeersec_dgram > > 0.28 ± 2% -0.0 0.25 ± 2% perf-profile.self.cycles-pp.unix_write_space > > 0.17 ± 2% -0.0 0.15 ± 5% perf-profile.self.cycles-pp.apparmor_socket_getpeersec_dgram > > 0.08 ± 6% -0.0 0.05 ± 7% perf-profile.self.cycles-pp.security_socket_sendmsg > > 0.12 ± 4% -0.0 0.10 ± 3% perf-profile.self.cycles-pp.__skb_datagram_iter > > 0.24 ± 2% -0.0 0.22 perf-profile.self.cycles-pp.mutex_unlock > > 0.08 ± 5% +0.0 0.10 ± 6% perf-profile.self.cycles-pp.scm_recv > > 0.17 ± 2% +0.0 0.19 ± 3% perf-profile.self.cycles-pp.__x64_sys_read > > 0.19 ± 3% +0.0 0.22 ± 2% perf-profile.self.cycles-pp.__get_task_ioprio > > 0.00 +0.1 0.06 perf-profile.self.cycles-pp.finish_wait > > 0.00 +0.1 0.06 ± 7% perf-profile.self.cycles-pp.cr4_update_irqsoff > > 0.00 +0.1 0.06 ± 7% perf-profile.self.cycles-pp.invalidate_user_asid > > 0.00 +0.1 0.07 ± 12% perf-profile.self.cycles-pp.wake_affine > > 0.00 +0.1 0.07 ± 7% perf-profile.self.cycles-pp.check_cfs_rq_runtime > > 0.00 +0.1 0.07 ± 5% perf-profile.self.cycles-pp.perf_trace_buf_update > > 0.00 +0.1 0.07 ± 9% perf-profile.self.cycles-pp.asm_sysvec_reschedule_ipi > > 0.00 +0.1 0.07 ± 10% perf-profile.self.cycles-pp.__bitmap_and > > 0.00 +0.1 0.08 ± 10% perf-profile.self.cycles-pp.schedule_debug > > 0.00 +0.1 0.08 ± 13% perf-profile.self.cycles-pp.read@plt > > 0.00 +0.1 0.08 ± 12% perf-profile.self.cycles-pp.perf_trace_buf_alloc > > 0.00 +0.1 0.09 ± 35% perf-profile.self.cycles-pp.migrate_task_rq_fair > > 0.00 +0.1 0.09 ± 5% perf-profile.self.cycles-pp.place_entity > > 0.00 +0.1 0.10 ± 4% perf-profile.self.cycles-pp.tracing_gen_ctx_irq_test > > 0.00 +0.1 0.10 perf-profile.self.cycles-pp.__wake_up_common_lock > > 0.07 ± 17% +0.1 0.18 ± 3% perf-profile.self.cycles-pp.__list_add_valid > > 0.00 +0.1 0.11 ± 8% perf-profile.self.cycles-pp.native_irq_return_iret > > 0.00 +0.1 0.12 ± 6% perf-profile.self.cycles-pp.select_idle_cpu > > 0.00 +0.1 0.12 ± 34% perf-profile.self.cycles-pp._find_next_and_bit > > 0.00 +0.1 0.13 ± 25% perf-profile.self.cycles-pp.__cgroup_account_cputime > > 0.00 +0.1 0.13 ± 7% perf-profile.self.cycles-pp.switch_ldt > > 0.00 +0.1 0.14 ± 5% perf-profile.self.cycles-pp.check_preempt_curr > > 0.00 +0.1 0.15 ± 2% perf-profile.self.cycles-pp.save_fpregs_to_fpstate > > 0.00 +0.1 0.15 ± 5% perf-profile.self.cycles-pp.__rdgsbase_inactive > > 0.14 ± 3% +0.2 0.29 perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore > > 0.00 +0.2 0.15 ± 7% perf-profile.self.cycles-pp.ttwu_queue_wakelist > > 0.00 +0.2 0.17 ± 4% perf-profile.self.cycles-pp.rb_insert_color > > 0.00 +0.2 0.17 ± 5% perf-profile.self.cycles-pp.rb_next > > 0.00 +0.2 0.18 ± 2% perf-profile.self.cycles-pp.autoremove_wake_function > > 0.01 ±223% +0.2 0.19 ± 6% perf-profile.self.cycles-pp.ttwu_do_activate > > 0.00 +0.2 0.20 ± 2% perf-profile.self.cycles-pp.rcu_note_context_switch > > 0.00 +0.2 0.20 ± 7% perf-profile.self.cycles-pp.exit_to_user_mode_loop > > 0.27 +0.2 0.47 ± 3% perf-profile.self.cycles-pp.mutex_lock > > 0.00 +0.2 0.20 ± 28% perf-profile.self.cycles-pp.perf_trace_sched_switch > > 0.00 +0.2 0.21 ± 9% perf-profile.self.cycles-pp.resched_curr > > 0.04 ± 45% +0.2 0.26 ± 7% perf-profile.self.cycles-pp.perf_tp_event > > 0.06 ± 7% +0.2 0.28 ± 8% perf-profile.self.cycles-pp.perf_trace_sched_wakeup_template > > 0.19 ± 7% +0.2 0.41 ± 5% perf-profile.self.cycles-pp.__list_del_entry_valid > > 0.08 ± 5% +0.2 0.31 ± 11% perf-profile.self.cycles-pp.task_h_load > > 0.00 +0.2 0.23 ± 5% perf-profile.self.cycles-pp.finish_task_switch > > 0.03 ± 70% +0.2 0.27 ± 5% perf-profile.self.cycles-pp.rb_erase > > 0.02 ±142% +0.3 0.29 ± 2% perf-profile.self.cycles-pp.native_sched_clock > > 0.00 +0.3 0.28 ± 3% perf-profile.self.cycles-pp.__wrgsbase_inactive > > 0.00 +0.3 0.28 ± 6% perf-profile.self.cycles-pp.clear_buddies > > 0.07 ± 10% +0.3 0.35 ± 3% perf-profile.self.cycles-pp.schedule_timeout > > 0.03 ± 70% +0.3 0.33 ± 3% perf-profile.self.cycles-pp.select_task_rq > > 0.06 ± 13% +0.3 0.36 ± 4% perf-profile.self.cycles-pp.__wake_up_common > > 0.06 ± 13% +0.3 0.36 ± 3% perf-profile.self.cycles-pp.dequeue_entity > > 0.06 ± 18% +0.3 0.37 ± 7% perf-profile.self.cycles-pp.perf_trace_sched_stat_runtime > > 0.01 ±223% +0.3 0.33 ± 4% perf-profile.self.cycles-pp.schedule > > 0.02 ±142% +0.3 0.35 ± 7% perf-profile.self.cycles-pp.cpuacct_charge > > 0.01 ±223% +0.3 0.35 perf-profile.self.cycles-pp.set_next_entity > > 0.00 +0.4 0.35 ± 13% perf-profile.self.cycles-pp.available_idle_cpu > > 0.08 ± 10% +0.4 0.44 ± 5% perf-profile.self.cycles-pp.prepare_to_wait > > 0.63 ± 3% +0.4 1.00 ± 4% perf-profile.self.cycles-pp.vfs_read > > 0.02 ±142% +0.4 0.40 ± 4% perf-profile.self.cycles-pp.check_preempt_wakeup > > 0.02 ±141% +0.4 0.42 ± 4% perf-profile.self.cycles-pp.pick_next_entity > > 0.07 ± 17% +0.4 0.48 perf-profile.self.cycles-pp.__calc_delta > > 0.06 ± 14% +0.4 0.47 ± 3% perf-profile.self.cycles-pp.unix_stream_data_wait > > 0.04 ± 45% +0.4 0.45 ± 4% perf-profile.self.cycles-pp.switch_fpu_return > > 0.00 +0.5 0.46 ± 7% perf-profile.self.cycles-pp.set_next_buddy > > 0.07 ± 17% +0.5 0.53 ± 3% perf-profile.self.cycles-pp.select_task_rq_fair > > 0.08 ± 16% +0.5 0.55 ± 4% perf-profile.self.cycles-pp.try_to_wake_up > > 0.08 ± 19% +0.5 0.56 ± 3% perf-profile.self.cycles-pp.update_rq_clock > > 0.02 ±141% +0.5 0.50 ± 10% perf-profile.self.cycles-pp.select_idle_sibling > > 0.77 ± 2% +0.5 1.25 ± 2% perf-profile.self.cycles-pp.__libc_read > > 0.09 ± 19% +0.5 0.59 ± 3% perf-profile.self.cycles-pp.reweight_entity > > 0.08 ± 14% +0.5 0.59 ± 2% perf-profile.self.cycles-pp.dequeue_task_fair > > 0.08 ± 13% +0.6 0.64 ± 5% perf-profile.self.cycles-pp.update_min_vruntime > > 0.02 ±141% +0.6 0.58 ± 7% perf-profile.self.cycles-pp.put_prev_entity > > 0.06 ± 11% +0.6 0.64 ± 4% perf-profile.self.cycles-pp.enqueue_task_fair > > 0.07 ± 18% +0.6 0.68 ± 3% perf-profile.self.cycles-pp.os_xsave > > 1.39 ± 2% +0.7 2.06 ± 3% perf-profile.self.cycles-pp._raw_spin_lock_irqsave > > 0.28 ± 8% +0.7 0.97 ± 4% perf-profile.self.cycles-pp.update_cfs_group > > 0.14 ± 8% +0.7 0.83 ± 3% perf-profile.self.cycles-pp.__update_load_avg_se > > 1.76 ± 3% +0.7 2.47 ± 3% perf-profile.self.cycles-pp._raw_spin_lock > > 0.12 ± 12% +0.7 0.85 ± 5% perf-profile.self.cycles-pp.prepare_task_switch > > 0.12 ± 12% +0.8 0.91 ± 3% perf-profile.self.cycles-pp.__update_load_avg_cfs_rq > > 0.13 ± 12% +0.8 0.93 ± 5% perf-profile.self.cycles-pp.pick_next_task_fair > > 0.13 ± 12% +0.9 0.98 ± 4% perf-profile.self.cycles-pp.__switch_to > > 0.11 ± 18% +0.9 1.06 ± 5% perf-profile.self.cycles-pp.___perf_sw_event > > 0.16 ± 11% +1.2 1.34 ± 4% perf-profile.self.cycles-pp.enqueue_entity > > 0.20 ± 12% +1.4 1.58 ± 4% perf-profile.self.cycles-pp.__switch_to_asm > > 0.23 ± 10% +1.4 1.65 ± 5% perf-profile.self.cycles-pp.restore_fpregs_from_fpstate > > 0.25 ± 12% +1.5 1.77 ± 4% perf-profile.self.cycles-pp.__schedule > > 0.22 ± 10% +1.6 1.78 ± 10% perf-profile.self.cycles-pp.update_load_avg > > 0.23 ± 16% +1.7 1.91 ± 7% perf-profile.self.cycles-pp.update_curr > > 0.48 ± 11% +3.4 3.86 ± 4% perf-profile.self.cycles-pp.switch_mm_irqs_off > > > > > > To reproduce: > > > > git clone https://github.com/intel/lkp-tests.git > > cd lkp-tests > > sudo bin/lkp install job.yaml # job file is attached in this email > > bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run > > sudo bin/lkp run generated-yaml-file > > > > # if come across any failure that blocks the test, > > # please remove ~/.lkp and /lkp dir to run from a clean state. > > > > Amazon Development Center Germany GmbH > Krausenstr. 38 > 10117 Berlin > Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss > Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B > Sitz: Berlin > Ust-ID: DE 289 237 879 > > > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3] sched/fair: sanitize vruntime of entity being placed 2023-02-21 17:26 ` Vincent Guittot @ 2023-02-27 8:42 ` Roman Kagan 2023-02-27 14:37 ` Vincent Guittot 0 siblings, 1 reply; 14+ messages in thread From: Roman Kagan @ 2023-02-27 8:42 UTC (permalink / raw) To: Vincent Guittot Cc: Peter Zijlstra, linux-kernel, Valentin Schneider, Zhang Qiao, Ben Segall, Waiman Long, Steven Rostedt, Mel Gorman, Dietmar Eggemann, Daniel Bristot de Oliveira, Ingo Molnar, Juri Lelli On Tue, Feb 21, 2023 at 06:26:11PM +0100, Vincent Guittot wrote: > On Tue, 21 Feb 2023 at 17:57, Roman Kagan <rkagan@amazon.de> wrote: > > What scares me, though, is that I've got a message from the test robot > > that this commit drammatically affected hackbench results, see the quote > > below. I expected the commit not to affect any benchmarks. > > > > Any idea what could have caused this change? > > Hmm, It's most probably because se->exec_start is reset after a > migration and the condition becomes true for newly migrated task > whereas its vruntime should be after min_vruntime. > > We have missed this condition Makes sense to me. But what would then be the reliable way to detect a sched_entity which has slept for long and risks overflowing in .vruntime comparison? Thanks, Roman. Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B Sitz: Berlin Ust-ID: DE 289 237 879 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3] sched/fair: sanitize vruntime of entity being placed 2023-02-27 8:42 ` Roman Kagan @ 2023-02-27 14:37 ` Vincent Guittot 2023-02-27 17:00 ` Dietmar Eggemann 2023-03-02 9:36 ` Zhang Qiao 0 siblings, 2 replies; 14+ messages in thread From: Vincent Guittot @ 2023-02-27 14:37 UTC (permalink / raw) To: Roman Kagan, Vincent Guittot, Peter Zijlstra, linux-kernel, Valentin Schneider, Zhang Qiao, Ben Segall, Waiman Long, Steven Rostedt, Mel Gorman, Dietmar Eggemann, Daniel Bristot de Oliveira, Ingo Molnar, Juri Lelli On Mon, 27 Feb 2023 at 09:43, Roman Kagan <rkagan@amazon.de> wrote: > > On Tue, Feb 21, 2023 at 06:26:11PM +0100, Vincent Guittot wrote: > > On Tue, 21 Feb 2023 at 17:57, Roman Kagan <rkagan@amazon.de> wrote: > > > What scares me, though, is that I've got a message from the test robot > > > that this commit drammatically affected hackbench results, see the quote > > > below. I expected the commit not to affect any benchmarks. > > > > > > Any idea what could have caused this change? > > > > Hmm, It's most probably because se->exec_start is reset after a > > migration and the condition becomes true for newly migrated task > > whereas its vruntime should be after min_vruntime. > > > > We have missed this condition > > Makes sense to me. > > But what would then be the reliable way to detect a sched_entity which > has slept for long and risks overflowing in .vruntime comparison? For now I don't have a better idea than adding the same check in migrate_task_rq_fair() > > Thanks, > Roman. > > > > Amazon Development Center Germany GmbH > Krausenstr. 38 > 10117 Berlin > Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss > Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B > Sitz: Berlin > Ust-ID: DE 289 237 879 > > > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3] sched/fair: sanitize vruntime of entity being placed 2023-02-27 14:37 ` Vincent Guittot @ 2023-02-27 17:00 ` Dietmar Eggemann 2023-02-27 17:15 ` Vincent Guittot 2023-03-02 9:36 ` Zhang Qiao 1 sibling, 1 reply; 14+ messages in thread From: Dietmar Eggemann @ 2023-02-27 17:00 UTC (permalink / raw) To: Vincent Guittot, Roman Kagan, Peter Zijlstra, linux-kernel, Valentin Schneider, Zhang Qiao, Ben Segall, Waiman Long, Steven Rostedt, Mel Gorman, Daniel Bristot de Oliveira, Ingo Molnar, Juri Lelli On 27/02/2023 15:37, Vincent Guittot wrote: > On Mon, 27 Feb 2023 at 09:43, Roman Kagan <rkagan@amazon.de> wrote: >> >> On Tue, Feb 21, 2023 at 06:26:11PM +0100, Vincent Guittot wrote: >>> On Tue, 21 Feb 2023 at 17:57, Roman Kagan <rkagan@amazon.de> wrote: >>>> What scares me, though, is that I've got a message from the test robot >>>> that this commit drammatically affected hackbench results, see the quote >>>> below. I expected the commit not to affect any benchmarks. >>>> >>>> Any idea what could have caused this change? >>> >>> Hmm, It's most probably because se->exec_start is reset after a >>> migration and the condition becomes true for newly migrated task >>> whereas its vruntime should be after min_vruntime. >>> >>> We have missed this condition >> >> Makes sense to me. >> >> But what would then be the reliable way to detect a sched_entity which >> has slept for long and risks overflowing in .vruntime comparison? > > For now I don't have a better idea than adding the same check in > migrate_task_rq_fair() Don't we have the issue that we could have a non-up-to-date rq clock in migrate? No rq lock held in `!task_on_rq_migrating(p)`. Also deferring `se->exec_start = 0` from `migrate` into `enqueue -> place entity` doesn't seem to work since the rq clocks of different CPUs are not in sync. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3] sched/fair: sanitize vruntime of entity being placed 2023-02-27 17:00 ` Dietmar Eggemann @ 2023-02-27 17:15 ` Vincent Guittot 0 siblings, 0 replies; 14+ messages in thread From: Vincent Guittot @ 2023-02-27 17:15 UTC (permalink / raw) To: Dietmar Eggemann Cc: Roman Kagan, Peter Zijlstra, linux-kernel, Valentin Schneider, Zhang Qiao, Ben Segall, Waiman Long, Steven Rostedt, Mel Gorman, Daniel Bristot de Oliveira, Ingo Molnar, Juri Lelli On Mon, 27 Feb 2023 at 18:00, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote: > > On 27/02/2023 15:37, Vincent Guittot wrote: > > On Mon, 27 Feb 2023 at 09:43, Roman Kagan <rkagan@amazon.de> wrote: > >> > >> On Tue, Feb 21, 2023 at 06:26:11PM +0100, Vincent Guittot wrote: > >>> On Tue, 21 Feb 2023 at 17:57, Roman Kagan <rkagan@amazon.de> wrote: > >>>> What scares me, though, is that I've got a message from the test robot > >>>> that this commit drammatically affected hackbench results, see the quote > >>>> below. I expected the commit not to affect any benchmarks. > >>>> > >>>> Any idea what could have caused this change? > >>> > >>> Hmm, It's most probably because se->exec_start is reset after a > >>> migration and the condition becomes true for newly migrated task > >>> whereas its vruntime should be after min_vruntime. > >>> > >>> We have missed this condition > >> > >> Makes sense to me. > >> > >> But what would then be the reliable way to detect a sched_entity which > >> has slept for long and risks overflowing in .vruntime comparison? > > > > For now I don't have a better idea than adding the same check in > > migrate_task_rq_fair() > > Don't we have the issue that we could have a non-up-to-date rq clock in > migrate? No rq lock held in `!task_on_rq_migrating(p)`. yes the rq clock may be not up to date but that would also mean that the cfs was idle and as a result its min_vruntime has not moved forward and we don't have a problem of possible overflow > > Also deferring `se->exec_start = 0` from `migrate` into `enqueue -> > place entity` doesn't seem to work since the rq clocks of different CPUs > are not in sync. yes > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3] sched/fair: sanitize vruntime of entity being placed 2023-02-27 14:37 ` Vincent Guittot 2023-02-27 17:00 ` Dietmar Eggemann @ 2023-03-02 9:36 ` Zhang Qiao 2023-03-02 13:34 ` Vincent Guittot 1 sibling, 1 reply; 14+ messages in thread From: Zhang Qiao @ 2023-03-02 9:36 UTC (permalink / raw) To: Vincent Guittot Cc: Roman Kagan, Peter Zijlstra, linux-kernel, Valentin Schneider, Ben Segall, Waiman Long, Steven Rostedt, Mel Gorman, Dietmar Eggemann, Daniel Bristot de Oliveira, Ingo Molnar, Juri Lelli 在 2023/2/27 22:37, Vincent Guittot 写道: > On Mon, 27 Feb 2023 at 09:43, Roman Kagan <rkagan@amazon.de> wrote: >> >> On Tue, Feb 21, 2023 at 06:26:11PM +0100, Vincent Guittot wrote: >>> On Tue, 21 Feb 2023 at 17:57, Roman Kagan <rkagan@amazon.de> wrote: >>>> What scares me, though, is that I've got a message from the test robot >>>> that this commit drammatically affected hackbench results, see the quote >>>> below. I expected the commit not to affect any benchmarks. >>>> >>>> Any idea what could have caused this change? >>> >>> Hmm, It's most probably because se->exec_start is reset after a >>> migration and the condition becomes true for newly migrated task >>> whereas its vruntime should be after min_vruntime. >>> >>> We have missed this condition >> >> Makes sense to me. >> >> But what would then be the reliable way to detect a sched_entity which >> has slept for long and risks overflowing in .vruntime comparison? > > For now I don't have a better idea than adding the same check in > migrate_task_rq_fair() Hi, Vincent, I fixed this condition as you said, and the test results are as follows. testcase: hackbench -g 44 -f 20 --process --pipe -l 60000 -s 100 version1: v6.2 version2: v6.2 + commit 829c1651e9c4 version3: v6.2 + commit 829c1651e9c4 + this patch ------------------------------------------------- version1 version2 version3 test1 81.0 118.1 82.1 test2 82.1 116.9 80.3 test3 83.2 103.9 83.3 avg(s) 82.1 113.0 81.9 ------------------------------------------------- After deal with the task migration case, the hackbench result has restored. The patch as follow, how does this look? diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index ff4dbbae3b10..3a88d20fd29e 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4648,6 +4648,26 @@ static void check_spread(struct cfs_rq *cfs_rq, struct sched_entity *se) #endif } +static inline u64 sched_sleeper_credit(struct sched_entity *se) +{ + + unsigned long thresh; + + if (se_is_idle(se)) + thresh = sysctl_sched_min_granularity; + else + thresh = sysctl_sched_latency; + + /* + * Halve their sleep time's effect, to allow + * for a gentler effect of sleepers: + */ + if (sched_feat(GENTLE_FAIR_SLEEPERS)) + thresh >>= 1; + + return thresh; +} + static void place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) { @@ -4664,23 +4684,8 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) vruntime += sched_vslice(cfs_rq, se); /* sleeps up to a single latency don't count. */ - if (!initial) { - unsigned long thresh; - - if (se_is_idle(se)) - thresh = sysctl_sched_min_granularity; - else - thresh = sysctl_sched_latency; - - /* - * Halve their sleep time's effect, to allow - * for a gentler effect of sleepers: - */ - if (sched_feat(GENTLE_FAIR_SLEEPERS)) - thresh >>= 1; - - vruntime -= thresh; - } + if (!initial) + vruntime -= sched_sleeper_credit(se); /* * Pull vruntime of the entity being placed to the base level of @@ -4690,7 +4695,7 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) * inversed due to s64 overflow. */ sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start; - if ((s64)sleep_time > 60LL * NSEC_PER_SEC) + if (se->exec_start != 0 && (s64)sleep_time > 60LL * NSEC_PER_SEC) se->vruntime = vruntime; else se->vruntime = max_vruntime(se->vruntime, vruntime); @@ -7634,8 +7639,12 @@ static void migrate_task_rq_fair(struct task_struct *p, int new_cpu) */ if (READ_ONCE(p->__state) == TASK_WAKING) { struct cfs_rq *cfs_rq = cfs_rq_of(se); + u64 sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start; - se->vruntime -= u64_u32_load(cfs_rq->min_vruntime); + if ((s64)sleep_time > 60LL * NSEC_PER_SEC) + se->vruntime = -sched_sleeper_credit(se); + else + se->vruntime -= u64_u32_load(cfs_rq->min_vruntime); } if (!task_on_rq_migrating(p)) { Thanks. Zhang Qiao. > >> >> Thanks, >> Roman. >> >> >> >> Amazon Development Center Germany GmbH >> Krausenstr. 38 >> 10117 Berlin >> Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss >> Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B >> Sitz: Berlin >> Ust-ID: DE 289 237 879 >> >> >> > . > ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v3] sched/fair: sanitize vruntime of entity being placed 2023-03-02 9:36 ` Zhang Qiao @ 2023-03-02 13:34 ` Vincent Guittot 2023-03-02 14:29 ` Zhang Qiao 0 siblings, 1 reply; 14+ messages in thread From: Vincent Guittot @ 2023-03-02 13:34 UTC (permalink / raw) To: Zhang Qiao Cc: Roman Kagan, Peter Zijlstra, linux-kernel, Valentin Schneider, Ben Segall, Waiman Long, Steven Rostedt, Mel Gorman, Dietmar Eggemann, Daniel Bristot de Oliveira, Ingo Molnar, Juri Lelli On Thu, 2 Mar 2023 at 10:36, Zhang Qiao <zhangqiao22@huawei.com> wrote: > > > > 在 2023/2/27 22:37, Vincent Guittot 写道: > > On Mon, 27 Feb 2023 at 09:43, Roman Kagan <rkagan@amazon.de> wrote: > >> > >> On Tue, Feb 21, 2023 at 06:26:11PM +0100, Vincent Guittot wrote: > >>> On Tue, 21 Feb 2023 at 17:57, Roman Kagan <rkagan@amazon.de> wrote: > >>>> What scares me, though, is that I've got a message from the test robot > >>>> that this commit drammatically affected hackbench results, see the quote > >>>> below. I expected the commit not to affect any benchmarks. > >>>> > >>>> Any idea what could have caused this change? > >>> > >>> Hmm, It's most probably because se->exec_start is reset after a > >>> migration and the condition becomes true for newly migrated task > >>> whereas its vruntime should be after min_vruntime. > >>> > >>> We have missed this condition > >> > >> Makes sense to me. > >> > >> But what would then be the reliable way to detect a sched_entity which > >> has slept for long and risks overflowing in .vruntime comparison? > > > > For now I don't have a better idea than adding the same check in > > migrate_task_rq_fair() > > Hi, Vincent, > I fixed this condition as you said, and the test results are as follows. > > testcase: hackbench -g 44 -f 20 --process --pipe -l 60000 -s 100 > version1: v6.2 > version2: v6.2 + commit 829c1651e9c4 > version3: v6.2 + commit 829c1651e9c4 + this patch > > ------------------------------------------------- > version1 version2 version3 > test1 81.0 118.1 82.1 > test2 82.1 116.9 80.3 > test3 83.2 103.9 83.3 > avg(s) 82.1 113.0 81.9 > > ------------------------------------------------- > After deal with the task migration case, the hackbench result has restored. > > The patch as follow, how does this look? > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index ff4dbbae3b10..3a88d20fd29e 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -4648,6 +4648,26 @@ static void check_spread(struct cfs_rq *cfs_rq, struct sched_entity *se) > #endif > } > > +static inline u64 sched_sleeper_credit(struct sched_entity *se) > +{ > + > + unsigned long thresh; > + > + if (se_is_idle(se)) > + thresh = sysctl_sched_min_granularity; > + else > + thresh = sysctl_sched_latency; > + > + /* > + * Halve their sleep time's effect, to allow > + * for a gentler effect of sleepers: > + */ > + if (sched_feat(GENTLE_FAIR_SLEEPERS)) > + thresh >>= 1; > + > + return thresh; > +} > + > static void > place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) > { > @@ -4664,23 +4684,8 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) > vruntime += sched_vslice(cfs_rq, se); > > /* sleeps up to a single latency don't count. */ > - if (!initial) { > - unsigned long thresh; > - > - if (se_is_idle(se)) > - thresh = sysctl_sched_min_granularity; > - else > - thresh = sysctl_sched_latency; > - > - /* > - * Halve their sleep time's effect, to allow > - * for a gentler effect of sleepers: > - */ > - if (sched_feat(GENTLE_FAIR_SLEEPERS)) > - thresh >>= 1; > - > - vruntime -= thresh; > - } > + if (!initial) > + vruntime -= sched_sleeper_credit(se); > > /* > * Pull vruntime of the entity being placed to the base level of > @@ -4690,7 +4695,7 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) > * inversed due to s64 overflow. > */ > sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start; > - if ((s64)sleep_time > 60LL * NSEC_PER_SEC) > + if (se->exec_start != 0 && (s64)sleep_time > 60LL * NSEC_PER_SEC) > se->vruntime = vruntime; > else > se->vruntime = max_vruntime(se->vruntime, vruntime); > @@ -7634,8 +7639,12 @@ static void migrate_task_rq_fair(struct task_struct *p, int new_cpu) > */ > if (READ_ONCE(p->__state) == TASK_WAKING) { > struct cfs_rq *cfs_rq = cfs_rq_of(se); > + u64 sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start; > > - se->vruntime -= u64_u32_load(cfs_rq->min_vruntime); > + if ((s64)sleep_time > 60LL * NSEC_PER_SEC) You also need to test (se->exec_start !=0) here because the task might migrate another time before being scheduled. You should create a helper function like below and use it in both place static inline bool entity_long_sleep(se) { struct cfs_rq *cfs_rq; u64 sleep_time; if (se->exec_start == 0) return false; cfs_rq = cfs_rq_of(se); sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start; if ((s64)sleep_time > 60LL * NSEC_PER_SEC) return true; return false; } > + se->vruntime = -sched_sleeper_credit(se); > + else > + se->vruntime -= u64_u32_load(cfs_rq->min_vruntime); > } > > if (!task_on_rq_migrating(p)) { > > > > Thanks. > Zhang Qiao. > > > > >> > >> Thanks, > >> Roman. > >> > >> > >> > >> Amazon Development Center Germany GmbH > >> Krausenstr. 38 > >> 10117 Berlin > >> Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss > >> Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B > >> Sitz: Berlin > >> Ust-ID: DE 289 237 879 > >> > >> > >> > > . > > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3] sched/fair: sanitize vruntime of entity being placed 2023-03-02 13:34 ` Vincent Guittot @ 2023-03-02 14:29 ` Zhang Qiao 2023-03-02 14:55 ` Vincent Guittot 0 siblings, 1 reply; 14+ messages in thread From: Zhang Qiao @ 2023-03-02 14:29 UTC (permalink / raw) To: Vincent Guittot Cc: Roman Kagan, Peter Zijlstra, linux-kernel, Valentin Schneider, Ben Segall, Waiman Long, Steven Rostedt, Mel Gorman, Dietmar Eggemann, Daniel Bristot de Oliveira, Ingo Molnar, Juri Lelli 在 2023/3/2 21:34, Vincent Guittot 写道: > On Thu, 2 Mar 2023 at 10:36, Zhang Qiao <zhangqiao22@huawei.com> wrote: >> >> >> >> 在 2023/2/27 22:37, Vincent Guittot 写道: >>> On Mon, 27 Feb 2023 at 09:43, Roman Kagan <rkagan@amazon.de> wrote: >>>> >>>> On Tue, Feb 21, 2023 at 06:26:11PM +0100, Vincent Guittot wrote: >>>>> On Tue, 21 Feb 2023 at 17:57, Roman Kagan <rkagan@amazon.de> wrote: >>>>>> What scares me, though, is that I've got a message from the test robot >>>>>> that this commit drammatically affected hackbench results, see the quote >>>>>> below. I expected the commit not to affect any benchmarks. >>>>>> >>>>>> Any idea what could have caused this change? >>>>> >>>>> Hmm, It's most probably because se->exec_start is reset after a >>>>> migration and the condition becomes true for newly migrated task >>>>> whereas its vruntime should be after min_vruntime. >>>>> >>>>> We have missed this condition >>>> >>>> Makes sense to me. >>>> >>>> But what would then be the reliable way to detect a sched_entity which >>>> has slept for long and risks overflowing in .vruntime comparison? >>> >>> For now I don't have a better idea than adding the same check in >>> migrate_task_rq_fair() >> >> Hi, Vincent, >> I fixed this condition as you said, and the test results are as follows. >> >> testcase: hackbench -g 44 -f 20 --process --pipe -l 60000 -s 100 >> version1: v6.2 >> version2: v6.2 + commit 829c1651e9c4 >> version3: v6.2 + commit 829c1651e9c4 + this patch >> >> ------------------------------------------------- >> version1 version2 version3 >> test1 81.0 118.1 82.1 >> test2 82.1 116.9 80.3 >> test3 83.2 103.9 83.3 >> avg(s) 82.1 113.0 81.9 >> >> ------------------------------------------------- >> After deal with the task migration case, the hackbench result has restored. >> >> The patch as follow, how does this look? >> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >> index ff4dbbae3b10..3a88d20fd29e 100644 >> --- a/kernel/sched/fair.c >> +++ b/kernel/sched/fair.c >> @@ -4648,6 +4648,26 @@ static void check_spread(struct cfs_rq *cfs_rq, struct sched_entity *se) >> #endif >> } >> >> +static inline u64 sched_sleeper_credit(struct sched_entity *se) >> +{ >> + >> + unsigned long thresh; >> + >> + if (se_is_idle(se)) >> + thresh = sysctl_sched_min_granularity; >> + else >> + thresh = sysctl_sched_latency; >> + >> + /* >> + * Halve their sleep time's effect, to allow >> + * for a gentler effect of sleepers: >> + */ >> + if (sched_feat(GENTLE_FAIR_SLEEPERS)) >> + thresh >>= 1; >> + >> + return thresh; >> +} >> + >> static void >> place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) >> { >> @@ -4664,23 +4684,8 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) >> vruntime += sched_vslice(cfs_rq, se); >> >> /* sleeps up to a single latency don't count. */ >> - if (!initial) { >> - unsigned long thresh; >> - >> - if (se_is_idle(se)) >> - thresh = sysctl_sched_min_granularity; >> - else >> - thresh = sysctl_sched_latency; >> - >> - /* >> - * Halve their sleep time's effect, to allow >> - * for a gentler effect of sleepers: >> - */ >> - if (sched_feat(GENTLE_FAIR_SLEEPERS)) >> - thresh >>= 1; >> - >> - vruntime -= thresh; >> - } >> + if (!initial) >> + vruntime -= sched_sleeper_credit(se); >> >> /* >> * Pull vruntime of the entity being placed to the base level of >> @@ -4690,7 +4695,7 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) >> * inversed due to s64 overflow. >> */ >> sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start; >> - if ((s64)sleep_time > 60LL * NSEC_PER_SEC) >> + if (se->exec_start != 0 && (s64)sleep_time > 60LL * NSEC_PER_SEC) >> se->vruntime = vruntime; >> else >> se->vruntime = max_vruntime(se->vruntime, vruntime); >> @@ -7634,8 +7639,12 @@ static void migrate_task_rq_fair(struct task_struct *p, int new_cpu) >> */ >> if (READ_ONCE(p->__state) == TASK_WAKING) { >> struct cfs_rq *cfs_rq = cfs_rq_of(se); >> + u64 sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start; >> >> - se->vruntime -= u64_u32_load(cfs_rq->min_vruntime); >> + if ((s64)sleep_time > 60LL * NSEC_PER_SEC) > > You also need to test (se->exec_start !=0) here because the task might Hi, I don't understand when the another migration happend. Could you tell me in more detail? I think the next migration will happend after the wakee task enqueued, but at this time the p->__state isn't TASK_WAKING, p->__state already be changed to TASK_RUNNING at ttwu_do_wakeup(). If such a migration exists, Previous code "se->vruntime -= u64_u32_load(cfs_rq->min_vruntime);" maybe perform multiple times,wouldn't it go wrong in this way? > migrate another time before being scheduled. You should create a > helper function like below and use it in both place Ok, I will update at next version. Thanks, ZhangQiao. > > static inline bool entity_long_sleep(se) > { > struct cfs_rq *cfs_rq; > u64 sleep_time; > > if (se->exec_start == 0) > return false; > > cfs_rq = cfs_rq_of(se); > sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start; > if ((s64)sleep_time > 60LL * NSEC_PER_SEC) > return true; > > return false; > } > > >> + se->vruntime = -sched_sleeper_credit(se); >> + else >> + se->vruntime -= u64_u32_load(cfs_rq->min_vruntime); >> } >> >> if (!task_on_rq_migrating(p)) { >> >> >> >> Thanks. >> Zhang Qiao. >> >>> >>>> >>>> Thanks, >>>> Roman. >>>> >>>> >>>> >>>> Amazon Development Center Germany GmbH >>>> Krausenstr. 38 >>>> 10117 Berlin >>>> Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss >>>> Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B >>>> Sitz: Berlin >>>> Ust-ID: DE 289 237 879 >>>> >>>> >>>> >>> . >>> > . > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3] sched/fair: sanitize vruntime of entity being placed 2023-03-02 14:29 ` Zhang Qiao @ 2023-03-02 14:55 ` Vincent Guittot 2023-03-03 6:51 ` Zhang Qiao 0 siblings, 1 reply; 14+ messages in thread From: Vincent Guittot @ 2023-03-02 14:55 UTC (permalink / raw) To: Zhang Qiao Cc: Roman Kagan, Peter Zijlstra, linux-kernel, Valentin Schneider, Ben Segall, Waiman Long, Steven Rostedt, Mel Gorman, Dietmar Eggemann, Daniel Bristot de Oliveira, Ingo Molnar, Juri Lelli On Thu, 2 Mar 2023 at 15:29, Zhang Qiao <zhangqiao22@huawei.com> wrote: > > > > 在 2023/3/2 21:34, Vincent Guittot 写道: > > On Thu, 2 Mar 2023 at 10:36, Zhang Qiao <zhangqiao22@huawei.com> wrote: > >> > >> > >> > >> 在 2023/2/27 22:37, Vincent Guittot 写道: > >>> On Mon, 27 Feb 2023 at 09:43, Roman Kagan <rkagan@amazon.de> wrote: > >>>> > >>>> On Tue, Feb 21, 2023 at 06:26:11PM +0100, Vincent Guittot wrote: > >>>>> On Tue, 21 Feb 2023 at 17:57, Roman Kagan <rkagan@amazon.de> wrote: > >>>>>> What scares me, though, is that I've got a message from the test robot > >>>>>> that this commit drammatically affected hackbench results, see the quote > >>>>>> below. I expected the commit not to affect any benchmarks. > >>>>>> > >>>>>> Any idea what could have caused this change? > >>>>> > >>>>> Hmm, It's most probably because se->exec_start is reset after a > >>>>> migration and the condition becomes true for newly migrated task > >>>>> whereas its vruntime should be after min_vruntime. > >>>>> > >>>>> We have missed this condition > >>>> > >>>> Makes sense to me. > >>>> > >>>> But what would then be the reliable way to detect a sched_entity which > >>>> has slept for long and risks overflowing in .vruntime comparison? > >>> > >>> For now I don't have a better idea than adding the same check in > >>> migrate_task_rq_fair() > >> > >> Hi, Vincent, > >> I fixed this condition as you said, and the test results are as follows. > >> > >> testcase: hackbench -g 44 -f 20 --process --pipe -l 60000 -s 100 > >> version1: v6.2 > >> version2: v6.2 + commit 829c1651e9c4 > >> version3: v6.2 + commit 829c1651e9c4 + this patch > >> > >> ------------------------------------------------- > >> version1 version2 version3 > >> test1 81.0 118.1 82.1 > >> test2 82.1 116.9 80.3 > >> test3 83.2 103.9 83.3 > >> avg(s) 82.1 113.0 81.9 > >> > >> ------------------------------------------------- > >> After deal with the task migration case, the hackbench result has restored. > >> > >> The patch as follow, how does this look? > >> > >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > >> index ff4dbbae3b10..3a88d20fd29e 100644 > >> --- a/kernel/sched/fair.c > >> +++ b/kernel/sched/fair.c > >> @@ -4648,6 +4648,26 @@ static void check_spread(struct cfs_rq *cfs_rq, struct sched_entity *se) > >> #endif > >> } > >> > >> +static inline u64 sched_sleeper_credit(struct sched_entity *se) > >> +{ > >> + > >> + unsigned long thresh; > >> + > >> + if (se_is_idle(se)) > >> + thresh = sysctl_sched_min_granularity; > >> + else > >> + thresh = sysctl_sched_latency; > >> + > >> + /* > >> + * Halve their sleep time's effect, to allow > >> + * for a gentler effect of sleepers: > >> + */ > >> + if (sched_feat(GENTLE_FAIR_SLEEPERS)) > >> + thresh >>= 1; > >> + > >> + return thresh; > >> +} > >> + > >> static void > >> place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) > >> { > >> @@ -4664,23 +4684,8 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) > >> vruntime += sched_vslice(cfs_rq, se); > >> > >> /* sleeps up to a single latency don't count. */ > >> - if (!initial) { > >> - unsigned long thresh; > >> - > >> - if (se_is_idle(se)) > >> - thresh = sysctl_sched_min_granularity; > >> - else > >> - thresh = sysctl_sched_latency; > >> - > >> - /* > >> - * Halve their sleep time's effect, to allow > >> - * for a gentler effect of sleepers: > >> - */ > >> - if (sched_feat(GENTLE_FAIR_SLEEPERS)) > >> - thresh >>= 1; > >> - > >> - vruntime -= thresh; > >> - } > >> + if (!initial) > >> + vruntime -= sched_sleeper_credit(se); > >> > >> /* > >> * Pull vruntime of the entity being placed to the base level of > >> @@ -4690,7 +4695,7 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) > >> * inversed due to s64 overflow. > >> */ > >> sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start; > >> - if ((s64)sleep_time > 60LL * NSEC_PER_SEC) > >> + if (se->exec_start != 0 && (s64)sleep_time > 60LL * NSEC_PER_SEC) > >> se->vruntime = vruntime; > >> else > >> se->vruntime = max_vruntime(se->vruntime, vruntime); > >> @@ -7634,8 +7639,12 @@ static void migrate_task_rq_fair(struct task_struct *p, int new_cpu) > >> */ > >> if (READ_ONCE(p->__state) == TASK_WAKING) { > >> struct cfs_rq *cfs_rq = cfs_rq_of(se); > >> + u64 sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start; > >> > >> - se->vruntime -= u64_u32_load(cfs_rq->min_vruntime); > >> + if ((s64)sleep_time > 60LL * NSEC_PER_SEC) > > > > You also need to test (se->exec_start !=0) here because the task might > > Hi, > > I don't understand when the another migration happend. Could you tell me in more detail? se->exec_start is update when the task becomes current. You can have the sequence: task TA runs on CPU0 TA's se->exec_start = xxxx TA is put back into the rb tree waiting for next slice while another task is running CPU1 pulls TA which migrates on CPU1 migrate_task_rq_fair() w/ TA's se->exec_start == xxxx TA's se->exec_start = 0 TA is put into the rb tree of CPU1 waiting to run on CPU1 CPU2 pulls TA which migrates on CPU2 migrate_task_rq_fair() w/ TA's se->exec_start == 0 TA's se->exec_start = 0 > > I think the next migration will happend after the wakee task enqueued, but at this time > the p->__state isn't TASK_WAKING, p->__state already be changed to TASK_RUNNING at ttwu_do_wakeup(). > > If such a migration exists, Previous code "se->vruntime -= u64_u32_load(cfs_rq->min_vruntime);" maybe > perform multiple times,wouldn't it go wrong in this way? the vruntime have been updated when enqueued but not exec_start > > > migrate another time before being scheduled. You should create a > > helper function like below and use it in both place > > Ok, I will update at next version. > > > Thanks, > ZhangQiao. > > > > > static inline bool entity_long_sleep(se) > > { > > struct cfs_rq *cfs_rq; > > u64 sleep_time; > > > > if (se->exec_start == 0) > > return false; > > > > cfs_rq = cfs_rq_of(se); > > sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start; > > if ((s64)sleep_time > 60LL * NSEC_PER_SEC) > > return true; > > > > return false; > > } > > > > > >> + se->vruntime = -sched_sleeper_credit(se); > >> + else > >> + se->vruntime -= u64_u32_load(cfs_rq->min_vruntime); > >> } > >> > >> if (!task_on_rq_migrating(p)) { > >> > >> > >> > >> Thanks. > >> Zhang Qiao. > >> > >>> > >>>> > >>>> Thanks, > >>>> Roman. > >>>> > >>>> > >>>> > >>>> Amazon Development Center Germany GmbH > >>>> Krausenstr. 38 > >>>> 10117 Berlin > >>>> Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss > >>>> Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B > >>>> Sitz: Berlin > >>>> Ust-ID: DE 289 237 879 > >>>> > >>>> > >>>> > >>> . > >>> > > . > > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3] sched/fair: sanitize vruntime of entity being placed 2023-03-02 14:55 ` Vincent Guittot @ 2023-03-03 6:51 ` Zhang Qiao 2023-03-03 8:32 ` Vincent Guittot 0 siblings, 1 reply; 14+ messages in thread From: Zhang Qiao @ 2023-03-03 6:51 UTC (permalink / raw) To: Vincent Guittot Cc: Roman Kagan, Peter Zijlstra, linux-kernel, Valentin Schneider, Ben Segall, Waiman Long, Steven Rostedt, Mel Gorman, Dietmar Eggemann, Daniel Bristot de Oliveira, Ingo Molnar, Juri Lelli 在 2023/3/2 22:55, Vincent Guittot 写道: > On Thu, 2 Mar 2023 at 15:29, Zhang Qiao <zhangqiao22@huawei.com> wrote: >> >> >> >> 在 2023/3/2 21:34, Vincent Guittot 写道: >>> On Thu, 2 Mar 2023 at 10:36, Zhang Qiao <zhangqiao22@huawei.com> wrote: >>>> >>>> >>>> >>>> 在 2023/2/27 22:37, Vincent Guittot 写道: >>>>> On Mon, 27 Feb 2023 at 09:43, Roman Kagan <rkagan@amazon.de> wrote: >>>>>> >>>>>> On Tue, Feb 21, 2023 at 06:26:11PM +0100, Vincent Guittot wrote: >>>>>>> On Tue, 21 Feb 2023 at 17:57, Roman Kagan <rkagan@amazon.de> wrote: >>>>>>>> What scares me, though, is that I've got a message from the test robot >>>>>>>> that this commit drammatically affected hackbench results, see the quote >>>>>>>> below. I expected the commit not to affect any benchmarks. >>>>>>>> >>>>>>>> Any idea what could have caused this change? >>>>>>> >>>>>>> Hmm, It's most probably because se->exec_start is reset after a >>>>>>> migration and the condition becomes true for newly migrated task >>>>>>> whereas its vruntime should be after min_vruntime. >>>>>>> >>>>>>> We have missed this condition >>>>>> >>>>>> Makes sense to me. >>>>>> >>>>>> But what would then be the reliable way to detect a sched_entity which >>>>>> has slept for long and risks overflowing in .vruntime comparison? >>>>> >>>>> For now I don't have a better idea than adding the same check in >>>>> migrate_task_rq_fair() >>>> >>>> Hi, Vincent, >>>> I fixed this condition as you said, and the test results are as follows. >>>> >>>> testcase: hackbench -g 44 -f 20 --process --pipe -l 60000 -s 100 >>>> version1: v6.2 >>>> version2: v6.2 + commit 829c1651e9c4 >>>> version3: v6.2 + commit 829c1651e9c4 + this patch >>>> >>>> ------------------------------------------------- >>>> version1 version2 version3 >>>> test1 81.0 118.1 82.1 >>>> test2 82.1 116.9 80.3 >>>> test3 83.2 103.9 83.3 >>>> avg(s) 82.1 113.0 81.9 >>>> >>>> ------------------------------------------------- >>>> After deal with the task migration case, the hackbench result has restored. >>>> >>>> The patch as follow, how does this look? >>>> >>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >>>> index ff4dbbae3b10..3a88d20fd29e 100644 >>>> --- a/kernel/sched/fair.c >>>> +++ b/kernel/sched/fair.c >>>> @@ -4648,6 +4648,26 @@ static void check_spread(struct cfs_rq *cfs_rq, struct sched_entity *se) >>>> #endif >>>> } >>>> >>>> +static inline u64 sched_sleeper_credit(struct sched_entity *se) >>>> +{ >>>> + >>>> + unsigned long thresh; >>>> + >>>> + if (se_is_idle(se)) >>>> + thresh = sysctl_sched_min_granularity; >>>> + else >>>> + thresh = sysctl_sched_latency; >>>> + >>>> + /* >>>> + * Halve their sleep time's effect, to allow >>>> + * for a gentler effect of sleepers: >>>> + */ >>>> + if (sched_feat(GENTLE_FAIR_SLEEPERS)) >>>> + thresh >>= 1; >>>> + >>>> + return thresh; >>>> +} >>>> + >>>> static void >>>> place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) >>>> { >>>> @@ -4664,23 +4684,8 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) >>>> vruntime += sched_vslice(cfs_rq, se); >>>> >>>> /* sleeps up to a single latency don't count. */ >>>> - if (!initial) { >>>> - unsigned long thresh; >>>> - >>>> - if (se_is_idle(se)) >>>> - thresh = sysctl_sched_min_granularity; >>>> - else >>>> - thresh = sysctl_sched_latency; >>>> - >>>> - /* >>>> - * Halve their sleep time's effect, to allow >>>> - * for a gentler effect of sleepers: >>>> - */ >>>> - if (sched_feat(GENTLE_FAIR_SLEEPERS)) >>>> - thresh >>= 1; >>>> - >>>> - vruntime -= thresh; >>>> - } >>>> + if (!initial) >>>> + vruntime -= sched_sleeper_credit(se); >>>> >>>> /* >>>> * Pull vruntime of the entity being placed to the base level of >>>> @@ -4690,7 +4695,7 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) >>>> * inversed due to s64 overflow. >>>> */ >>>> sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start; >>>> - if ((s64)sleep_time > 60LL * NSEC_PER_SEC) >>>> + if (se->exec_start != 0 && (s64)sleep_time > 60LL * NSEC_PER_SEC) >>>> se->vruntime = vruntime; >>>> else >>>> se->vruntime = max_vruntime(se->vruntime, vruntime); >>>> @@ -7634,8 +7639,12 @@ static void migrate_task_rq_fair(struct task_struct *p, int new_cpu) >>>> */ >>>> if (READ_ONCE(p->__state) == TASK_WAKING) { >>>> struct cfs_rq *cfs_rq = cfs_rq_of(se); >>>> + u64 sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start; >>>> >>>> - se->vruntime -= u64_u32_load(cfs_rq->min_vruntime); >>>> + if ((s64)sleep_time > 60LL * NSEC_PER_SEC) >>> >>> You also need to test (se->exec_start !=0) here because the task might >> >> Hi, >> >> I don't understand when the another migration happend. Could you tell me in more detail? > > se->exec_start is update when the task becomes current. > > You can have the sequence: > > task TA runs on CPU0 > TA's se->exec_start = xxxx > TA is put back into the rb tree waiting for next slice while another > task is running > CPU1 pulls TA which migrates on CPU1 > migrate_task_rq_fair() w/ TA's se->exec_start == xxxx > TA's se->exec_start = 0 > TA is put into the rb tree of CPU1 waiting to run on CPU1 > CPU2 pulls TA which migrates on CPU2 > migrate_task_rq_fair() w/ TA's se->exec_start == 0 > TA's se->exec_start = 0 Hi, Vincent, yes, you're right, such sequence does exist. But at this point, p->__state != TASK_WAKING. I have a question, Whether there is case that is "p->se.exec_start == 0 && p->__state == TASK_WAKING" ? I analyzed the code and concluded that this case isn't existed, is it right? Thanks. ZhangQiao. > >> >> I think the next migration will happend after the wakee task enqueued, but at this time >> the p->__state isn't TASK_WAKING, p->__state already be changed to TASK_RUNNING at ttwu_do_wakeup(). >> >> If such a migration exists, Previous code "se->vruntime -= u64_u32_load(cfs_rq->min_vruntime);" maybe >> perform multiple times,wouldn't it go wrong in this way? > > the vruntime have been updated when enqueued but not exec_start > >> >>> migrate another time before being scheduled. You should create a >>> helper function like below and use it in both place >> >> Ok, I will update at next version. >> >> >> Thanks, >> ZhangQiao. >> >>> >>> static inline bool entity_long_sleep(se) >>> { >>> struct cfs_rq *cfs_rq; >>> u64 sleep_time; >>> >>> if (se->exec_start == 0) >>> return false; >>> >>> cfs_rq = cfs_rq_of(se); >>> sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start; >>> if ((s64)sleep_time > 60LL * NSEC_PER_SEC) >>> return true; >>> >>> return false; >>> } >>> >>> >>>> + se->vruntime = -sched_sleeper_credit(se); >>>> + else >>>> + se->vruntime -= u64_u32_load(cfs_rq->min_vruntime); >>>> } >>>> >>>> if (!task_on_rq_migrating(p)) { >>>> >>>> >>>> >>>> Thanks. >>>> Zhang Qiao. >>>> >>>>> >>>>>> >>>>>> Thanks, >>>>>> Roman. >>>>>> >>>>>> >>>>>> >>>>>> Amazon Development Center Germany GmbH >>>>>> Krausenstr. 38 >>>>>> 10117 Berlin >>>>>> Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss >>>>>> Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B >>>>>> Sitz: Berlin >>>>>> Ust-ID: DE 289 237 879 >>>>>> >>>>>> >>>>>> >>>>> . >>>>> >>> . >>> > . > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3] sched/fair: sanitize vruntime of entity being placed 2023-03-03 6:51 ` Zhang Qiao @ 2023-03-03 8:32 ` Vincent Guittot 0 siblings, 0 replies; 14+ messages in thread From: Vincent Guittot @ 2023-03-03 8:32 UTC (permalink / raw) To: Zhang Qiao Cc: Roman Kagan, Peter Zijlstra, linux-kernel, Valentin Schneider, Ben Segall, Waiman Long, Steven Rostedt, Mel Gorman, Dietmar Eggemann, Daniel Bristot de Oliveira, Ingo Molnar, Juri Lelli On Fri, 3 Mar 2023 at 07:51, Zhang Qiao <zhangqiao22@huawei.com> wrote: > > > > 在 2023/3/2 22:55, Vincent Guittot 写道: > > On Thu, 2 Mar 2023 at 15:29, Zhang Qiao <zhangqiao22@huawei.com> wrote: > >> > >> > >> > >> 在 2023/3/2 21:34, Vincent Guittot 写道: > >>> On Thu, 2 Mar 2023 at 10:36, Zhang Qiao <zhangqiao22@huawei.com> wrote: > >>>> > >>>> > >>>> > >>>> 在 2023/2/27 22:37, Vincent Guittot 写道: > >>>>> On Mon, 27 Feb 2023 at 09:43, Roman Kagan <rkagan@amazon.de> wrote: > >>>>>> > >>>>>> On Tue, Feb 21, 2023 at 06:26:11PM +0100, Vincent Guittot wrote: > >>>>>>> On Tue, 21 Feb 2023 at 17:57, Roman Kagan <rkagan@amazon.de> wrote: > >>>>>>>> What scares me, though, is that I've got a message from the test robot > >>>>>>>> that this commit drammatically affected hackbench results, see the quote > >>>>>>>> below. I expected the commit not to affect any benchmarks. > >>>>>>>> > >>>>>>>> Any idea what could have caused this change? > >>>>>>> > >>>>>>> Hmm, It's most probably because se->exec_start is reset after a > >>>>>>> migration and the condition becomes true for newly migrated task > >>>>>>> whereas its vruntime should be after min_vruntime. > >>>>>>> > >>>>>>> We have missed this condition > >>>>>> > >>>>>> Makes sense to me. > >>>>>> > >>>>>> But what would then be the reliable way to detect a sched_entity which > >>>>>> has slept for long and risks overflowing in .vruntime comparison? > >>>>> > >>>>> For now I don't have a better idea than adding the same check in > >>>>> migrate_task_rq_fair() > >>>> > >>>> Hi, Vincent, > >>>> I fixed this condition as you said, and the test results are as follows. > >>>> > >>>> testcase: hackbench -g 44 -f 20 --process --pipe -l 60000 -s 100 > >>>> version1: v6.2 > >>>> version2: v6.2 + commit 829c1651e9c4 > >>>> version3: v6.2 + commit 829c1651e9c4 + this patch > >>>> > >>>> ------------------------------------------------- > >>>> version1 version2 version3 > >>>> test1 81.0 118.1 82.1 > >>>> test2 82.1 116.9 80.3 > >>>> test3 83.2 103.9 83.3 > >>>> avg(s) 82.1 113.0 81.9 > >>>> > >>>> ------------------------------------------------- > >>>> After deal with the task migration case, the hackbench result has restored. > >>>> > >>>> The patch as follow, how does this look? > >>>> > >>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > >>>> index ff4dbbae3b10..3a88d20fd29e 100644 > >>>> --- a/kernel/sched/fair.c > >>>> +++ b/kernel/sched/fair.c > >>>> @@ -4648,6 +4648,26 @@ static void check_spread(struct cfs_rq *cfs_rq, struct sched_entity *se) > >>>> #endif > >>>> } > >>>> > >>>> +static inline u64 sched_sleeper_credit(struct sched_entity *se) > >>>> +{ > >>>> + > >>>> + unsigned long thresh; > >>>> + > >>>> + if (se_is_idle(se)) > >>>> + thresh = sysctl_sched_min_granularity; > >>>> + else > >>>> + thresh = sysctl_sched_latency; > >>>> + > >>>> + /* > >>>> + * Halve their sleep time's effect, to allow > >>>> + * for a gentler effect of sleepers: > >>>> + */ > >>>> + if (sched_feat(GENTLE_FAIR_SLEEPERS)) > >>>> + thresh >>= 1; > >>>> + > >>>> + return thresh; > >>>> +} > >>>> + > >>>> static void > >>>> place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) > >>>> { > >>>> @@ -4664,23 +4684,8 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) > >>>> vruntime += sched_vslice(cfs_rq, se); > >>>> > >>>> /* sleeps up to a single latency don't count. */ > >>>> - if (!initial) { > >>>> - unsigned long thresh; > >>>> - > >>>> - if (se_is_idle(se)) > >>>> - thresh = sysctl_sched_min_granularity; > >>>> - else > >>>> - thresh = sysctl_sched_latency; > >>>> - > >>>> - /* > >>>> - * Halve their sleep time's effect, to allow > >>>> - * for a gentler effect of sleepers: > >>>> - */ > >>>> - if (sched_feat(GENTLE_FAIR_SLEEPERS)) > >>>> - thresh >>= 1; > >>>> - > >>>> - vruntime -= thresh; > >>>> - } > >>>> + if (!initial) > >>>> + vruntime -= sched_sleeper_credit(se); > >>>> > >>>> /* > >>>> * Pull vruntime of the entity being placed to the base level of > >>>> @@ -4690,7 +4695,7 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) > >>>> * inversed due to s64 overflow. > >>>> */ > >>>> sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start; > >>>> - if ((s64)sleep_time > 60LL * NSEC_PER_SEC) > >>>> + if (se->exec_start != 0 && (s64)sleep_time > 60LL * NSEC_PER_SEC) > >>>> se->vruntime = vruntime; > >>>> else > >>>> se->vruntime = max_vruntime(se->vruntime, vruntime); > >>>> @@ -7634,8 +7639,12 @@ static void migrate_task_rq_fair(struct task_struct *p, int new_cpu) > >>>> */ > >>>> if (READ_ONCE(p->__state) == TASK_WAKING) { > >>>> struct cfs_rq *cfs_rq = cfs_rq_of(se); > >>>> + u64 sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start; > >>>> > >>>> - se->vruntime -= u64_u32_load(cfs_rq->min_vruntime); > >>>> + if ((s64)sleep_time > 60LL * NSEC_PER_SEC) > >>> > >>> You also need to test (se->exec_start !=0) here because the task might > >> > >> Hi, > >> > >> I don't understand when the another migration happend. Could you tell me in more detail? > > > > se->exec_start is update when the task becomes current. > > > > You can have the sequence: > > > > task TA runs on CPU0 > > TA's se->exec_start = xxxx > > TA is put back into the rb tree waiting for next slice while another > > task is running > > CPU1 pulls TA which migrates on CPU1 > > migrate_task_rq_fair() w/ TA's se->exec_start == xxxx > > TA's se->exec_start = 0 > > TA is put into the rb tree of CPU1 waiting to run on CPU1 > > CPU2 pulls TA which migrates on CPU2 > > migrate_task_rq_fair() w/ TA's se->exec_start == 0 > > TA's se->exec_start = 0 > Hi, Vincent, > > yes, you're right, such sequence does exist. But at this point, p->__state != TASK_WAKING. > > I have a question, Whether there is case that is "p->se.exec_start == 0 && p->__state == TASK_WAKING" ? > I analyzed the code and concluded that this case isn't existed, is it right? Yes, you're right. Your proposal is enough Thanks > > Thanks. > ZhangQiao. > > > > >> > >> I think the next migration will happend after the wakee task enqueued, but at this time > >> the p->__state isn't TASK_WAKING, p->__state already be changed to TASK_RUNNING at ttwu_do_wakeup(). > >> > >> If such a migration exists, Previous code "se->vruntime -= u64_u32_load(cfs_rq->min_vruntime);" maybe > >> perform multiple times,wouldn't it go wrong in this way? > > > > the vruntime have been updated when enqueued but not exec_start > > > >> > >>> migrate another time before being scheduled. You should create a > >>> helper function like below and use it in both place > >> > >> Ok, I will update at next version. > >> > >> > >> Thanks, > >> ZhangQiao. > >> > >>> > >>> static inline bool entity_long_sleep(se) > >>> { > >>> struct cfs_rq *cfs_rq; > >>> u64 sleep_time; > >>> > >>> if (se->exec_start == 0) > >>> return false; > >>> > >>> cfs_rq = cfs_rq_of(se); > >>> sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start; > >>> if ((s64)sleep_time > 60LL * NSEC_PER_SEC) > >>> return true; > >>> > >>> return false; > >>> } > >>> > >>> > >>>> + se->vruntime = -sched_sleeper_credit(se); > >>>> + else > >>>> + se->vruntime -= u64_u32_load(cfs_rq->min_vruntime); > >>>> } > >>>> > >>>> if (!task_on_rq_migrating(p)) { > >>>> > >>>> > >>>> > >>>> Thanks. > >>>> Zhang Qiao. > >>>> > >>>>> > >>>>>> > >>>>>> Thanks, > >>>>>> Roman. > >>>>>> > >>>>>> > >>>>>> > >>>>>> Amazon Development Center Germany GmbH > >>>>>> Krausenstr. 38 > >>>>>> 10117 Berlin > >>>>>> Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss > >>>>>> Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B > >>>>>> Sitz: Berlin > >>>>>> Ust-ID: DE 289 237 879 > >>>>>> > >>>>>> > >>>>>> > >>>>> . > >>>>> > >>> . > >>> > > . > > ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2023-03-03 8:33 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-02-09 19:31 [PATCH v3] sched/fair: sanitize vruntime of entity being placed Roman Kagan 2023-02-21 9:38 ` Vincent Guittot 2023-02-21 16:57 ` Roman Kagan 2023-02-21 17:26 ` Vincent Guittot 2023-02-27 8:42 ` Roman Kagan 2023-02-27 14:37 ` Vincent Guittot 2023-02-27 17:00 ` Dietmar Eggemann 2023-02-27 17:15 ` Vincent Guittot 2023-03-02 9:36 ` Zhang Qiao 2023-03-02 13:34 ` Vincent Guittot 2023-03-02 14:29 ` Zhang Qiao 2023-03-02 14:55 ` Vincent Guittot 2023-03-03 6:51 ` Zhang Qiao 2023-03-03 8:32 ` Vincent Guittot
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).