From: Ye Xiaolong <xiaolong.ye@intel.com>
To: lkp@lists.01.org
Subject: Re: [lkp-robot] [sched] f432b34f82: will-it-scale.per_thread_ops -12.6% regression
Date: Fri, 22 Jun 2018 15:34:43 +0800 [thread overview]
Message-ID: <20180622073443.GS11011@yexl-desktop> (raw)
In-Reply-To: <CAKfTPtDpx4iebk4aAQYAst6pmhfK-Sa-sfD41b49O3V6gDuddg@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 36156 bytes --]
Hi,
On 06/20, Vincent Guittot wrote:
>Hi,
>
>On Wed, 20 Jun 2018 at 09:38, kernel test robot <xiaolong.ye@intel.com> wrote:
>>
>>
>> Greeting,
>>
>> FYI, we noticed a -12.6% regression of will-it-scale.per_thread_ops due to commit:
>>
>>
>> commit: f432b34f825d49999b8ad5417cff4b5f104b74d4 ("sched: use pelt for scale_rt_capacity()")
>> https://git.linaro.org/people/vincent.guittot/kernel.git sched-rt-utilization
>>
>> in testcase: will-it-scale
>> on test machine: 8 threads Ivy Bridge with 16G memory
>> with following parameters:
>>
>> nr_task: 100%
>> mode: thread
>> test: sched_yield
>> cpufreq_governor: performance
>>
>> test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
>> test-url: https://github.com/antonblanchard/will-it-scale
>
>Thanks for the report.
>How many time the test is run ? I can see some high stdev value which
>suggest instability is test results
Hmm, you're right, the stdev value of the reported will-it-scale.per_thread_ops is high, we've run the test for 4 times,
the result is quite unstable, we need to further check.
Thanks,
Xiaolong
>
>>
>> In addition to that, the commit also has significant impact on the following tests:
>>
>> +------------------+-----------------------------------------------------------------+
>> | testcase: change | unixbench: unixbench.score 7.4% improvement |
>> | test machine | 8 threads Intel(R) Core(TM) i7 CPU 870 @ 2.93GHz with 6G memory |
>> | test parameters | nr_task=100% |
>> | | runtime=300s |
>> | | test=execl |
>> +------------------+-----------------------------------------------------------------+
>>
>
>Regards,
>Vincent
>
>>
>> Details are as below:
>> -------------------------------------------------------------------------------------------------->
>>
>>
>> To reproduce:
>>
>> git clone https://github.com/intel/lkp-tests.git
>> cd lkp-tests
>> bin/lkp install job.yaml # job file is attached in this email
>> bin/lkp run job.yaml
>>
>> =========================================================================================
>> compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
>> gcc-7/performance/x86_64-rhel-7.2/thread/100%/debian-x86_64-2016-08-31.cgz/lkp-ivb-d01/sched_yield/will-it-scale
>>
>> commit:
>> 3cd6eb61ec ("cpufreq/schedutil: take into account interrupt")
>> f432b34f82 ("sched: use pelt for scale_rt_capacity()")
>>
>> 3cd6eb61ecbc03a6 f432b34f825d49999b8ad5417c
>> ---------------- --------------------------
>> %stddev %change %stddev
>> \ | \
>> 2193002 -12.6% 1916998 ą 10% will-it-scale.per_thread_ops
>> 6932284 ą 2% +7144.3% 5.022e+08 ą 71% will-it-scale.time.involuntary_context_switches
>> 792.00 -10.8% 706.75 ą 8% will-it-scale.time.percent_of_cpu_this_job_got
>> 1842 -9.9% 1660 ą 7% will-it-scale.time.system_time
>> 544.32 -13.9% 468.66 ą 11% will-it-scale.time.user_time
>> 17544019 -12.6% 15335987 ą 10% will-it-scale.workload
>> 0.64 +10.5 11.10 ą 69% mpstat.cpu.idle%
>> 39211 +4177.4% 1677224 ą 70% vmstat.system.cs
>> 290602 ą 7% -24.2% 220391 ą 23% softirqs.RCU
>> 23897 ą 4% +160.0% 62125 ą 39% softirqs.SCHED
>> 20490 +3.3% 21172 proc-vmstat.nr_active_anon
>> 6128 +0.9% 6183 proc-vmstat.nr_mapped
>> 3480 ą 3% +9.0% 3792 ą 5% proc-vmstat.nr_shmem
>> 20490 +3.3% 21172 proc-vmstat.nr_zone_active_anon
>> 1173 ą 17% +40.4% 1648 ą 21% proc-vmstat.pgactivate
>> 321823 +1.3% 325849 proc-vmstat.pgalloc_normal
>> 317059 +1.3% 321126 proc-vmstat.pgfree
>> 3680 -10.4% 3298 ą 8% turbostat.Avg_MHz
>> 113297 ą 2% +2190.9% 2595491 ą 36% turbostat.C1
>> 0.04 +0.8 0.87 ą 36% turbostat.C1%
>> 211.75 ą 9% +584.7% 1449 ą 83% turbostat.C1E
>> 162.25 ą 19% +1010.6% 1802 ą 76% turbostat.C3
>> 7843 ą 10% +3025.2% 245131 ą 77% turbostat.C6
>> 0.30 ą 10% +9.5 9.76 ą 77% turbostat.C6%
>> 0.11 ą 10% +9616.3% 10.45 ą 73% turbostat.CPU%c1
>> 991835 +2010.0% 20928148 ą 36% cpuidle.C1.time
>> 114701 ą 2% +2165.3% 2598349 ą 36% cpuidle.C1.usage
>> 28432 ą 20% +557.1% 186822 ą 78% cpuidle.C1E.time
>> 275.75 ą 14% +451.9% 1522 ą 78% cpuidle.C1E.usage
>> 54932 ą 18% +697.3% 438003 ą 66% cpuidle.C3.time
>> 209.50 ą 18% +784.4% 1852 ą 73% cpuidle.C3.usage
>> 8498054 +2689.5% 2.371e+08 ą 76% cpuidle.C6.time
>> 9209 ą 2% +2576.7% 246513 ą 76% cpuidle.C6.usage
>> 1106 ą 12% +3290.7% 37501 ą 39% cpuidle.POLL.time
>> 345.25 ą 22% +2197.0% 7930 ą 37% cpuidle.POLL.usage
>> 9.444e+11 -57.4% 4.026e+11 ą 32% perf-stat.branch-instructions
>> 2.27 -0.1 2.16 ą 3% perf-stat.branch-miss-rate%
>> 2.14e+10 -59.7% 8.632e+09 ą 29% perf-stat.branch-misses
>> 5.06 ą 16% +7.3 12.37 ą 52% perf-stat.cache-miss-rate%
>> 11908086 +4181.7% 5.099e+08 ą 70% perf-stat.context-switches
>> 8.855e+12 -59.5% 3.587e+12 ą 29% perf-stat.cpu-cycles
>> 34115 ą 9% -70.0% 10248 ą 42% perf-stat.cpu-migrations
>> 3.556e+10 ą 2% -62.3% 1.342e+10 ą 30% perf-stat.dTLB-load-misses
>> 1.4e+12 -57.6% 5.941e+11 ą 31% perf-stat.dTLB-loads
>> 0.00 ą 36% +0.0 0.03 ą115% perf-stat.dTLB-store-miss-rate%
>> 9.282e+11 -57.9% 3.905e+11 ą 31% perf-stat.dTLB-stores
>> 52.07 ą 5% -11.3 40.80 ą 22% perf-stat.iTLB-load-miss-rate%
>> 5.605e+09 ą 9% -73.2% 1.503e+09 ą 14% perf-stat.iTLB-load-misses
>> 5.136e+09 ą 3% -53.4% 2.395e+09 ą 47% perf-stat.iTLB-loads
>> 4.548e+12 -57.7% 1.925e+12 ą 31% perf-stat.instructions
>> 818.96 ą 9% +56.1% 1278 ą 26% perf-stat.instructions-per-iTLB-miss
>> 259253 -49.2% 131782 ą 45% perf-stat.path-length
>> 14.08 ą 2% -2.7 11.40 ą 6% perf-profile.calltrace.cycles-pp.update_curr.pick_next_task_fair.__schedule.schedule.__x64_sys_sched_yield
>> 40.73 -2.6 38.12 ą 4% perf-profile.calltrace.cycles-pp.schedule.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe
>> 39.59 -2.4 37.15 ą 4% perf-profile.calltrace.cycles-pp.__schedule.schedule.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe
>> 48.35 -2.4 45.91 ą 4% perf-profile.calltrace.cycles-pp.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe
>> 25.93 -2.2 23.74 ą 5% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret
>> 23.12 -1.6 21.55 ą 4% perf-profile.calltrace.cycles-pp.pick_next_task_fair.__schedule.schedule.__x64_sys_sched_yield.do_syscall_64
>> 4.20 -0.6 3.58 ą 5% perf-profile.calltrace.cycles-pp.__calc_delta.update_curr.pick_next_task_fair.__schedule.schedule
>> 2.72 ą 3% -0.4 2.28 ą 8% perf-profile.calltrace.cycles-pp.pick_next_task_fair
>> 1.69 ą 2% -0.3 1.38 ą 5% perf-profile.calltrace.cycles-pp.yield_task_fair
>> 1.62 ą 4% -0.3 1.35 perf-profile.calltrace.cycles-pp.update_curr
>> 0.64 ą 2% -0.1 0.55 ą 6% perf-profile.calltrace.cycles-pp.do_syscall_64
>> 0.44 ą 58% +0.3 0.73 ą 17% perf-profile.calltrace.cycles-pp.__list_del_entry_valid.pick_next_task_fair.__schedule.schedule.__x64_sys_sched_yield
>> 1.21 ą 3% +0.9 2.10 ą 37% perf-profile.calltrace.cycles-pp.yield_task_fair.do_sched_yield.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe
>> 40.76 -2.6 38.17 ą 4% perf-profile.children.cycles-pp.schedule
>> 39.83 -2.5 37.34 ą 4% perf-profile.children.cycles-pp.__schedule
>> 49.28 -2.5 46.80 ą 4% perf-profile.children.cycles-pp.__x64_sys_sched_yield
>> 15.74 ą 2% -2.5 13.29 ą 2% perf-profile.children.cycles-pp.update_curr
>> 25.93 -2.2 23.75 ą 5% perf-profile.children.cycles-pp.syscall_return_via_sysret
>> 25.84 -2.0 23.84 ą 4% perf-profile.children.cycles-pp.pick_next_task_fair
>> 4.69 -0.7 4.04 ą 4% perf-profile.children.cycles-pp.__calc_delta
>> 0.79 ą 3% +0.2 1.04 ą 10% perf-profile.children.cycles-pp.clear_buddies
>> 25.93 -2.2 23.75 ą 5% perf-profile.self.cycles-pp.syscall_return_via_sysret
>> 9.70 ą 4% -1.8 7.95 ą 5% perf-profile.self.cycles-pp.update_curr
>> 9.44 ą 2% -0.9 8.59 ą 6% perf-profile.self.cycles-pp.__schedule
>> 4.69 -0.7 4.04 ą 4% perf-profile.self.cycles-pp.__calc_delta
>> 2.54 ą 2% -0.5 2.08 ą 10% perf-profile.self.cycles-pp.do_sched_yield
>> 2.83 ą 2% -0.5 2.37 ą 8% perf-profile.self.cycles-pp.yield_task_fair
>> 2.78 ą 5% -0.3 2.52 ą 6% perf-profile.self.cycles-pp.__x64_sys_sched_yield
>> 0.79 ą 3% +0.2 1.04 ą 10% perf-profile.self.cycles-pp.clear_buddies
>> 141060 -52.6% 66832 ą 56% sched_debug.cfs_rq:/.exec_clock.min
>> 2868 ą 20% +976.5% 30878 ą 59% sched_debug.cfs_rq:/.exec_clock.stddev
>> 1200492 -16.8% 998858 ą 12% sched_debug.cfs_rq:/.min_vruntime.avg
>> 1151085 -53.5% 535255 ą 56% sched_debug.cfs_rq:/.min_vruntime.min
>> 26141 ą 21% +882.4% 256804 ą 54% sched_debug.cfs_rq:/.min_vruntime.stddev
>> 1.23 ą 8% -14.8% 1.05 ą 7% sched_debug.cfs_rq:/.nr_spread_over.avg
>> 26137 ą 21% +882.5% 256802 ą 54% sched_debug.cfs_rq:/.spread0.stddev
>> 843.54 ą 2% -49.4% 427.08 ą 49% sched_debug.cfs_rq:/.util_avg.min
>> 88.78 ą 21% +166.9% 236.92 ą 33% sched_debug.cfs_rq:/.util_avg.stddev
>> 1.02 ą 8% -31.6% 0.70 ą 31% sched_debug.cpu.clock.stddev
>> 1.02 ą 8% -31.6% 0.70 ą 31% sched_debug.cpu.clock_task.stddev
>> 94.08 -36.2% 60.00 ą 55% sched_debug.cpu.cpu_load[0].min
>> 94.42 -35.6% 60.79 ą 42% sched_debug.cpu.cpu_load[1].min
>> 175.12 ą 8% +18.1% 206.83 ą 3% sched_debug.cpu.cpu_load[2].max
>> 95.75 -43.4% 54.21 ą 42% sched_debug.cpu.cpu_load[2].min
>> 26.63 ą 18% +75.2% 46.64 ą 24% sched_debug.cpu.cpu_load[2].stddev
>> 156.75 ą 6% +29.2% 202.46 ą 6% sched_debug.cpu.cpu_load[3].max
>> 96.79 -53.2% 45.29 ą 45% sched_debug.cpu.cpu_load[3].min
>> 20.25 ą 15% +135.2% 47.63 ą 29% sched_debug.cpu.cpu_load[3].stddev
>> 141.08 ą 4% +39.2% 196.33 ą 8% sched_debug.cpu.cpu_load[4].max
>> 97.54 -59.3% 39.67 ą 51% sched_debug.cpu.cpu_load[4].min
>> 14.80 ą 12% +219.8% 47.34 ą 33% sched_debug.cpu.cpu_load[4].stddev
>> 1170938 +2870.5% 34782808 ą 57% sched_debug.cpu.nr_switches.avg
>> 4811815 ą 26% +3077.4% 1.529e+08 ą 49% sched_debug.cpu.nr_switches.max
>> 58201 ą 15% -72.5% 15998 ą 99% sched_debug.cpu.nr_switches.min
>> 1633456 ą 20% +3373.7% 56741908 ą 62% sched_debug.cpu.nr_switches.stddev
>> 3.097e+08 -52.9% 1.459e+08 ą 58% sched_debug.cpu.sched_count.min
>> 7185897 ą 19% +826.3% 66559915 ą 58% sched_debug.cpu.sched_count.stddev
>> 12004 ą 2% +2153.0% 270459 ą 36% sched_debug.cpu.sched_goidle.avg
>> 64148 ą 16% +2818.1% 1871943 ą 35% sched_debug.cpu.sched_goidle.max
>> 21941 ą 10% +2722.7% 619331 ą 35% sched_debug.cpu.sched_goidle.stddev
>> 7011 ą 17% -67.7% 2265 ą 59% sched_debug.cpu.ttwu_count.min
>> 5763 ą 18% -73.5% 1527 ą 76% sched_debug.cpu.ttwu_local.min
>> 3.055e+08 ą 2% -53.1% 1.433e+08 ą 58% sched_debug.cpu.yld_count.min
>> 8644751 ą 18% +684.3% 67800425 ą 57% sched_debug.cpu.yld_count.stddev
>>
>>
>>
>> will-it-scale.time.involuntary_context_switches
>>
>> 2.5e+09 +-+---------------------------------------------------------------+
>> | |
>> | |
>> 2e+09 +-+ O |
>> | |
>> | |
>> 1.5e+09 +-+ O |
>> | |
>> 1e+09 +-+ O O O O |
>> | O |
>> | |
>> 5e+08 +-O O |
>> O O O O O O O O O O O |
>> | O O O O O |
>> 0 +-+---------------------------------------------------------------+
>>
>>
>> [*] bisect-good sample
>> [O] bisect-bad sample
>>
>> ***************************************************************************************************
>> nhm-white: 8 threads Intel(R) Core(TM) i7 CPU 870 @ 2.93GHz with 6G memory
>> =========================================================================================
>> compiler/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase:
>> gcc-7/x86_64-rhel-7.2/100%/debian-x86_64-2018-04-03.cgz/300s/nhm-white/execl/unixbench
>>
>> commit:
>> 3cd6eb61ec ("cpufreq/schedutil: take into account interrupt")
>> f432b34f82 ("sched: use pelt for scale_rt_capacity()")
>>
>> 3cd6eb61ecbc03a6 f432b34f825d49999b8ad5417c
>> ---------------- --------------------------
>> %stddev %change %stddev
>> \ | \
>> 4301 +7.4% 4619 unixbench.score
>> 4180477 -35.5% 2695696 ą 3% unixbench.time.involuntary_context_switches
>> 2.583e+08 +9.0% 2.817e+08 unixbench.time.minor_page_faults
>> 1655 +5.7% 1750 unixbench.time.system_time
>> 362.72 +4.4% 378.68 unixbench.time.user_time
>> 3279113 ą 3% +48.3% 4861315 ą 4% unixbench.time.voluntary_context_switches
>> 6252322 ą 3% +7.3% 6709345 ą 3% unixbench.workload
>> 1001980 ą 3% -43.5% 566495 ą 7% interrupts.CAL:Function_call_interrupts
>> 19.30 ą 7% -4.0 15.26 ą 12% mpstat.cpu.idle%
>> 56649 -7.5% 52419 vmstat.system.cs
>> 14780 -9.1% 13434 vmstat.system.in
>> 2.016e+08 +9.0% 2.196e+08 proc-vmstat.numa_hit
>> 2.016e+08 +9.0% 2.196e+08 proc-vmstat.numa_local
>> 5989 +6.4% 6370 proc-vmstat.pgactivate
>> 2.071e+08 +9.0% 2.258e+08 proc-vmstat.pgalloc_normal
>> 2.61e+08 +9.3% 2.852e+08 proc-vmstat.pgfault
>> 2.07e+08 +9.0% 2.258e+08 proc-vmstat.pgfree
>> 1.219e+08 ą 2% -31.2% 83948754 ą 5% cpuidle.C1.time
>> 5089762 ą 2% +15.7% 5888875 ą 3% cpuidle.C1.usage
>> 87816352 ą 3% -59.2% 35819088 ą 8% cpuidle.C1E.time
>> 2025352 ą 2% -57.4% 863049 ą 7% cpuidle.C1E.usage
>> 430285 ą 14% -35.1% 279134 ą 17% cpuidle.C3.usage
>> 1967162 ą 2% -11.4% 1742236 ą 3% cpuidle.POLL.time
>> 114825 ą 3% +21.4% 139387 ą 3% cpuidle.POLL.usage
>> 2401 ą 2% +9.7% 2635 ą 2% turbostat.Avg_MHz
>> 5089676 ą 2% +15.7% 5888833 ą 3% turbostat.C1
>> 4.50 -1.4 3.10 ą 4% turbostat.C1%
>> 2025326 ą 2% -57.4% 863042 ą 7% turbostat.C1E
>> 3.24 -1.9 1.32 ą 7% turbostat.C1E%
>> 430222 ą 14% -35.1% 279119 ą 17% turbostat.C3
>> 10.11 ą 3% -37.3% 6.33 ą 8% turbostat.CPU%c1
>> 6014414 ą 3% -15.0% 5110077 ą 3% turbostat.IRQ
>> 6.982e+11 +7.4% 7.499e+11 perf-stat.branch-instructions
>> 3.524e+10 +6.2% 3.741e+10 perf-stat.branch-misses
>> 19295137 -7.8% 17797227 ą 2% perf-stat.context-switches
>> 6.455e+12 +9.9% 7.092e+12 perf-stat.cpu-cycles
>> 4273261 -48.8% 2188451 ą 5% perf-stat.cpu-migrations
>> 1.352e+12 +7.1% 1.448e+12 perf-stat.dTLB-loads
>> 6.033e+11 +6.4% 6.42e+11 ą 2% perf-stat.dTLB-stores
>> 2.725e+09 ą 2% +6.6% 2.906e+09 ą 2% perf-stat.iTLB-load-misses
>> 3.492e+12 +7.0% 3.737e+12 perf-stat.iTLB-loads
>> 3.489e+12 +7.0% 3.735e+12 perf-stat.instructions
>> 0.54 -2.6% 0.53 perf-stat.ipc
>> 2.48e+08 +9.0% 2.704e+08 perf-stat.minor-faults
>> 2.48e+08 +9.0% 2.704e+08 perf-stat.page-faults
>> 35944 ą 28% +74.2% 62632 ą 27% sched_debug.cfs_rq:/.MIN_vruntime.avg
>> 287559 ą 28% +59.1% 457421 ą 30% sched_debug.cfs_rq:/.MIN_vruntime.max
>> 95101 ą 28% +63.3% 155341 ą 28% sched_debug.cfs_rq:/.MIN_vruntime.stddev
>> 35944 ą 28% +74.2% 62632 ą 27% sched_debug.cfs_rq:/.max_vruntime.avg
>> 287559 ą 28% +59.1% 457421 ą 30% sched_debug.cfs_rq:/.max_vruntime.max
>> 95101 ą 28% +63.3% 155341 ą 28% sched_debug.cfs_rq:/.max_vruntime.stddev
>> 790964 +14.9% 908896 sched_debug.cfs_rq:/.min_vruntime.avg
>> 827175 ą 2% +14.3% 945176 ą 2% sched_debug.cfs_rq:/.min_vruntime.max
>> 723861 ą 4% +13.4% 820615 ą 4% sched_debug.cfs_rq:/.min_vruntime.min
>> 705.87 ą 18% +24.0% 875.53 ą 9% sched_debug.cfs_rq:/.util_avg.avg
>> 461.75 ą 27% +43.7% 663.40 ą 14% sched_debug.cfs_rq:/.util_est_enqueued.avg
>> 193562 ą 15% -51.0% 94833 ą 62% sched_debug.cpu.avg_idle.min
>> 0.77 ą 10% +32.6% 1.02 ą 19% sched_debug.cpu.clock.stddev
>> 0.76 ą 10% +33.3% 1.02 ą 19% sched_debug.cpu.clock_task.stddev
>> 74.29 +11.8% 83.08 ą 9% sched_debug.cpu.cpu_load[3].min
>> 1123755 -8.6% 1027327 sched_debug.cpu.nr_switches.avg
>> 1197214 ą 2% -8.1% 1100171 sched_debug.cpu.nr_switches.max
>> 795.50 ą 17% -74.2% 205.42 ą 20% sched_debug.cpu.nr_uninterruptible.max
>> -1575 -86.9% -207.17 sched_debug.cpu.nr_uninterruptible.min
>> 751.08 ą 40% -81.4% 139.62 ą 18% sched_debug.cpu.nr_uninterruptible.stddev
>> 1116957 -8.6% 1020656 sched_debug.cpu.sched_count.avg
>> 1190399 ą 2% -8.2% 1092751 sched_debug.cpu.sched_count.max
>> 304608 ą 2% -26.8% 223083 sched_debug.cpu.ttwu_local.avg
>> 337393 ą 3% -26.3% 248739 ą 2% sched_debug.cpu.ttwu_local.max
>> 264918 ą 4% -25.0% 198599 ą 4% sched_debug.cpu.ttwu_local.min
>> 21128 ą 24% -23.7% 16117 ą 10% sched_debug.cpu.ttwu_local.stddev
>> 10.90 ą 16% -4.1 6.85 ą 10% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64
>> 10.90 ą 16% -4.1 6.85 ą 10% perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64
>> 10.90 ą 16% -4.1 6.84 ą 10% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
>> 10.08 ą 15% -3.9 6.15 ą 12% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
>> 9.78 ą 15% -3.8 5.95 ą 13% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary
>> 12.20 ą 7% -3.6 8.60 ą 11% perf-profile.calltrace.cycles-pp.secondary_startup_64
>> 0.86 ą 5% -0.5 0.36 ą100% perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork
>> 0.90 ą 4% -0.3 0.65 ą 17% perf-profile.calltrace.cycles-pp.ret_from_fork
>> 0.90 ą 4% -0.3 0.65 ą 17% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork
>> 0.68 ą 3% +0.1 0.75 perf-profile.calltrace.cycles-pp.mark_page_accessed.unmap_page_range.unmap_vmas.exit_mmap.mmput
>> 1.21 ą 3% +0.1 1.33 ą 7% perf-profile.calltrace.cycles-pp.alloc_pages_vma.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault
>> 1.00 ą 6% +0.1 1.13 ą 9% perf-profile.calltrace.cycles-pp.do_open_execat.do_execveat_common.__x64_sys_execve.do_syscall_64.entry_SYSCALL_64_after_hwframe
>> 0.97 ą 6% +0.1 1.10 ą 10% perf-profile.calltrace.cycles-pp.do_filp_open.do_open_execat.do_execveat_common.__x64_sys_execve.do_syscall_64
>> 1.51 ą 3% +0.1 1.65 ą 4% perf-profile.calltrace.cycles-pp.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
>> 0.93 ą 8% +0.2 1.08 ą 10% perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_open_execat.do_execveat_common.__x64_sys_execve
>> 1.23 ą 3% +0.2 1.40 ą 6% perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
>> 1.23 ą 5% +0.2 1.40 ą 4% perf-profile.calltrace.cycles-pp.walk_component.path_lookupat.filename_lookup.do_faccessat.do_syscall_64
>> 1.20 ą 5% +0.2 1.38 ą 6% perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
>> 0.99 ą 10% +0.2 1.18 ą 9% perf-profile.calltrace.cycles-pp.perf_event_mmap.mmap_region.do_mmap.vm_mmap_pgoff.elf_map
>> 2.09 ą 5% +0.2 2.29 ą 6% perf-profile.calltrace.cycles-pp.wp_page_copy.do_wp_page.__handle_mm_fault.handle_mm_fault.__do_page_fault
>> 0.26 ą100% +0.3 0.56 ą 5% perf-profile.calltrace.cycles-pp.vma_interval_tree_insert.__vma_adjust.__split_vma.mprotect_fixup.do_mprotect_pkey
>> 2.71 ą 4% +0.3 3.03 ą 5% perf-profile.calltrace.cycles-pp.do_mmap.vm_mmap_pgoff.elf_map.load_elf_binary.search_binary_handler
>> 4.43 ą 3% +0.3 4.76 ą 2% perf-profile.calltrace.cycles-pp.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe
>> 0.47 ą 58% +0.3 0.81 ą 7% perf-profile.calltrace.cycles-pp.unlink_file_vma.free_pgtables.exit_mmap.mmput.flush_old_exec
>> 4.82 +0.4 5.18 ą 3% perf-profile.calltrace.cycles-pp.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe
>> 1.71 ą 8% +0.4 2.06 ą 4% perf-profile.calltrace.cycles-pp.free_pgtables.exit_mmap.mmput.flush_old_exec.load_elf_binary
>> 3.83 ą 3% +0.4 4.20 ą 2% perf-profile.calltrace.cycles-pp.mmap_region.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64
>> 4.92 +0.4 5.33 ą 4% perf-profile.calltrace.cycles-pp.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe
>> 0.12 ą173% +0.5 0.62 ą 12% perf-profile.calltrace.cycles-pp.__vma_adjust.__split_vma.do_munmap.vm_munmap.elf_map
>> 12.06 ą 5% +0.8 12.91 perf-profile.calltrace.cycles-pp.exit_mmap.mmput.flush_old_exec.load_elf_binary.search_binary_handler
>> 12.10 ą 5% +0.9 12.98 perf-profile.calltrace.cycles-pp.mmput.flush_old_exec.load_elf_binary.search_binary_handler.do_execveat_common
>> 12.57 ą 5% +0.9 13.49 ą 2% perf-profile.calltrace.cycles-pp.flush_old_exec.load_elf_binary.search_binary_handler.do_execveat_common.__x64_sys_execve
>> 27.55 ą 2% +1.8 29.34 ą 3% perf-profile.calltrace.cycles-pp.load_elf_binary.search_binary_handler.do_execveat_common.__x64_sys_execve.do_syscall_64
>> 27.65 ą 2% +1.8 29.45 ą 3% perf-profile.calltrace.cycles-pp.search_binary_handler.do_execveat_common.__x64_sys_execve.do_syscall_64.entry_SYSCALL_64_after_hwframe
>> 38.27 ą 2% +2.3 40.61 ą 2% perf-profile.calltrace.cycles-pp.__x64_sys_execve.do_syscall_64.entry_SYSCALL_64_after_hwframe.execve
>> 38.27 ą 2% +2.3 40.62 ą 2% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.execve
>> 38.27 ą 2% +2.4 40.62 ą 2% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.execve
>> 38.21 ą 2% +2.4 40.56 ą 2% perf-profile.calltrace.cycles-pp.do_execveat_common.__x64_sys_execve.do_syscall_64.entry_SYSCALL_64_after_hwframe.execve
>> 40.44 ą 2% +2.4 42.83 ą 2% perf-profile.calltrace.cycles-pp.execve
>> 10.90 ą 16% -4.1 6.85 ą 10% perf-profile.children.cycles-pp.start_secondary
>> 12.21 ą 7% -3.6 8.61 ą 11% perf-profile.children.cycles-pp.do_idle
>> 12.20 ą 7% -3.6 8.60 ą 11% perf-profile.children.cycles-pp.secondary_startup_64
>> 12.20 ą 7% -3.6 8.60 ą 11% perf-profile.children.cycles-pp.cpu_startup_entry
>> 11.26 ą 7% -3.5 7.75 ą 13% perf-profile.children.cycles-pp.cpuidle_enter_state
>> 10.93 ą 7% -3.5 7.47 ą 13% perf-profile.children.cycles-pp.intel_idle
>> 0.86 ą 5% -0.3 0.60 ą 22% perf-profile.children.cycles-pp.smpboot_thread_fn
>> 0.91 ą 4% -0.3 0.65 ą 17% perf-profile.children.cycles-pp.ret_from_fork
>> 0.90 ą 4% -0.3 0.65 ą 17% perf-profile.children.cycles-pp.kthread
>> 0.28 ą 34% -0.2 0.04 ą108% perf-profile.children.cycles-pp.ksys_write
>> 0.27 ą 38% -0.2 0.04 ą108% perf-profile.children.cycles-pp.vfs_write
>> 0.27 ą 39% -0.2 0.04 ą108% perf-profile.children.cycles-pp.__vfs_write
>> 0.25 ą 44% -0.2 0.03 ą100% perf-profile.children.cycles-pp.__generic_file_write_iter
>> 0.25 ą 44% -0.2 0.03 ą100% perf-profile.children.cycles-pp.generic_file_write_iter
>> 0.25 ą 43% -0.2 0.03 ą105% perf-profile.children.cycles-pp.generic_perform_write
>> 0.54 ą 22% -0.2 0.35 ą 17% perf-profile.children.cycles-pp.cpu_stopper_thread
>> 0.57 ą 8% -0.2 0.40 ą 21% perf-profile.children.cycles-pp.radix_tree_next_chunk
>> 1.16 ą 4% -0.1 1.05 ą 8% perf-profile.children.cycles-pp.__might_sleep
>> 0.45 ą 4% -0.1 0.35 ą 25% perf-profile.children.cycles-pp.stop_one_cpu
>> 0.41 ą 4% -0.1 0.31 ą 24% perf-profile.children.cycles-pp.up_write
>> 0.16 ą 26% -0.1 0.07 ą 38% perf-profile.children.cycles-pp.smp_call_function_single
>> 0.26 ą 16% -0.1 0.17 ą 14% perf-profile.children.cycles-pp.free_ldt_pgtables
>> 0.14 ą 23% -0.1 0.08 ą 60% perf-profile.children.cycles-pp.wake_up_q
>> 0.08 ą 20% +0.0 0.11 ą 16% perf-profile.children.cycles-pp.selinux_bprm_set_creds
>> 0.06 ą 58% +0.0 0.09 ą 13% perf-profile.children.cycles-pp.switch_mm
>> 0.78 ą 3% +0.0 0.83 ą 3% perf-profile.children.cycles-pp.mark_page_accessed
>> 0.03 ą100% +0.0 0.07 ą 14% perf-profile.children.cycles-pp.mntput_no_expire
>> 0.00 +0.1 0.05 perf-profile.children.cycles-pp.__calc_delta
>> 0.15 ą 19% +0.1 0.20 ą 19% perf-profile.children.cycles-pp.mem_cgroup_uncharge_list
>> 0.01 ą173% +0.1 0.07 ą 23% perf-profile.children.cycles-pp.task_work_add
>> 0.00 +0.1 0.06 ą 20% perf-profile.children.cycles-pp.file_ra_state_init
>> 0.13 ą 9% +0.1 0.20 ą 15% perf-profile.children.cycles-pp.may_open
>> 0.10 ą 21% +0.1 0.18 ą 18% perf-profile.children.cycles-pp.security_file_alloc
>> 0.07 ą 27% +0.1 0.15 ą 46% perf-profile.children.cycles-pp.may_expand_vm
>> 0.00 +0.1 0.07 ą 30% perf-profile.children.cycles-pp.selinux_task_getsecid
>> 0.12 ą 29% +0.1 0.19 ą 25% perf-profile.children.cycles-pp.lookup_memtype
>> 0.08 ą 23% +0.1 0.16 ą 20% perf-profile.children.cycles-pp.selinux_file_alloc_security
>> 0.07 ą 58% +0.1 0.16 ą 9% perf-profile.children.cycles-pp.should_failslab
>> 0.28 ą 8% +0.1 0.37 ą 6% perf-profile.children.cycles-pp.get_empty_filp
>> 0.41 ą 18% +0.1 0.54 ą 20% perf-profile.children.cycles-pp.native_flush_tlb_one_user
>> 1.53 ą 3% +0.1 1.66 ą 4% perf-profile.children.cycles-pp.do_sys_open
>> 0.56 ą 5% +0.1 0.69 ą 5% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
>> 0.20 ą 19% +0.2 0.35 ą 16% perf-profile.children.cycles-pp.mem_cgroup_commit_charge
>> 1.78 ą 4% +0.2 1.94 ą 5% perf-profile.children.cycles-pp.walk_component
>> 0.94 ą 11% +0.2 1.10 ą 4% perf-profile.children.cycles-pp.irq_exit
>> 0.61 ą 16% +0.2 0.78 ą 7% perf-profile.children.cycles-pp.free_pgd_range
>> 0.50 ą 12% +0.2 0.70 ą 19% perf-profile.children.cycles-pp.expand_downwards
>> 2.10 ą 5% +0.2 2.30 ą 5% perf-profile.children.cycles-pp.wp_page_copy
>> 1.02 ą 12% +0.2 1.24 ą 4% perf-profile.children.cycles-pp.vma_interval_tree_insert
>> 2.73 ą 3% +0.2 2.95 ą 4% perf-profile.children.cycles-pp.__vma_adjust
>> 1.00 ą 8% +0.2 1.25 ą 6% perf-profile.children.cycles-pp.unlink_file_vma
>> 2.19 ą 6% +0.4 2.56 ą 5% perf-profile.children.cycles-pp.free_pgtables
>> 4.93 +0.4 5.34 ą 4% perf-profile.children.cycles-pp.ksys_mmap_pgoff
>> 3.11 ą 7% +0.4 3.54 ą 5% perf-profile.children.cycles-pp.do_filp_open
>> 3.02 ą 7% +0.5 3.49 ą 5% perf-profile.children.cycles-pp.path_openat
>> 6.21 +0.6 6.84 ą 3% perf-profile.children.cycles-pp.mmap_region
>> 7.17 +0.6 7.80 ą 2% perf-profile.children.cycles-pp.do_mmap
>> 7.91 +0.7 8.61 ą 2% perf-profile.children.cycles-pp.vm_mmap_pgoff
>> 12.07 ą 5% +0.8 12.92 perf-profile.children.cycles-pp.exit_mmap
>> 12.11 ą 5% +0.9 12.99 perf-profile.children.cycles-pp.mmput
>> 12.58 ą 5% +0.9 13.49 ą 2% perf-profile.children.cycles-pp.flush_old_exec
>> 27.57 ą 2% +1.8 29.38 ą 3% perf-profile.children.cycles-pp.load_elf_binary
>> 27.69 ą 2% +1.8 29.51 ą 3% perf-profile.children.cycles-pp.search_binary_handler
>> 38.45 ą 2% +2.3 40.77 ą 2% perf-profile.children.cycles-pp.__x64_sys_execve
>> 38.38 ą 2% +2.3 40.71 ą 2% perf-profile.children.cycles-pp.do_execveat_common
>> 40.44 ą 2% +2.4 42.83 ą 2% perf-profile.children.cycles-pp.execve
>> 55.45 ą 2% +2.9 58.30 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
>> 55.36 ą 2% +2.9 58.23 perf-profile.children.cycles-pp.do_syscall_64
>> 10.91 ą 7% -3.4 7.46 ą 13% perf-profile.self.cycles-pp.intel_idle
>> 0.55 ą 11% -0.2 0.39 ą 23% perf-profile.self.cycles-pp.radix_tree_next_chunk
>> 0.41 ą 4% -0.1 0.30 ą 27% perf-profile.self.cycles-pp.up_write
>> 0.14 ą 34% -0.1 0.04 ą107% perf-profile.self.cycles-pp.smp_call_function_single
>> 0.14 ą 40% -0.1 0.08 ą 46% perf-profile.self.cycles-pp.unmap_vmas
>> 0.19 ą 10% -0.0 0.15 ą 4% perf-profile.self.cycles-pp.unmapped_area_topdown
>> 0.05 ą 58% +0.0 0.08 ą 5% perf-profile.self.cycles-pp.exit_mmap
>> 0.01 ą173% +0.0 0.05 ą 9% perf-profile.self.cycles-pp.selinux_bprm_set_creds
>> 0.03 ą100% +0.0 0.07 ą 14% perf-profile.self.cycles-pp.mntput_no_expire
>> 0.00 +0.1 0.05 perf-profile.self.cycles-pp.__calc_delta
>> 0.06 ą 62% +0.1 0.12 ą 15% perf-profile.self.cycles-pp.remove_vma
>> 0.00 +0.1 0.06 ą 20% perf-profile.self.cycles-pp.file_ra_state_init
>> 0.03 ą100% +0.1 0.10 ą 21% perf-profile.self.cycles-pp.mem_cgroup_uncharge_list
>> 0.07 ą 22% +0.1 0.15 ą 46% perf-profile.self.cycles-pp.may_expand_vm
>> 0.14 ą 25% +0.1 0.22 ą 8% perf-profile.self.cycles-pp.cpumask_any_but
>> 0.08 ą 36% +0.1 0.16 ą 11% perf-profile.self.cycles-pp.mem_cgroup_commit_charge
>> 0.07 ą 58% +0.1 0.15 ą 5% perf-profile.self.cycles-pp.should_failslab
>> 0.20 ą 28% +0.1 0.32 ą 6% perf-profile.self.cycles-pp.load_elf_binary
>> 0.15 ą 36% +0.1 0.27 ą 6% perf-profile.self.cycles-pp.free_pgd_range
>> 0.41 ą 18% +0.1 0.54 ą 20% perf-profile.self.cycles-pp.native_flush_tlb_one_user
>> 0.96 ą 5% +0.2 1.13 ą 8% perf-profile.self.cycles-pp.kmem_cache_alloc
>> 1.00 ą 13% +0.2 1.23 ą 4% perf-profile.self.cycles-pp.vma_interval_tree_insert
>> 3.35 ą 8% +0.5 3.80 ą 8% perf-profile.self.cycles-pp.unmap_page_range
>>
>>
>>
>>
>>
>> Disclaimer:
>> Results have been estimated based on internal Intel analysis and are provided
>> for informational purposes only. Any difference in system hardware or software
>> design or configuration may affect actual performance.
>>
>>
>> Thanks,
>> Xiaolong
prev parent reply other threads:[~2018-06-22 7:34 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-20 7:34 [lkp-robot] [sched] f432b34f82: will-it-scale.per_thread_ops -12.6% regression kernel test robot
2018-06-20 12:05 ` Vincent Guittot
2018-06-22 7:34 ` Ye Xiaolong [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180622073443.GS11011@yexl-desktop \
--to=xiaolong.ye@intel.com \
--cc=lkp@lists.01.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.