All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ye Xiaolong <xiaolong.ye@intel.com>
To: lkp@lists.01.org
Subject: Re: [lkp-robot] [sched] f432b34f82: will-it-scale.per_thread_ops -12.6% regression
Date: Fri, 22 Jun 2018 15:34:43 +0800	[thread overview]
Message-ID: <20180622073443.GS11011@yexl-desktop> (raw)
In-Reply-To: <CAKfTPtDpx4iebk4aAQYAst6pmhfK-Sa-sfD41b49O3V6gDuddg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 36156 bytes --]

Hi,

On 06/20, Vincent Guittot wrote:
>Hi,
>
>On Wed, 20 Jun 2018 at 09:38, kernel test robot <xiaolong.ye@intel.com> wrote:
>>
>>
>> Greeting,
>>
>> FYI, we noticed a -12.6% regression of will-it-scale.per_thread_ops due to commit:
>>
>>
>> commit: f432b34f825d49999b8ad5417cff4b5f104b74d4 ("sched: use pelt for scale_rt_capacity()")
>> https://git.linaro.org/people/vincent.guittot/kernel.git sched-rt-utilization
>>
>> in testcase: will-it-scale
>> on test machine: 8 threads Ivy Bridge with 16G memory
>> with following parameters:
>>
>>         nr_task: 100%
>>         mode: thread
>>         test: sched_yield
>>         cpufreq_governor: performance
>>
>> test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
>> test-url: https://github.com/antonblanchard/will-it-scale
>
>Thanks for the report.
>How many time the test is run ? I can see some high stdev value which
>suggest instability is test results

Hmm, you're right, the stdev value of the reported will-it-scale.per_thread_ops is high, we've run the test for 4 times,
the result is quite unstable, we need to further check.


Thanks,
Xiaolong

>
>>
>> In addition to that, the commit also has significant impact on the following tests:
>>
>> +------------------+-----------------------------------------------------------------+
>> | testcase: change | unixbench: unixbench.score 7.4% improvement                     |
>> | test machine     | 8 threads Intel(R) Core(TM) i7 CPU 870 @ 2.93GHz with 6G memory |
>> | test parameters  | nr_task=100%                                                    |
>> |                  | runtime=300s                                                    |
>> |                  | test=execl                                                      |
>> +------------------+-----------------------------------------------------------------+
>>
>
>Regards,
>Vincent
>
>>
>> Details are as below:
>> -------------------------------------------------------------------------------------------------->
>>
>>
>> To reproduce:
>>
>>         git clone https://github.com/intel/lkp-tests.git
>>         cd lkp-tests
>>         bin/lkp install job.yaml  # job file is attached in this email
>>         bin/lkp run     job.yaml
>>
>> =========================================================================================
>> compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
>>   gcc-7/performance/x86_64-rhel-7.2/thread/100%/debian-x86_64-2016-08-31.cgz/lkp-ivb-d01/sched_yield/will-it-scale
>>
>> commit:
>>   3cd6eb61ec ("cpufreq/schedutil: take into account interrupt")
>>   f432b34f82 ("sched: use pelt for scale_rt_capacity()")
>>
>> 3cd6eb61ecbc03a6 f432b34f825d49999b8ad5417c
>> ---------------- --------------------------
>>          %stddev     %change         %stddev
>>              \          |                \
>>    2193002           -12.6%    1916998 ą 10%  will-it-scale.per_thread_ops
>>    6932284 ą  2%   +7144.3%  5.022e+08 ą 71%  will-it-scale.time.involuntary_context_switches
>>     792.00           -10.8%     706.75 ą  8%  will-it-scale.time.percent_of_cpu_this_job_got
>>       1842            -9.9%       1660 ą  7%  will-it-scale.time.system_time
>>     544.32           -13.9%     468.66 ą 11%  will-it-scale.time.user_time
>>   17544019           -12.6%   15335987 ą 10%  will-it-scale.workload
>>       0.64           +10.5       11.10 ą 69%  mpstat.cpu.idle%
>>      39211         +4177.4%    1677224 ą 70%  vmstat.system.cs
>>     290602 ą  7%     -24.2%     220391 ą 23%  softirqs.RCU
>>      23897 ą  4%    +160.0%      62125 ą 39%  softirqs.SCHED
>>      20490            +3.3%      21172        proc-vmstat.nr_active_anon
>>       6128            +0.9%       6183        proc-vmstat.nr_mapped
>>       3480 ą  3%      +9.0%       3792 ą  5%  proc-vmstat.nr_shmem
>>      20490            +3.3%      21172        proc-vmstat.nr_zone_active_anon
>>       1173 ą 17%     +40.4%       1648 ą 21%  proc-vmstat.pgactivate
>>     321823            +1.3%     325849        proc-vmstat.pgalloc_normal
>>     317059            +1.3%     321126        proc-vmstat.pgfree
>>       3680           -10.4%       3298 ą  8%  turbostat.Avg_MHz
>>     113297 ą  2%   +2190.9%    2595491 ą 36%  turbostat.C1
>>       0.04            +0.8        0.87 ą 36%  turbostat.C1%
>>     211.75 ą  9%    +584.7%       1449 ą 83%  turbostat.C1E
>>     162.25 ą 19%   +1010.6%       1802 ą 76%  turbostat.C3
>>       7843 ą 10%   +3025.2%     245131 ą 77%  turbostat.C6
>>       0.30 ą 10%      +9.5        9.76 ą 77%  turbostat.C6%
>>       0.11 ą 10%   +9616.3%      10.45 ą 73%  turbostat.CPU%c1
>>     991835         +2010.0%   20928148 ą 36%  cpuidle.C1.time
>>     114701 ą  2%   +2165.3%    2598349 ą 36%  cpuidle.C1.usage
>>      28432 ą 20%    +557.1%     186822 ą 78%  cpuidle.C1E.time
>>     275.75 ą 14%    +451.9%       1522 ą 78%  cpuidle.C1E.usage
>>      54932 ą 18%    +697.3%     438003 ą 66%  cpuidle.C3.time
>>     209.50 ą 18%    +784.4%       1852 ą 73%  cpuidle.C3.usage
>>    8498054         +2689.5%  2.371e+08 ą 76%  cpuidle.C6.time
>>       9209 ą  2%   +2576.7%     246513 ą 76%  cpuidle.C6.usage
>>       1106 ą 12%   +3290.7%      37501 ą 39%  cpuidle.POLL.time
>>     345.25 ą 22%   +2197.0%       7930 ą 37%  cpuidle.POLL.usage
>>  9.444e+11           -57.4%  4.026e+11 ą 32%  perf-stat.branch-instructions
>>       2.27            -0.1        2.16 ą  3%  perf-stat.branch-miss-rate%
>>   2.14e+10           -59.7%  8.632e+09 ą 29%  perf-stat.branch-misses
>>       5.06 ą 16%      +7.3       12.37 ą 52%  perf-stat.cache-miss-rate%
>>   11908086         +4181.7%  5.099e+08 ą 70%  perf-stat.context-switches
>>  8.855e+12           -59.5%  3.587e+12 ą 29%  perf-stat.cpu-cycles
>>      34115 ą  9%     -70.0%      10248 ą 42%  perf-stat.cpu-migrations
>>  3.556e+10 ą  2%     -62.3%  1.342e+10 ą 30%  perf-stat.dTLB-load-misses
>>    1.4e+12           -57.6%  5.941e+11 ą 31%  perf-stat.dTLB-loads
>>       0.00 ą 36%      +0.0        0.03 ą115%  perf-stat.dTLB-store-miss-rate%
>>  9.282e+11           -57.9%  3.905e+11 ą 31%  perf-stat.dTLB-stores
>>      52.07 ą  5%     -11.3       40.80 ą 22%  perf-stat.iTLB-load-miss-rate%
>>  5.605e+09 ą  9%     -73.2%  1.503e+09 ą 14%  perf-stat.iTLB-load-misses
>>  5.136e+09 ą  3%     -53.4%  2.395e+09 ą 47%  perf-stat.iTLB-loads
>>  4.548e+12           -57.7%  1.925e+12 ą 31%  perf-stat.instructions
>>     818.96 ą  9%     +56.1%       1278 ą 26%  perf-stat.instructions-per-iTLB-miss
>>     259253           -49.2%     131782 ą 45%  perf-stat.path-length
>>      14.08 ą  2%      -2.7       11.40 ą  6%  perf-profile.calltrace.cycles-pp.update_curr.pick_next_task_fair.__schedule.schedule.__x64_sys_sched_yield
>>      40.73            -2.6       38.12 ą  4%  perf-profile.calltrace.cycles-pp.schedule.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe
>>      39.59            -2.4       37.15 ą  4%  perf-profile.calltrace.cycles-pp.__schedule.schedule.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe
>>      48.35            -2.4       45.91 ą  4%  perf-profile.calltrace.cycles-pp.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe
>>      25.93            -2.2       23.74 ą  5%  perf-profile.calltrace.cycles-pp.syscall_return_via_sysret
>>      23.12            -1.6       21.55 ą  4%  perf-profile.calltrace.cycles-pp.pick_next_task_fair.__schedule.schedule.__x64_sys_sched_yield.do_syscall_64
>>       4.20            -0.6        3.58 ą  5%  perf-profile.calltrace.cycles-pp.__calc_delta.update_curr.pick_next_task_fair.__schedule.schedule
>>       2.72 ą  3%      -0.4        2.28 ą  8%  perf-profile.calltrace.cycles-pp.pick_next_task_fair
>>       1.69 ą  2%      -0.3        1.38 ą  5%  perf-profile.calltrace.cycles-pp.yield_task_fair
>>       1.62 ą  4%      -0.3        1.35        perf-profile.calltrace.cycles-pp.update_curr
>>       0.64 ą  2%      -0.1        0.55 ą  6%  perf-profile.calltrace.cycles-pp.do_syscall_64
>>       0.44 ą 58%      +0.3        0.73 ą 17%  perf-profile.calltrace.cycles-pp.__list_del_entry_valid.pick_next_task_fair.__schedule.schedule.__x64_sys_sched_yield
>>       1.21 ą  3%      +0.9        2.10 ą 37%  perf-profile.calltrace.cycles-pp.yield_task_fair.do_sched_yield.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe
>>      40.76            -2.6       38.17 ą  4%  perf-profile.children.cycles-pp.schedule
>>      39.83            -2.5       37.34 ą  4%  perf-profile.children.cycles-pp.__schedule
>>      49.28            -2.5       46.80 ą  4%  perf-profile.children.cycles-pp.__x64_sys_sched_yield
>>      15.74 ą  2%      -2.5       13.29 ą  2%  perf-profile.children.cycles-pp.update_curr
>>      25.93            -2.2       23.75 ą  5%  perf-profile.children.cycles-pp.syscall_return_via_sysret
>>      25.84            -2.0       23.84 ą  4%  perf-profile.children.cycles-pp.pick_next_task_fair
>>       4.69            -0.7        4.04 ą  4%  perf-profile.children.cycles-pp.__calc_delta
>>       0.79 ą  3%      +0.2        1.04 ą 10%  perf-profile.children.cycles-pp.clear_buddies
>>      25.93            -2.2       23.75 ą  5%  perf-profile.self.cycles-pp.syscall_return_via_sysret
>>       9.70 ą  4%      -1.8        7.95 ą  5%  perf-profile.self.cycles-pp.update_curr
>>       9.44 ą  2%      -0.9        8.59 ą  6%  perf-profile.self.cycles-pp.__schedule
>>       4.69            -0.7        4.04 ą  4%  perf-profile.self.cycles-pp.__calc_delta
>>       2.54 ą  2%      -0.5        2.08 ą 10%  perf-profile.self.cycles-pp.do_sched_yield
>>       2.83 ą  2%      -0.5        2.37 ą  8%  perf-profile.self.cycles-pp.yield_task_fair
>>       2.78 ą  5%      -0.3        2.52 ą  6%  perf-profile.self.cycles-pp.__x64_sys_sched_yield
>>       0.79 ą  3%      +0.2        1.04 ą 10%  perf-profile.self.cycles-pp.clear_buddies
>>     141060           -52.6%      66832 ą 56%  sched_debug.cfs_rq:/.exec_clock.min
>>       2868 ą 20%    +976.5%      30878 ą 59%  sched_debug.cfs_rq:/.exec_clock.stddev
>>    1200492           -16.8%     998858 ą 12%  sched_debug.cfs_rq:/.min_vruntime.avg
>>    1151085           -53.5%     535255 ą 56%  sched_debug.cfs_rq:/.min_vruntime.min
>>      26141 ą 21%    +882.4%     256804 ą 54%  sched_debug.cfs_rq:/.min_vruntime.stddev
>>       1.23 ą  8%     -14.8%       1.05 ą  7%  sched_debug.cfs_rq:/.nr_spread_over.avg
>>      26137 ą 21%    +882.5%     256802 ą 54%  sched_debug.cfs_rq:/.spread0.stddev
>>     843.54 ą  2%     -49.4%     427.08 ą 49%  sched_debug.cfs_rq:/.util_avg.min
>>      88.78 ą 21%    +166.9%     236.92 ą 33%  sched_debug.cfs_rq:/.util_avg.stddev
>>       1.02 ą  8%     -31.6%       0.70 ą 31%  sched_debug.cpu.clock.stddev
>>       1.02 ą  8%     -31.6%       0.70 ą 31%  sched_debug.cpu.clock_task.stddev
>>      94.08           -36.2%      60.00 ą 55%  sched_debug.cpu.cpu_load[0].min
>>      94.42           -35.6%      60.79 ą 42%  sched_debug.cpu.cpu_load[1].min
>>     175.12 ą  8%     +18.1%     206.83 ą  3%  sched_debug.cpu.cpu_load[2].max
>>      95.75           -43.4%      54.21 ą 42%  sched_debug.cpu.cpu_load[2].min
>>      26.63 ą 18%     +75.2%      46.64 ą 24%  sched_debug.cpu.cpu_load[2].stddev
>>     156.75 ą  6%     +29.2%     202.46 ą  6%  sched_debug.cpu.cpu_load[3].max
>>      96.79           -53.2%      45.29 ą 45%  sched_debug.cpu.cpu_load[3].min
>>      20.25 ą 15%    +135.2%      47.63 ą 29%  sched_debug.cpu.cpu_load[3].stddev
>>     141.08 ą  4%     +39.2%     196.33 ą  8%  sched_debug.cpu.cpu_load[4].max
>>      97.54           -59.3%      39.67 ą 51%  sched_debug.cpu.cpu_load[4].min
>>      14.80 ą 12%    +219.8%      47.34 ą 33%  sched_debug.cpu.cpu_load[4].stddev
>>    1170938         +2870.5%   34782808 ą 57%  sched_debug.cpu.nr_switches.avg
>>    4811815 ą 26%   +3077.4%  1.529e+08 ą 49%  sched_debug.cpu.nr_switches.max
>>      58201 ą 15%     -72.5%      15998 ą 99%  sched_debug.cpu.nr_switches.min
>>    1633456 ą 20%   +3373.7%   56741908 ą 62%  sched_debug.cpu.nr_switches.stddev
>>  3.097e+08           -52.9%  1.459e+08 ą 58%  sched_debug.cpu.sched_count.min
>>    7185897 ą 19%    +826.3%   66559915 ą 58%  sched_debug.cpu.sched_count.stddev
>>      12004 ą  2%   +2153.0%     270459 ą 36%  sched_debug.cpu.sched_goidle.avg
>>      64148 ą 16%   +2818.1%    1871943 ą 35%  sched_debug.cpu.sched_goidle.max
>>      21941 ą 10%   +2722.7%     619331 ą 35%  sched_debug.cpu.sched_goidle.stddev
>>       7011 ą 17%     -67.7%       2265 ą 59%  sched_debug.cpu.ttwu_count.min
>>       5763 ą 18%     -73.5%       1527 ą 76%  sched_debug.cpu.ttwu_local.min
>>  3.055e+08 ą  2%     -53.1%  1.433e+08 ą 58%  sched_debug.cpu.yld_count.min
>>    8644751 ą 18%    +684.3%   67800425 ą 57%  sched_debug.cpu.yld_count.stddev
>>
>>
>>
>>                     will-it-scale.time.involuntary_context_switches
>>
>>   2.5e+09 +-+---------------------------------------------------------------+
>>           |                                                                 |
>>           |                                                                 |
>>     2e+09 +-+      O                                                        |
>>           |                                                                 |
>>           |                                                                 |
>>   1.5e+09 +-+                         O                                     |
>>           |                                                                 |
>>     1e+09 +-+    O         O        O                       O               |
>>           |                    O                                            |
>>           |                                                                 |
>>     5e+08 +-O                O                                              |
>>           O    O      O O         O        O    O   O    O    O O           |
>>           |                              O   O    O    O           O        |
>>         0 +-+---------------------------------------------------------------+
>>
>>
>> [*] bisect-good sample
>> [O] bisect-bad  sample
>>
>> ***************************************************************************************************
>> nhm-white: 8 threads Intel(R) Core(TM) i7 CPU 870 @ 2.93GHz with 6G memory
>> =========================================================================================
>> compiler/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase:
>>   gcc-7/x86_64-rhel-7.2/100%/debian-x86_64-2018-04-03.cgz/300s/nhm-white/execl/unixbench
>>
>> commit:
>>   3cd6eb61ec ("cpufreq/schedutil: take into account interrupt")
>>   f432b34f82 ("sched: use pelt for scale_rt_capacity()")
>>
>> 3cd6eb61ecbc03a6 f432b34f825d49999b8ad5417c
>> ---------------- --------------------------
>>          %stddev     %change         %stddev
>>              \          |                \
>>       4301            +7.4%       4619        unixbench.score
>>    4180477           -35.5%    2695696 ą  3%  unixbench.time.involuntary_context_switches
>>  2.583e+08            +9.0%  2.817e+08        unixbench.time.minor_page_faults
>>       1655            +5.7%       1750        unixbench.time.system_time
>>     362.72            +4.4%     378.68        unixbench.time.user_time
>>    3279113 ą  3%     +48.3%    4861315 ą  4%  unixbench.time.voluntary_context_switches
>>    6252322 ą  3%      +7.3%    6709345 ą  3%  unixbench.workload
>>    1001980 ą  3%     -43.5%     566495 ą  7%  interrupts.CAL:Function_call_interrupts
>>      19.30 ą  7%      -4.0       15.26 ą 12%  mpstat.cpu.idle%
>>      56649            -7.5%      52419        vmstat.system.cs
>>      14780            -9.1%      13434        vmstat.system.in
>>  2.016e+08            +9.0%  2.196e+08        proc-vmstat.numa_hit
>>  2.016e+08            +9.0%  2.196e+08        proc-vmstat.numa_local
>>       5989            +6.4%       6370        proc-vmstat.pgactivate
>>  2.071e+08            +9.0%  2.258e+08        proc-vmstat.pgalloc_normal
>>   2.61e+08            +9.3%  2.852e+08        proc-vmstat.pgfault
>>   2.07e+08            +9.0%  2.258e+08        proc-vmstat.pgfree
>>  1.219e+08 ą  2%     -31.2%   83948754 ą  5%  cpuidle.C1.time
>>    5089762 ą  2%     +15.7%    5888875 ą  3%  cpuidle.C1.usage
>>   87816352 ą  3%     -59.2%   35819088 ą  8%  cpuidle.C1E.time
>>    2025352 ą  2%     -57.4%     863049 ą  7%  cpuidle.C1E.usage
>>     430285 ą 14%     -35.1%     279134 ą 17%  cpuidle.C3.usage
>>    1967162 ą  2%     -11.4%    1742236 ą  3%  cpuidle.POLL.time
>>     114825 ą  3%     +21.4%     139387 ą  3%  cpuidle.POLL.usage
>>       2401 ą  2%      +9.7%       2635 ą  2%  turbostat.Avg_MHz
>>    5089676 ą  2%     +15.7%    5888833 ą  3%  turbostat.C1
>>       4.50            -1.4        3.10 ą  4%  turbostat.C1%
>>    2025326 ą  2%     -57.4%     863042 ą  7%  turbostat.C1E
>>       3.24            -1.9        1.32 ą  7%  turbostat.C1E%
>>     430222 ą 14%     -35.1%     279119 ą 17%  turbostat.C3
>>      10.11 ą  3%     -37.3%       6.33 ą  8%  turbostat.CPU%c1
>>    6014414 ą  3%     -15.0%    5110077 ą  3%  turbostat.IRQ
>>  6.982e+11            +7.4%  7.499e+11        perf-stat.branch-instructions
>>  3.524e+10            +6.2%  3.741e+10        perf-stat.branch-misses
>>   19295137            -7.8%   17797227 ą  2%  perf-stat.context-switches
>>  6.455e+12            +9.9%  7.092e+12        perf-stat.cpu-cycles
>>    4273261           -48.8%    2188451 ą  5%  perf-stat.cpu-migrations
>>  1.352e+12            +7.1%  1.448e+12        perf-stat.dTLB-loads
>>  6.033e+11            +6.4%   6.42e+11 ą  2%  perf-stat.dTLB-stores
>>  2.725e+09 ą  2%      +6.6%  2.906e+09 ą  2%  perf-stat.iTLB-load-misses
>>  3.492e+12            +7.0%  3.737e+12        perf-stat.iTLB-loads
>>  3.489e+12            +7.0%  3.735e+12        perf-stat.instructions
>>       0.54            -2.6%       0.53        perf-stat.ipc
>>   2.48e+08            +9.0%  2.704e+08        perf-stat.minor-faults
>>   2.48e+08            +9.0%  2.704e+08        perf-stat.page-faults
>>      35944 ą 28%     +74.2%      62632 ą 27%  sched_debug.cfs_rq:/.MIN_vruntime.avg
>>     287559 ą 28%     +59.1%     457421 ą 30%  sched_debug.cfs_rq:/.MIN_vruntime.max
>>      95101 ą 28%     +63.3%     155341 ą 28%  sched_debug.cfs_rq:/.MIN_vruntime.stddev
>>      35944 ą 28%     +74.2%      62632 ą 27%  sched_debug.cfs_rq:/.max_vruntime.avg
>>     287559 ą 28%     +59.1%     457421 ą 30%  sched_debug.cfs_rq:/.max_vruntime.max
>>      95101 ą 28%     +63.3%     155341 ą 28%  sched_debug.cfs_rq:/.max_vruntime.stddev
>>     790964           +14.9%     908896        sched_debug.cfs_rq:/.min_vruntime.avg
>>     827175 ą  2%     +14.3%     945176 ą  2%  sched_debug.cfs_rq:/.min_vruntime.max
>>     723861 ą  4%     +13.4%     820615 ą  4%  sched_debug.cfs_rq:/.min_vruntime.min
>>     705.87 ą 18%     +24.0%     875.53 ą  9%  sched_debug.cfs_rq:/.util_avg.avg
>>     461.75 ą 27%     +43.7%     663.40 ą 14%  sched_debug.cfs_rq:/.util_est_enqueued.avg
>>     193562 ą 15%     -51.0%      94833 ą 62%  sched_debug.cpu.avg_idle.min
>>       0.77 ą 10%     +32.6%       1.02 ą 19%  sched_debug.cpu.clock.stddev
>>       0.76 ą 10%     +33.3%       1.02 ą 19%  sched_debug.cpu.clock_task.stddev
>>      74.29           +11.8%      83.08 ą  9%  sched_debug.cpu.cpu_load[3].min
>>    1123755            -8.6%    1027327        sched_debug.cpu.nr_switches.avg
>>    1197214 ą  2%      -8.1%    1100171        sched_debug.cpu.nr_switches.max
>>     795.50 ą 17%     -74.2%     205.42 ą 20%  sched_debug.cpu.nr_uninterruptible.max
>>      -1575           -86.9%    -207.17        sched_debug.cpu.nr_uninterruptible.min
>>     751.08 ą 40%     -81.4%     139.62 ą 18%  sched_debug.cpu.nr_uninterruptible.stddev
>>    1116957            -8.6%    1020656        sched_debug.cpu.sched_count.avg
>>    1190399 ą  2%      -8.2%    1092751        sched_debug.cpu.sched_count.max
>>     304608 ą  2%     -26.8%     223083        sched_debug.cpu.ttwu_local.avg
>>     337393 ą  3%     -26.3%     248739 ą  2%  sched_debug.cpu.ttwu_local.max
>>     264918 ą  4%     -25.0%     198599 ą  4%  sched_debug.cpu.ttwu_local.min
>>      21128 ą 24%     -23.7%      16117 ą 10%  sched_debug.cpu.ttwu_local.stddev
>>      10.90 ą 16%      -4.1        6.85 ą 10%  perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64
>>      10.90 ą 16%      -4.1        6.85 ą 10%  perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64
>>      10.90 ą 16%      -4.1        6.84 ą 10%  perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
>>      10.08 ą 15%      -3.9        6.15 ą 12%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
>>       9.78 ą 15%      -3.8        5.95 ą 13%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary
>>      12.20 ą  7%      -3.6        8.60 ą 11%  perf-profile.calltrace.cycles-pp.secondary_startup_64
>>       0.86 ą  5%      -0.5        0.36 ą100%  perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork
>>       0.90 ą  4%      -0.3        0.65 ą 17%  perf-profile.calltrace.cycles-pp.ret_from_fork
>>       0.90 ą  4%      -0.3        0.65 ą 17%  perf-profile.calltrace.cycles-pp.kthread.ret_from_fork
>>       0.68 ą  3%      +0.1        0.75        perf-profile.calltrace.cycles-pp.mark_page_accessed.unmap_page_range.unmap_vmas.exit_mmap.mmput
>>       1.21 ą  3%      +0.1        1.33 ą  7%  perf-profile.calltrace.cycles-pp.alloc_pages_vma.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault
>>       1.00 ą  6%      +0.1        1.13 ą  9%  perf-profile.calltrace.cycles-pp.do_open_execat.do_execveat_common.__x64_sys_execve.do_syscall_64.entry_SYSCALL_64_after_hwframe
>>       0.97 ą  6%      +0.1        1.10 ą 10%  perf-profile.calltrace.cycles-pp.do_filp_open.do_open_execat.do_execveat_common.__x64_sys_execve.do_syscall_64
>>       1.51 ą  3%      +0.1        1.65 ą  4%  perf-profile.calltrace.cycles-pp.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
>>       0.93 ą  8%      +0.2        1.08 ą 10%  perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_open_execat.do_execveat_common.__x64_sys_execve
>>       1.23 ą  3%      +0.2        1.40 ą  6%  perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
>>       1.23 ą  5%      +0.2        1.40 ą  4%  perf-profile.calltrace.cycles-pp.walk_component.path_lookupat.filename_lookup.do_faccessat.do_syscall_64
>>       1.20 ą  5%      +0.2        1.38 ą  6%  perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
>>       0.99 ą 10%      +0.2        1.18 ą  9%  perf-profile.calltrace.cycles-pp.perf_event_mmap.mmap_region.do_mmap.vm_mmap_pgoff.elf_map
>>       2.09 ą  5%      +0.2        2.29 ą  6%  perf-profile.calltrace.cycles-pp.wp_page_copy.do_wp_page.__handle_mm_fault.handle_mm_fault.__do_page_fault
>>       0.26 ą100%      +0.3        0.56 ą  5%  perf-profile.calltrace.cycles-pp.vma_interval_tree_insert.__vma_adjust.__split_vma.mprotect_fixup.do_mprotect_pkey
>>       2.71 ą  4%      +0.3        3.03 ą  5%  perf-profile.calltrace.cycles-pp.do_mmap.vm_mmap_pgoff.elf_map.load_elf_binary.search_binary_handler
>>       4.43 ą  3%      +0.3        4.76 ą  2%  perf-profile.calltrace.cycles-pp.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe
>>       0.47 ą 58%      +0.3        0.81 ą  7%  perf-profile.calltrace.cycles-pp.unlink_file_vma.free_pgtables.exit_mmap.mmput.flush_old_exec
>>       4.82            +0.4        5.18 ą  3%  perf-profile.calltrace.cycles-pp.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe
>>       1.71 ą  8%      +0.4        2.06 ą  4%  perf-profile.calltrace.cycles-pp.free_pgtables.exit_mmap.mmput.flush_old_exec.load_elf_binary
>>       3.83 ą  3%      +0.4        4.20 ą  2%  perf-profile.calltrace.cycles-pp.mmap_region.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64
>>       4.92            +0.4        5.33 ą  4%  perf-profile.calltrace.cycles-pp.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe
>>       0.12 ą173%      +0.5        0.62 ą 12%  perf-profile.calltrace.cycles-pp.__vma_adjust.__split_vma.do_munmap.vm_munmap.elf_map
>>      12.06 ą  5%      +0.8       12.91        perf-profile.calltrace.cycles-pp.exit_mmap.mmput.flush_old_exec.load_elf_binary.search_binary_handler
>>      12.10 ą  5%      +0.9       12.98        perf-profile.calltrace.cycles-pp.mmput.flush_old_exec.load_elf_binary.search_binary_handler.do_execveat_common
>>      12.57 ą  5%      +0.9       13.49 ą  2%  perf-profile.calltrace.cycles-pp.flush_old_exec.load_elf_binary.search_binary_handler.do_execveat_common.__x64_sys_execve
>>      27.55 ą  2%      +1.8       29.34 ą  3%  perf-profile.calltrace.cycles-pp.load_elf_binary.search_binary_handler.do_execveat_common.__x64_sys_execve.do_syscall_64
>>      27.65 ą  2%      +1.8       29.45 ą  3%  perf-profile.calltrace.cycles-pp.search_binary_handler.do_execveat_common.__x64_sys_execve.do_syscall_64.entry_SYSCALL_64_after_hwframe
>>      38.27 ą  2%      +2.3       40.61 ą  2%  perf-profile.calltrace.cycles-pp.__x64_sys_execve.do_syscall_64.entry_SYSCALL_64_after_hwframe.execve
>>      38.27 ą  2%      +2.3       40.62 ą  2%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.execve
>>      38.27 ą  2%      +2.4       40.62 ą  2%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.execve
>>      38.21 ą  2%      +2.4       40.56 ą  2%  perf-profile.calltrace.cycles-pp.do_execveat_common.__x64_sys_execve.do_syscall_64.entry_SYSCALL_64_after_hwframe.execve
>>      40.44 ą  2%      +2.4       42.83 ą  2%  perf-profile.calltrace.cycles-pp.execve
>>      10.90 ą 16%      -4.1        6.85 ą 10%  perf-profile.children.cycles-pp.start_secondary
>>      12.21 ą  7%      -3.6        8.61 ą 11%  perf-profile.children.cycles-pp.do_idle
>>      12.20 ą  7%      -3.6        8.60 ą 11%  perf-profile.children.cycles-pp.secondary_startup_64
>>      12.20 ą  7%      -3.6        8.60 ą 11%  perf-profile.children.cycles-pp.cpu_startup_entry
>>      11.26 ą  7%      -3.5        7.75 ą 13%  perf-profile.children.cycles-pp.cpuidle_enter_state
>>      10.93 ą  7%      -3.5        7.47 ą 13%  perf-profile.children.cycles-pp.intel_idle
>>       0.86 ą  5%      -0.3        0.60 ą 22%  perf-profile.children.cycles-pp.smpboot_thread_fn
>>       0.91 ą  4%      -0.3        0.65 ą 17%  perf-profile.children.cycles-pp.ret_from_fork
>>       0.90 ą  4%      -0.3        0.65 ą 17%  perf-profile.children.cycles-pp.kthread
>>       0.28 ą 34%      -0.2        0.04 ą108%  perf-profile.children.cycles-pp.ksys_write
>>       0.27 ą 38%      -0.2        0.04 ą108%  perf-profile.children.cycles-pp.vfs_write
>>       0.27 ą 39%      -0.2        0.04 ą108%  perf-profile.children.cycles-pp.__vfs_write
>>       0.25 ą 44%      -0.2        0.03 ą100%  perf-profile.children.cycles-pp.__generic_file_write_iter
>>       0.25 ą 44%      -0.2        0.03 ą100%  perf-profile.children.cycles-pp.generic_file_write_iter
>>       0.25 ą 43%      -0.2        0.03 ą105%  perf-profile.children.cycles-pp.generic_perform_write
>>       0.54 ą 22%      -0.2        0.35 ą 17%  perf-profile.children.cycles-pp.cpu_stopper_thread
>>       0.57 ą  8%      -0.2        0.40 ą 21%  perf-profile.children.cycles-pp.radix_tree_next_chunk
>>       1.16 ą  4%      -0.1        1.05 ą  8%  perf-profile.children.cycles-pp.__might_sleep
>>       0.45 ą  4%      -0.1        0.35 ą 25%  perf-profile.children.cycles-pp.stop_one_cpu
>>       0.41 ą  4%      -0.1        0.31 ą 24%  perf-profile.children.cycles-pp.up_write
>>       0.16 ą 26%      -0.1        0.07 ą 38%  perf-profile.children.cycles-pp.smp_call_function_single
>>       0.26 ą 16%      -0.1        0.17 ą 14%  perf-profile.children.cycles-pp.free_ldt_pgtables
>>       0.14 ą 23%      -0.1        0.08 ą 60%  perf-profile.children.cycles-pp.wake_up_q
>>       0.08 ą 20%      +0.0        0.11 ą 16%  perf-profile.children.cycles-pp.selinux_bprm_set_creds
>>       0.06 ą 58%      +0.0        0.09 ą 13%  perf-profile.children.cycles-pp.switch_mm
>>       0.78 ą  3%      +0.0        0.83 ą  3%  perf-profile.children.cycles-pp.mark_page_accessed
>>       0.03 ą100%      +0.0        0.07 ą 14%  perf-profile.children.cycles-pp.mntput_no_expire
>>       0.00            +0.1        0.05        perf-profile.children.cycles-pp.__calc_delta
>>       0.15 ą 19%      +0.1        0.20 ą 19%  perf-profile.children.cycles-pp.mem_cgroup_uncharge_list
>>       0.01 ą173%      +0.1        0.07 ą 23%  perf-profile.children.cycles-pp.task_work_add
>>       0.00            +0.1        0.06 ą 20%  perf-profile.children.cycles-pp.file_ra_state_init
>>       0.13 ą  9%      +0.1        0.20 ą 15%  perf-profile.children.cycles-pp.may_open
>>       0.10 ą 21%      +0.1        0.18 ą 18%  perf-profile.children.cycles-pp.security_file_alloc
>>       0.07 ą 27%      +0.1        0.15 ą 46%  perf-profile.children.cycles-pp.may_expand_vm
>>       0.00            +0.1        0.07 ą 30%  perf-profile.children.cycles-pp.selinux_task_getsecid
>>       0.12 ą 29%      +0.1        0.19 ą 25%  perf-profile.children.cycles-pp.lookup_memtype
>>       0.08 ą 23%      +0.1        0.16 ą 20%  perf-profile.children.cycles-pp.selinux_file_alloc_security
>>       0.07 ą 58%      +0.1        0.16 ą  9%  perf-profile.children.cycles-pp.should_failslab
>>       0.28 ą  8%      +0.1        0.37 ą  6%  perf-profile.children.cycles-pp.get_empty_filp
>>       0.41 ą 18%      +0.1        0.54 ą 20%  perf-profile.children.cycles-pp.native_flush_tlb_one_user
>>       1.53 ą  3%      +0.1        1.66 ą  4%  perf-profile.children.cycles-pp.do_sys_open
>>       0.56 ą  5%      +0.1        0.69 ą  5%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
>>       0.20 ą 19%      +0.2        0.35 ą 16%  perf-profile.children.cycles-pp.mem_cgroup_commit_charge
>>       1.78 ą  4%      +0.2        1.94 ą  5%  perf-profile.children.cycles-pp.walk_component
>>       0.94 ą 11%      +0.2        1.10 ą  4%  perf-profile.children.cycles-pp.irq_exit
>>       0.61 ą 16%      +0.2        0.78 ą  7%  perf-profile.children.cycles-pp.free_pgd_range
>>       0.50 ą 12%      +0.2        0.70 ą 19%  perf-profile.children.cycles-pp.expand_downwards
>>       2.10 ą  5%      +0.2        2.30 ą  5%  perf-profile.children.cycles-pp.wp_page_copy
>>       1.02 ą 12%      +0.2        1.24 ą  4%  perf-profile.children.cycles-pp.vma_interval_tree_insert
>>       2.73 ą  3%      +0.2        2.95 ą  4%  perf-profile.children.cycles-pp.__vma_adjust
>>       1.00 ą  8%      +0.2        1.25 ą  6%  perf-profile.children.cycles-pp.unlink_file_vma
>>       2.19 ą  6%      +0.4        2.56 ą  5%  perf-profile.children.cycles-pp.free_pgtables
>>       4.93            +0.4        5.34 ą  4%  perf-profile.children.cycles-pp.ksys_mmap_pgoff
>>       3.11 ą  7%      +0.4        3.54 ą  5%  perf-profile.children.cycles-pp.do_filp_open
>>       3.02 ą  7%      +0.5        3.49 ą  5%  perf-profile.children.cycles-pp.path_openat
>>       6.21            +0.6        6.84 ą  3%  perf-profile.children.cycles-pp.mmap_region
>>       7.17            +0.6        7.80 ą  2%  perf-profile.children.cycles-pp.do_mmap
>>       7.91            +0.7        8.61 ą  2%  perf-profile.children.cycles-pp.vm_mmap_pgoff
>>      12.07 ą  5%      +0.8       12.92        perf-profile.children.cycles-pp.exit_mmap
>>      12.11 ą  5%      +0.9       12.99        perf-profile.children.cycles-pp.mmput
>>      12.58 ą  5%      +0.9       13.49 ą  2%  perf-profile.children.cycles-pp.flush_old_exec
>>      27.57 ą  2%      +1.8       29.38 ą  3%  perf-profile.children.cycles-pp.load_elf_binary
>>      27.69 ą  2%      +1.8       29.51 ą  3%  perf-profile.children.cycles-pp.search_binary_handler
>>      38.45 ą  2%      +2.3       40.77 ą  2%  perf-profile.children.cycles-pp.__x64_sys_execve
>>      38.38 ą  2%      +2.3       40.71 ą  2%  perf-profile.children.cycles-pp.do_execveat_common
>>      40.44 ą  2%      +2.4       42.83 ą  2%  perf-profile.children.cycles-pp.execve
>>      55.45 ą  2%      +2.9       58.30        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
>>      55.36 ą  2%      +2.9       58.23        perf-profile.children.cycles-pp.do_syscall_64
>>      10.91 ą  7%      -3.4        7.46 ą 13%  perf-profile.self.cycles-pp.intel_idle
>>       0.55 ą 11%      -0.2        0.39 ą 23%  perf-profile.self.cycles-pp.radix_tree_next_chunk
>>       0.41 ą  4%      -0.1        0.30 ą 27%  perf-profile.self.cycles-pp.up_write
>>       0.14 ą 34%      -0.1        0.04 ą107%  perf-profile.self.cycles-pp.smp_call_function_single
>>       0.14 ą 40%      -0.1        0.08 ą 46%  perf-profile.self.cycles-pp.unmap_vmas
>>       0.19 ą 10%      -0.0        0.15 ą  4%  perf-profile.self.cycles-pp.unmapped_area_topdown
>>       0.05 ą 58%      +0.0        0.08 ą  5%  perf-profile.self.cycles-pp.exit_mmap
>>       0.01 ą173%      +0.0        0.05 ą  9%  perf-profile.self.cycles-pp.selinux_bprm_set_creds
>>       0.03 ą100%      +0.0        0.07 ą 14%  perf-profile.self.cycles-pp.mntput_no_expire
>>       0.00            +0.1        0.05        perf-profile.self.cycles-pp.__calc_delta
>>       0.06 ą 62%      +0.1        0.12 ą 15%  perf-profile.self.cycles-pp.remove_vma
>>       0.00            +0.1        0.06 ą 20%  perf-profile.self.cycles-pp.file_ra_state_init
>>       0.03 ą100%      +0.1        0.10 ą 21%  perf-profile.self.cycles-pp.mem_cgroup_uncharge_list
>>       0.07 ą 22%      +0.1        0.15 ą 46%  perf-profile.self.cycles-pp.may_expand_vm
>>       0.14 ą 25%      +0.1        0.22 ą  8%  perf-profile.self.cycles-pp.cpumask_any_but
>>       0.08 ą 36%      +0.1        0.16 ą 11%  perf-profile.self.cycles-pp.mem_cgroup_commit_charge
>>       0.07 ą 58%      +0.1        0.15 ą  5%  perf-profile.self.cycles-pp.should_failslab
>>       0.20 ą 28%      +0.1        0.32 ą  6%  perf-profile.self.cycles-pp.load_elf_binary
>>       0.15 ą 36%      +0.1        0.27 ą  6%  perf-profile.self.cycles-pp.free_pgd_range
>>       0.41 ą 18%      +0.1        0.54 ą 20%  perf-profile.self.cycles-pp.native_flush_tlb_one_user
>>       0.96 ą  5%      +0.2        1.13 ą  8%  perf-profile.self.cycles-pp.kmem_cache_alloc
>>       1.00 ą 13%      +0.2        1.23 ą  4%  perf-profile.self.cycles-pp.vma_interval_tree_insert
>>       3.35 ą  8%      +0.5        3.80 ą  8%  perf-profile.self.cycles-pp.unmap_page_range
>>
>>
>>
>>
>>
>> Disclaimer:
>> Results have been estimated based on internal Intel analysis and are provided
>> for informational purposes only. Any difference in system hardware or software
>> design or configuration may affect actual performance.
>>
>>
>> Thanks,
>> Xiaolong

      reply	other threads:[~2018-06-22  7:34 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-20  7:34 [lkp-robot] [sched] f432b34f82: will-it-scale.per_thread_ops -12.6% regression kernel test robot
2018-06-20 12:05 ` Vincent Guittot
2018-06-22  7:34   ` Ye Xiaolong [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180622073443.GS11011@yexl-desktop \
    --to=xiaolong.ye@intel.com \
    --cc=lkp@lists.01.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.