Re: [drm/mgag200] 90f479ae51: vm-scalability.median -18.8% regression

* Re: [drm/mgag200] 90f479ae51: vm-scalability.median -18.8% regression
       [not found] <20190729095155.GP22106@shao2-debian>
@ 2019-07-30 17:50 ` Thomas Zimmermann
  2019-07-30 18:12   ` Daniel Vetter
  2019-08-04 18:39   ` Thomas Zimmermann
  0 siblings, 2 replies; 61+ messages in thread
From: Thomas Zimmermann @ 2019-07-30 17:50 UTC (permalink / raw)
  To: kernel test robot, Noralf Trønnes, Daniel Vetter
  Cc: Stephen Rothwell, lkp, dri-devel

[-- Attachment #1.1.1: Type: text/plain, Size: 59680 bytes --]

Am 29.07.19 um 11:51 schrieb kernel test robot:
> Greeting,
> 
> FYI, we noticed a -18.8% regression of vm-scalability.median due to commit:>
> 
> commit: 90f479ae51afa45efab97afdde9b94b9660dd3e4 ("drm/mgag200: Replace struct mga_fbdev with generic framebuffer emulation")
> https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next.git master

Daniel, Noralf, we may have to revert this patch.

I expected some change in display performance, but not in VM. Since it's
a server chipset, probably no one cares much about display performance.
So that seemed like a good trade-off for re-using shared code.

Part of the patch set is that the generic fb emulation now maps and
unmaps the fbdev BO when updating the screen. I guess that's the cause
of the performance regression. And it should be visible with other
drivers as well if they use a shadow FB for fbdev emulation.

The thing is that we'd need another generic fbdev emulation for ast and
mgag200 that handles this issue properly.

Best regards
Thomas

> 
> in testcase: vm-scalability
> on test machine: 288 threads Intel(R) Xeon Phi(TM) CPU 7295 @ 1.50GHz with 80G memory
> with following parameters:
> 
> 	runtime: 300s
> 	size: 8T
> 	test: anon-cow-seq-hugetlb
> 	cpufreq_governor: performance
> 
> test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
> test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/
> 
> 
> 
> Details are as below:
> -------------------------------------------------------------------------------------------------->
> 
> 
> To reproduce:
> 
>         git clone https://github.com/intel/lkp-tests.git
>         cd lkp-tests
>         bin/lkp install job.yaml  # job file is attached in this email
>         bin/lkp run     job.yaml
> 
> =========================================================================================
> compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase:
>   gcc-7/performance/x86_64-rhel-7.6/debian-x86_64-2019-05-14.cgz/300s/8T/lkp-knm01/anon-cow-seq-hugetlb/vm-scalability
> 
> commit: 
>   f1f8555dfb ("drm/bochs: Use shadow buffer for bochs framebuffer console")
>   90f479ae51 ("drm/mgag200: Replace struct mga_fbdev with generic framebuffer emulation")
> 
> f1f8555dfb9a70a2 90f479ae51afa45efab97afdde9 
> ---------------- --------------------------- 
>        fail:runs  %reproduction    fail:runs
>            |             |             |    
>           2:4          -50%            :4     dmesg.WARNING:at#for_ip_interrupt_entry/0x
>            :4           25%           1:4     dmesg.WARNING:at_ip___perf_sw_event/0x
>            :4           25%           1:4     dmesg.WARNING:at_ip__fsnotify_parent/0x
>          %stddev     %change         %stddev
>              \          |                \  
>      43955 ±  2%     -18.8%      35691        vm-scalability.median
>       0.06 ±  7%    +193.0%       0.16 ±  2%  vm-scalability.median_stddev
>   14906559 ±  2%     -17.9%   12237079        vm-scalability.throughput
>      87651 ±  2%     -17.4%      72374        vm-scalability.time.involuntary_context_switches
>    2086168           -23.6%    1594224        vm-scalability.time.minor_page_faults
>      15082 ±  2%     -10.4%      13517        vm-scalability.time.percent_of_cpu_this_job_got
>      29987            -8.9%      27327        vm-scalability.time.system_time
>      15755           -12.4%      13795        vm-scalability.time.user_time
>     122011           -19.3%      98418        vm-scalability.time.voluntary_context_switches
>  3.034e+09           -23.6%  2.318e+09        vm-scalability.workload
>     242478 ± 12%     +68.5%     408518 ± 23%  cpuidle.POLL.time
>       2788 ± 21%    +117.4%       6062 ± 26%  cpuidle.POLL.usage
>      56653 ± 10%     +64.4%      93144 ± 20%  meminfo.Mapped
>     120392 ±  7%     +14.0%     137212 ±  4%  meminfo.Shmem
>      47221 ± 11%     +77.1%      83634 ± 22%  numa-meminfo.node0.Mapped
>     120465 ±  7%     +13.9%     137205 ±  4%  numa-meminfo.node0.Shmem
>    2885513           -16.5%    2409384        numa-numastat.node0.local_node
>    2885471           -16.5%    2409354        numa-numastat.node0.numa_hit
>      11813 ± 11%     +76.3%      20824 ± 22%  numa-vmstat.node0.nr_mapped
>      30096 ±  7%     +13.8%      34238 ±  4%  numa-vmstat.node0.nr_shmem
>      43.72 ±  2%      +5.5       49.20        mpstat.cpu.all.idle%
>       0.03 ±  4%      +0.0        0.05 ±  6%  mpstat.cpu.all.soft%
>      19.51            -2.4       17.08        mpstat.cpu.all.usr%
>       1012            -7.9%     932.75        turbostat.Avg_MHz
>      32.38 ± 10%     +25.8%      40.73        turbostat.CPU%c1
>     145.51            -3.1%     141.01        turbostat.PkgWatt
>      15.09           -19.2%      12.19        turbostat.RAMWatt
>      43.50 ±  2%     +13.2%      49.25        vmstat.cpu.id
>      18.75 ±  2%     -13.3%      16.25 ±  2%  vmstat.cpu.us
>     152.00 ±  2%      -9.5%     137.50        vmstat.procs.r
>       4800           -13.1%       4173        vmstat.system.cs
>     156170           -11.9%     137594        slabinfo.anon_vma.active_objs
>       3395           -11.9%       2991        slabinfo.anon_vma.active_slabs
>     156190           -11.9%     137606        slabinfo.anon_vma.num_objs
>       3395           -11.9%       2991        slabinfo.anon_vma.num_slabs
>       1716 ±  5%     +11.5%       1913 ±  8%  slabinfo.dmaengine-unmap-16.active_objs
>       1716 ±  5%     +11.5%       1913 ±  8%  slabinfo.dmaengine-unmap-16.num_objs
>       1767 ±  2%     -19.0%       1431 ±  2%  slabinfo.hugetlbfs_inode_cache.active_objs
>       1767 ±  2%     -19.0%       1431 ±  2%  slabinfo.hugetlbfs_inode_cache.num_objs
>       3597 ±  5%     -16.4%       3006 ±  3%  slabinfo.skbuff_ext_cache.active_objs
>       3597 ±  5%     -16.4%       3006 ±  3%  slabinfo.skbuff_ext_cache.num_objs
>    1330122           -23.6%    1016557        proc-vmstat.htlb_buddy_alloc_success
>      77214 ±  3%      +6.4%      82128 ±  2%  proc-vmstat.nr_active_anon
>      67277            +2.9%      69246        proc-vmstat.nr_anon_pages
>     218.50 ±  3%     -10.6%     195.25        proc-vmstat.nr_dirtied
>     288628            +1.4%     292755        proc-vmstat.nr_file_pages
>     360.50            -2.7%     350.75        proc-vmstat.nr_inactive_file
>      14225 ±  9%     +63.8%      23304 ± 20%  proc-vmstat.nr_mapped
>      30109 ±  7%     +13.8%      34259 ±  4%  proc-vmstat.nr_shmem
>      99870            -1.3%      98597        proc-vmstat.nr_slab_unreclaimable
>     204.00 ±  4%     -12.1%     179.25        proc-vmstat.nr_written
>      77214 ±  3%      +6.4%      82128 ±  2%  proc-vmstat.nr_zone_active_anon
>     360.50            -2.7%     350.75        proc-vmstat.nr_zone_inactive_file
>       8810 ± 19%     -66.1%       2987 ± 42%  proc-vmstat.numa_hint_faults
>       8810 ± 19%     -66.1%       2987 ± 42%  proc-vmstat.numa_hint_faults_local
>    2904082           -16.4%    2427026        proc-vmstat.numa_hit
>    2904081           -16.4%    2427025        proc-vmstat.numa_local
>  6.828e+08           -23.5%  5.221e+08        proc-vmstat.pgalloc_normal
>    2900008           -17.2%    2400195        proc-vmstat.pgfault
>  6.827e+08           -23.5%   5.22e+08        proc-vmstat.pgfree
>  1.635e+10           -17.0%  1.357e+10        perf-stat.i.branch-instructions
>       1.53 ±  4%      -0.1        1.45 ±  3%  perf-stat.i.branch-miss-rate%
>  2.581e+08 ±  3%     -20.5%  2.051e+08 ±  2%  perf-stat.i.branch-misses
>      12.66            +1.1       13.78        perf-stat.i.cache-miss-rate%
>   72720849           -12.0%   63958986        perf-stat.i.cache-misses
>  5.766e+08           -18.6%  4.691e+08        perf-stat.i.cache-references
>       4674 ±  2%     -13.0%       4064        perf-stat.i.context-switches
>       4.29           +12.5%       4.83        perf-stat.i.cpi
>  2.573e+11            -7.4%  2.383e+11        perf-stat.i.cpu-cycles
>     231.35           -21.5%     181.56        perf-stat.i.cpu-migrations
>       3522            +4.4%       3677        perf-stat.i.cycles-between-cache-misses
>       0.09 ± 13%      +0.0        0.12 ±  5%  perf-stat.i.iTLB-load-miss-rate%
>  5.894e+10           -15.8%  4.961e+10        perf-stat.i.iTLB-loads
>  5.901e+10           -15.8%  4.967e+10        perf-stat.i.instructions
>       1291 ± 14%     -21.8%       1010        perf-stat.i.instructions-per-iTLB-miss
>       0.24           -11.0%       0.21        perf-stat.i.ipc
>       9476           -17.5%       7821        perf-stat.i.minor-faults
>       9478           -17.5%       7821        perf-stat.i.page-faults
>       9.76            -3.6%       9.41        perf-stat.overall.MPKI
>       1.59 ±  4%      -0.1        1.52        perf-stat.overall.branch-miss-rate%
>      12.61            +1.1       13.71        perf-stat.overall.cache-miss-rate%
>       4.38           +10.5%       4.83        perf-stat.overall.cpi
>       3557            +5.3%       3747        perf-stat.overall.cycles-between-cache-misses
>       0.08 ± 12%      +0.0        0.10        perf-stat.overall.iTLB-load-miss-rate%
>       1268 ± 15%     -23.0%     976.22        perf-stat.overall.instructions-per-iTLB-miss
>       0.23            -9.5%       0.21        perf-stat.overall.ipc
>       5815            +9.7%       6378        perf-stat.overall.path-length
>  1.634e+10           -17.5%  1.348e+10        perf-stat.ps.branch-instructions
>  2.595e+08 ±  3%     -21.2%  2.043e+08 ±  2%  perf-stat.ps.branch-misses
>   72565205           -12.2%   63706339        perf-stat.ps.cache-misses
>  5.754e+08           -19.2%  4.646e+08        perf-stat.ps.cache-references
>       4640 ±  2%     -12.5%       4060        perf-stat.ps.context-switches
>  2.581e+11            -7.5%  2.387e+11        perf-stat.ps.cpu-cycles
>     229.91           -22.0%     179.42        perf-stat.ps.cpu-migrations
>  5.889e+10           -16.3%  4.927e+10        perf-stat.ps.iTLB-loads
>  5.899e+10           -16.3%  4.938e+10        perf-stat.ps.instructions
>       9388           -18.2%       7677        perf-stat.ps.minor-faults
>       9389           -18.2%       7677        perf-stat.ps.page-faults
>  1.764e+13           -16.2%  1.479e+13        perf-stat.total.instructions
>      46803 ±  3%     -18.8%      37982 ±  6%  sched_debug.cfs_rq:/.exec_clock.min
>       5320 ±  3%     +23.7%       6581 ±  3%  sched_debug.cfs_rq:/.exec_clock.stddev
>       6737 ± 14%     +58.1%      10649 ± 10%  sched_debug.cfs_rq:/.load.avg
>     587978 ± 17%     +58.2%     930382 ±  9%  sched_debug.cfs_rq:/.load.max
>      46952 ± 16%     +64.8%      77388 ± 11%  sched_debug.cfs_rq:/.load.stddev
>       7.12 ±  4%     +49.1%      10.62 ±  6%  sched_debug.cfs_rq:/.load_avg.avg
>     474.40 ± 23%     +67.5%     794.60 ± 10%  sched_debug.cfs_rq:/.load_avg.max
>      37.70 ± 11%     +74.8%      65.90 ±  9%  sched_debug.cfs_rq:/.load_avg.stddev
>   13424269 ±  4%     -15.6%   11328098 ±  2%  sched_debug.cfs_rq:/.min_vruntime.avg
>   15411275 ±  3%     -12.4%   13505072 ±  2%  sched_debug.cfs_rq:/.min_vruntime.max
>    7939295 ±  6%     -17.5%    6551322 ±  7%  sched_debug.cfs_rq:/.min_vruntime.min
>      21.44 ±  7%     -56.1%       9.42 ±  4%  sched_debug.cfs_rq:/.nr_spread_over.avg
>     117.45 ± 11%     -60.6%      46.30 ± 14%  sched_debug.cfs_rq:/.nr_spread_over.max
>      19.33 ±  8%     -66.4%       6.49 ±  9%  sched_debug.cfs_rq:/.nr_spread_over.stddev
>       4.32 ± 15%     +84.4%       7.97 ±  3%  sched_debug.cfs_rq:/.runnable_load_avg.avg
>     353.85 ± 29%    +118.8%     774.35 ± 11%  sched_debug.cfs_rq:/.runnable_load_avg.max
>      27.30 ± 24%    +118.5%      59.64 ±  9%  sched_debug.cfs_rq:/.runnable_load_avg.stddev
>       6729 ± 14%     +58.2%      10644 ± 10%  sched_debug.cfs_rq:/.runnable_weight.avg
>     587978 ± 17%     +58.2%     930382 ±  9%  sched_debug.cfs_rq:/.runnable_weight.max
>      46950 ± 16%     +64.8%      77387 ± 11%  sched_debug.cfs_rq:/.runnable_weight.stddev
>    5305069 ±  4%     -17.4%    4380376 ±  7%  sched_debug.cfs_rq:/.spread0.avg
>    7328745 ±  3%      -9.9%    6600897 ±  3%  sched_debug.cfs_rq:/.spread0.max
>    2220837 ±  4%     +55.8%    3460596 ±  5%  sched_debug.cpu.avg_idle.avg
>    4590666 ±  9%     +76.8%    8117037 ± 15%  sched_debug.cpu.avg_idle.max
>     485052 ±  7%     +80.3%     874679 ± 10%  sched_debug.cpu.avg_idle.stddev
>     561.50 ± 26%     +37.7%     773.30 ± 15%  sched_debug.cpu.clock.stddev
>     561.50 ± 26%     +37.7%     773.30 ± 15%  sched_debug.cpu.clock_task.stddev
>       3.20 ± 10%    +109.6%       6.70 ±  3%  sched_debug.cpu.cpu_load[0].avg
>     309.10 ± 20%    +150.3%     773.75 ± 12%  sched_debug.cpu.cpu_load[0].max
>      21.02 ± 14%    +160.8%      54.80 ±  9%  sched_debug.cpu.cpu_load[0].stddev
>       3.19 ±  8%    +109.8%       6.70 ±  3%  sched_debug.cpu.cpu_load[1].avg
>     299.75 ± 19%    +158.0%     773.30 ± 12%  sched_debug.cpu.cpu_load[1].max
>      20.32 ± 12%    +168.7%      54.62 ±  9%  sched_debug.cpu.cpu_load[1].stddev
>       3.20 ±  8%    +109.1%       6.69 ±  4%  sched_debug.cpu.cpu_load[2].avg
>     288.90 ± 20%    +167.0%     771.40 ± 12%  sched_debug.cpu.cpu_load[2].max
>      19.70 ± 12%    +175.4%      54.27 ±  9%  sched_debug.cpu.cpu_load[2].stddev
>       3.16 ±  8%    +110.9%       6.66 ±  6%  sched_debug.cpu.cpu_load[3].avg
>     275.50 ± 24%    +178.4%     766.95 ± 12%  sched_debug.cpu.cpu_load[3].max
>      18.92 ± 15%    +184.2%      53.77 ± 10%  sched_debug.cpu.cpu_load[3].stddev
>       3.08 ±  8%    +115.7%       6.65 ±  7%  sched_debug.cpu.cpu_load[4].avg
>     263.55 ± 28%    +188.7%     760.85 ± 12%  sched_debug.cpu.cpu_load[4].max
>      18.03 ± 18%    +196.6%      53.46 ± 11%  sched_debug.cpu.cpu_load[4].stddev
>      14543            -9.6%      13150        sched_debug.cpu.curr->pid.max
>       5293 ± 16%     +74.7%       9248 ± 11%  sched_debug.cpu.load.avg
>     587978 ± 17%     +58.2%     930382 ±  9%  sched_debug.cpu.load.max
>      40887 ± 19%     +78.3%      72891 ±  9%  sched_debug.cpu.load.stddev
>    1141679 ±  4%     +56.9%    1790907 ±  5%  sched_debug.cpu.max_idle_balance_cost.avg
>    2432100 ±  9%     +72.6%    4196779 ± 13%  sched_debug.cpu.max_idle_balance_cost.max
>     745656           +29.3%     964170 ±  5%  sched_debug.cpu.max_idle_balance_cost.min
>     239032 ±  9%     +81.9%     434806 ± 10%  sched_debug.cpu.max_idle_balance_cost.stddev
>       0.00 ± 27%     +92.1%       0.00 ± 31%  sched_debug.cpu.next_balance.stddev
>       1030 ±  4%     -10.4%     924.00 ±  2%  sched_debug.cpu.nr_switches.min
>       0.04 ± 26%    +139.0%       0.09 ± 41%  sched_debug.cpu.nr_uninterruptible.avg
>     830.35 ±  6%     -12.0%     730.50 ±  2%  sched_debug.cpu.sched_count.min
>     912.00 ±  2%      -9.5%     825.38        sched_debug.cpu.ttwu_count.avg
>     433.05 ±  3%     -19.2%     350.05 ±  3%  sched_debug.cpu.ttwu_count.min
>     160.70 ±  3%     -12.5%     140.60 ±  4%  sched_debug.cpu.ttwu_local.min
>       9072 ± 11%     -36.4%       5767 ±  8%  softirqs.CPU1.RCU
>      12769 ±  5%     +15.3%      14718 ±  3%  softirqs.CPU101.SCHED
>      13198           +11.5%      14717 ±  3%  softirqs.CPU102.SCHED
>      12981 ±  4%     +13.9%      14788 ±  3%  softirqs.CPU105.SCHED
>      13486 ±  3%     +11.8%      15071 ±  4%  softirqs.CPU111.SCHED
>      12794 ±  4%     +14.1%      14601 ±  9%  softirqs.CPU112.SCHED
>      12999 ±  4%     +10.1%      14314 ±  4%  softirqs.CPU115.SCHED
>      12844 ±  4%     +10.6%      14202 ±  2%  softirqs.CPU120.SCHED
>      13336 ±  3%      +9.4%      14585 ±  3%  softirqs.CPU122.SCHED
>      12639 ±  4%     +20.2%      15195        softirqs.CPU123.SCHED
>      13040 ±  5%     +15.2%      15024 ±  5%  softirqs.CPU126.SCHED
>      13123           +15.1%      15106 ±  5%  softirqs.CPU127.SCHED
>       9188 ±  6%     -35.7%       5911 ±  2%  softirqs.CPU13.RCU
>      13054 ±  3%     +13.1%      14761 ±  5%  softirqs.CPU130.SCHED
>      13158 ±  2%     +13.9%      14985 ±  5%  softirqs.CPU131.SCHED
>      12797 ±  6%     +13.5%      14524 ±  3%  softirqs.CPU133.SCHED
>      12452 ±  5%     +14.8%      14297        softirqs.CPU134.SCHED
>      13078 ±  3%     +10.4%      14439 ±  3%  softirqs.CPU138.SCHED
>      12617 ±  2%     +14.5%      14442 ±  5%  softirqs.CPU139.SCHED
>      12974 ±  3%     +13.7%      14752 ±  4%  softirqs.CPU142.SCHED
>      12579 ±  4%     +19.1%      14983 ±  3%  softirqs.CPU143.SCHED
>       9122 ± 24%     -44.6%       5053 ±  5%  softirqs.CPU144.RCU
>      13366 ±  2%     +11.1%      14848 ±  3%  softirqs.CPU149.SCHED
>      13246 ±  2%     +22.0%      16162 ±  7%  softirqs.CPU150.SCHED
>      13452 ±  3%     +20.5%      16210 ±  7%  softirqs.CPU151.SCHED
>      13507           +10.1%      14869        softirqs.CPU156.SCHED
>      13808 ±  3%      +9.2%      15079 ±  4%  softirqs.CPU157.SCHED
>      13442 ±  2%     +13.4%      15248 ±  4%  softirqs.CPU160.SCHED
>      13311           +12.1%      14920 ±  2%  softirqs.CPU162.SCHED
>      13544 ±  3%      +8.5%      14695 ±  4%  softirqs.CPU163.SCHED
>      13648 ±  3%     +11.2%      15179 ±  2%  softirqs.CPU166.SCHED
>      13404 ±  4%     +12.5%      15079 ±  3%  softirqs.CPU168.SCHED
>      13421 ±  6%     +16.0%      15568 ±  8%  softirqs.CPU169.SCHED
>      13115 ±  3%     +23.1%      16139 ± 10%  softirqs.CPU171.SCHED
>      13424 ±  6%     +10.4%      14822 ±  3%  softirqs.CPU175.SCHED
>      13274 ±  3%     +13.7%      15087 ±  9%  softirqs.CPU185.SCHED
>      13409 ±  3%     +12.3%      15063 ±  3%  softirqs.CPU190.SCHED
>      13181 ±  7%     +13.4%      14946 ±  3%  softirqs.CPU196.SCHED
>      13578 ±  3%     +10.9%      15061        softirqs.CPU197.SCHED
>      13323 ±  5%     +24.8%      16627 ±  6%  softirqs.CPU198.SCHED
>      14072 ±  2%     +12.3%      15798 ±  7%  softirqs.CPU199.SCHED
>      12604 ± 13%     +17.9%      14865        softirqs.CPU201.SCHED
>      13380 ±  4%     +14.8%      15356 ±  3%  softirqs.CPU203.SCHED
>      13481 ±  8%     +14.2%      15390 ±  3%  softirqs.CPU204.SCHED
>      12921 ±  2%     +13.8%      14710 ±  3%  softirqs.CPU206.SCHED
>      13468           +13.0%      15218 ±  2%  softirqs.CPU208.SCHED
>      13253 ±  2%     +13.1%      14992        softirqs.CPU209.SCHED
>      13319 ±  2%     +14.3%      15225 ±  7%  softirqs.CPU210.SCHED
>      13673 ±  5%     +16.3%      15895 ±  3%  softirqs.CPU211.SCHED
>      13290           +17.0%      15556 ±  5%  softirqs.CPU212.SCHED
>      13455 ±  4%     +14.4%      15392 ±  3%  softirqs.CPU213.SCHED
>      13454 ±  4%     +14.3%      15377 ±  3%  softirqs.CPU215.SCHED
>      13872 ±  7%      +9.7%      15221 ±  5%  softirqs.CPU220.SCHED
>      13555 ±  4%     +17.3%      15896 ±  5%  softirqs.CPU222.SCHED
>      13411 ±  4%     +20.8%      16197 ±  6%  softirqs.CPU223.SCHED
>       8472 ± 21%     -44.8%       4680 ±  3%  softirqs.CPU224.RCU
>      13141 ±  3%     +16.2%      15265 ±  7%  softirqs.CPU225.SCHED
>      14084 ±  3%      +8.2%      15242 ±  2%  softirqs.CPU226.SCHED
>      13528 ±  4%     +11.3%      15063 ±  4%  softirqs.CPU228.SCHED
>      13218 ±  3%     +16.3%      15377 ±  4%  softirqs.CPU229.SCHED
>      14031 ±  4%     +10.2%      15467 ±  2%  softirqs.CPU231.SCHED
>      13770 ±  3%     +14.0%      15700 ±  3%  softirqs.CPU232.SCHED
>      13456 ±  3%     +12.3%      15105 ±  3%  softirqs.CPU233.SCHED
>      13137 ±  4%     +13.5%      14909 ±  3%  softirqs.CPU234.SCHED
>      13318 ±  2%     +14.7%      15280 ±  2%  softirqs.CPU235.SCHED
>      13690 ±  2%     +13.7%      15563 ±  7%  softirqs.CPU238.SCHED
>      13771 ±  5%     +20.8%      16634 ±  7%  softirqs.CPU241.SCHED
>      13317 ±  7%     +19.5%      15919 ±  9%  softirqs.CPU243.SCHED
>       8234 ± 16%     -43.9%       4616 ±  5%  softirqs.CPU244.RCU
>      13845 ±  6%     +13.0%      15643 ±  3%  softirqs.CPU244.SCHED
>      13179 ±  3%     +16.3%      15323        softirqs.CPU246.SCHED
>      13754           +12.2%      15438 ±  3%  softirqs.CPU248.SCHED
>      13769 ±  4%     +10.9%      15276 ±  2%  softirqs.CPU252.SCHED
>      13702           +10.5%      15147 ±  2%  softirqs.CPU254.SCHED
>      13315 ±  2%     +12.5%      14980 ±  3%  softirqs.CPU255.SCHED
>      13785 ±  3%     +12.9%      15568 ±  5%  softirqs.CPU256.SCHED
>      13307 ±  3%     +15.0%      15298 ±  3%  softirqs.CPU257.SCHED
>      13864 ±  3%     +10.5%      15313 ±  2%  softirqs.CPU259.SCHED
>      13879 ±  2%     +11.4%      15465        softirqs.CPU261.SCHED
>      13815           +13.6%      15687 ±  5%  softirqs.CPU264.SCHED
>     119574 ±  2%     +11.8%     133693 ± 11%  softirqs.CPU266.TIMER
>      13688           +10.9%      15180 ±  6%  softirqs.CPU267.SCHED
>      11716 ±  4%     +19.3%      13974 ±  8%  softirqs.CPU27.SCHED
>      13866 ±  3%     +13.7%      15765 ±  4%  softirqs.CPU271.SCHED
>      13887 ±  5%     +12.5%      15621        softirqs.CPU272.SCHED
>      13383 ±  3%     +19.8%      16031 ±  2%  softirqs.CPU274.SCHED
>      13347           +14.1%      15232 ±  3%  softirqs.CPU275.SCHED
>      12884 ±  2%     +21.0%      15593 ±  4%  softirqs.CPU276.SCHED
>      13131 ±  5%     +13.4%      14891 ±  5%  softirqs.CPU277.SCHED
>      12891 ±  2%     +19.2%      15371 ±  4%  softirqs.CPU278.SCHED
>      13313 ±  4%     +13.0%      15049 ±  2%  softirqs.CPU279.SCHED
>      13514 ±  3%     +10.2%      14897 ±  2%  softirqs.CPU280.SCHED
>      13501 ±  3%     +13.7%      15346        softirqs.CPU281.SCHED
>      13261           +17.5%      15577        softirqs.CPU282.SCHED
>       8076 ± 15%     -43.7%       4546 ±  5%  softirqs.CPU283.RCU
>      13686 ±  3%     +12.6%      15413 ±  2%  softirqs.CPU284.SCHED
>      13439 ±  2%      +9.2%      14670 ±  4%  softirqs.CPU285.SCHED
>       8878 ±  9%     -35.4%       5735 ±  4%  softirqs.CPU35.RCU
>      11690 ±  2%     +13.6%      13274 ±  5%  softirqs.CPU40.SCHED
>      11714 ±  2%     +19.3%      13975 ± 13%  softirqs.CPU41.SCHED
>      11763           +12.5%      13239 ±  4%  softirqs.CPU45.SCHED
>      11662 ±  2%      +9.4%      12757 ±  3%  softirqs.CPU46.SCHED
>      11805 ±  2%      +9.3%      12902 ±  2%  softirqs.CPU50.SCHED
>      12158 ±  3%     +12.3%      13655 ±  8%  softirqs.CPU55.SCHED
>      11716 ±  4%      +8.8%      12751 ±  3%  softirqs.CPU58.SCHED
>      11922 ±  2%      +9.9%      13100 ±  4%  softirqs.CPU64.SCHED
>       9674 ± 17%     -41.8%       5625 ±  6%  softirqs.CPU66.RCU
>      11818           +12.0%      13237        softirqs.CPU66.SCHED
>     124682 ±  7%      -6.1%     117088 ±  5%  softirqs.CPU66.TIMER
>       8637 ±  9%     -34.0%       5700 ±  7%  softirqs.CPU70.RCU
>      11624 ±  2%     +11.0%      12901 ±  2%  softirqs.CPU70.SCHED
>      12372 ±  2%     +13.2%      14003 ±  3%  softirqs.CPU71.SCHED
>       9949 ± 25%     -33.9%       6574 ± 31%  softirqs.CPU72.RCU
>      10392 ± 26%     -35.1%       6745 ± 35%  softirqs.CPU73.RCU
>      12766 ±  3%     +11.1%      14188 ±  3%  softirqs.CPU76.SCHED
>      12611 ±  2%     +18.8%      14984 ±  5%  softirqs.CPU78.SCHED
>      12786 ±  3%     +17.9%      15079 ±  7%  softirqs.CPU79.SCHED
>      11947 ±  4%      +9.7%      13103 ±  4%  softirqs.CPU8.SCHED
>      13379 ±  7%     +11.8%      14962 ±  4%  softirqs.CPU83.SCHED
>      13438 ±  5%      +9.7%      14738 ±  2%  softirqs.CPU84.SCHED
>      12768           +19.4%      15241 ±  6%  softirqs.CPU88.SCHED
>       8604 ± 13%     -39.3%       5222 ±  3%  softirqs.CPU89.RCU
>      13077 ±  2%     +17.1%      15308 ±  7%  softirqs.CPU89.SCHED
>      11887 ±  3%     +20.1%      14272 ±  5%  softirqs.CPU9.SCHED
>      12723 ±  3%     +11.3%      14165 ±  4%  softirqs.CPU90.SCHED
>       8439 ± 12%     -38.9%       5153 ±  4%  softirqs.CPU91.RCU
>      13429 ±  3%     +10.3%      14806 ±  2%  softirqs.CPU95.SCHED
>      12852 ±  4%     +10.3%      14174 ±  5%  softirqs.CPU96.SCHED
>      13010 ±  2%     +14.4%      14888 ±  5%  softirqs.CPU97.SCHED
>    2315644 ±  4%     -36.2%    1477200 ±  4%  softirqs.RCU
>       1572 ± 10%     +63.9%       2578 ± 39%  interrupts.CPU0.NMI:Non-maskable_interrupts
>       1572 ± 10%     +63.9%       2578 ± 39%  interrupts.CPU0.PMI:Performance_monitoring_interrupts
>     252.00 ± 11%     -35.2%     163.25 ± 13%  interrupts.CPU104.RES:Rescheduling_interrupts
>       2738 ± 24%     +52.4%       4173 ± 19%  interrupts.CPU105.NMI:Non-maskable_interrupts
>       2738 ± 24%     +52.4%       4173 ± 19%  interrupts.CPU105.PMI:Performance_monitoring_interrupts
>     245.75 ± 19%     -31.0%     169.50 ±  7%  interrupts.CPU105.RES:Rescheduling_interrupts
>     228.75 ± 13%     -24.7%     172.25 ± 19%  interrupts.CPU106.RES:Rescheduling_interrupts
>       2243 ± 15%     +66.3%       3730 ± 35%  interrupts.CPU113.NMI:Non-maskable_interrupts
>       2243 ± 15%     +66.3%       3730 ± 35%  interrupts.CPU113.PMI:Performance_monitoring_interrupts
>       2703 ± 31%     +67.0%       4514 ± 33%  interrupts.CPU118.NMI:Non-maskable_interrupts
>       2703 ± 31%     +67.0%       4514 ± 33%  interrupts.CPU118.PMI:Performance_monitoring_interrupts
>       2613 ± 25%     +42.2%       3715 ± 24%  interrupts.CPU121.NMI:Non-maskable_interrupts
>       2613 ± 25%     +42.2%       3715 ± 24%  interrupts.CPU121.PMI:Performance_monitoring_interrupts
>     311.50 ± 23%     -47.7%     163.00 ±  9%  interrupts.CPU122.RES:Rescheduling_interrupts
>     266.75 ± 19%     -31.6%     182.50 ± 15%  interrupts.CPU124.RES:Rescheduling_interrupts
>     293.75 ± 33%     -32.3%     198.75 ± 19%  interrupts.CPU125.RES:Rescheduling_interrupts
>       2601 ± 36%     +43.2%       3724 ± 29%  interrupts.CPU127.NMI:Non-maskable_interrupts
>       2601 ± 36%     +43.2%       3724 ± 29%  interrupts.CPU127.PMI:Performance_monitoring_interrupts
>       2258 ± 21%     +68.2%       3797 ± 29%  interrupts.CPU13.NMI:Non-maskable_interrupts
>       2258 ± 21%     +68.2%       3797 ± 29%  interrupts.CPU13.PMI:Performance_monitoring_interrupts
>       3338 ± 29%     +54.6%       5160 ±  9%  interrupts.CPU139.NMI:Non-maskable_interrupts
>       3338 ± 29%     +54.6%       5160 ±  9%  interrupts.CPU139.PMI:Performance_monitoring_interrupts
>     219.50 ± 27%     -23.0%     169.00 ± 21%  interrupts.CPU139.RES:Rescheduling_interrupts
>     290.25 ± 25%     -32.5%     196.00 ± 11%  interrupts.CPU14.RES:Rescheduling_interrupts
>     243.50 ±  4%     -16.0%     204.50 ± 12%  interrupts.CPU140.RES:Rescheduling_interrupts
>       1797 ± 15%    +135.0%       4223 ± 46%  interrupts.CPU147.NMI:Non-maskable_interrupts
>       1797 ± 15%    +135.0%       4223 ± 46%  interrupts.CPU147.PMI:Performance_monitoring_interrupts
>       2537 ± 22%     +89.6%       4812 ± 28%  interrupts.CPU15.NMI:Non-maskable_interrupts
>       2537 ± 22%     +89.6%       4812 ± 28%  interrupts.CPU15.PMI:Performance_monitoring_interrupts
>     292.25 ± 34%     -33.9%     193.25 ±  6%  interrupts.CPU15.RES:Rescheduling_interrupts
>     424.25 ± 37%     -58.5%     176.25 ± 14%  interrupts.CPU158.RES:Rescheduling_interrupts
>     312.50 ± 42%     -54.2%     143.00 ± 18%  interrupts.CPU159.RES:Rescheduling_interrupts
>     725.00 ±118%     -75.7%     176.25 ± 14%  interrupts.CPU163.RES:Rescheduling_interrupts
>       2367 ±  6%     +59.9%       3786 ± 24%  interrupts.CPU177.NMI:Non-maskable_interrupts
>       2367 ±  6%     +59.9%       3786 ± 24%  interrupts.CPU177.PMI:Performance_monitoring_interrupts
>     239.50 ± 30%     -46.6%     128.00 ± 14%  interrupts.CPU179.RES:Rescheduling_interrupts
>     320.75 ± 15%     -24.0%     243.75 ± 20%  interrupts.CPU20.RES:Rescheduling_interrupts
>     302.50 ± 17%     -47.2%     159.75 ±  8%  interrupts.CPU200.RES:Rescheduling_interrupts
>       2166 ±  5%     +92.0%       4157 ± 40%  interrupts.CPU207.NMI:Non-maskable_interrupts
>       2166 ±  5%     +92.0%       4157 ± 40%  interrupts.CPU207.PMI:Performance_monitoring_interrupts
>     217.00 ± 11%     -34.6%     142.00 ± 12%  interrupts.CPU214.RES:Rescheduling_interrupts
>       2610 ± 36%     +47.4%       3848 ± 35%  interrupts.CPU215.NMI:Non-maskable_interrupts
>       2610 ± 36%     +47.4%       3848 ± 35%  interrupts.CPU215.PMI:Performance_monitoring_interrupts
>       2046 ± 13%    +118.6%       4475 ± 43%  interrupts.CPU22.NMI:Non-maskable_interrupts
>       2046 ± 13%    +118.6%       4475 ± 43%  interrupts.CPU22.PMI:Performance_monitoring_interrupts
>     289.50 ± 28%     -41.1%     170.50 ±  8%  interrupts.CPU22.RES:Rescheduling_interrupts
>       2232 ±  6%     +33.0%       2970 ± 24%  interrupts.CPU221.NMI:Non-maskable_interrupts
>       2232 ±  6%     +33.0%       2970 ± 24%  interrupts.CPU221.PMI:Performance_monitoring_interrupts
>       4552 ± 12%     -27.6%       3295 ± 15%  interrupts.CPU222.NMI:Non-maskable_interrupts
>       4552 ± 12%     -27.6%       3295 ± 15%  interrupts.CPU222.PMI:Performance_monitoring_interrupts
>       2013 ± 15%     +80.9%       3641 ± 27%  interrupts.CPU226.NMI:Non-maskable_interrupts
>       2013 ± 15%     +80.9%       3641 ± 27%  interrupts.CPU226.PMI:Performance_monitoring_interrupts
>       2575 ± 49%     +67.1%       4302 ± 34%  interrupts.CPU227.NMI:Non-maskable_interrupts
>       2575 ± 49%     +67.1%       4302 ± 34%  interrupts.CPU227.PMI:Performance_monitoring_interrupts
>     248.00 ± 36%     -36.3%     158.00 ± 19%  interrupts.CPU228.RES:Rescheduling_interrupts
>       2441 ± 24%     +43.0%       3490 ± 30%  interrupts.CPU23.NMI:Non-maskable_interrupts
>       2441 ± 24%     +43.0%       3490 ± 30%  interrupts.CPU23.PMI:Performance_monitoring_interrupts
>     404.25 ± 69%     -65.5%     139.50 ± 17%  interrupts.CPU236.RES:Rescheduling_interrupts
>     566.50 ± 40%     -73.6%     149.50 ± 31%  interrupts.CPU237.RES:Rescheduling_interrupts
>     243.50 ± 26%     -37.1%     153.25 ± 21%  interrupts.CPU248.RES:Rescheduling_interrupts
>     258.25 ± 12%     -53.5%     120.00 ± 18%  interrupts.CPU249.RES:Rescheduling_interrupts
>       2888 ± 27%     +49.4%       4313 ± 30%  interrupts.CPU253.NMI:Non-maskable_interrupts
>       2888 ± 27%     +49.4%       4313 ± 30%  interrupts.CPU253.PMI:Performance_monitoring_interrupts
>       2468 ± 44%     +67.3%       4131 ± 37%  interrupts.CPU256.NMI:Non-maskable_interrupts
>       2468 ± 44%     +67.3%       4131 ± 37%  interrupts.CPU256.PMI:Performance_monitoring_interrupts
>     425.00 ± 59%     -60.3%     168.75 ± 34%  interrupts.CPU258.RES:Rescheduling_interrupts
>       1859 ± 16%    +106.3%       3834 ± 44%  interrupts.CPU268.NMI:Non-maskable_interrupts
>       1859 ± 16%    +106.3%       3834 ± 44%  interrupts.CPU268.PMI:Performance_monitoring_interrupts
>       2684 ± 28%     +61.2%       4326 ± 36%  interrupts.CPU269.NMI:Non-maskable_interrupts
>       2684 ± 28%     +61.2%       4326 ± 36%  interrupts.CPU269.PMI:Performance_monitoring_interrupts
>       2171 ±  6%    +108.8%       4533 ± 20%  interrupts.CPU270.NMI:Non-maskable_interrupts
>       2171 ±  6%    +108.8%       4533 ± 20%  interrupts.CPU270.PMI:Performance_monitoring_interrupts
>       2262 ± 14%     +61.8%       3659 ± 37%  interrupts.CPU273.NMI:Non-maskable_interrupts
>       2262 ± 14%     +61.8%       3659 ± 37%  interrupts.CPU273.PMI:Performance_monitoring_interrupts
>       2203 ± 11%     +50.7%       3320 ± 38%  interrupts.CPU279.NMI:Non-maskable_interrupts
>       2203 ± 11%     +50.7%       3320 ± 38%  interrupts.CPU279.PMI:Performance_monitoring_interrupts
>       2433 ± 17%     +52.9%       3721 ± 25%  interrupts.CPU280.NMI:Non-maskable_interrupts
>       2433 ± 17%     +52.9%       3721 ± 25%  interrupts.CPU280.PMI:Performance_monitoring_interrupts
>       2778 ± 33%     +63.1%       4531 ± 36%  interrupts.CPU283.NMI:Non-maskable_interrupts
>       2778 ± 33%     +63.1%       4531 ± 36%  interrupts.CPU283.PMI:Performance_monitoring_interrupts
>     331.75 ± 32%     -39.8%     199.75 ± 17%  interrupts.CPU29.RES:Rescheduling_interrupts
>       2178 ± 22%     +53.9%       3353 ± 31%  interrupts.CPU3.NMI:Non-maskable_interrupts
>       2178 ± 22%     +53.9%       3353 ± 31%  interrupts.CPU3.PMI:Performance_monitoring_interrupts
>     298.50 ± 30%     -39.7%     180.00 ±  6%  interrupts.CPU34.RES:Rescheduling_interrupts
>       2490 ±  3%     +58.7%       3953 ± 28%  interrupts.CPU35.NMI:Non-maskable_interrupts
>       2490 ±  3%     +58.7%       3953 ± 28%  interrupts.CPU35.PMI:Performance_monitoring_interrupts
>     270.50 ± 24%     -31.1%     186.25 ±  3%  interrupts.CPU36.RES:Rescheduling_interrupts
>       2493 ±  7%     +57.0%       3915 ± 27%  interrupts.CPU43.NMI:Non-maskable_interrupts
>       2493 ±  7%     +57.0%       3915 ± 27%  interrupts.CPU43.PMI:Performance_monitoring_interrupts
>     286.75 ± 36%     -32.4%     193.75 ±  7%  interrupts.CPU45.RES:Rescheduling_interrupts
>     259.00 ± 12%     -23.6%     197.75 ± 13%  interrupts.CPU46.RES:Rescheduling_interrupts
>     244.00 ± 21%     -35.6%     157.25 ± 11%  interrupts.CPU47.RES:Rescheduling_interrupts
>     230.00 ±  7%     -21.3%     181.00 ± 11%  interrupts.CPU48.RES:Rescheduling_interrupts
>     281.00 ± 13%     -27.4%     204.00 ± 15%  interrupts.CPU53.RES:Rescheduling_interrupts
>     256.75 ±  5%     -18.4%     209.50 ± 12%  interrupts.CPU54.RES:Rescheduling_interrupts
>       2433 ±  9%     +68.4%       4098 ± 35%  interrupts.CPU58.NMI:Non-maskable_interrupts
>       2433 ±  9%     +68.4%       4098 ± 35%  interrupts.CPU58.PMI:Performance_monitoring_interrupts
>     316.00 ± 25%     -41.4%     185.25 ± 13%  interrupts.CPU59.RES:Rescheduling_interrupts
>       2703 ± 38%     +56.0%       4217 ± 31%  interrupts.CPU60.NMI:Non-maskable_interrupts
>       2703 ± 38%     +56.0%       4217 ± 31%  interrupts.CPU60.PMI:Performance_monitoring_interrupts
>       2425 ± 16%     +39.9%       3394 ± 27%  interrupts.CPU61.NMI:Non-maskable_interrupts
>       2425 ± 16%     +39.9%       3394 ± 27%  interrupts.CPU61.PMI:Performance_monitoring_interrupts
>       2388 ± 18%     +69.5%       4047 ± 29%  interrupts.CPU66.NMI:Non-maskable_interrupts
>       2388 ± 18%     +69.5%       4047 ± 29%  interrupts.CPU66.PMI:Performance_monitoring_interrupts
>       2322 ± 11%     +93.4%       4491 ± 35%  interrupts.CPU67.NMI:Non-maskable_interrupts
>       2322 ± 11%     +93.4%       4491 ± 35%  interrupts.CPU67.PMI:Performance_monitoring_interrupts
>     319.00 ± 40%     -44.7%     176.25 ±  9%  interrupts.CPU67.RES:Rescheduling_interrupts
>       2512 ±  8%     +28.1%       3219 ± 25%  interrupts.CPU70.NMI:Non-maskable_interrupts
>       2512 ±  8%     +28.1%       3219 ± 25%  interrupts.CPU70.PMI:Performance_monitoring_interrupts
>       2290 ± 39%     +78.7%       4094 ± 28%  interrupts.CPU74.NMI:Non-maskable_interrupts
>       2290 ± 39%     +78.7%       4094 ± 28%  interrupts.CPU74.PMI:Performance_monitoring_interrupts
>       2446 ± 40%     +94.8%       4764 ± 23%  interrupts.CPU75.NMI:Non-maskable_interrupts
>       2446 ± 40%     +94.8%       4764 ± 23%  interrupts.CPU75.PMI:Performance_monitoring_interrupts
>     426.75 ± 61%     -67.7%     138.00 ±  8%  interrupts.CPU75.RES:Rescheduling_interrupts
>     192.50 ± 13%     +45.6%     280.25 ± 45%  interrupts.CPU76.RES:Rescheduling_interrupts
>     274.25 ± 34%     -42.2%     158.50 ± 34%  interrupts.CPU77.RES:Rescheduling_interrupts
>       2357 ±  9%     +73.0%       4078 ± 23%  interrupts.CPU78.NMI:Non-maskable_interrupts
>       2357 ±  9%     +73.0%       4078 ± 23%  interrupts.CPU78.PMI:Performance_monitoring_interrupts
>     348.50 ± 53%     -47.3%     183.75 ± 29%  interrupts.CPU80.RES:Rescheduling_interrupts
>       2650 ± 43%     +46.2%       3874 ± 36%  interrupts.CPU84.NMI:Non-maskable_interrupts
>       2650 ± 43%     +46.2%       3874 ± 36%  interrupts.CPU84.PMI:Performance_monitoring_interrupts
>       2235 ± 10%    +117.8%       4867 ± 10%  interrupts.CPU90.NMI:Non-maskable_interrupts
>       2235 ± 10%    +117.8%       4867 ± 10%  interrupts.CPU90.PMI:Performance_monitoring_interrupts
>       2606 ± 33%     +38.1%       3598 ± 21%  interrupts.CPU92.NMI:Non-maskable_interrupts
>       2606 ± 33%     +38.1%       3598 ± 21%  interrupts.CPU92.PMI:Performance_monitoring_interrupts
>     408.75 ± 58%     -56.8%     176.75 ± 25%  interrupts.CPU92.RES:Rescheduling_interrupts
>     399.00 ± 64%     -63.6%     145.25 ± 16%  interrupts.CPU93.RES:Rescheduling_interrupts
>     314.75 ± 36%     -44.2%     175.75 ± 13%  interrupts.CPU94.RES:Rescheduling_interrupts
>     191.00 ± 15%     -29.1%     135.50 ±  9%  interrupts.CPU97.RES:Rescheduling_interrupts
>      94.00 ±  8%     +50.0%     141.00 ± 12%  interrupts.IWI:IRQ_work_interrupts
>     841457 ±  7%     +16.6%     980751 ±  3%  interrupts.NMI:Non-maskable_interrupts
>     841457 ±  7%     +16.6%     980751 ±  3%  interrupts.PMI:Performance_monitoring_interrupts
>      12.75 ± 11%      -4.1        8.67 ± 31%  perf-profile.calltrace.cycles-pp.do_rw_once
>       1.02 ± 16%      -0.6        0.47 ± 59%  perf-profile.calltrace.cycles-pp.sched_clock.sched_clock_cpu.cpuidle_enter_state.cpuidle_enter.do_idle
>       1.10 ± 15%      -0.4        0.66 ± 14%  perf-profile.calltrace.cycles-pp.sched_clock_cpu.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry
>       1.05 ± 16%      -0.4        0.61 ± 14%  perf-profile.calltrace.cycles-pp.native_sched_clock.sched_clock.sched_clock_cpu.cpuidle_enter_state.cpuidle_enter
>       1.58 ±  4%      +0.3        1.91 ±  7%  perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.copy_page
>       0.79 ± 26%      +0.5        1.27 ± 18%  perf-profile.calltrace.cycles-pp.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       0.79 ± 26%      +0.5        1.27 ± 18%  perf-profile.calltrace.cycles-pp.do_group_exit.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       0.79 ± 26%      +0.5        1.27 ± 18%  perf-profile.calltrace.cycles-pp.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       2.11 ±  4%      +0.5        2.60 ±  7%  perf-profile.calltrace.cycles-pp.apic_timer_interrupt.osq_lock.__mutex_lock.hugetlb_fault.handle_mm_fault
>       0.83 ± 26%      +0.5        1.32 ± 18%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
>       0.83 ± 26%      +0.5        1.32 ± 18%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       1.90 ±  5%      +0.6        2.45 ±  7%  perf-profile.calltrace.cycles-pp.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.copy_page.copy_subpage
>       0.65 ± 62%      +0.6        1.20 ± 15%  perf-profile.calltrace.cycles-pp.alloc_fresh_huge_page.alloc_surplus_huge_page.alloc_huge_page.hugetlb_cow.hugetlb_fault
>       0.60 ± 62%      +0.6        1.16 ± 18%  perf-profile.calltrace.cycles-pp.free_huge_page.release_pages.tlb_flush_mmu.tlb_finish_mmu.exit_mmap
>       0.95 ± 17%      +0.6        1.52 ±  8%  perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.mutex_spin_on_owner
>       0.61 ± 62%      +0.6        1.18 ± 18%  perf-profile.calltrace.cycles-pp.release_pages.tlb_flush_mmu.tlb_finish_mmu.exit_mmap.mmput
>       0.61 ± 62%      +0.6        1.19 ± 19%  perf-profile.calltrace.cycles-pp.tlb_finish_mmu.exit_mmap.mmput.do_exit.do_group_exit
>       0.61 ± 62%      +0.6        1.19 ± 19%  perf-profile.calltrace.cycles-pp.tlb_flush_mmu.tlb_finish_mmu.exit_mmap.mmput.do_exit
>       0.64 ± 61%      +0.6        1.23 ± 18%  perf-profile.calltrace.cycles-pp.mmput.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64
>       0.64 ± 61%      +0.6        1.23 ± 18%  perf-profile.calltrace.cycles-pp.exit_mmap.mmput.do_exit.do_group_exit.__x64_sys_exit_group
>       1.30 ±  9%      +0.6        1.92 ±  8%  perf-profile.calltrace.cycles-pp.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.mutex_spin_on_owner.__mutex_lock
>       0.19 ±173%      +0.7        0.89 ± 20%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.free_huge_page.release_pages.tlb_flush_mmu
>       0.19 ±173%      +0.7        0.90 ± 20%  perf-profile.calltrace.cycles-pp._raw_spin_lock.free_huge_page.release_pages.tlb_flush_mmu.tlb_finish_mmu
>       0.00            +0.8        0.77 ± 30%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.prep_new_huge_page.alloc_fresh_huge_page.alloc_surplus_huge_page
>       0.00            +0.8        0.78 ± 30%  perf-profile.calltrace.cycles-pp._raw_spin_lock.prep_new_huge_page.alloc_fresh_huge_page.alloc_surplus_huge_page.alloc_huge_page
>       0.00            +0.8        0.79 ± 29%  perf-profile.calltrace.cycles-pp.prep_new_huge_page.alloc_fresh_huge_page.alloc_surplus_huge_page.alloc_huge_page.hugetlb_cow
>       0.82 ± 67%      +0.9        1.72 ± 22%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.alloc_huge_page.hugetlb_cow.hugetlb_fault
>       0.84 ± 66%      +0.9        1.74 ± 20%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.alloc_surplus_huge_page.alloc_huge_page.hugetlb_cow
>       2.52 ±  6%      +0.9        3.44 ±  9%  perf-profile.calltrace.cycles-pp.smp_apic_timer_interrupt.apic_timer_interrupt.copy_page.copy_subpage.copy_user_huge_page
>       0.83 ± 67%      +0.9        1.75 ± 21%  perf-profile.calltrace.cycles-pp._raw_spin_lock.alloc_huge_page.hugetlb_cow.hugetlb_fault.handle_mm_fault
>       0.84 ± 66%      +0.9        1.77 ± 20%  perf-profile.calltrace.cycles-pp._raw_spin_lock.alloc_surplus_huge_page.alloc_huge_page.hugetlb_cow.hugetlb_fault
>       1.64 ± 12%      +1.0        2.67 ±  7%  perf-profile.calltrace.cycles-pp.smp_apic_timer_interrupt.apic_timer_interrupt.mutex_spin_on_owner.__mutex_lock.hugetlb_fault
>       1.65 ± 45%      +1.3        2.99 ± 18%  perf-profile.calltrace.cycles-pp.alloc_surplus_huge_page.alloc_huge_page.hugetlb_cow.hugetlb_fault.handle_mm_fault
>       1.74 ± 13%      +1.4        3.16 ±  6%  perf-profile.calltrace.cycles-pp.apic_timer_interrupt.mutex_spin_on_owner.__mutex_lock.hugetlb_fault.handle_mm_fault
>       2.56 ± 48%      +2.2        4.81 ± 19%  perf-profile.calltrace.cycles-pp.alloc_huge_page.hugetlb_cow.hugetlb_fault.handle_mm_fault.__do_page_fault
>      12.64 ± 14%      +3.6       16.20 ±  8%  perf-profile.calltrace.cycles-pp.mutex_spin_on_owner.__mutex_lock.hugetlb_fault.handle_mm_fault.__do_page_fault
>       2.97 ±  7%      +3.8        6.74 ±  9%  perf-profile.calltrace.cycles-pp.apic_timer_interrupt.copy_page.copy_subpage.copy_user_huge_page.hugetlb_cow
>      19.99 ±  9%      +4.1       24.05 ±  6%  perf-profile.calltrace.cycles-pp.hugetlb_cow.hugetlb_fault.handle_mm_fault.__do_page_fault.do_page_fault
>       1.37 ± 15%      -0.5        0.83 ± 13%  perf-profile.children.cycles-pp.sched_clock_cpu
>       1.31 ± 16%      -0.5        0.78 ± 13%  perf-profile.children.cycles-pp.sched_clock
>       1.29 ± 16%      -0.5        0.77 ± 13%  perf-profile.children.cycles-pp.native_sched_clock
>       1.80 ±  2%      -0.3        1.47 ± 10%  perf-profile.children.cycles-pp.task_tick_fair
>       0.73 ±  2%      -0.2        0.54 ± 11%  perf-profile.children.cycles-pp.update_curr
>       0.42 ± 17%      -0.2        0.27 ± 16%  perf-profile.children.cycles-pp.account_process_tick
>       0.73 ± 10%      -0.2        0.58 ±  9%  perf-profile.children.cycles-pp.rcu_sched_clock_irq
>       0.27 ±  6%      -0.1        0.14 ± 14%  perf-profile.children.cycles-pp.__acct_update_integrals
>       0.27 ± 18%      -0.1        0.16 ± 13%  perf-profile.children.cycles-pp.rcu_segcblist_ready_cbs
>       0.40 ± 12%      -0.1        0.30 ± 14%  perf-profile.children.cycles-pp.__next_timer_interrupt
>       0.47 ±  7%      -0.1        0.39 ± 13%  perf-profile.children.cycles-pp.update_rq_clock
>       0.29 ± 12%      -0.1        0.21 ± 15%  perf-profile.children.cycles-pp.cpuidle_governor_latency_req
>       0.21 ±  7%      -0.1        0.14 ± 12%  perf-profile.children.cycles-pp.account_system_index_time
>       0.38 ±  2%      -0.1        0.31 ± 12%  perf-profile.children.cycles-pp.timerqueue_add
>       0.26 ± 11%      -0.1        0.20 ± 13%  perf-profile.children.cycles-pp.find_next_bit
>       0.23 ± 15%      -0.1        0.17 ± 15%  perf-profile.children.cycles-pp.rcu_dynticks_eqs_exit
>       0.14 ±  8%      -0.1        0.07 ± 14%  perf-profile.children.cycles-pp.account_user_time
>       0.17 ±  6%      -0.0        0.12 ± 10%  perf-profile.children.cycles-pp.cpuacct_charge
>       0.18 ± 20%      -0.0        0.13 ±  3%  perf-profile.children.cycles-pp.irq_work_tick
>       0.11 ± 13%      -0.0        0.07 ± 25%  perf-profile.children.cycles-pp.tick_sched_do_timer
>       0.12 ± 10%      -0.0        0.08 ± 15%  perf-profile.children.cycles-pp.get_cpu_device
>       0.07 ± 11%      -0.0        0.04 ± 58%  perf-profile.children.cycles-pp.raise_softirq
>       0.12 ±  3%      -0.0        0.09 ±  8%  perf-profile.children.cycles-pp.write
>       0.11 ± 13%      +0.0        0.14 ±  8%  perf-profile.children.cycles-pp.native_write_msr
>       0.09 ±  9%      +0.0        0.11 ±  7%  perf-profile.children.cycles-pp.finish_task_switch
>       0.10 ± 10%      +0.0        0.13 ±  5%  perf-profile.children.cycles-pp.schedule_idle
>       0.07 ±  6%      +0.0        0.10 ± 12%  perf-profile.children.cycles-pp.__read_nocancel
>       0.04 ± 58%      +0.0        0.07 ± 15%  perf-profile.children.cycles-pp.__free_pages_ok
>       0.06 ±  7%      +0.0        0.09 ± 13%  perf-profile.children.cycles-pp.perf_read
>       0.07            +0.0        0.11 ± 14%  perf-profile.children.cycles-pp.perf_evsel__read_counter
>       0.07            +0.0        0.11 ± 13%  perf-profile.children.cycles-pp.cmd_stat
>       0.07            +0.0        0.11 ± 13%  perf-profile.children.cycles-pp.__run_perf_stat
>       0.07            +0.0        0.11 ± 13%  perf-profile.children.cycles-pp.process_interval
>       0.07            +0.0        0.11 ± 13%  perf-profile.children.cycles-pp.read_counters
>       0.07 ± 22%      +0.0        0.11 ± 19%  perf-profile.children.cycles-pp.__handle_mm_fault
>       0.07 ± 19%      +0.1        0.13 ±  8%  perf-profile.children.cycles-pp.rb_erase
>       0.03 ±100%      +0.1        0.09 ±  9%  perf-profile.children.cycles-pp.smp_call_function_single
>       0.01 ±173%      +0.1        0.08 ± 11%  perf-profile.children.cycles-pp.perf_event_read
>       0.00            +0.1        0.07 ± 13%  perf-profile.children.cycles-pp.__perf_event_read_value
>       0.00            +0.1        0.07 ±  7%  perf-profile.children.cycles-pp.__intel_pmu_enable_all
>       0.08 ± 17%      +0.1        0.15 ±  8%  perf-profile.children.cycles-pp.native_apic_msr_eoi_write
>       0.04 ±103%      +0.1        0.13 ± 58%  perf-profile.children.cycles-pp.shmem_getpage_gfp
>       0.38 ± 14%      +0.1        0.51 ±  6%  perf-profile.children.cycles-pp.run_timer_softirq
>       0.11 ±  4%      +0.3        0.37 ± 32%  perf-profile.children.cycles-pp.worker_thread
>       0.20 ±  5%      +0.3        0.48 ± 25%  perf-profile.children.cycles-pp.ret_from_fork
>       0.20 ±  4%      +0.3        0.48 ± 25%  perf-profile.children.cycles-pp.kthread
>       0.00            +0.3        0.29 ± 38%  perf-profile.children.cycles-pp.memcpy_erms
>       0.00            +0.3        0.29 ± 38%  perf-profile.children.cycles-pp.drm_fb_helper_dirty_work
>       0.00            +0.3        0.31 ± 37%  perf-profile.children.cycles-pp.process_one_work
>       0.47 ± 48%      +0.4        0.91 ± 19%  perf-profile.children.cycles-pp.prep_new_huge_page
>       0.70 ± 29%      +0.5        1.16 ± 18%  perf-profile.children.cycles-pp.free_huge_page
>       0.73 ± 29%      +0.5        1.19 ± 18%  perf-profile.children.cycles-pp.tlb_flush_mmu
>       0.72 ± 29%      +0.5        1.18 ± 18%  perf-profile.children.cycles-pp.release_pages
>       0.73 ± 29%      +0.5        1.19 ± 18%  perf-profile.children.cycles-pp.tlb_finish_mmu
>       0.76 ± 27%      +0.5        1.23 ± 18%  perf-profile.children.cycles-pp.exit_mmap
>       0.77 ± 27%      +0.5        1.24 ± 18%  perf-profile.children.cycles-pp.mmput
>       0.79 ± 26%      +0.5        1.27 ± 18%  perf-profile.children.cycles-pp.__x64_sys_exit_group
>       0.79 ± 26%      +0.5        1.27 ± 18%  perf-profile.children.cycles-pp.do_group_exit
>       0.79 ± 26%      +0.5        1.27 ± 18%  perf-profile.children.cycles-pp.do_exit
>       1.28 ± 29%      +0.5        1.76 ±  9%  perf-profile.children.cycles-pp.perf_mux_hrtimer_handler
>       0.77 ± 28%      +0.5        1.26 ± 13%  perf-profile.children.cycles-pp.alloc_fresh_huge_page
>       1.53 ± 15%      +0.7        2.26 ± 14%  perf-profile.children.cycles-pp.do_syscall_64
>       1.53 ± 15%      +0.7        2.27 ± 14%  perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
>       1.13 ±  3%      +0.9        2.07 ± 14%  perf-profile.children.cycles-pp.interrupt_entry
>       0.79 ±  9%      +1.0        1.76 ±  5%  perf-profile.children.cycles-pp.perf_event_task_tick
>       1.71 ± 39%      +1.4        3.08 ± 16%  perf-profile.children.cycles-pp.alloc_surplus_huge_page
>       2.66 ± 42%      +2.3        4.94 ± 17%  perf-profile.children.cycles-pp.alloc_huge_page
>       2.89 ± 45%      +2.7        5.54 ± 18%  perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
>       3.34 ± 35%      +2.7        6.02 ± 17%  perf-profile.children.cycles-pp._raw_spin_lock
>      12.77 ± 14%      +3.9       16.63 ±  7%  perf-profile.children.cycles-pp.mutex_spin_on_owner
>      20.12 ±  9%      +4.0       24.16 ±  6%  perf-profile.children.cycles-pp.hugetlb_cow
>      15.40 ± 10%      -3.6       11.84 ± 28%  perf-profile.self.cycles-pp.do_rw_once
>       4.02 ±  9%      -1.3        2.73 ± 30%  perf-profile.self.cycles-pp.do_access
>       2.00 ± 14%      -0.6        1.41 ± 13%  perf-profile.self.cycles-pp.cpuidle_enter_state
>       1.26 ± 16%      -0.5        0.74 ± 13%  perf-profile.self.cycles-pp.native_sched_clock
>       0.42 ± 17%      -0.2        0.27 ± 16%  perf-profile.self.cycles-pp.account_process_tick
>       0.27 ± 19%      -0.2        0.12 ± 17%  perf-profile.self.cycles-pp.timerqueue_del
>       0.53 ±  3%      -0.1        0.38 ± 11%  perf-profile.self.cycles-pp.update_curr
>       0.27 ±  6%      -0.1        0.14 ± 14%  perf-profile.self.cycles-pp.__acct_update_integrals
>       0.27 ± 18%      -0.1        0.16 ± 13%  perf-profile.self.cycles-pp.rcu_segcblist_ready_cbs
>       0.61 ±  4%      -0.1        0.51 ±  8%  perf-profile.self.cycles-pp.task_tick_fair
>       0.20 ±  8%      -0.1        0.12 ± 14%  perf-profile.self.cycles-pp.account_system_index_time
>       0.23 ± 15%      -0.1        0.16 ± 17%  perf-profile.self.cycles-pp.rcu_dynticks_eqs_exit
>       0.25 ± 11%      -0.1        0.18 ± 14%  perf-profile.self.cycles-pp.find_next_bit
>       0.10 ± 11%      -0.1        0.03 ±100%  perf-profile.self.cycles-pp.tick_sched_do_timer
>       0.29            -0.1        0.23 ± 11%  perf-profile.self.cycles-pp.timerqueue_add
>       0.12 ± 10%      -0.1        0.06 ± 17%  perf-profile.self.cycles-pp.account_user_time
>       0.22 ± 15%      -0.1        0.16 ±  6%  perf-profile.self.cycles-pp.scheduler_tick
>       0.17 ±  6%      -0.0        0.12 ± 10%  perf-profile.self.cycles-pp.cpuacct_charge
>       0.18 ± 20%      -0.0        0.13 ±  3%  perf-profile.self.cycles-pp.irq_work_tick
>       0.07 ± 13%      -0.0        0.03 ±100%  perf-profile.self.cycles-pp.update_process_times
>       0.12 ±  7%      -0.0        0.08 ± 15%  perf-profile.self.cycles-pp.get_cpu_device
>       0.07 ± 11%      -0.0        0.04 ± 58%  perf-profile.self.cycles-pp.raise_softirq
>       0.12 ± 11%      -0.0        0.09 ±  7%  perf-profile.self.cycles-pp.tick_nohz_get_sleep_length
>       0.11 ± 11%      +0.0        0.14 ±  6%  perf-profile.self.cycles-pp.native_write_msr
>       0.10 ±  5%      +0.1        0.15 ±  8%  perf-profile.self.cycles-pp.__remove_hrtimer
>       0.07 ± 23%      +0.1        0.13 ±  8%  perf-profile.self.cycles-pp.rb_erase
>       0.08 ± 17%      +0.1        0.15 ±  7%  perf-profile.self.cycles-pp.native_apic_msr_eoi_write
>       0.00            +0.1        0.08 ± 10%  perf-profile.self.cycles-pp.smp_call_function_single
>       0.32 ± 17%      +0.1        0.42 ±  7%  perf-profile.self.cycles-pp.run_timer_softirq
>       0.22 ±  5%      +0.1        0.34 ±  4%  perf-profile.self.cycles-pp.ktime_get_update_offsets_now
>       0.45 ± 15%      +0.2        0.60 ± 12%  perf-profile.self.cycles-pp.rcu_irq_enter
>       0.31 ±  8%      +0.2        0.46 ± 16%  perf-profile.self.cycles-pp.irq_enter
>       0.29 ± 10%      +0.2        0.44 ± 16%  perf-profile.self.cycles-pp.apic_timer_interrupt
>       0.71 ± 30%      +0.2        0.92 ±  8%  perf-profile.self.cycles-pp.perf_mux_hrtimer_handler
>       0.00            +0.3        0.28 ± 37%  perf-profile.self.cycles-pp.memcpy_erms
>       1.12 ±  3%      +0.9        2.02 ± 15%  perf-profile.self.cycles-pp.interrupt_entry
>       0.79 ±  9%      +0.9        1.73 ±  5%  perf-profile.self.cycles-pp.perf_event_task_tick
>       2.49 ± 45%      +2.1        4.55 ± 20%  perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
>      10.95 ± 15%      +2.7       13.61 ±  8%  perf-profile.self.cycles-pp.mutex_spin_on_owner
> 
> 
>                                                                                 
>                                vm-scalability.throughput                        
>                                                                                 
>   1.6e+07 +-+---------------------------------------------------------------+   
>           |..+.+    +..+.+..+.+.   +.      +..+.+..+.+..+.+..+.+..+    +    |   
>   1.4e+07 +-+  :    :  O      O    O                           O            |   
>   1.2e+07 O-+O O  O O    O  O    O    O O  O  O    O    O    O      O  O O  O   
>           |     :   :                           O    O    O       O         |   
>     1e+07 +-+   :  :                                                        |   
>           |     :  :                                                        |   
>     8e+06 +-+   :  :                                                        |   
>           |      : :                                                        |   
>     6e+06 +-+    : :                                                        |   
>     4e+06 +-+    : :                                                        |   
>           |      ::                                                         |   
>     2e+06 +-+     :                                                         |   
>           |       :                                                         |   
>         0 +-+---------------------------------------------------------------+   
>                                                                                 
>                                                                                                                                                                 
>                          vm-scalability.time.minor_page_faults                  
>                                                                                 
>   2.5e+06 +-+---------------------------------------------------------------+   
>           |                                                                 |   
>           |..+.+    +..+.+..+.+..+.+..+.+..  .+.  .+.+..+.+..+.+..+.+..+    |   
>     2e+06 +-+  :    :                      +.   +.                          |   
>           O  O O: O O  O O  O O  O O                    O      O            |   
>           |     :   :                 O O  O  O O  O O    O  O    O O  O O  O   
>   1.5e+06 +-+   :  :                                                        |   
>           |     :  :                                                        |   
>     1e+06 +-+    : :                                                        |   
>           |      : :                                                        |   
>           |      : :                                                        |   
>    500000 +-+    : :                                                        |   
>           |       :                                                         |   
>           |       :                                                         |   
>         0 +-+---------------------------------------------------------------+   
>                                                                                 
>                                                                                                                                                                 
>                                 vm-scalability.workload                         
>                                                                                 
>   3.5e+09 +-+---------------------------------------------------------------+   
>           | .+.                      .+.+..                        .+..     |   
>     3e+09 +-+  +    +..+.+..+.+..+.+.      +..+.+..+.+..+.+..+.+..+    +    |   
>           |    :    :       O O                                O            |   
>   2.5e+09 O-+O O: O O  O O       O O  O    O            O                   |   
>           |     :   :                   O     O O  O O    O  O    O O  O O  O   
>     2e+09 +-+   :  :                                                        |   
>           |     :  :                                                        |   
>   1.5e+09 +-+    : :                                                        |   
>           |      : :                                                        |   
>     1e+09 +-+    : :                                                        |   
>           |      : :                                                        |   
>     5e+08 +-+     :                                                         |   
>           |       :                                                         |   
>         0 +-+---------------------------------------------------------------+   
>                                                                                 
>                                                                                 
> [*] bisect-good sample
> [O] bisect-bad  sample
> 
> 
> 
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
> 
> 
> Thanks,
> Rong Chen
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah
HRB 21284 (AG Nürnberg)

[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread