Hi Am 30.07.19 um 20:12 schrieb Daniel Vetter: > On Tue, Jul 30, 2019 at 7:50 PM Thomas Zimmermann wrote: >> Am 29.07.19 um 11:51 schrieb kernel test robot: >>> Greeting, >>> >>> FYI, we noticed a -18.8% regression of vm-scalability.median due to commit:> >>> >>> commit: 90f479ae51afa45efab97afdde9b94b9660dd3e4 ("drm/mgag200: Replace struct mga_fbdev with generic framebuffer emulation") >>> https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next.git master >> >> Daniel, Noralf, we may have to revert this patch. >> >> I expected some change in display performance, but not in VM. Since it's >> a server chipset, probably no one cares much about display performance. >> So that seemed like a good trade-off for re-using shared code. >> >> Part of the patch set is that the generic fb emulation now maps and >> unmaps the fbdev BO when updating the screen. I guess that's the cause >> of the performance regression. And it should be visible with other >> drivers as well if they use a shadow FB for fbdev emulation. > > For fbcon we should need to do any maps/unamps at all, this is for the > fbdev mmap support only. If the testcase mentioned here tests fbdev > mmap handling it's pretty badly misnamed :-) And as long as you don't > have an fbdev mmap there shouldn't be any impact at all. The ast and mgag200 have only a few MiB of VRAM, so we have to get the fbdev BO out if it's not being displayed. If not being mapped, it can be evicted and make room for X, etc. To make this work, the BO's memory is mapped and unmapped in drm_fb_helper_dirty_work() before being updated from the shadow FB. [1] That fbdev mapping is established on each screen update, more or less. From my (yet unverified) understanding, this causes the performance regression in the VM code. The original code in mgag200 used to kmap the fbdev BO while it's being displayed; [2] and the drawing code only mapped it when necessary (i.e., not being display). [3] I think this could be added for VRAM helpers as well, but it's still a workaround and non-VRAM drivers might also run into such a performance regression if they use the fbdev's shadow fb. Noralf mentioned that there are plans for other DRM clients besides the console. They would as well run into similar problems. >> The thing is that we'd need another generic fbdev emulation for ast and >> mgag200 that handles this issue properly. > > Yeah I dont think we want to jump the gun here. If you can try to > repro locally and profile where we're wasting cpu time I hope that > should sched a light what's going wrong here. I don't have much time ATM and I'm not even officially at work until late Aug. I'd send you the revert and investigate later. I agree that using generic fbdev emulation would be preferable. Best regards Thomas [1] https://cgit.freedesktop.org/drm/drm-misc/tree/drivers/gpu/drm/drm_fb_helper.c?id=90f479ae51afa45efab97afdde9b94b9660dd3e4#n419 [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/mgag200/mgag200_mode.c?h=v5.2#n897 [3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/mgag200/mgag200_fb.c?h=v5.2#n75 > -Daniel > >> >> Best regards >> Thomas >> >>> >>> in testcase: vm-scalability >>> on test machine: 288 threads Intel(R) Xeon Phi(TM) CPU 7295 @ 1.50GHz with 80G memory >>> with following parameters: >>> >>> runtime: 300s >>> size: 8T >>> test: anon-cow-seq-hugetlb >>> cpufreq_governor: performance >>> >>> test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us. >>> test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/ >>> >>> >>> >>> Details are as below: >>> --------------------------------------------------------------------------------------------------> >>> >>> >>> To reproduce: >>> >>> git clone https://github.com/intel/lkp-tests.git >>> cd lkp-tests >>> bin/lkp install job.yaml # job file is attached in this email >>> bin/lkp run job.yaml >>> >>> ========================================================================================= >>> compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase: >>> gcc-7/performance/x86_64-rhel-7.6/debian-x86_64-2019-05-14.cgz/300s/8T/lkp-knm01/anon-cow-seq-hugetlb/vm-scalability >>> >>> commit: >>> f1f8555dfb ("drm/bochs: Use shadow buffer for bochs framebuffer console") >>> 90f479ae51 ("drm/mgag200: Replace struct mga_fbdev with generic framebuffer emulation") >>> >>> f1f8555dfb9a70a2 90f479ae51afa45efab97afdde9 >>> ---------------- --------------------------- >>> fail:runs %reproduction fail:runs >>> | | | >>> 2:4 -50% :4 dmesg.WARNING:at#for_ip_interrupt_entry/0x >>> :4 25% 1:4 dmesg.WARNING:at_ip___perf_sw_event/0x >>> :4 25% 1:4 dmesg.WARNING:at_ip__fsnotify_parent/0x >>> %stddev %change %stddev >>> \ | \ >>> 43955 ± 2% -18.8% 35691 vm-scalability.median >>> 0.06 ± 7% +193.0% 0.16 ± 2% vm-scalability.median_stddev >>> 14906559 ± 2% -17.9% 12237079 vm-scalability.throughput >>> 87651 ± 2% -17.4% 72374 vm-scalability.time.involuntary_context_switches >>> 2086168 -23.6% 1594224 vm-scalability.time.minor_page_faults >>> 15082 ± 2% -10.4% 13517 vm-scalability.time.percent_of_cpu_this_job_got >>> 29987 -8.9% 27327 vm-scalability.time.system_time >>> 15755 -12.4% 13795 vm-scalability.time.user_time >>> 122011 -19.3% 98418 vm-scalability.time.voluntary_context_switches >>> 3.034e+09 -23.6% 2.318e+09 vm-scalability.workload >>> 242478 ± 12% +68.5% 408518 ± 23% cpuidle.POLL.time >>> 2788 ± 21% +117.4% 6062 ± 26% cpuidle.POLL.usage >>> 56653 ± 10% +64.4% 93144 ± 20% meminfo.Mapped >>> 120392 ± 7% +14.0% 137212 ± 4% meminfo.Shmem >>> 47221 ± 11% +77.1% 83634 ± 22% numa-meminfo.node0.Mapped >>> 120465 ± 7% +13.9% 137205 ± 4% numa-meminfo.node0.Shmem >>> 2885513 -16.5% 2409384 numa-numastat.node0.local_node >>> 2885471 -16.5% 2409354 numa-numastat.node0.numa_hit >>> 11813 ± 11% +76.3% 20824 ± 22% numa-vmstat.node0.nr_mapped >>> 30096 ± 7% +13.8% 34238 ± 4% numa-vmstat.node0.nr_shmem >>> 43.72 ± 2% +5.5 49.20 mpstat.cpu.all.idle% >>> 0.03 ± 4% +0.0 0.05 ± 6% mpstat.cpu.all.soft% >>> 19.51 -2.4 17.08 mpstat.cpu.all.usr% >>> 1012 -7.9% 932.75 turbostat.Avg_MHz >>> 32.38 ± 10% +25.8% 40.73 turbostat.CPU%c1 >>> 145.51 -3.1% 141.01 turbostat.PkgWatt >>> 15.09 -19.2% 12.19 turbostat.RAMWatt >>> 43.50 ± 2% +13.2% 49.25 vmstat.cpu.id >>> 18.75 ± 2% -13.3% 16.25 ± 2% vmstat.cpu.us >>> 152.00 ± 2% -9.5% 137.50 vmstat.procs.r >>> 4800 -13.1% 4173 vmstat.system.cs >>> 156170 -11.9% 137594 slabinfo.anon_vma.active_objs >>> 3395 -11.9% 2991 slabinfo.anon_vma.active_slabs >>> 156190 -11.9% 137606 slabinfo.anon_vma.num_objs >>> 3395 -11.9% 2991 slabinfo.anon_vma.num_slabs >>> 1716 ± 5% +11.5% 1913 ± 8% slabinfo.dmaengine-unmap-16.active_objs >>> 1716 ± 5% +11.5% 1913 ± 8% slabinfo.dmaengine-unmap-16.num_objs >>> 1767 ± 2% -19.0% 1431 ± 2% slabinfo.hugetlbfs_inode_cache.active_objs >>> 1767 ± 2% -19.0% 1431 ± 2% slabinfo.hugetlbfs_inode_cache.num_objs >>> 3597 ± 5% -16.4% 3006 ± 3% slabinfo.skbuff_ext_cache.active_objs >>> 3597 ± 5% -16.4% 3006 ± 3% slabinfo.skbuff_ext_cache.num_objs >>> 1330122 -23.6% 1016557 proc-vmstat.htlb_buddy_alloc_success >>> 77214 ± 3% +6.4% 82128 ± 2% proc-vmstat.nr_active_anon >>> 67277 +2.9% 69246 proc-vmstat.nr_anon_pages >>> 218.50 ± 3% -10.6% 195.25 proc-vmstat.nr_dirtied >>> 288628 +1.4% 292755 proc-vmstat.nr_file_pages >>> 360.50 -2.7% 350.75 proc-vmstat.nr_inactive_file >>> 14225 ± 9% +63.8% 23304 ± 20% proc-vmstat.nr_mapped >>> 30109 ± 7% +13.8% 34259 ± 4% proc-vmstat.nr_shmem >>> 99870 -1.3% 98597 proc-vmstat.nr_slab_unreclaimable >>> 204.00 ± 4% -12.1% 179.25 proc-vmstat.nr_written >>> 77214 ± 3% +6.4% 82128 ± 2% proc-vmstat.nr_zone_active_anon >>> 360.50 -2.7% 350.75 proc-vmstat.nr_zone_inactive_file >>> 8810 ± 19% -66.1% 2987 ± 42% proc-vmstat.numa_hint_faults >>> 8810 ± 19% -66.1% 2987 ± 42% proc-vmstat.numa_hint_faults_local >>> 2904082 -16.4% 2427026 proc-vmstat.numa_hit >>> 2904081 -16.4% 2427025 proc-vmstat.numa_local >>> 6.828e+08 -23.5% 5.221e+08 proc-vmstat.pgalloc_normal >>> 2900008 -17.2% 2400195 proc-vmstat.pgfault >>> 6.827e+08 -23.5% 5.22e+08 proc-vmstat.pgfree >>> 1.635e+10 -17.0% 1.357e+10 perf-stat.i.branch-instructions >>> 1.53 ± 4% -0.1 1.45 ± 3% perf-stat.i.branch-miss-rate% >>> 2.581e+08 ± 3% -20.5% 2.051e+08 ± 2% perf-stat.i.branch-misses >>> 12.66 +1.1 13.78 perf-stat.i.cache-miss-rate% >>> 72720849 -12.0% 63958986 perf-stat.i.cache-misses >>> 5.766e+08 -18.6% 4.691e+08 perf-stat.i.cache-references >>> 4674 ± 2% -13.0% 4064 perf-stat.i.context-switches >>> 4.29 +12.5% 4.83 perf-stat.i.cpi >>> 2.573e+11 -7.4% 2.383e+11 perf-stat.i.cpu-cycles >>> 231.35 -21.5% 181.56 perf-stat.i.cpu-migrations >>> 3522 +4.4% 3677 perf-stat.i.cycles-between-cache-misses >>> 0.09 ± 13% +0.0 0.12 ± 5% perf-stat.i.iTLB-load-miss-rate% >>> 5.894e+10 -15.8% 4.961e+10 perf-stat.i.iTLB-loads >>> 5.901e+10 -15.8% 4.967e+10 perf-stat.i.instructions >>> 1291 ± 14% -21.8% 1010 perf-stat.i.instructions-per-iTLB-miss >>> 0.24 -11.0% 0.21 perf-stat.i.ipc >>> 9476 -17.5% 7821 perf-stat.i.minor-faults >>> 9478 -17.5% 7821 perf-stat.i.page-faults >>> 9.76 -3.6% 9.41 perf-stat.overall.MPKI >>> 1.59 ± 4% -0.1 1.52 perf-stat.overall.branch-miss-rate% >>> 12.61 +1.1 13.71 perf-stat.overall.cache-miss-rate% >>> 4.38 +10.5% 4.83 perf-stat.overall.cpi >>> 3557 +5.3% 3747 perf-stat.overall.cycles-between-cache-misses >>> 0.08 ± 12% +0.0 0.10 perf-stat.overall.iTLB-load-miss-rate% >>> 1268 ± 15% -23.0% 976.22 perf-stat.overall.instructions-per-iTLB-miss >>> 0.23 -9.5% 0.21 perf-stat.overall.ipc >>> 5815 +9.7% 6378 perf-stat.overall.path-length >>> 1.634e+10 -17.5% 1.348e+10 perf-stat.ps.branch-instructions >>> 2.595e+08 ± 3% -21.2% 2.043e+08 ± 2% perf-stat.ps.branch-misses >>> 72565205 -12.2% 63706339 perf-stat.ps.cache-misses >>> 5.754e+08 -19.2% 4.646e+08 perf-stat.ps.cache-references >>> 4640 ± 2% -12.5% 4060 perf-stat.ps.context-switches >>> 2.581e+11 -7.5% 2.387e+11 perf-stat.ps.cpu-cycles >>> 229.91 -22.0% 179.42 perf-stat.ps.cpu-migrations >>> 5.889e+10 -16.3% 4.927e+10 perf-stat.ps.iTLB-loads >>> 5.899e+10 -16.3% 4.938e+10 perf-stat.ps.instructions >>> 9388 -18.2% 7677 perf-stat.ps.minor-faults >>> 9389 -18.2% 7677 perf-stat.ps.page-faults >>> 1.764e+13 -16.2% 1.479e+13 perf-stat.total.instructions >>> 46803 ± 3% -18.8% 37982 ± 6% sched_debug.cfs_rq:/.exec_clock.min >>> 5320 ± 3% +23.7% 6581 ± 3% sched_debug.cfs_rq:/.exec_clock.stddev >>> 6737 ± 14% +58.1% 10649 ± 10% sched_debug.cfs_rq:/.load.avg >>> 587978 ± 17% +58.2% 930382 ± 9% sched_debug.cfs_rq:/.load.max >>> 46952 ± 16% +64.8% 77388 ± 11% sched_debug.cfs_rq:/.load.stddev >>> 7.12 ± 4% +49.1% 10.62 ± 6% sched_debug.cfs_rq:/.load_avg.avg >>> 474.40 ± 23% +67.5% 794.60 ± 10% sched_debug.cfs_rq:/.load_avg.max >>> 37.70 ± 11% +74.8% 65.90 ± 9% sched_debug.cfs_rq:/.load_avg.stddev >>> 13424269 ± 4% -15.6% 11328098 ± 2% sched_debug.cfs_rq:/.min_vruntime.avg >>> 15411275 ± 3% -12.4% 13505072 ± 2% sched_debug.cfs_rq:/.min_vruntime.max >>> 7939295 ± 6% -17.5% 6551322 ± 7% sched_debug.cfs_rq:/.min_vruntime.min >>> 21.44 ± 7% -56.1% 9.42 ± 4% sched_debug.cfs_rq:/.nr_spread_over.avg >>> 117.45 ± 11% -60.6% 46.30 ± 14% sched_debug.cfs_rq:/.nr_spread_over.max >>> 19.33 ± 8% -66.4% 6.49 ± 9% sched_debug.cfs_rq:/.nr_spread_over.stddev >>> 4.32 ± 15% +84.4% 7.97 ± 3% sched_debug.cfs_rq:/.runnable_load_avg.avg >>> 353.85 ± 29% +118.8% 774.35 ± 11% sched_debug.cfs_rq:/.runnable_load_avg.max >>> 27.30 ± 24% +118.5% 59.64 ± 9% sched_debug.cfs_rq:/.runnable_load_avg.stddev >>> 6729 ± 14% +58.2% 10644 ± 10% sched_debug.cfs_rq:/.runnable_weight.avg >>> 587978 ± 17% +58.2% 930382 ± 9% sched_debug.cfs_rq:/.runnable_weight.max >>> 46950 ± 16% +64.8% 77387 ± 11% sched_debug.cfs_rq:/.runnable_weight.stddev >>> 5305069 ± 4% -17.4% 4380376 ± 7% sched_debug.cfs_rq:/.spread0.avg >>> 7328745 ± 3% -9.9% 6600897 ± 3% sched_debug.cfs_rq:/.spread0.max >>> 2220837 ± 4% +55.8% 3460596 ± 5% sched_debug.cpu.avg_idle.avg >>> 4590666 ± 9% +76.8% 8117037 ± 15% sched_debug.cpu.avg_idle.max >>> 485052 ± 7% +80.3% 874679 ± 10% sched_debug.cpu.avg_idle.stddev >>> 561.50 ± 26% +37.7% 773.30 ± 15% sched_debug.cpu.clock.stddev >>> 561.50 ± 26% +37.7% 773.30 ± 15% sched_debug.cpu.clock_task.stddev >>> 3.20 ± 10% +109.6% 6.70 ± 3% sched_debug.cpu.cpu_load[0].avg >>> 309.10 ± 20% +150.3% 773.75 ± 12% sched_debug.cpu.cpu_load[0].max >>> 21.02 ± 14% +160.8% 54.80 ± 9% sched_debug.cpu.cpu_load[0].stddev >>> 3.19 ± 8% +109.8% 6.70 ± 3% sched_debug.cpu.cpu_load[1].avg >>> 299.75 ± 19% +158.0% 773.30 ± 12% sched_debug.cpu.cpu_load[1].max >>> 20.32 ± 12% +168.7% 54.62 ± 9% sched_debug.cpu.cpu_load[1].stddev >>> 3.20 ± 8% +109.1% 6.69 ± 4% sched_debug.cpu.cpu_load[2].avg >>> 288.90 ± 20% +167.0% 771.40 ± 12% sched_debug.cpu.cpu_load[2].max >>> 19.70 ± 12% +175.4% 54.27 ± 9% sched_debug.cpu.cpu_load[2].stddev >>> 3.16 ± 8% +110.9% 6.66 ± 6% sched_debug.cpu.cpu_load[3].avg >>> 275.50 ± 24% +178.4% 766.95 ± 12% sched_debug.cpu.cpu_load[3].max >>> 18.92 ± 15% +184.2% 53.77 ± 10% sched_debug.cpu.cpu_load[3].stddev >>> 3.08 ± 8% +115.7% 6.65 ± 7% sched_debug.cpu.cpu_load[4].avg >>> 263.55 ± 28% +188.7% 760.85 ± 12% sched_debug.cpu.cpu_load[4].max >>> 18.03 ± 18% +196.6% 53.46 ± 11% sched_debug.cpu.cpu_load[4].stddev >>> 14543 -9.6% 13150 sched_debug.cpu.curr->pid.max >>> 5293 ± 16% +74.7% 9248 ± 11% sched_debug.cpu.load.avg >>> 587978 ± 17% +58.2% 930382 ± 9% sched_debug.cpu.load.max >>> 40887 ± 19% +78.3% 72891 ± 9% sched_debug.cpu.load.stddev >>> 1141679 ± 4% +56.9% 1790907 ± 5% sched_debug.cpu.max_idle_balance_cost.avg >>> 2432100 ± 9% +72.6% 4196779 ± 13% sched_debug.cpu.max_idle_balance_cost.max >>> 745656 +29.3% 964170 ± 5% sched_debug.cpu.max_idle_balance_cost.min >>> 239032 ± 9% +81.9% 434806 ± 10% sched_debug.cpu.max_idle_balance_cost.stddev >>> 0.00 ± 27% +92.1% 0.00 ± 31% sched_debug.cpu.next_balance.stddev >>> 1030 ± 4% -10.4% 924.00 ± 2% sched_debug.cpu.nr_switches.min >>> 0.04 ± 26% +139.0% 0.09 ± 41% sched_debug.cpu.nr_uninterruptible.avg >>> 830.35 ± 6% -12.0% 730.50 ± 2% sched_debug.cpu.sched_count.min >>> 912.00 ± 2% -9.5% 825.38 sched_debug.cpu.ttwu_count.avg >>> 433.05 ± 3% -19.2% 350.05 ± 3% sched_debug.cpu.ttwu_count.min >>> 160.70 ± 3% -12.5% 140.60 ± 4% sched_debug.cpu.ttwu_local.min >>> 9072 ± 11% -36.4% 5767 ± 8% softirqs.CPU1.RCU >>> 12769 ± 5% +15.3% 14718 ± 3% softirqs.CPU101.SCHED >>> 13198 +11.5% 14717 ± 3% softirqs.CPU102.SCHED >>> 12981 ± 4% +13.9% 14788 ± 3% softirqs.CPU105.SCHED >>> 13486 ± 3% +11.8% 15071 ± 4% softirqs.CPU111.SCHED >>> 12794 ± 4% +14.1% 14601 ± 9% softirqs.CPU112.SCHED >>> 12999 ± 4% +10.1% 14314 ± 4% softirqs.CPU115.SCHED >>> 12844 ± 4% +10.6% 14202 ± 2% softirqs.CPU120.SCHED >>> 13336 ± 3% +9.4% 14585 ± 3% softirqs.CPU122.SCHED >>> 12639 ± 4% +20.2% 15195 softirqs.CPU123.SCHED >>> 13040 ± 5% +15.2% 15024 ± 5% softirqs.CPU126.SCHED >>> 13123 +15.1% 15106 ± 5% softirqs.CPU127.SCHED >>> 9188 ± 6% -35.7% 5911 ± 2% softirqs.CPU13.RCU >>> 13054 ± 3% +13.1% 14761 ± 5% softirqs.CPU130.SCHED >>> 13158 ± 2% +13.9% 14985 ± 5% softirqs.CPU131.SCHED >>> 12797 ± 6% +13.5% 14524 ± 3% softirqs.CPU133.SCHED >>> 12452 ± 5% +14.8% 14297 softirqs.CPU134.SCHED >>> 13078 ± 3% +10.4% 14439 ± 3% softirqs.CPU138.SCHED >>> 12617 ± 2% +14.5% 14442 ± 5% softirqs.CPU139.SCHED >>> 12974 ± 3% +13.7% 14752 ± 4% softirqs.CPU142.SCHED >>> 12579 ± 4% +19.1% 14983 ± 3% softirqs.CPU143.SCHED >>> 9122 ± 24% -44.6% 5053 ± 5% softirqs.CPU144.RCU >>> 13366 ± 2% +11.1% 14848 ± 3% softirqs.CPU149.SCHED >>> 13246 ± 2% +22.0% 16162 ± 7% softirqs.CPU150.SCHED >>> 13452 ± 3% +20.5% 16210 ± 7% softirqs.CPU151.SCHED >>> 13507 +10.1% 14869 softirqs.CPU156.SCHED >>> 13808 ± 3% +9.2% 15079 ± 4% softirqs.CPU157.SCHED >>> 13442 ± 2% +13.4% 15248 ± 4% softirqs.CPU160.SCHED >>> 13311 +12.1% 14920 ± 2% softirqs.CPU162.SCHED >>> 13544 ± 3% +8.5% 14695 ± 4% softirqs.CPU163.SCHED >>> 13648 ± 3% +11.2% 15179 ± 2% softirqs.CPU166.SCHED >>> 13404 ± 4% +12.5% 15079 ± 3% softirqs.CPU168.SCHED >>> 13421 ± 6% +16.0% 15568 ± 8% softirqs.CPU169.SCHED >>> 13115 ± 3% +23.1% 16139 ± 10% softirqs.CPU171.SCHED >>> 13424 ± 6% +10.4% 14822 ± 3% softirqs.CPU175.SCHED >>> 13274 ± 3% +13.7% 15087 ± 9% softirqs.CPU185.SCHED >>> 13409 ± 3% +12.3% 15063 ± 3% softirqs.CPU190.SCHED >>> 13181 ± 7% +13.4% 14946 ± 3% softirqs.CPU196.SCHED >>> 13578 ± 3% +10.9% 15061 softirqs.CPU197.SCHED >>> 13323 ± 5% +24.8% 16627 ± 6% softirqs.CPU198.SCHED >>> 14072 ± 2% +12.3% 15798 ± 7% softirqs.CPU199.SCHED >>> 12604 ± 13% +17.9% 14865 softirqs.CPU201.SCHED >>> 13380 ± 4% +14.8% 15356 ± 3% softirqs.CPU203.SCHED >>> 13481 ± 8% +14.2% 15390 ± 3% softirqs.CPU204.SCHED >>> 12921 ± 2% +13.8% 14710 ± 3% softirqs.CPU206.SCHED >>> 13468 +13.0% 15218 ± 2% softirqs.CPU208.SCHED >>> 13253 ± 2% +13.1% 14992 softirqs.CPU209.SCHED >>> 13319 ± 2% +14.3% 15225 ± 7% softirqs.CPU210.SCHED >>> 13673 ± 5% +16.3% 15895 ± 3% softirqs.CPU211.SCHED >>> 13290 +17.0% 15556 ± 5% softirqs.CPU212.SCHED >>> 13455 ± 4% +14.4% 15392 ± 3% softirqs.CPU213.SCHED >>> 13454 ± 4% +14.3% 15377 ± 3% softirqs.CPU215.SCHED >>> 13872 ± 7% +9.7% 15221 ± 5% softirqs.CPU220.SCHED >>> 13555 ± 4% +17.3% 15896 ± 5% softirqs.CPU222.SCHED >>> 13411 ± 4% +20.8% 16197 ± 6% softirqs.CPU223.SCHED >>> 8472 ± 21% -44.8% 4680 ± 3% softirqs.CPU224.RCU >>> 13141 ± 3% +16.2% 15265 ± 7% softirqs.CPU225.SCHED >>> 14084 ± 3% +8.2% 15242 ± 2% softirqs.CPU226.SCHED >>> 13528 ± 4% +11.3% 15063 ± 4% softirqs.CPU228.SCHED >>> 13218 ± 3% +16.3% 15377 ± 4% softirqs.CPU229.SCHED >>> 14031 ± 4% +10.2% 15467 ± 2% softirqs.CPU231.SCHED >>> 13770 ± 3% +14.0% 15700 ± 3% softirqs.CPU232.SCHED >>> 13456 ± 3% +12.3% 15105 ± 3% softirqs.CPU233.SCHED >>> 13137 ± 4% +13.5% 14909 ± 3% softirqs.CPU234.SCHED >>> 13318 ± 2% +14.7% 15280 ± 2% softirqs.CPU235.SCHED >>> 13690 ± 2% +13.7% 15563 ± 7% softirqs.CPU238.SCHED >>> 13771 ± 5% +20.8% 16634 ± 7% softirqs.CPU241.SCHED >>> 13317 ± 7% +19.5% 15919 ± 9% softirqs.CPU243.SCHED >>> 8234 ± 16% -43.9% 4616 ± 5% softirqs.CPU244.RCU >>> 13845 ± 6% +13.0% 15643 ± 3% softirqs.CPU244.SCHED >>> 13179 ± 3% +16.3% 15323 softirqs.CPU246.SCHED >>> 13754 +12.2% 15438 ± 3% softirqs.CPU248.SCHED >>> 13769 ± 4% +10.9% 15276 ± 2% softirqs.CPU252.SCHED >>> 13702 +10.5% 15147 ± 2% softirqs.CPU254.SCHED >>> 13315 ± 2% +12.5% 14980 ± 3% softirqs.CPU255.SCHED >>> 13785 ± 3% +12.9% 15568 ± 5% softirqs.CPU256.SCHED >>> 13307 ± 3% +15.0% 15298 ± 3% softirqs.CPU257.SCHED >>> 13864 ± 3% +10.5% 15313 ± 2% softirqs.CPU259.SCHED >>> 13879 ± 2% +11.4% 15465 softirqs.CPU261.SCHED >>> 13815 +13.6% 15687 ± 5% softirqs.CPU264.SCHED >>> 119574 ± 2% +11.8% 133693 ± 11% softirqs.CPU266.TIMER >>> 13688 +10.9% 15180 ± 6% softirqs.CPU267.SCHED >>> 11716 ± 4% +19.3% 13974 ± 8% softirqs.CPU27.SCHED >>> 13866 ± 3% +13.7% 15765 ± 4% softirqs.CPU271.SCHED >>> 13887 ± 5% +12.5% 15621 softirqs.CPU272.SCHED >>> 13383 ± 3% +19.8% 16031 ± 2% softirqs.CPU274.SCHED >>> 13347 +14.1% 15232 ± 3% softirqs.CPU275.SCHED >>> 12884 ± 2% +21.0% 15593 ± 4% softirqs.CPU276.SCHED >>> 13131 ± 5% +13.4% 14891 ± 5% softirqs.CPU277.SCHED >>> 12891 ± 2% +19.2% 15371 ± 4% softirqs.CPU278.SCHED >>> 13313 ± 4% +13.0% 15049 ± 2% softirqs.CPU279.SCHED >>> 13514 ± 3% +10.2% 14897 ± 2% softirqs.CPU280.SCHED >>> 13501 ± 3% +13.7% 15346 softirqs.CPU281.SCHED >>> 13261 +17.5% 15577 softirqs.CPU282.SCHED >>> 8076 ± 15% -43.7% 4546 ± 5% softirqs.CPU283.RCU >>> 13686 ± 3% +12.6% 15413 ± 2% softirqs.CPU284.SCHED >>> 13439 ± 2% +9.2% 14670 ± 4% softirqs.CPU285.SCHED >>> 8878 ± 9% -35.4% 5735 ± 4% softirqs.CPU35.RCU >>> 11690 ± 2% +13.6% 13274 ± 5% softirqs.CPU40.SCHED >>> 11714 ± 2% +19.3% 13975 ± 13% softirqs.CPU41.SCHED >>> 11763 +12.5% 13239 ± 4% softirqs.CPU45.SCHED >>> 11662 ± 2% +9.4% 12757 ± 3% softirqs.CPU46.SCHED >>> 11805 ± 2% +9.3% 12902 ± 2% softirqs.CPU50.SCHED >>> 12158 ± 3% +12.3% 13655 ± 8% softirqs.CPU55.SCHED >>> 11716 ± 4% +8.8% 12751 ± 3% softirqs.CPU58.SCHED >>> 11922 ± 2% +9.9% 13100 ± 4% softirqs.CPU64.SCHED >>> 9674 ± 17% -41.8% 5625 ± 6% softirqs.CPU66.RCU >>> 11818 +12.0% 13237 softirqs.CPU66.SCHED >>> 124682 ± 7% -6.1% 117088 ± 5% softirqs.CPU66.TIMER >>> 8637 ± 9% -34.0% 5700 ± 7% softirqs.CPU70.RCU >>> 11624 ± 2% +11.0% 12901 ± 2% softirqs.CPU70.SCHED >>> 12372 ± 2% +13.2% 14003 ± 3% softirqs.CPU71.SCHED >>> 9949 ± 25% -33.9% 6574 ± 31% softirqs.CPU72.RCU >>> 10392 ± 26% -35.1% 6745 ± 35% softirqs.CPU73.RCU >>> 12766 ± 3% +11.1% 14188 ± 3% softirqs.CPU76.SCHED >>> 12611 ± 2% +18.8% 14984 ± 5% softirqs.CPU78.SCHED >>> 12786 ± 3% +17.9% 15079 ± 7% softirqs.CPU79.SCHED >>> 11947 ± 4% +9.7% 13103 ± 4% softirqs.CPU8.SCHED >>> 13379 ± 7% +11.8% 14962 ± 4% softirqs.CPU83.SCHED >>> 13438 ± 5% +9.7% 14738 ± 2% softirqs.CPU84.SCHED >>> 12768 +19.4% 15241 ± 6% softirqs.CPU88.SCHED >>> 8604 ± 13% -39.3% 5222 ± 3% softirqs.CPU89.RCU >>> 13077 ± 2% +17.1% 15308 ± 7% softirqs.CPU89.SCHED >>> 11887 ± 3% +20.1% 14272 ± 5% softirqs.CPU9.SCHED >>> 12723 ± 3% +11.3% 14165 ± 4% softirqs.CPU90.SCHED >>> 8439 ± 12% -38.9% 5153 ± 4% softirqs.CPU91.RCU >>> 13429 ± 3% +10.3% 14806 ± 2% softirqs.CPU95.SCHED >>> 12852 ± 4% +10.3% 14174 ± 5% softirqs.CPU96.SCHED >>> 13010 ± 2% +14.4% 14888 ± 5% softirqs.CPU97.SCHED >>> 2315644 ± 4% -36.2% 1477200 ± 4% softirqs.RCU >>> 1572 ± 10% +63.9% 2578 ± 39% interrupts.CPU0.NMI:Non-maskable_interrupts >>> 1572 ± 10% +63.9% 2578 ± 39% interrupts.CPU0.PMI:Performance_monitoring_interrupts >>> 252.00 ± 11% -35.2% 163.25 ± 13% interrupts.CPU104.RES:Rescheduling_interrupts >>> 2738 ± 24% +52.4% 4173 ± 19% interrupts.CPU105.NMI:Non-maskable_interrupts >>> 2738 ± 24% +52.4% 4173 ± 19% interrupts.CPU105.PMI:Performance_monitoring_interrupts >>> 245.75 ± 19% -31.0% 169.50 ± 7% interrupts.CPU105.RES:Rescheduling_interrupts >>> 228.75 ± 13% -24.7% 172.25 ± 19% interrupts.CPU106.RES:Rescheduling_interrupts >>> 2243 ± 15% +66.3% 3730 ± 35% interrupts.CPU113.NMI:Non-maskable_interrupts >>> 2243 ± 15% +66.3% 3730 ± 35% interrupts.CPU113.PMI:Performance_monitoring_interrupts >>> 2703 ± 31% +67.0% 4514 ± 33% interrupts.CPU118.NMI:Non-maskable_interrupts >>> 2703 ± 31% +67.0% 4514 ± 33% interrupts.CPU118.PMI:Performance_monitoring_interrupts >>> 2613 ± 25% +42.2% 3715 ± 24% interrupts.CPU121.NMI:Non-maskable_interrupts >>> 2613 ± 25% +42.2% 3715 ± 24% interrupts.CPU121.PMI:Performance_monitoring_interrupts >>> 311.50 ± 23% -47.7% 163.00 ± 9% interrupts.CPU122.RES:Rescheduling_interrupts >>> 266.75 ± 19% -31.6% 182.50 ± 15% interrupts.CPU124.RES:Rescheduling_interrupts >>> 293.75 ± 33% -32.3% 198.75 ± 19% interrupts.CPU125.RES:Rescheduling_interrupts >>> 2601 ± 36% +43.2% 3724 ± 29% interrupts.CPU127.NMI:Non-maskable_interrupts >>> 2601 ± 36% +43.2% 3724 ± 29% interrupts.CPU127.PMI:Performance_monitoring_interrupts >>> 2258 ± 21% +68.2% 3797 ± 29% interrupts.CPU13.NMI:Non-maskable_interrupts >>> 2258 ± 21% +68.2% 3797 ± 29% interrupts.CPU13.PMI:Performance_monitoring_interrupts >>> 3338 ± 29% +54.6% 5160 ± 9% interrupts.CPU139.NMI:Non-maskable_interrupts >>> 3338 ± 29% +54.6% 5160 ± 9% interrupts.CPU139.PMI:Performance_monitoring_interrupts >>> 219.50 ± 27% -23.0% 169.00 ± 21% interrupts.CPU139.RES:Rescheduling_interrupts >>> 290.25 ± 25% -32.5% 196.00 ± 11% interrupts.CPU14.RES:Rescheduling_interrupts >>> 243.50 ± 4% -16.0% 204.50 ± 12% interrupts.CPU140.RES:Rescheduling_interrupts >>> 1797 ± 15% +135.0% 4223 ± 46% interrupts.CPU147.NMI:Non-maskable_interrupts >>> 1797 ± 15% +135.0% 4223 ± 46% interrupts.CPU147.PMI:Performance_monitoring_interrupts >>> 2537 ± 22% +89.6% 4812 ± 28% interrupts.CPU15.NMI:Non-maskable_interrupts >>> 2537 ± 22% +89.6% 4812 ± 28% interrupts.CPU15.PMI:Performance_monitoring_interrupts >>> 292.25 ± 34% -33.9% 193.25 ± 6% interrupts.CPU15.RES:Rescheduling_interrupts >>> 424.25 ± 37% -58.5% 176.25 ± 14% interrupts.CPU158.RES:Rescheduling_interrupts >>> 312.50 ± 42% -54.2% 143.00 ± 18% interrupts.CPU159.RES:Rescheduling_interrupts >>> 725.00 ±118% -75.7% 176.25 ± 14% interrupts.CPU163.RES:Rescheduling_interrupts >>> 2367 ± 6% +59.9% 3786 ± 24% interrupts.CPU177.NMI:Non-maskable_interrupts >>> 2367 ± 6% +59.9% 3786 ± 24% interrupts.CPU177.PMI:Performance_monitoring_interrupts >>> 239.50 ± 30% -46.6% 128.00 ± 14% interrupts.CPU179.RES:Rescheduling_interrupts >>> 320.75 ± 15% -24.0% 243.75 ± 20% interrupts.CPU20.RES:Rescheduling_interrupts >>> 302.50 ± 17% -47.2% 159.75 ± 8% interrupts.CPU200.RES:Rescheduling_interrupts >>> 2166 ± 5% +92.0% 4157 ± 40% interrupts.CPU207.NMI:Non-maskable_interrupts >>> 2166 ± 5% +92.0% 4157 ± 40% interrupts.CPU207.PMI:Performance_monitoring_interrupts >>> 217.00 ± 11% -34.6% 142.00 ± 12% interrupts.CPU214.RES:Rescheduling_interrupts >>> 2610 ± 36% +47.4% 3848 ± 35% interrupts.CPU215.NMI:Non-maskable_interrupts >>> 2610 ± 36% +47.4% 3848 ± 35% interrupts.CPU215.PMI:Performance_monitoring_interrupts >>> 2046 ± 13% +118.6% 4475 ± 43% interrupts.CPU22.NMI:Non-maskable_interrupts >>> 2046 ± 13% +118.6% 4475 ± 43% interrupts.CPU22.PMI:Performance_monitoring_interrupts >>> 289.50 ± 28% -41.1% 170.50 ± 8% interrupts.CPU22.RES:Rescheduling_interrupts >>> 2232 ± 6% +33.0% 2970 ± 24% interrupts.CPU221.NMI:Non-maskable_interrupts >>> 2232 ± 6% +33.0% 2970 ± 24% interrupts.CPU221.PMI:Performance_monitoring_interrupts >>> 4552 ± 12% -27.6% 3295 ± 15% interrupts.CPU222.NMI:Non-maskable_interrupts >>> 4552 ± 12% -27.6% 3295 ± 15% interrupts.CPU222.PMI:Performance_monitoring_interrupts >>> 2013 ± 15% +80.9% 3641 ± 27% interrupts.CPU226.NMI:Non-maskable_interrupts >>> 2013 ± 15% +80.9% 3641 ± 27% interrupts.CPU226.PMI:Performance_monitoring_interrupts >>> 2575 ± 49% +67.1% 4302 ± 34% interrupts.CPU227.NMI:Non-maskable_interrupts >>> 2575 ± 49% +67.1% 4302 ± 34% interrupts.CPU227.PMI:Performance_monitoring_interrupts >>> 248.00 ± 36% -36.3% 158.00 ± 19% interrupts.CPU228.RES:Rescheduling_interrupts >>> 2441 ± 24% +43.0% 3490 ± 30% interrupts.CPU23.NMI:Non-maskable_interrupts >>> 2441 ± 24% +43.0% 3490 ± 30% interrupts.CPU23.PMI:Performance_monitoring_interrupts >>> 404.25 ± 69% -65.5% 139.50 ± 17% interrupts.CPU236.RES:Rescheduling_interrupts >>> 566.50 ± 40% -73.6% 149.50 ± 31% interrupts.CPU237.RES:Rescheduling_interrupts >>> 243.50 ± 26% -37.1% 153.25 ± 21% interrupts.CPU248.RES:Rescheduling_interrupts >>> 258.25 ± 12% -53.5% 120.00 ± 18% interrupts.CPU249.RES:Rescheduling_interrupts >>> 2888 ± 27% +49.4% 4313 ± 30% interrupts.CPU253.NMI:Non-maskable_interrupts >>> 2888 ± 27% +49.4% 4313 ± 30% interrupts.CPU253.PMI:Performance_monitoring_interrupts >>> 2468 ± 44% +67.3% 4131 ± 37% interrupts.CPU256.NMI:Non-maskable_interrupts >>> 2468 ± 44% +67.3% 4131 ± 37% interrupts.CPU256.PMI:Performance_monitoring_interrupts >>> 425.00 ± 59% -60.3% 168.75 ± 34% interrupts.CPU258.RES:Rescheduling_interrupts >>> 1859 ± 16% +106.3% 3834 ± 44% interrupts.CPU268.NMI:Non-maskable_interrupts >>> 1859 ± 16% +106.3% 3834 ± 44% interrupts.CPU268.PMI:Performance_monitoring_interrupts >>> 2684 ± 28% +61.2% 4326 ± 36% interrupts.CPU269.NMI:Non-maskable_interrupts >>> 2684 ± 28% +61.2% 4326 ± 36% interrupts.CPU269.PMI:Performance_monitoring_interrupts >>> 2171 ± 6% +108.8% 4533 ± 20% interrupts.CPU270.NMI:Non-maskable_interrupts >>> 2171 ± 6% +108.8% 4533 ± 20% interrupts.CPU270.PMI:Performance_monitoring_interrupts >>> 2262 ± 14% +61.8% 3659 ± 37% interrupts.CPU273.NMI:Non-maskable_interrupts >>> 2262 ± 14% +61.8% 3659 ± 37% interrupts.CPU273.PMI:Performance_monitoring_interrupts >>> 2203 ± 11% +50.7% 3320 ± 38% interrupts.CPU279.NMI:Non-maskable_interrupts >>> 2203 ± 11% +50.7% 3320 ± 38% interrupts.CPU279.PMI:Performance_monitoring_interrupts >>> 2433 ± 17% +52.9% 3721 ± 25% interrupts.CPU280.NMI:Non-maskable_interrupts >>> 2433 ± 17% +52.9% 3721 ± 25% interrupts.CPU280.PMI:Performance_monitoring_interrupts >>> 2778 ± 33% +63.1% 4531 ± 36% interrupts.CPU283.NMI:Non-maskable_interrupts >>> 2778 ± 33% +63.1% 4531 ± 36% interrupts.CPU283.PMI:Performance_monitoring_interrupts >>> 331.75 ± 32% -39.8% 199.75 ± 17% interrupts.CPU29.RES:Rescheduling_interrupts >>> 2178 ± 22% +53.9% 3353 ± 31% interrupts.CPU3.NMI:Non-maskable_interrupts >>> 2178 ± 22% +53.9% 3353 ± 31% interrupts.CPU3.PMI:Performance_monitoring_interrupts >>> 298.50 ± 30% -39.7% 180.00 ± 6% interrupts.CPU34.RES:Rescheduling_interrupts >>> 2490 ± 3% +58.7% 3953 ± 28% interrupts.CPU35.NMI:Non-maskable_interrupts >>> 2490 ± 3% +58.7% 3953 ± 28% interrupts.CPU35.PMI:Performance_monitoring_interrupts >>> 270.50 ± 24% -31.1% 186.25 ± 3% interrupts.CPU36.RES:Rescheduling_interrupts >>> 2493 ± 7% +57.0% 3915 ± 27% interrupts.CPU43.NMI:Non-maskable_interrupts >>> 2493 ± 7% +57.0% 3915 ± 27% interrupts.CPU43.PMI:Performance_monitoring_interrupts >>> 286.75 ± 36% -32.4% 193.75 ± 7% interrupts.CPU45.RES:Rescheduling_interrupts >>> 259.00 ± 12% -23.6% 197.75 ± 13% interrupts.CPU46.RES:Rescheduling_interrupts >>> 244.00 ± 21% -35.6% 157.25 ± 11% interrupts.CPU47.RES:Rescheduling_interrupts >>> 230.00 ± 7% -21.3% 181.00 ± 11% interrupts.CPU48.RES:Rescheduling_interrupts >>> 281.00 ± 13% -27.4% 204.00 ± 15% interrupts.CPU53.RES:Rescheduling_interrupts >>> 256.75 ± 5% -18.4% 209.50 ± 12% interrupts.CPU54.RES:Rescheduling_interrupts >>> 2433 ± 9% +68.4% 4098 ± 35% interrupts.CPU58.NMI:Non-maskable_interrupts >>> 2433 ± 9% +68.4% 4098 ± 35% interrupts.CPU58.PMI:Performance_monitoring_interrupts >>> 316.00 ± 25% -41.4% 185.25 ± 13% interrupts.CPU59.RES:Rescheduling_interrupts >>> 2703 ± 38% +56.0% 4217 ± 31% interrupts.CPU60.NMI:Non-maskable_interrupts >>> 2703 ± 38% +56.0% 4217 ± 31% interrupts.CPU60.PMI:Performance_monitoring_interrupts >>> 2425 ± 16% +39.9% 3394 ± 27% interrupts.CPU61.NMI:Non-maskable_interrupts >>> 2425 ± 16% +39.9% 3394 ± 27% interrupts.CPU61.PMI:Performance_monitoring_interrupts >>> 2388 ± 18% +69.5% 4047 ± 29% interrupts.CPU66.NMI:Non-maskable_interrupts >>> 2388 ± 18% +69.5% 4047 ± 29% interrupts.CPU66.PMI:Performance_monitoring_interrupts >>> 2322 ± 11% +93.4% 4491 ± 35% interrupts.CPU67.NMI:Non-maskable_interrupts >>> 2322 ± 11% +93.4% 4491 ± 35% interrupts.CPU67.PMI:Performance_monitoring_interrupts >>> 319.00 ± 40% -44.7% 176.25 ± 9% interrupts.CPU67.RES:Rescheduling_interrupts >>> 2512 ± 8% +28.1% 3219 ± 25% interrupts.CPU70.NMI:Non-maskable_interrupts >>> 2512 ± 8% +28.1% 3219 ± 25% interrupts.CPU70.PMI:Performance_monitoring_interrupts >>> 2290 ± 39% +78.7% 4094 ± 28% interrupts.CPU74.NMI:Non-maskable_interrupts >>> 2290 ± 39% +78.7% 4094 ± 28% interrupts.CPU74.PMI:Performance_monitoring_interrupts >>> 2446 ± 40% +94.8% 4764 ± 23% interrupts.CPU75.NMI:Non-maskable_interrupts >>> 2446 ± 40% +94.8% 4764 ± 23% interrupts.CPU75.PMI:Performance_monitoring_interrupts >>> 426.75 ± 61% -67.7% 138.00 ± 8% interrupts.CPU75.RES:Rescheduling_interrupts >>> 192.50 ± 13% +45.6% 280.25 ± 45% interrupts.CPU76.RES:Rescheduling_interrupts >>> 274.25 ± 34% -42.2% 158.50 ± 34% interrupts.CPU77.RES:Rescheduling_interrupts >>> 2357 ± 9% +73.0% 4078 ± 23% interrupts.CPU78.NMI:Non-maskable_interrupts >>> 2357 ± 9% +73.0% 4078 ± 23% interrupts.CPU78.PMI:Performance_monitoring_interrupts >>> 348.50 ± 53% -47.3% 183.75 ± 29% interrupts.CPU80.RES:Rescheduling_interrupts >>> 2650 ± 43% +46.2% 3874 ± 36% interrupts.CPU84.NMI:Non-maskable_interrupts >>> 2650 ± 43% +46.2% 3874 ± 36% interrupts.CPU84.PMI:Performance_monitoring_interrupts >>> 2235 ± 10% +117.8% 4867 ± 10% interrupts.CPU90.NMI:Non-maskable_interrupts >>> 2235 ± 10% +117.8% 4867 ± 10% interrupts.CPU90.PMI:Performance_monitoring_interrupts >>> 2606 ± 33% +38.1% 3598 ± 21% interrupts.CPU92.NMI:Non-maskable_interrupts >>> 2606 ± 33% +38.1% 3598 ± 21% interrupts.CPU92.PMI:Performance_monitoring_interrupts >>> 408.75 ± 58% -56.8% 176.75 ± 25% interrupts.CPU92.RES:Rescheduling_interrupts >>> 399.00 ± 64% -63.6% 145.25 ± 16% interrupts.CPU93.RES:Rescheduling_interrupts >>> 314.75 ± 36% -44.2% 175.75 ± 13% interrupts.CPU94.RES:Rescheduling_interrupts >>> 191.00 ± 15% -29.1% 135.50 ± 9% interrupts.CPU97.RES:Rescheduling_interrupts >>> 94.00 ± 8% +50.0% 141.00 ± 12% interrupts.IWI:IRQ_work_interrupts >>> 841457 ± 7% +16.6% 980751 ± 3% interrupts.NMI:Non-maskable_interrupts >>> 841457 ± 7% +16.6% 980751 ± 3% interrupts.PMI:Performance_monitoring_interrupts >>> 12.75 ± 11% -4.1 8.67 ± 31% perf-profile.calltrace.cycles-pp.do_rw_once >>> 1.02 ± 16% -0.6 0.47 ± 59% perf-profile.calltrace.cycles-pp.sched_clock.sched_clock_cpu.cpuidle_enter_state.cpuidle_enter.do_idle >>> 1.10 ± 15% -0.4 0.66 ± 14% perf-profile.calltrace.cycles-pp.sched_clock_cpu.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry >>> 1.05 ± 16% -0.4 0.61 ± 14% perf-profile.calltrace.cycles-pp.native_sched_clock.sched_clock.sched_clock_cpu.cpuidle_enter_state.cpuidle_enter >>> 1.58 ± 4% +0.3 1.91 ± 7% perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.copy_page >>> 0.79 ± 26% +0.5 1.27 ± 18% perf-profile.calltrace.cycles-pp.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe >>> 0.79 ± 26% +0.5 1.27 ± 18% perf-profile.calltrace.cycles-pp.do_group_exit.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe >>> 0.79 ± 26% +0.5 1.27 ± 18% perf-profile.calltrace.cycles-pp.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe >>> 2.11 ± 4% +0.5 2.60 ± 7% perf-profile.calltrace.cycles-pp.apic_timer_interrupt.osq_lock.__mutex_lock.hugetlb_fault.handle_mm_fault >>> 0.83 ± 26% +0.5 1.32 ± 18% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe >>> 0.83 ± 26% +0.5 1.32 ± 18% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe >>> 1.90 ± 5% +0.6 2.45 ± 7% perf-profile.calltrace.cycles-pp.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.copy_page.copy_subpage >>> 0.65 ± 62% +0.6 1.20 ± 15% perf-profile.calltrace.cycles-pp.alloc_fresh_huge_page.alloc_surplus_huge_page.alloc_huge_page.hugetlb_cow.hugetlb_fault >>> 0.60 ± 62% +0.6 1.16 ± 18% perf-profile.calltrace.cycles-pp.free_huge_page.release_pages.tlb_flush_mmu.tlb_finish_mmu.exit_mmap >>> 0.95 ± 17% +0.6 1.52 ± 8% perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.mutex_spin_on_owner >>> 0.61 ± 62% +0.6 1.18 ± 18% perf-profile.calltrace.cycles-pp.release_pages.tlb_flush_mmu.tlb_finish_mmu.exit_mmap.mmput >>> 0.61 ± 62% +0.6 1.19 ± 19% perf-profile.calltrace.cycles-pp.tlb_finish_mmu.exit_mmap.mmput.do_exit.do_group_exit >>> 0.61 ± 62% +0.6 1.19 ± 19% perf-profile.calltrace.cycles-pp.tlb_flush_mmu.tlb_finish_mmu.exit_mmap.mmput.do_exit >>> 0.64 ± 61% +0.6 1.23 ± 18% perf-profile.calltrace.cycles-pp.mmput.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64 >>> 0.64 ± 61% +0.6 1.23 ± 18% perf-profile.calltrace.cycles-pp.exit_mmap.mmput.do_exit.do_group_exit.__x64_sys_exit_group >>> 1.30 ± 9% +0.6 1.92 ± 8% perf-profile.calltrace.cycles-pp.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.mutex_spin_on_owner.__mutex_lock >>> 0.19 ±173% +0.7 0.89 ± 20% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.free_huge_page.release_pages.tlb_flush_mmu >>> 0.19 ±173% +0.7 0.90 ± 20% perf-profile.calltrace.cycles-pp._raw_spin_lock.free_huge_page.release_pages.tlb_flush_mmu.tlb_finish_mmu >>> 0.00 +0.8 0.77 ± 30% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.prep_new_huge_page.alloc_fresh_huge_page.alloc_surplus_huge_page >>> 0.00 +0.8 0.78 ± 30% perf-profile.calltrace.cycles-pp._raw_spin_lock.prep_new_huge_page.alloc_fresh_huge_page.alloc_surplus_huge_page.alloc_huge_page >>> 0.00 +0.8 0.79 ± 29% perf-profile.calltrace.cycles-pp.prep_new_huge_page.alloc_fresh_huge_page.alloc_surplus_huge_page.alloc_huge_page.hugetlb_cow >>> 0.82 ± 67% +0.9 1.72 ± 22% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.alloc_huge_page.hugetlb_cow.hugetlb_fault >>> 0.84 ± 66% +0.9 1.74 ± 20% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.alloc_surplus_huge_page.alloc_huge_page.hugetlb_cow >>> 2.52 ± 6% +0.9 3.44 ± 9% perf-profile.calltrace.cycles-pp.smp_apic_timer_interrupt.apic_timer_interrupt.copy_page.copy_subpage.copy_user_huge_page >>> 0.83 ± 67% +0.9 1.75 ± 21% perf-profile.calltrace.cycles-pp._raw_spin_lock.alloc_huge_page.hugetlb_cow.hugetlb_fault.handle_mm_fault >>> 0.84 ± 66% +0.9 1.77 ± 20% perf-profile.calltrace.cycles-pp._raw_spin_lock.alloc_surplus_huge_page.alloc_huge_page.hugetlb_cow.hugetlb_fault >>> 1.64 ± 12% +1.0 2.67 ± 7% perf-profile.calltrace.cycles-pp.smp_apic_timer_interrupt.apic_timer_interrupt.mutex_spin_on_owner.__mutex_lock.hugetlb_fault >>> 1.65 ± 45% +1.3 2.99 ± 18% perf-profile.calltrace.cycles-pp.alloc_surplus_huge_page.alloc_huge_page.hugetlb_cow.hugetlb_fault.handle_mm_fault >>> 1.74 ± 13% +1.4 3.16 ± 6% perf-profile.calltrace.cycles-pp.apic_timer_interrupt.mutex_spin_on_owner.__mutex_lock.hugetlb_fault.handle_mm_fault >>> 2.56 ± 48% +2.2 4.81 ± 19% perf-profile.calltrace.cycles-pp.alloc_huge_page.hugetlb_cow.hugetlb_fault.handle_mm_fault.__do_page_fault >>> 12.64 ± 14% +3.6 16.20 ± 8% perf-profile.calltrace.cycles-pp.mutex_spin_on_owner.__mutex_lock.hugetlb_fault.handle_mm_fault.__do_page_fault >>> 2.97 ± 7% +3.8 6.74 ± 9% perf-profile.calltrace.cycles-pp.apic_timer_interrupt.copy_page.copy_subpage.copy_user_huge_page.hugetlb_cow >>> 19.99 ± 9% +4.1 24.05 ± 6% perf-profile.calltrace.cycles-pp.hugetlb_cow.hugetlb_fault.handle_mm_fault.__do_page_fault.do_page_fault >>> 1.37 ± 15% -0.5 0.83 ± 13% perf-profile.children.cycles-pp.sched_clock_cpu >>> 1.31 ± 16% -0.5 0.78 ± 13% perf-profile.children.cycles-pp.sched_clock >>> 1.29 ± 16% -0.5 0.77 ± 13% perf-profile.children.cycles-pp.native_sched_clock >>> 1.80 ± 2% -0.3 1.47 ± 10% perf-profile.children.cycles-pp.task_tick_fair >>> 0.73 ± 2% -0.2 0.54 ± 11% perf-profile.children.cycles-pp.update_curr >>> 0.42 ± 17% -0.2 0.27 ± 16% perf-profile.children.cycles-pp.account_process_tick >>> 0.73 ± 10% -0.2 0.58 ± 9% perf-profile.children.cycles-pp.rcu_sched_clock_irq >>> 0.27 ± 6% -0.1 0.14 ± 14% perf-profile.children.cycles-pp.__acct_update_integrals >>> 0.27 ± 18% -0.1 0.16 ± 13% perf-profile.children.cycles-pp.rcu_segcblist_ready_cbs >>> 0.40 ± 12% -0.1 0.30 ± 14% perf-profile.children.cycles-pp.__next_timer_interrupt >>> 0.47 ± 7% -0.1 0.39 ± 13% perf-profile.children.cycles-pp.update_rq_clock >>> 0.29 ± 12% -0.1 0.21 ± 15% perf-profile.children.cycles-pp.cpuidle_governor_latency_req >>> 0.21 ± 7% -0.1 0.14 ± 12% perf-profile.children.cycles-pp.account_system_index_time >>> 0.38 ± 2% -0.1 0.31 ± 12% perf-profile.children.cycles-pp.timerqueue_add >>> 0.26 ± 11% -0.1 0.20 ± 13% perf-profile.children.cycles-pp.find_next_bit >>> 0.23 ± 15% -0.1 0.17 ± 15% perf-profile.children.cycles-pp.rcu_dynticks_eqs_exit >>> 0.14 ± 8% -0.1 0.07 ± 14% perf-profile.children.cycles-pp.account_user_time >>> 0.17 ± 6% -0.0 0.12 ± 10% perf-profile.children.cycles-pp.cpuacct_charge >>> 0.18 ± 20% -0.0 0.13 ± 3% perf-profile.children.cycles-pp.irq_work_tick >>> 0.11 ± 13% -0.0 0.07 ± 25% perf-profile.children.cycles-pp.tick_sched_do_timer >>> 0.12 ± 10% -0.0 0.08 ± 15% perf-profile.children.cycles-pp.get_cpu_device >>> 0.07 ± 11% -0.0 0.04 ± 58% perf-profile.children.cycles-pp.raise_softirq >>> 0.12 ± 3% -0.0 0.09 ± 8% perf-profile.children.cycles-pp.write >>> 0.11 ± 13% +0.0 0.14 ± 8% perf-profile.children.cycles-pp.native_write_msr >>> 0.09 ± 9% +0.0 0.11 ± 7% perf-profile.children.cycles-pp.finish_task_switch >>> 0.10 ± 10% +0.0 0.13 ± 5% perf-profile.children.cycles-pp.schedule_idle >>> 0.07 ± 6% +0.0 0.10 ± 12% perf-profile.children.cycles-pp.__read_nocancel >>> 0.04 ± 58% +0.0 0.07 ± 15% perf-profile.children.cycles-pp.__free_pages_ok >>> 0.06 ± 7% +0.0 0.09 ± 13% perf-profile.children.cycles-pp.perf_read >>> 0.07 +0.0 0.11 ± 14% perf-profile.children.cycles-pp.perf_evsel__read_counter >>> 0.07 +0.0 0.11 ± 13% perf-profile.children.cycles-pp.cmd_stat >>> 0.07 +0.0 0.11 ± 13% perf-profile.children.cycles-pp.__run_perf_stat >>> 0.07 +0.0 0.11 ± 13% perf-profile.children.cycles-pp.process_interval >>> 0.07 +0.0 0.11 ± 13% perf-profile.children.cycles-pp.read_counters >>> 0.07 ± 22% +0.0 0.11 ± 19% perf-profile.children.cycles-pp.__handle_mm_fault >>> 0.07 ± 19% +0.1 0.13 ± 8% perf-profile.children.cycles-pp.rb_erase >>> 0.03 ±100% +0.1 0.09 ± 9% perf-profile.children.cycles-pp.smp_call_function_single >>> 0.01 ±173% +0.1 0.08 ± 11% perf-profile.children.cycles-pp.perf_event_read >>> 0.00 +0.1 0.07 ± 13% perf-profile.children.cycles-pp.__perf_event_read_value >>> 0.00 +0.1 0.07 ± 7% perf-profile.children.cycles-pp.__intel_pmu_enable_all >>> 0.08 ± 17% +0.1 0.15 ± 8% perf-profile.children.cycles-pp.native_apic_msr_eoi_write >>> 0.04 ±103% +0.1 0.13 ± 58% perf-profile.children.cycles-pp.shmem_getpage_gfp >>> 0.38 ± 14% +0.1 0.51 ± 6% perf-profile.children.cycles-pp.run_timer_softirq >>> 0.11 ± 4% +0.3 0.37 ± 32% perf-profile.children.cycles-pp.worker_thread >>> 0.20 ± 5% +0.3 0.48 ± 25% perf-profile.children.cycles-pp.ret_from_fork >>> 0.20 ± 4% +0.3 0.48 ± 25% perf-profile.children.cycles-pp.kthread >>> 0.00 +0.3 0.29 ± 38% perf-profile.children.cycles-pp.memcpy_erms >>> 0.00 +0.3 0.29 ± 38% perf-profile.children.cycles-pp.drm_fb_helper_dirty_work >>> 0.00 +0.3 0.31 ± 37% perf-profile.children.cycles-pp.process_one_work >>> 0.47 ± 48% +0.4 0.91 ± 19% perf-profile.children.cycles-pp.prep_new_huge_page >>> 0.70 ± 29% +0.5 1.16 ± 18% perf-profile.children.cycles-pp.free_huge_page >>> 0.73 ± 29% +0.5 1.19 ± 18% perf-profile.children.cycles-pp.tlb_flush_mmu >>> 0.72 ± 29% +0.5 1.18 ± 18% perf-profile.children.cycles-pp.release_pages >>> 0.73 ± 29% +0.5 1.19 ± 18% perf-profile.children.cycles-pp.tlb_finish_mmu >>> 0.76 ± 27% +0.5 1.23 ± 18% perf-profile.children.cycles-pp.exit_mmap >>> 0.77 ± 27% +0.5 1.24 ± 18% perf-profile.children.cycles-pp.mmput >>> 0.79 ± 26% +0.5 1.27 ± 18% perf-profile.children.cycles-pp.__x64_sys_exit_group >>> 0.79 ± 26% +0.5 1.27 ± 18% perf-profile.children.cycles-pp.do_group_exit >>> 0.79 ± 26% +0.5 1.27 ± 18% perf-profile.children.cycles-pp.do_exit >>> 1.28 ± 29% +0.5 1.76 ± 9% perf-profile.children.cycles-pp.perf_mux_hrtimer_handler >>> 0.77 ± 28% +0.5 1.26 ± 13% perf-profile.children.cycles-pp.alloc_fresh_huge_page >>> 1.53 ± 15% +0.7 2.26 ± 14% perf-profile.children.cycles-pp.do_syscall_64 >>> 1.53 ± 15% +0.7 2.27 ± 14% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe >>> 1.13 ± 3% +0.9 2.07 ± 14% perf-profile.children.cycles-pp.interrupt_entry >>> 0.79 ± 9% +1.0 1.76 ± 5% perf-profile.children.cycles-pp.perf_event_task_tick >>> 1.71 ± 39% +1.4 3.08 ± 16% perf-profile.children.cycles-pp.alloc_surplus_huge_page >>> 2.66 ± 42% +2.3 4.94 ± 17% perf-profile.children.cycles-pp.alloc_huge_page >>> 2.89 ± 45% +2.7 5.54 ± 18% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath >>> 3.34 ± 35% +2.7 6.02 ± 17% perf-profile.children.cycles-pp._raw_spin_lock >>> 12.77 ± 14% +3.9 16.63 ± 7% perf-profile.children.cycles-pp.mutex_spin_on_owner >>> 20.12 ± 9% +4.0 24.16 ± 6% perf-profile.children.cycles-pp.hugetlb_cow >>> 15.40 ± 10% -3.6 11.84 ± 28% perf-profile.self.cycles-pp.do_rw_once >>> 4.02 ± 9% -1.3 2.73 ± 30% perf-profile.self.cycles-pp.do_access >>> 2.00 ± 14% -0.6 1.41 ± 13% perf-profile.self.cycles-pp.cpuidle_enter_state >>> 1.26 ± 16% -0.5 0.74 ± 13% perf-profile.self.cycles-pp.native_sched_clock >>> 0.42 ± 17% -0.2 0.27 ± 16% perf-profile.self.cycles-pp.account_process_tick >>> 0.27 ± 19% -0.2 0.12 ± 17% perf-profile.self.cycles-pp.timerqueue_del >>> 0.53 ± 3% -0.1 0.38 ± 11% perf-profile.self.cycles-pp.update_curr >>> 0.27 ± 6% -0.1 0.14 ± 14% perf-profile.self.cycles-pp.__acct_update_integrals >>> 0.27 ± 18% -0.1 0.16 ± 13% perf-profile.self.cycles-pp.rcu_segcblist_ready_cbs >>> 0.61 ± 4% -0.1 0.51 ± 8% perf-profile.self.cycles-pp.task_tick_fair >>> 0.20 ± 8% -0.1 0.12 ± 14% perf-profile.self.cycles-pp.account_system_index_time >>> 0.23 ± 15% -0.1 0.16 ± 17% perf-profile.self.cycles-pp.rcu_dynticks_eqs_exit >>> 0.25 ± 11% -0.1 0.18 ± 14% perf-profile.self.cycles-pp.find_next_bit >>> 0.10 ± 11% -0.1 0.03 ±100% perf-profile.self.cycles-pp.tick_sched_do_timer >>> 0.29 -0.1 0.23 ± 11% perf-profile.self.cycles-pp.timerqueue_add >>> 0.12 ± 10% -0.1 0.06 ± 17% perf-profile.self.cycles-pp.account_user_time >>> 0.22 ± 15% -0.1 0.16 ± 6% perf-profile.self.cycles-pp.scheduler_tick >>> 0.17 ± 6% -0.0 0.12 ± 10% perf-profile.self.cycles-pp.cpuacct_charge >>> 0.18 ± 20% -0.0 0.13 ± 3% perf-profile.self.cycles-pp.irq_work_tick >>> 0.07 ± 13% -0.0 0.03 ±100% perf-profile.self.cycles-pp.update_process_times >>> 0.12 ± 7% -0.0 0.08 ± 15% perf-profile.self.cycles-pp.get_cpu_device >>> 0.07 ± 11% -0.0 0.04 ± 58% perf-profile.self.cycles-pp.raise_softirq >>> 0.12 ± 11% -0.0 0.09 ± 7% perf-profile.self.cycles-pp.tick_nohz_get_sleep_length >>> 0.11 ± 11% +0.0 0.14 ± 6% perf-profile.self.cycles-pp.native_write_msr >>> 0.10 ± 5% +0.1 0.15 ± 8% perf-profile.self.cycles-pp.__remove_hrtimer >>> 0.07 ± 23% +0.1 0.13 ± 8% perf-profile.self.cycles-pp.rb_erase >>> 0.08 ± 17% +0.1 0.15 ± 7% perf-profile.self.cycles-pp.native_apic_msr_eoi_write >>> 0.00 +0.1 0.08 ± 10% perf-profile.self.cycles-pp.smp_call_function_single >>> 0.32 ± 17% +0.1 0.42 ± 7% perf-profile.self.cycles-pp.run_timer_softirq >>> 0.22 ± 5% +0.1 0.34 ± 4% perf-profile.self.cycles-pp.ktime_get_update_offsets_now >>> 0.45 ± 15% +0.2 0.60 ± 12% perf-profile.self.cycles-pp.rcu_irq_enter >>> 0.31 ± 8% +0.2 0.46 ± 16% perf-profile.self.cycles-pp.irq_enter >>> 0.29 ± 10% +0.2 0.44 ± 16% perf-profile.self.cycles-pp.apic_timer_interrupt >>> 0.71 ± 30% +0.2 0.92 ± 8% perf-profile.self.cycles-pp.perf_mux_hrtimer_handler >>> 0.00 +0.3 0.28 ± 37% perf-profile.self.cycles-pp.memcpy_erms >>> 1.12 ± 3% +0.9 2.02 ± 15% perf-profile.self.cycles-pp.interrupt_entry >>> 0.79 ± 9% +0.9 1.73 ± 5% perf-profile.self.cycles-pp.perf_event_task_tick >>> 2.49 ± 45% +2.1 4.55 ± 20% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath >>> 10.95 ± 15% +2.7 13.61 ± 8% perf-profile.self.cycles-pp.mutex_spin_on_owner >>> >>> >>> >>> vm-scalability.throughput >>> >>> 1.6e+07 +-+---------------------------------------------------------------+ >>> |..+.+ +..+.+..+.+. +. +..+.+..+.+..+.+..+.+..+ + | >>> 1.4e+07 +-+ : : O O O O | >>> 1.2e+07 O-+O O O O O O O O O O O O O O O O O O >>> | : : O O O O | >>> 1e+07 +-+ : : | >>> | : : | >>> 8e+06 +-+ : : | >>> | : : | >>> 6e+06 +-+ : : | >>> 4e+06 +-+ : : | >>> | :: | >>> 2e+06 +-+ : | >>> | : | >>> 0 +-+---------------------------------------------------------------+ >>> >>> >>> vm-scalability.time.minor_page_faults >>> >>> 2.5e+06 +-+---------------------------------------------------------------+ >>> | | >>> |..+.+ +..+.+..+.+..+.+..+.+.. .+. .+.+..+.+..+.+..+.+..+ | >>> 2e+06 +-+ : : +. +. | >>> O O O: O O O O O O O O O O | >>> | : : O O O O O O O O O O O O O O >>> 1.5e+06 +-+ : : | >>> | : : | >>> 1e+06 +-+ : : | >>> | : : | >>> | : : | >>> 500000 +-+ : : | >>> | : | >>> | : | >>> 0 +-+---------------------------------------------------------------+ >>> >>> >>> vm-scalability.workload >>> >>> 3.5e+09 +-+---------------------------------------------------------------+ >>> | .+. .+.+.. .+.. | >>> 3e+09 +-+ + +..+.+..+.+..+.+. +..+.+..+.+..+.+..+.+..+ + | >>> | : : O O O | >>> 2.5e+09 O-+O O: O O O O O O O O O | >>> | : : O O O O O O O O O O O O >>> 2e+09 +-+ : : | >>> | : : | >>> 1.5e+09 +-+ : : | >>> | : : | >>> 1e+09 +-+ : : | >>> | : : | >>> 5e+08 +-+ : | >>> | : | >>> 0 +-+---------------------------------------------------------------+ >>> >>> >>> [*] bisect-good sample >>> [O] bisect-bad sample >>> >>> >>> >>> Disclaimer: >>> Results have been estimated based on internal Intel analysis and are provided >>> for informational purposes only. Any difference in system hardware or software >>> design or configuration may affect actual performance. >>> >>> >>> Thanks, >>> Rong Chen >>> >> >> -- >> Thomas Zimmermann >> Graphics Driver Developer >> SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany >> GF: Felix Imendörffer, Mary Higgins, Sri Rasiah >> HRB 21284 (AG Nürnberg) >> > > -- Thomas Zimmermann Graphics Driver Developer SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg)