From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932543AbcFHHWD (ORCPT ); Wed, 8 Jun 2016 03:22:03 -0400 Received: from mga01.intel.com ([192.55.52.88]:51754 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753150AbcFHHWB (ORCPT ); Wed, 8 Jun 2016 03:22:01 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.26,438,1459839600"; d="scan'208";a="823951162" From: "Huang\, Ying" To: "Kirill A. Shutemov" Cc: kernel test robot , Rik van Riel , Michal Hocko , , LKML , Michal Hocko , Minchan Kim , Vinayak Menon , Mel Gorman , Andrew Morton , Linus Torvalds Subject: Re: [LKP] [lkp] [mm] 5c0a85fad9: unixbench.score -6.3% regression References: <20160606022724.GA26227@yexl-desktop> <20160606095136.GA79951@black.fi.intel.com> Date: Wed, 08 Jun 2016 15:21:56 +0800 In-Reply-To: <20160606095136.GA79951@black.fi.intel.com> (Kirill A. Shutemov's message of "Mon, 6 Jun 2016 12:51:36 +0300") Message-ID: <87a8iw5enf.fsf@yhuang-dev.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org "Kirill A. Shutemov" writes: > On Mon, Jun 06, 2016 at 10:27:24AM +0800, kernel test robot wrote: >> >> FYI, we noticed a -6.3% regression of unixbench.score due to commit: >> >> commit 5c0a85fad949212b3e059692deecdeed74ae7ec7 ("mm: make faultaround produce old ptes") >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master >> >> in testcase: unixbench >> on test machine: lituya: 16 threads Haswell High-end Desktop (i7-5960X 3.0G) with 16G memory >> with following parameters: cpufreq_governor=performance/nr_task=1/test=shell8 >> >> >> Details are as below: >> --------------------------------------------------------------------------------------------------> >> >> >> ========================================================================================= >> compiler/cpufreq_governor/kconfig/nr_task/rootfs/tbox_group/test/testcase: >> gcc-4.9/performance/x86_64-rhel/1/debian-x86_64-2015-02-07.cgz/lituya/shell8/unixbench >> >> commit: >> 4b50bcc7eda4d3cc9e3f2a0aa60e590fedf728c5 >> 5c0a85fad949212b3e059692deecdeed74ae7ec7 >> >> 4b50bcc7eda4d3cc 5c0a85fad949212b3e059692de >> ---------------- -------------------------- >> fail:runs %reproduction fail:runs >> | | | >> 3:4 -75% :4 kmsg.DHCP/BOOTP:Reply_not_for_us,op[#]xid[#] >> %stddev %change %stddev >> \ | \ >> 14321 . 0% -6.3% 13425 . 0% unixbench.score >> 1996897 . 0% -6.1% 1874635 . 0% unixbench.time.involuntary_context_switches >> 1.721e+08 . 0% -6.2% 1.613e+08 . 0% unixbench.time.minor_page_faults >> 758.65 . 0% -3.0% 735.86 . 0% unixbench.time.system_time >> 387.66 . 0% +5.4% 408.49 . 0% unixbench.time.user_time >> 5950278 . 0% -6.2% 5583456 . 0% unixbench.time.voluntary_context_switches > > That's weird. > > I don't understand why the change would reduce number or minor faults. > It should stay the same on x86-64. Rise of user_time is puzzling too. unixbench runs in fixed time mode. That is, the total time to run unixbench is fixed, but the work done varies. So the minor_page_faults change may reflect only the work done. > Hm. Is reproducible? Across reboot? Yes. LKP will run every benchmark after reboot via kexec. We run 3 times for both the commit and its parent. The result is quite stable. You can find the standard deviation in percent is near 0 across different runs. Here is another comparison with profile data. ========================================================================================= compiler/cpufreq_governor/debug-setup/kconfig/nr_task/rootfs/tbox_group/test/testcase: gcc-4.9/performance/profile/x86_64-rhel/1/debian-x86_64-2015-02-07.cgz/lituya/shell8/unixbench commit: 4b50bcc7eda4d3cc9e3f2a0aa60e590fedf728c5 5c0a85fad949212b3e059692deecdeed74ae7ec7 4b50bcc7eda4d3cc 5c0a85fad949212b3e059692de ---------------- -------------------------- %stddev %change %stddev \ | \ 14056 ± 0% -6.3% 13172 ± 0% unixbench.score 6464046 ± 0% -6.1% 6071922 ± 0% unixbench.time.involuntary_context_switches 5.555e+08 ± 0% -6.2% 5.211e+08 ± 0% unixbench.time.minor_page_faults 2537 ± 0% -3.2% 2455 ± 0% unixbench.time.system_time 1284 ± 0% +5.8% 1359 ± 0% unixbench.time.user_time 19192611 ± 0% -6.2% 18010830 ± 0% unixbench.time.voluntary_context_switches 7709931 ± 0% -11.0% 6860574 ± 0% cpuidle.C1-HSW.usage 6900 ± 1% -43.9% 3871 ± 0% proc-vmstat.nr_active_file 40813 ± 1% -77.9% 9015 ±114% softirqs.NET_RX 111331 ± 1% -13.3% 96503 ± 0% meminfo.Active 27603 ± 1% -43.9% 15486 ± 0% meminfo.Active(file) 93169 ± 0% -5.8% 87766 ± 0% vmstat.system.cs 19768 ± 0% -1.7% 19437 ± 0% vmstat.system.in 6.22 ± 0% +10.3% 6.86 ± 0% turbostat.CPU%c3 0.02 ± 20% -85.7% 0.00 ±141% turbostat.Pkg%pc3 68.99 ± 0% -1.7% 67.84 ± 0% turbostat.PkgWatt 1.38 ± 5% -42.0% 0.80 ± 5% perf-profile.cycles-pp.page_remove_rmap.unmap_page_range.unmap_single_vma.unmap_vmas.exit_mmap 0.83 ± 4% +28.8% 1.07 ± 21% perf-profile.cycles-pp.release_pages.free_pages_and_swap_cache.tlb_flush_mmu_free.tlb_finish_mmu.exit_mmap 1.55 ± 3% -10.6% 1.38 ± 2% perf-profile.cycles-pp.unmap_single_vma.unmap_vmas.exit_mmap.mmput.flush_old_exec 1.59 ± 3% -9.8% 1.44 ± 3% perf-profile.cycles-pp.unmap_vmas.exit_mmap.mmput.flush_old_exec.load_elf_binary 389.00 ± 0% +32.1% 514.00 ± 8% slabinfo.file_lock_cache.active_objs 389.00 ± 0% +32.1% 514.00 ± 8% slabinfo.file_lock_cache.num_objs 7075 ± 3% -17.7% 5823 ± 7% slabinfo.pid.active_objs 7075 ± 3% -17.7% 5823 ± 7% slabinfo.pid.num_objs 0.67 ± 34% +86.4% 1.24 ± 30% sched_debug.cfs_rq:/.runnable_load_avg.min -9013 ± -1% +14.4% -10315 ± -9% sched_debug.cfs_rq:/.spread0.avg 83127 ± 5% +16.9% 97163 ± 8% sched_debug.cpu.avg_idle.min 17777 ± 16% +66.6% 29608 ± 22% sched_debug.cpu.curr->pid.avg 50223 ± 10% +49.3% 74974 ± 0% sched_debug.cpu.curr->pid.max 22281 ± 13% +51.8% 33816 ± 6% sched_debug.cpu.curr->pid.stddev 251.79 ± 5% -13.8% 217.15 ± 5% sched_debug.cpu.nr_uninterruptible.max -261.12 ± -2% -13.4% -226.03 ± -1% sched_debug.cpu.nr_uninterruptible.min 221.14 ± 3% -14.7% 188.60 ± 1% sched_debug.cpu.nr_uninterruptible.stddev 1.94e+11 ± 0% -5.8% 1.827e+11 ± 0% perf-stat.L1-dcache-load-misses 3.496e+12 ± 0% -6.5% 3.268e+12 ± 0% perf-stat.L1-dcache-loads 2.262e+12 ± 1% -5.5% 2.137e+12 ± 0% perf-stat.L1-dcache-stores 9.711e+10 ± 0% -3.7% 9.353e+10 ± 0% perf-stat.L1-icache-load-misses 8.051e+08 ± 0% -8.8% 7.343e+08 ± 1% perf-stat.LLC-load-misses 7.184e+10 ± 1% -5.6% 6.78e+10 ± 0% perf-stat.LLC-loads 5.867e+08 ± 2% -7.0% 5.456e+08 ± 0% perf-stat.LLC-store-misses 1.524e+10 ± 1% -5.6% 1.438e+10 ± 0% perf-stat.LLC-stores 2.711e+12 ± 0% -6.3% 2.539e+12 ± 0% perf-stat.branch-instructions 5.948e+10 ± 0% -3.9% 5.715e+10 ± 0% perf-stat.branch-load-misses 2.715e+12 ± 0% -6.4% 2.542e+12 ± 0% perf-stat.branch-loads 5.947e+10 ± 0% -3.9% 5.713e+10 ± 0% perf-stat.branch-misses 1.448e+09 ± 0% -9.3% 1.313e+09 ± 1% perf-stat.cache-misses 1.931e+11 ± 0% -5.8% 1.818e+11 ± 0% perf-stat.cache-references 58882705 ± 0% -5.8% 55467522 ± 0% perf-stat.context-switches 17037466 ± 0% -6.1% 15999111 ± 0% perf-stat.cpu-migrations 6.732e+09 ± 1% +90.7% 1.284e+10 ± 0% perf-stat.dTLB-load-misses 3.474e+12 ± 0% -6.6% 3.245e+12 ± 0% perf-stat.dTLB-loads 1.215e+09 ± 0% -5.5% 1.149e+09 ± 0% perf-stat.dTLB-store-misses 2.286e+12 ± 0% -5.8% 2.153e+12 ± 0% perf-stat.dTLB-stores 3.511e+09 ± 0% +20.4% 4.226e+09 ± 0% perf-stat.iTLB-load-misses 2.317e+09 ± 0% -6.8% 2.16e+09 ± 0% perf-stat.iTLB-loads 1.343e+13 ± 0% -6.0% 1.263e+13 ± 0% perf-stat.instructions 5.504e+08 ± 0% -6.2% 5.163e+08 ± 0% perf-stat.minor-faults 8.09e+08 ± 1% -9.0% 7.36e+08 ± 1% perf-stat.node-loads 5.932e+08 ± 0% -8.7% 5.417e+08 ± 1% perf-stat.node-stores 5.504e+08 ± 0% -6.2% 5.163e+08 ± 0% perf-stat.page-faults Best Regards, Huang, Ying