From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751827AbdKJGJs (ORCPT ); Fri, 10 Nov 2017 01:09:48 -0500 Received: from mga02.intel.com ([134.134.136.20]:46849 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750995AbdKJGJr (ORCPT ); Fri, 10 Nov 2017 01:09:47 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.44,372,1505804400"; d="scan'208";a="1242374819" Subject: Re: [01/18] x86/asm/64: Remove the restore_c_regs_and_iret label To: Andrew Lutomirski , X86 ML Cc: Borislav Petkov , "linux-kernel@vger.kernel.org" , Brian Gerst , Dave Hansen , Linus Torvalds References: <480b1efbf2152c090512eca4db8b5894019c0535.1509006199.git.luto@kernel.org> From: kemi Message-ID: <88a7a33f-6d44-4b34-e702-f15456bc276d@intel.com> Date: Fri, 10 Nov 2017 14:08:04 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 In-Reply-To: <480b1efbf2152c090512eca4db8b5894019c0535.1509006199.git.luto@kernel.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Some performance regression/improvement is reported by LKP-tools for this patch series tested with Intel Atom processor. So, post the data here for your reference. Branch:x86/entry_consolidation Commit id: base:50da9d439392fdd91601d36e7f05728265bff262 head:69af865668fdb86a95e4e948b1f48b2689d60b73 Benchmark suite:will-it-scale Download link:https://github.com/antonblanchard/will-it-scale/tree/master/tests Metrics: will-it-scale.per_process_ops=processes/nr_cpu will-it-scale.per_thread_ops=threads/nr_cpu tbox:lkp-avoton3(nr_cpu=8,memory=16G) CPU: Intel(R) Atom(TM) CPU C2750 @ 2.40GHz Performance regression with will-it-scale benchmark suite: testcase base change head metric eventfd1 1505677 -5.9% 1416132 will-it-scale.per_process_ops 1352716 -3.0% 1311943 will-it-scale.per_thread_ops lseek2 7306698 -4.3% 6991473 will-it-scale.per_process_ops 4906388 -3.6% 4730531 will-it-scale.per_thread_ops lseek1 7355365 -4.2% 7046224 will-it-scale.per_process_ops 4928961 -3.7% 4748791 will-it-scale.per_thread_ops getppid1 8479806 -4.1% 8129026 will-it-scale.per_process_ops 8515252 -4.1% 8162076 will-it-scale.per_thread_ops lock1 1054249 -3.2% 1020895 will-it-scale.per_process_ops 989145 -2.6% 963578 will-it-scale.per_thread_ops dup1 2675825 -3.0% 2596257 will-it-scale.per_process_ops futex3 4986520 -2.8% 4846640 will-it-scale.per_process_ops 5009388 -2.7% 4875126 will-it-scale.per_thread_ops futex4 3932936 -2.0% 3854240 will-it-scale.per_process_ops 3950138 -2.0% 3872615 will-it-scale.per_thread_ops futex1 2941886 -1.8% 2888912 will-it-scale.per_process_ops futex2 2500203 -1.6% 2461065 will-it-scale.per_process_ops 1534692 -2.3% 1499532 will-it-scale.per_thread_ops malloc1 61314 -1.0% 60725 will-it-scale.per_process_ops 19996 -1.5% 19688 will-it-scale.per_thread_ops Performance improvement with will-it-scale benchmark suite: testcase base change head metric context_switch1 176376 +1.6% 179152 will-it-scale.per_process_ops 180703 +1.9% 184209 will-it-scale.per_thread_ops page_fault2 179716 +2.5% 184272 will-it-scale.per_process_ops 146890 +2.8% 150989 will-it-scale.per_thread_ops page_fault3 666953 +3.7% 691735 will-it-scale.per_process_ops 464641 +5.0% 487952 will-it-scale.per_thread_ops unix1 483094 +4.4% 504201 will-it-scale.per_process_ops 450055 +7.5% 483637 will-it-scale.per_thread_ops read2 575887 +5.0% 604440 will-it-scale.per_process_ops 500319 +5.2% 526361 will-it-scale.per_thread_ops poll1 4614597 +5.4% 4864022 will-it-scale.per_process_ops 3981551 +5.8% 4213409 will-it-scale.per_thread_ops pwrite2 383344 +5.7% 405151 will-it-scale.per_process_ops 367006 +5.0% 385209 will-it-scale.per_thread_ops sched_yield 3011191 +6.0% 3191710 will-it-scale.per_process_ops 3024171 +6.1% 3208197 will-it-scale.per_thread_ops pipe1 755487 +6.2% 802622 will-it-scale.per_process_ops 705136 +8.8% 766950 will-it-scale.per_thread_ops pwrite3 422850 +6.6% 450660 will-it-scale.per_process_ops 413370 +3.7% 428704 will-it-scale.per_thread_ops readseek1 972102 +6.7% 1036852 will-it-scale.per_process_ops 844877 +6.6% 900686 will-it-scale.per_thread_ops pwrite1 981310 +6.8% 1047809 will-it-scale.per_process_ops 944421 +5.7% 998472 will-it-scale.per_thread_ops pread2 444743 +6.9% 475332 will-it-scale.per_process_ops 430299 +6.1% 456718 will-it-scale.per_thread_ops writeseek1 849520 +7.0% 908672 will-it-scale.per_process_ops 746978 +9.3% 816372 will-it-scale.per_thread_ops pread3 1108949 +7.2% 1189021 will-it-scale.per_process_ops 1088521 +5.5% 1148522 will-it-scale.per_thread_ops mmap1 207314 +7.3% 222442 will-it-scale.per_process_ops 82533 +6.9% 88199 will-it-scale.per_thread_ops writeseek3 377973 +7.4% 405853 will-it-scale.per_process_ops 333156 +11.4% 371100 will-it-scale.per_thread_ops open2 266217 +7.6% 286335 will-it-scale.per_process_ops 208208 +6.6% 222052 will-it-scale.per_thread_ops unlink2 54774 +7.7% 59013 will-it-scale.per_process_ops 53792 +7.0% 57584 will-it-scale.per_thread_ops poll2 257458 +8.0% 278072 will-it-scale.per_process_ops 153400 +8.4% 166256 will-it-scale.per_thread_ops posix_semaphore1 19898603 +8.3% 21552049 will-it-scale.per_process_ops 19797092 +8.4% 21458395 will-it-scale.per_thread_ops pthread_mutex2 35871102 +8.4% 38868017 will-it-scale.per_process_ops 21506625 +8.4% 23312550 will-it-scale.per_thread_ops mmap2 154242 +8.5% 167348 will-it-scale.per_process_ops 62234 +7.4% 66841 will-it-scale.per_thread_ops unlink1 31487 +9.3% 34404 will-it-scale.per_process_ops 31607 +8.5% 34285 will-it-scale.per_thread_ops open1 280301 +9.9% 307995 will-it-scale.per_process_ops 213863 +7.8% 230585 will-it-scale.per_thread_ops signal1 355247 +11.2% 394875 will-it-scale.per_process_ops 176973 +9.7% 194160 will-it-scale.per_thread_ops ============================================================================================ Branch:x86/entry_consolidation Commit id: base:50da9d439392fdd91601d36e7f05728265bff262 head:69af865668fdb86a95e4e948b1f48b2689d60b73 Benchmark suite:unixbench Download link:https://github.com/kdlucas/byte-unixbench.git tbox:lkp-avoton2(nr_cpu=8,memory=16G) CPU: Intel(R) Atom(TM) CPU C2750 @ 2.40GHz Performance regression with unixbench benchmark suite: testcase base change head metric syscall 1206 -4.2% 1155 unixbench.score pipe 4851 -1.5% 4779 unixbench.score execl 498.83 -1.2% 492.90 unixbench.score Performance improvement with unixbench benchmark suite: testcase base change head metric fsdisk 2150 +2.7% 2208 unixbench.score ============================================================================================= On 2017年10月26日 16:26, Andrew Lutomirski wrote: > The only user was the 64-bit opportunistic SYSRET failure path, and > that path didn't really need it. This change makes the > opportunistic SYSRET code a bit more straightforward and gets rid of > the label. > > Signed-off-by: Andy Lutomirski > Reviewed-by: Borislav Petkov > --- > arch/x86/entry/entry_64.S | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S > index 49167258d587..afe1f403fa0e 100644 > --- a/arch/x86/entry/entry_64.S > +++ b/arch/x86/entry/entry_64.S > @@ -245,7 +245,6 @@ entry_SYSCALL64_slow_path: > call do_syscall_64 /* returns with IRQs disabled */ > > return_from_SYSCALL_64: > - RESTORE_EXTRA_REGS > TRACE_IRQS_IRETQ /* we're about to change IF */ > > /* > @@ -314,6 +313,7 @@ return_from_SYSCALL_64: > */ > syscall_return_via_sysret: > /* rcx and r11 are already restored (see code above) */ > + RESTORE_EXTRA_REGS > RESTORE_C_REGS_EXCEPT_RCX_R11 > movq RSP(%rsp), %rsp > UNWIND_HINT_EMPTY > @@ -321,7 +321,7 @@ syscall_return_via_sysret: > > opportunistic_sysret_failed: > SWAPGS > - jmp restore_c_regs_and_iret > + jmp restore_regs_and_iret > END(entry_SYSCALL_64) > > ENTRY(stub_ptregs_64) > @@ -638,7 +638,6 @@ retint_kernel: > */ > GLOBAL(restore_regs_and_iret) > RESTORE_EXTRA_REGS > -restore_c_regs_and_iret: > RESTORE_C_REGS > REMOVE_PT_GPREGS_FROM_STACK 8 > INTERRUPT_RETURN >