From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752798AbdBBSnd (ORCPT ); Thu, 2 Feb 2017 13:43:33 -0500 Received: from mx1.redhat.com ([209.132.183.28]:50672 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751741AbdBBSna (ORCPT ); Thu, 2 Feb 2017 13:43:30 -0500 From: Waiman Long To: Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" Cc: linux-kernel@vger.kernel.org, Waiman Long Subject: [PATCH] x86, locking: Inline *_unlock_bh & *_unlock_irqrestore Date: Thu, 2 Feb 2017 13:36:29 -0500 Message-Id: <1486060589-31572-1-git-send-email-longman@redhat.com> X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Thu, 02 Feb 2017 18:36:38 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org For spinlock and read-write locks, both *_unlock() and *_unlock_irq() functions are inlined if !PREEMPT. Whereas the *_unlock_bh() and *unlock_irqrestore() are not inlined as that will increase the size of the kernel binary. Given the fact that the PV qspinlock unlock call is a callee-saved function pointer, the unlock function is essentially a leaf function if !PREEMPT is true under all circumstances. Inlining it will enable the compiler to do much better optimization around the the unlock call sites. Similarly for the read-write locks. To unleash this additional performance, all these unlock functions are now inlined for x86-64 kernel where kernel size is usually less of a concern. For 32-bit kernel, it is assumed that the focus will be a bit more on kernel size optimization and so those functions are not inlined. With 4.9.6 kernel source and gcc 4.8.5-11 compiler, the text size of the base vmlinux binary increased from 8527918 bytes to 8462235 bytes. That was an increase of 0.78%. On a 2-socket 24-core 48-thread system, the performance of the AIM7 high_systime workload (1000 users) increased from 209818.43 jobs/min to 213333.33 jobs/min. That was an increase of 1.68%. Signed-off-by: Waiman Long --- arch/x86/Kconfig | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 7b6fd68..a0fbed7 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -23,6 +23,12 @@ config X86_64 depends on 64BIT # Options that are inherently 64-bit kernel only: select ARCH_HAS_GIGANTIC_PAGE + select ARCH_INLINE_READ_UNLOCK_BH if !PREEMPT + select ARCH_INLINE_READ_UNLOCK_IRQRESTORE if !PREEMPT + select ARCH_INLINE_SPIN_UNLOCK_BH if !PREEMPT + select ARCH_INLINE_SPIN_UNLOCK_IRQRESTORE if !PREEMPT + select ARCH_INLINE_WRITE_UNLOCK_BH if !PREEMPT + select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE if !PREEMPT select ARCH_SUPPORTS_INT128 select ARCH_USE_CMPXCHG_LOCKREF select HAVE_ARCH_SOFT_DIRTY -- 1.8.3.1