From mboxrd@z Thu Jan 1 00:00:00 1970 From: Waiman Long Subject: Re: [PATCH v12 09/11] pvqspinlock, x86: Add para-virtualization support Date: Wed, 29 Oct 2014 15:05:52 -0400 Message-ID: <54513A90.3080108__33532.9350470149$1414609669$gmane$org@hp.com> References: <1413483040-58399-1-git-send-email-Waiman.Long@hp.com> <1413483040-58399-10-git-send-email-Waiman.Long@hp.com> <20141024085437.GV21513@worktop.programming.kicks-ass.net> <544E830C.6070307@hp.com> <20141027180439.GL3337@twins.programming.kicks-ass.net> <544EB79E.6020200@hp.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1XjYZH-0000s1-Bm for xen-devel@lists.xenproject.org; Wed, 29 Oct 2014 19:06:03 +0000 In-Reply-To: <544EB79E.6020200@hp.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Peter Zijlstra Cc: linux-arch@vger.kernel.org, Raghavendra K T , Oleg Nesterov , kvm@vger.kernel.org, Scott J Norton , x86@kernel.org, Paolo Bonzini , linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, Ingo Molnar , David Vrabel , "H. Peter Anvin" , xen-devel@lists.xenproject.org, Thomas Gleixner , "Paul E. McKenney" , Linus Torvalds , Boris Ostrovsky , Douglas Hatch List-Id: xen-devel@lists.xenproject.org On 10/27/2014 05:22 PM, Waiman Long wrote: > On 10/27/2014 02:04 PM, Peter Zijlstra wrote: >> On Mon, Oct 27, 2014 at 01:38:20PM -0400, Waiman Long wrote: >>> On 10/24/2014 04:54 AM, Peter Zijlstra wrote: >>>> On Thu, Oct 16, 2014 at 02:10:38PM -0400, Waiman Long wrote: >>>> >>>>> Since enabling paravirt spinlock will disable unlock function >>>>> inlining, >>>>> a jump label can be added to the unlock function without adding patch >>>>> sites all over the kernel. >>>> But you don't have to. My patches allowed for the inline to remain, >>>> again reducing the overhead of enabling PV spinlocks while running >>>> on a >>>> real machine. >>>> >>>> Look at: >>>> >>>> http://lkml.kernel.org/r/20140615130154.213923590@chello.nl >>>> >>>> In particular this hunk: >>>> >>>> Index: linux-2.6/arch/x86/kernel/paravirt_patch_64.c >>>> =================================================================== >>>> --- linux-2.6.orig/arch/x86/kernel/paravirt_patch_64.c >>>> +++ linux-2.6/arch/x86/kernel/paravirt_patch_64.c >>>> @@ -22,6 +22,10 @@ DEF_NATIVE(pv_cpu_ops, swapgs, "swapgs") >>>> DEF_NATIVE(, mov32, "mov %edi, %eax"); >>>> DEF_NATIVE(, mov64, "mov %rdi, %rax"); >>>> >>>> +#if defined(CONFIG_PARAVIRT_SPINLOCKS)&& >>>> defined(CONFIG_QUEUE_SPINLOCK) >>>> +DEF_NATIVE(pv_lock_ops, queue_unlock, "movb $0, (%rdi)"); >>>> +#endif >>>> + >>>> unsigned paravirt_patch_ident_32(void *insnbuf, unsigned len) >>>> { >>>> return paravirt_patch_insns(insnbuf, len, >>>> @@ -61,6 +65,9 @@ unsigned native_patch(u8 type, u16 clobb >>>> PATCH_SITE(pv_cpu_ops, clts); >>>> PATCH_SITE(pv_mmu_ops, flush_tlb_single); >>>> PATCH_SITE(pv_cpu_ops, wbinvd); >>>> +#if defined(CONFIG_PARAVIRT_SPINLOCKS)&& >>>> defined(CONFIG_QUEUE_SPINLOCK) >>>> + PATCH_SITE(pv_lock_ops, queue_unlock); >>>> +#endif >>>> >>>> patch_site: >>>> ret = paravirt_patch_insns(ibuf, len, start, end); >>>> >>>> >>>> That makes sure to overwrite the callee-saved call to the >>>> pv_lock_ops::queue_unlock with the immediate asm "movb $0, (%rdi)". >>>> >>>> >>>> Therefore you can retain the inlined unlock with hardly (there >>>> might be >>>> some NOP padding) any overhead at all. On PV it reverts to a callee >>>> saved function call. >>> My concern is that spin_unlock() can be called in many places, >>> including >>> loadable kernel modules. Can the paravirt_patch_ident_32() function >>> able to >>> patch all of them in reasonable time? How about a kernel module >>> loaded later >>> at run time? >> modules should be fine, see arch/x86/kernel/module.c:module_finalize() >> -> apply_paravirt(). >> >> Also note that the 'default' text is an indirect call into the paravirt >> ops table which routes to the 'right' function, so even if the text >> patching would be 'late' calls would 'work' as expected, just slower. > > Thanks for letting me know about that. I have this concern because > your patch didn't change the current configuration of disabling unlock > inlining when paravirt_spinlock is enabled. With that, I think it is > worthwhile to reduce the performance delta between the PV and non-PV > kernel on bare metal. I am sorry that the unlock call sites patching code doesn't work in a virtual guest. Your pvqspinlock patch did an unconditional patching even in a virtual guest. I added check for the paravirt_spinlocks_enabled, but it turned out that some spin_unlock() seemed to be called before paravirt_spinlocks_enabled is set. As a result, some call sites were still patched resulting in missed wake up's and system hang. At this point, I am going to leave out that change from my patch set until we can figure out a better way of doing that. -Longman