From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753483AbbGNVIO (ORCPT ); Tue, 14 Jul 2015 17:08:14 -0400 Received: from userp1040.oracle.com ([156.151.31.81]:29873 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752712AbbGNVIN (ORCPT ); Tue, 14 Jul 2015 17:08:13 -0400 Message-ID: <55A579FD.6030000@oracle.com> Date: Tue, 14 Jul 2015 17:07:09 -0400 From: Boris Ostrovsky User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Thomas Gleixner CC: Yanmin Zhang , Joerg Roedel , Peter Zijlstra , LKML , xiao jin , Peter Anvin , xen-devel , Borislav Petkov , Ingo Molnar Subject: Re: [Xen-devel] [patch 1/4] hotplug: Prevent alloc/free of irq descriptors during cpu up/down References: <20150705170530.849428850@linutronix.de> <20150705171102.063519515@linutronix.de> <55A51F10.7010407@oracle.com> <55A532C2.4080306@oracle.com> <55A56B48.4060605@oracle.com> In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Source-IP: userv0022.oracle.com [156.151.31.74] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 07/14/2015 04:15 PM, Thomas Gleixner wrote: > On Tue, 14 Jul 2015, Boris Ostrovsky wrote: >> On 07/14/2015 01:32 PM, Thomas Gleixner wrote: >>> On Tue, 14 Jul 2015, Boris Ostrovsky wrote: >>>> On 07/14/2015 11:44 AM, Thomas Gleixner wrote: >>>>> On Tue, 14 Jul 2015, Boris Ostrovsky wrote: >>>>>>> Prevent allocation and freeing of interrupt descriptors accross cpu >>>>>>> hotplug. >>>>>> This breaks Xen guests that allocate interrupt descriptors in >>>>>> .cpu_up(). >>>>> And where exactly does XEN allocate those descriptors? >>>> xen_cpu_up() >>>> xen_setup_timer() >>>> bind_virq_to_irqhandler() >>>> bind_virq_to_irq() >>>> xen_allocate_irq_dynamic() >>>> xen_allocate_irqs_dynamic() >>>> irq_alloc_descs() >>>> >>>> >>>> There is also a similar pass via xen_cpu_up() -> xen_smp_intr_init() >>> Sigh. >>> >>>>> >>>>>> Any chance this locking can be moved into arch code? >>>>> No. >>> The issue here is that all architectures need that protection and just >>> Xen does irq allocations in cpu_up. >>> >>> So moving that protection into architecture code is not really an >>> option. >>> >>>>>> Otherwise we will need to have something like arch_post_cpu_up() >>>>>> after the lock is released. >>> I'm not sure, that this will work. You probably want to do this in the >>> cpu prepare stage, i.e. before calling __cpu_up(). >> For PV guests (the ones that use xen_cpu_up()) it will work either before or >> after __cpu_up(). At least my (somewhat limited) testing didn't show any >> problems so far. >> >> However, HVM CPUs use xen_hvm_cpu_up() and if you read comments there you will >> see that xen_smp_intr_init() needs to be called before native_cpu_up() but >> xen_init_lock_cpu() (which eventually calls irq_alloc_descs()) needs to be >> called after. >> >> I think I can split xen_init_lock_cpu() so that the part that needs to be >> called after will avoid going into irq core code. And then the rest will go >> into arch_cpu_prepare(). > I think we should revisit this for 4.3. For 4.2 we can do the trivial > variant and move the locking in native_cpu_up() and x86 only. x86 was > the only arch on which such wreckage has been seen in the wild, but we > should have that protection for all archs in the long run. > > Patch below should fix the issue. Thanks! Most of my tests passed, I had a couple of failures but I will need to see whether they are related to this patch. -boris