xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>,
	Joerg Roedel <jroedel@suse.de>,
	Peter Zijlstra <peterz@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	xiao jin <jin.xiao@intel.com>, Peter Anvin <hpa@zytor.com>,
	xen-devel <xen-devel@lists.xenproject.org>,
	Borislav Petkov <bp@suse.de>, Ingo Molnar <mingo@kernel.org>
Subject: Re: [patch 1/4] hotplug: Prevent alloc/free of irq descriptors during cpu up/down
Date: Tue, 14 Jul 2015 22:15:51 +0200 (CEST)	[thread overview]
Message-ID: <alpine.DEB.2.11.1507142211390.18576__46493.5520584278$1436905097$gmane$org@nanos> (raw)
In-Reply-To: <55A56B48.4060605@oracle.com>

On Tue, 14 Jul 2015, Boris Ostrovsky wrote:
> On 07/14/2015 01:32 PM, Thomas Gleixner wrote:
> > On Tue, 14 Jul 2015, Boris Ostrovsky wrote:
> > > On 07/14/2015 11:44 AM, Thomas Gleixner wrote:
> > > > On Tue, 14 Jul 2015, Boris Ostrovsky wrote:
> > > > > > Prevent allocation and freeing of interrupt descriptors accross cpu
> > > > > > hotplug.
> > > > > This breaks Xen guests that allocate interrupt descriptors in
> > > > > .cpu_up().
> > > > And where exactly does XEN allocate those descriptors?
> > > xen_cpu_up()
> > >      xen_setup_timer()
> > >          bind_virq_to_irqhandler()
> > >              bind_virq_to_irq()
> > >                  xen_allocate_irq_dynamic()
> > >                      xen_allocate_irqs_dynamic()
> > >                          irq_alloc_descs()
> > > 
> > > 
> > > There is also a similar pass via xen_cpu_up() -> xen_smp_intr_init()
> > Sigh.
> >   
> > > >    
> > > > > Any chance this locking can be moved into arch code?
> > > > No.
> > The issue here is that all architectures need that protection and just
> > Xen does irq allocations in cpu_up.
> > 
> > So moving that protection into architecture code is not really an
> > option.
> > 
> > > > > Otherwise we will need to have something like arch_post_cpu_up()
> > > > > after the lock is released.
> > I'm not sure, that this will work. You probably want to do this in the
> > cpu prepare stage, i.e. before calling __cpu_up().
> 
> For PV guests (the ones that use xen_cpu_up()) it will work either before or
> after __cpu_up(). At least my (somewhat limited) testing didn't show any
> problems so far.
> 
> However, HVM CPUs use xen_hvm_cpu_up() and if you read comments there you will
> see that xen_smp_intr_init() needs to be called before native_cpu_up() but
> xen_init_lock_cpu() (which eventually calls irq_alloc_descs()) needs to be
> called after.
> 
> I think I can split xen_init_lock_cpu() so that the part that needs to be
> called after will avoid going into irq core code. And then the rest will go
> into arch_cpu_prepare().

I think we should revisit this for 4.3. For 4.2 we can do the trivial
variant and move the locking in native_cpu_up() and x86 only. x86 was
the only arch on which such wreckage has been seen in the wild, but we
should have that protection for all archs in the long run.

Patch below should fix the issue.

Thanks,

	tglx
---
commit d4a969314077914a623f3e2c5120cd2ef31aba30
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Tue Jul 14 22:03:57 2015 +0200

    genirq: Revert sparse irq locking around __cpu_up() and move it to x86 for now
    
    Boris reported that the sparse_irq protection around __cpu_up() in the
    generic code causes a regression on Xen. Xen allocates interrupts and
    some more in the xen_cpu_up() function, so it deadlocks on the
    sparse_irq_lock.
    
    There is no simple fix for this and we really should have the
    protection for all architectures, but for now the only solution is to
    move it to x86 where actual wreckage due to the lack of protection has
    been observed.
    
    Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
    Fixes: a89941816726 'hotplug: Prevent alloc/free of irq descriptors during cpu up/down'
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: xiao jin <jin.xiao@intel.com>
    Cc: Joerg Roedel <jroedel@suse.de>
    Cc: Borislav Petkov <bp@suse.de>
    Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
    Cc: xen-devel <xen-devel@lists.xenproject.org>

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index d3010aa79daf..b1f3ed9c7a9e 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -992,8 +992,17 @@ int native_cpu_up(unsigned int cpu, struct task_struct *tidle)
 
 	common_cpu_up(cpu, tidle);
 
+	/*
+	 * We have to walk the irq descriptors to setup the vector
+	 * space for the cpu which comes online.  Prevent irq
+	 * alloc/free across the bringup.
+	 */
+	irq_lock_sparse();
+
 	err = do_boot_cpu(apicid, cpu, tidle);
+
 	if (err) {
+		irq_unlock_sparse();
 		pr_err("do_boot_cpu failed(%d) to wakeup CPU#%u\n", err, cpu);
 		return -EIO;
 	}
@@ -1011,6 +1020,8 @@ int native_cpu_up(unsigned int cpu, struct task_struct *tidle)
 		touch_nmi_watchdog();
 	}
 
+	irq_unlock_sparse();
+
 	return 0;
 }
 
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 6a374544d495..5644ec5582b9 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -527,18 +527,9 @@ static int _cpu_up(unsigned int cpu, int tasks_frozen)
 		goto out_notify;
 	}
 
-	/*
-	 * Some architectures have to walk the irq descriptors to
-	 * setup the vector space for the cpu which comes online.
-	 * Prevent irq alloc/free across the bringup.
-	 */
-	irq_lock_sparse();
-
 	/* Arch-specific enabling code. */
 	ret = __cpu_up(cpu, idle);
 
-	irq_unlock_sparse();
-
 	if (ret != 0)
 		goto out_notify;
 	BUG_ON(!cpu_online(cpu));

  parent reply	other threads:[~2015-07-14 20:16 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20150705170530.849428850@linutronix.de>
     [not found] ` <20150705171102.063519515@linutronix.de>
2015-07-14 14:39   ` [patch 1/4] hotplug: Prevent alloc/free of irq descriptors during cpu up/down Boris Ostrovsky
     [not found]   ` <55A51F10.7010407@oracle.com>
2015-07-14 15:44     ` Thomas Gleixner
     [not found]     ` <alpine.DEB.2.11.1507141743150.18576@nanos>
2015-07-14 16:03       ` Boris Ostrovsky
     [not found]       ` <55A532C2.4080306@oracle.com>
2015-07-14 17:32         ` Thomas Gleixner
     [not found]         ` <alpine.DEB.2.11.1507141901460.18576@nanos>
2015-07-14 20:04           ` Boris Ostrovsky
     [not found]           ` <55A56B48.4060605@oracle.com>
2015-07-14 20:15             ` Thomas Gleixner [this message]
     [not found]             ` <alpine.DEB.2.11.1507142211390.18576@nanos>
2015-07-14 21:07               ` Boris Ostrovsky
     [not found]               ` <55A579FD.6030000@oracle.com>
2016-03-12  9:19                 ` Thomas Gleixner
     [not found]                 ` <alpine.DEB.2.11.1603121017500.3657@nanos>
2016-03-14 13:12                   ` Boris Ostrovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='alpine.DEB.2.11.1507142211390.18576__46493.5520584278$1436905097$gmane$org@nanos' \
    --to=tglx@linutronix.de \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@suse.de \
    --cc=hpa@zytor.com \
    --cc=jin.xiao@intel.com \
    --cc=jroedel@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=xen-devel@lists.xenproject.org \
    --cc=yanmin_zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).