From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752182AbbGNOkU (ORCPT ); Tue, 14 Jul 2015 10:40:20 -0400 Received: from aserp1040.oracle.com ([141.146.126.69]:20370 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751800AbbGNOkS (ORCPT ); Tue, 14 Jul 2015 10:40:18 -0400 Message-ID: <55A51F10.7010407@oracle.com> Date: Tue, 14 Jul 2015 10:39:12 -0400 From: Boris Ostrovsky User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Thomas Gleixner , LKML CC: Ingo Molnar , Peter Zijlstra , Peter Anvin , xiao jin , Joerg Roedel , Borislav Petkov , Yanmin Zhang , xen-devel Subject: Re: [patch 1/4] hotplug: Prevent alloc/free of irq descriptors during cpu up/down References: <20150705170530.849428850@linutronix.de> <20150705171102.063519515@linutronix.de> In-Reply-To: <20150705171102.063519515@linutronix.de> Content-Type: text/plain; charset=iso-8859-15; format=flowed Content-Transfer-Encoding: 7bit X-Source-IP: userv0022.oracle.com [156.151.31.74] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 07/05/2015 01:12 PM, Thomas Gleixner wrote: > When a cpu goes up some architectures (e.g. x86) have to walk the irq > space to set up the vector space for the cpu. While this needs extra > protection at the architecture level we can avoid a few race > conditions by preventing the concurrent allocation/free of irq > descriptors and the associated data. > > When a cpu goes down it moves the interrupts which are targeted to > this cpu away by reassigning the affinities. While this happens > interrupts can be allocated and freed, which opens a can of race > conditions in the code which reassignes the affinities because > interrupt descriptors might be freed underneath. > > Example: > > CPU1 CPU2 > cpu_up/down > irq_desc = irq_to_desc(irq); > remove_from_radix_tree(desc); > raw_spin_lock(&desc->lock); > free(desc); > > We could protect the irq descriptors with RCU, but that would require > a full tree change of all accesses to interrupt descriptors. But > fortunately these kind of race conditions are rather limited to a few > things like cpu hotplug. The normal setup/teardown is very well > serialized. So the simpler and obvious solution is: > > Prevent allocation and freeing of interrupt descriptors accross cpu > hotplug. This breaks Xen guests that allocate interrupt descriptors in .cpu_up(). Any chance this locking can be moved into arch code? Otherwise we will need to have something like arch_post_cpu_up() after the lock is released. (The patch doesn't appear to have any side effects for the down path since Xen guests deallocate descriptors in __cpu_die()). -boris > > Signed-off-by: Thomas Gleixner > --- > include/linux/irqdesc.h | 7 ++++++- > kernel/cpu.c | 21 ++++++++++++++++++++- > kernel/irq/internals.h | 4 ---- > 3 files changed, 26 insertions(+), 6 deletions(-) > > Index: tip/include/linux/irqdesc.h > =================================================================== > --- tip.orig/include/linux/irqdesc.h > +++ tip/include/linux/irqdesc.h > @@ -90,7 +90,12 @@ struct irq_desc { > const char *name; > } ____cacheline_internodealigned_in_smp; > > -#ifndef CONFIG_SPARSE_IRQ > +#ifdef CONFIG_SPARSE_IRQ > +extern void irq_lock_sparse(void); > +extern void irq_unlock_sparse(void); > +#else > +static inline void irq_lock_sparse(void) { } > +static inline void irq_unlock_sparse(void) { } > extern struct irq_desc irq_desc[NR_IRQS]; > #endif > > Index: tip/kernel/cpu.c > =================================================================== > --- tip.orig/kernel/cpu.c > +++ tip/kernel/cpu.c > @@ -392,13 +392,19 @@ static int __ref _cpu_down(unsigned int > smpboot_park_threads(cpu); > > /* > - * So now all preempt/rcu users must observe !cpu_active(). > + * Prevent irq alloc/free while the dying cpu reorganizes the > + * interrupt affinities. > */ > + irq_lock_sparse(); > > + /* > + * So now all preempt/rcu users must observe !cpu_active(). > + */ > err = __stop_machine(take_cpu_down, &tcd_param, cpumask_of(cpu)); > if (err) { > /* CPU didn't die: tell everyone. Can't complain. */ > cpu_notify_nofail(CPU_DOWN_FAILED | mod, hcpu); > + irq_unlock_sparse(); > goto out_release; > } > BUG_ON(cpu_online(cpu)); > @@ -415,6 +421,9 @@ static int __ref _cpu_down(unsigned int > smp_mb(); /* Read from cpu_dead_idle before __cpu_die(). */ > per_cpu(cpu_dead_idle, cpu) = false; > > + /* Interrupts are moved away from the dying cpu, reenable alloc/free */ > + irq_unlock_sparse(); > + > hotplug_cpu__broadcast_tick_pull(cpu); > /* This actually kills the CPU. */ > __cpu_die(cpu); > @@ -517,8 +526,18 @@ static int _cpu_up(unsigned int cpu, int > goto out_notify; > } > > + /* > + * Some architectures have to walk the irq descriptors to > + * setup the vector space for the cpu which comes online. > + * Prevent irq alloc/free across the bringup. > + */ > + irq_lock_sparse(); > + > /* Arch-specific enabling code. */ > ret = __cpu_up(cpu, idle); > + > + irq_unlock_sparse(); > + > if (ret != 0) > goto out_notify; > BUG_ON(!cpu_online(cpu)); > Index: tip/kernel/irq/internals.h > =================================================================== > --- tip.orig/kernel/irq/internals.h > +++ tip/kernel/irq/internals.h > @@ -76,12 +76,8 @@ extern void unmask_threaded_irq(struct i > > #ifdef CONFIG_SPARSE_IRQ > static inline void irq_mark_irq(unsigned int irq) { } > -extern void irq_lock_sparse(void); > -extern void irq_unlock_sparse(void); > #else > extern void irq_mark_irq(unsigned int irq); > -static inline void irq_lock_sparse(void) { } > -static inline void irq_unlock_sparse(void) { } > #endif > > extern void init_kstat_irqs(struct irq_desc *desc, int node, int nr); > > > --