From mboxrd@z Thu Jan 1 00:00:00 1970 From: Boris Ostrovsky Subject: Re: [PATCH v2 46/52] xen, balloon: Fix CPU hotplug callback registration Date: Fri, 14 Feb 2014 11:49:15 -0500 Message-ID: <52FE490B.8000908__41432.8448511243$1392396646$gmane$org@oracle.com> References: <20140214074750.22701.47330.stgit@srivatsabhat.in.ibm.com> <20140214075935.22701.71000.stgit@srivatsabhat.in.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1WELwT-0002Rd-TJ for xen-devel@lists.xenproject.org; Fri, 14 Feb 2014 16:48:46 +0000 In-Reply-To: <20140214075935.22701.71000.stgit@srivatsabhat.in.ibm.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: "Srivatsa S. Bhat" Cc: linux-arch@vger.kernel.org, ego@linux.vnet.ibm.com, walken@google.com, linux@arm.linux.org.uk, akpm@linux-foundation.org, peterz@infradead.org, rusty@rustcorp.com.au, rjw@rjwysocki.net, oleg@redhat.com, linux-kernel@vger.kernel.org, paulus@samba.org, David Vrabel , tj@kernel.org, xen-devel@lists.xenproject.org, tglx@linutronix.de, paulmck@linux.vnet.ibm.com, mingo@kernel.org List-Id: xen-devel@lists.xenproject.org On 02/14/2014 02:59 AM, Srivatsa S. Bhat wrote: > Subsystems that want to register CPU hotplug callbacks, as well as perform > initialization for the CPUs that are already online, often do it as shown > below: > > get_online_cpus(); > > for_each_online_cpu(cpu) > init_cpu(cpu); > > register_cpu_notifier(&foobar_cpu_notifier); > > put_online_cpus(); > > This is wrong, since it is prone to ABBA deadlocks involving the > cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently > with CPU hotplug operations). > > Interestingly, the balloon code in xen can actually prevent double > initialization and hence can use the following simplified form of callback > registration: > > register_cpu_notifier(&foobar_cpu_notifier); > > get_online_cpus(); > > for_each_online_cpu(cpu) > init_cpu(cpu); > > put_online_cpus(); > > A hotplug operation that occurs between registering the notifier and calling > get_online_cpus(), won't disrupt anything, because the code takes care to > perform the memory allocations only once. > > So reorganize the balloon code in xen this way to fix the deadlock with > callback registration. > > Cc: Konrad Rzeszutek Wilk > Cc: Boris Ostrovsky > Cc: David Vrabel > Cc: Ingo Molnar > Cc: xen-devel@lists.xenproject.org > Signed-off-by: Srivatsa S. Bhat > --- > > drivers/xen/balloon.c | 35 +++++++++++++++++++++++------------ > 1 file changed, 23 insertions(+), 12 deletions(-) This looks exactly like the earlier version (i.e the notifier is still kept registered on allocation failure and commit message doesn't exactly reflect the change). -boris > > diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c > index 37d06ea..afe1a3f 100644 > --- a/drivers/xen/balloon.c > +++ b/drivers/xen/balloon.c > @@ -592,19 +592,29 @@ static void __init balloon_add_region(unsigned long start_pfn, > } > } > > +static int alloc_balloon_scratch_page(int cpu) > +{ > + if (per_cpu(balloon_scratch_page, cpu) != NULL) > + return 0; > + > + per_cpu(balloon_scratch_page, cpu) = alloc_page(GFP_KERNEL); > + if (per_cpu(balloon_scratch_page, cpu) == NULL) { > + pr_warn("Failed to allocate balloon_scratch_page for cpu %d\n", cpu); > + return -ENOMEM; > + } > + > + return 0; > +} > + > + > static int balloon_cpu_notify(struct notifier_block *self, > unsigned long action, void *hcpu) > { > int cpu = (long)hcpu; > switch (action) { > case CPU_UP_PREPARE: > - if (per_cpu(balloon_scratch_page, cpu) != NULL) > - break; > - per_cpu(balloon_scratch_page, cpu) = alloc_page(GFP_KERNEL); > - if (per_cpu(balloon_scratch_page, cpu) == NULL) { > - pr_warn("Failed to allocate balloon_scratch_page for cpu %d\n", cpu); > + if (alloc_balloon_scratch_page(cpu)) > return NOTIFY_BAD; > - } > break; > default: > break; > @@ -624,15 +634,16 @@ static int __init balloon_init(void) > return -ENODEV; > > if (!xen_feature(XENFEAT_auto_translated_physmap)) { > - for_each_online_cpu(cpu) > - { > - per_cpu(balloon_scratch_page, cpu) = alloc_page(GFP_KERNEL); > - if (per_cpu(balloon_scratch_page, cpu) == NULL) { > - pr_warn("Failed to allocate balloon_scratch_page for cpu %d\n", cpu); > + register_cpu_notifier(&balloon_cpu_notifier); > + > + get_online_cpus(); > + for_each_online_cpu(cpu) { > + if (alloc_balloon_scratch_page(cpu)) { > + put_online_cpus(); > return -ENOMEM; > } > } > - register_cpu_notifier(&balloon_cpu_notifier); > + put_online_cpus(); > } > > pr_info("Initialising balloon driver\n"); >