From mboxrd@z Thu Jan  1 00:00:00 1970
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Subject: Re: [PATCH tip/core/rcu 02/20] x86: Use common
 outgoing-CPU-notification code
Date: Wed, 4 Mar 2015 07:25:15 -0800
Message-ID: <20150304152515.GL15405__22112.2430092239$1425482863$gmane$org@linux.vnet.ibm.com>
References: <1425404595-17816-1-git-send-email-paulmck@linux.vnet.ibm.com>
	<1425404595-17816-2-git-send-email-paulmck@linux.vnet.ibm.com>
	<54F608C4.40405@oracle.com>
	<20150303194223.GR15405@linux.vnet.ibm.com>
	<54F615D3.2040802@oracle.com>
	<20150303212647.GZ15405@linux.vnet.ibm.com>
	<54F6307A.8040003@oracle.com>
	<20150303223151.GC15405@linux.vnet.ibm.com>
	<20150304144336.GA8225@linux.vnet.ibm.com>
	<54F71CCF.9050509@oracle.com>
Reply-To: paulmck@linux.vnet.ibm.com
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
Received: from mail6.bemta5.messagelabs.com ([195.245.231.135])
	by lists.xen.org with esmtp (Exim 4.72)
	(envelope-from <paulmck@linux.vnet.ibm.com>) id 1YTBAq-0008N1-Rv
	for xen-devel@lists.xenproject.org; Wed, 04 Mar 2015 15:25:25 +0000
Received: from /spool/local
	by e33.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only!
	Violators will be prosecuted
	for <xen-devel@lists.xenproject.org> from <paulmck@linux.vnet.ibm.com>;
	Wed, 4 Mar 2015 08:25:21 -0700
Received: from b03cxnp07028.gho.boulder.ibm.com
	(b03cxnp07028.gho.boulder.ibm.com [9.17.130.15])
	by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id AC1FD3E4004C
	for <xen-devel@lists.xenproject.org>;
	Wed,  4 Mar 2015 08:25:19 -0700 (MST)
Received: from d03av05.boulder.ibm.com (d03av05.boulder.ibm.com [9.17.195.85])
	by b03cxnp07028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with
	ESMTP id t24FNibj32571454
	for <xen-devel@lists.xenproject.org>; Wed, 4 Mar 2015 08:23:44 -0700
Received: from d03av05.boulder.ibm.com (localhost [127.0.0.1])
	by d03av05.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP
	id t24FPG0a028931
	for <xen-devel@lists.xenproject.org>; Wed, 4 Mar 2015 08:25:19 -0700
Content-Disposition: inline
In-Reply-To: <54F71CCF.9050509@oracle.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: tglx@linutronix.de, laijs@cn.fujitsu.com, bobby.prani@gmail.com, peterz@infradead.org, fweisbec@gmail.com, dvhart@linux.intel.com, x86@kernel.org, oleg@redhat.com, linux-kernel@vger.kernel.org, rostedt@goodmis.org, josh@joshtriplett.org, dhowells@redhat.com, edumazet@google.com, mathieu.desnoyers@efficios.com, David Vrabel <david.vrabel@citrix.com>, dipankar@in.ibm.com, xen-devel@lists.xenproject.org, akpm@linux-foundation.org, mingo@kernel.org
List-Id: xen-devel@lists.xenproject.org

On Wed, Mar 04, 2015 at 09:55:11AM -0500, Boris Ostrovsky wrote:
> On 03/04/2015 09:43 AM, Paul E. McKenney wrote:
> >On Tue, Mar 03, 2015 at 02:31:51PM -0800, Paul E. McKenney wrote:
> >>On Tue, Mar 03, 2015 at 05:06:50PM -0500, Boris Ostrovsky wrote:
> >>>On 03/03/2015 04:26 PM, Paul E. McKenney wrote:
> >>>>On Tue, Mar 03, 2015 at 03:13:07PM -0500, Boris Ostrovsky wrote:
> >>>>>On 03/03/2015 02:42 PM, Paul E. McKenney wrote:
> >>>>>>On Tue, Mar 03, 2015 at 02:17:24PM -0500, Boris Ostrovsky wrote:
> >>>>>>>On 03/03/2015 12:42 PM, Paul E. McKenney wrote:
> >>>>>>>>  }
> >>>>>>>>@@ -511,7 +508,8 @@ static void xen_cpu_die(unsigned int cpu)
> >>>>>>>>  		schedule_timeout(HZ/10);
> >>>>>>>>  	}
> >>>>>>>>-	cpu_die_common(cpu);
> >>>>>>>>+	(void)cpu_wait_death(cpu, 5);
> >>>>>>>>+	/* FIXME: Are the below calls really safe in case of timeout? */
> >>>>>>>Not for HVM guests (PV guests will only reach this point after
> >>>>>>>target cpu has been marked as down by the hypervisor).
> >>>>>>>
> >>>>>>>We need at least to have a message similar to what native_cpu_die()
> >>>>>>>prints on cpu_wait_death() failure. And I think we should not call
> >>>>>>>the two routines below (three, actually --- there is also
> >>>>>>>xen_teardown_timer() below, which is not part of the diff).
> >>>>>>>
> >>>>>>>-boris
> >>>>>>>
> >>>>>>>
> >>>>>>>>  	xen_smp_intr_free(cpu);
> >>>>>>>>  	xen_uninit_lock_cpu(cpu);
> >>>>>>So something like this, then?
> >>>>>>
> >>>>>>	if (cpu_wait_death(cpu, 5)) {
> >>>>>>		xen_smp_intr_free(cpu);
> >>>>>>		xen_uninit_lock_cpu(cpu);
> >>>>>>		xen_teardown_timer(cpu);
> >>>>>>	}
> >>>>>	else
> >>>>>		pr_err("CPU %u didn't die...\n", cpu);
> >>>>>
> >>>>>
> >>>>>>Easy change for me to make if so!
> >>>>>>
> >>>>>>Or do I need some other check for HVM-vs.-PV guests, and, if so, what
> >>>>>>would that check be?  And also if so, is it OK to online a PV guest's
> >>>>>>CPU that timed out during its previous offline?
> >>>>>I believe PV VCPUs will always be CPU_DEAD by the time we get here
> >>>>>since we are (indirectly) waiting for this in the loop at the
> >>>>>beginning of xen_cpu_die():
> >>>>>
> >>>>>'while (xen_pv_domain() && HYPERVISOR_vcpu_op(VCPUOP_is_up, cpu,
> >>>>>NULL))' will exit only after 'HYPERVISOR_vcpu_op(VCPUOP_down,
> >>>>>smp_processor_id()' in xen_play_dead(). Which happens after
> >>>>>play_dead_common() has marked the cpu as CPU_DEAD.
> >>>>>
> >>>>>So no test is needed.
> >>>>OK, so I have the following patch on top of my previous patch, which
> >>>>I will merge if testing goes well.  So if a CPU times out going offline,
> >>>>the above three functions will not be called, the "didn't die" message
> >>>>will be printed, and any future attempt to online that CPU will fail.
> >>>>Is that the correct semantics?
> >>>Yes.
> >>>
> >>>I am not sure whether not ever onlining the CPU is the best outcome
> >>>but then I don't think trying to online it again with all interrupts
> >>>and such still set up will work well. And it's an improvement over
> >>>what we have now anyway (with current code we may clean up things
> >>>for a non-dead cpu).
> >>Another strategy is to key off of the return value of cpu_check_up_prepare().
> >>If it returns -EBUSY, then the outgoing CPU finished up after the
> >>surviving CPU timed out.  The CPU trying to bring the new CPU online
> >>could (in theory, anyway) do the xen_smp_intr_free(), xen_uninit_lock_cpu(),
> >>and xen_teardown_timer() at that point.
> >And the code for this, in xen_cpu_up(), might look something like the
> >following:
> >
> >	rc = cpu_check_up_prepare(cpu);
> >	if (rc && rc != -EBUSY)
> >		return rc;
> >	if (rc == EBUSY) {
> >		xen_smp_intr_free(cpu);
> >		xen_uninit_lock_cpu(cpu);
> >		xen_teardown_timer(cpu);
> >	}
> >
> >The idea is that we detect when the CPU eventually took itself offline,
> >but only did so after the surviving CPU timed out.  (Of course, it
> >would probably be best to put those three statements into a small
> >function that is called from both places.)
> >
> >I have no idea whether this approach would really work, especially given
> >your earlier statement that CPU_DEAD happens early on.  But in case it
> >is helpful or sparks some better idea.
> 
> Let me test this, I think it may work.
> 
> In the meantime, it turned out that HVM guests are broken by this
> patch (with our without changes that we've been discussing), because
> HVM CPUs die with
> 
> static void xen_hvm_cpu_die(unsigned int cpu)
> {
>         xen_cpu_die(cpu);
>         native_cpu_die(cpu);
> }
> 
> Which means that cpu_wait_death() is called twice, and second call
> moves the CPU to CPU_BROKEN.

Yikes!  I did miss this one.  :-(

> The simple solution is to stop calling native_cpu_die() above but
> I'd like to use common code in native_cpu_die(). I'll see if I can
> carve it out without too much damage to x86.

Very good, thank you!  I look forward to seeing your patch.

							Thanx, Paul

> Thanks.
> -boris
> 
> 
> >
> >							Thanx, Paul
> >
> >>But I must defer to you on this sort of thing.
> >>
> >>							Thanx, Paul
> >>
> >>>Thanks.
> >>>-boris
> >>>
> >>>
> >>>>							Thanx, Paul
> >>>>
> >>>>------------------------------------------------------------------------
> >>>>
> >>>>diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
> >>>>index e2c7389c58c5..f2a06ff0614d 100644
> >>>>--- a/arch/x86/xen/smp.c
> >>>>+++ b/arch/x86/xen/smp.c
> >>>>@@ -508,12 +508,13 @@ static void xen_cpu_die(unsigned int cpu)
> >>>>  		schedule_timeout(HZ/10);
> >>>>  	}
> >>>>-	(void)cpu_wait_death(cpu, 5);
> >>>>-	/* FIXME: Are the below calls really safe in case of timeout? */
> >>>>-
> >>>>-	xen_smp_intr_free(cpu);
> >>>>-	xen_uninit_lock_cpu(cpu);
> >>>>-	xen_teardown_timer(cpu);
> >>>>+	if (cpu_wait_death(cpu, 5)) {
> >>>>+		xen_smp_intr_free(cpu);
> >>>>+		xen_uninit_lock_cpu(cpu);
> >>>>+		xen_teardown_timer(cpu);
> >>>>+	} else {
> >>>>+		pr_err("CPU %u didn't die...\n", cpu);
> >>>>+	}
> >>>>  }
> >>>>  static void xen_play_dead(void) /* used only with HOTPLUG_CPU */
> >>>>
>