All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: linux-kernel@vger.kernel.org, mingo@kernel.org,
	laijs@cn.fujitsu.com, dipankar@in.ibm.com,
	akpm@linux-foundation.org, mathieu.desnoyers@efficios.com,
	josh@joshtriplett.org, tglx@linutronix.de, peterz@infradead.org,
	rostedt@goodmis.org, dhowells@redhat.com, edumazet@google.com,
	dvhart@linux.intel.com, fweisbec@gmail.com, oleg@redhat.com,
	bobby.prani@gmail.com, x86@kernel.org,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	David Vrabel <david.vrabel@citrix.com>,
	xen-devel@lists.xenproject.org
Subject: Re: [PATCH tip/core/rcu 02/20] x86: Use common outgoing-CPU-notification code
Date: Tue, 3 Mar 2015 14:31:51 -0800	[thread overview]
Message-ID: <20150303223151.GC15405@linux.vnet.ibm.com> (raw)
In-Reply-To: <54F6307A.8040003@oracle.com>

On Tue, Mar 03, 2015 at 05:06:50PM -0500, Boris Ostrovsky wrote:
> On 03/03/2015 04:26 PM, Paul E. McKenney wrote:
> >On Tue, Mar 03, 2015 at 03:13:07PM -0500, Boris Ostrovsky wrote:
> >>On 03/03/2015 02:42 PM, Paul E. McKenney wrote:
> >>>On Tue, Mar 03, 2015 at 02:17:24PM -0500, Boris Ostrovsky wrote:
> >>>>On 03/03/2015 12:42 PM, Paul E. McKenney wrote:
> >>>>>  }
> >>>>>@@ -511,7 +508,8 @@ static void xen_cpu_die(unsigned int cpu)
> >>>>>  		schedule_timeout(HZ/10);
> >>>>>  	}
> >>>>>-	cpu_die_common(cpu);
> >>>>>+	(void)cpu_wait_death(cpu, 5);
> >>>>>+	/* FIXME: Are the below calls really safe in case of timeout? */
> >>>>
> >>>>Not for HVM guests (PV guests will only reach this point after
> >>>>target cpu has been marked as down by the hypervisor).
> >>>>
> >>>>We need at least to have a message similar to what native_cpu_die()
> >>>>prints on cpu_wait_death() failure. And I think we should not call
> >>>>the two routines below (three, actually --- there is also
> >>>>xen_teardown_timer() below, which is not part of the diff).
> >>>>
> >>>>-boris
> >>>>
> >>>>
> >>>>>  	xen_smp_intr_free(cpu);
> >>>>>  	xen_uninit_lock_cpu(cpu);
> >>>So something like this, then?
> >>>
> >>>	if (cpu_wait_death(cpu, 5)) {
> >>>		xen_smp_intr_free(cpu);
> >>>		xen_uninit_lock_cpu(cpu);
> >>>		xen_teardown_timer(cpu);
> >>>	}
> >>	else
> >>		pr_err("CPU %u didn't die...\n", cpu);
> >>
> >>
> >>>Easy change for me to make if so!
> >>>
> >>>Or do I need some other check for HVM-vs.-PV guests, and, if so, what
> >>>would that check be?  And also if so, is it OK to online a PV guest's
> >>>CPU that timed out during its previous offline?
> >>
> >>I believe PV VCPUs will always be CPU_DEAD by the time we get here
> >>since we are (indirectly) waiting for this in the loop at the
> >>beginning of xen_cpu_die():
> >>
> >>'while (xen_pv_domain() && HYPERVISOR_vcpu_op(VCPUOP_is_up, cpu,
> >>NULL))' will exit only after 'HYPERVISOR_vcpu_op(VCPUOP_down,
> >>smp_processor_id()' in xen_play_dead(). Which happens after
> >>play_dead_common() has marked the cpu as CPU_DEAD.
> >>
> >>So no test is needed.
> >OK, so I have the following patch on top of my previous patch, which
> >I will merge if testing goes well.  So if a CPU times out going offline,
> >the above three functions will not be called, the "didn't die" message
> >will be printed, and any future attempt to online that CPU will fail.
> >Is that the correct semantics?
> 
> Yes.
> 
> I am not sure whether not ever onlining the CPU is the best outcome
> but then I don't think trying to online it again with all interrupts
> and such still set up will work well. And it's an improvement over
> what we have now anyway (with current code we may clean up things
> for a non-dead cpu).

Another strategy is to key off of the return value of cpu_check_up_prepare().
If it returns -EBUSY, then the outgoing CPU finished up after the
surviving CPU timed out.  The CPU trying to bring the new CPU online
could (in theory, anyway) do the xen_smp_intr_free(), xen_uninit_lock_cpu(),
and xen_teardown_timer() at that point.

But I must defer to you on this sort of thing.

							Thanx, Paul

> Thanks.
> -boris
> 
> 
> >
> >							Thanx, Paul
> >
> >------------------------------------------------------------------------
> >
> >diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
> >index e2c7389c58c5..f2a06ff0614d 100644
> >--- a/arch/x86/xen/smp.c
> >+++ b/arch/x86/xen/smp.c
> >@@ -508,12 +508,13 @@ static void xen_cpu_die(unsigned int cpu)
> >  		schedule_timeout(HZ/10);
> >  	}
> >-	(void)cpu_wait_death(cpu, 5);
> >-	/* FIXME: Are the below calls really safe in case of timeout? */
> >-
> >-	xen_smp_intr_free(cpu);
> >-	xen_uninit_lock_cpu(cpu);
> >-	xen_teardown_timer(cpu);
> >+	if (cpu_wait_death(cpu, 5)) {
> >+		xen_smp_intr_free(cpu);
> >+		xen_uninit_lock_cpu(cpu);
> >+		xen_teardown_timer(cpu);
> >+	} else {
> >+		pr_err("CPU %u didn't die...\n", cpu);
> >+	}
> >  }
> >  static void xen_play_dead(void) /* used only with HOTPLUG_CPU */
> >
> 


  parent reply	other threads:[~2015-03-03 22:32 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-03 17:41 [PATCH tip/core/rcu 0/20] CPU hotplug updates for v4.1 Paul E. McKenney
2015-03-03 17:42 ` [PATCH tip/core/rcu 01/20] smpboot: Add common code for notification from dying CPU Paul E. McKenney
2015-03-03 17:42   ` Paul E. McKenney
2015-03-03 17:42   ` [PATCH tip/core/rcu 02/20] x86: Use common outgoing-CPU-notification code Paul E. McKenney
2015-03-03 19:17     ` Boris Ostrovsky
2015-03-03 19:42       ` Paul E. McKenney
2015-03-03 19:42       ` Paul E. McKenney
2015-03-03 20:13         ` Boris Ostrovsky
2015-03-03 20:13         ` Boris Ostrovsky
2015-03-03 21:26           ` Paul E. McKenney
2015-03-03 22:06             ` Boris Ostrovsky
2015-03-03 22:31               ` Paul E. McKenney
2015-03-03 22:31               ` Paul E. McKenney [this message]
2015-03-04 14:43                 ` Paul E. McKenney
2015-03-04 14:43                 ` Paul E. McKenney
2015-03-04 14:55                   ` Boris Ostrovsky
2015-03-04 14:55                   ` Boris Ostrovsky
2015-03-04 15:25                     ` Paul E. McKenney
2015-03-05 21:17                       ` Boris Ostrovsky
2015-03-05 21:17                       ` Boris Ostrovsky
2015-03-05 22:00                         ` Paul E. McKenney
2015-03-05 22:00                         ` Paul E. McKenney
2015-03-04 15:25                     ` Paul E. McKenney
2015-03-04 15:45                     ` David Vrabel
2015-03-04 15:45                     ` David Vrabel
2015-03-04 16:10                       ` Boris Ostrovsky
2015-03-04 16:10                       ` Boris Ostrovsky
2015-03-03 22:06             ` Boris Ostrovsky
2015-03-03 21:26           ` Paul E. McKenney
2015-03-03 19:17     ` Boris Ostrovsky
2015-03-03 17:42   ` Paul E. McKenney
2015-03-03 17:42   ` [PATCH tip/core/rcu 03/20] blackfin: " Paul E. McKenney
2015-03-03 17:42   ` [PATCH tip/core/rcu 04/20] metag: " Paul E. McKenney
2015-03-03 17:42     ` Paul E. McKenney
2015-03-10 15:30     ` James Hogan
2015-03-10 15:30       ` James Hogan
2015-03-10 16:59       ` Paul E. McKenney
2015-03-11 11:03         ` James Hogan
2015-03-11 11:03           ` James Hogan
2015-03-11 18:58           ` Paul E. McKenney
2015-03-11 18:58             ` Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 05/20] rcu: Consolidate offline-CPU callback initialization Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 06/20] rcu: Put all orphan-callback-related code under same comment Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 07/20] rcu: Simplify sync_rcu_preempt_exp_init() Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 08/20] rcu: Eliminate empty HOTPLUG_CPU ifdef Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 09/20] rcu: Detect stalls caused by failure to propagate up rcu_node tree Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 10/20] rcu: Provide diagnostic option to slow down grace-period initialization Paul E. McKenney
2015-03-04 10:54     ` Paul Bolle
2015-03-04 14:59       ` Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 11/20] rcutorture: Enable slow grace-period initializations Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 12/20] rcu: Remove event tracing from rcu_cpu_notify(), used by offline CPUs Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 13/20] rcu: Rework preemptible expedited bitmask handling Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 14/20] rcu: Move rcu_report_unblock_qs_rnp() to common code Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 15/20] rcu: Process offlining and onlining only at grace-period start Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 16/20] rcu: Eliminate ->onoff_mutex from rcu_node structure Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 17/20] cpu: Make CPU-offline idle-loop transition point more precise Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 18/20] rcu: Handle outgoing CPUs on exit from idle loop Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 19/20] rcutorture: Default to grace-period-initialization delays Paul E. McKenney
2015-03-03 17:43   ` [PATCH tip/core/rcu 20/20] rcu: Add diagnostics to grace-period cleanup Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150303223151.GC15405@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=bobby.prani@gmail.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=david.vrabel@citrix.com \
    --cc=dhowells@redhat.com \
    --cc=dipankar@in.ibm.com \
    --cc=dvhart@linux.intel.com \
    --cc=edumazet@google.com \
    --cc=fweisbec@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=konrad.wilk@oracle.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.