From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1033066AbbKEIKJ (ORCPT ); Thu, 5 Nov 2015 03:10:09 -0500 Received: from mail.bmw-carit.de ([62.245.222.98]:53538 "EHLO mail.bmw-carit.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1030947AbbKEIKD (ORCPT ); Thu, 5 Nov 2015 03:10:03 -0500 X-CTCH-RefID: str=0001.0A0C0202.563B0ED6.00B9,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0 Subject: Re: [PATCH] smpboot: Add smpboot state variables instead of reusing CPU hotplug states To: , , "Paul E. McKenney" References: <1444908764-9057-1-git-send-email-daniel.wagner@bmw-carit.de> CC: Thomas Gleixner , Peter Zijlstra From: Daniel Wagner Message-ID: <563B0ED4.2090607@bmw-carit.de> Date: Thu, 5 Nov 2015 09:09:56 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: <1444908764-9057-1-git-send-email-daniel.wagner@bmw-carit.de> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Paul, I guess this patch got the summer conference period treatment. ACK, NACK, completely STUPID idea? cheers, daniel On 10/15/2015 01:32 PM, Daniel Wagner wrote: > The cpu hotplug state machine in smpboot.c is reusing the states from > cpu.h. That is confusing when it comes to the CPU_DEAD_FROZEN usage. > Paul explained to me that he was in need of an additional state > for destinguishing between a CPU error states. For this he just > picked CPU_DEAD_FROZEN. > > 8038dad7e888581266c76df15d70ca457a3c5910 smpboot: Add common code for notification from dying CPU > 2a442c9c6453d3d043dfd89f2e03a1deff8a6f06 x86: Use common outgoing-CPU-notification code > > Instead of reusing the states, let's add new definition inside > the smpboot.c file with explenation what those states > mean. Thanks Paul for providing them. > > Signed-off-by: Daniel Wagner > Cc: Thomas Gleixner > Cc: "Paul E. McKenney" > Cc: Peter Zijlstra > Cc: xen-devel@lists.xenproject.org > Cc: linux-kernel@vger.kernel.org > --- > arch/x86/xen/smp.c | 4 +-- > include/linux/cpu.h | 3 +- > kernel/smpboot.c | 82 ++++++++++++++++++++++++++++++++++++++++------------- > 3 files changed, 67 insertions(+), 22 deletions(-) > > diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c > index 3f4ebf0..804bf5c 100644 > --- a/arch/x86/xen/smp.c > +++ b/arch/x86/xen/smp.c > @@ -495,7 +495,7 @@ static int xen_cpu_up(unsigned int cpu, struct task_struct *idle) > rc = HYPERVISOR_vcpu_op(VCPUOP_up, cpu, NULL); > BUG_ON(rc); > > - while (cpu_report_state(cpu) != CPU_ONLINE) > + while (!cpu_check_online(cpu)) > HYPERVISOR_sched_op(SCHEDOP_yield, NULL); > > return 0; > @@ -767,7 +767,7 @@ static int xen_hvm_cpu_up(unsigned int cpu, struct task_struct *tidle) > * This can happen if CPU was offlined earlier and > * offlining timed out in common_cpu_die(). > */ > - if (cpu_report_state(cpu) == CPU_DEAD_FROZEN) { > + if (cpu_check_timeout(cpu)) { > xen_smp_intr_free(cpu); > xen_uninit_lock_cpu(cpu); > } > diff --git a/include/linux/cpu.h b/include/linux/cpu.h > index 23c30bd..f78ab46 100644 > --- a/include/linux/cpu.h > +++ b/include/linux/cpu.h > @@ -284,7 +284,8 @@ void arch_cpu_idle_dead(void); > > DECLARE_PER_CPU(bool, cpu_dead_idle); > > -int cpu_report_state(int cpu); > +int cpu_check_online(int cpu); > +int cpu_check_timeout(int cpu); > int cpu_check_up_prepare(int cpu); > void cpu_set_state_online(int cpu); > #ifdef CONFIG_HOTPLUG_CPU > diff --git a/kernel/smpboot.c b/kernel/smpboot.c > index a818cbc..75e5724 100644 > --- a/kernel/smpboot.c > +++ b/kernel/smpboot.c > @@ -371,19 +371,63 @@ int smpboot_update_cpumask_percpu_thread(struct smp_hotplug_thread *plug_thread, > } > EXPORT_SYMBOL_GPL(smpboot_update_cpumask_percpu_thread); > > +/* The CPU is offline, and its last offline operation was > + * successful and proceeded normally. (Or, alternatively, the > + * CPU never has come online, as this is the initial state.) > + */ > +#define CPUHP_POST_DEAD 0x01 > + > +/* The CPU is in the process of coming online. > + * Simple architectures can skip this state, and just invoke > + * cpu_set_state_online() unconditionally instead. > + */ > +#define CPUHP_UP_PREPARE 0x02 > + > +/* The CPU is now online. Simple architectures can skip this > + * state, and just invoke cpu_wait_death() and cpu_report_death() > + * unconditionally instead. > + */ > +#define CPUHP_ONLINE 0x03 > + > +/* The CPU has gone offline, so that it may now be safely > + * powered off (or whatever the architecture needs to do to it). > + */ > +#define CPUHP_DEAD 0x04 > + > +/* The CPU did not go offline in a timely fashion, if at all, > + * so it might need special processing at the next online (for > + * example, simply refusing to bring it online). > + */ > +#define CPUHP_BROKEN 0x05 > + > +/* The CPU eventually did go offline, but not in a timely > + * fashion. If some sort of reset operation is required before it > + * can be brought online, that reset operation needs to be carried > + * out at online time. (Or, again, the architecture might simply > + * refuse to bring it online.) > + */ > +#define CPUHP_TIMEOUT 0x06 > + > static DEFINE_PER_CPU(atomic_t, cpu_hotplug_state) = ATOMIC_INIT(CPU_POST_DEAD); > > /* > * Called to poll specified CPU's state, for example, when waiting for > * a CPU to come online. > */ > -int cpu_report_state(int cpu) > +int cpu_check_online(int cpu) > +{ > + return atomic_read(&per_cpu(cpu_hotplug_state, cpu)) == > + CPUHP_ONLINE; > +} > + > +int cpu_check_timeout(int cpu) > { > - return atomic_read(&per_cpu(cpu_hotplug_state, cpu)); > + return atomic_read(&per_cpu(cpu_hotplug_state, cpu)) == > + CPUHP_TIMEOUT; > } > > /* > - * If CPU has died properly, set its state to CPU_UP_PREPARE and > + * If CPU has died properly, set its state to CPUHP_UP_PREPARE and > * return success. Otherwise, return -EBUSY if the CPU died after > * cpu_wait_death() timed out. And yet otherwise again, return -EAGAIN > * if cpu_wait_death() timed out and the CPU still hasn't gotten around > @@ -397,19 +441,19 @@ int cpu_report_state(int cpu) > int cpu_check_up_prepare(int cpu) > { > if (!IS_ENABLED(CONFIG_HOTPLUG_CPU)) { > - atomic_set(&per_cpu(cpu_hotplug_state, cpu), CPU_UP_PREPARE); > + atomic_set(&per_cpu(cpu_hotplug_state, cpu), CPUHP_UP_PREPARE); > return 0; > } > > switch (atomic_read(&per_cpu(cpu_hotplug_state, cpu))) { > > - case CPU_POST_DEAD: > + case CPUHP_POST_DEAD: > > /* The CPU died properly, so just start it up again. */ > - atomic_set(&per_cpu(cpu_hotplug_state, cpu), CPU_UP_PREPARE); > + atomic_set(&per_cpu(cpu_hotplug_state, cpu), CPUHP_UP_PREPARE); > return 0; > > - case CPU_DEAD_FROZEN: > + case CPUHP_TIMEOUT: > > /* > * Timeout during CPU death, so let caller know. > @@ -424,7 +468,7 @@ int cpu_check_up_prepare(int cpu) > */ > return -EBUSY; > > - case CPU_BROKEN: > + case CPUHP_BROKEN: > > /* > * The most likely reason we got here is that there was > @@ -452,7 +496,7 @@ int cpu_check_up_prepare(int cpu) > */ > void cpu_set_state_online(int cpu) > { > - (void)atomic_xchg(&per_cpu(cpu_hotplug_state, cpu), CPU_ONLINE); > + (void)atomic_xchg(&per_cpu(cpu_hotplug_state, cpu), CPUHP_ONLINE); > } > > #ifdef CONFIG_HOTPLUG_CPU > @@ -470,12 +514,12 @@ bool cpu_wait_death(unsigned int cpu, int seconds) > might_sleep(); > > /* The outgoing CPU will normally get done quite quickly. */ > - if (atomic_read(&per_cpu(cpu_hotplug_state, cpu)) == CPU_DEAD) > + if (atomic_read(&per_cpu(cpu_hotplug_state, cpu)) == CPUHP_DEAD) > goto update_state; > udelay(5); > > /* But if the outgoing CPU dawdles, wait increasingly long times. */ > - while (atomic_read(&per_cpu(cpu_hotplug_state, cpu)) != CPU_DEAD) { > + while (atomic_read(&per_cpu(cpu_hotplug_state, cpu)) != CPUHP_DEAD) { > schedule_timeout_uninterruptible(sleep_jf); > jf_left -= sleep_jf; > if (jf_left <= 0) > @@ -484,14 +528,14 @@ bool cpu_wait_death(unsigned int cpu, int seconds) > } > update_state: > oldstate = atomic_read(&per_cpu(cpu_hotplug_state, cpu)); > - if (oldstate == CPU_DEAD) { > + if (oldstate == CPUHP_DEAD) { > /* Outgoing CPU died normally, update state. */ > smp_mb(); /* atomic_read() before update. */ > - atomic_set(&per_cpu(cpu_hotplug_state, cpu), CPU_POST_DEAD); > + atomic_set(&per_cpu(cpu_hotplug_state, cpu), CPUHP_POST_DEAD); > } else { > /* Outgoing CPU still hasn't died, set state accordingly. */ > if (atomic_cmpxchg(&per_cpu(cpu_hotplug_state, cpu), > - oldstate, CPU_BROKEN) != oldstate) > + oldstate, CPUHP_BROKEN) != oldstate) > goto update_state; > ret = false; > } > @@ -502,7 +546,7 @@ update_state: > * Called by the outgoing CPU to report its successful death. Return > * false if this report follows the surviving CPU's timing out. > * > - * A separate "CPU_DEAD_FROZEN" is used when the surviving CPU > + * A separate "CPUHP_TIMEOUT" is used when the surviving CPU > * timed out. This approach allows architectures to omit calls to > * cpu_check_up_prepare() and cpu_set_state_online() without defeating > * the next cpu_wait_death()'s polling loop. > @@ -515,13 +559,13 @@ bool cpu_report_death(void) > > do { > oldstate = atomic_read(&per_cpu(cpu_hotplug_state, cpu)); > - if (oldstate != CPU_BROKEN) > - newstate = CPU_DEAD; > + if (oldstate != CPUHP_BROKEN) > + newstate = CPUHP_DEAD; > else > - newstate = CPU_DEAD_FROZEN; > + newstate = CPUHP_TIMEOUT; > } while (atomic_cmpxchg(&per_cpu(cpu_hotplug_state, cpu), > oldstate, newstate) != oldstate); > - return newstate == CPU_DEAD; > + return newstate == CPUHP_DEAD; > } > > #endif /* #ifdef CONFIG_HOTPLUG_CPU */ >