All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch 00/40] CPU hotplug rework - episode I
@ 2013-01-31 15:44 Thomas Gleixner
  2013-01-31 12:11 ` [patch 01/40] smpboot: Allow selfparking per cpu threads Thomas Gleixner
                   ` (41 more replies)
  0 siblings, 42 replies; 67+ messages in thread
From: Thomas Gleixner @ 2013-01-31 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Zijlstra, Rusty Russell, Paul McKenney,
	Srivatsa S. Bhat, Arjan van de Veen, Paul Turner,
	Richard Weinberger, Magnus Damm, Linus Torvalds, Andrew Morton

The current CPU hotplug implementation has become an increasing
nightmare full of races and undocumented behaviour. The main issue of
the current hotplug scheme is the completely asymetric
startup/teardown process. The hotplug notifiers are mostly
undocumented and the CPU_* actions in lots of implementations seem to
be randomly chosen.

We had a long discussion in San Diego last year about reworking the
hotplug core into a fully symetric state machine. After a few doomed
attempts to convert the existing code into a state machine, I finally
found a workable solution.

The following patch series implements a trivial array based state
machine, which replaces the existing steps in cpu_up/down and also the
notifiers which must run on the hotplugged cpu are converted to a
callback array. This documents clearly the ordering of the callbacks
and also makes the asymetric behaviour very obvious.

This series converts the stop_machine thread to the smpboot
infrastructure, implements the core state machine and converts all
notifiers which have ordering constraints plus a randomly chosen bunch
of other notifiers to the state machine.

The runtime installed callbacks are immediately executed by the core
code on or on behalf of all cpus which have already reached the
corresponding state. A non executing installer function is there as
well to allow simple migration of the existing notifier maze.

The diffstat of the complete series is appended below.

 36 files changed, 1300 insertions(+), 1179 deletions(-)

We add slightly more code at this stage (225 lines alone in a header
file), but most of the conversions are removing code and we have only
tackled about 30 of 130+ instances. Even with the current conversion
state, the resulting text size shrinks already.

Known issues:
The current series has a not yet solved section mismatch issue versus
the array callbacks which are already installed at compile time.

There is more work in the pipeline:

 - Convert all notifiers to the state machine callbacks

 - Analyze the asymetric callbacks and fix them if possible or at
   least document why they need to be asymetric.

 - Unify the low level bringup across the architectures
   (e.g. synchronization between boot and hotplugged cpus, common
   setups, scheduler exposure, etc.)

At the end hotplug should run through an array of callbacks on both
sides with explicit core synchronization points. The ordering should
look like this:

CPUHP_OFFLINE                   // Start state.
CPUHP_PREP_<hardware>           // Kick CPU into life / let it die
CPUHP_PREP_<datastructures>     // Get datastructures set up / freed.
CPUHP_PREP_<threads>            // Create threads for cpu
CPUHP_SYNC			// Synchronization point
CPUHP_INIT_<hardware>		// Startup/teardown on the CPU (interrupts, timers ...)
CPUHP_SCHED_<stuff on CPU>      // Unpark/park per cpu local threads on the CPU.
CPUHP_ENABLE_<stuff_on_CPU>	// Enable/disable facilities 
CPUHP_SYNC			// Synchronization point
CPUHP_SCHED                     // Expose/remove CPU from general scheduler.
CPUHP_ONLINE                    // Final state

All PREP states can fail and the corresponding teardown callbacks are
invoked in the same way as they are invoked on offlining.

The existing DOWN_PREPARE notifier has only two instances which
actually might prevent the CPU from going down: rcu_tree and
padata. We might need to keep them, but these can be explicitly
documented asymetric states.

Quite some of the ONLINE/DOWN_PREPARE notifiers are racy and need a
proper inspection. All other valid users of ONLINE/DOWN_PREPARE
notifiers should be put into the CPUHP_ENABLE state block and be
executed on the hotplugged CPU. I have not seen a single instance
(except scheduler) which needs to be executed before we remove the CPU
from the general scheduler itself.

This final design needs quite some massaging of the current scheduler
code, but last time I discussed this with scheduler folks it seemed to
be doable with a reasonable effort. Other than that I don't see any
(un)real showstoppers on the horizon.

Thanks,

	tglx
---
 arch/arm/kernel/perf_event_cpu.c              |   28 -
 arch/arm/vfp/vfpmodule.c                      |   29 -
 arch/blackfin/kernel/perf_event.c             |   25 -
 arch/powerpc/perf/core-book3s.c               |   29 -
 arch/s390/kernel/perf_cpum_cf.c               |   37 -
 arch/s390/kernel/vtime.c                      |   18 
 arch/sh/kernel/perf_event.c                   |   22 
 arch/x86/kernel/apic/x2apic_cluster.c         |   80 +--
 arch/x86/kernel/cpu/perf_event.c              |   78 +--
 arch/x86/kernel/cpu/perf_event_amd.c          |    6 
 arch/x86/kernel/cpu/perf_event_amd_ibs.c      |   54 --
 arch/x86/kernel/cpu/perf_event_intel.c        |    6 
 arch/x86/kernel/cpu/perf_event_intel_uncore.c |  109 +---
 arch/x86/kernel/tboot.c                       |   23 
 drivers/clocksource/arm_generic.c             |   40 -
 drivers/cpufreq/cpufreq_stats.c               |   55 --
 include/linux/cpu.h                           |   45 -
 include/linux/cpuhotplug.h                    |  207 ++++++++
 include/linux/perf_event.h                    |   21 
 include/linux/smpboot.h                       |    5 
 init/main.c                                   |   15 
 kernel/cpu.c                                  |  613 ++++++++++++++++++++++----
 kernel/events/core.c                          |   36 -
 kernel/hrtimer.c                              |   47 -
 kernel/profile.c                              |   92 +--
 kernel/rcutree.c                              |   95 +---
 kernel/sched/core.c                           |  251 ++++------
 kernel/sched/fair.c                           |   16 
 kernel/smp.c                                  |   50 --
 kernel/smpboot.c                              |   11 
 kernel/smpboot.h                              |    4 
 kernel/stop_machine.c                         |  154 ++----
 kernel/time/clockevents.c                     |   13 
 kernel/timer.c                                |   43 -
 kernel/workqueue.c                            |   80 +--
 virt/kvm/kvm_main.c                           |   42 -
 36 files changed, 1300 insertions(+), 1179 deletions(-)


^ permalink raw reply	[flat|nested] 67+ messages in thread

end of thread, other threads:[~2014-10-09 17:05 UTC | newest]

Thread overview: 67+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-31 15:44 [patch 00/40] CPU hotplug rework - episode I Thomas Gleixner
2013-01-31 12:11 ` [patch 01/40] smpboot: Allow selfparking per cpu threads Thomas Gleixner
2013-02-09  0:29   ` Paul E. McKenney
2013-02-14 17:46   ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2013-01-31 12:11 ` [patch 02/40] stop_machine: Store task reference in a separate per cpu variable Thomas Gleixner
2013-02-09  0:33   ` Paul E. McKenney
2013-02-14 17:47   ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2013-01-31 12:11 ` [patch 03/40] stop_machine: Use smpboot threads Thomas Gleixner
2013-02-09  0:39   ` Paul E. McKenney
2013-02-14 17:49   ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2013-01-31 12:11 ` [patch 04/40] cpu: Restructure FROZEN state handling Thomas Gleixner
2013-02-09  0:52   ` Paul E. McKenney
2014-10-09 16:53   ` Borislav Petkov
2013-01-31 12:11 ` [patch 05/40] cpu: Restructure cpu_down code Thomas Gleixner
2013-02-09  0:49   ` Paul E. McKenney
2014-10-09 17:05   ` Borislav Petkov
2013-01-31 12:11 ` [patch 06/40] cpu: hotplug: Split out cpu down functions Thomas Gleixner
2013-02-09  0:54   ` Paul E. McKenney
2013-01-31 12:11 ` [patch 07/40] cpu: hotplug: Convert to a state machine for the control processor Thomas Gleixner
2013-02-11 20:09   ` Paul E. McKenney
2013-01-31 12:11 ` [patch 08/40] cpu: hotplug: Convert the hotplugged processor work to a state machine Thomas Gleixner
2013-02-11 20:17   ` Paul E. McKenney
2013-01-31 12:11 ` [patch 10/40] sched: Convert to state machine callbacks Thomas Gleixner
2013-02-11 23:46   ` Paul E. McKenney
2013-01-31 12:11 ` [patch 09/40] cpu: hotplug: Implement setup/removal interface Thomas Gleixner
2013-02-01 13:44   ` Hillf Danton
2013-02-01 13:52     ` Thomas Gleixner
2013-01-31 12:11 ` [patch 11/40] x86: uncore: Move teardown callback to CPU_DEAD Thomas Gleixner
2013-01-31 12:11 ` [patch 12/40] x86: uncore: Convert to hotplug state machine Thomas Gleixner
2013-01-31 12:11 ` [patch 13/40] perf: " Thomas Gleixner
2013-01-31 12:11 ` [patch 14/40] x86: perf: Convert the core to the " Thomas Gleixner
2013-01-31 12:11 ` [patch 16/40] blackfin: perf: Convert hotplug notifier to " Thomas Gleixner
2013-01-31 12:11 ` [patch 15/40] x86: perf: Convert AMD IBS to hotplug " Thomas Gleixner
2013-01-31 12:11 ` [patch 17/40] powerpc: perf: Convert book3s notifier to state machine callbacks Thomas Gleixner
2013-01-31 12:11 ` [patch 18/40] s390: perf: Convert the hotplug " Thomas Gleixner
2013-01-31 12:11 ` [patch 19/40] sh: perf: Convert the hotplug notifiers " Thomas Gleixner
2013-01-31 12:11 ` [patch 21/40] sched: Convert the migration callback to hotplug states Thomas Gleixner
2013-01-31 12:11 ` [patch 20/40] perf: Remove perf cpu notifier code Thomas Gleixner
2013-01-31 12:11 ` [patch 22/40] workqueue: Convert to state machine callbacks Thomas Gleixner
2013-01-31 12:11 ` [patch 23/40] cpufreq: Convert to hotplug state machine Thomas Gleixner
2013-01-31 12:11 ` [patch 24/40] arm64: Convert generic timers " Thomas Gleixner
2013-01-31 12:11 ` [patch 25/40] arm: Convert VFP hotplug notifiers to " Thomas Gleixner
2013-01-31 12:11 ` [patch 26/40] arm: perf: Convert to hotplug " Thomas Gleixner
2013-01-31 12:11 ` [patch 27/40] virt: Convert kvm hotplug to " Thomas Gleixner
2013-01-31 12:11 ` [patch 28/40] cpuhotplug: Remove CPU_STARTING notifier Thomas Gleixner
2013-01-31 12:11 ` [patch 29/40] s390: Convert vtime to hotplug state machine Thomas Gleixner
2013-01-31 12:11 ` [patch 30/40] x86: tboot: Convert " Thomas Gleixner
2013-01-31 12:11 ` [patch 31/40] sched: Convert fair nohz balancer " Thomas Gleixner
2013-01-31 12:11 ` [patch 33/40] hrtimer: Convert " Thomas Gleixner
2013-01-31 12:11 ` [patch 32/40] rcu: Convert rcutree " Thomas Gleixner
2013-02-12  0:01   ` Paul E. McKenney
2013-02-12 15:50     ` Paul E. McKenney
2013-01-31 12:11 ` [patch 34/40] cpuhotplug: Remove CPU_DYING notifier Thomas Gleixner
2013-01-31 12:11 ` [patch 35/40] timers: Convert to hotplug state machine Thomas Gleixner
2013-01-31 12:11 ` [patch 36/40] profile: Convert ot " Thomas Gleixner
2013-01-31 12:11 ` [patch 37/40] x86: x2apic: Convert to cpu " Thomas Gleixner
2013-01-31 12:11 ` [patch 38/40] smp: Convert core to " Thomas Gleixner
2013-01-31 12:11 ` [patch 39/40] relayfs: Convert " Thomas Gleixner
2013-01-31 12:11 ` [patch 40/40] slab: " Thomas Gleixner
2013-01-31 20:23 ` [patch 00/40] CPU hotplug rework - episode I Andrew Morton
2013-01-31 21:48   ` Thomas Gleixner
2013-01-31 21:59     ` Linus Torvalds
2013-01-31 22:44       ` Thomas Gleixner
2013-01-31 22:55         ` Linus Torvalds
2013-02-01 10:51           ` Thomas Gleixner
2013-02-07  4:01             ` Rusty Russell
2013-02-09  0:28 ` Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.