[PATCH v2 0/7] Parallel CPU bringup for x86_64

* [PATCH v2 0/7] Parallel CPU bringup for x86_64
@ 2021-12-14 12:32 David Woodhouse
  2021-12-14 12:32 ` [PATCH v2 1/7] x86/apic/x2apic: Fix parallel handling of cluster_mask David Woodhouse
                   ` (6 more replies)
  0 siblings, 7 replies; 14+ messages in thread
From: David Woodhouse @ 2021-12-14 12:32 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H . Peter Anvin,
	Paolo Bonzini, Paul E . McKenney, linux-kernel, kvm, rcu, mimoja,
	hewenliang4, hushiyuan, luolongjun, hejingxian

This is a cut-down version of parallel CPU bringup for x86_64, which 
only does the INIT/SIPI/SIPI for APs in a CPUHP_BP_PARALLEL_DYN stage 
before the normal bringup to CPUHP_ONLINE happens sequentially.

Thus, we don't yet need any of the cleanups in RCU, TSC sync, topology 
updates, etc. — we only need to handle reentrancy through the real mode 
trampoline and the beginning of start_secondary() up to the point where 
it waits in wait_for_master_cpu().

This much is simple and sane enough to be merged, I think — modulo the
lack of sign-off on the patch that Thomas now claims not to remember
writing :)

This brings the 96-thread 2-socket Skylake startup time from 500ms to
100ms, which is a bit more modest than the 34ms we claimed before, but
still a nice win.

Further testing and analysis has shown us that allowing the APs to 
proceed from wait_from_master_cpu() in parallel is going to require a
bit more thought.

Once the APs reach smp_callin(), they call notify_cpu_starting() which 
walks through the states up to min(st->target, CPUHP_AP_ONLINE_IDLE). 
But if we allow the AP to get there when its target is one of the 
CPUHP_BP_PARALLEL_DYN states, that means that notify_cpu_starting() 
doesn't walk it through any states at all!

And then when the AP gets to the end of start_secondary() it ends up in 
cpu_startup_entry() which *sets* the state to CPUHP_AP_ONLINE_IDLE and 
thus has effectively *skipped* all the CPUHP_*_STARTING states.

The cheap answer is to explicitly walk to CPUHP_AP_ONLINE_IDLE but I 
don't want to let the APs *overtake* the target set for them by the 
overall CPUHP state machine.

So I think the better solution for further parallelisation is to make 
bringup_nonboot_cpus() bring all the APs to CPUHP_AP_ONLINE_IDLE in 
parallel, and *then* bring them to CPUHP_ONLINE. We will continue to 
play with that one and make sure the rest of the startup states are 
reentrant, in addition to the ones we've already fixed in 
https://git.infradead.org/users/dwmw2/linux.git/shortlog/refs/heads/parallel-5.16

v2: Only do do_cpu_up() for APs in parallel, nothing more. Drop half the
    fixes that aren't yet needed until we go further.

David Woodhouse (6):
      x86/apic/x2apic: Fix parallel handling of cluster_mask
      cpu/hotplug: Move idle_thread_get() to <linux/smpboot.h>
      cpu/hotplug: Add dynamic parallel bringup states before CPUHP_BRINGUP_CPU
      x86/smpboot: Reference count on smpboot_setup_warm_reset_vector()
      x86/smpboot: Split up native_cpu_up into separate phases and document them
      x86/smpboot: Send INIT/SIPI/SIPI to secondary CPUs in parallel

Thomas Gleixner (1):
      x86/smpboot: Support parallel startup of secondary CPUs

 arch/x86/include/asm/realmode.h       |   3 +
 arch/x86/include/asm/smp.h            |   9 +-
 arch/x86/kernel/acpi/sleep.c          |   1 +
 arch/x86/kernel/apic/apic.c           |   2 +-
 arch/x86/kernel/apic/x2apic_cluster.c |  82 ++++++-----
 arch/x86/kernel/head_64.S             |  71 ++++++++++
 arch/x86/kernel/smpboot.c             | 251 +++++++++++++++++++++++++---------
 arch/x86/realmode/init.c              |   3 +
 arch/x86/realmode/rm/trampoline_64.S  |  14 ++
 include/linux/cpuhotplug.h            |   2 +
 include/linux/smpboot.h               |   7 +
 kernel/cpu.c                          |  27 +++-
 kernel/smpboot.c                      |   2 +-
 kernel/smpboot.h                      |   2 -
 14 files changed, 371 insertions(+), 105 deletions(-)

^ permalink raw reply	[flat|nested] 14+ messages in thread