On Fri, 2023-03-24 at 14:57 +0100, Thomas Gleixner wrote: > > Why? Simply because of this: > >   BP                    AP              state >   kick()                                BRINGUP_CPU >                         startup()                      >   sync()                sync() >                         starting()      advances to AP_ONLINE >   sync()                sync() >   TSC_sync()            TSC_sync() >   wait_for_online()     set_online() >                         cpu_startup_entry() AP_ONLINE_IDLE >   wait_for_completion() complete() > > This works correctly today because bringup_cpu() does not modify state > and excpects the state to be advanced by the AP once the completion is > done. > > So you _cannot_ just throw some magic dynamic states before BRINGUP_CPU > and then expect that the state machine is consistent when the AP is > allowed to run the starting callbacks in parallel. Aha! I see. Yes, when the AP calls notify_cpu_starting(), which x86 does from smp_callin(), the AP takes *itself* forward through the states from there. That happens when the BP gets to do_wait_cpu_initialized(). So yes, the actual code in the existing series of patches is entirely safe, but you're right that we do only want that *one* additional state for parallelising the "kick AP" before CPUHP_BRINGUP_CPU. The rest need to come afterwards and be handled differently.