On Wed, 2020-12-16 at 16:31 +0100, Thomas Gleixner wrote:
> But obviously the C-state in which the APs are waiting is not really
> relevant, as you demonstrated that the cost is due to INIT/SIPI even
> with spinwait, which is what I suspected.
> 
> OTOH, the advantage of INIT/SIPI is that the AP comes up in a well known
> state.

And once we parallelise the bringup we basically only incur the latency
of *one* INIT/SIPI instead of multiplying it by the number of CPUs, so
it isn't clear that there's any *disadvantage* to it. It's certainly a
lot simpler.

I think we should definitely start by implementing the parallel bringup
as you described it, and then see if there's still a problem left to be
solved.

We were working on a SIPI-avoiding patch set which is similar to the
above, which Johanna had just about got working the night before this
one was posted. But it looks like we should go back to the drawing
board anyway instead of bothering to compare the details of the two.