On Wed, 2020-12-16 at 16:31 +0100, Thomas Gleixner wrote: > But obviously the C-state in which the APs are waiting is not really > relevant, as you demonstrated that the cost is due to INIT/SIPI even > with spinwait, which is what I suspected. > > OTOH, the advantage of INIT/SIPI is that the AP comes up in a well known > state. And once we parallelise the bringup we basically only incur the latency of *one* INIT/SIPI instead of multiplying it by the number of CPUs, so it isn't clear that there's any *disadvantage* to it. It's certainly a lot simpler. I think we should definitely start by implementing the parallel bringup as you described it, and then see if there's still a problem left to be solved. We were working on a SIPI-avoiding patch set which is similar to the above, which Johanna had just about got working the night before this one was posted. But it looks like we should go back to the drawing board anyway instead of bothering to compare the details of the two.