On 28/06/2018 23:13, speck for Dave Hansen wrote: > On 06/20/2018 01:19 PM, speck for Thomas Gleixner wrote: >> + /* >> + * If SMT is force disabled and the APIC ID belongs to >> + * a secondary thread, ignore it. >> + */ >> + if (apic_id_disabled(apicid)) { >> + pr_info_once("Ignoring secondary SMT threads\n"); >> + return -EINVAL; >> + } > Thomas, this boottime-disable stuff just ends up ignoring the > hyperthread and leaves it alone, right? Yes > > Some Intel folks pointed out a few problems with this. One is with > machine checks. If one thread is booted and has CR4.MCE=1, but the > other never gets booted and still has CR4.MCE=0, things go boom > (everything goes to shutdown state) if a machine check happens. > > We've traditionally pretended that this does not happen because folks > don't tend to turn off CPUs they've paid for via things like maxcpus= in > the real world. > > Ashok Raj and Tony Luck were evidently looking at this at some point, > but it got tricky and decided it wasn't worth the trouble. > > It makes me think we should either scrap or recommend against "nosmt=force". > > Some relevant SDM language: > >> Because the logical processors within a physical package are tightly >> coupled with respect to shared hardware resources, both logical >> processors are notified of machine check errors that occur within a >> given physical processor. If machine-check exceptions are enabled >> when a fatal error is reported, all the logical processors within a >> physical package are dispatched to the machine-check exception >> handler. If machine-check exceptions are disabled, the logical >> processors enter the shutdown state and assert the IERR# signal. When >> enabling machine-check exceptions, the MCE flag in control register >> CR4 should be set for each logical processor. So what you're saying is that we need to boot all the threads, including MCE setup etc, then leave them alone (mwait/deep C states?) so they avoid causing a shutdown? If so, I've got quite a lot of extra work to do in Xen... ~Andrew