On Wed, 22 Nov 2023, Mihai Carabas wrote: > La 22.11.2023 22:51, Christoph Lameter a scris: >> >> On Mon, 20 Nov 2023, Mihai Carabas wrote: >> >>> cpu_relax on ARM64 does a simple "yield". Thus we replace it with >>> smp_cond_load_relaxed which basically does a "wfe". >> >> Well it clears events first (which requires the first WFE) and then does a >> WFE waiting for any events if no events were pending. >> >> WFE does not cause a VMEXIT? Or does the inner loop of >> smp_cond_load_relaxed now do 2x VMEXITS? >> >> KVM ARM64 code seems to indicate that WFE causes a VMEXIT. See >> kvm_handle_wfx(). > > In KVM ARM64 the WFE traping is dynamic: it is enabled only if there are more > tasks waiting on the same core (e.g. on an oversubscribed system). > > In arch/arm64/kvm/arm.c: > >  457 >-------if (single_task_running()) >  458 >------->-------vcpu_clear_wfx_traps(vcpu); >  459 >-------else >  460 >------->-------vcpu_set_wfx_traps(vcpu); Ahh. Cool did not know about that. But still: Lots of VMEXITs once the load has to be shared. > This of course can be improved by having a knob where you can completly > disable wfx traping by your needs, but I left this as another subject to > tackle. kvm_arch_vcpu_load() looks strange. On the one hand we pass a cpu number into it and then we use functions that only work if we are running on that cpu? It would be better to use smp_processor_id() in the function and not pass the cpu number to it.