On 6/14/21 3:26 AM, Borislav Petkov wrote: > On Fri, Jun 11, 2021 at 06:15:32PM +0200, Thomas Gleixner wrote: >> - if (src) { >> - u32 size, offset, ecx, edx; >> - cpuid_count(XSTATE_CPUID, xfeature_nr, >> - &size, &offset, &ecx, &edx); >> - if (xfeature_nr == XFEATURE_PKRU) >> - memcpy(dest + offset, &vcpu->arch.pkru, >> - sizeof(vcpu->arch.pkru)); >> - else >> - memcpy(dest + offset, src, size); >> + cpuid_count(XSTATE_CPUID, xfeature_nr, >> + &size, &offset, &ecx, &edx); >> >> + if (xfeature_nr == XFEATURE_PKRU) { >> + memcpy(dest + offset, &vcpu->arch.pkru, >> + sizeof(vcpu->arch.pkru)); >> + } else { >> + src = get_xsave_addr(xsave, xfeature_nr); >> + if (src) >> + memcpy(dest + offset, src, size); >> } >> >> valid -= xfeature_mask; > > How about pulling up that PKRU case even further (pasting the whole > changed loop instead of an unreadable diff) and keeping the CPUID access > and the xsave address handling separate? > > valid = xstate_bv & ~XFEATURE_MASK_FPSSE; > while (valid) { > u32 size, offset, ecx, edx; > u64 xfeature_mask = valid & -valid; > int xfeature_nr = fls64(xfeature_mask) - 1; > void *src; > > if (xfeature_nr == XFEATURE_PKRU) { > memcpy(dest + offset, &vcpu->arch.pkru, sizeof(vcpu->arch.pkru)); > continue; > } > > cpuid_count(XSTATE_CPUID, xfeature_nr, &size, &offset, &ecx, &edx); > > src = get_xsave_addr(xsave, xfeature_nr); > if (src) > memcpy(dest + offset, src, size); > > valid -= xfeature_mask; > } I gave that a shot. Two wrinkles: The PKRU memcpy() needs 'offset' from cpuid_count() and the PKRU case also needs the 'valid -=' manipulation. The result is attached, and while it makes the diff look better, I don't think the resulting code is an improvement. > Btw, I'm wondering if that CPUID read in a loop can be replaced with > adding accessors for xstate_{offsets,sizes,..} etc and providing them to > kvm... I *think* these are already stored in xfeature_uncompacted_offset[]. It would be a pretty simple matter to export it. I just assumed that this is a slow enough path that the KVM folks don't care. >> @@ -4632,18 +4633,20 @@ static void load_xsave(struct kvm_vcpu * >> */ >> valid = xstate_bv & ~XFEATURE_MASK_FPSSE; >> while (valid) { >> + u32 size, offset, ecx, edx; >> u64 xfeature_mask = valid & -valid; >> int xfeature_nr = fls64(xfeature_mask) - 1; >> - void *dest = get_xsave_addr(xsave, xfeature_nr); >> >> - if (dest) { >> - u32 size, offset, ecx, edx; >> - cpuid_count(XSTATE_CPUID, xfeature_nr, >> - &size, &offset, &ecx, &edx); >> - if (xfeature_nr == XFEATURE_PKRU) >> - memcpy(&vcpu->arch.pkru, src + offset, >> - sizeof(vcpu->arch.pkru)); >> - else >> + cpuid_count(XSTATE_CPUID, xfeature_nr, >> + &size, &offset, &ecx, &edx); >> + >> + if (xfeature_nr == XFEATURE_PKRU) { >> + memcpy(&vcpu->arch.pkru, src + offset, >> + sizeof(vcpu->arch.pkru)); >> + } else { >> + void *dest = get_xsave_addr(xsave, xfeature_nr); >> + > > > ^ Superfluous newline. I'm happy to change it, but I usually like to separate declarations from pure code. Although, I guess that's a bit inconsistent in that file.