* [RFC PATCH] kvm/x86: Keep root hpa in prev_roots as much as possible @ 2021-05-25 21:39 Lai Jiangshan 2021-07-29 18:05 ` Sean Christopherson 0 siblings, 1 reply; 4+ messages in thread From: Lai Jiangshan @ 2021-05-25 21:39 UTC (permalink / raw) To: linux-kernel, kvm Cc: Paolo Bonzini, Lai Jiangshan, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson From: Lai Jiangshan <laijs@linux.alibaba.com> Pagetable roots in prev_roots[] are likely to be reused soon and there is no much overhead to keep it with a new need_sync field introduced. With the help of the new need_sync field, pagetable roots are kept as much as possible, and they will be re-synced before reused instead of being dropped. Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> --- This patch is just for RFC. Is the idea Ok? If the idea is Ok, we need to reused one bit from pgd or hpa as need_sync to save memory. Which one is better? arch/x86/include/asm/kvm_host.h | 3 ++- arch/x86/kvm/mmu/mmu.c | 6 ++++++ arch/x86/kvm/vmx/nested.c | 12 ++++-------- arch/x86/kvm/x86.c | 9 +++++---- 4 files changed, 17 insertions(+), 13 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 55efbacfc244..19a337cf7aa6 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -354,10 +354,11 @@ struct rsvd_bits_validate { struct kvm_mmu_root_info { gpa_t pgd; hpa_t hpa; + bool need_sync; }; #define KVM_MMU_ROOT_INFO_INVALID \ - ((struct kvm_mmu_root_info) { .pgd = INVALID_PAGE, .hpa = INVALID_PAGE }) + ((struct kvm_mmu_root_info) { .pgd = INVALID_PAGE, .hpa = INVALID_PAGE, .need_sync = true}) #define KVM_MMU_NUM_PREV_ROOTS 3 diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 5e60b00e8e50..147827135549 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3878,6 +3878,7 @@ static bool cached_root_available(struct kvm_vcpu *vcpu, gpa_t new_pgd, root.pgd = mmu->root_pgd; root.hpa = mmu->root_hpa; + root.need_sync = false; if (is_root_usable(&root, new_pgd, new_role)) return true; @@ -3892,6 +3893,11 @@ static bool cached_root_available(struct kvm_vcpu *vcpu, gpa_t new_pgd, mmu->root_hpa = root.hpa; mmu->root_pgd = root.pgd; + if (i < KVM_MMU_NUM_PREV_ROOTS && root.need_sync) { + kvm_make_request(KVM_REQ_MMU_SYNC, vcpu); + kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu); + } + return i < KVM_MMU_NUM_PREV_ROOTS; } diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 6058a65a6ede..ab7069ac6dc5 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -5312,7 +5312,7 @@ static int handle_invept(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); u32 vmx_instruction_info, types; - unsigned long type, roots_to_free; + unsigned long type; struct kvm_mmu *mmu; gva_t gva; struct x86_exception e; @@ -5361,29 +5361,25 @@ static int handle_invept(struct kvm_vcpu *vcpu) return nested_vmx_fail(vcpu, VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID); - roots_to_free = 0; if (nested_ept_root_matches(mmu->root_hpa, mmu->root_pgd, operand.eptp)) - roots_to_free |= KVM_MMU_ROOT_CURRENT; + kvm_mmu_free_roots(vcpu, mmu, KVM_MMU_ROOT_CURRENT); for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) { if (nested_ept_root_matches(mmu->prev_roots[i].hpa, mmu->prev_roots[i].pgd, operand.eptp)) - roots_to_free |= KVM_MMU_ROOT_PREVIOUS(i); + mmu->prev_roots[i].need_sync = true; } break; case VMX_EPT_EXTENT_GLOBAL: - roots_to_free = KVM_MMU_ROOTS_ALL; + kvm_mmu_free_roots(vcpu, mmu, KVM_MMU_ROOTS_ALL); break; default: BUG(); break; } - if (roots_to_free) - kvm_mmu_free_roots(vcpu, mmu, roots_to_free); - return nested_vmx_succeed(vcpu); } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index bbc4e04e67ad..1f5617ec6b34 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -11680,7 +11680,6 @@ int kvm_handle_invpcid(struct kvm_vcpu *vcpu, unsigned long type, gva_t gva) bool pcid_enabled; struct x86_exception e; unsigned i; - unsigned long roots_to_free = 0; struct { u64 pcid; u64 gla; @@ -11722,9 +11721,8 @@ int kvm_handle_invpcid(struct kvm_vcpu *vcpu, unsigned long type, gva_t gva) for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) if (kvm_get_pcid(vcpu, vcpu->arch.mmu->prev_roots[i].pgd) == operand.pcid) - roots_to_free |= KVM_MMU_ROOT_PREVIOUS(i); + vcpu->arch.mmu->prev_roots[i].need_sync = true; - kvm_mmu_free_roots(vcpu, vcpu->arch.mmu, roots_to_free); /* * If neither the current cr3 nor any of the prev_roots use the * given PCID, then nothing needs to be done here because a @@ -11743,7 +11741,10 @@ int kvm_handle_invpcid(struct kvm_vcpu *vcpu, unsigned long type, gva_t gva) fallthrough; case INVPCID_TYPE_ALL_INCL_GLOBAL: - kvm_make_request(KVM_REQ_MMU_RELOAD, vcpu); + kvm_mmu_sync_roots(vcpu); + kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu); + for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) + vcpu->arch.mmu->prev_roots[i].need_sync = true; return kvm_skip_emulated_instruction(vcpu); default: -- 2.19.1.6.gb485710b ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [RFC PATCH] kvm/x86: Keep root hpa in prev_roots as much as possible 2021-05-25 21:39 [RFC PATCH] kvm/x86: Keep root hpa in prev_roots as much as possible Lai Jiangshan @ 2021-07-29 18:05 ` Sean Christopherson 2021-08-03 1:19 ` Lai Jiangshan 0 siblings, 1 reply; 4+ messages in thread From: Sean Christopherson @ 2021-07-29 18:05 UTC (permalink / raw) To: Lai Jiangshan Cc: linux-kernel, kvm, Paolo Bonzini, Lai Jiangshan, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson On Wed, May 26, 2021, Lai Jiangshan wrote: > From: Lai Jiangshan <laijs@linux.alibaba.com> > > Pagetable roots in prev_roots[] are likely to be reused soon and > there is no much overhead to keep it with a new need_sync field > introduced. > > With the help of the new need_sync field, pagetable roots are > kept as much as possible, and they will be re-synced before reused > instead of being dropped. > > Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> > --- > > This patch is just for RFC. > Is the idea Ok? Yes, the idea is definitely a good one. > If the idea is Ok, we need to reused one bit from pgd or hpa > as need_sync to save memory. Which one is better? Ha, we can do this without increasing the memory footprint and without co-opting a bit from pgd or hpa. Because of compiler alignment/padding, the u8s and bools between mmu_role and prev_roots already occupy 8 bytes, even though the actual size is 4 bytes. In total, we need room for 4 roots (3 previous + current), i.e. 4 bytes. If a separate array is used, no additional memory is consumed and no masking is needed when reading/writing e.g. pgd. The cost is an extra swap() when updating the prev_roots LRU, but that's peanuts and would likely be offset by masking anyways. diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 99f37781a6fc..13bb3c3a60b4 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -424,10 +424,12 @@ struct kvm_mmu { hpa_t root_hpa; gpa_t root_pgd; union kvm_mmu_role mmu_role; + bool root_unsync; u8 root_level; u8 shadow_root_level; u8 ept_ad; bool direct_map; + bool unsync_roots[KVM_MMU_NUM_PREV_ROOTS]; struct kvm_mmu_root_info prev_roots[KVM_MMU_NUM_PREV_ROOTS]; /* > arch/x86/include/asm/kvm_host.h | 3 ++- > arch/x86/kvm/mmu/mmu.c | 6 ++++++ > arch/x86/kvm/vmx/nested.c | 12 ++++-------- > arch/x86/kvm/x86.c | 9 +++++---- > 4 files changed, 17 insertions(+), 13 deletions(-) > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index 55efbacfc244..19a337cf7aa6 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -354,10 +354,11 @@ struct rsvd_bits_validate { > struct kvm_mmu_root_info { > gpa_t pgd; > hpa_t hpa; > + bool need_sync; Hmm, use "unsync" instead of "need_sync", purely to match the existing terminology in KVM's MMU for this sort of behavior. > }; > > #define KVM_MMU_ROOT_INFO_INVALID \ > - ((struct kvm_mmu_root_info) { .pgd = INVALID_PAGE, .hpa = INVALID_PAGE }) > + ((struct kvm_mmu_root_info) { .pgd = INVALID_PAGE, .hpa = INVALID_PAGE, .need_sync = true}) > > #define KVM_MMU_NUM_PREV_ROOTS 3 > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index 5e60b00e8e50..147827135549 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -3878,6 +3878,7 @@ static bool cached_root_available(struct kvm_vcpu *vcpu, gpa_t new_pgd, > > root.pgd = mmu->root_pgd; > root.hpa = mmu->root_hpa; > + root.need_sync = false; > > if (is_root_usable(&root, new_pgd, new_role)) > return true; > @@ -3892,6 +3893,11 @@ static bool cached_root_available(struct kvm_vcpu *vcpu, gpa_t new_pgd, > mmu->root_hpa = root.hpa; > mmu->root_pgd = root.pgd; > > + if (i < KVM_MMU_NUM_PREV_ROOTS && root.need_sync) { Probably makes sense to write this as: if (i >= KVM_MMU_NUM_PREV_ROOTS) return false; if (root.need_sync) { kvm_make_request(KVM_REQ_MMU_SYNC, vcpu); kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu); } return true; The "i < KVM_MMU_NUM_PREV_ROOTS == success" logic is just confusing enough that it'd be nice to write it only once. And that would also play nicely with deferring a sync for the "current" root (see below), e.g. ... unsync = mmu->root_unsync; if (is_root_usable(&root, new_pgd, new_role)) goto found_root; for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) { swap(root, mmu->prev_roots[i]); swap(unsync, mmu->unsync_roots[i]); if (is_root_usable(&root, new_pgd, new_role)) break; } if (i >= KVM_MMU_NUM_PREV_ROOTS) return false; found_root: if (unsync) { kvm_make_request(KVM_REQ_MMU_SYNC, vcpu); kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu); } return true; > + kvm_make_request(KVM_REQ_MMU_SYNC, vcpu); > + kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu); > + } > + > return i < KVM_MMU_NUM_PREV_ROOTS; > } > > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c > index 6058a65a6ede..ab7069ac6dc5 100644 > --- a/arch/x86/kvm/vmx/nested.c > +++ b/arch/x86/kvm/vmx/nested.c > @@ -5312,7 +5312,7 @@ static int handle_invept(struct kvm_vcpu *vcpu) > { > struct vcpu_vmx *vmx = to_vmx(vcpu); > u32 vmx_instruction_info, types; > - unsigned long type, roots_to_free; > + unsigned long type; > struct kvm_mmu *mmu; > gva_t gva; > struct x86_exception e; > @@ -5361,29 +5361,25 @@ static int handle_invept(struct kvm_vcpu *vcpu) > return nested_vmx_fail(vcpu, > VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID); > > - roots_to_free = 0; > if (nested_ept_root_matches(mmu->root_hpa, mmu->root_pgd, > operand.eptp)) > - roots_to_free |= KVM_MMU_ROOT_CURRENT; > + kvm_mmu_free_roots(vcpu, mmu, KVM_MMU_ROOT_CURRENT); For a non-RFC series, I think this should do two things: 1. Separate INVEPT from INVPCID, i.e. do only INVPCID first. 2. Enhance INVEPT to SYNC+FLUSH the current root instead of freeing it As alluded to above, this can be done by deferring the sync+flush (which can't be done right away because INVEPT runs in L1 context, whereas KVM needs to sync+flush L2 EPT context). > for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) { > if (nested_ept_root_matches(mmu->prev_roots[i].hpa, > mmu->prev_roots[i].pgd, > operand.eptp)) > - roots_to_free |= KVM_MMU_ROOT_PREVIOUS(i); > + mmu->prev_roots[i].need_sync = true; > } > break; > case VMX_EPT_EXTENT_GLOBAL: > - roots_to_free = KVM_MMU_ROOTS_ALL; > + kvm_mmu_free_roots(vcpu, mmu, KVM_MMU_ROOTS_ALL); > break; > default: > BUG(); > break; > } > > - if (roots_to_free) > - kvm_mmu_free_roots(vcpu, mmu, roots_to_free); > - > return nested_vmx_succeed(vcpu); > } ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [RFC PATCH] kvm/x86: Keep root hpa in prev_roots as much as possible 2021-07-29 18:05 ` Sean Christopherson @ 2021-08-03 1:19 ` Lai Jiangshan 2021-08-03 15:57 ` Sean Christopherson 0 siblings, 1 reply; 4+ messages in thread From: Lai Jiangshan @ 2021-08-03 1:19 UTC (permalink / raw) To: Sean Christopherson Cc: LKML, kvm, Paolo Bonzini, Lai Jiangshan, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson On Fri, Jul 30, 2021 at 2:06 AM Sean Christopherson <seanjc@google.com> wrote: > > On Wed, May 26, 2021, Lai Jiangshan wrote: > > From: Lai Jiangshan <laijs@linux.alibaba.com> > > > > Pagetable roots in prev_roots[] are likely to be reused soon and > > there is no much overhead to keep it with a new need_sync field > > introduced. > > > > With the help of the new need_sync field, pagetable roots are > > kept as much as possible, and they will be re-synced before reused > > instead of being dropped. > > > > Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> > > --- > > > > This patch is just for RFC. > > Is the idea Ok? > > Yes, the idea is definitely a good one. > > > If the idea is Ok, we need to reused one bit from pgd or hpa > > as need_sync to save memory. Which one is better? > > Ha, we can do this without increasing the memory footprint and without co-opting > a bit from pgd or hpa. Because of compiler alignment/padding, the u8s and bools > between mmu_role and prev_roots already occupy 8 bytes, even though the actual > size is 4 bytes. In total, we need room for 4 roots (3 previous + current), i.e. > 4 bytes. If a separate array is used, no additional memory is consumed and no > masking is needed when reading/writing e.g. pgd. > > The cost is an extra swap() when updating the prev_roots LRU, but that's peanuts > and would likely be offset by masking anyways. > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index 99f37781a6fc..13bb3c3a60b4 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -424,10 +424,12 @@ struct kvm_mmu { > hpa_t root_hpa; > gpa_t root_pgd; > union kvm_mmu_role mmu_role; > + bool root_unsync; > u8 root_level; > u8 shadow_root_level; > u8 ept_ad; > bool direct_map; > + bool unsync_roots[KVM_MMU_NUM_PREV_ROOTS]; > struct kvm_mmu_root_info prev_roots[KVM_MMU_NUM_PREV_ROOTS]; > Hello I think it is too complicated. And it is hard to accept to put "unsync" out of struct kvm_mmu_root_info when they should be bound to each other. How about this: - KVM_MMU_NUM_PREV_ROOTS + KVM_MMU_NUM_CACHED_ROOTS - mmu->prev_roots[KVM_MMU_NUM_PREV_ROOTS] + mmu->cached_roots[KVM_MMU_NUM_CACHED_ROOTS] - mmu->root_hpa + mmu->cached_roots[0].hpa - mmu->root_pgd + mmu->cached_roots[0].pgd And using the bit63 in @pgd as the information that it is not requested to sync since the last sync. Thanks Lai. > /* > > > > arch/x86/include/asm/kvm_host.h | 3 ++- > > arch/x86/kvm/mmu/mmu.c | 6 ++++++ > > arch/x86/kvm/vmx/nested.c | 12 ++++-------- > > arch/x86/kvm/x86.c | 9 +++++---- > > 4 files changed, 17 insertions(+), 13 deletions(-) > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > > index 55efbacfc244..19a337cf7aa6 100644 > > --- a/arch/x86/include/asm/kvm_host.h > > +++ b/arch/x86/include/asm/kvm_host.h > > @@ -354,10 +354,11 @@ struct rsvd_bits_validate { > > struct kvm_mmu_root_info { > > gpa_t pgd; > > hpa_t hpa; > > + bool need_sync; > > Hmm, use "unsync" instead of "need_sync", purely to match the existing terminology > in KVM's MMU for this sort of behavior. > > > }; > > > > #define KVM_MMU_ROOT_INFO_INVALID \ > > - ((struct kvm_mmu_root_info) { .pgd = INVALID_PAGE, .hpa = INVALID_PAGE }) > > + ((struct kvm_mmu_root_info) { .pgd = INVALID_PAGE, .hpa = INVALID_PAGE, .need_sync = true}) > > > > #define KVM_MMU_NUM_PREV_ROOTS 3 > > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > index 5e60b00e8e50..147827135549 100644 > > --- a/arch/x86/kvm/mmu/mmu.c > > +++ b/arch/x86/kvm/mmu/mmu.c > > @@ -3878,6 +3878,7 @@ static bool cached_root_available(struct kvm_vcpu *vcpu, gpa_t new_pgd, > > > > root.pgd = mmu->root_pgd; > > root.hpa = mmu->root_hpa; > > + root.need_sync = false; > > > > if (is_root_usable(&root, new_pgd, new_role)) > > return true; > > @@ -3892,6 +3893,11 @@ static bool cached_root_available(struct kvm_vcpu *vcpu, gpa_t new_pgd, > > mmu->root_hpa = root.hpa; > > mmu->root_pgd = root.pgd; > > > > + if (i < KVM_MMU_NUM_PREV_ROOTS && root.need_sync) { > > Probably makes sense to write this as: > > if (i >= KVM_MMU_NUM_PREV_ROOTS) > return false; > > if (root.need_sync) { > kvm_make_request(KVM_REQ_MMU_SYNC, vcpu); > kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu); > } > return true; > > The "i < KVM_MMU_NUM_PREV_ROOTS == success" logic is just confusing enough that > it'd be nice to write it only once. > > And that would also play nicely with deferring a sync for the "current" root > (see below), e.g. > > ... > unsync = mmu->root_unsync; > > if (is_root_usable(&root, new_pgd, new_role)) > goto found_root; > > for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) { > swap(root, mmu->prev_roots[i]); > swap(unsync, mmu->unsync_roots[i]); > > if (is_root_usable(&root, new_pgd, new_role)) > break; > } > > if (i >= KVM_MMU_NUM_PREV_ROOTS) > return false; > > found_root: > if (unsync) { > kvm_make_request(KVM_REQ_MMU_SYNC, vcpu); > kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu); > } > return true; > > > + kvm_make_request(KVM_REQ_MMU_SYNC, vcpu); > > + kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu); > > + } > > + > > return i < KVM_MMU_NUM_PREV_ROOTS; > > } > > > > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c > > index 6058a65a6ede..ab7069ac6dc5 100644 > > --- a/arch/x86/kvm/vmx/nested.c > > +++ b/arch/x86/kvm/vmx/nested.c > > @@ -5312,7 +5312,7 @@ static int handle_invept(struct kvm_vcpu *vcpu) > > { > > struct vcpu_vmx *vmx = to_vmx(vcpu); > > u32 vmx_instruction_info, types; > > - unsigned long type, roots_to_free; > > + unsigned long type; > > struct kvm_mmu *mmu; > > gva_t gva; > > struct x86_exception e; > > @@ -5361,29 +5361,25 @@ static int handle_invept(struct kvm_vcpu *vcpu) > > return nested_vmx_fail(vcpu, > > VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID); > > > > - roots_to_free = 0; > > if (nested_ept_root_matches(mmu->root_hpa, mmu->root_pgd, > > operand.eptp)) > > - roots_to_free |= KVM_MMU_ROOT_CURRENT; > > + kvm_mmu_free_roots(vcpu, mmu, KVM_MMU_ROOT_CURRENT); > > For a non-RFC series, I think this should do two things: > > 1. Separate INVEPT from INVPCID, i.e. do only INVPCID first. > 2. Enhance INVEPT to SYNC+FLUSH the current root instead of freeing it > > As alluded to above, this can be done by deferring the sync+flush (which can't > be done right away because INVEPT runs in L1 context, whereas KVM needs to sync+flush > L2 EPT context). > > > for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) { > > if (nested_ept_root_matches(mmu->prev_roots[i].hpa, > > mmu->prev_roots[i].pgd, > > operand.eptp)) > > - roots_to_free |= KVM_MMU_ROOT_PREVIOUS(i); > > + mmu->prev_roots[i].need_sync = true; > > } > > break; > > case VMX_EPT_EXTENT_GLOBAL: > > - roots_to_free = KVM_MMU_ROOTS_ALL; > > + kvm_mmu_free_roots(vcpu, mmu, KVM_MMU_ROOTS_ALL); > > break; > > default: > > BUG(); > > break; > > } > > > > - if (roots_to_free) > > - kvm_mmu_free_roots(vcpu, mmu, roots_to_free); > > - > > return nested_vmx_succeed(vcpu); > > } ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC PATCH] kvm/x86: Keep root hpa in prev_roots as much as possible 2021-08-03 1:19 ` Lai Jiangshan @ 2021-08-03 15:57 ` Sean Christopherson 0 siblings, 0 replies; 4+ messages in thread From: Sean Christopherson @ 2021-08-03 15:57 UTC (permalink / raw) To: Lai Jiangshan Cc: LKML, kvm, Paolo Bonzini, Lai Jiangshan, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson On Tue, Aug 03, 2021, Lai Jiangshan wrote: > On Fri, Jul 30, 2021 at 2:06 AM Sean Christopherson <seanjc@google.com> wrote: > > Ha, we can do this without increasing the memory footprint and without co-opting > > a bit from pgd or hpa. Because of compiler alignment/padding, the u8s and bools > > between mmu_role and prev_roots already occupy 8 bytes, even though the actual > > size is 4 bytes. In total, we need room for 4 roots (3 previous + current), i.e. > > 4 bytes. If a separate array is used, no additional memory is consumed and no > > masking is needed when reading/writing e.g. pgd. > > > > The cost is an extra swap() when updating the prev_roots LRU, but that's peanuts > > and would likely be offset by masking anyways. > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > > index 99f37781a6fc..13bb3c3a60b4 100644 > > --- a/arch/x86/include/asm/kvm_host.h > > +++ b/arch/x86/include/asm/kvm_host.h > > @@ -424,10 +424,12 @@ struct kvm_mmu { > > hpa_t root_hpa; > > gpa_t root_pgd; > > union kvm_mmu_role mmu_role; > > + bool root_unsync; > > u8 root_level; > > u8 shadow_root_level; > > u8 ept_ad; > > bool direct_map; > > + bool unsync_roots[KVM_MMU_NUM_PREV_ROOTS]; > > struct kvm_mmu_root_info prev_roots[KVM_MMU_NUM_PREV_ROOTS]; > > > > Hello > > I think it is too complicated. And it is hard to accept to put "unsync" > out of struct kvm_mmu_root_info when they should be bound to each other. I agree it's a bit ugly to have the separate unsync_roots array, but I don't see how it's any more complex. It's literally a single swap() call. > How about this: > - KVM_MMU_NUM_PREV_ROOTS > + KVM_MMU_NUM_CACHED_ROOTS > - mmu->prev_roots[KVM_MMU_NUM_PREV_ROOTS] > + mmu->cached_roots[KVM_MMU_NUM_CACHED_ROOTS] I don't have a strong preference on PREV vs CACHED. CACHED is probably more intuitive, but KVM isn't truly caching the root, it's just tracking the HPA (and PGD for indirect MMUs), e.g. the root may no longer exist if the backing shadow page was zapped. On the other hand, the main helper is cached_root_available()... > - mmu->root_hpa > + mmu->cached_roots[0].hpa > - mmu->root_pgd > + mmu->cached_roots[0].pgd > > And using the bit63 in @pgd as the information that it is not requested FWIW, using bit 0 will likely generate more efficient code. > to sync since the last sync. Again, I don't have a super strong preference. I don't hate or love either one :-) Vitaly, Paolo, any preferences on names and approaches for tracking if a "cached" root is unsync? ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-08-03 15:57 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-05-25 21:39 [RFC PATCH] kvm/x86: Keep root hpa in prev_roots as much as possible Lai Jiangshan 2021-07-29 18:05 ` Sean Christopherson 2021-08-03 1:19 ` Lai Jiangshan 2021-08-03 15:57 ` Sean Christopherson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).