kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* tlb_flush stat on Intel/AMD
@ 2020-07-20 19:08 Jacob Xu
  2020-07-21 20:16 ` Sean Christopherson
  0 siblings, 1 reply; 3+ messages in thread
From: Jacob Xu @ 2020-07-20 19:08 UTC (permalink / raw)
  To: kvm

Hi all,

I wrote a kvm selftest to enter a 1-vcpu guest VM running a nop in one
thread and check the VM's debugfs entry for # of tlb flushes while
that was running. Installing the latest upstream kernel and running
this on an intel host showed a tlb_flush count of 30, while running it
on an amd host shows the tlb_flush count at 0.

Do we have an inconsistency between Intel and AMD in how VCPU_STAT is
incremented?

From browsing the code, we see that the stat only gets incremented in
the kvm_ wrappers of the x86_ops functions tlb_flush_all,
tlb_flush_current, and tlb_flush_guest. These wrappers are only called
via the KVM request api (referred to as deferred flushes in some other
threads), and there other instances of calling the x86_ops tlb_flush
methods directly (immediate flush).

It looks like most of the tlb flush calls are deferred, but there are
a few instances using the immediate flush where it's not counted
(kvm_mmu_load, svm_set_cr4, vmx_set_apic_access_page,
nested_prepare_vmcb_control). Is there a guideline on when to
deferred/immediate tlb_flush?

Could this be a cause for the lower tlb_flush stat seen on an AMD
host? Or perhaps there's another reason for the difference due to the
(too) simple selftest?

In the case of svm_tlb_flush, it seems like the tlb flush is deferred
anyway since the response to setting a tlb flush control bit in the
VMCB is not acted upon until entering the guest. So it seems we could
count tlb flushes on svm more easily by incrementing the counter by
checking the control bit before KVM_RUN. Though perhaps there's
another case we'd like to count as tlb flush when the guest switches
ASID (where would we track this?).

Would switching to this alternative for incrementing tlb_flush stat in
svm be much different than what we do right now?

Thanks,

Jacob

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: tlb_flush stat on Intel/AMD
  2020-07-20 19:08 tlb_flush stat on Intel/AMD Jacob Xu
@ 2020-07-21 20:16 ` Sean Christopherson
  2020-07-28 18:07   ` Jacob Xu
  0 siblings, 1 reply; 3+ messages in thread
From: Sean Christopherson @ 2020-07-21 20:16 UTC (permalink / raw)
  To: Jacob Xu; +Cc: kvm

On Mon, Jul 20, 2020 at 12:08:05PM -0700, Jacob Xu wrote:
> Hi all,
> 
> I wrote a kvm selftest to enter a 1-vcpu guest VM running a nop in one
> thread and check the VM's debugfs entry for # of tlb flushes while
> that was running. Installing the latest upstream kernel and running
> this on an intel host showed a tlb_flush count of 30, while running it
> on an amd host shows the tlb_flush count at 0.
> 
> Do we have an inconsistency between Intel and AMD in how VCPU_STAT is
> incremented?

Yes, you can even drop "between Intel and AMD" and just state "we have
inconsistencies in how VCPU_STAT is incremented".

> From browsing the code, we see that the stat only gets incremented in
> the kvm_ wrappers of the x86_ops functions tlb_flush_all,
> tlb_flush_current, and tlb_flush_guest. These wrappers are only called
> via the KVM request api (referred to as deferred flushes in some other
> threads), and there other instances of calling the x86_ops tlb_flush
> methods directly (immediate flush).
> 
> It looks like most of the tlb flush calls are deferred, but there are
> a few instances using the immediate flush where it's not counted
> (kvm_mmu_load, svm_set_cr4, vmx_set_apic_access_page,
> nested_prepare_vmcb_control). Is there a guideline on when to
> deferred/immediate tlb_flush?

The rule of thumb is to defer the flush whenever possible so that KVM
doesn't flush multiple times in a single VM-Exit.   However, of the above
three, svm_set_cr4() is the only one that can be deferred, but only
because kvm_mmu_load() and vmx_set_apic_access_page() are reachable after
KVM_REQ_TLB_FLUSH_CURRENT is processed in vcpu_enter_guest().

The VMX APIC flush could be deferred by hoisting KVM_REQ_APIC_PAGE_RELOAD
up.  I think that's safe?  But it's a very infrequent operation so I'm not
exactly chomping at the bit to get it fixed.

> Could this be a cause for the lower tlb_flush stat seen on an AMD
> host? Or perhaps there's another reason for the difference due to the
> (too) simple selftest?

Yes, but it's likely not due to any of the paths listed above.   Obviously
accounting the flush in kvm_mmu_load() will elevate the count, but it will
affect VMX and SVM identically.

Given that you see 0 on SVM and a low number on VMX, my money is on the
difference being that VMX accounts the TLB flush that occurs on vCPU
migration.  vmx_vcpu_load_vmcs() makes a KVM_REQ_TLB_FLUSH request, whereas
svm_vcpu_load() resets asid_generation but doesn't increment the stats.
 
> In the case of svm_tlb_flush, it seems like the tlb flush is deferred
> anyway since the response to setting a tlb flush control bit in the
> VMCB is not acted upon until entering the guest. So it seems we could
> count tlb flushes on svm more easily by incrementing the counter by
> checking the control bit before KVM_RUN. Though perhaps there's
> another case we'd like to count as tlb flush when the guest switches
> ASID (where would we track this?).
>
> Would switching to this alternative for incrementing tlb_flush stat in
> svm be much different than what we do right now?

I think a better option is to keep the current accounting, defer flushes
when possible to naturally fix accounting, and then fix the remaining one
off cases, e.g. kvm_mmu_load() and svm_vcpu_load().

I can prep a small series unless you want the honors?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: tlb_flush stat on Intel/AMD
  2020-07-21 20:16 ` Sean Christopherson
@ 2020-07-28 18:07   ` Jacob Xu
  0 siblings, 0 replies; 3+ messages in thread
From: Jacob Xu @ 2020-07-28 18:07 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm

> The VMX APIC flush could be deferred by hoisting KVM_REQ_APIC_PAGE_RELOAD
> up.  I think that's safe?  But it's a very infrequent operation so I'm not
> exactly chomping at the bit to get it fixed.

Could moving KVM_REQ_TLB_FLUSH and _CURRENT down also work?
It seems that none of the check_request() calls below depend on them.

> Given that you see 0 on SVM and a low number on VMX, my money is on the
> difference being that VMX accounts the TLB flush that occurs on vCPU
> migration.  vmx_vcpu_load_vmcs() makes a KVM_REQ_TLB_FLUSH request, whereas
> svm_vcpu_load() resets asid_generation but doesn't increment the stats.

I'll try adding a one-off stat increment here, and check the results.

> I think a better option is to keep the current accounting, defer flushes
> when possible to naturally fix accounting, and then fix the remaining one
> off cases, e.g. kvm_mmu_load() and svm_vcpu_load().
>
> I can prep a small series unless you want the honors?

I'll try making a small series:
kvm: selftests: add get_debugfs_stat to kvm_util
kvm: x86: increment tlb_flush stat on svm_vcpu_load(), kvm_mmu_load()
# omit kvm_mmu_load() if feasible to move check_request(KVM_REQ_TLB_FLUSH) down
kvm: x86: re-order check_request() to prefer use of deferred tlb_flush

Thanks for the explanations! Sorry for my late reply, I've updated my
email filter accordingly.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-07-28 18:08 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-20 19:08 tlb_flush stat on Intel/AMD Jacob Xu
2020-07-21 20:16 ` Sean Christopherson
2020-07-28 18:07   ` Jacob Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).