[PATCH] KVM: VMX: Micro-optimize vmexit time when not exposing PMU

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] KVM: VMX: Micro-optimize vmexit time when not exposing PMU
@ 2020-03-12 10:05 Wanpeng Li
  2020-03-12 10:36 ` Vitaly Kuznetsov
  0 siblings, 1 reply; 8+ messages in thread
From: Wanpeng Li @ 2020-03-12 10:05 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel

From: Wanpeng Li <wanpengli@tencent.com>

PMU is not exposed to guest by most of cloud providers since the bad performance 
of PMU emulation and security concern. However, it calls perf_guest_switch_get_msrs()
and clear_atomic_switch_msr() unconditionally even if PMU is not exposed to the 
guest before each vmentry. 

~1.28% vmexit time reduced can be observed by kvm-unit-tests/vmexit.flat on my 
SKX server.

Before patch:
vmcall 1559

After patch:
vmcall 1539

Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
---
 arch/x86/kvm/vmx/vmx.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 40b1e61..fd526c8 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6441,6 +6441,9 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx)
 	int i, nr_msrs;
 	struct perf_guest_switch_msr *msrs;
 
+	if (!vcpu_to_pmu(&vmx->vcpu)->version)
+		return;
+
 	msrs = perf_guest_get_msrs(&nr_msrs);
 
 	if (!msrs)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] KVM: VMX: Micro-optimize vmexit time when not exposing PMU
  2020-03-12 10:05 [PATCH] KVM: VMX: Micro-optimize vmexit time when not exposing PMU Wanpeng Li
@ 2020-03-12 10:36 ` Vitaly Kuznetsov
  2020-03-12 11:05   ` Wanpeng Li
  2020-03-12 16:21   ` Jim Mattson
  0 siblings, 2 replies; 8+ messages in thread
From: Vitaly Kuznetsov @ 2020-03-12 10:36 UTC (permalink / raw)
  To: Wanpeng Li, linux-kernel, kvm
  Cc: Paolo Bonzini, Sean Christopherson, Wanpeng Li, Jim Mattson,
	Joerg Roedel

Wanpeng Li <kernellwp@gmail.com> writes:

> From: Wanpeng Li <wanpengli@tencent.com>
>
> PMU is not exposed to guest by most of cloud providers since the bad performance 
> of PMU emulation and security concern. However, it calls perf_guest_switch_get_msrs()
> and clear_atomic_switch_msr() unconditionally even if PMU is not exposed to the 
> guest before each vmentry. 
>
> ~1.28% vmexit time reduced can be observed by kvm-unit-tests/vmexit.flat on my 
> SKX server.
>
> Before patch:
> vmcall 1559
>
> After patch:
> vmcall 1539
>
> Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 40b1e61..fd526c8 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6441,6 +6441,9 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx)
>  	int i, nr_msrs;
>  	struct perf_guest_switch_msr *msrs;
>  
> +	if (!vcpu_to_pmu(&vmx->vcpu)->version)
> +		return;
> +
>  	msrs = perf_guest_get_msrs(&nr_msrs);
>  
>  	if (!msrs)

Personally, I'd prefer this to be expressed as

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 40b1e6138cd5..ace92076c90f 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6567,7 +6567,9 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
 
        pt_guest_enter(vmx);
 
-       atomic_switch_perf_msrs(vmx);
+       if (vcpu_to_pmu(&vmx->vcpu)->version)
+               atomic_switch_perf_msrs(vmx);
+
        atomic_switch_umwait_control_msr(vmx);
 
        if (enable_preemption_timer)

(which will likely produce the same code as atomic_switch_perf_msrs() is
likely inlined).

Also, (not knowing much about PMU), is
"vcpu_to_pmu(&vmx->vcpu)->version" check correct?

E.g. in intel_is_valid_msr() correct for Intel PMU or is it stated
somewhere that it is generic rule?

Also, speaking about cloud providers and the 'micro' nature of this
optimization, would it rather make sense to introduce a static branch
(the policy to disable vPMU is likely to be host wide, right)?

-- 
Vitaly


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] KVM: VMX: Micro-optimize vmexit time when not exposing PMU
  2020-03-12 10:36 ` Vitaly Kuznetsov
@ 2020-03-12 11:05   ` Wanpeng Li
  2020-03-13  3:23     ` Xu, Like
  2020-03-12 16:21   ` Jim Mattson
  1 sibling, 1 reply; 8+ messages in thread
From: Wanpeng Li @ 2020-03-12 11:05 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: LKML, kvm, Paolo Bonzini, Sean Christopherson, Wanpeng Li,
	Jim Mattson, Joerg Roedel

On Thu, 12 Mar 2020 at 18:36, Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
> Wanpeng Li <kernellwp@gmail.com> writes:
>
> > From: Wanpeng Li <wanpengli@tencent.com>
> >
> > PMU is not exposed to guest by most of cloud providers since the bad performance
> > of PMU emulation and security concern. However, it calls perf_guest_switch_get_msrs()
> > and clear_atomic_switch_msr() unconditionally even if PMU is not exposed to the
> > guest before each vmentry.
> >
> > ~1.28% vmexit time reduced can be observed by kvm-unit-tests/vmexit.flat on my
> > SKX server.
> >
> > Before patch:
> > vmcall 1559
> >
> > After patch:
> > vmcall 1539
> >
> > Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
> > ---
> >  arch/x86/kvm/vmx/vmx.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index 40b1e61..fd526c8 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -6441,6 +6441,9 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx)
> >       int i, nr_msrs;
> >       struct perf_guest_switch_msr *msrs;
> >
> > +     if (!vcpu_to_pmu(&vmx->vcpu)->version)
> > +             return;
> > +
> >       msrs = perf_guest_get_msrs(&nr_msrs);
> >
> >       if (!msrs)
>
> Personally, I'd prefer this to be expressed as
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 40b1e6138cd5..ace92076c90f 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6567,7 +6567,9 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
>
>         pt_guest_enter(vmx);
>
> -       atomic_switch_perf_msrs(vmx);
> +       if (vcpu_to_pmu(&vmx->vcpu)->version)
> +               atomic_switch_perf_msrs(vmx);
> +

I just hope the beautiful codes before, I testing this version before
sending out the patch, ~30 cycles can be saved which means that ~2%
vmexit time, will update in next version. Let's wait Paolo for other
opinions below.

    Wanpeng

>
> Also, (not knowing much about PMU), is
> "vcpu_to_pmu(&vmx->vcpu)->version" check correct?
>
> E.g. in intel_is_valid_msr() correct for Intel PMU or is it stated
> somewhere that it is generic rule?
>
> Also, speaking about cloud providers and the 'micro' nature of this
> optimization, would it rather make sense to introduce a static branch
> (the policy to disable vPMU is likely to be host wide, right)?
>
> --
> Vitaly
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] KVM: VMX: Micro-optimize vmexit time when not exposing PMU
  2020-03-12 10:36 ` Vitaly Kuznetsov
  2020-03-12 11:05   ` Wanpeng Li
@ 2020-03-12 16:21   ` Jim Mattson
  2020-03-13  9:08     ` Vitaly Kuznetsov
  1 sibling, 1 reply; 8+ messages in thread
From: Jim Mattson @ 2020-03-12 16:21 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Wanpeng Li, LKML, kvm list, Paolo Bonzini, Sean Christopherson,
	Wanpeng Li, Joerg Roedel

On Thu, Mar 12, 2020 at 3:36 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:

> Also, speaking about cloud providers and the 'micro' nature of this
> optimization, would it rather make sense to introduce a static branch
> (the policy to disable vPMU is likely to be host wide, right)?

Speaking for a cloud provider, no, the policy is not likely to be host-wide.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] KVM: VMX: Micro-optimize vmexit time when not exposing PMU
  2020-03-12 11:05   ` Wanpeng Li
@ 2020-03-13  3:23     ` Xu, Like
  2020-03-13  3:39       ` Wanpeng Li
  0 siblings, 1 reply; 8+ messages in thread
From: Xu, Like @ 2020-03-13  3:23 UTC (permalink / raw)
  To: Wanpeng Li, Vitaly Kuznetsov
  Cc: LKML, kvm, Paolo Bonzini, Sean Christopherson, Wanpeng Li,
	Jim Mattson, Joerg Roedel

Hi Wanpeng,

On 2020/3/12 19:05, Wanpeng Li wrote:
> On Thu, 12 Mar 2020 at 18:36, Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>> Wanpeng Li <kernellwp@gmail.com> writes:
>>
>>> From: Wanpeng Li <wanpengli@tencent.com>
>>>
>>> PMU is not exposed to guest by most of cloud providers since the bad performance
>>> of PMU emulation and security concern. However, it calls perf_guest_switch_get_msrs()
>>> and clear_atomic_switch_msr() unconditionally even if PMU is not exposed to the
>>> guest before each vmentry.
>>>
>>> ~1.28% vmexit time reduced can be observed by kvm-unit-tests/vmexit.flat on my
>>> SKX server.
>>>
>>> Before patch:
>>> vmcall 1559
>>>
>>> After patch:
>>> vmcall 1539
>>>
>>> Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
>>> ---
>>>   arch/x86/kvm/vmx/vmx.c | 3 +++
>>>   1 file changed, 3 insertions(+)
>>>
>>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>>> index 40b1e61..fd526c8 100644
>>> --- a/arch/x86/kvm/vmx/vmx.c
>>> +++ b/arch/x86/kvm/vmx/vmx.c
>>> @@ -6441,6 +6441,9 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx)
>>>        int i, nr_msrs;
>>>        struct perf_guest_switch_msr *msrs;
>>>
>>> +     if (!vcpu_to_pmu(&vmx->vcpu)->version)
>>> +             return;
>>> +
>>>        msrs = perf_guest_get_msrs(&nr_msrs);
>>>
>>>        if (!msrs)
>> Personally, I'd prefer this to be expressed as
>>
>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> index 40b1e6138cd5..ace92076c90f 100644
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -6567,7 +6567,9 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
>>
>>          pt_guest_enter(vmx);
>>
>> -       atomic_switch_perf_msrs(vmx);
>> +       if (vcpu_to_pmu(&vmx->vcpu)->version)
We may use 'vmx->vcpu.arch.pmu.version'.

I would vote in favor of adding the "unlikely (vmx->vcpu.arch.pmu.version)"
check to the atomic_switch_perf_msrs(), which follows pt_guest_enter(vmx).

>> +               atomic_switch_perf_msrs(vmx);
>> +
> I just hope the beautiful codes before, I testing this version before
> sending out the patch, ~30 cycles can be saved which means that ~2%
> vmexit time, will update in next version. Let's wait Paolo for other
> opinions below.

You may factor the cost of the "pmu-> version check' itself (~10 cycles)
into your overall 'micro-optimize' revenue.

Thanks,
Like Xu
>
>      Wanpeng
>
>> Also, (not knowing much about PMU), is
>> "vcpu_to_pmu(&vmx->vcpu)->version" check correct?
>>
>> E.g. in intel_is_valid_msr() correct for Intel PMU or is it stated
>> somewhere that it is generic rule?
>>
>> Also, speaking about cloud providers and the 'micro' nature of this
>> optimization, would it rather make sense to introduce a static branch
>> (the policy to disable vPMU is likely to be host wide, right)?
>>
>> --
>> Vitaly
>>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] KVM: VMX: Micro-optimize vmexit time when not exposing PMU
  2020-03-13  3:23     ` Xu, Like
@ 2020-03-13  3:39       ` Wanpeng Li
  2020-03-13  4:57         ` Like Xu
  0 siblings, 1 reply; 8+ messages in thread
From: Wanpeng Li @ 2020-03-13  3:39 UTC (permalink / raw)
  To: like.xu
  Cc: Vitaly Kuznetsov, LKML, kvm, Paolo Bonzini, Sean Christopherson,
	Wanpeng Li, Jim Mattson, Joerg Roedel

On Fri, 13 Mar 2020 at 11:23, Xu, Like <like.xu@intel.com> wrote:
>
> Hi Wanpeng,
>
> On 2020/3/12 19:05, Wanpeng Li wrote:
> > On Thu, 12 Mar 2020 at 18:36, Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> >> Wanpeng Li <kernellwp@gmail.com> writes:
> >>
> >>> From: Wanpeng Li <wanpengli@tencent.com>
> >>>
> >>> PMU is not exposed to guest by most of cloud providers since the bad performance
> >>> of PMU emulation and security concern. However, it calls perf_guest_switch_get_msrs()
> >>> and clear_atomic_switch_msr() unconditionally even if PMU is not exposed to the
> >>> guest before each vmentry.
> >>>
> >>> ~1.28% vmexit time reduced can be observed by kvm-unit-tests/vmexit.flat on my
> >>> SKX server.
> >>>
> >>> Before patch:
> >>> vmcall 1559
> >>>
> >>> After patch:
> >>> vmcall 1539
> >>>
> >>> Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
> >>> ---
> >>>   arch/x86/kvm/vmx/vmx.c | 3 +++
> >>>   1 file changed, 3 insertions(+)
> >>>
> >>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> >>> index 40b1e61..fd526c8 100644
> >>> --- a/arch/x86/kvm/vmx/vmx.c
> >>> +++ b/arch/x86/kvm/vmx/vmx.c
> >>> @@ -6441,6 +6441,9 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx)
> >>>        int i, nr_msrs;
> >>>        struct perf_guest_switch_msr *msrs;
> >>>
> >>> +     if (!vcpu_to_pmu(&vmx->vcpu)->version)
> >>> +             return;
> >>> +
> >>>        msrs = perf_guest_get_msrs(&nr_msrs);
> >>>
> >>>        if (!msrs)
> >> Personally, I'd prefer this to be expressed as
> >>
> >> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> >> index 40b1e6138cd5..ace92076c90f 100644
> >> --- a/arch/x86/kvm/vmx/vmx.c
> >> +++ b/arch/x86/kvm/vmx/vmx.c
> >> @@ -6567,7 +6567,9 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
> >>
> >>          pt_guest_enter(vmx);
> >>
> >> -       atomic_switch_perf_msrs(vmx);
> >> +       if (vcpu_to_pmu(&vmx->vcpu)->version)
> We may use 'vmx->vcpu.arch.pmu.version'.

Thanks for confirm this. Maybe this is better:

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 40b1e61..b20423c 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6567,7 +6567,8 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)

        pt_guest_enter(vmx);

-       atomic_switch_perf_msrs(vmx);
+       if (vcpu_to_pmu(vcpu)->version)
+               atomic_switch_perf_msrs(vmx);
        atomic_switch_umwait_control_msr(vmx);

        if (enable_preemption_timer)

>
> I would vote in favor of adding the "unlikely (vmx->vcpu.arch.pmu.version)"
> check to the atomic_switch_perf_msrs(), which follows pt_guest_enter(vmx).

This is hotpath, let's save the cost of function call.

    Wanpeng

>
> >> +               atomic_switch_perf_msrs(vmx);
> >> +
> > I just hope the beautiful codes before, I testing this version before
> > sending out the patch, ~30 cycles can be saved which means that ~2%
> > vmexit time, will update in next version. Let's wait Paolo for other
> > opinions below.
>
> You may factor the cost of the "pmu-> version check' itself (~10 cycles)
> into your overall 'micro-optimize' revenue.
>
> Thanks,
> Like Xu
> >
> >      Wanpeng
> >
> >> Also, (not knowing much about PMU), is
> >> "vcpu_to_pmu(&vmx->vcpu)->version" check correct?
> >>
> >> E.g. in intel_is_valid_msr() correct for Intel PMU or is it stated
> >> somewhere that it is generic rule?
> >>
> >> Also, speaking about cloud providers and the 'micro' nature of this
> >> optimization, would it rather make sense to introduce a static branch
> >> (the policy to disable vPMU is likely to be host wide, right)?
> >>
> >> --
> >> Vitaly
> >>
>

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] KVM: VMX: Micro-optimize vmexit time when not exposing PMU
  2020-03-13  3:39       ` Wanpeng Li
@ 2020-03-13  4:57         ` Like Xu
  0 siblings, 0 replies; 8+ messages in thread
From: Like Xu @ 2020-03-13  4:57 UTC (permalink / raw)
  To: Wanpeng Li, like.xu
  Cc: Vitaly Kuznetsov, LKML, kvm, Paolo Bonzini, Sean Christopherson,
	Wanpeng Li, Jim Mattson, Joerg Roedel

On 2020/3/13 11:39, Wanpeng Li wrote:
> On Fri, 13 Mar 2020 at 11:23, Xu, Like <like.xu@intel.com> wrote:
>>
>> Hi Wanpeng,
>>
>> On 2020/3/12 19:05, Wanpeng Li wrote:
>>> On Thu, 12 Mar 2020 at 18:36, Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>>>> Wanpeng Li <kernellwp@gmail.com> writes:
>>>>
>>>>> From: Wanpeng Li <wanpengli@tencent.com>
>>>>>
>>>>> PMU is not exposed to guest by most of cloud providers since the bad performance
>>>>> of PMU emulation and security concern. However, it calls perf_guest_switch_get_msrs()
>>>>> and clear_atomic_switch_msr() unconditionally even if PMU is not exposed to the
>>>>> guest before each vmentry.
>>>>>
>>>>> ~1.28% vmexit time reduced can be observed by kvm-unit-tests/vmexit.flat on my
>>>>> SKX server.
>>>>>
>>>>> Before patch:
>>>>> vmcall 1559
>>>>>
>>>>> After patch:
>>>>> vmcall 1539
>>>>>
>>>>> Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
>>>>> ---
>>>>>    arch/x86/kvm/vmx/vmx.c | 3 +++
>>>>>    1 file changed, 3 insertions(+)
>>>>>
>>>>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>>>>> index 40b1e61..fd526c8 100644
>>>>> --- a/arch/x86/kvm/vmx/vmx.c
>>>>> +++ b/arch/x86/kvm/vmx/vmx.c
>>>>> @@ -6441,6 +6441,9 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx)
>>>>>         int i, nr_msrs;
>>>>>         struct perf_guest_switch_msr *msrs;
>>>>>
>>>>> +     if (!vcpu_to_pmu(&vmx->vcpu)->version)
>>>>> +             return;
>>>>> +
>>>>>         msrs = perf_guest_get_msrs(&nr_msrs);
>>>>>
>>>>>         if (!msrs)
>>>> Personally, I'd prefer this to be expressed as
>>>>
>>>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>>>> index 40b1e6138cd5..ace92076c90f 100644
>>>> --- a/arch/x86/kvm/vmx/vmx.c
>>>> +++ b/arch/x86/kvm/vmx/vmx.c
>>>> @@ -6567,7 +6567,9 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
>>>>
>>>>           pt_guest_enter(vmx);
>>>>
>>>> -       atomic_switch_perf_msrs(vmx);
>>>> +       if (vcpu_to_pmu(&vmx->vcpu)->version)
>> We may use 'vmx->vcpu.arch.pmu.version'.
> 
> Thanks for confirm this. Maybe this is better:
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 40b1e61..b20423c 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6567,7 +6567,8 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
> 
>          pt_guest_enter(vmx);
> 
> -       atomic_switch_perf_msrs(vmx);
> +       if (vcpu_to_pmu(vcpu)->version)
> +               atomic_switch_perf_msrs(vmx);

>          atomic_switch_umwait_control_msr(vmx);
> 
>          if (enable_preemption_timer)
> 
>>
>> I would vote in favor of adding the "unlikely (vmx->vcpu.arch.pmu.version)"
>> check to the atomic_switch_perf_msrs(), which follows pt_guest_enter(vmx).
> 
> This is hotpath, let's save the cost of function call.

You're right, I measured both.
We may fix pt_guest_enter() with static_branch_unlikely
for a little bit more micro-optimize as well.

Thanks,
Like Xu

> 
>      Wanpeng
> 
>>
>>>> +               atomic_switch_perf_msrs(vmx);
>>>> +
>>> I just hope the beautiful codes before, I testing this version before
>>> sending out the patch, ~30 cycles can be saved which means that ~2%
>>> vmexit time, will update in next version. Let's wait Paolo for other
>>> opinions below.
>>
>> You may factor the cost of the "pmu-> version check' itself (~10 cycles)
>> into your overall 'micro-optimize' revenue.
>>
>> Thanks,
>> Like Xu
>>>
>>>       Wanpeng
>>>
>>>> Also, (not knowing much about PMU), is
>>>> "vcpu_to_pmu(&vmx->vcpu)->version" check correct?
>>>>
>>>> E.g. in intel_is_valid_msr() correct for Intel PMU or is it stated
>>>> somewhere that it is generic rule?
>>>>
>>>> Also, speaking about cloud providers and the 'micro' nature of this
>>>> optimization, would it rather make sense to introduce a static branch
>>>> (the policy to disable vPMU is likely to be host wide, right)?
>>>>
>>>> --
>>>> Vitaly
>>>>
>>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] KVM: VMX: Micro-optimize vmexit time when not exposing PMU
  2020-03-12 16:21   ` Jim Mattson
@ 2020-03-13  9:08     ` Vitaly Kuznetsov
  0 siblings, 0 replies; 8+ messages in thread
From: Vitaly Kuznetsov @ 2020-03-13  9:08 UTC (permalink / raw)
  To: Jim Mattson, Wanpeng Li
  Cc: LKML, kvm list, Paolo Bonzini, Sean Christopherson, Wanpeng Li,
	Joerg Roedel

Jim Mattson <jmattson@google.com> writes:

> On Thu, Mar 12, 2020 at 3:36 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
>> Also, speaking about cloud providers and the 'micro' nature of this
>> optimization, would it rather make sense to introduce a static branch
>> (the policy to disable vPMU is likely to be host wide, right)?
>
> Speaking for a cloud provider, no, the policy is not likely to be host-wide.

Ah, then it's just my flawed picture of the world where hosts only run
instances of the same type/family because it's mych easier to partition
them this way.

Scratch the static branch idea then.

-- 
Vitaly


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-03-13  9:08 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-12 10:05 [PATCH] KVM: VMX: Micro-optimize vmexit time when not exposing PMU Wanpeng Li
2020-03-12 10:36 ` Vitaly Kuznetsov
2020-03-12 11:05   ` Wanpeng Li
2020-03-13  3:23     ` Xu, Like
2020-03-13  3:39       ` Wanpeng Li
2020-03-13  4:57         ` Like Xu
2020-03-12 16:21   ` Jim Mattson
2020-03-13  9:08     ` Vitaly Kuznetsov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.