From: Like Xu <like.xu.linux@gmail.com>
To: Liuxiangdong <liuxiangdong5@huawei.com>,
Zhu Lingshan <lingshan.zhu@intel.com>
Cc: seanjc@google.com, vkuznets@redhat.com, wanpengli@tencent.com,
jmattson@google.com, joro@8bytes.org, kan.liang@linux.intel.com,
ak@linux.intel.com, wei.w.wang@intel.com, eranian@google.com,
linux-kernel@vger.kernel.org, x86@kernel.org,
kvm@vger.kernel.org, boris.ostrvsky@oracle.com,
Yao Yuan <yuan.yao@intel.com>,
Venkatesh Srinivas <venkateshs@chromium.org>,
"Fangyi (Eric)" <eric.fangyi@huawei.com>,
Xiexiangyou <xiexiangyou@huawei.com>
Subject: Re: [PATCH V10 05/18] KVM: x86/pmu: Set MSR_IA32_MISC_ENABLE_EMON bit when vPMU is enabled
Date: Mon, 8 Nov 2021 16:44:01 +0800 [thread overview]
Message-ID: <97bc3da2-202b-f1f0-a269-4e28c848c7e9@gmail.com> (raw)
In-Reply-To: <6188DF79.7010405@huawei.com>
On 8/11/2021 4:27 pm, Liuxiangdong wrote:
>
>
> On 2021/11/8 12:11, Like Xu wrote:
>> On 8/11/2021 12:07 pm, Liuxiangdong wrote:
>>>
>>>
>>> On 2021/11/8 11:06, Like Xu wrote:
>>>> On 7/11/2021 6:14 pm, Liuxiangdong wrote:
>>>>> Hi, like and lingshan.
>>>>>
>>>>> As said, IA32_MISC_ENABLE[7] bit depends on the PMU is enabled for the
>>>>> guest, so a software
>>>>> write openration to this bit will be ignored.
>>>>>
>>>>> But, in this patch, all the openration that writes msr_ia32_misc_enable in
>>>>> guest could make this bit become 0.
>>>>>
>>>>> Suppose:
>>>>> When we start vm with "enable_pmu", vcpu->arch.ia32_misc_enable_msr may be
>>>>> 0x80 first.
>>>>> And next, guest writes msr_ia32_misc_enable value 0x1.
>>>>> What we want could be 0x81, but unfortunately, it will be 0x1 because of
>>>>> "data &= ~MSR_IA32_MISC_ENABLE_EMON;"
>>>>> And even if guest writes msr_ia32_misc_enable value 0x81, it will be 0x1 also.
>>>>>
>>>>
>>>> Yes and thank you. The fix has been committed on my private tree for a long
>>>> time.
>>>>
>>>>>
>>>>> What we want is write operation will not change this bit. So, how about this?
>>>>>
>>>>> --- a/arch/x86/kvm/x86.c
>>>>> +++ b/arch/x86/kvm/x86.c
>>>>> @@ -3321,6 +3321,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct
>>>>> msr_data *msr_info)
>>>>> }
>>>>> break;
>>>>> case MSR_IA32_MISC_ENABLE:
>>>>> + data &= ~MSR_IA32_MISC_ENABLE_EMON;
>>>>> + data |= (vcpu->arch.ia32_misc_enable_msr &
>>>>> MSR_IA32_MISC_ENABLE_EMON);
>>>>> if (!kvm_check_has_quirk(vcpu->kvm,
>>>>> KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT) &&
>>>>> ((vcpu->arch.ia32_misc_enable_msr ^ data) &
>>>>> MSR_IA32_MISC_ENABLE_MWAIT)) {
>>>>> if (!guest_cpuid_has(vcpu, X86_FEATURE_XMM3))
>>>>>
>>>>>
>>>>
>>>> How about this for the final state considering PEBS enabling:
>>>>
>>>> case MSR_IA32_MISC_ENABLE: {
>>>> u64 old_val = vcpu->arch.ia32_misc_enable_msr;
>>>> u64 pmu_mask = MSR_IA32_MISC_ENABLE_EMON |
>>>> MSR_IA32_MISC_ENABLE_EMON;
>>>>
>>> u64 pmu_mask = MSR_IA32_MISC_ENABLE_EMON |
>>> MSR_IA32_MISC_ENABLE_EMON;
>>>
>>> Repetitive "MSR_IA32_MISC_ENABLE_EMON" ?
>>
>> Oops,
>>
>> u64 pmu_mask = MSR_IA32_MISC_ENABLE_EMON |
>> MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL;
>>
>
> Yes. bit[12] is also read-only, so we can keep this bit unchanged also.
>
> And, because write operation will not change this bit by "pmu_mask", do we still
> need this if statement?
>
> /* RO bits */
> if (!msr_info->host_initiated &&
> ((old_val ^ data) & MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL))
> return 1;
>
> "(old_val ^ data) & MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL" means some operation
> tries to change this bit,
> so we cannot allow it.
> But, if there is no this judgement, "pmu_mask" will still make this bit[12] no
> change.
>
> The only difference is that we can not change other bit (except bit 12 and bit
> 7) once "old_val[12] != data[12]" if there exists this statement
> and we can change other bit if there is no judgement.
>
> For both MSR_IA32_MISC_ENABLE_EMON and MSR_IA32_MISC_ENABLE_EMON are read-only,
> maybe we can keep
> their behavioral consistency. Either both judge, or neither.
One more difference per Intel SDM, I assume:
For Bit 7, Performance Monitoring Available (R)
(R) means that attempts to change this bit will be silent;
For Bit 12, Processor Event Based Sampling (PEBS) Unavailable (RO),
(RO) means that attempts to change this bit will be #GP;
>
> Do you think so?
>
>
>> I'll send the fix after sync with Lingshan.
>>
>>>
>>>> /* RO bits */
>>>> if (!msr_info->host_initiated &&
>>>> ((old_val ^ data) & MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL))
>>>> return 1;
>>>>
>>>> /*
>>>> * For a dummy user space, the order of setting vPMU capabilities and
>>>> * initialising MSR_IA32_MISC_ENABLE is not strictly guaranteed, so to
>>>> * avoid inconsistent functionality we keep the vPMU bits unchanged
>>>> here.
>>>> */
>>> Yes. It's a little clearer with comments.
>>
>> Thanks for your feedback! Enjoy the feature.
>>
>>>> data &= ~pmu_mask;
>>>> data |= old_val & pmu_mask;
>>>> if (!kvm_check_has_quirk(vcpu->kvm,
>>>> KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT) &&
>>>> ((old_val ^ data) & MSR_IA32_MISC_ENABLE_MWAIT)) {
>>>> if (!guest_cpuid_has(vcpu, X86_FEATURE_XMM3))
>>>> return 1;
>>>> vcpu->arch.ia32_misc_enable_msr = data;
>>>> kvm_update_cpuid_runtime(vcpu);
>>>> } else {
>>>> vcpu->arch.ia32_misc_enable_msr = data;
>>>> }
>>>> break;
>>>> }
>>>>
>>>>> Or is there anything in your design intention I don't understand?
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Xiangdong Liu
>>>>>
>>>>>
>>>>> On 2021/8/6 21:37, Zhu Lingshan wrote:
>>>>>> From: Like Xu <like.xu@linux.intel.com>
>>>>>>
>>>>>> On Intel platforms, the software can use the IA32_MISC_ENABLE[7] bit to
>>>>>> detect whether the processor supports performance monitoring facility.
>>>>>>
>>>>>> It depends on the PMU is enabled for the guest, and a software write
>>>>>> operation to this available bit will be ignored. The proposal to ignore
>>>>>> the toggle in KVM is the way to go and that behavior matches bare metal.
>>>>>>
>>>>>> Cc: Yao Yuan <yuan.yao@intel.com>
>>>>>> Signed-off-by: Like Xu <like.xu@linux.intel.com>
>>>>>> Reviewed-by: Venkatesh Srinivas <venkateshs@chromium.org>
>>>>>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>>>>>> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
>>>>>> ---
>>>>>> arch/x86/kvm/vmx/pmu_intel.c | 1 +
>>>>>> arch/x86/kvm/x86.c | 1 +
>>>>>> 2 files changed, 2 insertions(+)
>>>>>>
>>>>>> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
>>>>>> index 9efc1a6b8693..d9dbebe03cae 100644
>>>>>> --- a/arch/x86/kvm/vmx/pmu_intel.c
>>>>>> +++ b/arch/x86/kvm/vmx/pmu_intel.c
>>>>>> @@ -488,6 +488,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
>>>>>> if (!pmu->version)
>>>>>> return;
>>>>>> + vcpu->arch.ia32_misc_enable_msr |= MSR_IA32_MISC_ENABLE_EMON;
>>>>>> perf_get_x86_pmu_capability(&x86_pmu);
>>>>>> pmu->nr_arch_gp_counters = min_t(int, eax.split.num_counters,
>>>>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>>>>> index efd11702465c..f6b6984e26ef 100644
>>>>>> --- a/arch/x86/kvm/x86.c
>>>>>> +++ b/arch/x86/kvm/x86.c
>>>>>> @@ -3321,6 +3321,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct
>>>>>> msr_data *msr_info)
>>>>>> }
>>>>>> break;
>>>>>> case MSR_IA32_MISC_ENABLE:
>>>>>> + data &= ~MSR_IA32_MISC_ENABLE_EMON;
>>>>>> if (!kvm_check_has_quirk(vcpu->kvm,
>>>>>> KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT) &&
>>>>>> ((vcpu->arch.ia32_misc_enable_msr ^ data) &
>>>>>> MSR_IA32_MISC_ENABLE_MWAIT)) {
>>>>>> if (!guest_cpuid_has(vcpu, X86_FEATURE_XMM3))
>>>>>
>>>
>
next prev parent reply other threads:[~2021-11-08 8:44 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-06 13:37 [PATCH V10 00/18] KVM: x86/pmu: Add *basic* support to enable guest PEBS via DS Zhu Lingshan
2021-08-06 13:37 ` [PATCH V10 01/18] perf/core: Use static_call to optimize perf_guest_info_callbacks Zhu Lingshan
2021-08-26 19:59 ` Sean Christopherson
2021-08-27 6:31 ` Like Xu
2021-09-15 1:19 ` Zhu, Lingshan
2021-09-21 23:22 ` Sean Christopherson
2021-08-27 17:23 ` Sean Christopherson
2021-08-06 13:37 ` [PATCH V10 02/18] perf/x86/intel: Add EPT-Friendly PEBS for Ice Lake Server Zhu Lingshan
2021-08-06 13:37 ` [PATCH V10 03/18] perf/x86/intel: Handle guest PEBS overflow PMI for KVM guest Zhu Lingshan
2021-08-06 13:37 ` [PATCH V10 04/18] perf/x86/core: Pass "struct kvm_pmu *" to determine the guest values Zhu Lingshan
2021-08-06 13:37 ` [PATCH V10 05/18] KVM: x86/pmu: Set MSR_IA32_MISC_ENABLE_EMON bit when vPMU is enabled Zhu Lingshan
2021-11-07 10:14 ` Liuxiangdong
2021-11-08 3:06 ` Like Xu
2021-11-08 4:07 ` Liuxiangdong
2021-11-08 4:11 ` Like Xu
2021-11-08 8:27 ` Liuxiangdong
2021-11-08 8:44 ` Like Xu [this message]
2021-11-08 10:06 ` Liuxiangdong
2021-08-06 13:37 ` [PATCH V10 06/18] KVM: x86/pmu: Introduce the ctrl_mask value for fixed counter Zhu Lingshan
2021-08-06 13:37 ` [PATCH V10 07/18] x86/perf/core: Add pebs_capable to store valid PEBS_COUNTER_MASK value Zhu Lingshan
2021-08-06 13:37 ` [PATCH V10 08/18] KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS Zhu Lingshan
2021-08-06 13:37 ` [PATCH V10 09/18] KVM: x86/pmu: Reprogram PEBS event to emulate guest PEBS counter Zhu Lingshan
2021-08-06 13:37 ` [PATCH V10 10/18] KVM: x86/pmu: Adjust precise_ip to emulate Ice Lake guest PDIR counter Zhu Lingshan
2021-08-06 13:37 ` [PATCH V10 11/18] KVM: x86/pmu: Add IA32_DS_AREA MSR emulation to support guest DS Zhu Lingshan
2021-08-06 13:37 ` [PATCH V10 12/18] KVM: x86/pmu: Add PEBS_DATA_CFG MSR emulation to support adaptive PEBS Zhu Lingshan
2021-08-06 13:37 ` [PATCH V10 13/18] KVM: x86: Set PEBS_UNAVAIL in IA32_MISC_ENABLE when PEBS is enabled Zhu Lingshan
2021-08-06 13:37 ` [PATCH V10 14/18] KVM: x86/pmu: Move pmc_speculative_in_use() to arch/x86/kvm/pmu.h Zhu Lingshan
2021-08-06 13:37 ` [PATCH V10 15/18] KVM: x86/pmu: Disable guest PEBS temporarily in two rare situations Zhu Lingshan
2021-08-06 13:38 ` [PATCH V10 16/18] KVM: x86/pmu: Add kvm_pmu_cap to optimize perf_get_x86_pmu_capability Zhu Lingshan
2021-08-06 13:38 ` [PATCH V10 17/18] KVM: x86/cpuid: Refactor host/guest CPU model consistency check Zhu Lingshan
2021-08-06 13:38 ` [PATCH V10 18/18] KVM: x86/pmu: Expose CPUIDs feature bits PDCM, DS, DTES64 Zhu Lingshan
2021-08-18 5:27 ` [PATCH V10 00/18] KVM: x86/pmu: Add *basic* support to enable guest PEBS via DS Zhu, Lingshan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=97bc3da2-202b-f1f0-a269-4e28c848c7e9@gmail.com \
--to=like.xu.linux@gmail.com \
--cc=ak@linux.intel.com \
--cc=boris.ostrvsky@oracle.com \
--cc=eranian@google.com \
--cc=eric.fangyi@huawei.com \
--cc=jmattson@google.com \
--cc=joro@8bytes.org \
--cc=kan.liang@linux.intel.com \
--cc=kvm@vger.kernel.org \
--cc=lingshan.zhu@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=liuxiangdong5@huawei.com \
--cc=seanjc@google.com \
--cc=venkateshs@chromium.org \
--cc=vkuznets@redhat.com \
--cc=wanpengli@tencent.com \
--cc=wei.w.wang@intel.com \
--cc=x86@kernel.org \
--cc=xiexiangyou@huawei.com \
--cc=yuan.yao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).