All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] KVM: x86/pmu: Isolate TSX specific perf_event_attr.attr logic for AMD
@ 2022-03-07  6:38 Like Xu
  2022-03-07 21:39 ` Jim Mattson
  0 siblings, 1 reply; 4+ messages in thread
From: Like Xu @ 2022-03-07  6:38 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Ravi Bangoria, Jim Mattson, Andi Kleen, kvm, Sean Christopherson,
	Wanpeng Li, Vitaly Kuznetsov, Joerg Roedel, linux-kernel

From: Like Xu <likexu@tencent.com>

HSW_IN_TX* bits are used in generic code which are not supported on
AMD. Worse, these bits overlap with AMD EventSelect[11:8] and hence
using HSW_IN_TX* bits unconditionally in generic code is resulting in
unintentional pmu behavior on AMD. For example, if EventSelect[11:8]
is 0x2, pmc_reprogram_counter() wrongly assumes that
HSW_IN_TX_CHECKPOINTED is set and thus forces sampling period to be 0.

Opportunistically remove two TSX specific incoming parameters for
the generic interface reprogram_counter().

Fixes: 103af0a98788 ("perf, kvm: Support the in_tx/in_tx_cp modifiers in KVM arch perfmon emulation v5")
Co-developed-by: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Like Xu <likexu@tencent.com>
---
Note: this patch is based on [1] which is considered to be a necessary cornerstone.
[1] https://lore.kernel.org/kvm/20220302111334.12689-1-likexu@tencent.com/

 arch/x86/kvm/pmu.c | 29 ++++++++++++++---------------
 1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 17c61c990282..d0f9515c37dd 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -99,8 +99,7 @@ static void kvm_perf_overflow(struct perf_event *perf_event,
 
 static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type,
 				  u64 config, bool exclude_user,
-				  bool exclude_kernel, bool intr,
-				  bool in_tx, bool in_tx_cp)
+				  bool exclude_kernel, bool intr)
 {
 	struct perf_event *event;
 	struct perf_event_attr attr = {
@@ -116,16 +115,18 @@ static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type,
 
 	attr.sample_period = get_sample_period(pmc, pmc->counter);
 
-	if (in_tx)
-		attr.config |= HSW_IN_TX;
-	if (in_tx_cp) {
-		/*
-		 * HSW_IN_TX_CHECKPOINTED is not supported with nonzero
-		 * period. Just clear the sample period so at least
-		 * allocating the counter doesn't fail.
-		 */
-		attr.sample_period = 0;
-		attr.config |= HSW_IN_TX_CHECKPOINTED;
+	if (guest_cpuid_is_intel(pmc->vcpu)) {
+		if (pmc->eventsel & HSW_IN_TX)
+			attr.config |= HSW_IN_TX;
+		if (pmc->eventsel & HSW_IN_TX_CHECKPOINTED) {
+			/*
+			 * HSW_IN_TX_CHECKPOINTED is not supported with nonzero
+			 * period. Just clear the sample period so at least
+			 * allocating the counter doesn't fail.
+			 */
+			attr.sample_period = 0;
+			attr.config |= HSW_IN_TX_CHECKPOINTED;
+		}
 	}
 
 	event = perf_event_create_kernel_counter(&attr, -1, current,
@@ -268,9 +269,7 @@ void reprogram_counter(struct kvm_pmc *pmc)
 			(eventsel & AMD64_RAW_EVENT_MASK),
 			!(eventsel & ARCH_PERFMON_EVENTSEL_USR),
 			!(eventsel & ARCH_PERFMON_EVENTSEL_OS),
-			eventsel & ARCH_PERFMON_EVENTSEL_INT,
-			(eventsel & HSW_IN_TX),
-			(eventsel & HSW_IN_TX_CHECKPOINTED));
+			eventsel & ARCH_PERFMON_EVENTSEL_INT);
 }
 EXPORT_SYMBOL_GPL(reprogram_counter);
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] KVM: x86/pmu: Isolate TSX specific perf_event_attr.attr logic for AMD
  2022-03-07  6:38 [PATCH] KVM: x86/pmu: Isolate TSX specific perf_event_attr.attr logic for AMD Like Xu
@ 2022-03-07 21:39 ` Jim Mattson
  2022-03-08 11:59   ` Like Xu
  0 siblings, 1 reply; 4+ messages in thread
From: Jim Mattson @ 2022-03-07 21:39 UTC (permalink / raw)
  To: Like Xu
  Cc: Paolo Bonzini, Ravi Bangoria, Andi Kleen, kvm,
	Sean Christopherson, Wanpeng Li, Vitaly Kuznetsov, Joerg Roedel,
	linux-kernel

On Sun, Mar 6, 2022 at 10:38 PM Like Xu <like.xu.linux@gmail.com> wrote:
>
> From: Like Xu <likexu@tencent.com>
>
> HSW_IN_TX* bits are used in generic code which are not supported on
> AMD. Worse, these bits overlap with AMD EventSelect[11:8] and hence
> using HSW_IN_TX* bits unconditionally in generic code is resulting in
> unintentional pmu behavior on AMD. For example, if EventSelect[11:8]
> is 0x2, pmc_reprogram_counter() wrongly assumes that
> HSW_IN_TX_CHECKPOINTED is set and thus forces sampling period to be 0.
>
> Opportunistically remove two TSX specific incoming parameters for
> the generic interface reprogram_counter().
>
> Fixes: 103af0a98788 ("perf, kvm: Support the in_tx/in_tx_cp modifiers in KVM arch perfmon emulation v5")
> Co-developed-by: Ravi Bangoria <ravi.bangoria@amd.com>
> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
> Signed-off-by: Like Xu <likexu@tencent.com>
> ---
> Note: this patch is based on [1] which is considered to be a necessary cornerstone.
> [1] https://lore.kernel.org/kvm/20220302111334.12689-1-likexu@tencent.com/
>
>  arch/x86/kvm/pmu.c | 29 ++++++++++++++---------------
>  1 file changed, 14 insertions(+), 15 deletions(-)
>
> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> index 17c61c990282..d0f9515c37dd 100644
> --- a/arch/x86/kvm/pmu.c
> +++ b/arch/x86/kvm/pmu.c
> @@ -99,8 +99,7 @@ static void kvm_perf_overflow(struct perf_event *perf_event,
>
>  static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type,
>                                   u64 config, bool exclude_user,
> -                                 bool exclude_kernel, bool intr,
> -                                 bool in_tx, bool in_tx_cp)
> +                                 bool exclude_kernel, bool intr)
>  {
>         struct perf_event *event;
>         struct perf_event_attr attr = {
> @@ -116,16 +115,18 @@ static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type,
>
>         attr.sample_period = get_sample_period(pmc, pmc->counter);
>
> -       if (in_tx)
> -               attr.config |= HSW_IN_TX;
> -       if (in_tx_cp) {
> -               /*
> -                * HSW_IN_TX_CHECKPOINTED is not supported with nonzero
> -                * period. Just clear the sample period so at least
> -                * allocating the counter doesn't fail.
> -                */
> -               attr.sample_period = 0;
> -               attr.config |= HSW_IN_TX_CHECKPOINTED;
> +       if (guest_cpuid_is_intel(pmc->vcpu)) {

This is not the right condition to check. Per the SDM, both bits 32
and 33 "may only be set if the processor supports HLE or RTM." On
other Intel processors, this bit is reserved and any attempts to set
them result in a #GP.

> +               if (pmc->eventsel & HSW_IN_TX)
> +                       attr.config |= HSW_IN_TX;

This statement does nothing. If HSW_IN_TX is set in pmc->eventsel, it
is set in attr.config already.

> +               if (pmc->eventsel & HSW_IN_TX_CHECKPOINTED) {
> +                       /*
> +                        * HSW_IN_TX_CHECKPOINTED is not supported with nonzero
> +                        * period. Just clear the sample period so at least
> +                        * allocating the counter doesn't fail.
> +                        */
> +                       attr.sample_period = 0;
> +                       attr.config |= HSW_IN_TX_CHECKPOINTED;

As above, this statement does nothing. We should just set
attr.sample_period to 0. Note, however, that the SDM documents an
additional constraint which is ignored here: "This bit may only be set
for IA32_PERFEVTSEL2." I have confirmed that a #GP is raised for an
attempt to set bit 33 in any PerfEvtSeln other than PerfEvtSel2 on a
Broadwell Xeon E5.

> +               }
>         }
>
>         event = perf_event_create_kernel_counter(&attr, -1, current,
> @@ -268,9 +269,7 @@ void reprogram_counter(struct kvm_pmc *pmc)
>                         (eventsel & AMD64_RAW_EVENT_MASK),
>                         !(eventsel & ARCH_PERFMON_EVENTSEL_USR),
>                         !(eventsel & ARCH_PERFMON_EVENTSEL_OS),
> -                       eventsel & ARCH_PERFMON_EVENTSEL_INT,
> -                       (eventsel & HSW_IN_TX),
> -                       (eventsel & HSW_IN_TX_CHECKPOINTED));
> +                       eventsel & ARCH_PERFMON_EVENTSEL_INT);
>  }
>  EXPORT_SYMBOL_GPL(reprogram_counter);
>
> --
> 2.35.1
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] KVM: x86/pmu: Isolate TSX specific perf_event_attr.attr logic for AMD
  2022-03-07 21:39 ` Jim Mattson
@ 2022-03-08 11:59   ` Like Xu
  2022-03-08 16:09     ` Jim Mattson
  0 siblings, 1 reply; 4+ messages in thread
From: Like Xu @ 2022-03-08 11:59 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Paolo Bonzini, Ravi Bangoria, Andi Kleen, kvm,
	Sean Christopherson, Wanpeng Li, Vitaly Kuznetsov, Joerg Roedel,
	linux-kernel, Kan Liang

On 8/3/2022 5:39 am, Jim Mattson wrote:
> On Sun, Mar 6, 2022 at 10:38 PM Like Xu <like.xu.linux@gmail.com> wrote:
>>
>> From: Like Xu <likexu@tencent.com>
>>
>> HSW_IN_TX* bits are used in generic code which are not supported on
>> AMD. Worse, these bits overlap with AMD EventSelect[11:8] and hence
>> using HSW_IN_TX* bits unconditionally in generic code is resulting in
>> unintentional pmu behavior on AMD. For example, if EventSelect[11:8]
>> is 0x2, pmc_reprogram_counter() wrongly assumes that
>> HSW_IN_TX_CHECKPOINTED is set and thus forces sampling period to be 0.
>>
>> Opportunistically remove two TSX specific incoming parameters for
>> the generic interface reprogram_counter().
>>
>> Fixes: 103af0a98788 ("perf, kvm: Support the in_tx/in_tx_cp modifiers in KVM arch perfmon emulation v5")
>> Co-developed-by: Ravi Bangoria <ravi.bangoria@amd.com>
>> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
>> Signed-off-by: Like Xu <likexu@tencent.com>
>> ---
>> Note: this patch is based on [1] which is considered to be a necessary cornerstone.
>> [1] https://lore.kernel.org/kvm/20220302111334.12689-1-likexu@tencent.com/
>>
>>   arch/x86/kvm/pmu.c | 29 ++++++++++++++---------------
>>   1 file changed, 14 insertions(+), 15 deletions(-)
>>
>> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
>> index 17c61c990282..d0f9515c37dd 100644
>> --- a/arch/x86/kvm/pmu.c
>> +++ b/arch/x86/kvm/pmu.c
>> @@ -99,8 +99,7 @@ static void kvm_perf_overflow(struct perf_event *perf_event,
>>
>>   static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type,
>>                                    u64 config, bool exclude_user,
>> -                                 bool exclude_kernel, bool intr,
>> -                                 bool in_tx, bool in_tx_cp)
>> +                                 bool exclude_kernel, bool intr)
>>   {
>>          struct perf_event *event;
>>          struct perf_event_attr attr = {
>> @@ -116,16 +115,18 @@ static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type,
>>
>>          attr.sample_period = get_sample_period(pmc, pmc->counter);
>>
>> -       if (in_tx)
>> -               attr.config |= HSW_IN_TX;
>> -       if (in_tx_cp) {
>> -               /*
>> -                * HSW_IN_TX_CHECKPOINTED is not supported with nonzero
>> -                * period. Just clear the sample period so at least
>> -                * allocating the counter doesn't fail.
>> -                */
>> -               attr.sample_period = 0;
>> -               attr.config |= HSW_IN_TX_CHECKPOINTED;
>> +       if (guest_cpuid_is_intel(pmc->vcpu)) {
> 
> This is not the right condition to check. Per the SDM, both bits 32
> and 33 "may only be set if the processor supports HLE or RTM." On
> other Intel processors, this bit is reserved and any attempts to set
> them result in a #GP.

We already have this part of the code:

	entry = kvm_find_cpuid_entry(vcpu, 7, 0);
	if (entry &&
	    (boot_cpu_has(X86_FEATURE_HLE) || boot_cpu_has(X86_FEATURE_RTM)) &&
	    (entry->ebx & (X86_FEATURE_HLE|X86_FEATURE_RTM)))
		pmu->reserved_bits ^= HSW_IN_TX|HSW_IN_TX_CHECKPOINTED;

> 
>> +               if (pmc->eventsel & HSW_IN_TX)
>> +                       attr.config |= HSW_IN_TX;
> 
> This statement does nothing. If HSW_IN_TX is set in pmc->eventsel, it
> is set in attr.config already.

Agree for the redundancy, since attr.config is "(eventsel & AMD64_RAW_EVENT_MASK)".

> 
>> +               if (pmc->eventsel & HSW_IN_TX_CHECKPOINTED) {
>> +                       /*
>> +                        * HSW_IN_TX_CHECKPOINTED is not supported with nonzero
>> +                        * period. Just clear the sample period so at least
>> +                        * allocating the counter doesn't fail.
>> +                        */
>> +                       attr.sample_period = 0;
>> +                       attr.config |= HSW_IN_TX_CHECKPOINTED;
> 
> As above, this statement does nothing. We should just set
> attr.sample_period to 0. Note, however, that the SDM documents an

Thanks and applied.

> additional constraint which is ignored here: "This bit may only be set
> for IA32_PERFEVTSEL2." I have confirmed that a #GP is raised for an
> attempt to set bit 33 in any PerfEvtSeln other than PerfEvtSel2 on a
> Broadwell Xeon E5.

Yes, "19.3.6.5 Performance Monitoring and Intel® TSX".

I'm not sure if the host perf scheduler indicate this restriction.

cc Kan.

> 
>> +               }
>>          }
>>
>>          event = perf_event_create_kernel_counter(&attr, -1, current,
>> @@ -268,9 +269,7 @@ void reprogram_counter(struct kvm_pmc *pmc)
>>                          (eventsel & AMD64_RAW_EVENT_MASK),
>>                          !(eventsel & ARCH_PERFMON_EVENTSEL_USR),
>>                          !(eventsel & ARCH_PERFMON_EVENTSEL_OS),
>> -                       eventsel & ARCH_PERFMON_EVENTSEL_INT,
>> -                       (eventsel & HSW_IN_TX),
>> -                       (eventsel & HSW_IN_TX_CHECKPOINTED));
>> +                       eventsel & ARCH_PERFMON_EVENTSEL_INT);
>>   }
>>   EXPORT_SYMBOL_GPL(reprogram_counter);
>>
>> --
>> 2.35.1
>>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] KVM: x86/pmu: Isolate TSX specific perf_event_attr.attr logic for AMD
  2022-03-08 11:59   ` Like Xu
@ 2022-03-08 16:09     ` Jim Mattson
  0 siblings, 0 replies; 4+ messages in thread
From: Jim Mattson @ 2022-03-08 16:09 UTC (permalink / raw)
  To: Like Xu
  Cc: Paolo Bonzini, Ravi Bangoria, Andi Kleen, kvm,
	Sean Christopherson, Wanpeng Li, Vitaly Kuznetsov, Joerg Roedel,
	linux-kernel, Kan Liang

On Tue, Mar 8, 2022 at 3:59 AM Like Xu <like.xu.linux@gmail.com> wrote:
>
> On 8/3/2022 5:39 am, Jim Mattson wrote:
> > On Sun, Mar 6, 2022 at 10:38 PM Like Xu <like.xu.linux@gmail.com> wrote:
> >>
> >> From: Like Xu <likexu@tencent.com>
> >>
> >> HSW_IN_TX* bits are used in generic code which are not supported on
> >> AMD. Worse, these bits overlap with AMD EventSelect[11:8] and hence
> >> using HSW_IN_TX* bits unconditionally in generic code is resulting in
> >> unintentional pmu behavior on AMD. For example, if EventSelect[11:8]
> >> is 0x2, pmc_reprogram_counter() wrongly assumes that
> >> HSW_IN_TX_CHECKPOINTED is set and thus forces sampling period to be 0.
> >>
> >> Opportunistically remove two TSX specific incoming parameters for
> >> the generic interface reprogram_counter().
> >>
> >> Fixes: 103af0a98788 ("perf, kvm: Support the in_tx/in_tx_cp modifiers in KVM arch perfmon emulation v5")
> >> Co-developed-by: Ravi Bangoria <ravi.bangoria@amd.com>
> >> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
> >> Signed-off-by: Like Xu <likexu@tencent.com>
> >> ---
> >> Note: this patch is based on [1] which is considered to be a necessary cornerstone.
> >> [1] https://lore.kernel.org/kvm/20220302111334.12689-1-likexu@tencent.com/
> >>
> >>   arch/x86/kvm/pmu.c | 29 ++++++++++++++---------------
> >>   1 file changed, 14 insertions(+), 15 deletions(-)
> >>
> >> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> >> index 17c61c990282..d0f9515c37dd 100644
> >> --- a/arch/x86/kvm/pmu.c
> >> +++ b/arch/x86/kvm/pmu.c
> >> @@ -99,8 +99,7 @@ static void kvm_perf_overflow(struct perf_event *perf_event,
> >>
> >>   static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type,
> >>                                    u64 config, bool exclude_user,
> >> -                                 bool exclude_kernel, bool intr,
> >> -                                 bool in_tx, bool in_tx_cp)
> >> +                                 bool exclude_kernel, bool intr)
> >>   {
> >>          struct perf_event *event;
> >>          struct perf_event_attr attr = {
> >> @@ -116,16 +115,18 @@ static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type,
> >>
> >>          attr.sample_period = get_sample_period(pmc, pmc->counter);
> >>
> >> -       if (in_tx)
> >> -               attr.config |= HSW_IN_TX;
> >> -       if (in_tx_cp) {
> >> -               /*
> >> -                * HSW_IN_TX_CHECKPOINTED is not supported with nonzero
> >> -                * period. Just clear the sample period so at least
> >> -                * allocating the counter doesn't fail.
> >> -                */
> >> -               attr.sample_period = 0;
> >> -               attr.config |= HSW_IN_TX_CHECKPOINTED;
> >> +       if (guest_cpuid_is_intel(pmc->vcpu)) {
> >
> > This is not the right condition to check. Per the SDM, both bits 32
> > and 33 "may only be set if the processor supports HLE or RTM." On
> > other Intel processors, this bit is reserved and any attempts to set
> > them result in a #GP.
>
> We already have this part of the code:
>
>         entry = kvm_find_cpuid_entry(vcpu, 7, 0);
>         if (entry &&
>             (boot_cpu_has(X86_FEATURE_HLE) || boot_cpu_has(X86_FEATURE_RTM)) &&
>             (entry->ebx & (X86_FEATURE_HLE|X86_FEATURE_RTM)))
>                 pmu->reserved_bits ^= HSW_IN_TX|HSW_IN_TX_CHECKPOINTED;

I stand corrected.

> > additional constraint which is ignored here: "This bit may only be set
> > for IA32_PERFEVTSEL2." I have confirmed that a #GP is raised for an
> > attempt to set bit 33 in any PerfEvtSeln other than PerfEvtSel2 on a
> > Broadwell Xeon E5.
>
> Yes, "19.3.6.5 Performance Monitoring and Intel® TSX".
>
> I'm not sure if the host perf scheduler indicate this restriction.

Whether it does or it doesn't, it's KVM's responsibility to synthesize
the #GP if IN_TXCP is set for any PerfEvtSeln other than PerfEvtSel2.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-03-08 16:09 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-07  6:38 [PATCH] KVM: x86/pmu: Isolate TSX specific perf_event_attr.attr logic for AMD Like Xu
2022-03-07 21:39 ` Jim Mattson
2022-03-08 11:59   ` Like Xu
2022-03-08 16:09     ` Jim Mattson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.