kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] KVM: x86/svm: Add module param to control PMU virtualization
@ 2021-11-17  8:03 Like Xu
  2021-11-18 13:25 ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: Like Xu @ 2021-11-17  8:03 UTC (permalink / raw)
  To: Paolo Bonzini, Maxim Levitsky
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel

From: Like Xu <likexu@tencent.com>

For Intel, the guest PMU can be disabled via clearing the PMU CPUID.
For AMD, all hw implementations support the base set of four
performance counters, with current mainstream hardware indicating
the presence of two additional counters via X86_FEATURE_PERFCTR_CORE.

In the virtualized world, the AMD guest driver may detect
the presence of at least one counter MSR. Most hypervisor
vendors would introduce a module param (like lbrv for svm)
to disable PMU for all guests.

Another control proposal per-VM is to pass PMU disable information
via MSR_IA32_PERF_CAPABILITIES or one bit in CPUID Fn4000_00[FF:00].
Both of methods require some guest-side changes, so a module
parameter may not be sufficiently granular, but practical enough.

Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/kvm/cpuid.c   |  2 +-
 arch/x86/kvm/svm/pmu.c |  4 ++++
 arch/x86/kvm/svm/svm.c | 11 +++++++++++
 arch/x86/kvm/svm/svm.h |  1 +
 4 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 2d70edb0f323..647af2a184ad 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -487,7 +487,7 @@ void kvm_set_cpu_caps(void)
 		F(CR8_LEGACY) | F(ABM) | F(SSE4A) | F(MISALIGNSSE) |
 		F(3DNOWPREFETCH) | F(OSVW) | 0 /* IBS */ | F(XOP) |
 		0 /* SKINIT, WDT, LWP */ | F(FMA4) | F(TBM) |
-		F(TOPOEXT) | F(PERFCTR_CORE)
+		F(TOPOEXT) | 0 /* PERFCTR_CORE */
 	);
 
 	kvm_cpu_cap_mask(CPUID_8000_0001_EDX,
diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index fdf587f19c5f..a0bcf0144664 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -16,6 +16,7 @@
 #include "cpuid.h"
 #include "lapic.h"
 #include "pmu.h"
+#include "svm.h"
 
 enum pmu_type {
 	PMU_TYPE_COUNTER = 0,
@@ -100,6 +101,9 @@ static inline struct kvm_pmc *get_gp_pmc_amd(struct kvm_pmu *pmu, u32 msr,
 {
 	struct kvm_vcpu *vcpu = pmu_to_vcpu(pmu);
 
+	if (!pmuv)
+		return NULL;
+
 	switch (msr) {
 	case MSR_F15H_PERF_CTL0:
 	case MSR_F15H_PERF_CTL1:
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 21bb81710e0f..062e48c191ee 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -190,6 +190,10 @@ module_param(vgif, int, 0444);
 static int lbrv = true;
 module_param(lbrv, int, 0444);
 
+/* enable/disable PMU virtualization */
+bool pmuv = true;
+module_param(pmuv, bool, 0444);
+
 static int tsc_scaling = true;
 module_param(tsc_scaling, int, 0444);
 
@@ -952,6 +956,10 @@ static __init void svm_set_cpu_caps(void)
 	    boot_cpu_has(X86_FEATURE_AMD_SSBD))
 		kvm_cpu_cap_set(X86_FEATURE_VIRT_SSBD);
 
+	/* AMD PMU PERFCTR_CORE CPUID */
+	if (pmuv && boot_cpu_has(X86_FEATURE_PERFCTR_CORE))
+		kvm_cpu_cap_set(X86_FEATURE_PERFCTR_CORE);
+
 	/* CPUID 0x8000001F (SME/SEV features) */
 	sev_set_cpu_caps();
 }
@@ -1085,6 +1093,9 @@ static __init int svm_hardware_setup(void)
 			pr_info("LBR virtualization supported\n");
 	}
 
+	if (!pmuv)
+		pr_info("PMU virtualization is disabled\n");
+
 	svm_set_cpu_caps();
 
 	/*
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 0d7bbe548ac3..08e1c19ffbdf 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -32,6 +32,7 @@
 extern u32 msrpm_offsets[MSRPM_OFFSETS] __read_mostly;
 extern bool npt_enabled;
 extern bool intercept_smi;
+extern bool pmuv;
 
 /*
  * Clean bits in VMCB.
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: x86/svm: Add module param to control PMU virtualization
  2021-11-17  8:03 [PATCH] KVM: x86/svm: Add module param to control PMU virtualization Like Xu
@ 2021-11-18 13:25 ` Paolo Bonzini
  2021-12-10 19:25   ` Jim Mattson
  2022-01-15  1:26   ` Jim Mattson
  0 siblings, 2 replies; 15+ messages in thread
From: Paolo Bonzini @ 2021-11-18 13:25 UTC (permalink / raw)
  To: Like Xu, Maxim Levitsky
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel

On 11/17/21 09:03, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> For Intel, the guest PMU can be disabled via clearing the PMU CPUID.
> For AMD, all hw implementations support the base set of four
> performance counters, with current mainstream hardware indicating
> the presence of two additional counters via X86_FEATURE_PERFCTR_CORE.
> 
> In the virtualized world, the AMD guest driver may detect
> the presence of at least one counter MSR. Most hypervisor
> vendors would introduce a module param (like lbrv for svm)
> to disable PMU for all guests.
> 
> Another control proposal per-VM is to pass PMU disable information
> via MSR_IA32_PERF_CAPABILITIES or one bit in CPUID Fn4000_00[FF:00].
> Both of methods require some guest-side changes, so a module
> parameter may not be sufficiently granular, but practical enough.
> 
> Signed-off-by: Like Xu <likexu@tencent.com>
> ---
>   arch/x86/kvm/cpuid.c   |  2 +-
>   arch/x86/kvm/svm/pmu.c |  4 ++++
>   arch/x86/kvm/svm/svm.c | 11 +++++++++++
>   arch/x86/kvm/svm/svm.h |  1 +
>   4 files changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 2d70edb0f323..647af2a184ad 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -487,7 +487,7 @@ void kvm_set_cpu_caps(void)
>   		F(CR8_LEGACY) | F(ABM) | F(SSE4A) | F(MISALIGNSSE) |
>   		F(3DNOWPREFETCH) | F(OSVW) | 0 /* IBS */ | F(XOP) |
>   		0 /* SKINIT, WDT, LWP */ | F(FMA4) | F(TBM) |
> -		F(TOPOEXT) | F(PERFCTR_CORE)
> +		F(TOPOEXT) | 0 /* PERFCTR_CORE */
>   	);
>   
>   	kvm_cpu_cap_mask(CPUID_8000_0001_EDX,
> diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
> index fdf587f19c5f..a0bcf0144664 100644
> --- a/arch/x86/kvm/svm/pmu.c
> +++ b/arch/x86/kvm/svm/pmu.c
> @@ -16,6 +16,7 @@
>   #include "cpuid.h"
>   #include "lapic.h"
>   #include "pmu.h"
> +#include "svm.h"
>   
>   enum pmu_type {
>   	PMU_TYPE_COUNTER = 0,
> @@ -100,6 +101,9 @@ static inline struct kvm_pmc *get_gp_pmc_amd(struct kvm_pmu *pmu, u32 msr,
>   {
>   	struct kvm_vcpu *vcpu = pmu_to_vcpu(pmu);
>   
> +	if (!pmuv)
> +		return NULL;
> +
>   	switch (msr) {
>   	case MSR_F15H_PERF_CTL0:
>   	case MSR_F15H_PERF_CTL1:
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 21bb81710e0f..062e48c191ee 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -190,6 +190,10 @@ module_param(vgif, int, 0444);
>   static int lbrv = true;
>   module_param(lbrv, int, 0444);
>   
> +/* enable/disable PMU virtualization */
> +bool pmuv = true;
> +module_param(pmuv, bool, 0444);
> +
>   static int tsc_scaling = true;
>   module_param(tsc_scaling, int, 0444);
>   
> @@ -952,6 +956,10 @@ static __init void svm_set_cpu_caps(void)
>   	    boot_cpu_has(X86_FEATURE_AMD_SSBD))
>   		kvm_cpu_cap_set(X86_FEATURE_VIRT_SSBD);
>   
> +	/* AMD PMU PERFCTR_CORE CPUID */
> +	if (pmuv && boot_cpu_has(X86_FEATURE_PERFCTR_CORE))
> +		kvm_cpu_cap_set(X86_FEATURE_PERFCTR_CORE);
> +
>   	/* CPUID 0x8000001F (SME/SEV features) */
>   	sev_set_cpu_caps();
>   }
> @@ -1085,6 +1093,9 @@ static __init int svm_hardware_setup(void)
>   			pr_info("LBR virtualization supported\n");
>   	}
>   
> +	if (!pmuv)
> +		pr_info("PMU virtualization is disabled\n");
> +
>   	svm_set_cpu_caps();
>   
>   	/*
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index 0d7bbe548ac3..08e1c19ffbdf 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -32,6 +32,7 @@
>   extern u32 msrpm_offsets[MSRPM_OFFSETS] __read_mostly;
>   extern bool npt_enabled;
>   extern bool intercept_smi;
> +extern bool pmuv;
>   
>   /*
>    * Clean bits in VMCB.
> 

Queued, thanks - just changed the parameter name to "pmu".

Paolo


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: x86/svm: Add module param to control PMU virtualization
  2021-11-18 13:25 ` Paolo Bonzini
@ 2021-12-10 19:25   ` Jim Mattson
  2021-12-11  2:15     ` Paolo Bonzini
  2022-01-15  1:26   ` Jim Mattson
  1 sibling, 1 reply; 15+ messages in thread
From: Jim Mattson @ 2021-12-10 19:25 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Like Xu, Maxim Levitsky, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Joerg Roedel, kvm, linux-kernel

On Thu, Nov 18, 2021 at 5:25 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 11/17/21 09:03, Like Xu wrote:
> > From: Like Xu <likexu@tencent.com>
> >
> > For Intel, the guest PMU can be disabled via clearing the PMU CPUID.
> > For AMD, all hw implementations support the base set of four
> > performance counters, with current mainstream hardware indicating
> > the presence of two additional counters via X86_FEATURE_PERFCTR_CORE.
> >
> > In the virtualized world, the AMD guest driver may detect
> > the presence of at least one counter MSR. Most hypervisor
> > vendors would introduce a module param (like lbrv for svm)
> > to disable PMU for all guests.
> >
> > Another control proposal per-VM is to pass PMU disable information
> > via MSR_IA32_PERF_CAPABILITIES or one bit in CPUID Fn4000_00[FF:00].
> > Both of methods require some guest-side changes, so a module
> > parameter may not be sufficiently granular, but practical enough.
> >
> > Signed-off-by: Like Xu <likexu@tencent.com>
> > ---
Thanks for this patch. It saves us from upstreaming our equivalent patch.

In the long run, I'd like to be able to override this system-wide
setting on a per-VM basis, for VMs that I trust. (Of course, this
implies that I trust the userspace process as well.)

How would you feel if we were to add a kvm ioctl to override this
setting, for a particular VM, guarded by an appropriate permissions
check, like capable(CAP_SYS_ADMIN) or capable(CAP_SYS_MODULE)?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: x86/svm: Add module param to control PMU virtualization
  2021-12-10 19:25   ` Jim Mattson
@ 2021-12-11  2:15     ` Paolo Bonzini
  2021-12-11  3:48       ` Jim Mattson
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2021-12-11  2:15 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Like Xu, Maxim Levitsky, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Joerg Roedel, kvm, linux-kernel

On 12/10/21 20:25, Jim Mattson wrote:
> In the long run, I'd like to be able to override this system-wide
> setting on a per-VM basis, for VMs that I trust. (Of course, this
> implies that I trust the userspace process as well.)
> 
> How would you feel if we were to add a kvm ioctl to override this
> setting, for a particular VM, guarded by an appropriate permissions
> check, like capable(CAP_SYS_ADMIN) or capable(CAP_SYS_MODULE)?

What's the rationale for guarding this with a capability check?  IIRC 
you don't have such checks for perf_event_open (apart for getting kernel 
addresses, which is not a problem for virtualization).

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: x86/svm: Add module param to control PMU virtualization
  2021-12-11  2:15     ` Paolo Bonzini
@ 2021-12-11  3:48       ` Jim Mattson
  2022-01-09  1:23         ` Jim Mattson
  0 siblings, 1 reply; 15+ messages in thread
From: Jim Mattson @ 2021-12-11  3:48 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Like Xu, Maxim Levitsky, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Joerg Roedel, kvm, linux-kernel

On Fri, Dec 10, 2021 at 6:15 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 12/10/21 20:25, Jim Mattson wrote:
> > In the long run, I'd like to be able to override this system-wide
> > setting on a per-VM basis, for VMs that I trust. (Of course, this
> > implies that I trust the userspace process as well.)
> >
> > How would you feel if we were to add a kvm ioctl to override this
> > setting, for a particular VM, guarded by an appropriate permissions
> > check, like capable(CAP_SYS_ADMIN) or capable(CAP_SYS_MODULE)?
>
> What's the rationale for guarding this with a capability check?  IIRC
> you don't have such checks for perf_event_open (apart for getting kernel
> addresses, which is not a problem for virtualization).

My reasoning was simply that for userspace to override a mode 0444
kernel module parameter, it should have the rights to reload the
module with the parameter override. I wasn't thinking specifically
about PMU capabilities.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: x86/svm: Add module param to control PMU virtualization
  2021-12-11  3:48       ` Jim Mattson
@ 2022-01-09  1:23         ` Jim Mattson
  2022-01-10  6:23           ` Like Xu
  0 siblings, 1 reply; 15+ messages in thread
From: Jim Mattson @ 2022-01-09  1:23 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Like Xu, Maxim Levitsky, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Joerg Roedel, kvm, linux-kernel, David Dunn

On Fri, Dec 10, 2021 at 7:48 PM Jim Mattson <jmattson@google.com> wrote:
>
> On Fri, Dec 10, 2021 at 6:15 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
> >
> > On 12/10/21 20:25, Jim Mattson wrote:
> > > In the long run, I'd like to be able to override this system-wide
> > > setting on a per-VM basis, for VMs that I trust. (Of course, this
> > > implies that I trust the userspace process as well.)
> > >
> > > How would you feel if we were to add a kvm ioctl to override this
> > > setting, for a particular VM, guarded by an appropriate permissions
> > > check, like capable(CAP_SYS_ADMIN) or capable(CAP_SYS_MODULE)?
> >
> > What's the rationale for guarding this with a capability check?  IIRC
> > you don't have such checks for perf_event_open (apart for getting kernel
> > addresses, which is not a problem for virtualization).
>
> My reasoning was simply that for userspace to override a mode 0444
> kernel module parameter, it should have the rights to reload the
> module with the parameter override. I wasn't thinking specifically
> about PMU capabilities.

Assuming that we trust userspace to decide whether or not to expose a
virtual PMU to a guest (as we do on the Intel side), perhaps we could
make use of the existing PMU_EVENT_FILTER to give us per-VM control,
rather than adding a new module parameter for per-host control. If
userspace calls KVM_SET_PMU_EVENT_FILTER with an action of
KVM_PMU_EVENT_ALLOW and an empty list of allowed events, KVM could
just disable the virtual PMU for that VM.

Today, the semantics of an empty allow list are quite different from
the proposed pmuv module parameter being false. However, it should be
an easy conversion. Would anyone be concerned about changing the
current semantics of an empty allow list? Is there a need for
disabling PMU virtualization for legacy userspace implementations that
can't be modified to ask for an empty allow list?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: x86/svm: Add module param to control PMU virtualization
  2022-01-09  1:23         ` Jim Mattson
@ 2022-01-10  6:23           ` Like Xu
  2022-01-10 18:13             ` Jim Mattson
  0 siblings, 1 reply; 15+ messages in thread
From: Like Xu @ 2022-01-10  6:23 UTC (permalink / raw)
  To: Jim Mattson, Paolo Bonzini
  Cc: Maxim Levitsky, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Joerg Roedel, kvm, linux-kernel, David Dunn

On 9/1/2022 9:23 am, Jim Mattson wrote:
> On Fri, Dec 10, 2021 at 7:48 PM Jim Mattson <jmattson@google.com> wrote:
>>
>> On Fri, Dec 10, 2021 at 6:15 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>
>>> On 12/10/21 20:25, Jim Mattson wrote:
>>>> In the long run, I'd like to be able to override this system-wide
>>>> setting on a per-VM basis, for VMs that I trust. (Of course, this
>>>> implies that I trust the userspace process as well.)
>>>>
>>>> How would you feel if we were to add a kvm ioctl to override this
>>>> setting, for a particular VM, guarded by an appropriate permissions
>>>> check, like capable(CAP_SYS_ADMIN) or capable(CAP_SYS_MODULE)?
>>>
>>> What's the rationale for guarding this with a capability check?  IIRC
>>> you don't have such checks for perf_event_open (apart for getting kernel
>>> addresses, which is not a problem for virtualization).
>>
>> My reasoning was simply that for userspace to override a mode 0444
>> kernel module parameter, it should have the rights to reload the
>> module with the parameter override. I wasn't thinking specifically
>> about PMU capabilities.

Do we have a precedent on any module parameter rewriting for privileger ?

A further requirement is whether we can dynamically change this part of
the behaviour when the guest is already booted up.

> 
> Assuming that we trust userspace to decide whether or not to expose a
> virtual PMU to a guest (as we do on the Intel side), perhaps we could
> make use of the existing PMU_EVENT_FILTER to give us per-VM control,
> rather than adding a new module parameter for per-host control. If

Various granularities of control are required to support vPMU production
scenarios, including per-host, per-VM, and dynamic-guest-alive control.

> userspace calls KVM_SET_PMU_EVENT_FILTER with an action of
> KVM_PMU_EVENT_ALLOW and an empty list of allowed events, KVM could
> just disable the virtual PMU for that VM.

AMD will also have "CPUID Fn8000_0022_EBX[NumCorePmc, 3:0]".

> 
> Today, the semantics of an empty allow list are quite different from
> the proposed pmuv module parameter being false. However, it should be
> an easy conversion. Would anyone be concerned about changing the
> current semantics of an empty allow list? Is there a need for
> disabling PMU virtualization for legacy userspace implementations that
> can't be modified to ask for an empty allow list?
> 

AFAI, at least one user-space agent has integrated with it plus additional 
"action"s.

Once the API that the kernel presents to user space has been defined,
it's best not to change it and instead fall into remorse.

"But I am not a decision maker. " :D

Thanks,
Like Xu


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: x86/svm: Add module param to control PMU virtualization
  2022-01-10  6:23           ` Like Xu
@ 2022-01-10 18:13             ` Jim Mattson
  2022-01-11  2:11               ` Like Xu
  0 siblings, 1 reply; 15+ messages in thread
From: Jim Mattson @ 2022-01-10 18:13 UTC (permalink / raw)
  To: Like Xu
  Cc: Paolo Bonzini, Maxim Levitsky, Sean Christopherson,
	Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel, kvm, linux-kernel,
	David Dunn

On Sun, Jan 9, 2022 at 10:23 PM Like Xu <like.xu.linux@gmail.com> wrote:
>
> On 9/1/2022 9:23 am, Jim Mattson wrote:
> > On Fri, Dec 10, 2021 at 7:48 PM Jim Mattson <jmattson@google.com> wrote:
> >>
> >> On Fri, Dec 10, 2021 at 6:15 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
> >>>
> >>> On 12/10/21 20:25, Jim Mattson wrote:
> >>>> In the long run, I'd like to be able to override this system-wide
> >>>> setting on a per-VM basis, for VMs that I trust. (Of course, this
> >>>> implies that I trust the userspace process as well.)
> >>>>
> >>>> How would you feel if we were to add a kvm ioctl to override this
> >>>> setting, for a particular VM, guarded by an appropriate permissions
> >>>> check, like capable(CAP_SYS_ADMIN) or capable(CAP_SYS_MODULE)?
> >>>
> >>> What's the rationale for guarding this with a capability check?  IIRC
> >>> you don't have such checks for perf_event_open (apart for getting kernel
> >>> addresses, which is not a problem for virtualization).
> >>
> >> My reasoning was simply that for userspace to override a mode 0444
> >> kernel module parameter, it should have the rights to reload the
> >> module with the parameter override. I wasn't thinking specifically
> >> about PMU capabilities.
>
> Do we have a precedent on any module parameter rewriting for privileger ?
>
> A further requirement is whether we can dynamically change this part of
> the behaviour when the guest is already booted up.
>
> >
> > Assuming that we trust userspace to decide whether or not to expose a
> > virtual PMU to a guest (as we do on the Intel side), perhaps we could
> > make use of the existing PMU_EVENT_FILTER to give us per-VM control,
> > rather than adding a new module parameter for per-host control. If
>
> Various granularities of control are required to support vPMU production
> scenarios, including per-host, per-VM, and dynamic-guest-alive control.
>
> > userspace calls KVM_SET_PMU_EVENT_FILTER with an action of
> > KVM_PMU_EVENT_ALLOW and an empty list of allowed events, KVM could
> > just disable the virtual PMU for that VM.
>
> AMD will also have "CPUID Fn8000_0022_EBX[NumCorePmc, 3:0]".

Where do you see this? Revision 3.33 (November 2021) of the AMD APM,
volume 3, only goes as high as CPUID Fn8000_0021.

> >
> > Today, the semantics of an empty allow list are quite different from
> > the proposed pmuv module parameter being false. However, it should be
> > an easy conversion. Would anyone be concerned about changing the
> > current semantics of an empty allow list? Is there a need for
> > disabling PMU virtualization for legacy userspace implementations that
> > can't be modified to ask for an empty allow list?
> >
>
> AFAI, at least one user-space agent has integrated with it plus additional
> "action"s.
>
> Once the API that the kernel presents to user space has been defined,
> it's best not to change it and instead fall into remorse.

Okay.

I propose the following:
1) The new module parameter should apply to Intel as well as AMD, for
situations where userspace is not trusted.
2) If the module parameter allows PMU virtualization, there should be
a new KVM_CAP whereby userspace can enable/disable PMU virtualization.
(Since you require a dynamic toggle, and there is a move afoot to
refuse guest CPUID changes once a guest is running, this new KVM_CAP
is needed on Intel as well as AMD).
3) If the module parameter does not allow PMU virtualization, there
should be no userspace override, since we have no precedent for
authorizing that kind of override.

> "But I am not a decision maker. " :D
>
> Thanks,
> Like Xu
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: x86/svm: Add module param to control PMU virtualization
  2022-01-10 18:13             ` Jim Mattson
@ 2022-01-11  2:11               ` Like Xu
  2022-01-11  3:24                 ` Jim Mattson
  0 siblings, 1 reply; 15+ messages in thread
From: Like Xu @ 2022-01-11  2:11 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Paolo Bonzini, Maxim Levitsky, Sean Christopherson,
	Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel, kvm, linux-kernel,
	David Dunn

On 11/1/2022 2:13 am, Jim Mattson wrote:
> On Sun, Jan 9, 2022 at 10:23 PM Like Xu <like.xu.linux@gmail.com> wrote:
>>
>> On 9/1/2022 9:23 am, Jim Mattson wrote:
>>> On Fri, Dec 10, 2021 at 7:48 PM Jim Mattson <jmattson@google.com> wrote:
>>>>
>>>> On Fri, Dec 10, 2021 at 6:15 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>>>
>>>>> On 12/10/21 20:25, Jim Mattson wrote:
>>>>>> In the long run, I'd like to be able to override this system-wide
>>>>>> setting on a per-VM basis, for VMs that I trust. (Of course, this
>>>>>> implies that I trust the userspace process as well.)
>>>>>>
>>>>>> How would you feel if we were to add a kvm ioctl to override this
>>>>>> setting, for a particular VM, guarded by an appropriate permissions
>>>>>> check, like capable(CAP_SYS_ADMIN) or capable(CAP_SYS_MODULE)?
>>>>>
>>>>> What's the rationale for guarding this with a capability check?  IIRC
>>>>> you don't have such checks for perf_event_open (apart for getting kernel
>>>>> addresses, which is not a problem for virtualization).
>>>>
>>>> My reasoning was simply that for userspace to override a mode 0444
>>>> kernel module parameter, it should have the rights to reload the
>>>> module with the parameter override. I wasn't thinking specifically
>>>> about PMU capabilities.
>>
>> Do we have a precedent on any module parameter rewriting for privileger ?
>>
>> A further requirement is whether we can dynamically change this part of
>> the behaviour when the guest is already booted up.
>>
>>>
>>> Assuming that we trust userspace to decide whether or not to expose a
>>> virtual PMU to a guest (as we do on the Intel side), perhaps we could
>>> make use of the existing PMU_EVENT_FILTER to give us per-VM control,
>>> rather than adding a new module parameter for per-host control. If
>>
>> Various granularities of control are required to support vPMU production
>> scenarios, including per-host, per-VM, and dynamic-guest-alive control.
>>
>>> userspace calls KVM_SET_PMU_EVENT_FILTER with an action of
>>> KVM_PMU_EVENT_ALLOW and an empty list of allowed events, KVM could
>>> just disable the virtual PMU for that VM.
>>
>> AMD will also have "CPUID Fn8000_0022_EBX[NumCorePmc, 3:0]".
> 
> Where do you see this? Revision 3.33 (November 2021) of the AMD APM,
> volume 3, only goes as high as CPUID Fn8000_0021.

Try APM Revision: 4.04 (November 2021),  page 1849/3273,
"CPUID Fn8000_0022_EBX Extended Performance Monitoring and Debug".

Given the current ambiguity in this revision, the AMD folks will reveal more
details bout this field in the next revision.

> 
>>>
>>> Today, the semantics of an empty allow list are quite different from
>>> the proposed pmuv module parameter being false. However, it should be
>>> an easy conversion. Would anyone be concerned about changing the
>>> current semantics of an empty allow list? Is there a need for
>>> disabling PMU virtualization for legacy userspace implementations that
>>> can't be modified to ask for an empty allow list?
>>>
>>
>> AFAI, at least one user-space agent has integrated with it plus additional
>> "action"s.
>>
>> Once the API that the kernel presents to user space has been defined,
>> it's best not to change it and instead fall into remorse.
> 
> Okay.
> 
> I propose the following:
> 1) The new module parameter should apply to Intel as well as AMD, for
> situations where userspace is not trusted.
> 2) If the module parameter allows PMU virtualization, there should be
> a new KVM_CAP whereby userspace can enable/disable PMU virtualization.
> (Since you require a dynamic toggle, and there is a move afoot to
> refuse guest CPUID changes once a guest is running, this new KVM_CAP
> is needed on Intel as well as AMD).

Both hands in favour. Do you need me as a labourer, or you have a ready-made one ?

> 3) If the module parameter does not allow PMU virtualization, there
> should be no userspace override, since we have no precedent for
> authorizing that kind of override.

Uh, I thought you (Google) had a lot of these (interesting) use cases internally.

> 
>> "But I am not a decision maker. " :D
>>
>> Thanks,
>> Like Xu
>>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: x86/svm: Add module param to control PMU virtualization
  2022-01-11  2:11               ` Like Xu
@ 2022-01-11  3:24                 ` Jim Mattson
  2022-01-11  6:18                   ` Like Xu
  0 siblings, 1 reply; 15+ messages in thread
From: Jim Mattson @ 2022-01-11  3:24 UTC (permalink / raw)
  To: Like Xu
  Cc: Paolo Bonzini, Maxim Levitsky, Sean Christopherson,
	Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel, kvm, linux-kernel,
	David Dunn

On Mon, Jan 10, 2022 at 6:11 PM Like Xu <like.xu.linux@gmail.com> wrote:
>
> On 11/1/2022 2:13 am, Jim Mattson wrote:
> > On Sun, Jan 9, 2022 at 10:23 PM Like Xu <like.xu.linux@gmail.com> wrote:
> >>
> >> On 9/1/2022 9:23 am, Jim Mattson wrote:
> >>> On Fri, Dec 10, 2021 at 7:48 PM Jim Mattson <jmattson@google.com> wrote:
> >>>>
> >>>> On Fri, Dec 10, 2021 at 6:15 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
> >>>>>
> >>>>> On 12/10/21 20:25, Jim Mattson wrote:
> >>>>>> In the long run, I'd like to be able to override this system-wide
> >>>>>> setting on a per-VM basis, for VMs that I trust. (Of course, this
> >>>>>> implies that I trust the userspace process as well.)
> >>>>>>
> >>>>>> How would you feel if we were to add a kvm ioctl to override this
> >>>>>> setting, for a particular VM, guarded by an appropriate permissions
> >>>>>> check, like capable(CAP_SYS_ADMIN) or capable(CAP_SYS_MODULE)?
> >>>>>
> >>>>> What's the rationale for guarding this with a capability check?  IIRC
> >>>>> you don't have such checks for perf_event_open (apart for getting kernel
> >>>>> addresses, which is not a problem for virtualization).
> >>>>
> >>>> My reasoning was simply that for userspace to override a mode 0444
> >>>> kernel module parameter, it should have the rights to reload the
> >>>> module with the parameter override. I wasn't thinking specifically
> >>>> about PMU capabilities.
> >>
> >> Do we have a precedent on any module parameter rewriting for privileger ?
> >>
> >> A further requirement is whether we can dynamically change this part of
> >> the behaviour when the guest is already booted up.
> >>
> >>>
> >>> Assuming that we trust userspace to decide whether or not to expose a
> >>> virtual PMU to a guest (as we do on the Intel side), perhaps we could
> >>> make use of the existing PMU_EVENT_FILTER to give us per-VM control,
> >>> rather than adding a new module parameter for per-host control. If
> >>
> >> Various granularities of control are required to support vPMU production
> >> scenarios, including per-host, per-VM, and dynamic-guest-alive control.
> >>
> >>> userspace calls KVM_SET_PMU_EVENT_FILTER with an action of
> >>> KVM_PMU_EVENT_ALLOW and an empty list of allowed events, KVM could
> >>> just disable the virtual PMU for that VM.
> >>
> >> AMD will also have "CPUID Fn8000_0022_EBX[NumCorePmc, 3:0]".
> >
> > Where do you see this? Revision 3.33 (November 2021) of the AMD APM,
> > volume 3, only goes as high as CPUID Fn8000_0021.
>
> Try APM Revision: 4.04 (November 2021),  page 1849/3273,
> "CPUID Fn8000_0022_EBX Extended Performance Monitoring and Debug".

Is this a public document?

> Given the current ambiguity in this revision, the AMD folks will reveal more
> details bout this field in the next revision.
>
> >
> >>>
> >>> Today, the semantics of an empty allow list are quite different from
> >>> the proposed pmuv module parameter being false. However, it should be
> >>> an easy conversion. Would anyone be concerned about changing the
> >>> current semantics of an empty allow list? Is there a need for
> >>> disabling PMU virtualization for legacy userspace implementations that
> >>> can't be modified to ask for an empty allow list?
> >>>
> >>
> >> AFAI, at least one user-space agent has integrated with it plus additional
> >> "action"s.
> >>
> >> Once the API that the kernel presents to user space has been defined,
> >> it's best not to change it and instead fall into remorse.
> >
> > Okay.
> >
> > I propose the following:
> > 1) The new module parameter should apply to Intel as well as AMD, for
> > situations where userspace is not trusted.
> > 2) If the module parameter allows PMU virtualization, there should be
> > a new KVM_CAP whereby userspace can enable/disable PMU virtualization.
> > (Since you require a dynamic toggle, and there is a move afoot to
> > refuse guest CPUID changes once a guest is running, this new KVM_CAP
> > is needed on Intel as well as AMD).
>
> Both hands in favour. Do you need me as a labourer, or you have a ready-made one ?

We could split the work. Since (1) is a modification of the change you
proposed in this thread, perhaps you could apply it to both AMD and
Intel in v2? We can find someone for (2).

> > 3) If the module parameter does not allow PMU virtualization, there
> > should be no userspace override, since we have no precedent for
> > authorizing that kind of override.
>
> Uh, I thought you (Google) had a lot of these (interesting) use cases internally.

We have modified some module parameters so that they can be changed at
runtime, but we don't have any concept of a privileged userspace
overriding a module parameter restriction.

> >
> >> "But I am not a decision maker. " :D
> >>
> >> Thanks,
> >> Like Xu
> >>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: x86/svm: Add module param to control PMU virtualization
  2022-01-11  3:24                 ` Jim Mattson
@ 2022-01-11  6:18                   ` Like Xu
  2022-01-11  7:25                     ` Jim Mattson
  0 siblings, 1 reply; 15+ messages in thread
From: Like Xu @ 2022-01-11  6:18 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Paolo Bonzini, Maxim Levitsky, Sean Christopherson,
	Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel, kvm, linux-kernel,
	David Dunn

On 11/1/2022 11:24 am, Jim Mattson wrote:
> On Mon, Jan 10, 2022 at 6:11 PM Like Xu <like.xu.linux@gmail.com> wrote:
>>
>> On 11/1/2022 2:13 am, Jim Mattson wrote:
>>> On Sun, Jan 9, 2022 at 10:23 PM Like Xu <like.xu.linux@gmail.com> wrote:
>>>>
>>>> On 9/1/2022 9:23 am, Jim Mattson wrote:
>>>>> On Fri, Dec 10, 2021 at 7:48 PM Jim Mattson <jmattson@google.com> wrote:
>>>>>>
>>>>>> On Fri, Dec 10, 2021 at 6:15 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>>>>>
>>>>>>> On 12/10/21 20:25, Jim Mattson wrote:
>>>>>>>> In the long run, I'd like to be able to override this system-wide
>>>>>>>> setting on a per-VM basis, for VMs that I trust. (Of course, this
>>>>>>>> implies that I trust the userspace process as well.)
>>>>>>>>
>>>>>>>> How would you feel if we were to add a kvm ioctl to override this
>>>>>>>> setting, for a particular VM, guarded by an appropriate permissions
>>>>>>>> check, like capable(CAP_SYS_ADMIN) or capable(CAP_SYS_MODULE)?
>>>>>>>
>>>>>>> What's the rationale for guarding this with a capability check?  IIRC
>>>>>>> you don't have such checks for perf_event_open (apart for getting kernel
>>>>>>> addresses, which is not a problem for virtualization).
>>>>>>
>>>>>> My reasoning was simply that for userspace to override a mode 0444
>>>>>> kernel module parameter, it should have the rights to reload the
>>>>>> module with the parameter override. I wasn't thinking specifically
>>>>>> about PMU capabilities.
>>>>
>>>> Do we have a precedent on any module parameter rewriting for privileger ?
>>>>
>>>> A further requirement is whether we can dynamically change this part of
>>>> the behaviour when the guest is already booted up.
>>>>
>>>>>
>>>>> Assuming that we trust userspace to decide whether or not to expose a
>>>>> virtual PMU to a guest (as we do on the Intel side), perhaps we could
>>>>> make use of the existing PMU_EVENT_FILTER to give us per-VM control,
>>>>> rather than adding a new module parameter for per-host control. If
>>>>
>>>> Various granularities of control are required to support vPMU production
>>>> scenarios, including per-host, per-VM, and dynamic-guest-alive control.
>>>>
>>>>> userspace calls KVM_SET_PMU_EVENT_FILTER with an action of
>>>>> KVM_PMU_EVENT_ALLOW and an empty list of allowed events, KVM could
>>>>> just disable the virtual PMU for that VM.
>>>>
>>>> AMD will also have "CPUID Fn8000_0022_EBX[NumCorePmc, 3:0]".
>>>
>>> Where do you see this? Revision 3.33 (November 2021) of the AMD APM,
>>> volume 3, only goes as high as CPUID Fn8000_0021.
>>
>> Try APM Revision: 4.04 (November 2021),  page 1849/3273,
>> "CPUID Fn8000_0022_EBX Extended Performance Monitoring and Debug".
> 
> Is this a public document?

The latest version of APM (40332) is revision v4.04, released on 12/1/2021.

> 
>> Given the current ambiguity in this revision, the AMD folks will reveal more
>> details bout this field in the next revision.
>>
>>>
>>>>>
>>>>> Today, the semantics of an empty allow list are quite different from
>>>>> the proposed pmuv module parameter being false. However, it should be
>>>>> an easy conversion. Would anyone be concerned about changing the
>>>>> current semantics of an empty allow list? Is there a need for
>>>>> disabling PMU virtualization for legacy userspace implementations that
>>>>> can't be modified to ask for an empty allow list?
>>>>>
>>>>
>>>> AFAI, at least one user-space agent has integrated with it plus additional
>>>> "action"s.
>>>>
>>>> Once the API that the kernel presents to user space has been defined,
>>>> it's best not to change it and instead fall into remorse.
>>>
>>> Okay.
>>>
>>> I propose the following:
>>> 1) The new module parameter should apply to Intel as well as AMD, for
>>> situations where userspace is not trusted.
>>> 2) If the module parameter allows PMU virtualization, there should be
>>> a new KVM_CAP whereby userspace can enable/disable PMU virtualization.
>>> (Since you require a dynamic toggle, and there is a move afoot to
>>> refuse guest CPUID changes once a guest is running, this new KVM_CAP
>>> is needed on Intel as well as AMD).
>>
>> Both hands in favour. Do you need me as a labourer, or you have a ready-made one ?
> 
> We could split the work. Since (1) is a modification of the change you
> proposed in this thread, perhaps you could apply it to both AMD and

We obviously need extra code to make the module parameters suitable for Intel 
since it
affects other features (such as LBR and PEBS), we may not rush to draw this line 
clearly.

> Intel in v2? We can find someone for (2).

The ioctl_set_pmu_event_filter() interface is already practical for dynamic toggle,
as not being able to program any events is the same as having none vPMU,
w/o considering performance impact of traversing the list.

I am not sure if the maintainer will buy in another KVM_CAP for only per-VM, 
considering
"CPUID Fn8000_0022_EBX[NumCorePmc, 3:0]" is a feature that will be available soon.

> 
>>> 3) If the module parameter does not allow PMU virtualization, there
>>> should be no userspace override, since we have no precedent for
>>> authorizing that kind of override.
>>
>> Uh, I thought you (Google) had a lot of these (interesting) use cases internally.
> 
> We have modified some module parameters so that they can be changed at
> runtime, but we don't have any concept of a privileged userspace
> overriding a module parameter restriction.

Considering that the semantics of the different module parameters are different,
allowing one of them to be overridden does not mean that such a generic framework
is promising, but it makes sense for the community to see another case for it.

> 
>>>
>>>> "But I am not a decision maker. " :D
>>>>
>>>> Thanks,
>>>> Like Xu
>>>>
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: x86/svm: Add module param to control PMU virtualization
  2022-01-11  6:18                   ` Like Xu
@ 2022-01-11  7:25                     ` Jim Mattson
  0 siblings, 0 replies; 15+ messages in thread
From: Jim Mattson @ 2022-01-11  7:25 UTC (permalink / raw)
  To: Like Xu
  Cc: Paolo Bonzini, Maxim Levitsky, Sean Christopherson,
	Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel, kvm, linux-kernel,
	David Dunn

On Mon, Jan 10, 2022 at 10:18 PM Like Xu <like.xu.linux@gmail.com> wrote:
>
> On 11/1/2022 11:24 am, Jim Mattson wrote:
> > On Mon, Jan 10, 2022 at 6:11 PM Like Xu <like.xu.linux@gmail.com> wrote:
> >>
> >> On 11/1/2022 2:13 am, Jim Mattson wrote:
> >>> On Sun, Jan 9, 2022 at 10:23 PM Like Xu <like.xu.linux@gmail.com> wrote:
> >>>>
> >>>> On 9/1/2022 9:23 am, Jim Mattson wrote:
> >>>>> On Fri, Dec 10, 2021 at 7:48 PM Jim Mattson <jmattson@google.com> wrote:
> >>>>>>
> >>>>>> On Fri, Dec 10, 2021 at 6:15 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
> >>>>>>>
> >>>>>>> On 12/10/21 20:25, Jim Mattson wrote:
> >>>>>>>> In the long run, I'd like to be able to override this system-wide
> >>>>>>>> setting on a per-VM basis, for VMs that I trust. (Of course, this
> >>>>>>>> implies that I trust the userspace process as well.)
> >>>>>>>>
> >>>>>>>> How would you feel if we were to add a kvm ioctl to override this
> >>>>>>>> setting, for a particular VM, guarded by an appropriate permissions
> >>>>>>>> check, like capable(CAP_SYS_ADMIN) or capable(CAP_SYS_MODULE)?
> >>>>>>>
> >>>>>>> What's the rationale for guarding this with a capability check?  IIRC
> >>>>>>> you don't have such checks for perf_event_open (apart for getting kernel
> >>>>>>> addresses, which is not a problem for virtualization).
> >>>>>>
> >>>>>> My reasoning was simply that for userspace to override a mode 0444
> >>>>>> kernel module parameter, it should have the rights to reload the
> >>>>>> module with the parameter override. I wasn't thinking specifically
> >>>>>> about PMU capabilities.
> >>>>
> >>>> Do we have a precedent on any module parameter rewriting for privileger ?
> >>>>
> >>>> A further requirement is whether we can dynamically change this part of
> >>>> the behaviour when the guest is already booted up.
> >>>>
> >>>>>
> >>>>> Assuming that we trust userspace to decide whether or not to expose a
> >>>>> virtual PMU to a guest (as we do on the Intel side), perhaps we could
> >>>>> make use of the existing PMU_EVENT_FILTER to give us per-VM control,
> >>>>> rather than adding a new module parameter for per-host control. If
> >>>>
> >>>> Various granularities of control are required to support vPMU production
> >>>> scenarios, including per-host, per-VM, and dynamic-guest-alive control.
> >>>>
> >>>>> userspace calls KVM_SET_PMU_EVENT_FILTER with an action of
> >>>>> KVM_PMU_EVENT_ALLOW and an empty list of allowed events, KVM could
> >>>>> just disable the virtual PMU for that VM.
> >>>>
> >>>> AMD will also have "CPUID Fn8000_0022_EBX[NumCorePmc, 3:0]".
> >>>
> >>> Where do you see this? Revision 3.33 (November 2021) of the AMD APM,
> >>> volume 3, only goes as high as CPUID Fn8000_0021.
> >>
> >> Try APM Revision: 4.04 (November 2021),  page 1849/3273,
> >> "CPUID Fn8000_0022_EBX Extended Performance Monitoring and Debug".
> >
> > Is this a public document?
>
> The latest version of APM (40332) is revision v4.04, released on 12/1/2021.

LOL. I was misled by the table of contents for Appendix E.4, which
stops at E.4.19 Function 8000_0021—Extended Feature Identification 2.

> >
> >> Given the current ambiguity in this revision, the AMD folks will reveal more
> >> details bout this field in the next revision.
> >>
> >>>
> >>>>>
> >>>>> Today, the semantics of an empty allow list are quite different from
> >>>>> the proposed pmuv module parameter being false. However, it should be
> >>>>> an easy conversion. Would anyone be concerned about changing the
> >>>>> current semantics of an empty allow list? Is there a need for
> >>>>> disabling PMU virtualization for legacy userspace implementations that
> >>>>> can't be modified to ask for an empty allow list?
> >>>>>
> >>>>
> >>>> AFAI, at least one user-space agent has integrated with it plus additional
> >>>> "action"s.
> >>>>
> >>>> Once the API that the kernel presents to user space has been defined,
> >>>> it's best not to change it and instead fall into remorse.
> >>>
> >>> Okay.
> >>>
> >>> I propose the following:
> >>> 1) The new module parameter should apply to Intel as well as AMD, for
> >>> situations where userspace is not trusted.
> >>> 2) If the module parameter allows PMU virtualization, there should be
> >>> a new KVM_CAP whereby userspace can enable/disable PMU virtualization.
> >>> (Since you require a dynamic toggle, and there is a move afoot to
> >>> refuse guest CPUID changes once a guest is running, this new KVM_CAP
> >>> is needed on Intel as well as AMD).
> >>
> >> Both hands in favour. Do you need me as a labourer, or you have a ready-made one ?
> >
> > We could split the work. Since (1) is a modification of the change you
> > proposed in this thread, perhaps you could apply it to both AMD and
>
> We obviously need extra code to make the module parameters suitable for Intel
> since it
> affects other features (such as LBR and PEBS), we may not rush to draw this line
> clearly.
>
> > Intel in v2? We can find someone for (2).
>
> The ioctl_set_pmu_event_filter() interface is already practical for dynamic toggle,
> as not being able to program any events is the same as having none vPMU,
> w/o considering performance impact of traversing the list.

Not being able to program any events is actually not the same as
setting the pmuv module parameter to false. In the former case,
prohibited events simply don't advance the counters. In the latter
case, all accesses to the PMU MSRs raise #GP. That was what I meant
earlier, when I said that we would have to change the semantics of an
empty allow list if we want to match the behavior of pmuv=0.

> I am not sure if the maintainer will buy in another KVM_CAP for only per-VM,
> considering
> "CPUID Fn8000_0022_EBX[NumCorePmc, 3:0]" is a feature that will be available soon.

For existing guests, the future availability of that feature is
irrelevant. We're going to need another solution that doesn't involve
recompiling every guest kernel.

> >
> >>> 3) If the module parameter does not allow PMU virtualization, there
> >>> should be no userspace override, since we have no precedent for
> >>> authorizing that kind of override.
> >>
> >> Uh, I thought you (Google) had a lot of these (interesting) use cases internally.
> >
> > We have modified some module parameters so that they can be changed at
> > runtime, but we don't have any concept of a privileged userspace
> > overriding a module parameter restriction.
>
> Considering that the semantics of the different module parameters are different,
> allowing one of them to be overridden does not mean that such a generic framework
> is promising, but it makes sense for the community to see another case for it.
>
> >
> >>>
> >>>> "But I am not a decision maker. " :D
> >>>>
> >>>> Thanks,
> >>>> Like Xu
> >>>>
> >

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: x86/svm: Add module param to control PMU virtualization
  2021-11-18 13:25 ` Paolo Bonzini
  2021-12-10 19:25   ` Jim Mattson
@ 2022-01-15  1:26   ` Jim Mattson
  2022-01-17  2:33     ` Like Xu
  1 sibling, 1 reply; 15+ messages in thread
From: Jim Mattson @ 2022-01-15  1:26 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Like Xu, Maxim Levitsky, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Joerg Roedel, kvm, linux-kernel

On Thu, Nov 18, 2021 at 5:25 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 11/17/21 09:03, Like Xu wrote:
> > From: Like Xu <likexu@tencent.com>
> >
> > For Intel, the guest PMU can be disabled via clearing the PMU CPUID.
> > For AMD, all hw implementations support the base set of four
> > performance counters, with current mainstream hardware indicating
> > the presence of two additional counters via X86_FEATURE_PERFCTR_CORE.
> >
> > In the virtualized world, the AMD guest driver may detect
> > the presence of at least one counter MSR. Most hypervisor
> > vendors would introduce a module param (like lbrv for svm)
> > to disable PMU for all guests.
> >
> > Another control proposal per-VM is to pass PMU disable information
> > via MSR_IA32_PERF_CAPABILITIES or one bit in CPUID Fn4000_00[FF:00].
> > Both of methods require some guest-side changes, so a module
> > parameter may not be sufficiently granular, but practical enough.
> >
> > Signed-off-by: Like Xu <likexu@tencent.com>
> > ---
> >   arch/x86/kvm/cpuid.c   |  2 +-
> >   arch/x86/kvm/svm/pmu.c |  4 ++++
> >   arch/x86/kvm/svm/svm.c | 11 +++++++++++
> >   arch/x86/kvm/svm/svm.h |  1 +
> >   4 files changed, 17 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > index 2d70edb0f323..647af2a184ad 100644
> > --- a/arch/x86/kvm/cpuid.c
> > +++ b/arch/x86/kvm/cpuid.c
> > @@ -487,7 +487,7 @@ void kvm_set_cpu_caps(void)
> >               F(CR8_LEGACY) | F(ABM) | F(SSE4A) | F(MISALIGNSSE) |
> >               F(3DNOWPREFETCH) | F(OSVW) | 0 /* IBS */ | F(XOP) |
> >               0 /* SKINIT, WDT, LWP */ | F(FMA4) | F(TBM) |
> > -             F(TOPOEXT) | F(PERFCTR_CORE)
> > +             F(TOPOEXT) | 0 /* PERFCTR_CORE */
> >       );
> >
> >       kvm_cpu_cap_mask(CPUID_8000_0001_EDX,
> > diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
> > index fdf587f19c5f..a0bcf0144664 100644
> > --- a/arch/x86/kvm/svm/pmu.c
> > +++ b/arch/x86/kvm/svm/pmu.c
> > @@ -16,6 +16,7 @@
> >   #include "cpuid.h"
> >   #include "lapic.h"
> >   #include "pmu.h"
> > +#include "svm.h"
> >
> >   enum pmu_type {
> >       PMU_TYPE_COUNTER = 0,
> > @@ -100,6 +101,9 @@ static inline struct kvm_pmc *get_gp_pmc_amd(struct kvm_pmu *pmu, u32 msr,
> >   {
> >       struct kvm_vcpu *vcpu = pmu_to_vcpu(pmu);
> >
> > +     if (!pmuv)
> > +             return NULL;
> > +
> >       switch (msr) {
> >       case MSR_F15H_PERF_CTL0:
> >       case MSR_F15H_PERF_CTL1:
> > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > index 21bb81710e0f..062e48c191ee 100644
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -190,6 +190,10 @@ module_param(vgif, int, 0444);
> >   static int lbrv = true;
> >   module_param(lbrv, int, 0444);
> >
> > +/* enable/disable PMU virtualization */
> > +bool pmuv = true;
> > +module_param(pmuv, bool, 0444);
> > +
> >   static int tsc_scaling = true;
> >   module_param(tsc_scaling, int, 0444);
> >
> > @@ -952,6 +956,10 @@ static __init void svm_set_cpu_caps(void)
> >           boot_cpu_has(X86_FEATURE_AMD_SSBD))
> >               kvm_cpu_cap_set(X86_FEATURE_VIRT_SSBD);
> >
> > +     /* AMD PMU PERFCTR_CORE CPUID */
> > +     if (pmuv && boot_cpu_has(X86_FEATURE_PERFCTR_CORE))
> > +             kvm_cpu_cap_set(X86_FEATURE_PERFCTR_CORE);
> > +
> >       /* CPUID 0x8000001F (SME/SEV features) */
> >       sev_set_cpu_caps();
> >   }
> > @@ -1085,6 +1093,9 @@ static __init int svm_hardware_setup(void)
> >                       pr_info("LBR virtualization supported\n");
> >       }
> >
> > +     if (!pmuv)
> > +             pr_info("PMU virtualization is disabled\n");
> > +
> >       svm_set_cpu_caps();
> >
> >       /*
> > diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> > index 0d7bbe548ac3..08e1c19ffbdf 100644
> > --- a/arch/x86/kvm/svm/svm.h
> > +++ b/arch/x86/kvm/svm/svm.h
> > @@ -32,6 +32,7 @@
> >   extern u32 msrpm_offsets[MSRPM_OFFSETS] __read_mostly;
> >   extern bool npt_enabled;
> >   extern bool intercept_smi;
> > +extern bool pmuv;
> >
> >   /*
> >    * Clean bits in VMCB.
> >
>
> Queued, thanks - just changed the parameter name to "pmu".

Whoops! The global 'pmu' is hidden by a local 'pmu' in get_gp_pmc_amd().

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: x86/svm: Add module param to control PMU virtualization
  2022-01-15  1:26   ` Jim Mattson
@ 2022-01-17  2:33     ` Like Xu
  2022-01-17  8:36       ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: Like Xu @ 2022-01-17  2:33 UTC (permalink / raw)
  To: Jim Mattson, Paolo Bonzini
  Cc: Maxim Levitsky, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Joerg Roedel, kvm, linux-kernel

On 15/1/2022 9:26 am, Jim Mattson wrote:
> On Thu, Nov 18, 2021 at 5:25 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>>
>> On 11/17/21 09:03, Like Xu wrote:
>>> From: Like Xu <likexu@tencent.com>
>>>
>>> For Intel, the guest PMU can be disabled via clearing the PMU CPUID.
>>> For AMD, all hw implementations support the base set of four
>>> performance counters, with current mainstream hardware indicating
>>> the presence of two additional counters via X86_FEATURE_PERFCTR_CORE.
>>>
>>> In the virtualized world, the AMD guest driver may detect
>>> the presence of at least one counter MSR. Most hypervisor
>>> vendors would introduce a module param (like lbrv for svm)
>>> to disable PMU for all guests.
>>>
>>> Another control proposal per-VM is to pass PMU disable information
>>> via MSR_IA32_PERF_CAPABILITIES or one bit in CPUID Fn4000_00[FF:00].
>>> Both of methods require some guest-side changes, so a module
>>> parameter may not be sufficiently granular, but practical enough.
>>>
>>> Signed-off-by: Like Xu <likexu@tencent.com>
>>> ---
>>>    arch/x86/kvm/cpuid.c   |  2 +-
>>>    arch/x86/kvm/svm/pmu.c |  4 ++++
>>>    arch/x86/kvm/svm/svm.c | 11 +++++++++++
>>>    arch/x86/kvm/svm/svm.h |  1 +
>>>    4 files changed, 17 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>>> index 2d70edb0f323..647af2a184ad 100644
>>> --- a/arch/x86/kvm/cpuid.c
>>> +++ b/arch/x86/kvm/cpuid.c
>>> @@ -487,7 +487,7 @@ void kvm_set_cpu_caps(void)
>>>                F(CR8_LEGACY) | F(ABM) | F(SSE4A) | F(MISALIGNSSE) |
>>>                F(3DNOWPREFETCH) | F(OSVW) | 0 /* IBS */ | F(XOP) |
>>>                0 /* SKINIT, WDT, LWP */ | F(FMA4) | F(TBM) |
>>> -             F(TOPOEXT) | F(PERFCTR_CORE)
>>> +             F(TOPOEXT) | 0 /* PERFCTR_CORE */
>>>        );
>>>
>>>        kvm_cpu_cap_mask(CPUID_8000_0001_EDX,
>>> diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
>>> index fdf587f19c5f..a0bcf0144664 100644
>>> --- a/arch/x86/kvm/svm/pmu.c
>>> +++ b/arch/x86/kvm/svm/pmu.c
>>> @@ -16,6 +16,7 @@
>>>    #include "cpuid.h"
>>>    #include "lapic.h"
>>>    #include "pmu.h"
>>> +#include "svm.h"
>>>
>>>    enum pmu_type {
>>>        PMU_TYPE_COUNTER = 0,
>>> @@ -100,6 +101,9 @@ static inline struct kvm_pmc *get_gp_pmc_amd(struct kvm_pmu *pmu, u32 msr,
>>>    {
>>>        struct kvm_vcpu *vcpu = pmu_to_vcpu(pmu);
>>>
>>> +     if (!pmuv)
>>> +             return NULL;
>>> +
>>>        switch (msr) {
>>>        case MSR_F15H_PERF_CTL0:
>>>        case MSR_F15H_PERF_CTL1:
>>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
>>> index 21bb81710e0f..062e48c191ee 100644
>>> --- a/arch/x86/kvm/svm/svm.c
>>> +++ b/arch/x86/kvm/svm/svm.c
>>> @@ -190,6 +190,10 @@ module_param(vgif, int, 0444);
>>>    static int lbrv = true;
>>>    module_param(lbrv, int, 0444);
>>>
>>> +/* enable/disable PMU virtualization */
>>> +bool pmuv = true;
>>> +module_param(pmuv, bool, 0444);
>>> +
>>>    static int tsc_scaling = true;
>>>    module_param(tsc_scaling, int, 0444);
>>>
>>> @@ -952,6 +956,10 @@ static __init void svm_set_cpu_caps(void)
>>>            boot_cpu_has(X86_FEATURE_AMD_SSBD))
>>>                kvm_cpu_cap_set(X86_FEATURE_VIRT_SSBD);
>>>
>>> +     /* AMD PMU PERFCTR_CORE CPUID */
>>> +     if (pmuv && boot_cpu_has(X86_FEATURE_PERFCTR_CORE))
>>> +             kvm_cpu_cap_set(X86_FEATURE_PERFCTR_CORE);
>>> +
>>>        /* CPUID 0x8000001F (SME/SEV features) */
>>>        sev_set_cpu_caps();
>>>    }
>>> @@ -1085,6 +1093,9 @@ static __init int svm_hardware_setup(void)
>>>                        pr_info("LBR virtualization supported\n");
>>>        }
>>>
>>> +     if (!pmuv)
>>> +             pr_info("PMU virtualization is disabled\n");
>>> +
>>>        svm_set_cpu_caps();
>>>
>>>        /*
>>> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
>>> index 0d7bbe548ac3..08e1c19ffbdf 100644
>>> --- a/arch/x86/kvm/svm/svm.h
>>> +++ b/arch/x86/kvm/svm/svm.h
>>> @@ -32,6 +32,7 @@
>>>    extern u32 msrpm_offsets[MSRPM_OFFSETS] __read_mostly;
>>>    extern bool npt_enabled;
>>>    extern bool intercept_smi;
>>> +extern bool pmuv;
>>>
>>>    /*
>>>     * Clean bits in VMCB.
>>>
>>
>> Queued, thanks - just changed the parameter name to "pmu".
> 
> Whoops! The global 'pmu' is hidden by a local 'pmu' in get_gp_pmc_amd().

Indeed, I wonder if Poalo would like to take a look at this fix:
https://lore.kernel.org/kvm/20220113035324.59572-1-likexu@tencent.com/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: x86/svm: Add module param to control PMU virtualization
  2022-01-17  2:33     ` Like Xu
@ 2022-01-17  8:36       ` Paolo Bonzini
  0 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2022-01-17  8:36 UTC (permalink / raw)
  To: Like Xu, Jim Mattson
  Cc: Maxim Levitsky, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Joerg Roedel, kvm, linux-kernel

On 1/17/22 03:33, Like Xu wrote:
>>>
>>
>> Whoops! The global 'pmu' is hidden by a local 'pmu' in get_gp_pmc_amd().
> 
> Indeed, I wonder if Poalo would like to take a look at this fix:
> https://lore.kernel.org/kvm/20220113035324.59572-1-likexu@tencent.com/

Yes, my mistake.

Paolo


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-01-17  8:36 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-17  8:03 [PATCH] KVM: x86/svm: Add module param to control PMU virtualization Like Xu
2021-11-18 13:25 ` Paolo Bonzini
2021-12-10 19:25   ` Jim Mattson
2021-12-11  2:15     ` Paolo Bonzini
2021-12-11  3:48       ` Jim Mattson
2022-01-09  1:23         ` Jim Mattson
2022-01-10  6:23           ` Like Xu
2022-01-10 18:13             ` Jim Mattson
2022-01-11  2:11               ` Like Xu
2022-01-11  3:24                 ` Jim Mattson
2022-01-11  6:18                   ` Like Xu
2022-01-11  7:25                     ` Jim Mattson
2022-01-15  1:26   ` Jim Mattson
2022-01-17  2:33     ` Like Xu
2022-01-17  8:36       ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).