linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/3] KVM: VMX: Fix VM entry failure on PT_MODE_HOST_GUEST while host is using PT
@ 2022-09-21 16:45 Xiaoyao Li
  2022-09-21 16:45 ` [RFC PATCH v2 1/3] perf/core: Expose perf_event_{en,dis}able_local() Xiaoyao Li
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Xiaoyao Li @ 2022-09-21 16:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Sean Christopherson, Paolo Bonzini, wei.w.wang, kan.liang
  Cc: xiaoyao.li, linux-perf-users, linux-kernel, kvm

There is one bug in KVM that can hit vm-entry failure 100% on platform
supporting PT_MODE_HOST_GUEST mode following below steps:

  1. #modprobe -r kvm_intel
  2. #modprobe kvm_intel pt_mode=1
  3. start a VM with QEMU
  4. on host: #perf record -e intel_pt//

The vm-entry failure happens because it violates the requirement stated
in Intel SDM 26.2.1.1 VM-Execution Control Fields

  If the logical processor is operating with Intel PT enabled (if
  IA32_RTIT_CTL.TraceEn = 1) at the time of VM entry, the "load
  IA32_RTIT_CTL" VM-entry control must be 0.

On PT_MODE_HOST_GUEST node, vm-entry load RTIT is always set. Thus KVM
needs to ensure IA32_RTIT_CTL.TraceEn is 0 before VM-entry. Currently KVM
manually WRMSR(IA32_RTIT_CTL) to clear TraceEn bit. However, it doesn't
work everytime since there is a posibility that IA32_RTIT_CTL.TraceEn is
re-enabled in PT PMI handler before vm-entry.

This series tries to fix the issue by exposing and calling perf driver
API to stop host PT event (if any) before vm-entry and resume PT event
after vm-exit. Perf API can prevent PT PMI handler from re-enabling PT.

By the way, drop the save/restore of PT MSRs of host because the resume
of PT event after vm-exit doesn't rely on the previous value of PT MSRs.

Changes in v1:
 - Export perf_event_{en,dis}able_local() and pt_get_curr_event() for KVM to
   stop/resume PT event; (Suggested-by Wang, Wei W <wei.w.wang@intel.com>)
 - Drop the save/restore of host PT MSRs.

v1: https://lore.kernel.org/all/20220825085625.867763-1-xiaoyao.li@intel.com/

Xiaoyao Li (3):
  perf/core: Expose perf_event_{en,dis}able_local()
  perf/x86/intel/pt: Introduce and export pt_get_curr_event()
  KVM: VMX: Stop/resume host PT before/after VMX transition when
    PT_MODE_HOST_GUEST

 arch/x86/events/intel/pt.c        |  8 ++++++++
 arch/x86/include/asm/perf_event.h |  2 ++
 arch/x86/kvm/vmx/vmx.c            | 31 +++++++++++++------------------
 arch/x86/kvm/vmx/vmx.h            |  2 +-
 include/linux/perf_event.h        |  1 +
 kernel/events/core.c              |  7 +++++++
 6 files changed, 32 insertions(+), 19 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC PATCH v2 1/3] perf/core: Expose perf_event_{en,dis}able_local()
  2022-09-21 16:45 [RFC PATCH v2 0/3] KVM: VMX: Fix VM entry failure on PT_MODE_HOST_GUEST while host is using PT Xiaoyao Li
@ 2022-09-21 16:45 ` Xiaoyao Li
  2022-09-22 12:16   ` Wang, Wei W
  2022-09-21 16:45 ` [RFC PATCH v2 2/3] perf/x86/intel/pt: Introduce and export pt_get_curr_event() Xiaoyao Li
  2022-09-21 16:45 ` [RFC PATCH v2 3/3] KVM: VMX: Stop/resume host PT before/after VMX transition when PT_MODE_HOST_GUEST Xiaoyao Li
  2 siblings, 1 reply; 19+ messages in thread
From: Xiaoyao Li @ 2022-09-21 16:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Sean Christopherson, Paolo Bonzini, wei.w.wang, kan.liang
  Cc: xiaoyao.li, linux-perf-users, linux-kernel, kvm

KVM needs them to disable/enable an Intel PT perf event before
vm-entry/after vm-exit.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 include/linux/perf_event.h | 1 +
 kernel/events/core.c       | 7 +++++++
 2 files changed, 8 insertions(+)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index ee8b9ecdc03b..fc5f3952d6a2 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1472,6 +1472,7 @@ extern int perf_swevent_get_recursion_context(void);
 extern void perf_swevent_put_recursion_context(int rctx);
 extern u64 perf_swevent_set_period(struct perf_event *event);
 extern void perf_event_enable(struct perf_event *event);
+extern void perf_event_enable_local(struct perf_event *event);
 extern void perf_event_disable(struct perf_event *event);
 extern void perf_event_disable_local(struct perf_event *event);
 extern void perf_event_disable_inatomic(struct perf_event *event);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 2621fd24ad26..8324bb99c6bf 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2446,6 +2446,7 @@ void perf_event_disable_local(struct perf_event *event)
 {
 	event_function_local(event, __perf_event_disable, NULL);
 }
+EXPORT_SYMBOL_GPL(perf_event_disable_local);
 
 /*
  * Strictly speaking kernel users cannot create groups and therefore this
@@ -2984,6 +2985,12 @@ static void _perf_event_enable(struct perf_event *event)
 	event_function_call(event, __perf_event_enable, NULL);
 }
 
+void perf_event_enable_local(struct perf_event *event)
+{
+	event_function_local(event, __perf_event_enable, NULL);
+}
+EXPORT_SYMBOL_GPL(perf_event_enable_local);
+
 /*
  * See perf_event_disable();
  */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH v2 2/3] perf/x86/intel/pt: Introduce and export pt_get_curr_event()
  2022-09-21 16:45 [RFC PATCH v2 0/3] KVM: VMX: Fix VM entry failure on PT_MODE_HOST_GUEST while host is using PT Xiaoyao Li
  2022-09-21 16:45 ` [RFC PATCH v2 1/3] perf/core: Expose perf_event_{en,dis}able_local() Xiaoyao Li
@ 2022-09-21 16:45 ` Xiaoyao Li
  2022-09-22  5:14   ` Xiaoyao Li
  2022-09-22 12:33   ` Liang, Kan
  2022-09-21 16:45 ` [RFC PATCH v2 3/3] KVM: VMX: Stop/resume host PT before/after VMX transition when PT_MODE_HOST_GUEST Xiaoyao Li
  2 siblings, 2 replies; 19+ messages in thread
From: Xiaoyao Li @ 2022-09-21 16:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Sean Christopherson, Paolo Bonzini, wei.w.wang, kan.liang
  Cc: xiaoyao.li, linux-perf-users, linux-kernel, kvm

KVM supports PT_MODE_HOST_GUEST mode for Intel PT where host and guest
have separate Intel PT configurations and they work independently.

Currently, in this mode, when both host and guest enable PT, KVM manually
clears MSR_IA32_RTIT_CTL.TRACEEN to disable host PT so that it can
context switch the other PT MSRs. However, PT PMI can be delivered after
this point and before the VM-entry. As a result, the re-enabling of PT
leads to VM-entry failure of guest.

To solve the problem, introduce and export pt_get_curr_event() for KVM
to get current pt event. Along with perf_event_{dis, en}able_local(),
With them, KVM can avoid PT re-enabling in PT PMI handler.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 arch/x86/events/intel/pt.c        | 8 ++++++++
 arch/x86/include/asm/perf_event.h | 2 ++
 2 files changed, 10 insertions(+)

diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
index 82ef87e9a897..62bfc45c11c9 100644
--- a/arch/x86/events/intel/pt.c
+++ b/arch/x86/events/intel/pt.c
@@ -1624,6 +1624,14 @@ static void pt_event_stop(struct perf_event *event, int mode)
 	}
 }
 
+struct perf_event *pt_get_curr_event(void)
+{
+	struct pt *pt = this_cpu_ptr(&pt_ctx);
+
+	return pt->handle.event;
+}
+EXPORT_SYMBOL_GPL(pt_get_curr_event);
+
 static long pt_event_snapshot_aux(struct perf_event *event,
 				  struct perf_output_handle *handle,
 				  unsigned long size)
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index f6fc8dd51ef4..7c3533392cf5 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -553,11 +553,13 @@ static inline int x86_perf_get_lbr(struct x86_pmu_lbr *lbr)
 
 #ifdef CONFIG_CPU_SUP_INTEL
  extern void intel_pt_handle_vmx(int on);
+ extern struct perf_event *pt_get_curr_event(void);
 #else
 static inline void intel_pt_handle_vmx(int on)
 {
 
 }
+struct perf_event *pt_get_curr_event(void) { return NULL; }
 #endif
 
 #if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_AMD)
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH v2 3/3] KVM: VMX: Stop/resume host PT before/after VMX transition when PT_MODE_HOST_GUEST
  2022-09-21 16:45 [RFC PATCH v2 0/3] KVM: VMX: Fix VM entry failure on PT_MODE_HOST_GUEST while host is using PT Xiaoyao Li
  2022-09-21 16:45 ` [RFC PATCH v2 1/3] perf/core: Expose perf_event_{en,dis}able_local() Xiaoyao Li
  2022-09-21 16:45 ` [RFC PATCH v2 2/3] perf/x86/intel/pt: Introduce and export pt_get_curr_event() Xiaoyao Li
@ 2022-09-21 16:45 ` Xiaoyao Li
  2022-09-22 12:34   ` Wang, Wei W
  2 siblings, 1 reply; 19+ messages in thread
From: Xiaoyao Li @ 2022-09-21 16:45 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Sean Christopherson, Paolo Bonzini, wei.w.wang, kan.liang
  Cc: xiaoyao.li, linux-perf-users, linux-kernel, kvm

Current implementation in pt_guest_enter() has two issues when pt mode
is PT_MODE_HOST_GUEST.

1. It relies on VM_ENTRY_LOAD_IA32_RTIT_CTL to disable host's Intel PT
   for the case that host enables PT while guest not.

   However, it causes VM entry failure due to violating the requirement
   stated in SDM "VM-Execution Control Fields"

     If the logical processor is operating with Intel PT enabled (if
     IA32_RTIT_CTL.TraceEn = 1) at the time of VM entry, the "load
     IA32_RTIT_CTL" VM-entry control must be 0.

2. In the case both host and guest enable Intel PT, it disables host's
   Intel PT by manually clearing MSR_IA32_RTIT_CTL for the purpose to
   context switch host and guest's PT configurations.

   However, PT PMI can be delivered later and before VM entry. In the PT
   PMI handler, it will a) update the host PT MSRs which leads to what KVM
   stores in vmx->pt_desc.host becomes stale, and b) re-enable Intel PT
   which leads to VM entry failure as #1.

To fix the above two issues, 1) grab and store host PT perf event and
disable/enable host PT before vm-enter/ after vm-exit. 2) drop host
pt_ctx and the logic to save/restore host PT MSRs since host PT driver
doesn't rely on the previous value of PT MSR, i.e., the re-enabling of PT
event after VM-exit re-initializes all the PT MSRs that it cares.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 31 +++++++++++++------------------
 arch/x86/kvm/vmx/vmx.h |  2 +-
 2 files changed, 14 insertions(+), 19 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c9b49a09e6b5..df1a16264bb6 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1124,37 +1124,32 @@ static inline void pt_save_msr(struct pt_ctx *ctx, u32 addr_range)
 
 static void pt_guest_enter(struct vcpu_vmx *vmx)
 {
+	struct perf_event *event;
+
 	if (vmx_pt_mode_is_system())
 		return;
 
-	/*
-	 * GUEST_IA32_RTIT_CTL is already set in the VMCS.
-	 * Save host state before VM entry.
-	 */
-	rdmsrl(MSR_IA32_RTIT_CTL, vmx->pt_desc.host.ctl);
-	if (vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) {
-		wrmsrl(MSR_IA32_RTIT_CTL, 0);
-		pt_save_msr(&vmx->pt_desc.host, vmx->pt_desc.num_address_ranges);
+	event = pt_get_curr_event();
+	if (event)
+		perf_event_disable_local(event);
+	vmx->pt_desc.host_event = event;
+
+	if (vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN)
 		pt_load_msr(&vmx->pt_desc.guest, vmx->pt_desc.num_address_ranges);
-	}
 }
 
 static void pt_guest_exit(struct vcpu_vmx *vmx)
 {
+	struct perf_event *event = vmx->pt_desc.host_event;
+
 	if (vmx_pt_mode_is_system())
 		return;
 
-	if (vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) {
+	if (vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN)
 		pt_save_msr(&vmx->pt_desc.guest, vmx->pt_desc.num_address_ranges);
-		pt_load_msr(&vmx->pt_desc.host, vmx->pt_desc.num_address_ranges);
-	}
 
-	/*
-	 * KVM requires VM_EXIT_CLEAR_IA32_RTIT_CTL to expose PT to the guest,
-	 * i.e. RTIT_CTL is always cleared on VM-Exit.  Restore it if necessary.
-	 */
-	if (vmx->pt_desc.host.ctl)
-		wrmsrl(MSR_IA32_RTIT_CTL, vmx->pt_desc.host.ctl);
+	if (event)
+		perf_event_enable_local(event);
 }
 
 void vmx_set_host_fs_gs(struct vmcs_host_state *host, u16 fs_sel, u16 gs_sel,
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 24d58c2ffaa3..4c20bdabc85b 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -66,7 +66,7 @@ struct pt_desc {
 	u64 ctl_bitmask;
 	u32 num_address_ranges;
 	u32 caps[PT_CPUID_REGS_NUM * PT_CPUID_LEAVES];
-	struct pt_ctx host;
+	struct perf_event *host_event;
 	struct pt_ctx guest;
 };
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v2 2/3] perf/x86/intel/pt: Introduce and export pt_get_curr_event()
  2022-09-21 16:45 ` [RFC PATCH v2 2/3] perf/x86/intel/pt: Introduce and export pt_get_curr_event() Xiaoyao Li
@ 2022-09-22  5:14   ` Xiaoyao Li
  2022-09-22 12:33   ` Liang, Kan
  1 sibling, 0 replies; 19+ messages in thread
From: Xiaoyao Li @ 2022-09-22  5:14 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Sean Christopherson, Paolo Bonzini, wei.w.wang, kan.liang
  Cc: linux-perf-users, linux-kernel, kvm

On 9/22/2022 12:45 AM, Xiaoyao Li wrote:
> KVM supports PT_MODE_HOST_GUEST mode for Intel PT where host and guest
> have separate Intel PT configurations and they work independently.
> 
> Currently, in this mode, when both host and guest enable PT, KVM manually
> clears MSR_IA32_RTIT_CTL.TRACEEN to disable host PT so that it can
> context switch the other PT MSRs. However, PT PMI can be delivered after
> this point and before the VM-entry. As a result, the re-enabling of PT
> leads to VM-entry failure of guest.
> 
> To solve the problem, introduce and export pt_get_curr_event() for KVM
> to get current pt event. Along with perf_event_{dis, en}able_local(),
> With them, KVM can avoid PT re-enabling in PT PMI handler.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>   arch/x86/events/intel/pt.c        | 8 ++++++++
>   arch/x86/include/asm/perf_event.h | 2 ++
>   2 files changed, 10 insertions(+)
> 
> diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
> index 82ef87e9a897..62bfc45c11c9 100644
> --- a/arch/x86/events/intel/pt.c
> +++ b/arch/x86/events/intel/pt.c
> @@ -1624,6 +1624,14 @@ static void pt_event_stop(struct perf_event *event, int mode)
>   	}
>   }
>   
> +struct perf_event *pt_get_curr_event(void)
> +{
> +	struct pt *pt = this_cpu_ptr(&pt_ctx);
> +
> +	return pt->handle.event;
> +}
> +EXPORT_SYMBOL_GPL(pt_get_curr_event);
> +
>   static long pt_event_snapshot_aux(struct perf_event *event,
>   				  struct perf_output_handle *handle,
>   				  unsigned long size)
> diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
> index f6fc8dd51ef4..7c3533392cf5 100644
> --- a/arch/x86/include/asm/perf_event.h
> +++ b/arch/x86/include/asm/perf_event.h
> @@ -553,11 +553,13 @@ static inline int x86_perf_get_lbr(struct x86_pmu_lbr *lbr)
>   
>   #ifdef CONFIG_CPU_SUP_INTEL
>    extern void intel_pt_handle_vmx(int on);
> + extern struct perf_event *pt_get_curr_event(void);
>   #else
>   static inline void intel_pt_handle_vmx(int on)
>   {
>   
>   }
> +struct perf_event *pt_get_curr_event(void) { return NULL; }

My fault, this needs to be

   static inline ...

>   #endif
>   
>   #if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_AMD)


^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [RFC PATCH v2 1/3] perf/core: Expose perf_event_{en,dis}able_local()
  2022-09-21 16:45 ` [RFC PATCH v2 1/3] perf/core: Expose perf_event_{en,dis}able_local() Xiaoyao Li
@ 2022-09-22 12:16   ` Wang, Wei W
  0 siblings, 0 replies; 19+ messages in thread
From: Wang, Wei W @ 2022-09-22 12:16 UTC (permalink / raw)
  To: Li, Xiaoyao, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Christopherson,,
	Sean, Paolo Bonzini, kan.liang
  Cc: linux-perf-users, linux-kernel, kvm

On Thursday, September 22, 2022 12:45 AM, Li, Xiaoyao wrote:
> KVM needs them to disable/enable an Intel PT perf event before vm-entry/after
> vm-exit.

I would explain more in the commit here:

Export perf_event_disable_local and perf_event_enable_local for perf users to disable
and enable perf events on the current local CPU. One usage is in PT virtualization
by KVM:
- before VMEnter to guest, KVM calls perf_event_disable_local to disable the host PT event
running on the current CPU, and this reuses the PT driver to save the related h/w states. 
- after VMExit to host, KVM calls perf_event_enable_local to resume the host PT event on
the current CPU by having the PT driver load the previously saved states into h/w.

> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>  include/linux/perf_event.h | 1 +
>  kernel/events/core.c       | 7 +++++++
>  2 files changed, 8 insertions(+)
> 
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index
> ee8b9ecdc03b..fc5f3952d6a2 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -1472,6 +1472,7 @@ extern int
> perf_swevent_get_recursion_context(void);
>  extern void perf_swevent_put_recursion_context(int rctx);  extern u64
> perf_swevent_set_period(struct perf_event *event);  extern void
> perf_event_enable(struct perf_event *event);
> +extern void perf_event_enable_local(struct perf_event *event);
>  extern void perf_event_disable(struct perf_event *event);  extern void
> perf_event_disable_local(struct perf_event *event);  extern void
> perf_event_disable_inatomic(struct perf_event *event); diff --git
> a/kernel/events/core.c b/kernel/events/core.c index
> 2621fd24ad26..8324bb99c6bf 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -2446,6 +2446,7 @@ void perf_event_disable_local(struct perf_event
> *event)  {
>  	event_function_local(event, __perf_event_disable, NULL);  }
> +EXPORT_SYMBOL_GPL(perf_event_disable_local);
> 
>  /*
>   * Strictly speaking kernel users cannot create groups and therefore this @@
> -2984,6 +2985,12 @@ static void _perf_event_enable(struct perf_event *event)
>  	event_function_call(event, __perf_event_enable, NULL);  }
> 
> +void perf_event_enable_local(struct perf_event *event) {
> +	event_function_local(event, __perf_event_enable, NULL); }
> +EXPORT_SYMBOL_GPL(perf_event_enable_local);
> +
>  /*
>   * See perf_event_disable();
>   */
> --
> 2.27.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v2 2/3] perf/x86/intel/pt: Introduce and export pt_get_curr_event()
  2022-09-21 16:45 ` [RFC PATCH v2 2/3] perf/x86/intel/pt: Introduce and export pt_get_curr_event() Xiaoyao Li
  2022-09-22  5:14   ` Xiaoyao Li
@ 2022-09-22 12:33   ` Liang, Kan
  2022-09-22 12:58     ` Wang, Wei W
  1 sibling, 1 reply; 19+ messages in thread
From: Liang, Kan @ 2022-09-22 12:33 UTC (permalink / raw)
  To: Xiaoyao Li, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Sean Christopherson, Paolo Bonzini,
	wei.w.wang
  Cc: linux-perf-users, linux-kernel, kvm



On 2022-09-21 12:45 p.m., Xiaoyao Li wrote:
> KVM supports PT_MODE_HOST_GUEST mode for Intel PT where host and guest
> have separate Intel PT configurations and they work independently.
> 
> Currently, in this mode, when both host and guest enable PT, KVM manually
> clears MSR_IA32_RTIT_CTL.TRACEEN to disable host PT so that it can
> context switch the other PT MSRs. However, PT PMI can be delivered after
> this point and before the VM-entry. As a result, the re-enabling of PT
> leads to VM-entry failure of guest.
> 
> To solve the problem, introduce and export pt_get_curr_event() for KVM
> to get current pt event. 

I don't think the current pt event is created by KVM. IIUC, the patch
basically expose the event created by the other user to KVM. That
doesn't sounds correct.

I think it should be perf's responsibility to decide which events should
be disabled, and which MSRs should be switched. Because only perf can
see all the events. The users should only see the events they created.

Thanks,
Kan

> Along with perf_event_{dis, en}able_local(),
> With them, KVM can avoid PT re-enabling in PT PMI handler.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>  arch/x86/events/intel/pt.c        | 8 ++++++++
>  arch/x86/include/asm/perf_event.h | 2 ++
>  2 files changed, 10 insertions(+)
> 
> diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
> index 82ef87e9a897..62bfc45c11c9 100644
> --- a/arch/x86/events/intel/pt.c
> +++ b/arch/x86/events/intel/pt.c
> @@ -1624,6 +1624,14 @@ static void pt_event_stop(struct perf_event *event, int mode)
>  	}
>  }
>  
> +struct perf_event *pt_get_curr_event(void)
> +{
> +	struct pt *pt = this_cpu_ptr(&pt_ctx);
> +
> +	return pt->handle.event;
> +}
> +EXPORT_SYMBOL_GPL(pt_get_curr_event);
> +
>  static long pt_event_snapshot_aux(struct perf_event *event,
>  				  struct perf_output_handle *handle,
>  				  unsigned long size)
> diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
> index f6fc8dd51ef4..7c3533392cf5 100644
> --- a/arch/x86/include/asm/perf_event.h
> +++ b/arch/x86/include/asm/perf_event.h
> @@ -553,11 +553,13 @@ static inline int x86_perf_get_lbr(struct x86_pmu_lbr *lbr)
>  
>  #ifdef CONFIG_CPU_SUP_INTEL
>   extern void intel_pt_handle_vmx(int on);
> + extern struct perf_event *pt_get_curr_event(void);
>  #else
>  static inline void intel_pt_handle_vmx(int on)
>  {
>  
>  }
> +struct perf_event *pt_get_curr_event(void) { return NULL; }
>  #endif
>  
>  #if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_AMD)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [RFC PATCH v2 3/3] KVM: VMX: Stop/resume host PT before/after VMX transition when PT_MODE_HOST_GUEST
  2022-09-21 16:45 ` [RFC PATCH v2 3/3] KVM: VMX: Stop/resume host PT before/after VMX transition when PT_MODE_HOST_GUEST Xiaoyao Li
@ 2022-09-22 12:34   ` Wang, Wei W
  0 siblings, 0 replies; 19+ messages in thread
From: Wang, Wei W @ 2022-09-22 12:34 UTC (permalink / raw)
  To: Li, Xiaoyao, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Christopherson,,
	Sean, Paolo Bonzini, kan.liang
  Cc: linux-perf-users, linux-kernel, kvm

On Thursday, September 22, 2022 12:45 AM, Li, Xiaoyao wrote:
> Current implementation in pt_guest_enter() has two issues when pt mode is
> PT_MODE_HOST_GUEST.
> 
> 1. It relies on VM_ENTRY_LOAD_IA32_RTIT_CTL to disable host's Intel PT
>    for the case that host enables PT while guest not.
> 
>    However, it causes VM entry failure due to violating the requirement
>    stated in SDM "VM-Execution Control Fields"
> 
>      If the logical processor is operating with Intel PT enabled (if
>      IA32_RTIT_CTL.TraceEn = 1) at the time of VM entry, the "load
>      IA32_RTIT_CTL" VM-entry control must be 0.
> 
> 2. In the case both host and guest enable Intel PT, it disables host's
>    Intel PT by manually clearing MSR_IA32_RTIT_CTL for the purpose to
>    context switch host and guest's PT configurations.
> 
>    However, PT PMI can be delivered later and before VM entry. In the PT
>    PMI handler, it will a) update the host PT MSRs which leads to what KVM
>    stores in vmx->pt_desc.host becomes stale, and b) re-enable Intel PT
>    which leads to VM entry failure as #1.
> 
> To fix the above two issues, 1) grab and store host PT perf event and
> disable/enable host PT before vm-enter/ after vm-exit. 2) drop host pt_ctx and
> the logic to save/restore host PT MSRs since host PT driver doesn't rely on the
> previous value of PT MSR, i.e., the re-enabling of PT event after VM-exit
> re-initializes all the PT MSRs that it cares.

I would also re-write the commit:

Current KVM implementation directly modifies the hardware states when it
wants to stop or start a host PT event on VMX transitions. This is not proper
because:
1) host perf event is well managed by the perf subsystem. Getting it stopped/started
needs to go through the perf subsystem to update the related metadata (e.g. event
state).
2) Simply modifying the MSR_IA32_RTIT_CTL isn't a complete way to disable the
host PT event, as there may be special cases, for example, a dangling PT bit in the
interrupt status register after PT has been stopped may cause the PT interrupt handler
to enable PT while KVM assumes PT has been disabled after it clears MSR_IA32_RTIT_CTL
(for more details, please check SDM "VM-Execution Control Fields" section for PT related
requirements). Not properly handling such cases can result in VMEntry failures. Those have
already been properly handled by the PT driver (i.e. pt_event_stop) called from the perf core.
3) stop/start a host PT event needs to save/restore the related MSR states for an
event switching. This has also been properly supported by the PT driver. It is an extra
burden for KVM to maintain another version of function to do such save/restore.

For those reasons, change KVM to get the running host PT event from the PT driver,
and call perf_event_disable/enable_local to disable/enable the host PT event running on the CPU.
This will reuse perf and PT driver to switch in/out the host PT event, with proper management of the
perf event.

> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 31 +++++++++++++------------------
> arch/x86/kvm/vmx/vmx.h |  2 +-
>  2 files changed, 14 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index
> c9b49a09e6b5..df1a16264bb6 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1124,37 +1124,32 @@ static inline void pt_save_msr(struct pt_ctx *ctx,
> u32 addr_range)
> 
>  static void pt_guest_enter(struct vcpu_vmx *vmx)  {
> +	struct perf_event *event;
> +
>  	if (vmx_pt_mode_is_system())
>  		return;
> 
> -	/*
> -	 * GUEST_IA32_RTIT_CTL is already set in the VMCS.
> -	 * Save host state before VM entry.
> -	 */
> -	rdmsrl(MSR_IA32_RTIT_CTL, vmx->pt_desc.host.ctl);
> -	if (vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) {
> -		wrmsrl(MSR_IA32_RTIT_CTL, 0);
> -		pt_save_msr(&vmx->pt_desc.host,
> vmx->pt_desc.num_address_ranges);
> +	event = pt_get_curr_event();
> +	if (event)
> +		perf_event_disable_local(event);
> +	vmx->pt_desc.host_event = event;
> +
> +	if (vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN)
>  		pt_load_msr(&vmx->pt_desc.guest,
> vmx->pt_desc.num_address_ranges);
> -	}
>  }
> 
>  static void pt_guest_exit(struct vcpu_vmx *vmx)  {
> +	struct perf_event *event = vmx->pt_desc.host_event;
> +
>  	if (vmx_pt_mode_is_system())
>  		return;
> 
> -	if (vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) {
> +	if (vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN)
>  		pt_save_msr(&vmx->pt_desc.guest,
> vmx->pt_desc.num_address_ranges);

As they are only used for guest msrs now, probably we can
rename them to pt_save_guest_msrs. Also for the load side.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [RFC PATCH v2 2/3] perf/x86/intel/pt: Introduce and export pt_get_curr_event()
  2022-09-22 12:33   ` Liang, Kan
@ 2022-09-22 12:58     ` Wang, Wei W
  2022-09-22 13:34       ` Peter Zijlstra
  2022-09-26 15:32       ` Liang, Kan
  0 siblings, 2 replies; 19+ messages in thread
From: Wang, Wei W @ 2022-09-22 12:58 UTC (permalink / raw)
  To: Liang, Kan, Li, Xiaoyao, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Christopherson,,
	Sean, Paolo Bonzini
  Cc: linux-perf-users, linux-kernel, kvm

On Thursday, September 22, 2022 8:34 PM, Liang, Kan wrote:
> > To solve the problem, introduce and export pt_get_curr_event() for KVM
> > to get current pt event.
> 
> I don't think the current pt event is created by KVM. IIUC, the patch basically
> expose the event created by the other user to KVM. That doesn't sounds
> correct.

Yes, that's the host PT event running on the current CPU. Not created by KVM.

> 
> I think it should be perf's responsibility to decide which events should be
> disabled, and which MSRs should be switched. Because only perf can see all the
> events. The users should only see the events they created.

For other pmu cases, yes. For PT, its management is simpler than other pmu
resources and PT PMU is much simpler. It doesn't have a scheduler to manage
events.

I think the usage here is similar to the CPU thread scheduling case. When we have
CPUs isolated from the CPU scheduler, i.e. no scheduler for those CPUs, basically
we rely on users to ping tasks to those CPUs (e.g. taskset).

For the commit log, probably we don't need those KVM details here. Simplify a bit:

Add a function to expose the current running PT event to users. One usage is in KVM,
it needs to get and disable the running host PT event before VMEnter to the guest and
resumes the event after VMexit to host.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v2 2/3] perf/x86/intel/pt: Introduce and export pt_get_curr_event()
  2022-09-22 12:58     ` Wang, Wei W
@ 2022-09-22 13:34       ` Peter Zijlstra
  2022-09-22 13:59         ` Wang, Wei W
  2022-09-26 15:32       ` Liang, Kan
  1 sibling, 1 reply; 19+ messages in thread
From: Peter Zijlstra @ 2022-09-22 13:34 UTC (permalink / raw)
  To: Wang, Wei W
  Cc: Liang, Kan, Li, Xiaoyao, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Christopherson,,
	Sean, Paolo Bonzini, linux-perf-users, linux-kernel, kvm

On Thu, Sep 22, 2022 at 12:58:49PM +0000, Wang, Wei W wrote:

> Add a function to expose the current running PT event to users. One usage is in KVM,
> it needs to get and disable the running host PT event before VMEnter to the guest and
> resumes the event after VMexit to host.

You cannot just kill a host event like that. If there is a host event,
the guest looses out.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [RFC PATCH v2 2/3] perf/x86/intel/pt: Introduce and export pt_get_curr_event()
  2022-09-22 13:34       ` Peter Zijlstra
@ 2022-09-22 13:59         ` Wang, Wei W
  2022-09-22 14:09           ` Peter Zijlstra
  0 siblings, 1 reply; 19+ messages in thread
From: Wang, Wei W @ 2022-09-22 13:59 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Liang, Kan, Li, Xiaoyao, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Christopherson,,
	Sean, Paolo Bonzini, linux-perf-users, linux-kernel, kvm

On Thursday, September 22, 2022 9:35 PM, Peter Zijlstra
> On Thu, Sep 22, 2022 at 12:58:49PM +0000, Wang, Wei W wrote:
> 
> > Add a function to expose the current running PT event to users. One
> > usage is in KVM, it needs to get and disable the running host PT event
> > before VMEnter to the guest and resumes the event after VMexit to host.
> 
> You cannot just kill a host event like that. If there is a host event, the guest
> looses out.

OK. The intention was to pause the event (that only profiles host info) when switching to guest,
and resume when switching back to host.

I don't think it is proper to directly disable it via clear MSR_IA32_RTIT_CTL, e.g. there
may be dangling PT bit in the interrupt status register, this invokes the PT interrupt handler
to re-enable it (for more details about the issue please see patch 3 commit).
Would you have any suggestions to pause it? 

Thanks,
Wei

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v2 2/3] perf/x86/intel/pt: Introduce and export pt_get_curr_event()
  2022-09-22 13:59         ` Wang, Wei W
@ 2022-09-22 14:09           ` Peter Zijlstra
  2022-09-22 14:42             ` Wang, Wei W
  0 siblings, 1 reply; 19+ messages in thread
From: Peter Zijlstra @ 2022-09-22 14:09 UTC (permalink / raw)
  To: Wang, Wei W
  Cc: Liang, Kan, Li, Xiaoyao, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Christopherson,,
	Sean, Paolo Bonzini, linux-perf-users, linux-kernel, kvm

On Thu, Sep 22, 2022 at 01:59:53PM +0000, Wang, Wei W wrote:
> On Thursday, September 22, 2022 9:35 PM, Peter Zijlstra
> > On Thu, Sep 22, 2022 at 12:58:49PM +0000, Wang, Wei W wrote:
> > 
> > > Add a function to expose the current running PT event to users. One
> > > usage is in KVM, it needs to get and disable the running host PT event
> > > before VMEnter to the guest and resumes the event after VMexit to host.
> > 
> > You cannot just kill a host event like that. If there is a host event, the guest
> > looses out.
> 
> OK. The intention was to pause the event (that only profiles host info) when switching to guest,
> and resume when switching back to host.

If the even doesn't profile guest context, then yes. If it does profile
guest context, you can't.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [RFC PATCH v2 2/3] perf/x86/intel/pt: Introduce and export pt_get_curr_event()
  2022-09-22 14:09           ` Peter Zijlstra
@ 2022-09-22 14:42             ` Wang, Wei W
  2022-09-26 15:48               ` Liang, Kan
  0 siblings, 1 reply; 19+ messages in thread
From: Wang, Wei W @ 2022-09-22 14:42 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Liang, Kan, Li, Xiaoyao, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Christopherson,,
	Sean, Paolo Bonzini, linux-perf-users, linux-kernel, kvm

On Thursday, September 22, 2022 10:10 PM, Peter Zijlstra wrote:
> On Thu, Sep 22, 2022 at 01:59:53PM +0000, Wang, Wei W wrote:
> > On Thursday, September 22, 2022 9:35 PM, Peter Zijlstra
> > > On Thu, Sep 22, 2022 at 12:58:49PM +0000, Wang, Wei W wrote:
> > >
> > > > Add a function to expose the current running PT event to users.
> > > > One usage is in KVM, it needs to get and disable the running host
> > > > PT event before VMEnter to the guest and resumes the event after
> VMexit to host.
> > >
> > > You cannot just kill a host event like that. If there is a host
> > > event, the guest looses out.
> >
> > OK. The intention was to pause the event (that only profiles host
> > info) when switching to guest, and resume when switching back to host.
> 
> If the even doesn't profile guest context, then yes. If it does profile guest
> context, you can't.

Seems better to add this one:

+int perf_event_disable_local_exclude_guest(struct perf_event *event)
+{
+       struct perf_event_attr *attr = &event->attr;
+
+       if (!attr->exclude_guest)
+               return -EPERM;
+
+       event_function_local(event, __perf_event_disable, NULL);
+
+       return 0;
+}
+EXPORT_SYMBOL_GPL(perf_event_disable_local_exclude_guest);


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v2 2/3] perf/x86/intel/pt: Introduce and export pt_get_curr_event()
  2022-09-22 12:58     ` Wang, Wei W
  2022-09-22 13:34       ` Peter Zijlstra
@ 2022-09-26 15:32       ` Liang, Kan
  1 sibling, 0 replies; 19+ messages in thread
From: Liang, Kan @ 2022-09-26 15:32 UTC (permalink / raw)
  To: Wang, Wei W, Li, Xiaoyao, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Christopherson,,
	Sean, Paolo Bonzini
  Cc: linux-perf-users, linux-kernel, kvm



On 2022-09-22 8:58 a.m., Wang, Wei W wrote:
> On Thursday, September 22, 2022 8:34 PM, Liang, Kan wrote:
>>> To solve the problem, introduce and export pt_get_curr_event() for KVM
>>> to get current pt event.
>>
>> I don't think the current pt event is created by KVM. IIUC, the patch basically
>> expose the event created by the other user to KVM. That doesn't sounds
>> correct.
> 
> Yes, that's the host PT event running on the current CPU. Not created by KVM.
> 
>>
>> I think it should be perf's responsibility to decide which events should be
>> disabled, and which MSRs should be switched. Because only perf can see all the
>> events. The users should only see the events they created.
> 
> For other pmu cases, yes. For PT, its management is simpler than other pmu
> resources and PT PMU is much simpler. It doesn't have a scheduler to manage
> events.
>

Right, but I think we'd better to create a simpler scheduler (just for
two events) in the PT driver, since you have two co-existing PT events
now, one is from the host and the other is from the guest. I don't think
it's KVM's responsibility to schedule events.

So I think the process should be something as below.

- Let KVM create a PT event if the guest request one.
- In VM-entry, just invoke the perf_event_enable_local(guest).
  The PT driver should schedule out the host event or whatever events
and schedule in the guest event.
- In VM-exit, just invoke the perf_event_disable_local(guest).
  The PT driver should schedule in the host event or whatever events and
schedule out the guest event.

I still don't think we want to expose the host event.

Thanks,
Kan

> I think the usage here is similar to the CPU thread scheduling case. When we have
> CPUs isolated from the CPU scheduler, i.e. no scheduler for those CPUs, basically
> we rely on users to ping tasks to those CPUs (e.g. taskset).
> 
> For the commit log, probably we don't need those KVM details here. Simplify a bit:
> 
> Add a function to expose the current running PT event to users. One usage is in KVM,
> it needs to get and disable the running host PT event before VMEnter to the guest and
> resumes the event after VMexit to host.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v2 2/3] perf/x86/intel/pt: Introduce and export pt_get_curr_event()
  2022-09-22 14:42             ` Wang, Wei W
@ 2022-09-26 15:48               ` Liang, Kan
  2022-09-26 17:24                 ` Jim Mattson
  0 siblings, 1 reply; 19+ messages in thread
From: Liang, Kan @ 2022-09-26 15:48 UTC (permalink / raw)
  To: Wang, Wei W, Peter Zijlstra
  Cc: Li, Xiaoyao, Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Christopherson,,
	Sean, Paolo Bonzini, linux-perf-users, linux-kernel, kvm



On 2022-09-22 10:42 a.m., Wang, Wei W wrote:
> On Thursday, September 22, 2022 10:10 PM, Peter Zijlstra wrote:
>> On Thu, Sep 22, 2022 at 01:59:53PM +0000, Wang, Wei W wrote:
>>> On Thursday, September 22, 2022 9:35 PM, Peter Zijlstra
>>>> On Thu, Sep 22, 2022 at 12:58:49PM +0000, Wang, Wei W wrote:
>>>>
>>>>> Add a function to expose the current running PT event to users.
>>>>> One usage is in KVM, it needs to get and disable the running host
>>>>> PT event before VMEnter to the guest and resumes the event after
>> VMexit to host.
>>>>
>>>> You cannot just kill a host event like that. If there is a host
>>>> event, the guest looses out.
>>>
>>> OK. The intention was to pause the event (that only profiles host
>>> info) when switching to guest, and resume when switching back to host.
>>
>> If the even doesn't profile guest context, then yes. If it does profile guest
>> context, you can't.
> 
> Seems better to add this one:

If the guest host mode is enabled, I think the PT driver should not
allow the perf tool to create a host event with !exclude_guest.

Thanks,
Kan
> 
> +int perf_event_disable_local_exclude_guest(struct perf_event *event)
> +{
> +       struct perf_event_attr *attr = &event->attr;
> +
> +       if (!attr->exclude_guest)
> +               return -EPERM;
> +
> +       event_function_local(event, __perf_event_disable, NULL);
> +
> +       return 0;
> +}
> +EXPORT_SYMBOL_GPL(perf_event_disable_local_exclude_guest);
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v2 2/3] perf/x86/intel/pt: Introduce and export pt_get_curr_event()
  2022-09-26 15:48               ` Liang, Kan
@ 2022-09-26 17:24                 ` Jim Mattson
  2022-09-26 18:08                   ` Liang, Kan
  0 siblings, 1 reply; 19+ messages in thread
From: Jim Mattson @ 2022-09-26 17:24 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Wang, Wei W, Peter Zijlstra, Li, Xiaoyao, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Christopherson,,
	Sean, Paolo Bonzini, linux-perf-users, linux-kernel, kvm

On Mon, Sep 26, 2022 at 9:55 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>
>
>
> On 2022-09-22 10:42 a.m., Wang, Wei W wrote:
> > On Thursday, September 22, 2022 10:10 PM, Peter Zijlstra wrote:
> >> On Thu, Sep 22, 2022 at 01:59:53PM +0000, Wang, Wei W wrote:
> >>> On Thursday, September 22, 2022 9:35 PM, Peter Zijlstra
> >>>> On Thu, Sep 22, 2022 at 12:58:49PM +0000, Wang, Wei W wrote:
> >>>>
> >>>>> Add a function to expose the current running PT event to users.
> >>>>> One usage is in KVM, it needs to get and disable the running host
> >>>>> PT event before VMEnter to the guest and resumes the event after
> >> VMexit to host.
> >>>>
> >>>> You cannot just kill a host event like that. If there is a host
> >>>> event, the guest looses out.
> >>>
> >>> OK. The intention was to pause the event (that only profiles host
> >>> info) when switching to guest, and resume when switching back to host.
> >>
> >> If the even doesn't profile guest context, then yes. If it does profile guest
> >> context, you can't.
> >
> > Seems better to add this one:
>
> If the guest host mode is enabled, I think the PT driver should not
> allow the perf tool to create a host event with !exclude_guest.

While I agree that guest events should generally have priority over
host events, this is not consistent with the way "normal" PMU events
are handled.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v2 2/3] perf/x86/intel/pt: Introduce and export pt_get_curr_event()
  2022-09-26 17:24                 ` Jim Mattson
@ 2022-09-26 18:08                   ` Liang, Kan
  2022-09-27 14:27                     ` Wang, Wei W
  0 siblings, 1 reply; 19+ messages in thread
From: Liang, Kan @ 2022-09-26 18:08 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Wang, Wei W, Peter Zijlstra, Li, Xiaoyao, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Christopherson,,
	Sean, Paolo Bonzini, linux-perf-users, linux-kernel, kvm



On 2022-09-26 1:24 p.m., Jim Mattson wrote:
> On Mon, Sep 26, 2022 at 9:55 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>>
>>
>>
>> On 2022-09-22 10:42 a.m., Wang, Wei W wrote:
>>> On Thursday, September 22, 2022 10:10 PM, Peter Zijlstra wrote:
>>>> On Thu, Sep 22, 2022 at 01:59:53PM +0000, Wang, Wei W wrote:
>>>>> On Thursday, September 22, 2022 9:35 PM, Peter Zijlstra
>>>>>> On Thu, Sep 22, 2022 at 12:58:49PM +0000, Wang, Wei W wrote:
>>>>>>
>>>>>>> Add a function to expose the current running PT event to users.
>>>>>>> One usage is in KVM, it needs to get and disable the running host
>>>>>>> PT event before VMEnter to the guest and resumes the event after
>>>> VMexit to host.
>>>>>>
>>>>>> You cannot just kill a host event like that. If there is a host
>>>>>> event, the guest looses out.
>>>>>
>>>>> OK. The intention was to pause the event (that only profiles host
>>>>> info) when switching to guest, and resume when switching back to host.
>>>>
>>>> If the even doesn't profile guest context, then yes. If it does profile guest
>>>> context, you can't.
>>>
>>> Seems better to add this one:
>>
>> If the guest host mode is enabled, I think the PT driver should not
>> allow the perf tool to create a host event with !exclude_guest.
> 
> While I agree that guest events should generally have priority over
> host events, this is not consistent with the way "normal" PMU events
> are handled.
Only when the two events try to access a resource at the same time, we
have to decide whether priority an event or share between events. But I
don't think this is the case here.

From my understanding of the host-guest mode, the host PT event never
traces the guest no matter whether the guest enables PT. So when
VM-entry, there is only a guest PT event or no event. If so, I think the
perf tool should warn the user if they try to create a host event with
!exclude_guest, since the host event never traces a guest.


Thanks,
Kan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [RFC PATCH v2 2/3] perf/x86/intel/pt: Introduce and export pt_get_curr_event()
  2022-09-26 18:08                   ` Liang, Kan
@ 2022-09-27 14:27                     ` Wang, Wei W
  2022-09-27 16:52                       ` Liang, Kan
  0 siblings, 1 reply; 19+ messages in thread
From: Wang, Wei W @ 2022-09-27 14:27 UTC (permalink / raw)
  To: Liang, Kan, Jim Mattson
  Cc: Peter Zijlstra, Li, Xiaoyao, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Christopherson,,
	Sean, Paolo Bonzini, linux-perf-users, linux-kernel, kvm

On Tuesday, September 27, 2022 2:09 AM, Liang, Kan wrote:
> From my understanding of the host-guest mode, the host PT event never
> traces the guest no matter whether the guest enables PT. So when VM-entry,
> there is only a guest PT event or no event. If so, I think the perf tool should
> warn the user if they try to create a host event with !exclude_guest, since the
> host event never traces a guest.

Probably not from the perf side. It's actually an issue of the KVM's pt_mode:
How can KVM prevent host from profiling the guest when in host-guest mode?
Just a warning from the perf side wouldn't be enough. I think that's a wrong
assumption from KVM side. I had a plan to fix this one. Exposing the host event
can tell KVM if the host is profiling the guest or not (i.e. host-guest is allowed).

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH v2 2/3] perf/x86/intel/pt: Introduce and export pt_get_curr_event()
  2022-09-27 14:27                     ` Wang, Wei W
@ 2022-09-27 16:52                       ` Liang, Kan
  0 siblings, 0 replies; 19+ messages in thread
From: Liang, Kan @ 2022-09-27 16:52 UTC (permalink / raw)
  To: Wang, Wei W, Jim Mattson
  Cc: Peter Zijlstra, Li, Xiaoyao, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Christopherson,,
	Sean, Paolo Bonzini, linux-perf-users, linux-kernel, kvm



On 2022-09-27 10:27 a.m., Wang, Wei W wrote:
> On Tuesday, September 27, 2022 2:09 AM, Liang, Kan wrote:
>> From my understanding of the host-guest mode, the host PT event never
>> traces the guest no matter whether the guest enables PT. So when VM-entry,
>> there is only a guest PT event or no event. If so, I think the perf tool should
>> warn the user if they try to create a host event with !exclude_guest, since the
>> host event never traces a guest.
> 
> Probably not from the perf side. It's actually an issue of the KVM's pt_mode:
> How can KVM prevent host from profiling the guest when in host-guest mode?

I don't think it's KVM's job to prevent host from profiling the guest.
However, the current KVM implementation implicitly disables the guest
profiling from the host. If I understand correct, this patch series just
change it to an explicit way (Just for kernel. Perf tool still doesn't
know about it.). There is no substantial change.

I think it should be PT driver who decide which event should be
scheduled. So the first step is to let the PT driver understand the
host-guest mode. Then if a perf user in the host tries to profile with
!exclude_guest with the host-guest mode, the pt driver can deny the
request. The perf tool than throw a warning, e.g., "The VMM is running
with host-guest mode. Perf cannot profile guest. Please remove the guest
profiling setting or switch to the host-only mode.".

> Just a warning from the perf side wouldn't be enough. I think that's a wrong
> assumption from KVM side. I had a plan to fix this one. Exposing the host event
> can tell KVM if the host is profiling the guest or not (i.e. host-guest is allowed).

I don't think KVM should know such information.

Thanks,
Kan

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-09-27 16:54 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-21 16:45 [RFC PATCH v2 0/3] KVM: VMX: Fix VM entry failure on PT_MODE_HOST_GUEST while host is using PT Xiaoyao Li
2022-09-21 16:45 ` [RFC PATCH v2 1/3] perf/core: Expose perf_event_{en,dis}able_local() Xiaoyao Li
2022-09-22 12:16   ` Wang, Wei W
2022-09-21 16:45 ` [RFC PATCH v2 2/3] perf/x86/intel/pt: Introduce and export pt_get_curr_event() Xiaoyao Li
2022-09-22  5:14   ` Xiaoyao Li
2022-09-22 12:33   ` Liang, Kan
2022-09-22 12:58     ` Wang, Wei W
2022-09-22 13:34       ` Peter Zijlstra
2022-09-22 13:59         ` Wang, Wei W
2022-09-22 14:09           ` Peter Zijlstra
2022-09-22 14:42             ` Wang, Wei W
2022-09-26 15:48               ` Liang, Kan
2022-09-26 17:24                 ` Jim Mattson
2022-09-26 18:08                   ` Liang, Kan
2022-09-27 14:27                     ` Wang, Wei W
2022-09-27 16:52                       ` Liang, Kan
2022-09-26 15:32       ` Liang, Kan
2022-09-21 16:45 ` [RFC PATCH v2 3/3] KVM: VMX: Stop/resume host PT before/after VMX transition when PT_MODE_HOST_GUEST Xiaoyao Li
2022-09-22 12:34   ` Wang, Wei W

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).