linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS
@ 2022-04-11 10:19 Like Xu
  2022-04-11 10:19 ` [PATCH RESEND v12 01/17] perf/x86/intel: Add EPT-Friendly PEBS for Ice Lake Server Like Xu
                   ` (18 more replies)
  0 siblings, 19 replies; 31+ messages in thread
From: Like Xu @ 2022-04-11 10:19 UTC (permalink / raw)
  To: Paolo Bonzini, Jim Mattson
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Joerg Roedel, linux-kernel, kvm

Hi,

Out of more accurate profiling results, this feature still has loyal
followers and another new rebased version is here. PeterZ had acked
the V9 patchset [0] and Paolo had asked for a new version, so please
check the changelog and feel free to review, test and comment.

[0] https://lore.kernel.org/kvm/YQF7lwM6qzYso0Gg@hirez.programming.kicks-ass.net/
[1] https://lore.kernel.org/kvm/95bf3dca-c6d1-02c8-40b6-8bb29a3a7a36@redhat.com/

---

The guest Precise Event Based Sampling (PEBS) feature can provide an
architectural state of the instruction executed after the guest instruction
that exactly caused the event. It needs new hardware facility only available
on Intel Ice Lake Server platforms. This patch set enables the basic PEBS
feature for KVM guests on ICX.

We can use PEBS feature on the Linux guest like native:

   # echo 0 > /proc/sys/kernel/watchdog (on the host)
   # perf record -e instructions:ppp ./br_instr a
   # perf record -c 100000 -e instructions:pp ./br_instr a

To emulate guest PEBS facility for the above perf usages,
we need to implement 2 code paths:

1) Fast path

This is when the host assigned physical PMC has an identical index as the
virtual PMC (e.g. using physical PMC0 to emulate virtual PMC0).
This path is used in most common use cases.

2) Slow path

This is when the host assigned physical PMC has a different index from the
virtual PMC (e.g. using physical PMC1 to emulate virtual PMC0) In this case,
KVM needs to rewrite the PEBS records to change the applicable counter indexes
to the virtual PMC indexes, which would otherwise contain the physical counter
index written by PEBS facility, and switch the counter reset values to the
offset corresponding to the physical counter indexes in the DS data structure.

The previous version [3] enables both fast path and slow path, which seems
a bit more complex as the first step. In this patchset, we want to start with
the fast path to get the basic guest PEBS enabled while keeping the slow path
disabled. More focused discussion on the slow path [4] is planned to be put to
another patchset in the next step.

Compared to later versions in subsequent steps, the functionality to support
host-guest PEBS both enabled and the functionality to emulate guest PEBS when
the counter is cross-mapped are missing in this patch set
(neither of these are typical scenarios).

With the basic support, the guest can retrieve the correct PEBS information from
its own PEBS records on the Ice Lake servers. And we expect it should work when
migrating to another Ice Lake and no regression about host perf is expected.

Here are the results of pebs test from guest/host for same workload:

perf report on guest:
# Samples: 2K of event 'instructions:ppp', # Event count (approx.): 1473377250 # Overhead  Command   Shared Object      Symbol
   57.74%  br_instr  br_instr           [.] lfsr_cond
   41.40%  br_instr  br_instr           [.] cmp_end
    0.21%  br_instr  [kernel.kallsyms]  [k] __lock_acquire

perf report on host:
# Samples: 2K of event 'instructions:ppp', # Event count (approx.): 1462721386 # Overhead  Command   Shared Object     Symbol
   57.90%  br_instr  br_instr          [.] lfsr_cond
   41.95%  br_instr  br_instr          [.] cmp_end
    0.05%  br_instr  [kernel.vmlinux]  [k] lock_acquire
    Conclusion: the profiling results on the guest are similar tothat on the host.

A minimum guest kernel version may be v5.4 or a backport version support
Icelake server PEBS.

Please check more details in each commit and feel free to comment.

Previous:
https://lore.kernel.org/kvm/20220304090427.90888-1-likexu@tencent.com/

[3]
https://lore.kernel.org/kvm/20210104131542.495413-1-like.xu@linux.intel.com/
[4]
https://lore.kernel.org/kvm/20210115191113.nktlnmivc3edstiv@two.firstfloor.org/

V12->V12 RESEND:
- Rebased on the top of kvm/queue; (Paolo)
- Add PEBS msrs to the msrs_to_save_all[];
- Stay up to date on https://github.com/lkml-likexu/linux/tree/kvm-queue-pebs;

V11->V12:
- Apply the new perf interface from tip/perf/core and fix the merge conflict;
- Rename "x86_pmu.pebs_ept" to "x86_pmu.pebs_ept"; (Sean)
- Rebased on the top of kvm/queue (b13a3befc815); (Paolo)

V10->V11:
- Merge perf_guest_info_callbacks static_call to the tip/perf/core;
- Keep use perf_guest_cbs in the kvm/queue context before merge window;
- Fix MSR_IA32_MISC_ENABLE_EMON bit (Liu XiangDong);
- Rebase "Reprogram PEBS event to emulate guest PEBS counter" patch;

V9->V10:
- improve readability in core.c(Peter Z)
- reuse guest_pebs_idxs(Liu XiangDong)

V8 -> V9 Changelog:
-fix a brackets error in xen_guest_state()

V7 -> V8 Changelog:
- fix coding style, add {} for single statement of multiple lines(Peter Z)
- fix coding style in xen_guest_state() (Boris Ostrovsky)
- s/pmu/kvm_pmu/ in intel_guest_get_msrs() (Peter Z)
- put lower cost branch in the first place for x86_pmu_handle_guest_pebs() (Peter Z)

V6 -> V7 Changelog:
- Fix conditions order and call x86_pmu_handle_guest_pebs() unconditionally; (PeterZ)
- Add a new patch to make all that perf_guest_cbs stuff suck less; (PeterZ)
- Document IA32_MISC_ENABLE[7] that that behavior matches bare metal; (Sean & Venkatesh)
- Update commit message for fixed counter mask refactoring;(PeterZ)
- Clarifying comments about {.host and .guest} for intel_guest_get_msrs(); (PeterZ)
- Add pebs_capable to store valid PEBS_COUNTER_MASK value; (PeterZ)
- Add more comments for perf's precise_ip field; (Andi & PeterZ)
- Refactor perf_overflow_handler_t and make it more legible; (PeterZ)
- Use "(unsigned long)cpuc->ds" instead of __this_cpu_read(cpu_hw_events.ds); (PeterZ)
- Keep using "(struct kvm_pmu *)data" to follow K&R; (Andi)

Like Xu (16):
  perf/x86/intel: Add EPT-Friendly PEBS for Ice Lake Server
  perf/x86/intel: Handle guest PEBS overflow PMI for KVM guest
  perf/x86/core: Pass "struct kvm_pmu *" to determine the guest values
  KVM: x86/pmu: Set MSR_IA32_MISC_ENABLE_EMON bit when vPMU is enabled
  KVM: x86/pmu: Introduce the ctrl_mask value for fixed counter
  KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS
  KVM: x86/pmu: Reprogram PEBS event to emulate guest PEBS counter
  KVM: x86/pmu: Adjust precise_ip to emulate Ice Lake guest PDIR counter
  KVM: x86/pmu: Add IA32_DS_AREA MSR emulation to support guest DS
  KVM: x86/pmu: Add PEBS_DATA_CFG MSR emulation to support adaptive PEBS
  KVM: x86: Set PEBS_UNAVAIL in IA32_MISC_ENABLE when PEBS is enabled
  KVM: x86/pmu: Move pmc_speculative_in_use() to arch/x86/kvm/pmu.h
  KVM: x86/pmu: Disable guest PEBS temporarily in two rare situations
  KVM: x86/pmu: Add kvm_pmu_cap to optimize perf_get_x86_pmu_capability
  KVM: x86/cpuid: Refactor host/guest CPU model consistency check
  KVM: x86/pmu: Expose CPUIDs feature bits PDCM, DS, DTES64

Peter Zijlstra (Intel) (1):
  x86/perf/core: Add pebs_capable to store valid PEBS_COUNTER_MASK value

 arch/x86/events/core.c            |   5 +-
 arch/x86/events/intel/core.c      | 157 +++++++++++++++++++++++++-----
 arch/x86/events/perf_event.h      |   6 +-
 arch/x86/include/asm/kvm_host.h   |  16 +++
 arch/x86/include/asm/msr-index.h  |   6 ++
 arch/x86/include/asm/perf_event.h |   5 +-
 arch/x86/kvm/cpuid.c              |  27 ++---
 arch/x86/kvm/cpuid.h              |   5 +
 arch/x86/kvm/pmu.c                |  52 +++++++---
 arch/x86/kvm/pmu.h                |  38 ++++++++
 arch/x86/kvm/vmx/capabilities.h   |  28 +++---
 arch/x86/kvm/vmx/pmu_intel.c      | 118 ++++++++++++++++++----
 arch/x86/kvm/vmx/vmx.c            |  24 ++++-
 arch/x86/kvm/vmx/vmx.h            |   2 +-
 arch/x86/kvm/x86.c                |  31 ++++--
 15 files changed, 412 insertions(+), 108 deletions(-)

-- 
2.35.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH RESEND v12 01/17] perf/x86/intel: Add EPT-Friendly PEBS for Ice Lake Server
  2022-04-11 10:19 [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS Like Xu
@ 2022-04-11 10:19 ` Like Xu
  2022-04-11 10:19 ` [PATCH RESEND v12 02/17] perf/x86/intel: Handle guest PEBS overflow PMI for KVM guest Like Xu
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 31+ messages in thread
From: Like Xu @ 2022-04-11 10:19 UTC (permalink / raw)
  To: Paolo Bonzini, Jim Mattson
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Joerg Roedel, linux-kernel, kvm

From: Like Xu <likexu@tencent.com>

Add support for EPT-Friendly PEBS, a new CPU feature that enlightens PEBS
to translate guest linear address through EPT, and facilitates handling
VM-Exits that occur when accessing PEBS records.  More information can
be found in the December 2021 release of Intel's SDM, Volume 3,
18.9.5 "EPT-Friendly PEBS". This new hardware facility makes sure the
guest PEBS records will not be lost, which is available on Intel Ice Lake
Server platforms (and later).

KVM will check this field through perf_get_x86_pmu_capability() instead
of hard coding the CPU models in the KVM code. If it is supported, the
guest PEBS capability will be exposed to the guest. Guest PEBS can be
enabled when and only when "EPT-Friendly PEBS" is supported and
EPT is enabled.

Cc: linux-perf-users@vger.kernel.org
Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/events/core.c            | 1 +
 arch/x86/events/intel/core.c      | 1 +
 arch/x86/events/perf_event.h      | 3 ++-
 arch/x86/include/asm/perf_event.h | 1 +
 4 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index eef816fc216d..adb6d9d3cd4d 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2994,5 +2994,6 @@ void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap)
 	cap->bit_width_fixed	= x86_pmu.cntval_bits;
 	cap->events_mask	= (unsigned int)x86_pmu.events_maskl;
 	cap->events_mask_len	= x86_pmu.events_mask_len;
+	cap->pebs_ept		= x86_pmu.pebs_ept;
 }
 EXPORT_SYMBOL_GPL(perf_get_x86_pmu_capability);
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index e88791b420ee..0988ff3e18fb 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -6134,6 +6134,7 @@ __init int intel_pmu_init(void)
 
 	case INTEL_FAM6_ICELAKE_X:
 	case INTEL_FAM6_ICELAKE_D:
+		x86_pmu.pebs_ept = 1;
 		pmem = true;
 		fallthrough;
 	case INTEL_FAM6_ICELAKE_L:
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 150261d929b9..0998742760c8 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -815,7 +815,8 @@ struct x86_pmu {
 			pebs_prec_dist		:1,
 			pebs_no_tlb		:1,
 			pebs_no_isolation	:1,
-			pebs_block		:1;
+			pebs_block		:1,
+			pebs_ept		:1;
 	int		pebs_record_size;
 	int		pebs_buffer_size;
 	int		max_pebs_events;
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 58d9e4b1fa0a..44c9a4c20c06 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -192,6 +192,7 @@ struct x86_pmu_capability {
 	int		bit_width_fixed;
 	unsigned int	events_mask;
 	int		events_mask_len;
+	unsigned int	pebs_ept	:1;
 };
 
 /*
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RESEND v12 02/17] perf/x86/intel: Handle guest PEBS overflow PMI for KVM guest
  2022-04-11 10:19 [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS Like Xu
  2022-04-11 10:19 ` [PATCH RESEND v12 01/17] perf/x86/intel: Add EPT-Friendly PEBS for Ice Lake Server Like Xu
@ 2022-04-11 10:19 ` Like Xu
  2022-04-11 10:19 ` [PATCH RESEND v12 03/17] perf/x86/core: Pass "struct kvm_pmu *" to determine the guest values Like Xu
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 31+ messages in thread
From: Like Xu @ 2022-04-11 10:19 UTC (permalink / raw)
  To: Paolo Bonzini, Jim Mattson
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Joerg Roedel, linux-kernel, kvm

From: Like Xu <likexu@tencent.com>

With PEBS virtualization, the guest PEBS records get delivered to the
guest DS, and the host pmi handler uses perf_guest_cbs->is_in_guest()
to distinguish whether the PMI comes from the guest code like Intel PT.

No matter how many guest PEBS counters are overflowed, only triggering
one fake event is enough. The fake event causes the KVM PMI callback to
be called, thereby injecting the PEBS overflow PMI into the guest.

KVM may inject the PMI with BUFFER_OVF set, even if the guest DS is
empty. That should really be harmless. Thus guest PEBS handler would
retrieve the correct information from its own PEBS records buffer.

Cc: linux-perf-users@vger.kernel.org
Originally-by: Andi Kleen <ak@linux.intel.com>
Co-developed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/events/intel/core.c | 42 ++++++++++++++++++++++++++++++++++++
 1 file changed, 42 insertions(+)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 0988ff3e18fb..510fc2de4cd2 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2852,6 +2852,47 @@ static void intel_pmu_reset(void)
 	local_irq_restore(flags);
 }
 
+/*
+ * We may be running with guest PEBS events created by KVM, and the
+ * PEBS records are logged into the guest's DS and invisible to host.
+ *
+ * In the case of guest PEBS overflow, we only trigger a fake event
+ * to emulate the PEBS overflow PMI for guest PEBS counters in KVM.
+ * The guest will then vm-entry and check the guest DS area to read
+ * the guest PEBS records.
+ *
+ * The contents and other behavior of the guest event do not matter.
+ */
+static void x86_pmu_handle_guest_pebs(struct pt_regs *regs,
+				      struct perf_sample_data *data)
+{
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+	u64 guest_pebs_idxs = cpuc->pebs_enabled & ~cpuc->intel_ctrl_host_mask;
+	struct perf_event *event = NULL;
+	int bit;
+
+	if (!unlikely(perf_guest_state()))
+		return;
+
+	if (!x86_pmu.pebs_ept || !x86_pmu.pebs_active ||
+	    !guest_pebs_idxs)
+		return;
+
+	for_each_set_bit(bit, (unsigned long *)&guest_pebs_idxs,
+			 INTEL_PMC_IDX_FIXED + x86_pmu.num_counters_fixed) {
+		event = cpuc->events[bit];
+		if (!event->attr.precise_ip)
+			continue;
+
+		perf_sample_data_init(data, 0, event->hw.last_period);
+		if (perf_event_overflow(event, data, regs))
+			x86_pmu_stop(event, 0);
+
+		/* Inject one fake event is enough. */
+		break;
+	}
+}
+
 static int handle_pmi_common(struct pt_regs *regs, u64 status)
 {
 	struct perf_sample_data data;
@@ -2903,6 +2944,7 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status)
 		u64 pebs_enabled = cpuc->pebs_enabled;
 
 		handled++;
+		x86_pmu_handle_guest_pebs(regs, &data);
 		x86_pmu.drain_pebs(regs, &data);
 		status &= intel_ctrl | GLOBAL_STATUS_TRACE_TOPAPMI;
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RESEND v12 03/17] perf/x86/core: Pass "struct kvm_pmu *" to determine the guest values
  2022-04-11 10:19 [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS Like Xu
  2022-04-11 10:19 ` [PATCH RESEND v12 01/17] perf/x86/intel: Add EPT-Friendly PEBS for Ice Lake Server Like Xu
  2022-04-11 10:19 ` [PATCH RESEND v12 02/17] perf/x86/intel: Handle guest PEBS overflow PMI for KVM guest Like Xu
@ 2022-04-11 10:19 ` Like Xu
  2022-04-11 10:19 ` [PATCH RESEND v12 04/17] KVM: x86/pmu: Set MSR_IA32_MISC_ENABLE_EMON bit when vPMU is enabled Like Xu
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 31+ messages in thread
From: Like Xu @ 2022-04-11 10:19 UTC (permalink / raw)
  To: Paolo Bonzini, Jim Mattson
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Joerg Roedel, linux-kernel, kvm

From: Like Xu <like.xu@linux.intel.com>

Splitting the logic for determining the guest values is unnecessarily
confusing, and potentially fragile. Perf should have full knowledge and
control of what values are loaded for the guest.

If we change .guest_get_msrs() to take a struct kvm_pmu pointer, then it
can generate the full set of guest values by grabbing guest ds_area and
pebs_data_cfg. Alternatively, .guest_get_msrs() could take the desired
guest MSR values directly (ds_area and pebs_data_cfg), but kvm_pmu is
vendor agnostic, so we don't see any reason to not just pass the pointer.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/events/core.c            | 4 ++--
 arch/x86/events/intel/core.c      | 4 ++--
 arch/x86/events/perf_event.h      | 2 +-
 arch/x86/include/asm/perf_event.h | 4 ++--
 arch/x86/kvm/vmx/vmx.c            | 3 ++-
 5 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index adb6d9d3cd4d..7f1d10dbabc0 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -693,9 +693,9 @@ void x86_pmu_disable_all(void)
 	}
 }
 
-struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr)
+struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr, void *data)
 {
-	return static_call(x86_pmu_guest_get_msrs)(nr);
+	return static_call(x86_pmu_guest_get_msrs)(nr, data);
 }
 EXPORT_SYMBOL_GPL(perf_guest_get_msrs);
 
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 510fc2de4cd2..039e4e7cbc2f 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3972,7 +3972,7 @@ static int intel_pmu_hw_config(struct perf_event *event)
 	return 0;
 }
 
-static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr)
+static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
 {
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct perf_guest_switch_msr *arr = cpuc->guest_switch_msrs;
@@ -4005,7 +4005,7 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr)
 	return arr;
 }
 
-static struct perf_guest_switch_msr *core_guest_get_msrs(int *nr)
+static struct perf_guest_switch_msr *core_guest_get_msrs(int *nr, void *data)
 {
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct perf_guest_switch_msr *arr = cpuc->guest_switch_msrs;
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 0998742760c8..bf23cbe4f6cf 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -900,7 +900,7 @@ struct x86_pmu {
 	/*
 	 * Intel host/guest support (KVM)
 	 */
-	struct perf_guest_switch_msr *(*guest_get_msrs)(int *nr);
+	struct perf_guest_switch_msr *(*guest_get_msrs)(int *nr, void *data);
 
 	/*
 	 * Check period value for PERF_EVENT_IOC_PERIOD ioctl.
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 44c9a4c20c06..dc295b8c8def 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -492,10 +492,10 @@ static inline void perf_check_microcode(void) { }
 #endif
 
 #if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_INTEL)
-extern struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr);
+extern struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr, void *data);
 extern int x86_perf_get_lbr(struct x86_pmu_lbr *lbr);
 #else
-struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr);
+struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr, void *data);
 static inline int x86_perf_get_lbr(struct x86_pmu_lbr *lbr)
 {
 	return -1;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index cf8581978bce..ff28a3992427 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6696,9 +6696,10 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx)
 {
 	int i, nr_msrs;
 	struct perf_guest_switch_msr *msrs;
+	struct kvm_pmu *pmu = vcpu_to_pmu(&vmx->vcpu);
 
 	/* Note, nr_msrs may be garbage if perf_guest_get_msrs() returns NULL. */
-	msrs = perf_guest_get_msrs(&nr_msrs);
+	msrs = perf_guest_get_msrs(&nr_msrs, (void *)pmu);
 	if (!msrs)
 		return;
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RESEND v12 04/17] KVM: x86/pmu: Set MSR_IA32_MISC_ENABLE_EMON bit when vPMU is enabled
  2022-04-11 10:19 [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS Like Xu
                   ` (2 preceding siblings ...)
  2022-04-11 10:19 ` [PATCH RESEND v12 03/17] perf/x86/core: Pass "struct kvm_pmu *" to determine the guest values Like Xu
@ 2022-04-11 10:19 ` Like Xu
  2022-04-11 10:19 ` [PATCH RESEND v12 05/17] KVM: x86/pmu: Introduce the ctrl_mask value for fixed counter Like Xu
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 31+ messages in thread
From: Like Xu @ 2022-04-11 10:19 UTC (permalink / raw)
  To: Paolo Bonzini, Jim Mattson
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Joerg Roedel, linux-kernel, kvm

From: Like Xu <likexu@tencent.com>

On Intel platforms, the software can use the IA32_MISC_ENABLE[7] bit to
detect whether the processor supports performance monitoring facility.

It depends on the PMU is enabled for the guest, and a software write
operation to this available bit will be ignored. The proposal to ignore
the toggle in KVM is the way to go and that behavior matches bare metal.

Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/kvm/vmx/pmu_intel.c |  1 +
 arch/x86/kvm/x86.c           | 15 +++++++++++++--
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 9db662399487..e101406dafa3 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -502,6 +502,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 	if (!pmu->version)
 		return;
 
+	vcpu->arch.ia32_misc_enable_msr |= MSR_IA32_MISC_ENABLE_EMON;
 	perf_get_x86_pmu_capability(&x86_pmu);
 
 	pmu->nr_arch_gp_counters = min_t(int, eax.split.num_counters,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ab336f7c82e4..4b64b3ff5b67 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3550,9 +3550,19 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			vcpu->arch.ia32_tsc_adjust_msr = data;
 		}
 		break;
-	case MSR_IA32_MISC_ENABLE:
+	case MSR_IA32_MISC_ENABLE: {
+		u64 old_val = vcpu->arch.ia32_misc_enable_msr;
+		u64 pmu_mask = MSR_IA32_MISC_ENABLE_EMON;
+
+		/*
+		 * For a dummy user space, the order of setting vPMU capabilities and
+		 * initialising MSR_IA32_MISC_ENABLE is not strictly guaranteed, so to
+		 * avoid inconsistent functionality we keep the vPMU bits unchanged here.
+		 */
+		data &= ~pmu_mask;
+		data |= old_val & pmu_mask;
 		if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT) &&
-		    ((vcpu->arch.ia32_misc_enable_msr ^ data) & MSR_IA32_MISC_ENABLE_MWAIT)) {
+		    ((old_val ^ data)  & MSR_IA32_MISC_ENABLE_MWAIT)) {
 			if (!guest_cpuid_has(vcpu, X86_FEATURE_XMM3))
 				return 1;
 			vcpu->arch.ia32_misc_enable_msr = data;
@@ -3561,6 +3571,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			vcpu->arch.ia32_misc_enable_msr = data;
 		}
 		break;
+	}
 	case MSR_IA32_SMBASE:
 		if (!msr_info->host_initiated)
 			return 1;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RESEND v12 05/17] KVM: x86/pmu: Introduce the ctrl_mask value for fixed counter
  2022-04-11 10:19 [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS Like Xu
                   ` (3 preceding siblings ...)
  2022-04-11 10:19 ` [PATCH RESEND v12 04/17] KVM: x86/pmu: Set MSR_IA32_MISC_ENABLE_EMON bit when vPMU is enabled Like Xu
@ 2022-04-11 10:19 ` Like Xu
  2022-04-11 10:19 ` [PATCH RESEND v12 06/17] x86/perf/core: Add pebs_capable to store valid PEBS_COUNTER_MASK value Like Xu
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 31+ messages in thread
From: Like Xu @ 2022-04-11 10:19 UTC (permalink / raw)
  To: Paolo Bonzini, Jim Mattson
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Joerg Roedel, linux-kernel, kvm

From: Like Xu <like.xu@linux.intel.com>

The mask value of fixed counter control register should be dynamic
adjusted with the number of fixed counters. This patch introduces a
variable that includes the reserved bits of fixed counter control
registers. This is a generic code refactoring.

Co-developed-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/kvm_host.h | 1 +
 arch/x86/kvm/vmx/pmu_intel.c    | 6 +++++-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 2c20f715f009..6e3eeadfe8e3 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -502,6 +502,7 @@ struct kvm_pmu {
 	unsigned nr_arch_fixed_counters;
 	unsigned available_event_types;
 	u64 fixed_ctr_ctrl;
+	u64 fixed_ctr_ctrl_mask;
 	u64 global_ctrl;
 	u64 global_status;
 	u64 counter_bitmask[2];
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index e101406dafa3..2aeabb067bad 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -395,7 +395,7 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case MSR_CORE_PERF_FIXED_CTR_CTRL:
 		if (pmu->fixed_ctr_ctrl == data)
 			return 0;
-		if (!(data & 0xfffffffffffff444ull)) {
+		if (!(data & pmu->fixed_ctr_ctrl_mask)) {
 			reprogram_fixed_counters(pmu, data);
 			return 0;
 		}
@@ -483,6 +483,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 	struct kvm_cpuid_entry2 *entry;
 	union cpuid10_eax eax;
 	union cpuid10_edx edx;
+	int i;
 
 	pmu->nr_arch_gp_counters = 0;
 	pmu->nr_arch_fixed_counters = 0;
@@ -491,6 +492,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 	pmu->version = 0;
 	pmu->reserved_bits = 0xffffffff00200000ull;
 	pmu->raw_event_mask = X86_RAW_EVENT_MASK;
+	pmu->fixed_ctr_ctrl_mask = ~0ull;
 
 	entry = kvm_find_cpuid_entry(vcpu, 0xa, 0);
 	if (!entry || !vcpu->kvm->arch.enable_pmu)
@@ -527,6 +529,8 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 		setup_fixed_pmc_eventsel(pmu);
 	}
 
+	for (i = 0; i < pmu->nr_arch_fixed_counters; i++)
+		pmu->fixed_ctr_ctrl_mask &= ~(0xbull << (i * 4));
 	pmu->global_ctrl = ((1ull << pmu->nr_arch_gp_counters) - 1) |
 		(((1ull << pmu->nr_arch_fixed_counters) - 1) << INTEL_PMC_IDX_FIXED);
 	pmu->global_ctrl_mask = ~pmu->global_ctrl;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RESEND v12 06/17] x86/perf/core: Add pebs_capable to store valid PEBS_COUNTER_MASK value
  2022-04-11 10:19 [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS Like Xu
                   ` (4 preceding siblings ...)
  2022-04-11 10:19 ` [PATCH RESEND v12 05/17] KVM: x86/pmu: Introduce the ctrl_mask value for fixed counter Like Xu
@ 2022-04-11 10:19 ` Like Xu
  2022-04-11 10:19 ` [PATCH RESEND v12 07/17] KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS Like Xu
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 31+ messages in thread
From: Like Xu @ 2022-04-11 10:19 UTC (permalink / raw)
  To: Paolo Bonzini, Jim Mattson
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Joerg Roedel, linux-kernel, kvm

From: "Peter Zijlstra (Intel)" <peterz@infradead.org>

The value of pebs_counter_mask will be accessed frequently
for repeated use in the intel_guest_get_msrs(). So it can be
optimized instead of endlessly mucking about with branches.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/events/intel/core.c | 14 ++++++--------
 arch/x86/events/perf_event.h |  1 +
 2 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 039e4e7cbc2f..4a3619ed66d1 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2932,10 +2932,7 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status)
 	 * counters from the GLOBAL_STATUS mask and we always process PEBS
 	 * events via drain_pebs().
 	 */
-	if (x86_pmu.flags & PMU_FL_PEBS_ALL)
-		status &= ~cpuc->pebs_enabled;
-	else
-		status &= ~(cpuc->pebs_enabled & PEBS_COUNTER_MASK);
+	status &= ~(cpuc->pebs_enabled & x86_pmu.pebs_capable);
 
 	/*
 	 * PEBS overflow sets bit 62 in the global status register
@@ -3981,10 +3978,7 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
 	arr[0].msr = MSR_CORE_PERF_GLOBAL_CTRL;
 	arr[0].host = intel_ctrl & ~cpuc->intel_ctrl_guest_mask;
 	arr[0].guest = intel_ctrl & ~cpuc->intel_ctrl_host_mask;
-	if (x86_pmu.flags & PMU_FL_PEBS_ALL)
-		arr[0].guest &= ~cpuc->pebs_enabled;
-	else
-		arr[0].guest &= ~(cpuc->pebs_enabled & PEBS_COUNTER_MASK);
+	arr[0].guest &= ~(cpuc->pebs_enabled & x86_pmu.pebs_capable);
 	*nr = 1;
 
 	if (x86_pmu.pebs && x86_pmu.pebs_no_isolation) {
@@ -5688,6 +5682,7 @@ __init int intel_pmu_init(void)
 	x86_pmu.events_mask_len		= eax.split.mask_length;
 
 	x86_pmu.max_pebs_events		= min_t(unsigned, MAX_PEBS_EVENTS, x86_pmu.num_counters);
+	x86_pmu.pebs_capable		= PEBS_COUNTER_MASK;
 
 	/*
 	 * Quirk: v2 perfmon does not report fixed-purpose events, so
@@ -5872,6 +5867,7 @@ __init int intel_pmu_init(void)
 		x86_pmu.pebs_aliases = NULL;
 		x86_pmu.pebs_prec_dist = true;
 		x86_pmu.lbr_pt_coexist = true;
+		x86_pmu.pebs_capable = ~0ULL;
 		x86_pmu.flags |= PMU_FL_HAS_RSP_1;
 		x86_pmu.flags |= PMU_FL_PEBS_ALL;
 		x86_pmu.get_event_constraints = glp_get_event_constraints;
@@ -6229,6 +6225,7 @@ __init int intel_pmu_init(void)
 		x86_pmu.pebs_aliases = NULL;
 		x86_pmu.pebs_prec_dist = true;
 		x86_pmu.pebs_block = true;
+		x86_pmu.pebs_capable = ~0ULL;
 		x86_pmu.flags |= PMU_FL_HAS_RSP_1;
 		x86_pmu.flags |= PMU_FL_NO_HT_SHARING;
 		x86_pmu.flags |= PMU_FL_PEBS_ALL;
@@ -6271,6 +6268,7 @@ __init int intel_pmu_init(void)
 		x86_pmu.pebs_aliases = NULL;
 		x86_pmu.pebs_prec_dist = true;
 		x86_pmu.pebs_block = true;
+		x86_pmu.pebs_capable = ~0ULL;
 		x86_pmu.flags |= PMU_FL_HAS_RSP_1;
 		x86_pmu.flags |= PMU_FL_NO_HT_SHARING;
 		x86_pmu.flags |= PMU_FL_PEBS_ALL;
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index bf23cbe4f6cf..9e1bef9c2b0c 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -825,6 +825,7 @@ struct x86_pmu {
 	void		(*pebs_aliases)(struct perf_event *event);
 	unsigned long	large_pebs_flags;
 	u64		rtm_abort_event;
+	u64		pebs_capable;
 
 	/*
 	 * Intel LBR
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RESEND v12 07/17] KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS
  2022-04-11 10:19 [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS Like Xu
                   ` (5 preceding siblings ...)
  2022-04-11 10:19 ` [PATCH RESEND v12 06/17] x86/perf/core: Add pebs_capable to store valid PEBS_COUNTER_MASK value Like Xu
@ 2022-04-11 10:19 ` Like Xu
  2022-04-11 10:19 ` [PATCH RESEND v12 08/17] KVM: x86/pmu: Reprogram PEBS event to emulate guest PEBS counter Like Xu
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 31+ messages in thread
From: Like Xu @ 2022-04-11 10:19 UTC (permalink / raw)
  To: Paolo Bonzini, Jim Mattson
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Joerg Roedel, linux-kernel, kvm

From: Like Xu <like.xu@linux.intel.com>

If IA32_PERF_CAPABILITIES.PEBS_BASELINE [bit 14] is set, the
IA32_PEBS_ENABLE MSR exists and all architecturally enumerated fixed
and general-purpose counters have corresponding bits in IA32_PEBS_ENABLE
that enable generation of PEBS records. The general-purpose counter bits
start at bit IA32_PEBS_ENABLE[0], and the fixed counter bits start at
bit IA32_PEBS_ENABLE[32].

When guest PEBS is enabled, the IA32_PEBS_ENABLE MSR will be
added to the perf_guest_switch_msr() and atomically switched during
the VMX transitions just like CORE_PERF_GLOBAL_CTRL MSR.

Based on whether the platform supports x86_pmu.pebs_ept, it has also
refactored the way to add more msrs to arr[] in intel_guest_get_msrs()
for extensibility.

Originally-by: Andi Kleen <ak@linux.intel.com>
Co-developed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Co-developed-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/events/intel/core.c     | 75 ++++++++++++++++++++++++--------
 arch/x86/include/asm/kvm_host.h  |  3 ++
 arch/x86/include/asm/msr-index.h |  6 +++
 arch/x86/kvm/vmx/pmu_intel.c     | 31 +++++++++++++
 arch/x86/kvm/x86.c               |  1 +
 5 files changed, 98 insertions(+), 18 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 4a3619ed66d1..270356df4add 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3969,33 +3969,72 @@ static int intel_pmu_hw_config(struct perf_event *event)
 	return 0;
 }
 
+/*
+ * Currently, the only caller of this function is the atomic_switch_perf_msrs().
+ * The host perf conext helps to prepare the values of the real hardware for
+ * a set of msrs that need to be switched atomically in a vmx transaction.
+ *
+ * For example, the pseudocode needed to add a new msr should look like:
+ *
+ * arr[(*nr)++] = (struct perf_guest_switch_msr){
+ *	.msr = the hardware msr address,
+ *	.host = the value the hardware has when it doesn't run a guest,
+ *	.guest = the value the hardware has when it runs a guest,
+ * };
+ *
+ * These values have nothing to do with the emulated values the guest sees
+ * when it uses {RD,WR}MSR, which should be handled by the KVM context,
+ * specifically in the intel_pmu_{get,set}_msr().
+ */
 static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
 {
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct perf_guest_switch_msr *arr = cpuc->guest_switch_msrs;
 	u64 intel_ctrl = hybrid(cpuc->pmu, intel_ctrl);
+	u64 pebs_mask = cpuc->pebs_enabled & x86_pmu.pebs_capable;
+	int global_ctrl, pebs_enable;
 
-	arr[0].msr = MSR_CORE_PERF_GLOBAL_CTRL;
-	arr[0].host = intel_ctrl & ~cpuc->intel_ctrl_guest_mask;
-	arr[0].guest = intel_ctrl & ~cpuc->intel_ctrl_host_mask;
-	arr[0].guest &= ~(cpuc->pebs_enabled & x86_pmu.pebs_capable);
-	*nr = 1;
+	*nr = 0;
+	global_ctrl = (*nr)++;
+	arr[global_ctrl] = (struct perf_guest_switch_msr){
+		.msr = MSR_CORE_PERF_GLOBAL_CTRL,
+		.host = intel_ctrl & ~cpuc->intel_ctrl_guest_mask,
+		.guest = intel_ctrl & (~cpuc->intel_ctrl_host_mask | ~pebs_mask),
+	};
 
-	if (x86_pmu.pebs && x86_pmu.pebs_no_isolation) {
-		/*
-		 * If PMU counter has PEBS enabled it is not enough to
-		 * disable counter on a guest entry since PEBS memory
-		 * write can overshoot guest entry and corrupt guest
-		 * memory. Disabling PEBS solves the problem.
-		 *
-		 * Don't do this if the CPU already enforces it.
-		 */
-		arr[1].msr = MSR_IA32_PEBS_ENABLE;
-		arr[1].host = cpuc->pebs_enabled;
-		arr[1].guest = 0;
-		*nr = 2;
+	if (!x86_pmu.pebs)
+		return arr;
+
+	/*
+	 * If PMU counter has PEBS enabled it is not enough to
+	 * disable counter on a guest entry since PEBS memory
+	 * write can overshoot guest entry and corrupt guest
+	 * memory. Disabling PEBS solves the problem.
+	 *
+	 * Don't do this if the CPU already enforces it.
+	 */
+	if (x86_pmu.pebs_no_isolation) {
+		arr[(*nr)++] = (struct perf_guest_switch_msr){
+			.msr = MSR_IA32_PEBS_ENABLE,
+			.host = cpuc->pebs_enabled,
+			.guest = 0,
+		};
+		return arr;
 	}
 
+	if (!x86_pmu.pebs_ept)
+		return arr;
+	pebs_enable = (*nr)++;
+
+	arr[pebs_enable] = (struct perf_guest_switch_msr){
+		.msr = MSR_IA32_PEBS_ENABLE,
+		.host = cpuc->pebs_enabled & ~cpuc->intel_ctrl_guest_mask,
+		.guest = pebs_mask & ~cpuc->intel_ctrl_host_mask,
+	};
+
+	/* Set hw GLOBAL_CTRL bits for PEBS counter when it runs for guest */
+	arr[0].guest |= arr[*nr].guest;
+
 	return arr;
 }
 
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 6e3eeadfe8e3..be65c6527a8b 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -518,6 +518,9 @@ struct kvm_pmu {
 	DECLARE_BITMAP(all_valid_pmc_idx, X86_PMC_IDX_MAX);
 	DECLARE_BITMAP(pmc_in_use, X86_PMC_IDX_MAX);
 
+	u64 pebs_enable;
+	u64 pebs_enable_mask;
+
 	/*
 	 * The gate to release perf_events not marked in
 	 * pmc_in_use only once in a vcpu time slice.
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 0eb90d21049e..c72770942413 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -189,6 +189,12 @@
 #define PERF_CAP_PT_IDX			16
 
 #define MSR_PEBS_LD_LAT_THRESHOLD	0x000003f6
+#define PERF_CAP_PEBS_TRAP             BIT_ULL(6)
+#define PERF_CAP_ARCH_REG              BIT_ULL(7)
+#define PERF_CAP_PEBS_FORMAT           0xf00
+#define PERF_CAP_PEBS_BASELINE         BIT_ULL(14)
+#define PERF_CAP_PEBS_MASK	(PERF_CAP_PEBS_TRAP | PERF_CAP_ARCH_REG | \
+				 PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE)
 
 #define MSR_IA32_RTIT_CTL		0x00000570
 #define RTIT_CTL_TRACEEN		BIT(0)
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 2aeabb067bad..c7de5bc985c2 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -214,6 +214,9 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 	case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
 		ret = pmu->version > 1;
 		break;
+	case MSR_IA32_PEBS_ENABLE:
+		ret = vcpu->arch.perf_capabilities & PERF_CAP_PEBS_FORMAT;
+		break;
 	default:
 		ret = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0) ||
 			get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0) ||
@@ -361,6 +364,9 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
 		msr_info->data = 0;
 		return 0;
+	case MSR_IA32_PEBS_ENABLE:
+		msr_info->data = pmu->pebs_enable;
+		return 0;
 	default:
 		if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
 		    (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
@@ -421,6 +427,14 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			return 0;
 		}
 		break;
+	case MSR_IA32_PEBS_ENABLE:
+		if (pmu->pebs_enable == data)
+			return 0;
+		if (!(data & pmu->pebs_enable_mask)) {
+			pmu->pebs_enable = data;
+			return 0;
+		}
+		break;
 	default:
 		if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
 		    (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
@@ -493,6 +507,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 	pmu->reserved_bits = 0xffffffff00200000ull;
 	pmu->raw_event_mask = X86_RAW_EVENT_MASK;
 	pmu->fixed_ctr_ctrl_mask = ~0ull;
+	pmu->pebs_enable_mask = ~0ull;
 
 	entry = kvm_find_cpuid_entry(vcpu, 0xa, 0);
 	if (!entry || !vcpu->kvm->arch.enable_pmu)
@@ -564,6 +579,22 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 
 	if (lbr_desc->records.nr)
 		bitmap_set(pmu->all_valid_pmc_idx, INTEL_PMC_IDX_FIXED_VLBR, 1);
+
+	if (vcpu->arch.perf_capabilities & PERF_CAP_PEBS_FORMAT) {
+		if (vcpu->arch.perf_capabilities & PERF_CAP_PEBS_BASELINE) {
+			pmu->pebs_enable_mask = ~pmu->global_ctrl;
+			pmu->reserved_bits &= ~ICL_EVENTSEL_ADAPTIVE;
+			for (i = 0; i < pmu->nr_arch_fixed_counters; i++) {
+				pmu->fixed_ctr_ctrl_mask &=
+					~(1ULL << (INTEL_PMC_IDX_FIXED + i * 4));
+			}
+		} else {
+			pmu->pebs_enable_mask =
+				~((1ull << pmu->nr_arch_gp_counters) - 1);
+		}
+	} else {
+		vcpu->arch.perf_capabilities &= ~PERF_CAP_PEBS_MASK;
+	}
 }
 
 static void intel_pmu_init(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 4b64b3ff5b67..14902288cbb8 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1443,6 +1443,7 @@ static const u32 msrs_to_save_all[] = {
 	MSR_ARCH_PERFMON_EVENTSEL0 + 12, MSR_ARCH_PERFMON_EVENTSEL0 + 13,
 	MSR_ARCH_PERFMON_EVENTSEL0 + 14, MSR_ARCH_PERFMON_EVENTSEL0 + 15,
 	MSR_ARCH_PERFMON_EVENTSEL0 + 16, MSR_ARCH_PERFMON_EVENTSEL0 + 17,
+	MSR_IA32_PEBS_ENABLE,
 
 	MSR_K7_EVNTSEL0, MSR_K7_EVNTSEL1, MSR_K7_EVNTSEL2, MSR_K7_EVNTSEL3,
 	MSR_K7_PERFCTR0, MSR_K7_PERFCTR1, MSR_K7_PERFCTR2, MSR_K7_PERFCTR3,
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RESEND v12 08/17] KVM: x86/pmu: Reprogram PEBS event to emulate guest PEBS counter
  2022-04-11 10:19 [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS Like Xu
                   ` (6 preceding siblings ...)
  2022-04-11 10:19 ` [PATCH RESEND v12 07/17] KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS Like Xu
@ 2022-04-11 10:19 ` Like Xu
  2022-04-11 10:19 ` [PATCH RESEND v12 09/17] KVM: x86/pmu: Adjust precise_ip to emulate Ice Lake guest PDIR counter Like Xu
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 31+ messages in thread
From: Like Xu @ 2022-04-11 10:19 UTC (permalink / raw)
  To: Paolo Bonzini, Jim Mattson
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Joerg Roedel, linux-kernel, kvm

From: Like Xu <likexu@tencent.com>

When a guest counter is configured as a PEBS counter through
IA32_PEBS_ENABLE, a guest PEBS event will be reprogrammed by
configuring a non-zero precision level in the perf_event_attr.

The guest PEBS overflow PMI bit would be set in the guest
GLOBAL_STATUS MSR when PEBS facility generates a PEBS
overflow PMI based on guest IA32_DS_AREA MSR.

Even with the same counter index and the same event code and
mask, guest PEBS events will not be reused for non-PEBS events.

Originally-by: Andi Kleen <ak@linux.intel.com>
Co-developed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Like Xu <likexu@tencent.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kvm/pmu.c | 36 +++++++++++++++++++++++++++++++++---
 1 file changed, 33 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 618f529f1c4d..36487088f72c 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -86,15 +86,22 @@ static void kvm_pmi_trigger_fn(struct irq_work *irq_work)
 static inline void __kvm_perf_overflow(struct kvm_pmc *pmc, bool in_pmi)
 {
 	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
+	bool skip_pmi = false;
 
 	/* Ignore counters that have been reprogrammed already. */
 	if (test_and_set_bit(pmc->idx, pmu->reprogram_pmi))
 		return;
 
-	__set_bit(pmc->idx, (unsigned long *)&pmu->global_status);
+	if (pmc->perf_event && pmc->perf_event->attr.precise_ip) {
+		/* Indicate PEBS overflow PMI to guest. */
+		skip_pmi = __test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT,
+					      (unsigned long *)&pmu->global_status);
+	} else {
+		__set_bit(pmc->idx, (unsigned long *)&pmu->global_status);
+	}
 	kvm_make_request(KVM_REQ_PMU, pmc->vcpu);
 
-	if (!pmc->intr)
+	if (!pmc->intr || skip_pmi)
 		return;
 
 	/*
@@ -124,6 +131,7 @@ static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type,
 				  u64 config, bool exclude_user,
 				  bool exclude_kernel, bool intr)
 {
+	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
 	struct perf_event *event;
 	struct perf_event_attr attr = {
 		.type = type,
@@ -135,6 +143,7 @@ static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type,
 		.exclude_kernel = exclude_kernel,
 		.config = config,
 	};
+	bool pebs = test_bit(pmc->idx, (unsigned long *)&pmu->pebs_enable);
 
 	if (type == PERF_TYPE_HARDWARE && config >= PERF_COUNT_HW_MAX)
 		return;
@@ -150,6 +159,23 @@ static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type,
 		 */
 		attr.sample_period = 0;
 	}
+	if (pebs) {
+		/*
+		 * The non-zero precision level of guest event makes the ordinary
+		 * guest event becomes a guest PEBS event and triggers the host
+		 * PEBS PMI handler to determine whether the PEBS overflow PMI
+		 * comes from the host counters or the guest.
+		 *
+		 * For most PEBS hardware events, the difference in the software
+		 * precision levels of guest and host PEBS events will not affect
+		 * the accuracy of the PEBS profiling result, because the "event IP"
+		 * in the PEBS record is calibrated on the guest side.
+		 *
+		 * On Icelake everything is fine. Other hardware (GLC+, TNT+) that
+		 * could possibly care here is unsupported and needs changes.
+		 */
+		attr.precise_ip = 1;
+	}
 
 	event = perf_event_create_kernel_counter(&attr, -1, current,
 						 kvm_perf_overflow, pmc);
@@ -163,7 +189,7 @@ static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type,
 	pmc_to_pmu(pmc)->event_count++;
 	clear_bit(pmc->idx, pmc_to_pmu(pmc)->reprogram_pmi);
 	pmc->is_paused = false;
-	pmc->intr = intr;
+	pmc->intr = intr || pebs;
 }
 
 static void pmc_pause_counter(struct kvm_pmc *pmc)
@@ -189,6 +215,10 @@ static bool pmc_resume_counter(struct kvm_pmc *pmc)
 			      get_sample_period(pmc, pmc->counter)))
 		return false;
 
+	if (!test_bit(pmc->idx, (unsigned long *)&pmc_to_pmu(pmc)->pebs_enable) &&
+	    pmc->perf_event->attr.precise_ip)
+		return false;
+
 	/* reuse perf_event to serve as pmc_reprogram_counter() does*/
 	perf_event_enable(pmc->perf_event);
 	pmc->is_paused = false;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RESEND v12 09/17] KVM: x86/pmu: Adjust precise_ip to emulate Ice Lake guest PDIR counter
  2022-04-11 10:19 [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS Like Xu
                   ` (7 preceding siblings ...)
  2022-04-11 10:19 ` [PATCH RESEND v12 08/17] KVM: x86/pmu: Reprogram PEBS event to emulate guest PEBS counter Like Xu
@ 2022-04-11 10:19 ` Like Xu
  2022-05-13  8:57   ` Like Xu
  2022-05-13  9:26   ` [PATCH v13 " Like Xu
  2022-04-11 10:19 ` [PATCH RESEND v12 10/17] KVM: x86/pmu: Add IA32_DS_AREA MSR emulation to support guest DS Like Xu
                   ` (9 subsequent siblings)
  18 siblings, 2 replies; 31+ messages in thread
From: Like Xu @ 2022-04-11 10:19 UTC (permalink / raw)
  To: Paolo Bonzini, Jim Mattson
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Joerg Roedel, linux-kernel, kvm

From: Like Xu <like.xu@linux.intel.com>

The PEBS-PDIR facility on Ice Lake server is supported on IA31_FIXED0 only.
If the guest configures counter 32 and PEBS is enabled, the PEBS-PDIR
facility is supposed to be used, in which case KVM adjusts attr.precise_ip
to 3 and request host perf to assign the exactly requested counter or fail.

The CPU model check is also required since some platforms may place the
PEBS-PDIR facility in another counter index.

Signed-off-by: Like Xu <like.xu@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/events/intel/core.c | 2 +-
 arch/x86/kvm/pmu.c           | 2 ++
 arch/x86/kvm/pmu.h           | 8 ++++++++
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 270356df4add..45f3bab5d423 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4024,8 +4024,8 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
 
 	if (!x86_pmu.pebs_ept)
 		return arr;
-	pebs_enable = (*nr)++;
 
+	pebs_enable = (*nr)++;
 	arr[pebs_enable] = (struct perf_guest_switch_msr){
 		.msr = MSR_IA32_PEBS_ENABLE,
 		.host = cpuc->pebs_enabled & ~cpuc->intel_ctrl_guest_mask,
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 36487088f72c..c1312cd32237 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -175,6 +175,8 @@ static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type,
 		 * could possibly care here is unsupported and needs changes.
 		 */
 		attr.precise_ip = 1;
+		if (x86_match_cpu(vmx_icl_pebs_cpu) && pmc->idx == 32)
+			attr.precise_ip = 3;
 	}
 
 	event = perf_event_create_kernel_counter(&attr, -1, current,
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 2a53b6c9495c..06e750824da1 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -4,6 +4,8 @@
 
 #include <linux/nospec.h>
 
+#include <asm/cpu_device_id.h>
+
 #define vcpu_to_pmu(vcpu) (&(vcpu)->arch.pmu)
 #define pmu_to_vcpu(pmu)  (container_of((pmu), struct kvm_vcpu, arch.pmu))
 #define pmc_to_pmu(pmc)   (&(pmc)->vcpu->arch.pmu)
@@ -15,6 +17,12 @@
 #define VMWARE_BACKDOOR_PMC_REAL_TIME		0x10001
 #define VMWARE_BACKDOOR_PMC_APPARENT_TIME	0x10002
 
+static const struct x86_cpu_id vmx_icl_pebs_cpu[] = {
+	X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_D, NULL),
+	X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, NULL),
+	{}
+};
+
 struct kvm_event_hw_type_mapping {
 	u8 eventsel;
 	u8 unit_mask;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RESEND v12 10/17] KVM: x86/pmu: Add IA32_DS_AREA MSR emulation to support guest DS
  2022-04-11 10:19 [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS Like Xu
                   ` (8 preceding siblings ...)
  2022-04-11 10:19 ` [PATCH RESEND v12 09/17] KVM: x86/pmu: Adjust precise_ip to emulate Ice Lake guest PDIR counter Like Xu
@ 2022-04-11 10:19 ` Like Xu
  2022-04-11 10:19 ` [PATCH RESEND v12 11/17] KVM: x86/pmu: Add PEBS_DATA_CFG MSR emulation to support adaptive PEBS Like Xu
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 31+ messages in thread
From: Like Xu @ 2022-04-11 10:19 UTC (permalink / raw)
  To: Paolo Bonzini, Jim Mattson
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Joerg Roedel, linux-kernel, kvm

From: Like Xu <like.xu@linux.intel.com>

When CPUID.01H:EDX.DS[21] is set, the IA32_DS_AREA MSR exists and points
to the linear address of the first byte of the DS buffer management area,
which is used to manage the PEBS records.

When guest PEBS is enabled, the MSR_IA32_DS_AREA MSR will be added to the
perf_guest_switch_msr() and switched during the VMX transitions just like
CORE_PERF_GLOBAL_CTRL MSR. The WRMSR to IA32_DS_AREA MSR brings a #GP(0)
if the source register contains a non-canonical address.

Originally-by: Andi Kleen <ak@linux.intel.com>
Co-developed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/events/intel/core.c    | 10 +++++++++-
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/vmx/pmu_intel.c    | 11 +++++++++++
 arch/x86/kvm/x86.c              |  2 +-
 4 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 45f3bab5d423..07df5e7f444c 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -14,6 +14,7 @@
 #include <linux/slab.h>
 #include <linux/export.h>
 #include <linux/nmi.h>
+#include <linux/kvm_host.h>
 
 #include <asm/cpufeature.h>
 #include <asm/hardirq.h>
@@ -3990,6 +3991,7 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
 {
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct perf_guest_switch_msr *arr = cpuc->guest_switch_msrs;
+	struct kvm_pmu *kvm_pmu = (struct kvm_pmu *)data;
 	u64 intel_ctrl = hybrid(cpuc->pmu, intel_ctrl);
 	u64 pebs_mask = cpuc->pebs_enabled & x86_pmu.pebs_capable;
 	int global_ctrl, pebs_enable;
@@ -4022,9 +4024,15 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
 		return arr;
 	}
 
-	if (!x86_pmu.pebs_ept)
+	if (!kvm_pmu || !x86_pmu.pebs_ept)
 		return arr;
 
+	arr[(*nr)++] = (struct perf_guest_switch_msr){
+		.msr = MSR_IA32_DS_AREA,
+		.host = (unsigned long)cpuc->ds,
+		.guest = kvm_pmu->ds_area,
+	};
+
 	pebs_enable = (*nr)++;
 	arr[pebs_enable] = (struct perf_guest_switch_msr){
 		.msr = MSR_IA32_PEBS_ENABLE,
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index be65c6527a8b..f4152e85eca8 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -518,6 +518,7 @@ struct kvm_pmu {
 	DECLARE_BITMAP(all_valid_pmc_idx, X86_PMC_IDX_MAX);
 	DECLARE_BITMAP(pmc_in_use, X86_PMC_IDX_MAX);
 
+	u64 ds_area;
 	u64 pebs_enable;
 	u64 pebs_enable_mask;
 
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index c7de5bc985c2..54379fcbf803 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -217,6 +217,9 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 	case MSR_IA32_PEBS_ENABLE:
 		ret = vcpu->arch.perf_capabilities & PERF_CAP_PEBS_FORMAT;
 		break;
+	case MSR_IA32_DS_AREA:
+		ret = guest_cpuid_has(vcpu, X86_FEATURE_DS);
+		break;
 	default:
 		ret = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0) ||
 			get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0) ||
@@ -367,6 +370,9 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case MSR_IA32_PEBS_ENABLE:
 		msr_info->data = pmu->pebs_enable;
 		return 0;
+	case MSR_IA32_DS_AREA:
+		msr_info->data = pmu->ds_area;
+		return 0;
 	default:
 		if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
 		    (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
@@ -435,6 +441,11 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			return 0;
 		}
 		break;
+	case MSR_IA32_DS_AREA:
+		if (is_noncanonical_address(data, vcpu))
+			return 1;
+		pmu->ds_area = data;
+		return 0;
 	default:
 		if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
 		    (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 14902288cbb8..1f2e402d05bd 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1443,7 +1443,7 @@ static const u32 msrs_to_save_all[] = {
 	MSR_ARCH_PERFMON_EVENTSEL0 + 12, MSR_ARCH_PERFMON_EVENTSEL0 + 13,
 	MSR_ARCH_PERFMON_EVENTSEL0 + 14, MSR_ARCH_PERFMON_EVENTSEL0 + 15,
 	MSR_ARCH_PERFMON_EVENTSEL0 + 16, MSR_ARCH_PERFMON_EVENTSEL0 + 17,
-	MSR_IA32_PEBS_ENABLE,
+	MSR_IA32_PEBS_ENABLE, MSR_IA32_DS_AREA,
 
 	MSR_K7_EVNTSEL0, MSR_K7_EVNTSEL1, MSR_K7_EVNTSEL2, MSR_K7_EVNTSEL3,
 	MSR_K7_PERFCTR0, MSR_K7_PERFCTR1, MSR_K7_PERFCTR2, MSR_K7_PERFCTR3,
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RESEND v12 11/17] KVM: x86/pmu: Add PEBS_DATA_CFG MSR emulation to support adaptive PEBS
  2022-04-11 10:19 [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS Like Xu
                   ` (9 preceding siblings ...)
  2022-04-11 10:19 ` [PATCH RESEND v12 10/17] KVM: x86/pmu: Add IA32_DS_AREA MSR emulation to support guest DS Like Xu
@ 2022-04-11 10:19 ` Like Xu
  2022-04-11 10:19 ` [PATCH RESEND v12 12/17] KVM: x86: Set PEBS_UNAVAIL in IA32_MISC_ENABLE when PEBS is enabled Like Xu
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 31+ messages in thread
From: Like Xu @ 2022-04-11 10:19 UTC (permalink / raw)
  To: Paolo Bonzini, Jim Mattson
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Joerg Roedel, linux-kernel, kvm

From: Like Xu <likexu@tencent.com>

If IA32_PERF_CAPABILITIES.PEBS_BASELINE [bit 14] is set, the adaptive
PEBS is supported. The PEBS_DATA_CFG MSR and adaptive record enable
bits (IA32_PERFEVTSELx.Adaptive_Record and IA32_FIXED_CTR_CTRL.
FCx_Adaptive_Record) are also supported.

Adaptive PEBS provides software the capability to configure the PEBS
records to capture only the data of interest, keeping the record size
compact. An overflow of PMCx results in generation of an adaptive PEBS
record with state information based on the selections specified in
MSR_PEBS_DATA_CFG.By default, the record only contain the Basic group.

When guest adaptive PEBS is enabled, the IA32_PEBS_ENABLE MSR will
be added to the perf_guest_switch_msr() and switched during the VMX
transitions just like CORE_PERF_GLOBAL_CTRL MSR.

According to Intel SDM, software is recommended to  PEBS Baseline
when the following is true. IA32_PERF_CAPABILITIES.PEBS_BASELINE[14]
&& IA32_PERF_CAPABILITIES.PEBS_FMT[11:8] ≥ 4.

Co-developed-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/events/intel/core.c    |  8 ++++++++
 arch/x86/include/asm/kvm_host.h |  2 ++
 arch/x86/kvm/vmx/pmu_intel.c    | 20 +++++++++++++++++++-
 arch/x86/kvm/x86.c              |  2 +-
 4 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 07df5e7f444c..f723a24eb29b 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4033,6 +4033,14 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
 		.guest = kvm_pmu->ds_area,
 	};
 
+	if (x86_pmu.intel_cap.pebs_baseline) {
+		arr[(*nr)++] = (struct perf_guest_switch_msr){
+			.msr = MSR_PEBS_DATA_CFG,
+			.host = cpuc->pebs_data_cfg,
+			.guest = kvm_pmu->pebs_data_cfg,
+		};
+	}
+
 	pebs_enable = (*nr)++;
 	arr[pebs_enable] = (struct perf_guest_switch_msr){
 		.msr = MSR_IA32_PEBS_ENABLE,
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index f4152e85eca8..66057622164d 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -521,6 +521,8 @@ struct kvm_pmu {
 	u64 ds_area;
 	u64 pebs_enable;
 	u64 pebs_enable_mask;
+	u64 pebs_data_cfg;
+	u64 pebs_data_cfg_mask;
 
 	/*
 	 * The gate to release perf_events not marked in
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 54379fcbf803..df661b5bbbf1 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -205,6 +205,7 @@ static bool intel_pmu_is_valid_lbr_msr(struct kvm_vcpu *vcpu, u32 index)
 static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
+	u64 perf_capabilities = vcpu->arch.perf_capabilities;
 	int ret;
 
 	switch (msr) {
@@ -215,11 +216,15 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 		ret = pmu->version > 1;
 		break;
 	case MSR_IA32_PEBS_ENABLE:
-		ret = vcpu->arch.perf_capabilities & PERF_CAP_PEBS_FORMAT;
+		ret = perf_capabilities & PERF_CAP_PEBS_FORMAT;
 		break;
 	case MSR_IA32_DS_AREA:
 		ret = guest_cpuid_has(vcpu, X86_FEATURE_DS);
 		break;
+	case MSR_PEBS_DATA_CFG:
+		ret = (perf_capabilities & PERF_CAP_PEBS_BASELINE) &&
+			((perf_capabilities & PERF_CAP_PEBS_FORMAT) > 3);
+		break;
 	default:
 		ret = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0) ||
 			get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0) ||
@@ -373,6 +378,9 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case MSR_IA32_DS_AREA:
 		msr_info->data = pmu->ds_area;
 		return 0;
+	case MSR_PEBS_DATA_CFG:
+		msr_info->data = pmu->pebs_data_cfg;
+		return 0;
 	default:
 		if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
 		    (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
@@ -446,6 +454,14 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			return 1;
 		pmu->ds_area = data;
 		return 0;
+	case MSR_PEBS_DATA_CFG:
+		if (pmu->pebs_data_cfg == data)
+			return 0;
+		if (!(data & pmu->pebs_data_cfg_mask)) {
+			pmu->pebs_data_cfg = data;
+			return 0;
+		}
+		break;
 	default:
 		if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
 		    (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
@@ -519,6 +535,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 	pmu->raw_event_mask = X86_RAW_EVENT_MASK;
 	pmu->fixed_ctr_ctrl_mask = ~0ull;
 	pmu->pebs_enable_mask = ~0ull;
+	pmu->pebs_data_cfg_mask = ~0ull;
 
 	entry = kvm_find_cpuid_entry(vcpu, 0xa, 0);
 	if (!entry || !vcpu->kvm->arch.enable_pmu)
@@ -599,6 +616,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 				pmu->fixed_ctr_ctrl_mask &=
 					~(1ULL << (INTEL_PMC_IDX_FIXED + i * 4));
 			}
+			pmu->pebs_data_cfg_mask = ~0xff00000full;
 		} else {
 			pmu->pebs_enable_mask =
 				~((1ull << pmu->nr_arch_gp_counters) - 1);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1f2e402d05bd..02142fa244f3 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1443,7 +1443,7 @@ static const u32 msrs_to_save_all[] = {
 	MSR_ARCH_PERFMON_EVENTSEL0 + 12, MSR_ARCH_PERFMON_EVENTSEL0 + 13,
 	MSR_ARCH_PERFMON_EVENTSEL0 + 14, MSR_ARCH_PERFMON_EVENTSEL0 + 15,
 	MSR_ARCH_PERFMON_EVENTSEL0 + 16, MSR_ARCH_PERFMON_EVENTSEL0 + 17,
-	MSR_IA32_PEBS_ENABLE, MSR_IA32_DS_AREA,
+	MSR_IA32_PEBS_ENABLE, MSR_IA32_DS_AREA, MSR_PEBS_DATA_CFG,
 
 	MSR_K7_EVNTSEL0, MSR_K7_EVNTSEL1, MSR_K7_EVNTSEL2, MSR_K7_EVNTSEL3,
 	MSR_K7_PERFCTR0, MSR_K7_PERFCTR1, MSR_K7_PERFCTR2, MSR_K7_PERFCTR3,
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RESEND v12 12/17] KVM: x86: Set PEBS_UNAVAIL in IA32_MISC_ENABLE when PEBS is enabled
  2022-04-11 10:19 [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS Like Xu
                   ` (10 preceding siblings ...)
  2022-04-11 10:19 ` [PATCH RESEND v12 11/17] KVM: x86/pmu: Add PEBS_DATA_CFG MSR emulation to support adaptive PEBS Like Xu
@ 2022-04-11 10:19 ` Like Xu
  2022-04-11 10:19 ` [PATCH RESEND v12 13/17] KVM: x86/pmu: Move pmc_speculative_in_use() to arch/x86/kvm/pmu.h Like Xu
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 31+ messages in thread
From: Like Xu @ 2022-04-11 10:19 UTC (permalink / raw)
  To: Paolo Bonzini, Jim Mattson
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Joerg Roedel, linux-kernel, kvm

From: Like Xu <like.xu@linux.intel.com>

The bit 12 represents "Processor Event Based Sampling Unavailable (RO)" :
	1 = PEBS is not supported.
	0 = PEBS is supported.

A write to this PEBS_UNAVL available bit will bring #GP(0) when guest PEBS
is enabled. Some PEBS drivers in guest may care about this bit.

Signed-off-by: Like Xu <like.xu@linux.intel.com>
---
 arch/x86/kvm/vmx/pmu_intel.c | 2 ++
 arch/x86/kvm/x86.c           | 8 +++++++-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index df661b5bbbf1..389f2585f20a 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -609,6 +609,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 		bitmap_set(pmu->all_valid_pmc_idx, INTEL_PMC_IDX_FIXED_VLBR, 1);
 
 	if (vcpu->arch.perf_capabilities & PERF_CAP_PEBS_FORMAT) {
+		vcpu->arch.ia32_misc_enable_msr &= ~MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL;
 		if (vcpu->arch.perf_capabilities & PERF_CAP_PEBS_BASELINE) {
 			pmu->pebs_enable_mask = ~pmu->global_ctrl;
 			pmu->reserved_bits &= ~ICL_EVENTSEL_ADAPTIVE;
@@ -622,6 +623,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 				~((1ull << pmu->nr_arch_gp_counters) - 1);
 		}
 	} else {
+		vcpu->arch.ia32_misc_enable_msr |= MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL;
 		vcpu->arch.perf_capabilities &= ~PERF_CAP_PEBS_MASK;
 	}
 }
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 02142fa244f3..1887b7146da6 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3553,7 +3553,13 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		break;
 	case MSR_IA32_MISC_ENABLE: {
 		u64 old_val = vcpu->arch.ia32_misc_enable_msr;
-		u64 pmu_mask = MSR_IA32_MISC_ENABLE_EMON;
+		u64 pmu_mask = MSR_IA32_MISC_ENABLE_EMON |
+			MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL;
+
+		/* RO bits */
+		if (!msr_info->host_initiated &&
+		    ((old_val ^ data) & MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL))
+			return 1;
 
 		/*
 		 * For a dummy user space, the order of setting vPMU capabilities and
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RESEND v12 13/17] KVM: x86/pmu: Move pmc_speculative_in_use() to arch/x86/kvm/pmu.h
  2022-04-11 10:19 [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS Like Xu
                   ` (11 preceding siblings ...)
  2022-04-11 10:19 ` [PATCH RESEND v12 12/17] KVM: x86: Set PEBS_UNAVAIL in IA32_MISC_ENABLE when PEBS is enabled Like Xu
@ 2022-04-11 10:19 ` Like Xu
  2022-04-11 10:19 ` [PATCH RESEND v12 14/17] KVM: x86/pmu: Disable guest PEBS temporarily in two rare situations Like Xu
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 31+ messages in thread
From: Like Xu @ 2022-04-11 10:19 UTC (permalink / raw)
  To: Paolo Bonzini, Jim Mattson
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Joerg Roedel, linux-kernel, kvm

From: Like Xu <like.xu@linux.intel.com>

It allows this inline function to be reused by more callers in
more files, such as pmu_intel.c.

Signed-off-by: Like Xu <like.xu@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kvm/pmu.c | 11 -----------
 arch/x86/kvm/pmu.h | 11 +++++++++++
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index c1312cd32237..122e4bb4fa47 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -500,17 +500,6 @@ void kvm_pmu_init(struct kvm_vcpu *vcpu)
 	kvm_pmu_refresh(vcpu);
 }
 
-static inline bool pmc_speculative_in_use(struct kvm_pmc *pmc)
-{
-	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
-
-	if (pmc_is_fixed(pmc))
-		return fixed_ctrl_field(pmu->fixed_ctr_ctrl,
-			pmc->idx - INTEL_PMC_IDX_FIXED) & 0x3;
-
-	return pmc->eventsel & ARCH_PERFMON_EVENTSEL_ENABLE;
-}
-
 /* Release perf_events for vPMCs that have been unused for a full time slice.  */
 void kvm_pmu_cleanup(struct kvm_vcpu *vcpu)
 {
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 06e750824da1..b51f804737bd 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -143,6 +143,17 @@ static inline u64 get_sample_period(struct kvm_pmc *pmc, u64 counter_value)
 	return sample_period;
 }
 
+static inline bool pmc_speculative_in_use(struct kvm_pmc *pmc)
+{
+	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
+
+	if (pmc_is_fixed(pmc))
+		return fixed_ctrl_field(pmu->fixed_ctr_ctrl,
+					pmc->idx - INTEL_PMC_IDX_FIXED) & 0x3;
+
+	return pmc->eventsel & ARCH_PERFMON_EVENTSEL_ENABLE;
+}
+
 void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel);
 void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 ctrl, int fixed_idx);
 void reprogram_counter(struct kvm_pmu *pmu, int pmc_idx);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RESEND v12 14/17] KVM: x86/pmu: Disable guest PEBS temporarily in two rare situations
  2022-04-11 10:19 [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS Like Xu
                   ` (12 preceding siblings ...)
  2022-04-11 10:19 ` [PATCH RESEND v12 13/17] KVM: x86/pmu: Move pmc_speculative_in_use() to arch/x86/kvm/pmu.h Like Xu
@ 2022-04-11 10:19 ` Like Xu
  2022-04-11 10:19 ` [PATCH RESEND v12 15/17] KVM: x86/pmu: Add kvm_pmu_cap to optimize perf_get_x86_pmu_capability Like Xu
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 31+ messages in thread
From: Like Xu @ 2022-04-11 10:19 UTC (permalink / raw)
  To: Paolo Bonzini, Jim Mattson
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Joerg Roedel, linux-kernel, kvm

From: Like Xu <like.xu@linux.intel.com>

The guest PEBS will be disabled when some users try to perf KVM and
its user-space through the same PEBS facility OR when the host perf
doesn't schedule the guest PEBS counter in a one-to-one mapping manner
(neither of these are typical scenarios).

The PEBS records in the guest DS buffer are still accurate and the
above two restrictions will be checked before each vm-entry only if
guest PEBS is deemed to be enabled.

Suggested-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/events/intel/core.c    | 11 +++++++++--
 arch/x86/include/asm/kvm_host.h |  9 +++++++++
 arch/x86/kvm/vmx/pmu_intel.c    | 20 ++++++++++++++++++++
 arch/x86/kvm/vmx/vmx.c          |  4 ++++
 arch/x86/kvm/vmx/vmx.h          |  1 +
 5 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index f723a24eb29b..f136be17c1e2 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4048,8 +4048,15 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
 		.guest = pebs_mask & ~cpuc->intel_ctrl_host_mask,
 	};
 
-	/* Set hw GLOBAL_CTRL bits for PEBS counter when it runs for guest */
-	arr[0].guest |= arr[*nr].guest;
+	if (arr[pebs_enable].host) {
+		/* Disable guest PEBS if host PEBS is enabled. */
+		arr[pebs_enable].guest = 0;
+	} else {
+		/* Disable guest PEBS for cross-mapped PEBS counters. */
+		arr[pebs_enable].guest &= ~kvm_pmu->host_cross_mapped_mask;
+		/* Set hw GLOBAL_CTRL bits for PEBS counter when it runs for guest */
+		arr[global_ctrl].guest |= arr[pebs_enable].guest;
+	}
 
 	return arr;
 }
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 66057622164d..92b64fef75c1 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -524,6 +524,15 @@ struct kvm_pmu {
 	u64 pebs_data_cfg;
 	u64 pebs_data_cfg_mask;
 
+	/*
+	 * If a guest counter is cross-mapped to host counter with different
+	 * index, its PEBS capability will be temporarily disabled.
+	 *
+	 * The user should make sure that this mask is updated
+	 * after disabling interrupts and before perf_guest_get_msrs();
+	 */
+	u64 host_cross_mapped_mask;
+
 	/*
 	 * The gate to release perf_events not marked in
 	 * pmc_in_use only once in a vcpu time slice.
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 389f2585f20a..fc3b837448a3 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -790,6 +790,26 @@ static void intel_pmu_cleanup(struct kvm_vcpu *vcpu)
 		intel_pmu_release_guest_lbr_event(vcpu);
 }
 
+void intel_pmu_cross_mapped_check(struct kvm_pmu *pmu)
+{
+	struct kvm_pmc *pmc = NULL;
+	int bit;
+
+	for_each_set_bit(bit, (unsigned long *)&pmu->global_ctrl,
+			 X86_PMC_IDX_MAX) {
+		pmc = intel_pmc_idx_to_pmc(pmu, bit);
+
+		if (!pmc || !pmc_speculative_in_use(pmc) ||
+		    !intel_pmc_is_enabled(pmc))
+			continue;
+
+		if (pmc->perf_event && pmc->idx != pmc->perf_event->hw.idx) {
+			pmu->host_cross_mapped_mask |=
+				BIT_ULL(pmc->perf_event->hw.idx);
+		}
+	}
+}
+
 struct kvm_pmu_ops intel_pmu_ops __initdata = {
 	.pmc_perf_hw_id = intel_pmc_perf_hw_id,
 	.pmc_is_enabled = intel_pmc_is_enabled,
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index ff28a3992427..c8d768592c8c 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6698,6 +6698,10 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx)
 	struct perf_guest_switch_msr *msrs;
 	struct kvm_pmu *pmu = vcpu_to_pmu(&vmx->vcpu);
 
+	pmu->host_cross_mapped_mask = 0;
+	if (pmu->pebs_enable & pmu->global_ctrl)
+		intel_pmu_cross_mapped_check(pmu);
+
 	/* Note, nr_msrs may be garbage if perf_guest_get_msrs() returns NULL. */
 	msrs = perf_guest_get_msrs(&nr_msrs, (void *)pmu);
 	if (!msrs)
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 9c6bfcd84008..9d890e600d27 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -94,6 +94,7 @@ union vmx_exit_reason {
 #define vcpu_to_lbr_desc(vcpu) (&to_vmx(vcpu)->lbr_desc)
 #define vcpu_to_lbr_records(vcpu) (&to_vmx(vcpu)->lbr_desc.records)
 
+void intel_pmu_cross_mapped_check(struct kvm_pmu *pmu);
 bool intel_pmu_lbr_is_compatible(struct kvm_vcpu *vcpu);
 bool intel_pmu_lbr_is_enabled(struct kvm_vcpu *vcpu);
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RESEND v12 15/17] KVM: x86/pmu: Add kvm_pmu_cap to optimize perf_get_x86_pmu_capability
  2022-04-11 10:19 [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS Like Xu
                   ` (13 preceding siblings ...)
  2022-04-11 10:19 ` [PATCH RESEND v12 14/17] KVM: x86/pmu: Disable guest PEBS temporarily in two rare situations Like Xu
@ 2022-04-11 10:19 ` Like Xu
  2022-04-11 10:19 ` [PATCH RESEND v12 16/17] KVM: x86/cpuid: Refactor host/guest CPU model consistency check Like Xu
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 31+ messages in thread
From: Like Xu @ 2022-04-11 10:19 UTC (permalink / raw)
  To: Paolo Bonzini, Jim Mattson
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Joerg Roedel, linux-kernel, kvm

From: Like Xu <likexu@tencent.com>

The information obtained from the interface perf_get_x86_pmu_capability()
doesn't change, so an exported "struct x86_pmu_capability" is introduced
for all guests in the KVM, and it's initialized before hardware_setup().

Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/kvm/cpuid.c         | 27 ++++++++-------------------
 arch/x86/kvm/pmu.c           |  3 +++
 arch/x86/kvm/pmu.h           | 19 +++++++++++++++++++
 arch/x86/kvm/vmx/pmu_intel.c | 17 ++++++++---------
 arch/x86/kvm/x86.c           |  9 ++++-----
 5 files changed, 42 insertions(+), 33 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index b24ca7f4ed7c..8fbedae87f0e 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -883,34 +883,23 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
 	case 9:
 		break;
 	case 0xa: { /* Architectural Performance Monitoring */
-		struct x86_pmu_capability cap;
 		union cpuid10_eax eax;
 		union cpuid10_edx edx;
 
-		perf_get_x86_pmu_capability(&cap);
+		eax.split.version_id = kvm_pmu_cap.version;
+		eax.split.num_counters = kvm_pmu_cap.num_counters_gp;
+		eax.split.bit_width = kvm_pmu_cap.bit_width_gp;
+		eax.split.mask_length = kvm_pmu_cap.events_mask_len;
+		edx.split.num_counters_fixed = kvm_pmu_cap.num_counters_fixed;
+		edx.split.bit_width_fixed = kvm_pmu_cap.bit_width_fixed;
 
-		/*
-		 * The guest architecture pmu is only supported if the architecture
-		 * pmu exists on the host and the module parameters allow it.
-		 */
-		if (!cap.version || !enable_pmu)
-			memset(&cap, 0, sizeof(cap));
-
-		eax.split.version_id = min(cap.version, 2);
-		eax.split.num_counters = cap.num_counters_gp;
-		eax.split.bit_width = cap.bit_width_gp;
-		eax.split.mask_length = cap.events_mask_len;
-
-		edx.split.num_counters_fixed =
-			min(cap.num_counters_fixed, KVM_PMC_MAX_FIXED);
-		edx.split.bit_width_fixed = cap.bit_width_fixed;
-		if (cap.version)
+		if (kvm_pmu_cap.version)
 			edx.split.anythread_deprecated = 1;
 		edx.split.reserved1 = 0;
 		edx.split.reserved2 = 0;
 
 		entry->eax = eax.full;
-		entry->ebx = cap.events_mask;
+		entry->ebx = kvm_pmu_cap.events_mask;
 		entry->ecx = 0;
 		entry->edx = edx.full;
 		break;
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 122e4bb4fa47..b5d0c36b869b 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -24,6 +24,9 @@
 /* This is enough to filter the vast majority of currently defined events. */
 #define KVM_PMU_EVENT_FILTER_MAX_EVENTS 300
 
+struct x86_pmu_capability __read_mostly kvm_pmu_cap;
+EXPORT_SYMBOL_GPL(kvm_pmu_cap);
+
 /* NOTE:
  * - Each perf counter is defined as "struct kvm_pmc";
  * - There are two types of perf counters: general purpose (gp) and fixed.
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index b51f804737bd..dbf4c83519a4 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -154,6 +154,24 @@ static inline bool pmc_speculative_in_use(struct kvm_pmc *pmc)
 	return pmc->eventsel & ARCH_PERFMON_EVENTSEL_ENABLE;
 }
 
+extern struct x86_pmu_capability kvm_pmu_cap;
+
+static inline void kvm_init_pmu_capability(void)
+{
+	perf_get_x86_pmu_capability(&kvm_pmu_cap);
+
+	/*
+	 * Only support guest architectural pmu on
+	 * a host with architectural pmu.
+	 */
+	if (!kvm_pmu_cap.version)
+		memset(&kvm_pmu_cap, 0, sizeof(kvm_pmu_cap));
+
+	kvm_pmu_cap.version = min(kvm_pmu_cap.version, 2);
+	kvm_pmu_cap.num_counters_fixed = min(kvm_pmu_cap.num_counters_fixed,
+					     KVM_PMC_MAX_FIXED);
+}
+
 void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel);
 void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 ctrl, int fixed_idx);
 void reprogram_counter(struct kvm_pmu *pmu, int pmc_idx);
@@ -172,6 +190,7 @@ void kvm_pmu_cleanup(struct kvm_vcpu *vcpu);
 void kvm_pmu_destroy(struct kvm_vcpu *vcpu);
 int kvm_vm_ioctl_set_pmu_event_filter(struct kvm *kvm, void __user *argp);
 void kvm_pmu_trigger_event(struct kvm_vcpu *vcpu, u64 perf_hw_id);
+void kvm_init_pmu_capability(void);
 
 bool is_vmware_backdoor_pmc(u32 pmc_idx);
 
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index fc3b837448a3..f2c94e9dfa4b 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -519,8 +519,6 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
 	struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu);
-
-	struct x86_pmu_capability x86_pmu;
 	struct kvm_cpuid_entry2 *entry;
 	union cpuid10_eax eax;
 	union cpuid10_edx edx;
@@ -548,13 +546,14 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 		return;
 
 	vcpu->arch.ia32_misc_enable_msr |= MSR_IA32_MISC_ENABLE_EMON;
-	perf_get_x86_pmu_capability(&x86_pmu);
 
 	pmu->nr_arch_gp_counters = min_t(int, eax.split.num_counters,
-					 x86_pmu.num_counters_gp);
-	eax.split.bit_width = min_t(int, eax.split.bit_width, x86_pmu.bit_width_gp);
+					 kvm_pmu_cap.num_counters_gp);
+	eax.split.bit_width = min_t(int, eax.split.bit_width,
+				    kvm_pmu_cap.bit_width_gp);
 	pmu->counter_bitmask[KVM_PMC_GP] = ((u64)1 << eax.split.bit_width) - 1;
-	eax.split.mask_length = min_t(int, eax.split.mask_length, x86_pmu.events_mask_len);
+	eax.split.mask_length = min_t(int, eax.split.mask_length,
+				      kvm_pmu_cap.events_mask_len);
 	pmu->available_event_types = ~entry->ebx &
 					((1ull << eax.split.mask_length) - 1);
 
@@ -564,9 +563,9 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 		pmu->nr_arch_fixed_counters =
 			min3(ARRAY_SIZE(fixed_pmc_events),
 			     (size_t) edx.split.num_counters_fixed,
-			     (size_t) x86_pmu.num_counters_fixed);
-		edx.split.bit_width_fixed = min_t(int,
-			edx.split.bit_width_fixed, x86_pmu.bit_width_fixed);
+			     (size_t)kvm_pmu_cap.num_counters_fixed);
+		edx.split.bit_width_fixed = min_t(int, edx.split.bit_width_fixed,
+						  kvm_pmu_cap.bit_width_fixed);
 		pmu->counter_bitmask[KVM_PMC_FIXED] =
 			((u64)1 << edx.split.bit_width_fixed) - 1;
 		setup_fixed_pmc_eventsel(pmu);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1887b7146da6..8562debc14ed 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6645,15 +6645,12 @@ long kvm_arch_vm_ioctl(struct file *filp,
 
 static void kvm_init_msr_list(void)
 {
-	struct x86_pmu_capability x86_pmu;
 	u32 dummy[2];
 	unsigned i;
 
 	BUILD_BUG_ON_MSG(KVM_PMC_MAX_FIXED != 3,
 			 "Please update the fixed PMCs in msrs_to_saved_all[]");
 
-	perf_get_x86_pmu_capability(&x86_pmu);
-
 	num_msrs_to_save = 0;
 	num_emulated_msrs = 0;
 	num_msr_based_features = 0;
@@ -6705,12 +6702,12 @@ static void kvm_init_msr_list(void)
 			break;
 		case MSR_ARCH_PERFMON_PERFCTR0 ... MSR_ARCH_PERFMON_PERFCTR0 + 17:
 			if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_PERFCTR0 >=
-			    min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp))
+			    min(INTEL_PMC_MAX_GENERIC, kvm_pmu_cap.num_counters_gp))
 				continue;
 			break;
 		case MSR_ARCH_PERFMON_EVENTSEL0 ... MSR_ARCH_PERFMON_EVENTSEL0 + 17:
 			if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_EVENTSEL0 >=
-			    min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp))
+			    min(INTEL_PMC_MAX_GENERIC, kvm_pmu_cap.num_counters_gp))
 				continue;
 			break;
 		case MSR_IA32_XFD:
@@ -11676,6 +11673,8 @@ int kvm_arch_hardware_setup(void *opaque)
 	if (boot_cpu_has(X86_FEATURE_XSAVES))
 		rdmsrl(MSR_IA32_XSS, host_xss);
 
+	kvm_init_pmu_capability();
+
 	r = ops->hardware_setup();
 	if (r != 0)
 		return r;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RESEND v12 16/17] KVM: x86/cpuid: Refactor host/guest CPU model consistency check
  2022-04-11 10:19 [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS Like Xu
                   ` (14 preceding siblings ...)
  2022-04-11 10:19 ` [PATCH RESEND v12 15/17] KVM: x86/pmu: Add kvm_pmu_cap to optimize perf_get_x86_pmu_capability Like Xu
@ 2022-04-11 10:19 ` Like Xu
  2022-04-11 10:19 ` [PATCH RESEND v12 17/17] KVM: x86/pmu: Expose CPUIDs feature bits PDCM, DS, DTES64 Like Xu
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 31+ messages in thread
From: Like Xu @ 2022-04-11 10:19 UTC (permalink / raw)
  To: Paolo Bonzini, Jim Mattson
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Joerg Roedel, linux-kernel, kvm

From: Like Xu <like.xu@linux.intel.com>

For the same purpose, the leagcy intel_pmu_lbr_is_compatible() can be
renamed for reuse by more callers, and remove the comment about LBR
use case can be deleted by the way.

Signed-off-by: Like Xu <like.xu@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kvm/cpuid.h         |  5 +++++
 arch/x86/kvm/vmx/pmu_intel.c | 12 +-----------
 arch/x86/kvm/vmx/vmx.c       |  2 +-
 arch/x86/kvm/vmx/vmx.h       |  1 -
 4 files changed, 7 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index 8a770b481d9d..ac72aabba981 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -145,6 +145,11 @@ static inline int guest_cpuid_model(struct kvm_vcpu *vcpu)
 	return x86_model(best->eax);
 }
 
+static inline bool cpuid_model_is_consistent(struct kvm_vcpu *vcpu)
+{
+	return boot_cpu_data.x86_model == guest_cpuid_model(vcpu);
+}
+
 static inline int guest_cpuid_stepping(struct kvm_vcpu *vcpu)
 {
 	struct kvm_cpuid_entry2 *best;
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index f2c94e9dfa4b..84b326c4dce9 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -167,16 +167,6 @@ static inline struct kvm_pmc *get_fw_gp_pmc(struct kvm_pmu *pmu, u32 msr)
 	return get_gp_pmc(pmu, msr, MSR_IA32_PMC0);
 }
 
-bool intel_pmu_lbr_is_compatible(struct kvm_vcpu *vcpu)
-{
-	/*
-	 * As a first step, a guest could only enable LBR feature if its
-	 * cpu model is the same as the host because the LBR registers
-	 * would be pass-through to the guest and they're model specific.
-	 */
-	return boot_cpu_data.x86_model == guest_cpuid_model(vcpu);
-}
-
 bool intel_pmu_lbr_is_enabled(struct kvm_vcpu *vcpu)
 {
 	struct x86_pmu_lbr *lbr = vcpu_to_lbr_records(vcpu);
@@ -599,7 +589,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 	nested_vmx_pmu_refresh(vcpu,
 			       intel_is_valid_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL));
 
-	if (intel_pmu_lbr_is_compatible(vcpu))
+	if (cpuid_model_is_consistent(vcpu))
 		x86_perf_get_lbr(&lbr_desc->records);
 	else
 		lbr_desc->records.nr = 0;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c8d768592c8c..945d169eb07f 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2237,7 +2237,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			if ((data & PMU_CAP_LBR_FMT) !=
 			    (vmx_get_perf_capabilities() & PMU_CAP_LBR_FMT))
 				return 1;
-			if (!intel_pmu_lbr_is_compatible(vcpu))
+			if (!cpuid_model_is_consistent(vcpu))
 				return 1;
 		}
 		ret = kvm_set_msr_common(vcpu, msr_info);
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 9d890e600d27..ef2a2aeccb19 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -95,7 +95,6 @@ union vmx_exit_reason {
 #define vcpu_to_lbr_records(vcpu) (&to_vmx(vcpu)->lbr_desc.records)
 
 void intel_pmu_cross_mapped_check(struct kvm_pmu *pmu);
-bool intel_pmu_lbr_is_compatible(struct kvm_vcpu *vcpu);
 bool intel_pmu_lbr_is_enabled(struct kvm_vcpu *vcpu);
 
 int intel_pmu_create_guest_lbr_event(struct kvm_vcpu *vcpu);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RESEND v12 17/17] KVM: x86/pmu: Expose CPUIDs feature bits PDCM, DS, DTES64
  2022-04-11 10:19 [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS Like Xu
                   ` (15 preceding siblings ...)
  2022-04-11 10:19 ` [PATCH RESEND v12 16/17] KVM: x86/cpuid: Refactor host/guest CPU model consistency check Like Xu
@ 2022-04-11 10:19 ` Like Xu
  2022-05-10 16:55 ` [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS Paolo Bonzini
  2022-05-19 12:14 ` Vitaly Kuznetsov
  18 siblings, 0 replies; 31+ messages in thread
From: Like Xu @ 2022-04-11 10:19 UTC (permalink / raw)
  To: Paolo Bonzini, Jim Mattson
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Joerg Roedel, linux-kernel, kvm

From: Like Xu <likexu@tencent.com>

The CPUID features PDCM, DS and DTES64 are required for PEBS feature.
KVM would expose CPUID feature PDCM, DS and DTES64 to guest when PEBS
is supported in the KVM on the Ice Lake server platforms.

Originally-by: Andi Kleen <ak@linux.intel.com>
Co-developed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Co-developed-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/kvm/vmx/capabilities.h | 28 +++++++++++++++++-----------
 arch/x86/kvm/vmx/vmx.c          | 15 +++++++++++++++
 2 files changed, 32 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index 3f430e218375..0e3929ddf9c8 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -6,6 +6,7 @@
 
 #include "lapic.h"
 #include "x86.h"
+#include "pmu.h"
 
 extern bool __read_mostly enable_vpid;
 extern bool __read_mostly flexpriority_enabled;
@@ -385,23 +386,28 @@ static inline bool vmx_pt_mode_is_host_guest(void)
 	return pt_mode == PT_MODE_HOST_GUEST;
 }
 
+static inline bool vmx_pebs_supported(void)
+{
+	return boot_cpu_has(X86_FEATURE_PEBS) && kvm_pmu_cap.pebs_ept;
+}
+
 static inline u64 vmx_get_perf_capabilities(void)
 {
-	u64 perf_cap = 0;
-
-	if (!enable_pmu)
-		return perf_cap;
+	u64 perf_cap = PMU_CAP_FW_WRITES;
+	u64 host_perf_cap = 0;
 
 	if (boot_cpu_has(X86_FEATURE_PDCM))
-		rdmsrl(MSR_IA32_PERF_CAPABILITIES, perf_cap);
+		rdmsrl(MSR_IA32_PERF_CAPABILITIES, host_perf_cap);
 
-	perf_cap &= PMU_CAP_LBR_FMT;
+	perf_cap |= host_perf_cap & PMU_CAP_LBR_FMT;
 
-	/*
-	 * Since counters are virtualized, KVM would support full
-	 * width counting unconditionally, even if the host lacks it.
-	 */
-	return PMU_CAP_FW_WRITES | perf_cap;
+	if (vmx_pebs_supported()) {
+		perf_cap |= host_perf_cap & PERF_CAP_PEBS_MASK;
+		if ((perf_cap & PERF_CAP_PEBS_FORMAT) < 4)
+			perf_cap &= ~PERF_CAP_PEBS_BASELINE;
+	}
+
+	return perf_cap;
 }
 
 static inline u64 vmx_supported_debugctl(void)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 945d169eb07f..a6d6bb6ec9f0 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2240,6 +2240,17 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			if (!cpuid_model_is_consistent(vcpu))
 				return 1;
 		}
+		if (data & PERF_CAP_PEBS_FORMAT) {
+			if ((data & PERF_CAP_PEBS_MASK) !=
+			    (vmx_get_perf_capabilities() & PERF_CAP_PEBS_MASK))
+				return 1;
+			if (!guest_cpuid_has(vcpu, X86_FEATURE_DS))
+				return 1;
+			if (!guest_cpuid_has(vcpu, X86_FEATURE_DTES64))
+				return 1;
+			if (!cpuid_model_is_consistent(vcpu))
+				return 1;
+		}
 		ret = kvm_set_msr_common(vcpu, msr_info);
 		break;
 
@@ -7415,6 +7426,10 @@ static __init void vmx_set_cpu_caps(void)
 		kvm_cpu_cap_clear(X86_FEATURE_INVPCID);
 	if (vmx_pt_mode_is_host_guest())
 		kvm_cpu_cap_check_and_set(X86_FEATURE_INTEL_PT);
+	if (vmx_pebs_supported()) {
+		kvm_cpu_cap_check_and_set(X86_FEATURE_DS);
+		kvm_cpu_cap_check_and_set(X86_FEATURE_DTES64);
+	}
 
 	if (!enable_sgx) {
 		kvm_cpu_cap_clear(X86_FEATURE_SGX);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS
  2022-04-11 10:19 [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS Like Xu
                   ` (16 preceding siblings ...)
  2022-04-11 10:19 ` [PATCH RESEND v12 17/17] KVM: x86/pmu: Expose CPUIDs feature bits PDCM, DS, DTES64 Like Xu
@ 2022-05-10 16:55 ` Paolo Bonzini
  2022-05-19 12:14 ` Vitaly Kuznetsov
  18 siblings, 0 replies; 31+ messages in thread
From: Paolo Bonzini @ 2022-05-10 16:55 UTC (permalink / raw)
  To: Like Xu
  Cc: Jim Mattson, Peter Zijlstra, Sean Christopherson,
	Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel, linux-kernel, kvm

Queued, thanks, but only because I have not done my job very well
in handling this patch series (and LBR too) and I feel bad about
it.  Sending such a large patch series with no kvm-unit-tests should
not happen, and I'd be grateful if you wrote testcases after the fact.

Paolo



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RESEND v12 09/17] KVM: x86/pmu: Adjust precise_ip to emulate Ice Lake guest PDIR counter
  2022-04-11 10:19 ` [PATCH RESEND v12 09/17] KVM: x86/pmu: Adjust precise_ip to emulate Ice Lake guest PDIR counter Like Xu
@ 2022-05-13  8:57   ` Like Xu
  2022-05-13  9:26   ` [PATCH v13 " Like Xu
  1 sibling, 0 replies; 31+ messages in thread
From: Like Xu @ 2022-05-13  8:57 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	linux-kernel, kvm, Jim Mattson

On 11/4/2022 6:19 pm, Like Xu wrote:

> +++ b/arch/x86/kvm/pmu.h
> @@ -4,6 +4,8 @@
>   
>   #include <linux/nospec.h>
>   
> +#include <asm/cpu_device_id.h>
> +
>   #define vcpu_to_pmu(vcpu) (&(vcpu)->arch.pmu)
>   #define pmu_to_vcpu(pmu)  (container_of((pmu), struct kvm_vcpu, arch.pmu))
>   #define pmc_to_pmu(pmc)   (&(pmc)->vcpu->arch.pmu)
> @@ -15,6 +17,12 @@
>   #define VMWARE_BACKDOOR_PMC_REAL_TIME		0x10001
>   #define VMWARE_BACKDOOR_PMC_APPARENT_TIME	0x10002
>   
> +static const struct x86_cpu_id vmx_icl_pebs_cpu[] = {
> +	X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_D, NULL),
> +	X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, NULL),
> +	{}
> +};
The gcc [-Wunused-const-variable] flag would complain about
not moving vmx_icl_pebs_cpu[] from pmu.h to pmu.c:

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index b5d0c36b869b..17c9bfc2527d 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -27,6 +27,12 @@
  struct x86_pmu_capability __read_mostly kvm_pmu_cap;
  EXPORT_SYMBOL_GPL(kvm_pmu_cap);

+static const struct x86_cpu_id vmx_icl_pebs_cpu[] = {
+    X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_D, NULL),
+    X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, NULL),
+    {}
+};
+
  /* NOTE:
   * - Each perf counter is defined as "struct kvm_pmc";
   * - There are two types of perf counters: general purpose (gp) and fixed.
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index dbf4c83519a4..8064d074f3be 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -17,12 +17,6 @@
  #define VMWARE_BACKDOOR_PMC_REAL_TIME        0x10001
  #define VMWARE_BACKDOOR_PMC_APPARENT_TIME    0x10002

-static const struct x86_cpu_id vmx_icl_pebs_cpu[] = {
-    X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_D, NULL),
-    X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, NULL),
-    {}
-};
-
  struct kvm_event_hw_type_mapping {
      u8 eventsel;
      u8 unit_mask;




^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v13 09/17] KVM: x86/pmu: Adjust precise_ip to emulate Ice Lake guest PDIR counter
  2022-04-11 10:19 ` [PATCH RESEND v12 09/17] KVM: x86/pmu: Adjust precise_ip to emulate Ice Lake guest PDIR counter Like Xu
  2022-05-13  8:57   ` Like Xu
@ 2022-05-13  9:26   ` Like Xu
  1 sibling, 0 replies; 31+ messages in thread
From: Like Xu @ 2022-05-13  9:26 UTC (permalink / raw)
  To: pbonzini; +Cc: jmattson, joro, kvm, linux-kernel, seanjc, vkuznets, wanpengli

From: Like Xu <likexu@tencent.com>

The PEBS-PDIR facility on Ice Lake server is supported on IA31_FIXED0 only.
If the guest configures counter 32 and PEBS is enabled, the PEBS-PDIR
facility is supposed to be used, in which case KVM adjusts attr.precise_ip
to 3 and request host perf to assign the exactly requested counter or fail.

The CPU model check is also required since some platforms may place the
PEBS-PDIR facility in another counter index.

Signed-off-by: Like Xu <likexu@tencent.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
v12 -> v13 Changelog:
- Move vmx_icl_pebs_cpu[] from pmu.h to pmu.c;
- Drop unrelated change about arch/x86/events/intel/core.c;

 arch/x86/kvm/pmu.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 36487088f72c..0b8fc86839ba 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -16,6 +16,7 @@
 #include <linux/bsearch.h>
 #include <linux/sort.h>
 #include <asm/perf_event.h>
+#include <asm/cpu_device_id.h>
 #include "x86.h"
 #include "cpuid.h"
 #include "lapic.h"
@@ -24,6 +25,12 @@
 /* This is enough to filter the vast majority of currently defined events. */
 #define KVM_PMU_EVENT_FILTER_MAX_EVENTS 300
 
+static const struct x86_cpu_id vmx_icl_pebs_cpu[] = {
+	X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_D, NULL),
+	X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, NULL),
+	{}
+};
+
 /* NOTE:
  * - Each perf counter is defined as "struct kvm_pmc";
  * - There are two types of perf counters: general purpose (gp) and fixed.
@@ -175,6 +182,8 @@ static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type,
 		 * could possibly care here is unsupported and needs changes.
 		 */
 		attr.precise_ip = 1;
+		if (x86_match_cpu(vmx_icl_pebs_cpu) && pmc->idx == 32)
+			attr.precise_ip = 3;
 	}
 
 	event = perf_event_create_kernel_counter(&attr, -1, current,
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS
  2022-04-11 10:19 [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS Like Xu
                   ` (17 preceding siblings ...)
  2022-05-10 16:55 ` [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS Paolo Bonzini
@ 2022-05-19 12:14 ` Vitaly Kuznetsov
  2022-05-19 13:31   ` Like Xu
  18 siblings, 1 reply; 31+ messages in thread
From: Vitaly Kuznetsov @ 2022-05-19 12:14 UTC (permalink / raw)
  To: Like Xu
  Cc: Peter Zijlstra, Sean Christopherson, Wanpeng Li, Joerg Roedel,
	linux-kernel, kvm, Paolo Bonzini, Jim Mattson

Like Xu <like.xu.linux@gmail.com> writes:

...

Hi, the following commit

>   KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS

(currently in kvm/queue) breaks a number of selftests, e.g.:

# ./tools/testing/selftests/kvm/x86_64/state_test 
==== Test Assertion Failure ====
  lib/x86_64/processor.c:1207: r == nmsrs
  pid=6702 tid=6702 errno=7 - Argument list too long
     1	0x000000000040da11: vcpu_save_state at processor.c:1207 (discriminator 4)
     2	0x00000000004024e5: main at state_test.c:209 (discriminator 6)
     3	0x00007f9f48c2d55f: ?? ??:0
     4	0x00007f9f48c2d60b: ?? ??:0
     5	0x00000000004026d4: _start at ??:?
  Unexpected result from KVM_GET_MSRS, r: 29 (failed MSR was 0x3f1)

I don't think any of these failing tests care about MSR_IA32_PEBS_ENABLE
in particular, they're just trying to do KVM_GET_MSRS/KVM_SET_MSRS.

-- 
Vitaly


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS
  2022-05-19 12:14 ` Vitaly Kuznetsov
@ 2022-05-19 13:31   ` Like Xu
  2022-05-19 13:50     ` Like Xu
  0 siblings, 1 reply; 31+ messages in thread
From: Like Xu @ 2022-05-19 13:31 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Peter Zijlstra, Sean Christopherson, Wanpeng Li, Joerg Roedel,
	linux-kernel, kvm, Paolo Bonzini, Jim Mattson

On 19/5/2022 8:14 pm, Vitaly Kuznetsov wrote:
> Like Xu <like.xu.linux@gmail.com> writes:
> 
> ...
> 
> Hi, the following commit
> 
>>    KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS
> 
> (currently in kvm/queue) breaks a number of selftests, e.g.:

Indeed, e.g.:

x86_64/hyperv_clock
x86_64/max_vcpuid_cap_test
x86_64/mmu_role_test

> 
> # ./tools/testing/selftests/kvm/x86_64/state_test

This test continues to be silent after the top commit a3808d884612 ("KVM: x86/pmu:
Expose CPUIDs feature bits PDCM, DS, DTES64"), which implies a root cause.

Anyway, thanks for this git-bisect report.

> ==== Test Assertion Failure ====
>    lib/x86_64/processor.c:1207: r == nmsrs
>    pid=6702 tid=6702 errno=7 - Argument list too long
>       1	0x000000000040da11: vcpu_save_state at processor.c:1207 (discriminator 4)
>       2	0x00000000004024e5: main at state_test.c:209 (discriminator 6)
>       3	0x00007f9f48c2d55f: ?? ??:0
>       4	0x00007f9f48c2d60b: ?? ??:0
>       5	0x00000000004026d4: _start at ??:?
>    Unexpected result from KVM_GET_MSRS, r: 29 (failed MSR was 0x3f1)
> 
> I don't think any of these failing tests care about MSR_IA32_PEBS_ENABLE
> in particular, they're just trying to do KVM_GET_MSRS/KVM_SET_MSRS.
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS
  2022-05-19 13:31   ` Like Xu
@ 2022-05-19 13:50     ` Like Xu
  2022-05-19 14:46       ` Vitaly Kuznetsov
  0 siblings, 1 reply; 31+ messages in thread
From: Like Xu @ 2022-05-19 13:50 UTC (permalink / raw)
  To: Paolo Bonzini, Vitaly Kuznetsov
  Cc: Peter Zijlstra, Sean Christopherson, Wanpeng Li, Joerg Roedel,
	linux-kernel, kvm, Jim Mattson

On 19/5/2022 9:31 pm, Like Xu wrote:
> ==== Test Assertion Failure ====
>     lib/x86_64/processor.c:1207: r == nmsrs
>     pid=6702 tid=6702 errno=7 - Argument list too long
>        1    0x000000000040da11: vcpu_save_state at processor.c:1207 
> (discriminator 4)
>        2    0x00000000004024e5: main at state_test.c:209 (discriminator 6)
>        3    0x00007f9f48c2d55f: ?? ??:0
>        4    0x00007f9f48c2d60b: ?? ??:0
>        5    0x00000000004026d4: _start at ??:?
>     Unexpected result from KVM_GET_MSRS, r: 29 (failed MSR was 0x3f1)
> 
> I don't think any of these failing tests care about MSR_IA32_PEBS_ENABLE
> in particular, they're just trying to do KVM_GET_MSRS/KVM_SET_MSRS.

One of the lessons I learned here is that the members of msrs_to_save_all[]
are part of the KVM ABI. We don't add feature-related MSRs until the last
step of the KVM exposure feature (in this case, adding MSR_IA32_PEBS_ENABLE,
MSR_IA32_DS_AREA, MSR_PEBS_DATA_CFG to msrs_to_save_all[] should take
effect along with exposing the CPUID bits).

Awaiting a ruling from the core guardian on this part of the git-bisect deficiency.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS
  2022-05-19 13:50     ` Like Xu
@ 2022-05-19 14:46       ` Vitaly Kuznetsov
  2022-05-25  7:56         ` Like Xu
  0 siblings, 1 reply; 31+ messages in thread
From: Vitaly Kuznetsov @ 2022-05-19 14:46 UTC (permalink / raw)
  To: Like Xu
  Cc: Peter Zijlstra, Sean Christopherson, Wanpeng Li, Joerg Roedel,
	linux-kernel, kvm, Jim Mattson, Paolo Bonzini

Like Xu <like.xu.linux@gmail.com> writes:

> On 19/5/2022 9:31 pm, Like Xu wrote:
>> ==== Test Assertion Failure ====
>>     lib/x86_64/processor.c:1207: r == nmsrs
>>     pid=6702 tid=6702 errno=7 - Argument list too long
>>        1    0x000000000040da11: vcpu_save_state at processor.c:1207 
>> (discriminator 4)
>>        2    0x00000000004024e5: main at state_test.c:209 (discriminator 6)
>>        3    0x00007f9f48c2d55f: ?? ??:0
>>        4    0x00007f9f48c2d60b: ?? ??:0
>>        5    0x00000000004026d4: _start at ??:?
>>     Unexpected result from KVM_GET_MSRS, r: 29 (failed MSR was 0x3f1)
>> 
>> I don't think any of these failing tests care about MSR_IA32_PEBS_ENABLE
>> in particular, they're just trying to do KVM_GET_MSRS/KVM_SET_MSRS.
>
> One of the lessons I learned here is that the members of msrs_to_save_all[]
> are part of the KVM ABI. We don't add feature-related MSRs until the last
> step of the KVM exposure feature (in this case, adding MSR_IA32_PEBS_ENABLE,
> MSR_IA32_DS_AREA, MSR_PEBS_DATA_CFG to msrs_to_save_all[] should take
> effect along with exposing the CPUID bits).

AFAIR the basic rule here is that whatever gets returned with
KVM_GET_MSR_INDEX_LIST can be passed to KVM_GET_MSRS and read
successfully by the host (not necessarily by the guest) so my guess is
that MSR_IA32_PEBS_ENABLE is now returned in KVM_GET_MSR_INDEX_LIST but
can't be read with KVM_GET_MSRS. Later, the expectation is that what was
returned by KVM_GET_MSRS can be set successfully with KVM_SET_MSRS.

-- 
Vitaly


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS
  2022-05-19 14:46       ` Vitaly Kuznetsov
@ 2022-05-25  7:56         ` Like Xu
  2022-05-25  8:14           ` Paolo Bonzini
  0 siblings, 1 reply; 31+ messages in thread
From: Like Xu @ 2022-05-25  7:56 UTC (permalink / raw)
  To: Vitaly Kuznetsov, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Joerg Roedel, linux-kernel, kvm,
	Jim Mattson

On 19/5/2022 10:46 pm, Vitaly Kuznetsov wrote:
> Like Xu <like.xu.linux@gmail.com> writes:
> 
>> On 19/5/2022 9:31 pm, Like Xu wrote:
>>> ==== Test Assertion Failure ====
>>>      lib/x86_64/processor.c:1207: r == nmsrs
>>>      pid=6702 tid=6702 errno=7 - Argument list too long
>>>         1    0x000000000040da11: vcpu_save_state at processor.c:1207
>>> (discriminator 4)
>>>         2    0x00000000004024e5: main at state_test.c:209 (discriminator 6)
>>>         3    0x00007f9f48c2d55f: ?? ??:0
>>>         4    0x00007f9f48c2d60b: ?? ??:0
>>>         5    0x00000000004026d4: _start at ??:?
>>>      Unexpected result from KVM_GET_MSRS, r: 29 (failed MSR was 0x3f1)
>>>
>>> I don't think any of these failing tests care about MSR_IA32_PEBS_ENABLE
>>> in particular, they're just trying to do KVM_GET_MSRS/KVM_SET_MSRS.
>>
>> One of the lessons I learned here is that the members of msrs_to_save_all[]
>> are part of the KVM ABI. We don't add feature-related MSRs until the last
>> step of the KVM exposure feature (in this case, adding MSR_IA32_PEBS_ENABLE,
>> MSR_IA32_DS_AREA, MSR_PEBS_DATA_CFG to msrs_to_save_all[] should take
>> effect along with exposing the CPUID bits).
> 
> AFAIR the basic rule here is that whatever gets returned with
> KVM_GET_MSR_INDEX_LIST can be passed to KVM_GET_MSRS and read
> successfully by the host (not necessarily by the guest) so my guess is
> that MSR_IA32_PEBS_ENABLE is now returned in KVM_GET_MSR_INDEX_LIST but
> can't be read with KVM_GET_MSRS. Later, the expectation is that what was
> returned by KVM_GET_MSRS can be set successfully with KVM_SET_MSRS.
> 

Thanks for the clarification.

Some kvm x86 selftests have been failing due to this issue even after the last 
commit.

I blame myself for not passing the msr_info->host_initiated to the 
intel_is_valid_msr(),
meanwhile I pondered further whether we should check only the MSR addrs range in
the kvm_pmu_is_valid_msr() and apply this kind of sanity check in the 
pmu_set/get_msr().

Vitaly && Paolo, any preference to move forward ?

Thanks,
Like Xu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS
  2022-05-25  7:56         ` Like Xu
@ 2022-05-25  8:14           ` Paolo Bonzini
  2022-05-25  8:32             ` Like Xu
  0 siblings, 1 reply; 31+ messages in thread
From: Paolo Bonzini @ 2022-05-25  8:14 UTC (permalink / raw)
  To: Like Xu, Vitaly Kuznetsov
  Cc: Sean Christopherson, Wanpeng Li, Joerg Roedel, linux-kernel, kvm,
	Jim Mattson

On 5/25/22 09:56, Like Xu wrote:
> Thanks for the clarification.
> 
> Some kvm x86 selftests have been failing due to this issue even after 
> the last commit.
> 
> I blame myself for not passing the msr_info->host_initiated to the 
> intel_is_valid_msr(),
> meanwhile I pondered further whether we should check only the MSR addrs 
> range in
> the kvm_pmu_is_valid_msr() and apply this kind of sanity check in the 
> pmu_set/get_msr().
> 
> Vitaly && Paolo, any preference to move forward ?

I'm not sure what I did wrong to not see the failure, so I'll fix it myself.

But from now on, I'll have a hard rule of no new processor features 
enabled without KVM unit tests or selftests.  In fact, it would be nice 
if you wrote some for PEBS.

Paolo


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS
  2022-05-25  8:14           ` Paolo Bonzini
@ 2022-05-25  8:32             ` Like Xu
  2022-05-25 14:12               ` Maxim Levitsky
  0 siblings, 1 reply; 31+ messages in thread
From: Like Xu @ 2022-05-25  8:32 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Joerg Roedel, linux-kernel, kvm,
	Jim Mattson, Vitaly Kuznetsov

On 25/5/2022 4:14 pm, Paolo Bonzini wrote:
> On 5/25/22 09:56, Like Xu wrote:
>> Thanks for the clarification.
>>
>> Some kvm x86 selftests have been failing due to this issue even after the last 
>> commit.
>>
>> I blame myself for not passing the msr_info->host_initiated to the 
>> intel_is_valid_msr(),
>> meanwhile I pondered further whether we should check only the MSR addrs range in
>> the kvm_pmu_is_valid_msr() and apply this kind of sanity check in the 
>> pmu_set/get_msr().
>>
>> Vitaly && Paolo, any preference to move forward ?
> 
> I'm not sure what I did wrong to not see the failure, so I'll fix it myself.

More info, some Skylake hosts fail the tests like x86_64/state_test due to this 
issue.

> 
> But from now on, I'll have a hard rule of no new processor features enabled 
> without KVM unit tests or selftests.  In fact, it would be nice if you wrote 
> some for PEBS.

Great, my team (or at least me) is committed to contributing more tests on vPMU 
features.

We may update the process document to the 
Documentation/virt/kvm/review-checklist.rst.

> 
> Paolo
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS
  2022-05-25  8:32             ` Like Xu
@ 2022-05-25 14:12               ` Maxim Levitsky
  2022-05-25 14:13                 ` Paolo Bonzini
  0 siblings, 1 reply; 31+ messages in thread
From: Maxim Levitsky @ 2022-05-25 14:12 UTC (permalink / raw)
  To: Like Xu, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Joerg Roedel, linux-kernel, kvm,
	Jim Mattson, Vitaly Kuznetsov

On Wed, 2022-05-25 at 16:32 +0800, Like Xu wrote:
> On 25/5/2022 4:14 pm, Paolo Bonzini wrote:
> > On 5/25/22 09:56, Like Xu wrote:
> > > Thanks for the clarification.
> > > 
> > > Some kvm x86 selftests have been failing due to this issue even after the last 
> > > commit.
> > > 
> > > I blame myself for not passing the msr_info->host_initiated to the 
> > > intel_is_valid_msr(),
> > > meanwhile I pondered further whether we should check only the MSR addrs range in
> > > the kvm_pmu_is_valid_msr() and apply this kind of sanity check in the 
> > > pmu_set/get_msr().
> > > 
> > > Vitaly && Paolo, any preference to move forward ?
> > 
> > I'm not sure what I did wrong to not see the failure, so I'll fix it myself.
> 
> More info, some Skylake hosts fail the tests like x86_64/state_test due to this 
> issue.
> 
> > But from now on, I'll have a hard rule of no new processor features enabled 
> > without KVM unit tests or selftests.  In fact, it would be nice if you wrote 
> > some for PEBS.
> 
> Great, my team (or at least me) is committed to contributing more tests on vPMU 
> features.
> 
> We may update the process document to the 
> Documentation/virt/kvm/review-checklist.rst.
> 
> > Paolo
> > 

FYI, this patch series also break 'msr' test in kvm-unit tests.
(kvm/queue of today, and master of the kvm-unit-tests repo)

The test tries to set the MSR_IA32_MISC_ENABLE to 0x400c51889 and gets #GP.


Commenting this out, gets rid of #GP, but test still fails with unexpected result

		if (!msr_info->host_initiated &&
		    ((old_val ^ data) & MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL))
			return 1;




It is very possible that the test is broken, I'll check this later.

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS
  2022-05-25 14:12               ` Maxim Levitsky
@ 2022-05-25 14:13                 ` Paolo Bonzini
  2022-05-25 14:14                   ` Maxim Levitsky
  0 siblings, 1 reply; 31+ messages in thread
From: Paolo Bonzini @ 2022-05-25 14:13 UTC (permalink / raw)
  To: Maxim Levitsky, Like Xu
  Cc: Sean Christopherson, Wanpeng Li, Joerg Roedel, linux-kernel, kvm,
	Jim Mattson, Vitaly Kuznetsov

On 5/25/22 16:12, Maxim Levitsky wrote:
> FYI, this patch series also break 'msr' test in kvm-unit tests.
> (kvm/queue of today, and master of the kvm-unit-tests repo)
> 
> The test tries to set the MSR_IA32_MISC_ENABLE to 0x400c51889 and gets #GP.
> 
> 
> Commenting this out, gets rid of #GP, but test still fails with unexpected result
> 
> 		if (!msr_info->host_initiated &&
> 		    ((old_val ^ data) & MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL))
> 			return 1;
> 
> 
> 
> 
> It is very possible that the test is broken, I'll check this later.

Yes, for that I've sent a patch already:

https://lore.kernel.org/kvm/20220520183207.7952-1-pbonzini@redhat.com/

Paolo


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS
  2022-05-25 14:13                 ` Paolo Bonzini
@ 2022-05-25 14:14                   ` Maxim Levitsky
  0 siblings, 0 replies; 31+ messages in thread
From: Maxim Levitsky @ 2022-05-25 14:14 UTC (permalink / raw)
  To: Paolo Bonzini, Like Xu
  Cc: Sean Christopherson, Wanpeng Li, Joerg Roedel, linux-kernel, kvm,
	Jim Mattson, Vitaly Kuznetsov

On Wed, 2022-05-25 at 16:13 +0200, Paolo Bonzini wrote:
> On 5/25/22 16:12, Maxim Levitsky wrote:
> > FYI, this patch series also break 'msr' test in kvm-unit tests.
> > (kvm/queue of today, and master of the kvm-unit-tests repo)
> > 
> > The test tries to set the MSR_IA32_MISC_ENABLE to 0x400c51889 and gets #GP.
> > 
> > 
> > Commenting this out, gets rid of #GP, but test still fails with unexpected result
> > 
> > 		if (!msr_info->host_initiated &&
> > 		    ((old_val ^ data) & MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL))
> > 			return 1;
> > 
> > 
> > 
> > 
> > It is very possible that the test is broken, I'll check this later.
> 
> Yes, for that I've sent a patch already:
> 
> https://lore.kernel.org/kvm/20220520183207.7952-1-pbonzini@redhat.com/
> 
> Paolo
> 

Thank you very much!


Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2022-05-25 14:15 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-11 10:19 [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS Like Xu
2022-04-11 10:19 ` [PATCH RESEND v12 01/17] perf/x86/intel: Add EPT-Friendly PEBS for Ice Lake Server Like Xu
2022-04-11 10:19 ` [PATCH RESEND v12 02/17] perf/x86/intel: Handle guest PEBS overflow PMI for KVM guest Like Xu
2022-04-11 10:19 ` [PATCH RESEND v12 03/17] perf/x86/core: Pass "struct kvm_pmu *" to determine the guest values Like Xu
2022-04-11 10:19 ` [PATCH RESEND v12 04/17] KVM: x86/pmu: Set MSR_IA32_MISC_ENABLE_EMON bit when vPMU is enabled Like Xu
2022-04-11 10:19 ` [PATCH RESEND v12 05/17] KVM: x86/pmu: Introduce the ctrl_mask value for fixed counter Like Xu
2022-04-11 10:19 ` [PATCH RESEND v12 06/17] x86/perf/core: Add pebs_capable to store valid PEBS_COUNTER_MASK value Like Xu
2022-04-11 10:19 ` [PATCH RESEND v12 07/17] KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS Like Xu
2022-04-11 10:19 ` [PATCH RESEND v12 08/17] KVM: x86/pmu: Reprogram PEBS event to emulate guest PEBS counter Like Xu
2022-04-11 10:19 ` [PATCH RESEND v12 09/17] KVM: x86/pmu: Adjust precise_ip to emulate Ice Lake guest PDIR counter Like Xu
2022-05-13  8:57   ` Like Xu
2022-05-13  9:26   ` [PATCH v13 " Like Xu
2022-04-11 10:19 ` [PATCH RESEND v12 10/17] KVM: x86/pmu: Add IA32_DS_AREA MSR emulation to support guest DS Like Xu
2022-04-11 10:19 ` [PATCH RESEND v12 11/17] KVM: x86/pmu: Add PEBS_DATA_CFG MSR emulation to support adaptive PEBS Like Xu
2022-04-11 10:19 ` [PATCH RESEND v12 12/17] KVM: x86: Set PEBS_UNAVAIL in IA32_MISC_ENABLE when PEBS is enabled Like Xu
2022-04-11 10:19 ` [PATCH RESEND v12 13/17] KVM: x86/pmu: Move pmc_speculative_in_use() to arch/x86/kvm/pmu.h Like Xu
2022-04-11 10:19 ` [PATCH RESEND v12 14/17] KVM: x86/pmu: Disable guest PEBS temporarily in two rare situations Like Xu
2022-04-11 10:19 ` [PATCH RESEND v12 15/17] KVM: x86/pmu: Add kvm_pmu_cap to optimize perf_get_x86_pmu_capability Like Xu
2022-04-11 10:19 ` [PATCH RESEND v12 16/17] KVM: x86/cpuid: Refactor host/guest CPU model consistency check Like Xu
2022-04-11 10:19 ` [PATCH RESEND v12 17/17] KVM: x86/pmu: Expose CPUIDs feature bits PDCM, DS, DTES64 Like Xu
2022-05-10 16:55 ` [PATCH RESEND v12 00/17] KVM: x86/pmu: Add basic support to enable guest PEBS via DS Paolo Bonzini
2022-05-19 12:14 ` Vitaly Kuznetsov
2022-05-19 13:31   ` Like Xu
2022-05-19 13:50     ` Like Xu
2022-05-19 14:46       ` Vitaly Kuznetsov
2022-05-25  7:56         ` Like Xu
2022-05-25  8:14           ` Paolo Bonzini
2022-05-25  8:32             ` Like Xu
2022-05-25 14:12               ` Maxim Levitsky
2022-05-25 14:13                 ` Paolo Bonzini
2022-05-25 14:14                   ` Maxim Levitsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).