linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v12 00/11] Guest Last Branch Recording Enabling
@ 2020-06-13  8:09 Like Xu
  2020-06-13  8:09 ` [PATCH v12 01/11] perf/x86: Fix variable types for LBR registers Like Xu
                   ` (14 more replies)
  0 siblings, 15 replies; 34+ messages in thread
From: Like Xu @ 2020-06-13  8:09 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, ak, wei.w.wang,
	linux-kernel, kvm, Like Xu

Hi all,

Please help review this new version for the Kenrel 5.9 release.

Now, you may apply the last two qemu-devel patches to the upstream
qemu and try the guest LBR feature with '-cpu host' command line.

v11->v12 Changelog:
- apply "Signed-off-by" form PeterZ and his codes for the perf subsystem;
- add validity checks before expose LBR via MSR_IA32_PERF_CAPABILITIES;
- refactor MSR_IA32_DEBUGCTLMSR emulation with validity check;
- reorder "perf_event_attr" fields according to how they're declared;
- replace event_is_oncpu() with "event->state" check;
- make LBR emualtion specific to vmx rather than x86 generic;
- move pass-through LBR code to vmx.c instead of pmu_intel.c;
- add vmx_lbr_en/disable_passthrough layer to make code readable;
- rewrite pmu availability check with vmx_passthrough_lbr_msrs();

You may check more details in each commit.

Previous:
https://lore.kernel.org/kvm/20200514083054.62538-1-like.xu@linux.intel.com/

---

The last branch recording (LBR) is a performance monitor unit (PMU)
feature on Intel processors that records a running trace of the most
recent branches taken by the processor in the LBR stack. This patch
series is going to enable this feature for plenty of KVM guests.

The userspace could configure whether it's enabled or not for each
guest via MSR_IA32_PERF_CAPABILITIES msr. As a first step, a guest
could only enable LBR feature if its cpu model is the same as the
host since the LBR feature is still one of model specific features.

If it's enabled on the guest, the guest LBR driver would accesses the
LBR MSR (including IA32_DEBUGCTLMSR and records MSRs) as host does.
The first guest access on the LBR related MSRs is always interceptible.
The KVM trap would create a special LBR event (called guest LBR event)
which enables the callstack mode and none of hardware counter is assigned.
The host perf would enable and schedule this event as usual. 

Guest's first access to a LBR registers gets trapped to KVM, which
creates a guest LBR perf event. It's a regular LBR perf event which gets
the LBR facility assigned from the perf subsystem. Once that succeeds,
the LBR stack msrs are passed through to the guest for efficient accesses.
However, if another host LBR event comes in and takes over the LBR
facility, the LBR msrs will be made interceptible, and guest following
accesses to the LBR msrs will be trapped and meaningless. 

Because saving/restoring tens of LBR MSRs (e.g. 32 LBR stack entries) in
VMX transition brings too excessive overhead to frequent vmx transition
itself, the guest LBR event would help save/restore the LBR stack msrs
during the context switching with the help of native LBR event callstack
mechanism, including LBR_SELECT msr.

If the guest no longer accesses the LBR-related MSRs within a scheduling
time slice and the LBR enable bit is unset, vPMU would release its guest
LBR event as a normal event of a unused vPMC and the pass-through
state of the LBR stack msrs would be canceled.

---

LBR testcase:
echo 1 > /proc/sys/kernel/watchdog
echo 25 > /proc/sys/kernel/perf_cpu_time_max_percent
echo 5000 > /proc/sys/kernel/perf_event_max_sample_rate
echo 0 > /proc/sys/kernel/perf_cpu_time_max_percent
./perf record -b ./br_instr a

- Perf report on the host:
Samples: 72K of event 'cycles', Event count (approx.): 72512
Overhead  Command   Source Shared Object           Source Symbol                           Target Symbol                           Basic Block Cycles
  12.12%  br_instr  br_instr                       [.] cmp_end                             [.] lfsr_cond                           1
  11.05%  br_instr  br_instr                       [.] lfsr_cond                           [.] cmp_end                             5
   8.81%  br_instr  br_instr                       [.] lfsr_cond                           [.] cmp_end                             4
   5.04%  br_instr  br_instr                       [.] cmp_end                             [.] lfsr_cond                           20
   4.92%  br_instr  br_instr                       [.] lfsr_cond                           [.] cmp_end                             6
   4.88%  br_instr  br_instr                       [.] cmp_end                             [.] lfsr_cond                           6
   4.58%  br_instr  br_instr                       [.] cmp_end                             [.] lfsr_cond                           5

- Perf report on the guest:
Samples: 92K of event 'cycles', Event count (approx.): 92544
Overhead  Command   Source Shared Object  Source Symbol                                   Target Symbol                                   Basic Block Cycles
  12.03%  br_instr  br_instr              [.] cmp_end                                     [.] lfsr_cond                                   1
  11.09%  br_instr  br_instr              [.] lfsr_cond                                   [.] cmp_end                                     5
   8.57%  br_instr  br_instr              [.] lfsr_cond                                   [.] cmp_end                                     4
   5.08%  br_instr  br_instr              [.] lfsr_cond                                   [.] cmp_end                                     6
   5.06%  br_instr  br_instr              [.] cmp_end                                     [.] lfsr_cond                                   20
   4.87%  br_instr  br_instr              [.] cmp_end                                     [.] lfsr_cond                                   6
   4.70%  br_instr  br_instr              [.] cmp_end                                     [.] lfsr_cond                                   5

Conclusion: the profiling results on the guest are similar to that on the host.

Like Xu (10):
  perf/x86/core: Refactor hw->idx checks and cleanup
  perf/x86/lbr: Add interface to get LBR information
  perf/x86: Add constraint to create guest LBR event without hw counter
  perf/x86: Keep LBR records unchanged in host context for guest usage
  KVM: vmx/pmu: Expose LBR to guest via MSR_IA32_PERF_CAPABILITIES
  KVM: vmx/pmu: Unmask LBR fields in the MSR_IA32_DEBUGCTLMSR emualtion
  KVM: vmx/pmu: Pass-through LBR msrs when guest LBR event is scheduled
  KVM: vmx/pmu: Emulate legacy freezing LBRs on virtual PMI
  KVM: vmx/pmu: Reduce the overhead of LBR pass-through or cancellation
  KVM: vmx/pmu: Release guest LBR event via lazy release mechanism

Wei Wang (1):
  perf/x86: Fix variable types for LBR registers

Qemu-devel:
  target/i386: define a MSR based feature word - FEAT_PERF_CAPABILITIES
  target/i386: add -cpu,lbr=true support to enable guest LBR

 arch/x86/events/core.c            |  26 +--
 arch/x86/events/intel/core.c      | 109 ++++++++-----
 arch/x86/events/intel/lbr.c       |  51 +++++-
 arch/x86/events/perf_event.h      |   8 +-
 arch/x86/include/asm/perf_event.h |  34 +++-
 arch/x86/kvm/pmu.c                |  12 +-
 arch/x86/kvm/pmu.h                |   5 +
 arch/x86/kvm/vmx/capabilities.h   |  23 ++-
 arch/x86/kvm/vmx/pmu_intel.c      | 253 +++++++++++++++++++++++++++++-
 arch/x86/kvm/vmx/vmx.c            |  86 +++++++++-
 arch/x86/kvm/vmx/vmx.h            |  17 ++
 arch/x86/kvm/x86.c                |  13 --
 12 files changed, 559 insertions(+), 78 deletions(-)

-- 
2.21.3


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v12 01/11] perf/x86: Fix variable types for LBR registers
  2020-06-13  8:09 [PATCH v12 00/11] Guest Last Branch Recording Enabling Like Xu
@ 2020-06-13  8:09 ` Like Xu
  2020-07-03  8:01   ` [tip: perf/core] " tip-bot2 for Wei Wang
  2020-11-09  6:34   ` [PATCH v12 01/11] " Andi Kleen
  2020-06-13  8:09 ` [PATCH v12 02/11] perf/x86/core: Refactor hw->idx checks and cleanup Like Xu
                   ` (13 subsequent siblings)
  14 siblings, 2 replies; 34+ messages in thread
From: Like Xu @ 2020-06-13  8:09 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, ak, wei.w.wang,
	linux-kernel, kvm

From: Wei Wang <wei.w.wang@intel.com>

The MSR variable type can be 'unsigned int', which uses less memory than
the longer 'unsigned long'. Fix 'struct x86_pmu' for that. The lbr_nr won't
be a negative number, so make it 'unsigned int' as well.

Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
---
 arch/x86/events/perf_event.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index e17a3d8a47ed..eb37f6c43c96 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -673,8 +673,8 @@ struct x86_pmu {
 	/*
 	 * Intel LBR
 	 */
-	unsigned long	lbr_tos, lbr_from, lbr_to; /* MSR base regs       */
-	int		lbr_nr;			   /* hardware stack size */
+	unsigned int	lbr_tos, lbr_from, lbr_to,
+			lbr_nr;			   /* LBR base regs and size */
 	u64		lbr_sel_mask;		   /* LBR_SELECT valid bits */
 	const int	*lbr_sel_map;		   /* lbr_select mappings */
 	bool		lbr_double_abort;	   /* duplicated lbr aborts */
-- 
2.21.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v12 02/11] perf/x86/core: Refactor hw->idx checks and cleanup
  2020-06-13  8:09 [PATCH v12 00/11] Guest Last Branch Recording Enabling Like Xu
  2020-06-13  8:09 ` [PATCH v12 01/11] perf/x86: Fix variable types for LBR registers Like Xu
@ 2020-06-13  8:09 ` Like Xu
  2020-07-03  8:01   ` [tip: perf/core] " tip-bot2 for Like Xu
  2020-06-13  8:09 ` [PATCH v12 03/11] perf/x86/lbr: Add interface to get LBR information Like Xu
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 34+ messages in thread
From: Like Xu @ 2020-06-13  8:09 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, ak, wei.w.wang,
	linux-kernel, kvm, Like Xu

For intel_pmu_en/disable_event(), reorder the branches checks for hw->idx
and make them sorted by probability: gp,fixed,bts,others.

Clean up the x86_assign_hw_event() by converting multiple if-else
statements to a switch statement.

To skip x86_perf_event_update() and x86_perf_event_set_period(),
it's generic to replace "idx == INTEL_PMC_IDX_FIXED_BTS" check with
'!hwc->event_base' because that should be 0 for all non-gp/fixed cases.

Wrap related bit operations into intel_set/clear_masks() and make the main
path more cleaner and readable.

No functional changes.

Original-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
---
 arch/x86/events/core.c       | 25 +++++++----
 arch/x86/events/intel/core.c | 85 +++++++++++++++++++-----------------
 2 files changed, 62 insertions(+), 48 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 9e63ee50b19a..9a5056472b67 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -71,10 +71,9 @@ u64 x86_perf_event_update(struct perf_event *event)
 	struct hw_perf_event *hwc = &event->hw;
 	int shift = 64 - x86_pmu.cntval_bits;
 	u64 prev_raw_count, new_raw_count;
-	int idx = hwc->idx;
 	u64 delta;
 
-	if (idx == INTEL_PMC_IDX_FIXED_BTS)
+	if (unlikely(!hwc->event_base))
 		return 0;
 
 	/*
@@ -1097,22 +1096,30 @@ static inline void x86_assign_hw_event(struct perf_event *event,
 				struct cpu_hw_events *cpuc, int i)
 {
 	struct hw_perf_event *hwc = &event->hw;
+	int idx;
 
-	hwc->idx = cpuc->assign[i];
+	idx = hwc->idx = cpuc->assign[i];
 	hwc->last_cpu = smp_processor_id();
 	hwc->last_tag = ++cpuc->tags[i];
 
-	if (hwc->idx == INTEL_PMC_IDX_FIXED_BTS) {
+	switch (hwc->idx) {
+	case INTEL_PMC_IDX_FIXED_BTS:
 		hwc->config_base = 0;
 		hwc->event_base	= 0;
-	} else if (hwc->idx >= INTEL_PMC_IDX_FIXED) {
+		break;
+
+	case INTEL_PMC_IDX_FIXED ... INTEL_PMC_IDX_FIXED_BTS-1:
 		hwc->config_base = MSR_ARCH_PERFMON_FIXED_CTR_CTRL;
-		hwc->event_base = MSR_ARCH_PERFMON_FIXED_CTR0 + (hwc->idx - INTEL_PMC_IDX_FIXED);
-		hwc->event_base_rdpmc = (hwc->idx - INTEL_PMC_IDX_FIXED) | 1<<30;
-	} else {
+		hwc->event_base = MSR_ARCH_PERFMON_FIXED_CTR0 +
+				(idx - INTEL_PMC_IDX_FIXED);
+		hwc->event_base_rdpmc = (idx - INTEL_PMC_IDX_FIXED) | 1<<30;
+		break;
+
+	default:
 		hwc->config_base = x86_pmu_config_addr(hwc->idx);
 		hwc->event_base  = x86_pmu_event_addr(hwc->idx);
 		hwc->event_base_rdpmc = x86_pmu_rdpmc_index(hwc->idx);
+		break;
 	}
 }
 
@@ -1233,7 +1240,7 @@ int x86_perf_event_set_period(struct perf_event *event)
 	s64 period = hwc->sample_period;
 	int ret = 0, idx = hwc->idx;
 
-	if (idx == INTEL_PMC_IDX_FIXED_BTS)
+	if (unlikely(!hwc->event_base))
 		return 0;
 
 	/*
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index ca35c8b5ee10..8dac4c61bf76 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2136,8 +2136,35 @@ static inline void intel_pmu_ack_status(u64 ack)
 	wrmsrl(MSR_CORE_PERF_GLOBAL_OVF_CTRL, ack);
 }
 
-static void intel_pmu_disable_fixed(struct hw_perf_event *hwc)
+static inline bool event_is_checkpointed(struct perf_event *event)
+{
+	return unlikely(event->hw.config & HSW_IN_TX_CHECKPOINTED) != 0;
+}
+
+static inline void intel_set_masks(struct perf_event *event, int idx)
+{
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+
+	if (event->attr.exclude_host)
+		__set_bit(idx, (unsigned long *)&cpuc->intel_ctrl_guest_mask);
+	if (event->attr.exclude_guest)
+		__set_bit(idx, (unsigned long *)&cpuc->intel_ctrl_host_mask);
+	if (event_is_checkpointed(event))
+		__set_bit(idx, (unsigned long *)&cpuc->intel_cp_status);
+}
+
+static inline void intel_clear_masks(struct perf_event *event, int idx)
 {
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+
+	__clear_bit(idx, (unsigned long *)&cpuc->intel_ctrl_guest_mask);
+	__clear_bit(idx, (unsigned long *)&cpuc->intel_ctrl_host_mask);
+	__clear_bit(idx, (unsigned long *)&cpuc->intel_cp_status);
+}
+
+static void intel_pmu_disable_fixed(struct perf_event *event)
+{
+	struct hw_perf_event *hwc = &event->hw;
 	int idx = hwc->idx - INTEL_PMC_IDX_FIXED;
 	u64 ctrl_val, mask;
 
@@ -2148,31 +2175,22 @@ static void intel_pmu_disable_fixed(struct hw_perf_event *hwc)
 	wrmsrl(hwc->config_base, ctrl_val);
 }
 
-static inline bool event_is_checkpointed(struct perf_event *event)
-{
-	return (event->hw.config & HSW_IN_TX_CHECKPOINTED) != 0;
-}
-
 static void intel_pmu_disable_event(struct perf_event *event)
 {
 	struct hw_perf_event *hwc = &event->hw;
-	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+	int idx = hwc->idx;
 
-	if (unlikely(hwc->idx == INTEL_PMC_IDX_FIXED_BTS)) {
+	if (idx < INTEL_PMC_IDX_FIXED) {
+		intel_clear_masks(event, idx);
+		x86_pmu_disable_event(event);
+	} else if (idx < INTEL_PMC_IDX_FIXED_BTS) {
+		intel_clear_masks(event, idx);
+		intel_pmu_disable_fixed(event);
+	} else if (idx == INTEL_PMC_IDX_FIXED_BTS) {
 		intel_pmu_disable_bts();
 		intel_pmu_drain_bts_buffer();
-		return;
 	}
 
-	cpuc->intel_ctrl_guest_mask &= ~(1ull << hwc->idx);
-	cpuc->intel_ctrl_host_mask &= ~(1ull << hwc->idx);
-	cpuc->intel_cp_status &= ~(1ull << hwc->idx);
-
-	if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL))
-		intel_pmu_disable_fixed(hwc);
-	else
-		x86_pmu_disable_event(event);
-
 	/*
 	 * Needs to be called after x86_pmu_disable_event,
 	 * so we don't trigger the event without PEBS bit set.
@@ -2238,33 +2256,22 @@ static void intel_pmu_enable_fixed(struct perf_event *event)
 static void intel_pmu_enable_event(struct perf_event *event)
 {
 	struct hw_perf_event *hwc = &event->hw;
-	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
-
-	if (unlikely(hwc->idx == INTEL_PMC_IDX_FIXED_BTS)) {
-		if (!__this_cpu_read(cpu_hw_events.enabled))
-			return;
-
-		intel_pmu_enable_bts(hwc->config);
-		return;
-	}
-
-	if (event->attr.exclude_host)
-		cpuc->intel_ctrl_guest_mask |= (1ull << hwc->idx);
-	if (event->attr.exclude_guest)
-		cpuc->intel_ctrl_host_mask |= (1ull << hwc->idx);
-
-	if (unlikely(event_is_checkpointed(event)))
-		cpuc->intel_cp_status |= (1ull << hwc->idx);
+	int idx = hwc->idx;
 
 	if (unlikely(event->attr.precise_ip))
 		intel_pmu_pebs_enable(event);
 
-	if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL)) {
+	if (idx < INTEL_PMC_IDX_FIXED) {
+		intel_set_masks(event, idx);
+		__x86_pmu_enable_event(hwc, ARCH_PERFMON_EVENTSEL_ENABLE);
+	} else if (idx < INTEL_PMC_IDX_FIXED_BTS) {
+		intel_set_masks(event, idx);
 		intel_pmu_enable_fixed(event);
-		return;
+	} else if (idx == INTEL_PMC_IDX_FIXED_BTS) {
+		if (!__this_cpu_read(cpu_hw_events.enabled))
+			return;
+		intel_pmu_enable_bts(hwc->config);
 	}
-
-	__x86_pmu_enable_event(hwc, ARCH_PERFMON_EVENTSEL_ENABLE);
 }
 
 static void intel_pmu_add_event(struct perf_event *event)
-- 
2.21.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v12 03/11] perf/x86/lbr: Add interface to get LBR information
  2020-06-13  8:09 [PATCH v12 00/11] Guest Last Branch Recording Enabling Like Xu
  2020-06-13  8:09 ` [PATCH v12 01/11] perf/x86: Fix variable types for LBR registers Like Xu
  2020-06-13  8:09 ` [PATCH v12 02/11] perf/x86/core: Refactor hw->idx checks and cleanup Like Xu
@ 2020-06-13  8:09 ` Like Xu
  2020-07-03  8:01   ` [tip: perf/core] " tip-bot2 for Like Xu
  2020-06-13  8:09 ` [PATCH v12 04/11] perf/x86: Add constraint to create guest LBR event without hw counter Like Xu
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 34+ messages in thread
From: Like Xu @ 2020-06-13  8:09 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, ak, wei.w.wang,
	linux-kernel, kvm, Like Xu

The LBR records msrs are model specific. The perf subsystem has already
obtained the base addresses of LBR records based on the cpu model.

Therefore, an interface is added to allow callers outside the perf
subsystem to obtain these LBR information. It's useful for hypervisors
to emulate the LBR feature for guests with less code.

Co-developed-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
---
 arch/x86/events/intel/lbr.c       | 20 ++++++++++++++++++++
 arch/x86/include/asm/perf_event.h | 12 ++++++++++++
 2 files changed, 32 insertions(+)

diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index 65113b16804a..2ed3f2a51bdf 100644
--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -1343,3 +1343,23 @@ void intel_pmu_lbr_init_knl(void)
 	if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_LIP)
 		x86_pmu.intel_cap.lbr_format = LBR_FORMAT_EIP_FLAGS;
 }
+
+/**
+ * x86_perf_get_lbr - get the LBR records information
+ *
+ * @lbr: the caller's memory to store the LBR records information
+ *
+ * Returns: 0 indicates the LBR info has been successfully obtained
+ */
+int x86_perf_get_lbr(struct x86_pmu_lbr *lbr)
+{
+	int lbr_fmt = x86_pmu.intel_cap.lbr_format;
+
+	lbr->nr = x86_pmu.lbr_nr;
+	lbr->from = x86_pmu.lbr_from;
+	lbr->to = x86_pmu.lbr_to;
+	lbr->info = (lbr_fmt == LBR_FORMAT_INFO) ? MSR_LBR_INFO_0 : 0;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(x86_perf_get_lbr);
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index e855e9cf2c37..5d2c30f0df02 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -333,6 +333,13 @@ struct perf_guest_switch_msr {
 	u64 host, guest;
 };
 
+struct x86_pmu_lbr {
+	unsigned int	nr;
+	unsigned int	from;
+	unsigned int	to;
+	unsigned int	info;
+};
+
 extern void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap);
 extern void perf_check_microcode(void);
 extern int x86_perf_rdpmc_index(struct perf_event *event);
@@ -348,12 +355,17 @@ static inline void perf_check_microcode(void) { }
 
 #if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_INTEL)
 extern struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr);
+extern int x86_perf_get_lbr(struct x86_pmu_lbr *lbr);
 #else
 static inline struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr)
 {
 	*nr = 0;
 	return NULL;
 }
+static inline int x86_perf_get_lbr(struct x86_pmu_lbr *lbr)
+{
+	return -1;
+}
 #endif
 
 #ifdef CONFIG_CPU_SUP_INTEL
-- 
2.21.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v12 04/11] perf/x86: Add constraint to create guest LBR event without hw counter
  2020-06-13  8:09 [PATCH v12 00/11] Guest Last Branch Recording Enabling Like Xu
                   ` (2 preceding siblings ...)
  2020-06-13  8:09 ` [PATCH v12 03/11] perf/x86/lbr: Add interface to get LBR information Like Xu
@ 2020-06-13  8:09 ` Like Xu
  2020-06-13  8:09 ` [PATCH v12 05/11] perf/x86: Keep LBR records unchanged in host context for guest usage Like Xu
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 34+ messages in thread
From: Like Xu @ 2020-06-13  8:09 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, ak, wei.w.wang,
	linux-kernel, kvm, Like Xu

The hypervisor may request the perf subsystem to schedule a time window
to directly access the LBR records msrs for its own use. Normally, it would
create a guest LBR event with callstack mode enabled, which is scheduled
along with other ordinary LBR events on the host but in an exclusive way.

To avoid wasting a counter for the guest LBR event, the perf tracks its
hw->idx via INTEL_PMC_IDX_FIXED_VLBR and assigns it with a fake VLBR
counter with the help of new vlbr_constraint. As with the BTS event,
there is actually no hardware counter assigned for the guest LBR event.

Signed-off-by: Like Xu <like.xu@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200514083054.62538-5-like.xu@linux.intel.com
---
 arch/x86/events/core.c            |  1 +
 arch/x86/events/intel/core.c      | 18 ++++++++++++++++++
 arch/x86/events/intel/lbr.c       |  4 ++++
 arch/x86/events/perf_event.h      |  1 +
 arch/x86/include/asm/perf_event.h | 22 +++++++++++++++++++++-
 5 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 9a5056472b67..1996f2ed7c83 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -1104,6 +1104,7 @@ static inline void x86_assign_hw_event(struct perf_event *event,
 
 	switch (hwc->idx) {
 	case INTEL_PMC_IDX_FIXED_BTS:
+	case INTEL_PMC_IDX_FIXED_VLBR:
 		hwc->config_base = 0;
 		hwc->event_base	= 0;
 		break;
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 8dac4c61bf76..51e1fba7b1d1 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2621,6 +2621,20 @@ intel_bts_constraints(struct perf_event *event)
 	return NULL;
 }
 
+/*
+ * Note: matches a fake event, like Fixed2.
+ */
+static struct event_constraint *
+intel_vlbr_constraints(struct perf_event *event)
+{
+	struct event_constraint *c = &vlbr_constraint;
+
+	if (unlikely(constraint_match(c, event->hw.config)))
+		return c;
+
+	return NULL;
+}
+
 static int intel_alt_er(int idx, u64 config)
 {
 	int alt_idx = idx;
@@ -2811,6 +2825,10 @@ __intel_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
 {
 	struct event_constraint *c;
 
+	c = intel_vlbr_constraints(event);
+	if (c)
+		return c;
+
 	c = intel_bts_constraints(event);
 	if (c)
 		return c;
diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index 2ed3f2a51bdf..d285d26c1578 100644
--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -1363,3 +1363,7 @@ int x86_perf_get_lbr(struct x86_pmu_lbr *lbr)
 	return 0;
 }
 EXPORT_SYMBOL_GPL(x86_perf_get_lbr);
+
+struct event_constraint vlbr_constraint =
+	FIXED_EVENT_CONSTRAINT(INTEL_FIXED_VLBR_EVENT,
+			       (INTEL_PMC_IDX_FIXED_VLBR - INTEL_PMC_IDX_FIXED));
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index eb37f6c43c96..77a6dd66bd9a 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -990,6 +990,7 @@ void release_ds_buffers(void);
 void reserve_ds_buffers(void);
 
 extern struct event_constraint bts_constraint;
+extern struct event_constraint vlbr_constraint;
 
 void intel_pmu_enable_bts(u64 config);
 
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 5d2c30f0df02..2df707311d17 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -192,9 +192,29 @@ struct x86_pmu_capability {
 #define GLOBAL_STATUS_UNC_OVF				BIT_ULL(61)
 #define GLOBAL_STATUS_ASIF				BIT_ULL(60)
 #define GLOBAL_STATUS_COUNTERS_FROZEN			BIT_ULL(59)
-#define GLOBAL_STATUS_LBRS_FROZEN			BIT_ULL(58)
+#define GLOBAL_STATUS_LBRS_FROZEN_BIT			58
+#define GLOBAL_STATUS_LBRS_FROZEN			BIT_ULL(GLOBAL_STATUS_LBRS_FROZEN_BIT)
 #define GLOBAL_STATUS_TRACE_TOPAPMI			BIT_ULL(55)
 
+/*
+ * We model guest LBR event tracing as another fixed-mode PMC like BTS.
+ *
+ * We choose bit 58 because it's used to indicate LBR stack frozen state
+ * for architectural perfmon v4, also we unconditionally mask that bit in
+ * the handle_pmi_common(), so it'll never be set in the overflow handling.
+ *
+ * With this fake counter assigned, the guest LBR event user (such as KVM),
+ * can program the LBR registers on its own, and we don't actually do anything
+ * with then in the host context.
+ */
+#define INTEL_PMC_IDX_FIXED_VLBR	(GLOBAL_STATUS_LBRS_FROZEN_BIT)
+
+/*
+ * Pseudo-encoding the guest LBR event as event=0x00,umask=0x1b,
+ * since it would claim bit 58 which is effectively Fixed26.
+ */
+#define INTEL_FIXED_VLBR_EVENT	0x1b00
+
 /*
  * Adaptive PEBS v4
  */
-- 
2.21.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v12 05/11] perf/x86: Keep LBR records unchanged in host context for guest usage
  2020-06-13  8:09 [PATCH v12 00/11] Guest Last Branch Recording Enabling Like Xu
                   ` (3 preceding siblings ...)
  2020-06-13  8:09 ` [PATCH v12 04/11] perf/x86: Add constraint to create guest LBR event without hw counter Like Xu
@ 2020-06-13  8:09 ` Like Xu
  2020-06-13  8:09 ` [PATCH v12 06/11] KVM: vmx/pmu: Expose LBR to guest via MSR_IA32_PERF_CAPABILITIES Like Xu
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 34+ messages in thread
From: Like Xu @ 2020-06-13  8:09 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, ak, wei.w.wang,
	linux-kernel, kvm, Like Xu

When a guest wants to use the LBR registers, its hypervisor creates a guest
LBR event and let host perf schedules it. The LBR records msrs are
accessible to the guest when its guest LBR event is scheduled on
by the perf subsystem.

Before scheduling this event out, we should avoid host changes on
IA32_DEBUGCTLMSR or LBR_SELECT. Otherwise, some unexpected branch
operations may interfere with guest behavior, pollute LBR records, and even
cause host branches leakage. In addition, the read operation
on host is also avoidable.

To ensure that guest LBR records are not lost during the context switch,
the guest LBR event would enable the callstack mode which could
save/restore guest unread LBR records with the help of
intel_pmu_lbr_sched_task() naturally.

However, the guest LBR_SELECT may changes for its own use and the host
LBR event doesn't save/restore it. To ensure that we doesn't lost the guest
LBR_SELECT value when the guest LBR event is running, the vlbr_constraint
is bound up with a new constraint flag PERF_X86_EVENT_LBR_SELECT.

Signed-off-by: Like Xu <like.xu@linux.intel.com>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200514083054.62538-6-like.xu@linux.intel.com
---
 arch/x86/events/intel/core.c |  6 ++++--
 arch/x86/events/intel/lbr.c  | 31 ++++++++++++++++++++++++++-----
 arch/x86/events/perf_event.h |  3 +++
 3 files changed, 33 insertions(+), 7 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 51e1fba7b1d1..582ddff9a359 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2189,7 +2189,8 @@ static void intel_pmu_disable_event(struct perf_event *event)
 	} else if (idx == INTEL_PMC_IDX_FIXED_BTS) {
 		intel_pmu_disable_bts();
 		intel_pmu_drain_bts_buffer();
-	}
+	} else if (idx == INTEL_PMC_IDX_FIXED_VLBR)
+		intel_clear_masks(event, idx);
 
 	/*
 	 * Needs to be called after x86_pmu_disable_event,
@@ -2271,7 +2272,8 @@ static void intel_pmu_enable_event(struct perf_event *event)
 		if (!__this_cpu_read(cpu_hw_events.enabled))
 			return;
 		intel_pmu_enable_bts(hwc->config);
-	}
+	} else if (idx == INTEL_PMC_IDX_FIXED_VLBR)
+		intel_set_masks(event, idx);
 }
 
 static void intel_pmu_add_event(struct perf_event *event)
diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index d285d26c1578..d03de7539957 100644
--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -383,6 +383,9 @@ static void __intel_pmu_lbr_restore(struct x86_perf_task_context *task_ctx)
 
 	wrmsrl(x86_pmu.lbr_tos, tos);
 	task_ctx->lbr_stack_state = LBR_NONE;
+
+	if (cpuc->lbr_select)
+		wrmsrl(MSR_LBR_SELECT, task_ctx->lbr_sel);
 }
 
 static void __intel_pmu_lbr_save(struct x86_perf_task_context *task_ctx)
@@ -415,6 +418,9 @@ static void __intel_pmu_lbr_save(struct x86_perf_task_context *task_ctx)
 
 	cpuc->last_task_ctx = task_ctx;
 	cpuc->last_log_id = ++task_ctx->log_id;
+
+	if (cpuc->lbr_select)
+		rdmsrl(MSR_LBR_SELECT, task_ctx->lbr_sel);
 }
 
 void intel_pmu_lbr_swap_task_ctx(struct perf_event_context *prev,
@@ -485,6 +491,9 @@ void intel_pmu_lbr_add(struct perf_event *event)
 	if (!x86_pmu.lbr_nr)
 		return;
 
+	if (event->hw.flags & PERF_X86_EVENT_LBR_SELECT)
+		cpuc->lbr_select = 1;
+
 	cpuc->br_sel = event->hw.branch_reg.reg;
 
 	if (branch_user_callstack(cpuc->br_sel) && event->ctx->task_ctx_data) {
@@ -532,6 +541,9 @@ void intel_pmu_lbr_del(struct perf_event *event)
 		task_ctx->lbr_callstack_users--;
 	}
 
+	if (event->hw.flags & PERF_X86_EVENT_LBR_SELECT)
+		cpuc->lbr_select = 0;
+
 	if (x86_pmu.intel_cap.pebs_baseline && event->attr.precise_ip > 0)
 		cpuc->lbr_pebs_users--;
 	cpuc->lbr_users--;
@@ -540,11 +552,19 @@ void intel_pmu_lbr_del(struct perf_event *event)
 	perf_sched_cb_dec(event->ctx->pmu);
 }
 
+static inline bool vlbr_exclude_host(void)
+{
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+
+	return test_bit(INTEL_PMC_IDX_FIXED_VLBR,
+		(unsigned long *)&cpuc->intel_ctrl_guest_mask);
+}
+
 void intel_pmu_lbr_enable_all(bool pmi)
 {
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
-	if (cpuc->lbr_users)
+	if (cpuc->lbr_users && !vlbr_exclude_host())
 		__intel_pmu_lbr_enable(pmi);
 }
 
@@ -552,7 +572,7 @@ void intel_pmu_lbr_disable_all(void)
 {
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
-	if (cpuc->lbr_users)
+	if (cpuc->lbr_users && !vlbr_exclude_host())
 		__intel_pmu_lbr_disable();
 }
 
@@ -694,7 +714,8 @@ void intel_pmu_lbr_read(void)
 	 * This could be smarter and actually check the event,
 	 * but this simple approach seems to work for now.
 	 */
-	if (!cpuc->lbr_users || cpuc->lbr_users == cpuc->lbr_pebs_users)
+	if (!cpuc->lbr_users || vlbr_exclude_host() ||
+	    cpuc->lbr_users == cpuc->lbr_pebs_users)
 		return;
 
 	if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_32)
@@ -1365,5 +1386,5 @@ int x86_perf_get_lbr(struct x86_pmu_lbr *lbr)
 EXPORT_SYMBOL_GPL(x86_perf_get_lbr);
 
 struct event_constraint vlbr_constraint =
-	FIXED_EVENT_CONSTRAINT(INTEL_FIXED_VLBR_EVENT,
-			       (INTEL_PMC_IDX_FIXED_VLBR - INTEL_PMC_IDX_FIXED));
+	__EVENT_CONSTRAINT(INTEL_FIXED_VLBR_EVENT, (1ULL << INTEL_PMC_IDX_FIXED_VLBR),
+			  FIXED_EVENT_FLAGS, 1, 0, PERF_X86_EVENT_LBR_SELECT);
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 77a6dd66bd9a..81475963df99 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -78,6 +78,7 @@ static inline bool constraint_match(struct event_constraint *c, u64 ecode)
 #define PERF_X86_EVENT_LARGE_PEBS	0x0400 /* use large PEBS */
 #define PERF_X86_EVENT_PEBS_VIA_PT	0x0800 /* use PT buffer for PEBS */
 #define PERF_X86_EVENT_PAIR		0x1000 /* Large Increment per Cycle */
+#define PERF_X86_EVENT_LBR_SELECT	0x2000 /* Save/Restore MSR_LBR_SELECT */
 
 struct amd_nb {
 	int nb_id;  /* NorthBridge id */
@@ -237,6 +238,7 @@ struct cpu_hw_events {
 	u64				br_sel;
 	struct x86_perf_task_context	*last_task_ctx;
 	int				last_log_id;
+	int				lbr_select;
 
 	/*
 	 * Intel host/guest exclude bits
@@ -722,6 +724,7 @@ struct x86_perf_task_context {
 	u64 lbr_from[MAX_LBR_ENTRIES];
 	u64 lbr_to[MAX_LBR_ENTRIES];
 	u64 lbr_info[MAX_LBR_ENTRIES];
+	u64 lbr_sel;
 	int tos;
 	int valid_lbrs;
 	int lbr_callstack_users;
-- 
2.21.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v12 06/11] KVM: vmx/pmu: Expose LBR to guest via MSR_IA32_PERF_CAPABILITIES
  2020-06-13  8:09 [PATCH v12 00/11] Guest Last Branch Recording Enabling Like Xu
                   ` (4 preceding siblings ...)
  2020-06-13  8:09 ` [PATCH v12 05/11] perf/x86: Keep LBR records unchanged in host context for guest usage Like Xu
@ 2020-06-13  8:09 ` Like Xu
  2020-07-08 13:36   ` Andi Kleen
  2020-06-13  8:09 ` [PATCH v12 07/11] KVM: vmx/pmu: Unmask LBR fields in the MSR_IA32_DEBUGCTLMSR emualtion Like Xu
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 34+ messages in thread
From: Like Xu @ 2020-06-13  8:09 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, ak, wei.w.wang,
	linux-kernel, kvm, Like Xu

The bits [0, 5] of the read-only MSR_IA32_PERF_CAPABILITIES tells about
the record format stored in the LBR records. Userspace could expose guest
LBR when host supports LBR and the exactly supported LBR format value is
initialized to the MSR_IA32_PERF_CAPABILITIES and vcpu model is compatible.

Signed-off-by: Like Xu <like.xu@linux.intel.com>
---
 arch/x86/kvm/vmx/capabilities.h | 11 ++++++-
 arch/x86/kvm/vmx/pmu_intel.c    | 52 +++++++++++++++++++++++++++++++--
 arch/x86/kvm/vmx/vmx.h          |  6 ++++
 3 files changed, 65 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index 4bbd8b448d22..b633a90320ee 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -19,6 +19,7 @@ extern int __read_mostly pt_mode;
 #define PT_MODE_HOST_GUEST	1
 
 #define PMU_CAP_FW_WRITES	(1ULL << 13)
+#define PMU_CAP_LBR_FMT		0x3f
 
 struct nested_vmx_msrs {
 	/*
@@ -375,7 +376,15 @@ static inline u64 vmx_get_perf_capabilities(void)
 	 * Since counters are virtualized, KVM would support full
 	 * width counting unconditionally, even if the host lacks it.
 	 */
-	return PMU_CAP_FW_WRITES;
+	u64 perf_cap = PMU_CAP_FW_WRITES;
+
+	if (boot_cpu_has(X86_FEATURE_PDCM))
+		rdmsrl(MSR_IA32_PERF_CAPABILITIES, perf_cap);
+
+	/* From now on, KVM will support LBR.  */
+	perf_cap |= perf_cap & PMU_CAP_LBR_FMT;
+
+	return perf_cap;
 }
 
 #endif /* __KVM_X86_VMX_CAPS_H */
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index bdcce65c7a1d..a953c7d633f6 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -168,6 +168,13 @@ static inline struct kvm_pmc *get_fw_gp_pmc(struct kvm_pmu *pmu, u32 msr)
 	return get_gp_pmc(pmu, msr, MSR_IA32_PMC0);
 }
 
+static inline bool lbr_is_enabled(struct kvm_vcpu *vcpu)
+{
+	struct x86_pmu_lbr *lbr = &to_vmx(vcpu)->lbr_desc.lbr;
+
+	return lbr->nr && (vcpu->arch.perf_capabilities & PMU_CAP_LBR_FMT);
+}
+
 static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
@@ -251,6 +258,30 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	return 1;
 }
 
+static inline bool lbr_fmt_is_matched(u64 data)
+{
+	return (data & PMU_CAP_LBR_FMT) ==
+		(vmx_get_perf_capabilities() & PMU_CAP_LBR_FMT);
+}
+
+static inline bool lbr_is_compatible(struct kvm_vcpu *vcpu)
+{
+	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
+
+	if (pmu->version < 2)
+		return false;
+
+	/*
+	 * As a first step, a guest could only enable LBR feature if its cpu
+	 * model is the same as the host because the LBR registers would
+	 * be pass-through to the guest and they're model specific.
+	 */
+	if (boot_cpu_data.x86_model != guest_cpuid_model(vcpu))
+		return false;
+
+	return true;
+}
+
 static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
@@ -295,6 +326,14 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		if (guest_cpuid_has(vcpu, X86_FEATURE_PDCM) ?
 			(data & ~vmx_get_perf_capabilities()) : data)
 			return 1;
+		if (data & PMU_CAP_LBR_FMT) {
+			if (!lbr_fmt_is_matched(data))
+				return 1;
+			if (!lbr_is_compatible(vcpu))
+				return 1;
+			if (x86_perf_get_lbr(&to_vmx(vcpu)->lbr_desc.lbr))
+				return 1;
+		}
 		vcpu->arch.perf_capabilities = data;
 		return 0;
 	default:
@@ -337,6 +376,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 	struct kvm_cpuid_entry2 *entry;
 	union cpuid10_eax eax;
 	union cpuid10_edx edx;
+	struct lbr_desc *lbr_desc = &to_vmx(vcpu)->lbr_desc;
 
 	pmu->nr_arch_gp_counters = 0;
 	pmu->nr_arch_fixed_counters = 0;
@@ -344,7 +384,6 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 	pmu->counter_bitmask[KVM_PMC_FIXED] = 0;
 	pmu->version = 0;
 	pmu->reserved_bits = 0xffffffff00200000ull;
-	vcpu->arch.perf_capabilities = 0;
 
 	entry = kvm_find_cpuid_entry(vcpu, 0xa, 0);
 	if (!entry)
@@ -357,8 +396,6 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 		return;
 
 	perf_get_x86_pmu_capability(&x86_pmu);
-	if (guest_cpuid_has(vcpu, X86_FEATURE_PDCM))
-		vcpu->arch.perf_capabilities = vmx_get_perf_capabilities();
 
 	pmu->nr_arch_gp_counters = min_t(int, eax.split.num_counters,
 					 x86_pmu.num_counters_gp);
@@ -397,6 +434,10 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 	bitmap_set(pmu->all_valid_pmc_idx,
 		INTEL_PMC_MAX_GENERIC, pmu->nr_arch_fixed_counters);
 
+	if ((vcpu->arch.perf_capabilities & PMU_CAP_LBR_FMT) &&
+	    x86_perf_get_lbr(&lbr_desc->lbr))
+		vcpu->arch.perf_capabilities &= ~PMU_CAP_LBR_FMT;
+
 	nested_vmx_pmu_entry_exit_ctls_update(vcpu);
 }
 
@@ -404,6 +445,7 @@ static void intel_pmu_init(struct kvm_vcpu *vcpu)
 {
 	int i;
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
+	struct lbr_desc *lbr_desc = &to_vmx(vcpu)->lbr_desc;
 
 	for (i = 0; i < INTEL_PMC_MAX_GENERIC; i++) {
 		pmu->gp_counters[i].type = KVM_PMC_GP;
@@ -418,6 +460,10 @@ static void intel_pmu_init(struct kvm_vcpu *vcpu)
 		pmu->fixed_counters[i].idx = i + INTEL_PMC_IDX_FIXED;
 		pmu->fixed_counters[i].current_config = 0;
 	}
+
+	vcpu->arch.perf_capabilities = guest_cpuid_has(vcpu, X86_FEATURE_PDCM) ?
+		vmx_get_perf_capabilities() : 0;
+	lbr_desc->lbr.nr = 0;
 }
 
 static void intel_pmu_reset(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 8a83b5edc820..ef24338b194d 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -91,6 +91,11 @@ struct pt_desc {
 	struct pt_ctx guest;
 };
 
+struct lbr_desc {
+	/* Basic information about LBR records. */
+	struct x86_pmu_lbr lbr;
+};
+
 /*
  * The nested_vmx structure is part of vcpu_vmx, and holds information we need
  * for correct emulation of VMX (i.e., nested VMX) on this vcpu.
@@ -302,6 +307,7 @@ struct vcpu_vmx {
 	u64 ept_pointer;
 
 	struct pt_desc pt_desc;
+	struct lbr_desc lbr_desc;
 };
 
 enum ept_pointers_status {
-- 
2.21.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v12 07/11] KVM: vmx/pmu: Unmask LBR fields in the MSR_IA32_DEBUGCTLMSR emualtion
  2020-06-13  8:09 [PATCH v12 00/11] Guest Last Branch Recording Enabling Like Xu
                   ` (5 preceding siblings ...)
  2020-06-13  8:09 ` [PATCH v12 06/11] KVM: vmx/pmu: Expose LBR to guest via MSR_IA32_PERF_CAPABILITIES Like Xu
@ 2020-06-13  8:09 ` Like Xu
  2020-06-13  9:14   ` Xiaoyao Li
  2020-06-13  8:09 ` [PATCH v12 08/11] KVM: vmx/pmu: Pass-through LBR msrs when guest LBR event is scheduled Like Xu
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 34+ messages in thread
From: Like Xu @ 2020-06-13  8:09 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, ak, wei.w.wang,
	linux-kernel, kvm, Like Xu

When the LBR feature is reported by the vmx_get_perf_capabilities(),
the LBR fields in the [vmx|vcpu]_supported debugctl should be unmasked.

The debugctl msr is handled separately in vmx/svm and they're not
completely identical, hence remove the common msr handling code.

Signed-off-by: Like Xu <like.xu@linux.intel.com>
---
 arch/x86/kvm/vmx/capabilities.h | 12 ++++++++++++
 arch/x86/kvm/vmx/pmu_intel.c    | 19 +++++++++++++++++++
 arch/x86/kvm/x86.c              | 13 -------------
 3 files changed, 31 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index b633a90320ee..f6fcfabb1026 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -21,6 +21,8 @@ extern int __read_mostly pt_mode;
 #define PMU_CAP_FW_WRITES	(1ULL << 13)
 #define PMU_CAP_LBR_FMT		0x3f
 
+#define DEBUGCTLMSR_LBR_MASK		(DEBUGCTLMSR_LBR | DEBUGCTLMSR_FREEZE_LBRS_ON_PMI)
+
 struct nested_vmx_msrs {
 	/*
 	 * We only store the "true" versions of the VMX capability MSRs. We
@@ -387,4 +389,14 @@ static inline u64 vmx_get_perf_capabilities(void)
 	return perf_cap;
 }
 
+static inline u64 vmx_get_supported_debugctl(void)
+{
+	u64 val = 0;
+
+	if (vmx_get_perf_capabilities() & PMU_CAP_LBR_FMT)
+		val |= DEBUGCTLMSR_LBR_MASK;
+
+	return val;
+}
+
 #endif /* __KVM_X86_VMX_CAPS_H */
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index a953c7d633f6..d92e95b64c74 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -187,6 +187,7 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 	case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
 		ret = pmu->version > 1;
 		break;
+	case MSR_IA32_DEBUGCTLMSR:
 	case MSR_IA32_PERF_CAPABILITIES:
 		ret = 1;
 		break;
@@ -237,6 +238,9 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			return 1;
 		msr_info->data = vcpu->arch.perf_capabilities;
 		return 0;
+	case MSR_IA32_DEBUGCTLMSR:
+		msr_info->data = vmcs_read64(GUEST_IA32_DEBUGCTL);
+		return 0;
 	default:
 		if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
 		    (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
@@ -282,6 +286,16 @@ static inline bool lbr_is_compatible(struct kvm_vcpu *vcpu)
 	return true;
 }
 
+static inline u64 vcpu_get_supported_debugctl(struct kvm_vcpu *vcpu)
+{
+	u64 debugctlmsr = vmx_get_supported_debugctl();
+
+	if (!lbr_is_enabled(vcpu))
+		debugctlmsr &= ~DEBUGCTLMSR_LBR_MASK;
+
+	return debugctlmsr;
+}
+
 static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
@@ -336,6 +350,11 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		}
 		vcpu->arch.perf_capabilities = data;
 		return 0;
+	case MSR_IA32_DEBUGCTLMSR:
+		if (data & ~vcpu_get_supported_debugctl(vcpu))
+			return 1;
+		vmcs_write64(GUEST_IA32_DEBUGCTL, data);
+		return 0;
 	default:
 		if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
 		    (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 00c88c2f34e4..56f275eb4554 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2840,18 +2840,6 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			return 1;
 		}
 		break;
-	case MSR_IA32_DEBUGCTLMSR:
-		if (!data) {
-			/* We support the non-activated case already */
-			break;
-		} else if (data & ~(DEBUGCTLMSR_LBR | DEBUGCTLMSR_BTF)) {
-			/* Values other than LBR and BTF are vendor-specific,
-			   thus reserved and should throw a #GP */
-			return 1;
-		}
-		vcpu_unimpl(vcpu, "%s: MSR_IA32_DEBUGCTLMSR 0x%llx, nop\n",
-			    __func__, data);
-		break;
 	case 0x200 ... 0x2ff:
 		return kvm_mtrr_set_msr(vcpu, msr, data);
 	case MSR_IA32_APICBASE:
@@ -3120,7 +3108,6 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	switch (msr_info->index) {
 	case MSR_IA32_PLATFORM_ID:
 	case MSR_IA32_EBL_CR_POWERON:
-	case MSR_IA32_DEBUGCTLMSR:
 	case MSR_IA32_LASTBRANCHFROMIP:
 	case MSR_IA32_LASTBRANCHTOIP:
 	case MSR_IA32_LASTINTFROMIP:
-- 
2.21.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v12 08/11] KVM: vmx/pmu: Pass-through LBR msrs when guest LBR event is scheduled
  2020-06-13  8:09 [PATCH v12 00/11] Guest Last Branch Recording Enabling Like Xu
                   ` (6 preceding siblings ...)
  2020-06-13  8:09 ` [PATCH v12 07/11] KVM: vmx/pmu: Unmask LBR fields in the MSR_IA32_DEBUGCTLMSR emualtion Like Xu
@ 2020-06-13  8:09 ` Like Xu
  2020-06-13  8:09 ` [PATCH v12 09/11] KVM: vmx/pmu: Emulate legacy freezing LBRs on virtual PMI Like Xu
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 34+ messages in thread
From: Like Xu @ 2020-06-13  8:09 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, ak, wei.w.wang,
	linux-kernel, kvm, Like Xu

The guest first access on the LBR related msrs (including DEBUGCTLMSR and
records msrs) is always interceptible. The KVM handler would create a guest
LBR event which enables the callstack mode and none of hardware counter is
assigned. The host perf would schedule and enable this event as usual but
in an exclusive way.

If the guest LBR event is scheduled on with the corresponding vcpu context,
KVM will pass-through all LBR records msrs to the guest. The LBR callstack
mechanism implemented in the host could helps save/restore the guest LBR
records during the event context switches, which reduces a lot of overhead
if we save/restore tens of LBR msrs (e.g. 32 LBR records entries) in the
much more frequent VMX transitions.

To avoid reclaiming LBR resources from any higher priority event on host,
KVM would always check the exist of guest LBR event and its state before
vm-entry as late as possible. A negative result would cancel the
pass-through state, and it also prevents real registers accesses and
potential data leakage. If host reclaims the LBR between two checks, the
interception state and LBR records can be safely preserved due to native
save/restore support from guest LBR event.

The KVM emits a pr_warn() when the LBR hardware is unavailable to the
guest LBR event. The administer is supposed to reminder users that the
guest result may be inaccurate if someone is using LBR to record
hypervisor on the host side.

The guest LBR event will be released when the vPMU is reset but soon,
the lazy release mechanism would be applied to this event like a
regular vPMC.

Suggested-by: Andi Kleen <ak@linux.intel.com>
Co-developed-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
---
 arch/x86/kvm/vmx/pmu_intel.c | 138 ++++++++++++++++++++++++++++++++++-
 arch/x86/kvm/vmx/vmx.c       |  70 +++++++++++++++++-
 arch/x86/kvm/vmx/vmx.h       |   8 ++
 3 files changed, 212 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index d92e95b64c74..a78c440ebff2 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -175,6 +175,24 @@ static inline bool lbr_is_enabled(struct kvm_vcpu *vcpu)
 	return lbr->nr && (vcpu->arch.perf_capabilities & PMU_CAP_LBR_FMT);
 }
 
+static bool intel_is_valid_lbr_record_msr(struct kvm_vcpu *vcpu, u32 index)
+{
+	struct x86_pmu_lbr *lbr = &to_vmx(vcpu)->lbr_desc.lbr;
+	bool ret = false;
+
+	if (!lbr_is_enabled(vcpu))
+		return ret;
+
+	ret =  (index == MSR_LBR_SELECT) || (index == MSR_LBR_TOS) ||
+		(index >= lbr->from && index < lbr->from + lbr->nr) ||
+		(index >= lbr->to && index < lbr->to + lbr->nr);
+
+	if (!ret && lbr->info)
+		ret = (index >= lbr->info && index < lbr->info + lbr->nr);
+
+	return ret;
+}
+
 static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
@@ -194,7 +212,8 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 	default:
 		ret = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0) ||
 			get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0) ||
-			get_fixed_pmc(pmu, msr) || get_fw_gp_pmc(pmu, msr);
+			get_fixed_pmc(pmu, msr) || get_fw_gp_pmc(pmu, msr) ||
+			intel_is_valid_lbr_record_msr(vcpu, msr);
 		break;
 	}
 
@@ -213,6 +232,113 @@ static struct kvm_pmc *intel_msr_idx_to_pmc(struct kvm_vcpu *vcpu, u32 msr)
 	return pmc;
 }
 
+static int intel_pmu_create_lbr_event(struct kvm_vcpu *vcpu)
+{
+	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
+	struct lbr_desc *lbr_desc = &to_vmx(vcpu)->lbr_desc;
+	struct perf_event *event;
+
+	/*
+	 * The perf_event_attr is constructed in the minimum efficient way:
+	 * - set 'pinned = true' to make it task pinned so that if another
+	 *   cpu pinned event reclaims LBR, the event->oncpu will be set to -1;
+	 * - set '.exclude_host = true' to record guest branches behavior;
+	 *
+	 * - set '.config = INTEL_FIXED_VLBR_EVENT' to indicates host perf
+	 *   schedule the event without a real HW counter but a fake one;
+	 *   check is_guest_lbr_event() and __intel_get_event_constraints();
+	 *
+	 * - set 'sample_type = PERF_SAMPLE_BRANCH_STACK' and
+	 *   'branch_sample_type = PERF_SAMPLE_BRANCH_CALL_STACK |
+	 *   PERF_SAMPLE_BRANCH_USER' to configure it as a LBR callstack
+	 *   event, which helps KVM to save/restore guest LBR records
+	 *   during host context switches and reduces quite a lot overhead,
+	 *   check branch_user_callstack() and intel_pmu_lbr_sched_task();
+	 */
+	struct perf_event_attr attr = {
+		.type = PERF_TYPE_RAW,
+		.size = sizeof(attr),
+		.config = INTEL_FIXED_VLBR_EVENT,
+		.sample_type = PERF_SAMPLE_BRANCH_STACK,
+		.pinned = true,
+		.exclude_host = true,
+		.branch_sample_type = PERF_SAMPLE_BRANCH_CALL_STACK |
+					PERF_SAMPLE_BRANCH_USER,
+	};
+
+	if (unlikely(lbr_desc->event))
+		return 0;
+
+	event = perf_event_create_kernel_counter(&attr, -1,
+						current, NULL, NULL);
+	if (IS_ERR(event)) {
+		pr_debug_ratelimited("%s: failed %ld\n",
+					__func__, PTR_ERR(event));
+		return -ENOENT;
+	}
+	lbr_desc->event = event;
+	pmu->event_count++;
+	return 0;
+}
+
+static void intel_pmu_free_lbr_event(struct kvm_vcpu *vcpu)
+{
+	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
+	struct lbr_desc *lbr_desc = &to_vmx(vcpu)->lbr_desc;
+	struct perf_event *event = lbr_desc->event;
+
+	if (!event)
+		return;
+
+	perf_event_release_kernel(event);
+	lbr_desc->event = NULL;
+	pmu->event_count--;
+}
+
+/*
+ * It's safe to access LBR msrs from guest when they have not
+ * been passthrough since the host would help restore or reset
+ * the LBR msrs records when the guest LBR event is scheduled in.
+ */
+static bool access_lbr_record_msr(struct kvm_vcpu *vcpu,
+				     struct msr_data *msr_info, bool read)
+{
+	struct lbr_desc *lbr_desc = &to_vmx(vcpu)->lbr_desc;
+	u32 index = msr_info->index;
+
+	if (!intel_is_valid_lbr_record_msr(vcpu, index))
+		return false;
+
+	if (msr_info->host_initiated)
+		goto dummy;
+
+	if (!lbr_desc->event && !intel_pmu_create_lbr_event(vcpu))
+		goto dummy;
+
+	/*
+	 * Disable irq to ensure the LBR feature doesn't get reclaimed by the
+	 * host at the time the value is read from the msr, and this avoids the
+	 * host LBR value to be leaked to the guest. If LBR has been reclaimed,
+	 * return 0 on guest reads.
+	 */
+	local_irq_disable();
+	if (lbr_desc->event->state == PERF_EVENT_STATE_ACTIVE) {
+		if (read)
+			rdmsrl(index, msr_info->data);
+		else
+			wrmsrl(index, msr_info->data);
+	} else if (read)
+		msr_info->data = 0;
+	local_irq_enable();
+
+	return true;
+
+dummy:
+	if (read)
+		msr_info->data = 0;
+	return true;
+}
+
 static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
@@ -256,7 +382,8 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		} else if ((pmc = get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0))) {
 			msr_info->data = pmc->eventsel;
 			return 0;
-		}
+		} else if (access_lbr_record_msr(vcpu, msr_info, true))
+			return 0;
 	}
 
 	return 1;
@@ -354,6 +481,8 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		if (data & ~vcpu_get_supported_debugctl(vcpu))
 			return 1;
 		vmcs_write64(GUEST_IA32_DEBUGCTL, data);
+		if (!msr_info->host_initiated && !to_vmx(vcpu)->lbr_desc.event)
+			intel_pmu_create_lbr_event(vcpu);
 		return 0;
 	default:
 		if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
@@ -382,7 +511,8 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 				reprogram_gp_counter(pmc, data);
 				return 0;
 			}
-		}
+		} else if (access_lbr_record_msr(vcpu, msr_info, false))
+			return 0;
 	}
 
 	return 1;
@@ -483,6 +613,7 @@ static void intel_pmu_init(struct kvm_vcpu *vcpu)
 	vcpu->arch.perf_capabilities = guest_cpuid_has(vcpu, X86_FEATURE_PDCM) ?
 		vmx_get_perf_capabilities() : 0;
 	lbr_desc->lbr.nr = 0;
+	lbr_desc->event = NULL;
 }
 
 static void intel_pmu_reset(struct kvm_vcpu *vcpu)
@@ -507,6 +638,7 @@ static void intel_pmu_reset(struct kvm_vcpu *vcpu)
 
 	pmu->fixed_ctr_ctrl = pmu->global_ctrl = pmu->global_status =
 		pmu->global_ovf_ctrl = 0;
+	intel_pmu_free_lbr_event(vcpu);
 }
 
 struct kvm_pmu_ops intel_pmu_ops = {
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 08e26a9518c2..58a8af433741 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -3857,6 +3857,71 @@ void pt_update_intercept_for_msr(struct vcpu_vmx *vmx)
 	}
 }
 
+static void vmx_update_intercept_for_lbr_msrs(struct kvm_vcpu *vcpu, bool set)
+{
+	unsigned long *msr_bitmap = to_vmx(vcpu)->vmcs01.msr_bitmap;
+	struct x86_pmu_lbr *lbr = &to_vmx(vcpu)->lbr_desc.lbr;
+	int i;
+
+	WARN_ON_ONCE(!lbr->nr);
+
+	vmx_set_intercept_for_msr(msr_bitmap, MSR_LBR_SELECT, MSR_TYPE_RW, set);
+	vmx_set_intercept_for_msr(msr_bitmap, MSR_LBR_TOS, MSR_TYPE_RW, set);
+	for (i = 0; i < lbr->nr; i++) {
+		vmx_set_intercept_for_msr(msr_bitmap,
+			lbr->from + i, MSR_TYPE_RW, set);
+		vmx_set_intercept_for_msr(msr_bitmap,
+			lbr->to + i, MSR_TYPE_RW, set);
+		if (lbr->info)
+			vmx_set_intercept_for_msr(msr_bitmap,
+				lbr->info + i, MSR_TYPE_RW, set);
+	}
+}
+
+static inline void vmx_lbr_disable_passthrough(struct kvm_vcpu *vcpu)
+{
+	vmx_update_intercept_for_lbr_msrs(vcpu, true);
+}
+
+static inline void vmx_lbr_enable_passthrough(struct kvm_vcpu *vcpu)
+{
+	vmx_update_intercept_for_lbr_msrs(vcpu, false);
+}
+
+/*
+ * Higher priority host perf events (e.g. cpu pinned) could reclaim the
+ * pmu resources (e.g. LBR) that were assigned to the guest. This is
+ * usually done via ipi calls (more details in perf_install_in_context).
+ *
+ * Before entering the non-root mode (with irq disabled here), double
+ * confirm that the pmu features enabled to the guest are not reclaimed
+ * by higher priority host events. Otherwise, disallow vcpu's access to
+ * the reclaimed features.
+ */
+static void vmx_passthrough_lbr_msrs(struct kvm_vcpu *vcpu)
+{
+	struct lbr_desc *lbr_desc = &to_vmx(vcpu)->lbr_desc;
+
+	if (!lbr_desc->event) {
+		vmx_lbr_disable_passthrough(vcpu);
+		if (vmcs_read64(GUEST_IA32_DEBUGCTL) & DEBUGCTLMSR_LBR)
+			goto warn;
+		return;
+	}
+
+	if (lbr_desc->event->state < PERF_EVENT_STATE_ACTIVE) {
+		vmx_lbr_disable_passthrough(vcpu);
+		goto warn;
+	} else
+		vmx_lbr_enable_passthrough(vcpu);
+
+	return;
+
+warn:
+	pr_warn_ratelimited("kvm: vcpu-%d: fail to passthrough LBR.\n",
+		vcpu->vcpu_id);
+}
+
 static bool vmx_guest_apic_has_interrupt(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -6728,8 +6793,11 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
 
 	pt_guest_enter(vmx);
 
-	if (vcpu_to_pmu(vcpu)->version)
+	if (vcpu_to_pmu(vcpu)->version) {
 		atomic_switch_perf_msrs(vmx);
+		if (vcpu->arch.perf_capabilities & PMU_CAP_LBR_FMT)
+			vmx_passthrough_lbr_msrs(vcpu);
+	}
 	atomic_switch_umwait_control_msr(vmx);
 
 	if (enable_preemption_timer)
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index ef24338b194d..c67ce758412e 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -94,6 +94,14 @@ struct pt_desc {
 struct lbr_desc {
 	/* Basic information about LBR records. */
 	struct x86_pmu_lbr lbr;
+
+	/*
+	 * Emulate LBR feature via passthrough LBR registers when the
+	 * per-vcpu guest LBR event is scheduled on the current pcpu.
+	 *
+	 * The records may be inaccurate if the host reclaims the LBR.
+	 */
+	struct perf_event *event;
 };
 
 /*
-- 
2.21.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v12 09/11] KVM: vmx/pmu: Emulate legacy freezing LBRs on virtual PMI
  2020-06-13  8:09 [PATCH v12 00/11] Guest Last Branch Recording Enabling Like Xu
                   ` (7 preceding siblings ...)
  2020-06-13  8:09 ` [PATCH v12 08/11] KVM: vmx/pmu: Pass-through LBR msrs when guest LBR event is scheduled Like Xu
@ 2020-06-13  8:09 ` Like Xu
  2020-06-13  8:09 ` [PATCH v12 10/11] KVM: vmx/pmu: Reduce the overhead of LBR pass-through or cancellation Like Xu
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 34+ messages in thread
From: Like Xu @ 2020-06-13  8:09 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, ak, wei.w.wang,
	linux-kernel, kvm, Like Xu

The current vPMU only supports Architecture Version 2. According to
Intel SDM "17.4.7 Freezing LBR and Performance Counters on PMI", if
IA32_DEBUGCTL.Freeze_LBR_On_PMI = 1, the LBR is frozen on the virtual
PMI and the KVM would emulate to clear the LBR bit (bit 0) in
IA32_DEBUGCTL. Also guest needs to re-enable IA32_DEBUGCTL.LBR
to resume recording branches.

Signed-off-by: Like Xu <like.xu@linux.intel.com>
---
 arch/x86/kvm/pmu.c           |  5 ++++-
 arch/x86/kvm/pmu.h           |  1 +
 arch/x86/kvm/vmx/pmu_intel.c | 31 +++++++++++++++++++++++++++++++
 3 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index b86346903f2e..5053f4238218 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -378,8 +378,11 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data)
 
 void kvm_pmu_deliver_pmi(struct kvm_vcpu *vcpu)
 {
-	if (lapic_in_kernel(vcpu))
+	if (lapic_in_kernel(vcpu)) {
+		if (kvm_x86_ops.pmu_ops->deliver_pmi)
+			kvm_x86_ops.pmu_ops->deliver_pmi(vcpu);
 		kvm_apic_local_deliver(vcpu->arch.apic, APIC_LVTPC);
+	}
 }
 
 bool kvm_pmu_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index ab85eed8a6cc..095b84392b89 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -37,6 +37,7 @@ struct kvm_pmu_ops {
 	void (*refresh)(struct kvm_vcpu *vcpu);
 	void (*init)(struct kvm_vcpu *vcpu);
 	void (*reset)(struct kvm_vcpu *vcpu);
+	void (*deliver_pmi)(struct kvm_vcpu *vcpu);
 };
 
 static inline u64 pmc_bitmask(struct kvm_pmc *pmc)
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index a78c440ebff2..85a675004cbb 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -641,6 +641,36 @@ static void intel_pmu_reset(struct kvm_vcpu *vcpu)
 	intel_pmu_free_lbr_event(vcpu);
 }
 
+/*
+ * Emulate LBR_On_PMI behavior for 1 < pmu.version < 4.
+ *
+ * If Freeze_LBR_On_PMI = 1, the LBR is frozen on PMI and
+ * the KVM emulates to clear the LBR bit (bit 0) in IA32_DEBUGCTL.
+ *
+ * Guest needs to re-enable LBR to resume branches recording.
+ */
+static void intel_pmu_legacy_freezing_lbrs_on_pmi(struct kvm_vcpu *vcpu)
+{
+	u64 data;
+
+	data = vmcs_read64(GUEST_IA32_DEBUGCTL);
+	if (data & DEBUGCTLMSR_FREEZE_LBRS_ON_PMI) {
+		data &= ~DEBUGCTLMSR_LBR;
+		vmcs_write64(GUEST_IA32_DEBUGCTL, data);
+	}
+}
+
+static void intel_pmu_deliver_pmi(struct kvm_vcpu *vcpu)
+{
+	u8 version = vcpu_to_pmu(vcpu)->version;
+
+	if (!lbr_is_enabled(vcpu))
+		return;
+
+	if (version > 1 && version < 4)
+		intel_pmu_legacy_freezing_lbrs_on_pmi(vcpu);
+}
+
 struct kvm_pmu_ops intel_pmu_ops = {
 	.find_arch_event = intel_find_arch_event,
 	.find_fixed_event = intel_find_fixed_event,
@@ -655,4 +685,5 @@ struct kvm_pmu_ops intel_pmu_ops = {
 	.refresh = intel_pmu_refresh,
 	.init = intel_pmu_init,
 	.reset = intel_pmu_reset,
+	.deliver_pmi = intel_pmu_deliver_pmi,
 };
-- 
2.21.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v12 10/11] KVM: vmx/pmu: Reduce the overhead of LBR pass-through or cancellation
  2020-06-13  8:09 [PATCH v12 00/11] Guest Last Branch Recording Enabling Like Xu
                   ` (8 preceding siblings ...)
  2020-06-13  8:09 ` [PATCH v12 09/11] KVM: vmx/pmu: Emulate legacy freezing LBRs on virtual PMI Like Xu
@ 2020-06-13  8:09 ` Like Xu
  2020-06-13  8:09 ` [PATCH v12 11/11] KVM: vmx/pmu: Release guest LBR event via lazy release mechanism Like Xu
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 34+ messages in thread
From: Like Xu @ 2020-06-13  8:09 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, ak, wei.w.wang,
	linux-kernel, kvm, Like Xu

When the LBR record msrs has already been pass-through, there is no
need to call vmx_update_intercept_for_lbr_msrs() again and again, and
vice versa.

Signed-off-by: Like Xu <like.xu@linux.intel.com>
---
 arch/x86/kvm/vmx/pmu_intel.c |  1 +
 arch/x86/kvm/vmx/vmx.c       | 12 ++++++++++++
 arch/x86/kvm/vmx/vmx.h       |  3 +++
 3 files changed, 16 insertions(+)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 85a675004cbb..75ba0444b4d1 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -614,6 +614,7 @@ static void intel_pmu_init(struct kvm_vcpu *vcpu)
 		vmx_get_perf_capabilities() : 0;
 	lbr_desc->lbr.nr = 0;
 	lbr_desc->event = NULL;
+	lbr_desc->already_passthrough = false;
 }
 
 static void intel_pmu_reset(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 58a8af433741..800a26e3b571 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -3880,12 +3880,24 @@ static void vmx_update_intercept_for_lbr_msrs(struct kvm_vcpu *vcpu, bool set)
 
 static inline void vmx_lbr_disable_passthrough(struct kvm_vcpu *vcpu)
 {
+	struct lbr_desc *lbr_desc = &to_vmx(vcpu)->lbr_desc;
+
+	if (!lbr_desc->already_passthrough)
+		return;
+
 	vmx_update_intercept_for_lbr_msrs(vcpu, true);
+	lbr_desc->already_passthrough = false;
 }
 
 static inline void vmx_lbr_enable_passthrough(struct kvm_vcpu *vcpu)
 {
+	struct lbr_desc *lbr_desc = &to_vmx(vcpu)->lbr_desc;
+
+	if (lbr_desc->already_passthrough)
+		return;
+
 	vmx_update_intercept_for_lbr_msrs(vcpu, false);
+	lbr_desc->already_passthrough = true;
 }
 
 /*
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index c67ce758412e..c931463f75d9 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -102,6 +102,9 @@ struct lbr_desc {
 	 * The records may be inaccurate if the host reclaims the LBR.
 	 */
 	struct perf_event *event;
+
+	/* A flag to reduce the overhead of LBR pass-through or cancellation. */
+	bool already_passthrough;
 };
 
 /*
-- 
2.21.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v12 11/11] KVM: vmx/pmu: Release guest LBR event via lazy release mechanism
  2020-06-13  8:09 [PATCH v12 00/11] Guest Last Branch Recording Enabling Like Xu
                   ` (9 preceding siblings ...)
  2020-06-13  8:09 ` [PATCH v12 10/11] KVM: vmx/pmu: Reduce the overhead of LBR pass-through or cancellation Like Xu
@ 2020-06-13  8:09 ` Like Xu
  2020-06-13  8:09 ` [Qemu-devel] [PATCH 1/2] target/i386: define a new MSR based feature word - FEAT_PERF_CAPABILITIES Like Xu
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 34+ messages in thread
From: Like Xu @ 2020-06-13  8:09 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, ak, wei.w.wang,
	linux-kernel, kvm, Like Xu

The vPMU uses GUEST_LBR_IN_USE_IDX (bit 58) in 'pmu->pmc_in_use' to
indicate whether a guest LBR event is still needed by the vcpu. If the
vcpu no longer accesses LBR related registers within a scheduling time
slice, and the enable bit of LBR has been unset, vPMU will treat the
guest LBR event as a bland event of a vPMC counter and release it
as usual. Also the pass-through state of LBR records msrs is cancelled.

Signed-off-by: Like Xu <like.xu@linux.intel.com>
---
 arch/x86/kvm/pmu.c           |  7 +++++++
 arch/x86/kvm/pmu.h           |  4 ++++
 arch/x86/kvm/vmx/pmu_intel.c | 14 +++++++++++++-
 arch/x86/kvm/vmx/vmx.c       |  4 ++++
 4 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 5053f4238218..e5b76f1c3ce8 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -458,6 +458,7 @@ void kvm_pmu_cleanup(struct kvm_vcpu *vcpu)
 	struct kvm_pmc *pmc = NULL;
 	DECLARE_BITMAP(bitmask, X86_PMC_IDX_MAX);
 	int i;
+	bool arch_cleanup = false;
 
 	pmu->need_cleanup = false;
 
@@ -469,8 +470,14 @@ void kvm_pmu_cleanup(struct kvm_vcpu *vcpu)
 
 		if (pmc && pmc->perf_event && !pmc_speculative_in_use(pmc))
 			pmc_stop_counter(pmc);
+
+		if (i == INTEL_GUEST_LBR_INUSE)
+			arch_cleanup = true;
 	}
 
+	if (arch_cleanup && kvm_x86_ops.pmu_ops->cleanup)
+		kvm_x86_ops.pmu_ops->cleanup(vcpu);
+
 	bitmap_zero(pmu->pmc_in_use, X86_PMC_IDX_MAX);
 }
 
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 095b84392b89..d5023eacd8ed 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -15,6 +15,9 @@
 #define VMWARE_BACKDOOR_PMC_REAL_TIME		0x10001
 #define VMWARE_BACKDOOR_PMC_APPARENT_TIME	0x10002
 
+/* Indicates whether Intel LBR msrs were accessed during the last time slice. */
+#define INTEL_GUEST_LBR_INUSE INTEL_PMC_IDX_FIXED_VLBR
+
 struct kvm_event_hw_type_mapping {
 	u8 eventsel;
 	u8 unit_mask;
@@ -38,6 +41,7 @@ struct kvm_pmu_ops {
 	void (*init)(struct kvm_vcpu *vcpu);
 	void (*reset)(struct kvm_vcpu *vcpu);
 	void (*deliver_pmi)(struct kvm_vcpu *vcpu);
+	void (*cleanup)(struct kvm_vcpu *vcpu);
 };
 
 static inline u64 pmc_bitmask(struct kvm_pmc *pmc)
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 75ba0444b4d1..c1c5058acc90 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -303,6 +303,7 @@ static void intel_pmu_free_lbr_event(struct kvm_vcpu *vcpu)
 static bool access_lbr_record_msr(struct kvm_vcpu *vcpu,
 				     struct msr_data *msr_info, bool read)
 {
+	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
 	struct lbr_desc *lbr_desc = &to_vmx(vcpu)->lbr_desc;
 	u32 index = msr_info->index;
 
@@ -331,6 +332,7 @@ static bool access_lbr_record_msr(struct kvm_vcpu *vcpu,
 		msr_info->data = 0;
 	local_irq_enable();
 
+	__set_bit(INTEL_GUEST_LBR_INUSE, pmu->pmc_in_use);
 	return true;
 
 dummy:
@@ -483,6 +485,7 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		vmcs_write64(GUEST_IA32_DEBUGCTL, data);
 		if (!msr_info->host_initiated && !to_vmx(vcpu)->lbr_desc.event)
 			intel_pmu_create_lbr_event(vcpu);
+		__set_bit(INTEL_GUEST_LBR_INUSE, pmu->pmc_in_use);
 		return 0;
 	default:
 		if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
@@ -584,7 +587,9 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 		INTEL_PMC_MAX_GENERIC, pmu->nr_arch_fixed_counters);
 
 	if ((vcpu->arch.perf_capabilities & PMU_CAP_LBR_FMT) &&
-	    x86_perf_get_lbr(&lbr_desc->lbr))
+	    !x86_perf_get_lbr(&lbr_desc->lbr))
+		bitmap_set(pmu->all_valid_pmc_idx, INTEL_GUEST_LBR_INUSE, 1);
+	else
 		vcpu->arch.perf_capabilities &= ~PMU_CAP_LBR_FMT;
 
 	nested_vmx_pmu_entry_exit_ctls_update(vcpu);
@@ -672,6 +677,12 @@ static void intel_pmu_deliver_pmi(struct kvm_vcpu *vcpu)
 		intel_pmu_legacy_freezing_lbrs_on_pmi(vcpu);
 }
 
+static void intel_pmu_cleanup(struct kvm_vcpu *vcpu)
+{
+	if (!(vmcs_read64(GUEST_IA32_DEBUGCTL) & DEBUGCTLMSR_LBR))
+		intel_pmu_free_lbr_event(vcpu);
+}
+
 struct kvm_pmu_ops intel_pmu_ops = {
 	.find_arch_event = intel_find_arch_event,
 	.find_fixed_event = intel_find_fixed_event,
@@ -687,4 +698,5 @@ struct kvm_pmu_ops intel_pmu_ops = {
 	.init = intel_pmu_init,
 	.reset = intel_pmu_reset,
 	.deliver_pmi = intel_pmu_deliver_pmi,
+	.cleanup = intel_pmu_cleanup,
 };
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 800a26e3b571..8521fc640b95 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -3912,17 +3912,21 @@ static inline void vmx_lbr_enable_passthrough(struct kvm_vcpu *vcpu)
  */
 static void vmx_passthrough_lbr_msrs(struct kvm_vcpu *vcpu)
 {
+	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
 	struct lbr_desc *lbr_desc = &to_vmx(vcpu)->lbr_desc;
 
 	if (!lbr_desc->event) {
 		vmx_lbr_disable_passthrough(vcpu);
 		if (vmcs_read64(GUEST_IA32_DEBUGCTL) & DEBUGCTLMSR_LBR)
 			goto warn;
+		if (test_bit(INTEL_GUEST_LBR_INUSE, pmu->pmc_in_use))
+			goto warn;
 		return;
 	}
 
 	if (lbr_desc->event->state < PERF_EVENT_STATE_ACTIVE) {
 		vmx_lbr_disable_passthrough(vcpu);
+		__clear_bit(INTEL_GUEST_LBR_INUSE, pmu->pmc_in_use);
 		goto warn;
 	} else
 		vmx_lbr_enable_passthrough(vcpu);
-- 
2.21.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH 1/2] target/i386: define a new MSR based feature word - FEAT_PERF_CAPABILITIES
  2020-06-13  8:09 [PATCH v12 00/11] Guest Last Branch Recording Enabling Like Xu
                   ` (10 preceding siblings ...)
  2020-06-13  8:09 ` [PATCH v12 11/11] KVM: vmx/pmu: Release guest LBR event via lazy release mechanism Like Xu
@ 2020-06-13  8:09 ` Like Xu
  2020-06-13  8:09 ` [Qemu-devel] [PATCH 2/2] target/i386: add -cpu,lbr=true support to enable guest LBR Like Xu
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 34+ messages in thread
From: Like Xu @ 2020-06-13  8:09 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, ak, wei.w.wang,
	linux-kernel, kvm, Like Xu, Richard Henderson, Eduardo Habkost,
	Marcelo Tosatti, qemu-devel

The Perfmon and Debug Capability MSR named IA32_PERF_CAPABILITIES is
a feature-enumerating MSR, which only enumerates the feature full-width
write (via bit 13) by now which indicates the processor supports IA32_A_PMCx
interface for updating bits 32 and above of IA32_PMCx.

The existence of MSR IA32_PERF_CAPABILITIES is enumerated by CPUID.1:ECX[15].

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: qemu-devel@nongnu.org
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Message-Id: <20200529074347.124619-5-like.xu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/cpu.c | 23 +++++++++++++++++++++++
 target/i386/cpu.h |  3 +++
 target/i386/kvm.c | 20 ++++++++++++++++++++
 3 files changed, 46 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 02065e35d4..e47c9d1604 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1139,6 +1139,22 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
             .index = MSR_IA32_CORE_CAPABILITY,
         },
     },
+    [FEAT_PERF_CAPABILITIES] = {
+        .type = MSR_FEATURE_WORD,
+        .feat_names = {
+            NULL, NULL, NULL, NULL,
+            NULL, NULL, NULL, NULL,
+            NULL, NULL, NULL, NULL,
+            NULL, "full-width-write", NULL, NULL,
+            NULL, NULL, NULL, NULL,
+            NULL, NULL, NULL, NULL,
+            NULL, NULL, NULL, NULL,
+            NULL, NULL, NULL, NULL,
+        },
+        .msr = {
+            .index = MSR_IA32_PERF_CAPABILITIES,
+        },
+    },
 
     [FEAT_VMX_PROCBASED_CTLS] = {
         .type = MSR_FEATURE_WORD,
@@ -1316,6 +1332,10 @@ static FeatureDep feature_dependencies[] = {
         .from = { FEAT_7_0_EDX,             CPUID_7_0_EDX_CORE_CAPABILITY },
         .to = { FEAT_CORE_CAPABILITY,       ~0ull },
     },
+    {
+        .from = { FEAT_1_ECX,             CPUID_EXT_PDCM },
+        .to = { FEAT_PERF_CAPABILITIES,       ~0ull },
+    },
     {
         .from = { FEAT_1_ECX,               CPUID_EXT_VMX },
         .to = { FEAT_VMX_PROCBASED_CTLS,    ~0ull },
@@ -5488,6 +5508,9 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
             *ebx |= (cs->nr_cores * cs->nr_threads) << 16;
             *edx |= CPUID_HT;
         }
+        if (!cpu->enable_pmu) {
+            *ecx &= ~CPUID_EXT_PDCM;
+        }
         break;
     case 2:
         /* cache info: needed for Pentium Pro compatibility */
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 408392dbf6..fad2f874bd 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -356,6 +356,8 @@ typedef enum X86Seg {
 #define MSR_IA32_ARCH_CAPABILITIES      0x10a
 #define ARCH_CAP_TSX_CTRL_MSR		(1<<7)
 
+#define MSR_IA32_PERF_CAPABILITIES      0x345
+
 #define MSR_IA32_TSX_CTRL		0x122
 #define MSR_IA32_TSCDEADLINE            0x6e0
 
@@ -529,6 +531,7 @@ typedef enum FeatureWord {
     FEAT_XSAVE_COMP_HI, /* CPUID[EAX=0xd,ECX=0].EDX */
     FEAT_ARCH_CAPABILITIES,
     FEAT_CORE_CAPABILITY,
+    FEAT_PERF_CAPABILITIES,
     FEAT_VMX_PROCBASED_CTLS,
     FEAT_VMX_SECONDARY_CTLS,
     FEAT_VMX_PINBASED_CTLS,
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 34f838728d..9be6f76b2c 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -106,6 +106,7 @@ static bool has_msr_core_capabs;
 static bool has_msr_vmx_vmfunc;
 static bool has_msr_ucode_rev;
 static bool has_msr_vmx_procbased_ctls2;
+static bool has_msr_perf_capabs;
 
 static uint32_t has_architectural_pmu_version;
 static uint32_t num_architectural_pmu_gp_counters;
@@ -2027,6 +2028,9 @@ static int kvm_get_supported_msrs(KVMState *s)
             case MSR_IA32_CORE_CAPABILITY:
                 has_msr_core_capabs = true;
                 break;
+            case MSR_IA32_PERF_CAPABILITIES:
+                has_msr_perf_capabs = true;
+                break;
             case MSR_IA32_VMX_VMFUNC:
                 has_msr_vmx_vmfunc = true;
                 break;
@@ -2643,6 +2647,18 @@ static void kvm_msr_entry_add_vmx(X86CPU *cpu, FeatureWordArray f)
                       VMCS12_MAX_FIELD_INDEX << 1);
 }
 
+static void kvm_msr_entry_add_perf(X86CPU *cpu, FeatureWordArray f)
+{
+    uint64_t kvm_perf_cap =
+        kvm_arch_get_supported_msr_feature(kvm_state,
+                                           MSR_IA32_PERF_CAPABILITIES);
+
+    if (kvm_perf_cap) {
+        kvm_msr_entry_add(cpu, MSR_IA32_PERF_CAPABILITIES,
+                        kvm_perf_cap & f[FEAT_PERF_CAPABILITIES]);
+    }
+}
+
 static int kvm_buf_set_msrs(X86CPU *cpu)
 {
     int ret = kvm_vcpu_ioctl(CPU(cpu), KVM_SET_MSRS, cpu->kvm_msr_buf);
@@ -2675,6 +2691,10 @@ static void kvm_init_msrs(X86CPU *cpu)
                           env->features[FEAT_CORE_CAPABILITY]);
     }
 
+    if (has_msr_perf_capabs && cpu->enable_pmu) {
+        kvm_msr_entry_add_perf(cpu, env->features);
+    }
+
     if (has_msr_ucode_rev) {
         kvm_msr_entry_add(cpu, MSR_IA32_UCODE_REV, cpu->ucode_rev);
     }
-- 
2.21.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH 2/2] target/i386: add -cpu,lbr=true support to enable guest LBR
  2020-06-13  8:09 [PATCH v12 00/11] Guest Last Branch Recording Enabling Like Xu
                   ` (11 preceding siblings ...)
  2020-06-13  8:09 ` [Qemu-devel] [PATCH 1/2] target/i386: define a new MSR based feature word - FEAT_PERF_CAPABILITIES Like Xu
@ 2020-06-13  8:09 ` Like Xu
  2020-06-23 13:13 ` [PATCH v12 00/11] Guest Last Branch Recording Enabling Like Xu
  2020-07-02  7:40 ` Peter Zijlstra
  14 siblings, 0 replies; 34+ messages in thread
From: Like Xu @ 2020-06-13  8:09 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, ak, wei.w.wang,
	linux-kernel, kvm, Like Xu, Richard Henderson, Eduardo Habkost,
	Michael S. Tsirkin, Marcel Apfelbaum, Marcelo Tosatti,
	qemu-devel

The LBR feature would be enabled on the guest if:
- the KVM is enabled and the PMU is enabled and,
- the msr-based-feature IA32_PERF_CAPABILITIES is supporterd and,
- the supported returned value for lbr_fmt from this msr is not zero.

The LBR feature would be disabled on the guest if:
- the msr-based-feature IA32_PERF_CAPABILITIES is unsupporterd OR,
- qemu set the IA32_PERF_CAPABILITIES msr feature without lbr_fmt values OR,
- the requested guest vcpu model doesn't support PDCM.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: qemu-devel@nongnu.org
Signed-off-by: Like Xu <like.xu@linux.intel.com>
---
 hw/i386/pc.c      |  1 +
 target/i386/cpu.c | 25 +++++++++++++++++++++++--
 target/i386/cpu.h |  2 ++
 target/i386/kvm.c |  7 ++++++-
 4 files changed, 32 insertions(+), 3 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 2128f3d6fe..8d8d42a8ea 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -316,6 +316,7 @@ GlobalProperty pc_compat_1_5[] = {
     { "Nehalem-" TYPE_X86_CPU, "min-level", "2" },
     { "virtio-net-pci", "any_layout", "off" },
     { TYPE_X86_CPU, "pmu", "on" },
+    { TYPE_X86_CPU, "lbr", "on" },
     { "i440FX-pcihost", "short_root_bus", "0" },
     { "q35-pcihost", "short_root_bus", "0" },
 };
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index e47c9d1604..262a2595fa 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1142,8 +1142,8 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
     [FEAT_PERF_CAPABILITIES] = {
         .type = MSR_FEATURE_WORD,
         .feat_names = {
-            NULL, NULL, NULL, NULL,
-            NULL, NULL, NULL, NULL,
+            "lbr-fmt-bit-0", "lbr-fmt-bit-1", "lbr-fmt-bit-2", "lbr-fmt-bit-3",
+            "lbr-fmt-bit-4", "lbr-fmt-bit-5", NULL, NULL,
             NULL, NULL, NULL, NULL,
             NULL, "full-width-write", NULL, NULL,
             NULL, NULL, NULL, NULL,
@@ -4187,6 +4187,13 @@ static bool lmce_supported(void)
     return !!(mce_cap & MCG_LMCE_P);
 }
 
+static inline bool lbr_supported(void)
+{
+    return kvm_enabled() && (PERF_CAP_LBR_FMT &
+        kvm_arch_get_supported_msr_feature(kvm_state,
+                                           MSR_IA32_PERF_CAPABILITIES));
+}
+
 #define CPUID_MODEL_ID_SZ 48
 
 /**
@@ -4290,6 +4297,9 @@ static void max_x86_cpu_initfn(Object *obj)
     }
 
     object_property_set_bool(OBJECT(cpu), true, "pmu", &error_abort);
+    if (lbr_supported()) {
+        object_property_set_bool(OBJECT(cpu), true, "lbr", &error_abort);
+    }
 }
 
 static const TypeInfo max_x86_cpu_type_info = {
@@ -5510,6 +5520,10 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
         }
         if (!cpu->enable_pmu) {
             *ecx &= ~CPUID_EXT_PDCM;
+            if (cpu->enable_lbr) {
+                warn_report("LBR is unsupported since guest PMU is disabled.");
+                exit(1);
+            }
         }
         break;
     case 2:
@@ -6528,6 +6542,12 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
         }
     }
 
+    if (!cpu->max_features && cpu->enable_lbr &&
+        !(env->features[FEAT_1_ECX] & CPUID_EXT_PDCM)) {
+        warn_report("requested vcpu model doesn't support PDCM for LBR.");
+        exit(1);
+    }
+
     if (cpu->ucode_rev == 0) {
         /* The default is the same as KVM's.  */
         if (IS_AMD_CPU(env)) {
@@ -7165,6 +7185,7 @@ static Property x86_cpu_properties[] = {
 #endif
     DEFINE_PROP_INT32("node-id", X86CPU, node_id, CPU_UNSET_NUMA_NODE_ID),
     DEFINE_PROP_BOOL("pmu", X86CPU, enable_pmu, false),
+    DEFINE_PROP_BOOL("lbr", X86CPU, enable_lbr, false),
 
     DEFINE_PROP_UINT32("hv-spinlocks", X86CPU, hyperv_spinlock_attempts,
                        HYPERV_SPINLOCK_NEVER_RETRY),
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index fad2f874bd..e5f65e9b0c 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -357,6 +357,7 @@ typedef enum X86Seg {
 #define ARCH_CAP_TSX_CTRL_MSR		(1<<7)
 
 #define MSR_IA32_PERF_CAPABILITIES      0x345
+#define PERF_CAP_LBR_FMT      0x3f
 
 #define MSR_IA32_TSX_CTRL		0x122
 #define MSR_IA32_TSCDEADLINE            0x6e0
@@ -1686,6 +1687,7 @@ struct X86CPU {
      * capabilities) directly to the guest.
      */
     bool enable_pmu;
+    bool enable_lbr;
 
     /* LMCE support can be enabled/disabled via cpu option 'lmce=on/off'. It is
      * disabled by default to avoid breaking migration between QEMU with
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 9be6f76b2c..524ae86b0c 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -2652,8 +2652,10 @@ static void kvm_msr_entry_add_perf(X86CPU *cpu, FeatureWordArray f)
     uint64_t kvm_perf_cap =
         kvm_arch_get_supported_msr_feature(kvm_state,
                                            MSR_IA32_PERF_CAPABILITIES);
-
     if (kvm_perf_cap) {
+        if (!cpu->enable_lbr) {
+            kvm_perf_cap &= ~PERF_CAP_LBR_FMT;
+        }
         kvm_msr_entry_add(cpu, MSR_IA32_PERF_CAPABILITIES,
                         kvm_perf_cap & f[FEAT_PERF_CAPABILITIES]);
     }
@@ -2693,6 +2695,9 @@ static void kvm_init_msrs(X86CPU *cpu)
 
     if (has_msr_perf_capabs && cpu->enable_pmu) {
         kvm_msr_entry_add_perf(cpu, env->features);
+    } else if (!has_msr_perf_capabs && cpu->enable_lbr) {
+        warn_report("host doesn't support MSR_IA32_PERF_CAPABILITIES for LBR.");
+        exit(1);
     }
 
     if (has_msr_ucode_rev) {
-- 
2.21.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v12 07/11] KVM: vmx/pmu: Unmask LBR fields in the MSR_IA32_DEBUGCTLMSR emualtion
  2020-06-13  8:09 ` [PATCH v12 07/11] KVM: vmx/pmu: Unmask LBR fields in the MSR_IA32_DEBUGCTLMSR emualtion Like Xu
@ 2020-06-13  9:14   ` Xiaoyao Li
  2020-06-13  9:42     ` Xu, Like
  0 siblings, 1 reply; 34+ messages in thread
From: Xiaoyao Li @ 2020-06-13  9:14 UTC (permalink / raw)
  To: Like Xu, Paolo Bonzini
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, ak, wei.w.wang,
	linux-kernel, kvm

On 6/13/2020 4:09 PM, Like Xu wrote:
> When the LBR feature is reported by the vmx_get_perf_capabilities(),
> the LBR fields in the [vmx|vcpu]_supported debugctl should be unmasked.
> 
> The debugctl msr is handled separately in vmx/svm and they're not
> completely identical, hence remove the common msr handling code.
> 
> Signed-off-by: Like Xu <like.xu@linux.intel.com>
> ---
>   arch/x86/kvm/vmx/capabilities.h | 12 ++++++++++++
>   arch/x86/kvm/vmx/pmu_intel.c    | 19 +++++++++++++++++++
>   arch/x86/kvm/x86.c              | 13 -------------
>   3 files changed, 31 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
> index b633a90320ee..f6fcfabb1026 100644
> --- a/arch/x86/kvm/vmx/capabilities.h
> +++ b/arch/x86/kvm/vmx/capabilities.h
> @@ -21,6 +21,8 @@ extern int __read_mostly pt_mode;
>   #define PMU_CAP_FW_WRITES	(1ULL << 13)
>   #define PMU_CAP_LBR_FMT		0x3f
>   
> +#define DEBUGCTLMSR_LBR_MASK		(DEBUGCTLMSR_LBR | DEBUGCTLMSR_FREEZE_LBRS_ON_PMI)
> +
>   struct nested_vmx_msrs {
>   	/*
>   	 * We only store the "true" versions of the VMX capability MSRs. We
> @@ -387,4 +389,14 @@ static inline u64 vmx_get_perf_capabilities(void)
>   	return perf_cap;
>   }
>   
> +static inline u64 vmx_get_supported_debugctl(void)
> +{
> +	u64 val = 0;
> +
> +	if (vmx_get_perf_capabilities() & PMU_CAP_LBR_FMT)
> +		val |= DEBUGCTLMSR_LBR_MASK;
> +
> +	return val;
> +}
> +
>   #endif /* __KVM_X86_VMX_CAPS_H */
> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
> index a953c7d633f6..d92e95b64c74 100644
> --- a/arch/x86/kvm/vmx/pmu_intel.c
> +++ b/arch/x86/kvm/vmx/pmu_intel.c
> @@ -187,6 +187,7 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
>   	case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
>   		ret = pmu->version > 1;
>   		break;
> +	case MSR_IA32_DEBUGCTLMSR:
>   	case MSR_IA32_PERF_CAPABILITIES:
>   		ret = 1;
>   		break;
> @@ -237,6 +238,9 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>   			return 1;
>   		msr_info->data = vcpu->arch.perf_capabilities;
>   		return 0;
> +	case MSR_IA32_DEBUGCTLMSR:
> +		msr_info->data = vmcs_read64(GUEST_IA32_DEBUGCTL);

Can we put the emulation of MSR_IA32_DEBUGCTLMSR in 
vmx_{get/set})_msr(). AFAIK, MSR_IA32_DEBUGCTLMSR is not a pure PMU 
related MSR that there is bit 2 to enable #DB for bus lock.

> +		return 0;
>   	default:
>   		if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
>   		    (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
> @@ -282,6 +286,16 @@ static inline bool lbr_is_compatible(struct kvm_vcpu *vcpu)
>   	return true;
>   }
>   
> +static inline u64 vcpu_get_supported_debugctl(struct kvm_vcpu *vcpu)
> +{
> +	u64 debugctlmsr = vmx_get_supported_debugctl();
> +
> +	if (!lbr_is_enabled(vcpu))
> +		debugctlmsr &= ~DEBUGCTLMSR_LBR_MASK;
> +
> +	return debugctlmsr;
> +}
> +
>   static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>   {
>   	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
> @@ -336,6 +350,11 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>   		}
>   		vcpu->arch.perf_capabilities = data;
>   		return 0;
> +	case MSR_IA32_DEBUGCTLMSR:
> +		if (data & ~vcpu_get_supported_debugctl(vcpu))
> +			return 1;
> +		vmcs_write64(GUEST_IA32_DEBUGCTL, data);
> +		return 0;
>   	default:
>   		if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
>   		    (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 00c88c2f34e4..56f275eb4554 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2840,18 +2840,6 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>   			return 1;
>   		}
>   		break;
> -	case MSR_IA32_DEBUGCTLMSR:
> -		if (!data) {
> -			/* We support the non-activated case already */
> -			break;
> -		} else if (data & ~(DEBUGCTLMSR_LBR | DEBUGCTLMSR_BTF)) {

So after this patch, guest trying to set bit DEBUGCTLMSR_BTF will get a 
#GP instead of being ignored and printing a log in kernel.

These codes were introduced ~12 years ago in commit b5e2fec0ebc3 ("KVM: 
Ignore DEBUGCTL MSRs with no effect"), just to make Netware happy. Maybe 
I'm overthinking for that too old thing.

> -			/* Values other than LBR and BTF are vendor-specific,
> -			   thus reserved and should throw a #GP */
> -			return 1;
> -		}
> -		vcpu_unimpl(vcpu, "%s: MSR_IA32_DEBUGCTLMSR 0x%llx, nop\n",
> -			    __func__, data);
> -		break;
>   	case 0x200 ... 0x2ff:
>   		return kvm_mtrr_set_msr(vcpu, msr, data);
>   	case MSR_IA32_APICBASE:
> @@ -3120,7 +3108,6 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>   	switch (msr_info->index) {
>   	case MSR_IA32_PLATFORM_ID:
>   	case MSR_IA32_EBL_CR_POWERON:
> -	case MSR_IA32_DEBUGCTLMSR:
>   	case MSR_IA32_LASTBRANCHFROMIP:
>   	case MSR_IA32_LASTBRANCHTOIP:
>   	case MSR_IA32_LASTINTFROMIP:
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v12 07/11] KVM: vmx/pmu: Unmask LBR fields in the MSR_IA32_DEBUGCTLMSR emualtion
  2020-06-13  9:14   ` Xiaoyao Li
@ 2020-06-13  9:42     ` Xu, Like
  2020-07-07 20:21       ` Sean Christopherson
  0 siblings, 1 reply; 34+ messages in thread
From: Xu, Like @ 2020-06-13  9:42 UTC (permalink / raw)
  To: Xiaoyao Li, Like Xu, Paolo Bonzini
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, ak, wei.w.wang,
	linux-kernel, kvm

On 2020/6/13 17:14, Xiaoyao Li wrote:
> On 6/13/2020 4:09 PM, Like Xu wrote:
>> When the LBR feature is reported by the vmx_get_perf_capabilities(),
>> the LBR fields in the [vmx|vcpu]_supported debugctl should be unmasked.
>>
>> The debugctl msr is handled separately in vmx/svm and they're not
>> completely identical, hence remove the common msr handling code.
>>
>> Signed-off-by: Like Xu <like.xu@linux.intel.com>
>> ---
>>   arch/x86/kvm/vmx/capabilities.h | 12 ++++++++++++
>>   arch/x86/kvm/vmx/pmu_intel.c    | 19 +++++++++++++++++++
>>   arch/x86/kvm/x86.c              | 13 -------------
>>   3 files changed, 31 insertions(+), 13 deletions(-)
>>
>> diff --git a/arch/x86/kvm/vmx/capabilities.h 
>> b/arch/x86/kvm/vmx/capabilities.h
>> index b633a90320ee..f6fcfabb1026 100644
>> --- a/arch/x86/kvm/vmx/capabilities.h
>> +++ b/arch/x86/kvm/vmx/capabilities.h
>> @@ -21,6 +21,8 @@ extern int __read_mostly pt_mode;
>>   #define PMU_CAP_FW_WRITES    (1ULL << 13)
>>   #define PMU_CAP_LBR_FMT        0x3f
>>   +#define DEBUGCTLMSR_LBR_MASK        (DEBUGCTLMSR_LBR | 
>> DEBUGCTLMSR_FREEZE_LBRS_ON_PMI)
>> +
>>   struct nested_vmx_msrs {
>>       /*
>>        * We only store the "true" versions of the VMX capability MSRs. We
>> @@ -387,4 +389,14 @@ static inline u64 vmx_get_perf_capabilities(void)
>>       return perf_cap;
>>   }
>>   +static inline u64 vmx_get_supported_debugctl(void)
>> +{
>> +    u64 val = 0;
>> +
>> +    if (vmx_get_perf_capabilities() & PMU_CAP_LBR_FMT)
>> +        val |= DEBUGCTLMSR_LBR_MASK;
>> +
>> +    return val;
>> +}
>> +
>>   #endif /* __KVM_X86_VMX_CAPS_H */
>> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
>> index a953c7d633f6..d92e95b64c74 100644
>> --- a/arch/x86/kvm/vmx/pmu_intel.c
>> +++ b/arch/x86/kvm/vmx/pmu_intel.c
>> @@ -187,6 +187,7 @@ static bool intel_is_valid_msr(struct kvm_vcpu 
>> *vcpu, u32 msr)
>>       case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
>>           ret = pmu->version > 1;
>>           break;
>> +    case MSR_IA32_DEBUGCTLMSR:
>>       case MSR_IA32_PERF_CAPABILITIES:
>>           ret = 1;
>>           break;
>> @@ -237,6 +238,9 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, 
>> struct msr_data *msr_info)
>>               return 1;
>>           msr_info->data = vcpu->arch.perf_capabilities;
>>           return 0;
>> +    case MSR_IA32_DEBUGCTLMSR:
>> +        msr_info->data = vmcs_read64(GUEST_IA32_DEBUGCTL);
>
> Can we put the emulation of MSR_IA32_DEBUGCTLMSR in vmx_{get/set})_msr(). 
> AFAIK, MSR_IA32_DEBUGCTLMSR is not a pure PMU related MSR that there is 
> bit 2 to enable #DB for bus lock.
We already have "case MSR_IA32_DEBUGCTLMSR" handler in the vmx_set_msr()
and you may apply you bus lock changes in that handler.
>> +        return 0;
>>       default:
>>           if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
>>               (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
>> @@ -282,6 +286,16 @@ static inline bool lbr_is_compatible(struct 
>> kvm_vcpu *vcpu)
>>       return true;
>>   }
>>   +static inline u64 vcpu_get_supported_debugctl(struct kvm_vcpu *vcpu)
>> +{
>> +    u64 debugctlmsr = vmx_get_supported_debugctl();
>> +
>> +    if (!lbr_is_enabled(vcpu))
>> +        debugctlmsr &= ~DEBUGCTLMSR_LBR_MASK;
>> +
>> +    return debugctlmsr;
>> +}
>> +
>>   static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data 
>> *msr_info)
>>   {
>>       struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
>> @@ -336,6 +350,11 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, 
>> struct msr_data *msr_info)
>>           }
>>           vcpu->arch.perf_capabilities = data;
>>           return 0;
>> +    case MSR_IA32_DEBUGCTLMSR:
>> +        if (data & ~vcpu_get_supported_debugctl(vcpu))
>> +            return 1;
>> +        vmcs_write64(GUEST_IA32_DEBUGCTL, data);
>> +        return 0;
>>       default:
>>           if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
>>               (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 00c88c2f34e4..56f275eb4554 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -2840,18 +2840,6 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, 
>> struct msr_data *msr_info)
>>               return 1;
>>           }
>>           break;
>> -    case MSR_IA32_DEBUGCTLMSR:
>> -        if (!data) {
>> -            /* We support the non-activated case already */
>> -            break;
>> -        } else if (data & ~(DEBUGCTLMSR_LBR | DEBUGCTLMSR_BTF)) {
>
> So after this patch, guest trying to set bit DEBUGCTLMSR_BTF will get a 
> #GP instead of being ignored and printing a log in kernel.
>

Since the BTF is not implemented on the KVM at all,
I do propose not left this kind of dummy thing in the future KVM code.

Let's see if Netware or any BTF user will complain about this change.

> These codes were introduced ~12 years ago in commit b5e2fec0ebc3 ("KVM: 
> Ignore DEBUGCTL MSRs with no effect"), just to make Netware happy. Maybe 
> I'm overthinking for that too old thing.
>
>> -            /* Values other than LBR and BTF are vendor-specific,
>> -               thus reserved and should throw a #GP */
>> -            return 1;
>> -        }
>> -        vcpu_unimpl(vcpu, "%s: MSR_IA32_DEBUGCTLMSR 0x%llx, nop\n",
>> -                __func__, data);
>> -        break;
>>       case 0x200 ... 0x2ff:
>>           return kvm_mtrr_set_msr(vcpu, msr, data);
>>       case MSR_IA32_APICBASE:
>> @@ -3120,7 +3108,6 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, 
>> struct msr_data *msr_info)
>>       switch (msr_info->index) {
>>       case MSR_IA32_PLATFORM_ID:
>>       case MSR_IA32_EBL_CR_POWERON:
>> -    case MSR_IA32_DEBUGCTLMSR:
>>       case MSR_IA32_LASTBRANCHFROMIP:
>>       case MSR_IA32_LASTBRANCHTOIP:
>>       case MSR_IA32_LASTINTFROMIP:
>>
>


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v12 00/11] Guest Last Branch Recording Enabling
  2020-06-13  8:09 [PATCH v12 00/11] Guest Last Branch Recording Enabling Like Xu
                   ` (12 preceding siblings ...)
  2020-06-13  8:09 ` [Qemu-devel] [PATCH 2/2] target/i386: add -cpu,lbr=true support to enable guest LBR Like Xu
@ 2020-06-23 13:13 ` Like Xu
  2020-07-01  2:38   ` Like Xu
  2020-07-02  7:40 ` Peter Zijlstra
  14 siblings, 1 reply; 34+ messages in thread
From: Like Xu @ 2020-06-23 13:13 UTC (permalink / raw)
  To: Paolo Bonzini, Peter Zijlstra
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, ak, wei.w.wang, linux-kernel, kvm

On 2020/6/13 16:09, Like Xu wrote:
> Hi all,
> 
> Please help review this new version for the Kenrel 5.9 release.
> 
> Now, you may apply the last two qemu-devel patches to the upstream
> qemu and try the guest LBR feature with '-cpu host' command line.
> 
> v11->v12 Changelog:
> - apply "Signed-off-by" form PeterZ and his codes for the perf subsystem;
> - add validity checks before expose LBR via MSR_IA32_PERF_CAPABILITIES;
> - refactor MSR_IA32_DEBUGCTLMSR emulation with validity check;
> - reorder "perf_event_attr" fields according to how they're declared;
> - replace event_is_oncpu() with "event->state" check;
> - make LBR emualtion specific to vmx rather than x86 generic;
> - move pass-through LBR code to vmx.c instead of pmu_intel.c;
> - add vmx_lbr_en/disable_passthrough layer to make code readable;
> - rewrite pmu availability check with vmx_passthrough_lbr_msrs();
> 
> You may check more details in each commit.
> 
> Previous:
> https://lore.kernel.org/kvm/20200514083054.62538-1-like.xu@linux.intel.com/
> 
> ---
...
> 
> Wei Wang (1):
>   perf/x86: Fix variable types for LBR registers > Like Xu (10):
>    perf/x86/core: Refactor hw->idx checks and cleanup
>    perf/x86/lbr: Add interface to get LBR information
>    perf/x86: Add constraint to create guest LBR event without hw counter
>    perf/x86: Keep LBR records unchanged in host context for guest usage

Hi Peter,
Would you like to add "Acked-by" to the first three perf patches ?

>    KVM: vmx/pmu: Expose LBR to guest via MSR_IA32_PERF_CAPABILITIES
>    KVM: vmx/pmu: Unmask LBR fields in the MSR_IA32_DEBUGCTLMSR emualtion
>    KVM: vmx/pmu: Pass-through LBR msrs when guest LBR event is scheduled
>    KVM: vmx/pmu: Emulate legacy freezing LBRs on virtual PMI
>    KVM: vmx/pmu: Reduce the overhead of LBR pass-through or cancellation
>    KVM: vmx/pmu: Release guest LBR event via lazy release mechanism
> 

Hi Paolo,
Would you like to take a moment to review the KVM part for this feature ?

Thanks,
Like Xu

> 
> Qemu-devel:
>    target/i386: add -cpu,lbr=true support to enable guest LBR
> 
>   arch/x86/events/core.c            |  26 +--
>   arch/x86/events/intel/core.c      | 109 ++++++++-----
>   arch/x86/events/intel/lbr.c       |  51 +++++-
>   arch/x86/events/perf_event.h      |   8 +-
>   arch/x86/include/asm/perf_event.h |  34 +++-
>   arch/x86/kvm/pmu.c                |  12 +-
>   arch/x86/kvm/pmu.h                |   5 +
>   arch/x86/kvm/vmx/capabilities.h   |  23 ++-
>   arch/x86/kvm/vmx/pmu_intel.c      | 253 +++++++++++++++++++++++++++++-
>   arch/x86/kvm/vmx/vmx.c            |  86 +++++++++-
>   arch/x86/kvm/vmx/vmx.h            |  17 ++
>   arch/x86/kvm/x86.c                |  13 --
>   12 files changed, 559 insertions(+), 78 deletions(-)
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v12 00/11] Guest Last Branch Recording Enabling
  2020-06-23 13:13 ` [PATCH v12 00/11] Guest Last Branch Recording Enabling Like Xu
@ 2020-07-01  2:38   ` Like Xu
  0 siblings, 0 replies; 34+ messages in thread
From: Like Xu @ 2020-07-01  2:38 UTC (permalink / raw)
  To: Paolo Bonzini, Peter Zijlstra
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, ak, wei.w.wang, linux-kernel, kvm

Ping friendly.

If there is room for improvement, please let me know.

On 2020/6/23 21:13, Like Xu wrote:
> On 2020/6/13 16:09, Like Xu wrote:
>> Hi all,
>>
>> Please help review this new version for the Kenrel 5.9 release.
>>
>> Now, you may apply the last two qemu-devel patches to the upstream
>> qemu and try the guest LBR feature with '-cpu host' command line.
>>
>> v11->v12 Changelog:
>> - apply "Signed-off-by" form PeterZ and his codes for the perf subsystem;
>> - add validity checks before expose LBR via MSR_IA32_PERF_CAPABILITIES;
>> - refactor MSR_IA32_DEBUGCTLMSR emulation with validity check;
>> - reorder "perf_event_attr" fields according to how they're declared;
>> - replace event_is_oncpu() with "event->state" check;
>> - make LBR emualtion specific to vmx rather than x86 generic;
>> - move pass-through LBR code to vmx.c instead of pmu_intel.c;
>> - add vmx_lbr_en/disable_passthrough layer to make code readable;
>> - rewrite pmu availability check with vmx_passthrough_lbr_msrs();
>>
>> You may check more details in each commit.
>>
>> Previous:
>> https://lore.kernel.org/kvm/20200514083054.62538-1-like.xu@linux.intel.com/
>>
>> ---
> ...
>>
>> Wei Wang (1):
>>   perf/x86: Fix variable types for LBR registers > Like Xu (10):
>>    perf/x86/core: Refactor hw->idx checks and cleanup
>>    perf/x86/lbr: Add interface to get LBR information
>>    perf/x86: Add constraint to create guest LBR event without hw counter
>>    perf/x86: Keep LBR records unchanged in host context for guest usage
> 
> Hi Peter,
> Would you like to add "Acked-by" to the first three perf patches ?
> 
>>    KVM: vmx/pmu: Expose LBR to guest via MSR_IA32_PERF_CAPABILITIES
>>    KVM: vmx/pmu: Unmask LBR fields in the MSR_IA32_DEBUGCTLMSR emualtion
>>    KVM: vmx/pmu: Pass-through LBR msrs when guest LBR event is scheduled
>>    KVM: vmx/pmu: Emulate legacy freezing LBRs on virtual PMI
>>    KVM: vmx/pmu: Reduce the overhead of LBR pass-through or cancellation
>>    KVM: vmx/pmu: Release guest LBR event via lazy release mechanism
>>
> 
> Hi Paolo,
> Would you like to take a moment to review the KVM part for this feature ?
> 
> Thanks,
> Like Xu
> 
>>
>> Qemu-devel:
>>    target/i386: add -cpu,lbr=true support to enable guest LBR
>>
>>   arch/x86/events/core.c            |  26 +--
>>   arch/x86/events/intel/core.c      | 109 ++++++++-----
>>   arch/x86/events/intel/lbr.c       |  51 +++++-
>>   arch/x86/events/perf_event.h      |   8 +-
>>   arch/x86/include/asm/perf_event.h |  34 +++-
>>   arch/x86/kvm/pmu.c                |  12 +-
>>   arch/x86/kvm/pmu.h                |   5 +
>>   arch/x86/kvm/vmx/capabilities.h   |  23 ++-
>>   arch/x86/kvm/vmx/pmu_intel.c      | 253 +++++++++++++++++++++++++++++-
>>   arch/x86/kvm/vmx/vmx.c            |  86 +++++++++-
>>   arch/x86/kvm/vmx/vmx.h            |  17 ++
>>   arch/x86/kvm/x86.c                |  13 --
>>   12 files changed, 559 insertions(+), 78 deletions(-)
>>
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v12 00/11] Guest Last Branch Recording Enabling
  2020-06-13  8:09 [PATCH v12 00/11] Guest Last Branch Recording Enabling Like Xu
                   ` (13 preceding siblings ...)
  2020-06-23 13:13 ` [PATCH v12 00/11] Guest Last Branch Recording Enabling Like Xu
@ 2020-07-02  7:40 ` Peter Zijlstra
  2020-07-02 13:11   ` Liang, Kan
  14 siblings, 1 reply; 34+ messages in thread
From: Peter Zijlstra @ 2020-07-02  7:40 UTC (permalink / raw)
  To: Like Xu
  Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, ak, wei.w.wang, linux-kernel, kvm,
	Liang, Kan

On Sat, Jun 13, 2020 at 04:09:45PM +0800, Like Xu wrote:
> Like Xu (10):
>   perf/x86/core: Refactor hw->idx checks and cleanup
>   perf/x86/lbr: Add interface to get LBR information
>   perf/x86: Add constraint to create guest LBR event without hw counter
>   perf/x86: Keep LBR records unchanged in host context for guest usage

> Wei Wang (1):
>   perf/x86: Fix variable types for LBR registers

>  arch/x86/events/core.c            |  26 +--
>  arch/x86/events/intel/core.c      | 109 ++++++++-----
>  arch/x86/events/intel/lbr.c       |  51 +++++-
>  arch/x86/events/perf_event.h      |   8 +-
>  arch/x86/include/asm/perf_event.h |  34 +++-

These look good to me; but at the same time Kan is sending me
Architectural LBR patches.

Kan, if I take these perf patches and stick them in a tip/perf/vlbr
topic branch, can you rebase the arch lbr stuff on top, or is there
anything in the arch-lbr series that badly conflicts with this work?

Paolo, would that topic branch work for you too, to then stick these
patches in top?

>   KVM: vmx/pmu: Expose LBR to guest via MSR_IA32_PERF_CAPABILITIES
>   KVM: vmx/pmu: Unmask LBR fields in the MSR_IA32_DEBUGCTLMSR emualtion
>   KVM: vmx/pmu: Pass-through LBR msrs when guest LBR event is scheduled
>   KVM: vmx/pmu: Emulate legacy freezing LBRs on virtual PMI
>   KVM: vmx/pmu: Reduce the overhead of LBR pass-through or cancellation
>   KVM: vmx/pmu: Release guest LBR event via lazy release mechanism

>  arch/x86/kvm/pmu.c                |  12 +-
>  arch/x86/kvm/pmu.h                |   5 +
>  arch/x86/kvm/vmx/capabilities.h   |  23 ++-
>  arch/x86/kvm/vmx/pmu_intel.c      | 253 +++++++++++++++++++++++++++++-
>  arch/x86/kvm/vmx/vmx.c            |  86 +++++++++-
>  arch/x86/kvm/vmx/vmx.h            |  17 ++
>  arch/x86/kvm/x86.c                |  13 --

>  12 files changed, 559 insertions(+), 78 deletions(-)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v12 00/11] Guest Last Branch Recording Enabling
  2020-07-02  7:40 ` Peter Zijlstra
@ 2020-07-02 13:11   ` Liang, Kan
  2020-07-02 13:58     ` Peter Zijlstra
  0 siblings, 1 reply; 34+ messages in thread
From: Liang, Kan @ 2020-07-02 13:11 UTC (permalink / raw)
  To: Peter Zijlstra, Like Xu
  Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, ak, wei.w.wang, linux-kernel, kvm,
	Liang, Kan



On 7/2/2020 3:40 AM, Peter Zijlstra wrote:
> On Sat, Jun 13, 2020 at 04:09:45PM +0800, Like Xu wrote:
>> Like Xu (10):
>>    perf/x86/core: Refactor hw->idx checks and cleanup
>>    perf/x86/lbr: Add interface to get LBR information
>>    perf/x86: Add constraint to create guest LBR event without hw counter
>>    perf/x86: Keep LBR records unchanged in host context for guest usage
> 
>> Wei Wang (1):
>>    perf/x86: Fix variable types for LBR registers
> 
>>   arch/x86/events/core.c            |  26 +--
>>   arch/x86/events/intel/core.c      | 109 ++++++++-----
>>   arch/x86/events/intel/lbr.c       |  51 +++++-
>>   arch/x86/events/perf_event.h      |   8 +-
>>   arch/x86/include/asm/perf_event.h |  34 +++-
> 
> These look good to me; but at the same time Kan is sending me
> Architectural LBR patches.
> 
> Kan, if I take these perf patches and stick them in a tip/perf/vlbr
> topic branch, can you rebase the arch lbr stuff on top, or is there
> anything in the arch-lbr series that badly conflicts with this work?
> 

Yes, I can rebase the arch lbr patches on top of them.
Please push the tip/perf/vlbr branch, so I can pull and rebase my patches.

Thanks,
Kan

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v12 00/11] Guest Last Branch Recording Enabling
  2020-07-02 13:11   ` Liang, Kan
@ 2020-07-02 13:58     ` Peter Zijlstra
  2020-07-03  7:56       ` Peter Zijlstra
  0 siblings, 1 reply; 34+ messages in thread
From: Peter Zijlstra @ 2020-07-02 13:58 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Like Xu, Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, ak, wei.w.wang,
	linux-kernel, kvm, Liang, Kan

On Thu, Jul 02, 2020 at 09:11:06AM -0400, Liang, Kan wrote:
> On 7/2/2020 3:40 AM, Peter Zijlstra wrote:
> > On Sat, Jun 13, 2020 at 04:09:45PM +0800, Like Xu wrote:
> > > Like Xu (10):
> > >    perf/x86/core: Refactor hw->idx checks and cleanup
> > >    perf/x86/lbr: Add interface to get LBR information
> > >    perf/x86: Add constraint to create guest LBR event without hw counter
> > >    perf/x86: Keep LBR records unchanged in host context for guest usage
> > 
> > > Wei Wang (1):
> > >    perf/x86: Fix variable types for LBR registers
> > 
> > >   arch/x86/events/core.c            |  26 +--
> > >   arch/x86/events/intel/core.c      | 109 ++++++++-----
> > >   arch/x86/events/intel/lbr.c       |  51 +++++-
> > >   arch/x86/events/perf_event.h      |   8 +-
> > >   arch/x86/include/asm/perf_event.h |  34 +++-
> > 
> > These look good to me; but at the same time Kan is sending me
> > Architectural LBR patches.
> > 
> > Kan, if I take these perf patches and stick them in a tip/perf/vlbr
> > topic branch, can you rebase the arch lbr stuff on top, or is there
> > anything in the arch-lbr series that badly conflicts with this work?
> > 
> 
> Yes, I can rebase the arch lbr patches on top of them.
> Please push the tip/perf/vlbr branch, so I can pull and rebase my patches.

For now I have:

  git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git perf/vlbr

Once the 0day robot comes back all-green, I'll push it out to
tip/perf/vlbr and merge it into tip/perf/core.

Thanks!

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v12 00/11] Guest Last Branch Recording Enabling
  2020-07-02 13:58     ` Peter Zijlstra
@ 2020-07-03  7:56       ` Peter Zijlstra
  2020-07-03  8:04         ` Xu, Like
  0 siblings, 1 reply; 34+ messages in thread
From: Peter Zijlstra @ 2020-07-03  7:56 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Like Xu, Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, ak, wei.w.wang,
	linux-kernel, kvm, Liang, Kan

On Thu, Jul 02, 2020 at 03:58:42PM +0200, Peter Zijlstra wrote:
> On Thu, Jul 02, 2020 at 09:11:06AM -0400, Liang, Kan wrote:
> > On 7/2/2020 3:40 AM, Peter Zijlstra wrote:
> > > On Sat, Jun 13, 2020 at 04:09:45PM +0800, Like Xu wrote:
> > > > Like Xu (10):
> > > >    perf/x86/core: Refactor hw->idx checks and cleanup
> > > >    perf/x86/lbr: Add interface to get LBR information
> > > >    perf/x86: Add constraint to create guest LBR event without hw counter
> > > >    perf/x86: Keep LBR records unchanged in host context for guest usage
> > > 
> > > > Wei Wang (1):
> > > >    perf/x86: Fix variable types for LBR registers
> > > 
> > > >   arch/x86/events/core.c            |  26 +--
> > > >   arch/x86/events/intel/core.c      | 109 ++++++++-----
> > > >   arch/x86/events/intel/lbr.c       |  51 +++++-
> > > >   arch/x86/events/perf_event.h      |   8 +-
> > > >   arch/x86/include/asm/perf_event.h |  34 +++-
> > > 
> > > These look good to me; but at the same time Kan is sending me
> > > Architectural LBR patches.
> > > 
> > > Kan, if I take these perf patches and stick them in a tip/perf/vlbr
> > > topic branch, can you rebase the arch lbr stuff on top, or is there
> > > anything in the arch-lbr series that badly conflicts with this work?
> > > 
> > 
> > Yes, I can rebase the arch lbr patches on top of them.
> > Please push the tip/perf/vlbr branch, so I can pull and rebase my patches.
> 
> For now I have:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git perf/vlbr
> 
> Once the 0day robot comes back all-green, I'll push it out to
> tip/perf/vlbr and merge it into tip/perf/core.

tip/perf/vlbr now exists, thanks!

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [tip: perf/core] perf/x86/lbr: Add interface to get LBR information
  2020-06-13  8:09 ` [PATCH v12 03/11] perf/x86/lbr: Add interface to get LBR information Like Xu
@ 2020-07-03  8:01   ` tip-bot2 for Like Xu
  0 siblings, 0 replies; 34+ messages in thread
From: tip-bot2 for Like Xu @ 2020-07-03  8:01 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Like Xu, Wei Wang, Peter Zijlstra (Intel), x86, LKML

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     b2d6504761a50b9493eb4b20f6e188b673f20c32
Gitweb:        https://git.kernel.org/tip/b2d6504761a50b9493eb4b20f6e188b673f20c32
Author:        Like Xu <like.xu@linux.intel.com>
AuthorDate:    Sat, 13 Jun 2020 16:09:48 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 02 Jul 2020 15:51:46 +02:00

perf/x86/lbr: Add interface to get LBR information

The LBR records msrs are model specific. The perf subsystem has already
obtained the base addresses of LBR records based on the cpu model.

Therefore, an interface is added to allow callers outside the perf
subsystem to obtain these LBR information. It's useful for hypervisors
to emulate the LBR feature for guests with less code.

Signed-off-by: Like Xu <like.xu@linux.intel.com>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200613080958.132489-4-like.xu@linux.intel.com
---
 arch/x86/events/intel/lbr.c       | 20 ++++++++++++++++++++
 arch/x86/include/asm/perf_event.h | 12 ++++++++++++
 2 files changed, 32 insertions(+)

diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index 65113b1..2ed3f2a 100644
--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -1343,3 +1343,23 @@ void intel_pmu_lbr_init_knl(void)
 	if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_LIP)
 		x86_pmu.intel_cap.lbr_format = LBR_FORMAT_EIP_FLAGS;
 }
+
+/**
+ * x86_perf_get_lbr - get the LBR records information
+ *
+ * @lbr: the caller's memory to store the LBR records information
+ *
+ * Returns: 0 indicates the LBR info has been successfully obtained
+ */
+int x86_perf_get_lbr(struct x86_pmu_lbr *lbr)
+{
+	int lbr_fmt = x86_pmu.intel_cap.lbr_format;
+
+	lbr->nr = x86_pmu.lbr_nr;
+	lbr->from = x86_pmu.lbr_from;
+	lbr->to = x86_pmu.lbr_to;
+	lbr->info = (lbr_fmt == LBR_FORMAT_INFO) ? MSR_LBR_INFO_0 : 0;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(x86_perf_get_lbr);
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index e855e9c..5d2c30f 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -333,6 +333,13 @@ struct perf_guest_switch_msr {
 	u64 host, guest;
 };
 
+struct x86_pmu_lbr {
+	unsigned int	nr;
+	unsigned int	from;
+	unsigned int	to;
+	unsigned int	info;
+};
+
 extern void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap);
 extern void perf_check_microcode(void);
 extern int x86_perf_rdpmc_index(struct perf_event *event);
@@ -348,12 +355,17 @@ static inline void perf_check_microcode(void) { }
 
 #if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_INTEL)
 extern struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr);
+extern int x86_perf_get_lbr(struct x86_pmu_lbr *lbr);
 #else
 static inline struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr)
 {
 	*nr = 0;
 	return NULL;
 }
+static inline int x86_perf_get_lbr(struct x86_pmu_lbr *lbr)
+{
+	return -1;
+}
 #endif
 
 #ifdef CONFIG_CPU_SUP_INTEL

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [tip: perf/core] perf/x86/core: Refactor hw->idx checks and cleanup
  2020-06-13  8:09 ` [PATCH v12 02/11] perf/x86/core: Refactor hw->idx checks and cleanup Like Xu
@ 2020-07-03  8:01   ` tip-bot2 for Like Xu
  0 siblings, 0 replies; 34+ messages in thread
From: tip-bot2 for Like Xu @ 2020-07-03  8:01 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Like Xu, Peter Zijlstra (Intel), x86, LKML

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     027440b5d426a51f33b515bbd236cc479d1e051f
Gitweb:        https://git.kernel.org/tip/027440b5d426a51f33b515bbd236cc479d1e051f
Author:        Like Xu <like.xu@linux.intel.com>
AuthorDate:    Sat, 13 Jun 2020 16:09:47 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 02 Jul 2020 15:51:46 +02:00

perf/x86/core: Refactor hw->idx checks and cleanup

For intel_pmu_en/disable_event(), reorder the branches checks for hw->idx
and make them sorted by probability: gp,fixed,bts,others.

Clean up the x86_assign_hw_event() by converting multiple if-else
statements to a switch statement.

To skip x86_perf_event_update() and x86_perf_event_set_period(),
it's generic to replace "idx == INTEL_PMC_IDX_FIXED_BTS" check with
'!hwc->event_base' because that should be 0 for all non-gp/fixed cases.

Wrap related bit operations into intel_set/clear_masks() and make the main
path more cleaner and readable.

No functional changes.

Signed-off-by: Like Xu <like.xu@linux.intel.com>
Original-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200613080958.132489-3-like.xu@linux.intel.com
---
 arch/x86/events/core.c       | 25 ++++++----
 arch/x86/events/intel/core.c | 85 ++++++++++++++++++-----------------
 2 files changed, 62 insertions(+), 48 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 4103665..15cb7af 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -71,10 +71,9 @@ u64 x86_perf_event_update(struct perf_event *event)
 	struct hw_perf_event *hwc = &event->hw;
 	int shift = 64 - x86_pmu.cntval_bits;
 	u64 prev_raw_count, new_raw_count;
-	int idx = hwc->idx;
 	u64 delta;
 
-	if (idx == INTEL_PMC_IDX_FIXED_BTS)
+	if (unlikely(!hwc->event_base))
 		return 0;
 
 	/*
@@ -1097,22 +1096,30 @@ static inline void x86_assign_hw_event(struct perf_event *event,
 				struct cpu_hw_events *cpuc, int i)
 {
 	struct hw_perf_event *hwc = &event->hw;
+	int idx;
 
-	hwc->idx = cpuc->assign[i];
+	idx = hwc->idx = cpuc->assign[i];
 	hwc->last_cpu = smp_processor_id();
 	hwc->last_tag = ++cpuc->tags[i];
 
-	if (hwc->idx == INTEL_PMC_IDX_FIXED_BTS) {
+	switch (hwc->idx) {
+	case INTEL_PMC_IDX_FIXED_BTS:
 		hwc->config_base = 0;
 		hwc->event_base	= 0;
-	} else if (hwc->idx >= INTEL_PMC_IDX_FIXED) {
+		break;
+
+	case INTEL_PMC_IDX_FIXED ... INTEL_PMC_IDX_FIXED_BTS-1:
 		hwc->config_base = MSR_ARCH_PERFMON_FIXED_CTR_CTRL;
-		hwc->event_base = MSR_ARCH_PERFMON_FIXED_CTR0 + (hwc->idx - INTEL_PMC_IDX_FIXED);
-		hwc->event_base_rdpmc = (hwc->idx - INTEL_PMC_IDX_FIXED) | 1<<30;
-	} else {
+		hwc->event_base = MSR_ARCH_PERFMON_FIXED_CTR0 +
+				(idx - INTEL_PMC_IDX_FIXED);
+		hwc->event_base_rdpmc = (idx - INTEL_PMC_IDX_FIXED) | 1<<30;
+		break;
+
+	default:
 		hwc->config_base = x86_pmu_config_addr(hwc->idx);
 		hwc->event_base  = x86_pmu_event_addr(hwc->idx);
 		hwc->event_base_rdpmc = x86_pmu_rdpmc_index(hwc->idx);
+		break;
 	}
 }
 
@@ -1233,7 +1240,7 @@ int x86_perf_event_set_period(struct perf_event *event)
 	s64 period = hwc->sample_period;
 	int ret = 0, idx = hwc->idx;
 
-	if (idx == INTEL_PMC_IDX_FIXED_BTS)
+	if (unlikely(!hwc->event_base))
 		return 0;
 
 	/*
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index ca35c8b..8dac4c6 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2136,8 +2136,35 @@ static inline void intel_pmu_ack_status(u64 ack)
 	wrmsrl(MSR_CORE_PERF_GLOBAL_OVF_CTRL, ack);
 }
 
-static void intel_pmu_disable_fixed(struct hw_perf_event *hwc)
+static inline bool event_is_checkpointed(struct perf_event *event)
+{
+	return unlikely(event->hw.config & HSW_IN_TX_CHECKPOINTED) != 0;
+}
+
+static inline void intel_set_masks(struct perf_event *event, int idx)
+{
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+
+	if (event->attr.exclude_host)
+		__set_bit(idx, (unsigned long *)&cpuc->intel_ctrl_guest_mask);
+	if (event->attr.exclude_guest)
+		__set_bit(idx, (unsigned long *)&cpuc->intel_ctrl_host_mask);
+	if (event_is_checkpointed(event))
+		__set_bit(idx, (unsigned long *)&cpuc->intel_cp_status);
+}
+
+static inline void intel_clear_masks(struct perf_event *event, int idx)
 {
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+
+	__clear_bit(idx, (unsigned long *)&cpuc->intel_ctrl_guest_mask);
+	__clear_bit(idx, (unsigned long *)&cpuc->intel_ctrl_host_mask);
+	__clear_bit(idx, (unsigned long *)&cpuc->intel_cp_status);
+}
+
+static void intel_pmu_disable_fixed(struct perf_event *event)
+{
+	struct hw_perf_event *hwc = &event->hw;
 	int idx = hwc->idx - INTEL_PMC_IDX_FIXED;
 	u64 ctrl_val, mask;
 
@@ -2148,31 +2175,22 @@ static void intel_pmu_disable_fixed(struct hw_perf_event *hwc)
 	wrmsrl(hwc->config_base, ctrl_val);
 }
 
-static inline bool event_is_checkpointed(struct perf_event *event)
-{
-	return (event->hw.config & HSW_IN_TX_CHECKPOINTED) != 0;
-}
-
 static void intel_pmu_disable_event(struct perf_event *event)
 {
 	struct hw_perf_event *hwc = &event->hw;
-	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+	int idx = hwc->idx;
 
-	if (unlikely(hwc->idx == INTEL_PMC_IDX_FIXED_BTS)) {
+	if (idx < INTEL_PMC_IDX_FIXED) {
+		intel_clear_masks(event, idx);
+		x86_pmu_disable_event(event);
+	} else if (idx < INTEL_PMC_IDX_FIXED_BTS) {
+		intel_clear_masks(event, idx);
+		intel_pmu_disable_fixed(event);
+	} else if (idx == INTEL_PMC_IDX_FIXED_BTS) {
 		intel_pmu_disable_bts();
 		intel_pmu_drain_bts_buffer();
-		return;
 	}
 
-	cpuc->intel_ctrl_guest_mask &= ~(1ull << hwc->idx);
-	cpuc->intel_ctrl_host_mask &= ~(1ull << hwc->idx);
-	cpuc->intel_cp_status &= ~(1ull << hwc->idx);
-
-	if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL))
-		intel_pmu_disable_fixed(hwc);
-	else
-		x86_pmu_disable_event(event);
-
 	/*
 	 * Needs to be called after x86_pmu_disable_event,
 	 * so we don't trigger the event without PEBS bit set.
@@ -2238,33 +2256,22 @@ static void intel_pmu_enable_fixed(struct perf_event *event)
 static void intel_pmu_enable_event(struct perf_event *event)
 {
 	struct hw_perf_event *hwc = &event->hw;
-	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
-
-	if (unlikely(hwc->idx == INTEL_PMC_IDX_FIXED_BTS)) {
-		if (!__this_cpu_read(cpu_hw_events.enabled))
-			return;
-
-		intel_pmu_enable_bts(hwc->config);
-		return;
-	}
-
-	if (event->attr.exclude_host)
-		cpuc->intel_ctrl_guest_mask |= (1ull << hwc->idx);
-	if (event->attr.exclude_guest)
-		cpuc->intel_ctrl_host_mask |= (1ull << hwc->idx);
-
-	if (unlikely(event_is_checkpointed(event)))
-		cpuc->intel_cp_status |= (1ull << hwc->idx);
+	int idx = hwc->idx;
 
 	if (unlikely(event->attr.precise_ip))
 		intel_pmu_pebs_enable(event);
 
-	if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL)) {
+	if (idx < INTEL_PMC_IDX_FIXED) {
+		intel_set_masks(event, idx);
+		__x86_pmu_enable_event(hwc, ARCH_PERFMON_EVENTSEL_ENABLE);
+	} else if (idx < INTEL_PMC_IDX_FIXED_BTS) {
+		intel_set_masks(event, idx);
 		intel_pmu_enable_fixed(event);
-		return;
+	} else if (idx == INTEL_PMC_IDX_FIXED_BTS) {
+		if (!__this_cpu_read(cpu_hw_events.enabled))
+			return;
+		intel_pmu_enable_bts(hwc->config);
 	}
-
-	__x86_pmu_enable_event(hwc, ARCH_PERFMON_EVENTSEL_ENABLE);
 }
 
 static void intel_pmu_add_event(struct perf_event *event)

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [tip: perf/core] perf/x86: Fix variable types for LBR registers
  2020-06-13  8:09 ` [PATCH v12 01/11] perf/x86: Fix variable types for LBR registers Like Xu
@ 2020-07-03  8:01   ` tip-bot2 for Wei Wang
  2020-11-09  6:34   ` [PATCH v12 01/11] " Andi Kleen
  1 sibling, 0 replies; 34+ messages in thread
From: tip-bot2 for Wei Wang @ 2020-07-03  8:01 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Peter Zijlstra (Intel), Wei Wang, x86, LKML

The following commit has been merged into the perf/core branch of tip:

Commit-ID:     3cb9d5464c1ceea86f6225089b2f7965989cf316
Gitweb:        https://git.kernel.org/tip/3cb9d5464c1ceea86f6225089b2f7965989cf316
Author:        Wei Wang <wei.w.wang@intel.com>
AuthorDate:    Sat, 13 Jun 2020 16:09:46 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 02 Jul 2020 15:51:45 +02:00

perf/x86: Fix variable types for LBR registers

The MSR variable type can be 'unsigned int', which uses less memory than
the longer 'unsigned long'. Fix 'struct x86_pmu' for that. The lbr_nr won't
be a negative number, so make it 'unsigned int' as well.

Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200613080958.132489-2-like.xu@linux.intel.com
---
 arch/x86/events/perf_event.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index e17a3d8..eb37f6c 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -673,8 +673,8 @@ struct x86_pmu {
 	/*
 	 * Intel LBR
 	 */
-	unsigned long	lbr_tos, lbr_from, lbr_to; /* MSR base regs       */
-	int		lbr_nr;			   /* hardware stack size */
+	unsigned int	lbr_tos, lbr_from, lbr_to,
+			lbr_nr;			   /* LBR base regs and size */
 	u64		lbr_sel_mask;		   /* LBR_SELECT valid bits */
 	const int	*lbr_sel_map;		   /* lbr_select mappings */
 	bool		lbr_double_abort;	   /* duplicated lbr aborts */

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v12 00/11] Guest Last Branch Recording Enabling
  2020-07-03  7:56       ` Peter Zijlstra
@ 2020-07-03  8:04         ` Xu, Like
  0 siblings, 0 replies; 34+ messages in thread
From: Xu, Like @ 2020-07-03  8:04 UTC (permalink / raw)
  To: Peter Zijlstra, Paolo Bonzini
  Cc: Liang, Kan, Like Xu, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, ak, wei.w.wang,
	linux-kernel, kvm, Liang, Kan

On 2020/7/3 15:56, Peter Zijlstra wrote:
> On Thu, Jul 02, 2020 at 03:58:42PM +0200, Peter Zijlstra wrote:
>> On Thu, Jul 02, 2020 at 09:11:06AM -0400, Liang, Kan wrote:
>>> On 7/2/2020 3:40 AM, Peter Zijlstra wrote:
>>>> On Sat, Jun 13, 2020 at 04:09:45PM +0800, Like Xu wrote:
>>>>> Like Xu (10):
>>>>>     perf/x86/core: Refactor hw->idx checks and cleanup
>>>>>     perf/x86/lbr: Add interface to get LBR information
>>>>>     perf/x86: Add constraint to create guest LBR event without hw counter
>>>>>     perf/x86: Keep LBR records unchanged in host context for guest usage
>>>>> Wei Wang (1):
>>>>>     perf/x86: Fix variable types for LBR registers
>>>>>    arch/x86/events/core.c            |  26 +--
>>>>>    arch/x86/events/intel/core.c      | 109 ++++++++-----
>>>>>    arch/x86/events/intel/lbr.c       |  51 +++++-
>>>>>    arch/x86/events/perf_event.h      |   8 +-
>>>>>    arch/x86/include/asm/perf_event.h |  34 +++-
>>>> These look good to me; but at the same time Kan is sending me
>>>> Architectural LBR patches.
>>>>
>>>> Kan, if I take these perf patches and stick them in a tip/perf/vlbr
>>>> topic branch, can you rebase the arch lbr stuff on top, or is there
>>>> anything in the arch-lbr series that badly conflicts with this work?
>>>>
>>> Yes, I can rebase the arch lbr patches on top of them.
>>> Please push the tip/perf/vlbr branch, so I can pull and rebase my patches.
>> For now I have:
>>
>>    git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git perf/vlbr
>>
>> Once the 0day robot comes back all-green, I'll push it out to
>> tip/perf/vlbr and merge it into tip/perf/core.
> tip/perf/vlbr now exists, thanks!
Hi Peter,

Thanks for your patience and professional support on this feature!

Thanks,
Like Xu


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v12 07/11] KVM: vmx/pmu: Unmask LBR fields in the MSR_IA32_DEBUGCTLMSR emualtion
  2020-06-13  9:42     ` Xu, Like
@ 2020-07-07 20:21       ` Sean Christopherson
  2020-07-08  1:37         ` Xiaoyao Li
  2020-07-08  7:06         ` Xu, Like
  0 siblings, 2 replies; 34+ messages in thread
From: Sean Christopherson @ 2020-07-07 20:21 UTC (permalink / raw)
  To: Xu, Like
  Cc: Xiaoyao Li, Like Xu, Paolo Bonzini, Peter Zijlstra,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, ak,
	wei.w.wang, linux-kernel, kvm

On Sat, Jun 13, 2020 at 05:42:50PM +0800, Xu, Like wrote:
> On 2020/6/13 17:14, Xiaoyao Li wrote:
> >On 6/13/2020 4:09 PM, Like Xu wrote:
> >>When the LBR feature is reported by the vmx_get_perf_capabilities(),
> >>the LBR fields in the [vmx|vcpu]_supported debugctl should be unmasked.
> >>
> >>The debugctl msr is handled separately in vmx/svm and they're not
> >>completely identical, hence remove the common msr handling code.

I would prefer to put the "remove DEBUGCTRL handling from common x86" in a
separate patch.  Without digging into SVM, it's not obvious that dropping
MSR_IA32_DEBUGCTLMSR from kvm_set_msr_common() is a nop for SVM.

> >>Signed-off-by: Like Xu <like.xu@linux.intel.com>
> >>---
> >>  arch/x86/kvm/vmx/capabilities.h | 12 ++++++++++++
> >>  arch/x86/kvm/vmx/pmu_intel.c    | 19 +++++++++++++++++++
> >>  arch/x86/kvm/x86.c              | 13 -------------
> >>  3 files changed, 31 insertions(+), 13 deletions(-)
> >>
> >>diff --git a/arch/x86/kvm/vmx/capabilities.h
> >>b/arch/x86/kvm/vmx/capabilities.h
> >>index b633a90320ee..f6fcfabb1026 100644
> >>--- a/arch/x86/kvm/vmx/capabilities.h
> >>+++ b/arch/x86/kvm/vmx/capabilities.h
> >>@@ -21,6 +21,8 @@ extern int __read_mostly pt_mode;
> >>  #define PMU_CAP_FW_WRITES    (1ULL << 13)
> >>  #define PMU_CAP_LBR_FMT        0x3f
> >>  +#define DEBUGCTLMSR_LBR_MASK        (DEBUGCTLMSR_LBR |
> >>DEBUGCTLMSR_FREEZE_LBRS_ON_PMI)
> >>+
> >>  struct nested_vmx_msrs {
> >>      /*
> >>       * We only store the "true" versions of the VMX capability MSRs. We
> >>@@ -387,4 +389,14 @@ static inline u64 vmx_get_perf_capabilities(void)
> >>      return perf_cap;
> >>  }
> >>  +static inline u64 vmx_get_supported_debugctl(void)
> >>+{
> >>+    u64 val = 0;
> >>+
> >>+    if (vmx_get_perf_capabilities() & PMU_CAP_LBR_FMT)
> >>+        val |= DEBUGCTLMSR_LBR_MASK;
> >>+
> >>+    return val;
> >>+}
> >>+
> >>  #endif /* __KVM_X86_VMX_CAPS_H */
> >>diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
> >>index a953c7d633f6..d92e95b64c74 100644
> >>--- a/arch/x86/kvm/vmx/pmu_intel.c
> >>+++ b/arch/x86/kvm/vmx/pmu_intel.c
> >>@@ -187,6 +187,7 @@ static bool intel_is_valid_msr(struct kvm_vcpu
> >>*vcpu, u32 msr)
> >>      case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
> >>          ret = pmu->version > 1;
> >>          break;
> >>+    case MSR_IA32_DEBUGCTLMSR:
> >>      case MSR_IA32_PERF_CAPABILITIES:
> >>          ret = 1;
> >>          break;
> >>@@ -237,6 +238,9 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu,
> >>struct msr_data *msr_info)
> >>              return 1;
> >>          msr_info->data = vcpu->arch.perf_capabilities;
> >>          return 0;
> >>+    case MSR_IA32_DEBUGCTLMSR:
> >>+        msr_info->data = vmcs_read64(GUEST_IA32_DEBUGCTL);
> >
> >Can we put the emulation of MSR_IA32_DEBUGCTLMSR in vmx_{get/set})_msr().
> >AFAIK, MSR_IA32_DEBUGCTLMSR is not a pure PMU related MSR that there is
> >bit 2 to enable #DB for bus lock.
> We already have "case MSR_IA32_DEBUGCTLMSR" handler in the vmx_set_msr()
> and you may apply you bus lock changes in that handler.

Hrm, but that'd be weird dependency as vmx_set_msr() would need to check for
#DB bus lock support but not actually write GUEST_IA32_DEBUGCTL, or we'd end
up writing it twice when both bus lock and LBR are supported.

I don't see anything in the series that takes action on writes to
MSR_IA32_DEBUGCTLMSR beyond updating the VMCS, i.e. AFAICT there isn't any
reason to call into the PMU, VMX can simply query vmx_get_perf_capabilities()
to check if it's legal to enable DEBUGCTLMSR_LBR_MASK.

A question for both LBR and bus lock: would it make sense to cache the
guest's value in vcpu_vmx so that querying the guest value doesn't require
a VMREAD?  I don't have a good feel for how frequently it would be accessed.

> >>+        return 0;
> >>      default:
> >>          if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
> >>              (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
> >>@@ -282,6 +286,16 @@ static inline bool lbr_is_compatible(struct
> >>kvm_vcpu *vcpu)
> >>      return true;
> >>  }
> >>  +static inline u64 vcpu_get_supported_debugctl(struct kvm_vcpu *vcpu)
> >>+{
> >>+    u64 debugctlmsr = vmx_get_supported_debugctl();
> >>+
> >>+    if (!lbr_is_enabled(vcpu))
> >>+        debugctlmsr &= ~DEBUGCTLMSR_LBR_MASK;
> >>+
> >>+    return debugctlmsr;
> >>+}
> >>+
> >>  static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data
> >>*msr_info)
> >>  {
> >>      struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
> >>@@ -336,6 +350,11 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu,
> >>struct msr_data *msr_info)
> >>          }
> >>          vcpu->arch.perf_capabilities = data;
> >>          return 0;
> >>+    case MSR_IA32_DEBUGCTLMSR:
> >>+        if (data & ~vcpu_get_supported_debugctl(vcpu))
> >>+            return 1;
> >>+        vmcs_write64(GUEST_IA32_DEBUGCTL, data);
> >>+        return 0;
> >>      default:
> >>          if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
> >>              (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
> >>diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> >>index 00c88c2f34e4..56f275eb4554 100644
> >>--- a/arch/x86/kvm/x86.c
> >>+++ b/arch/x86/kvm/x86.c
> >>@@ -2840,18 +2840,6 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu,
> >>struct msr_data *msr_info)
> >>              return 1;
> >>          }
> >>          break;
> >>-    case MSR_IA32_DEBUGCTLMSR:
> >>-        if (!data) {
> >>-            /* We support the non-activated case already */
> >>-            break;
> >>-        } else if (data & ~(DEBUGCTLMSR_LBR | DEBUGCTLMSR_BTF)) {
> >
> >So after this patch, guest trying to set bit DEBUGCTLMSR_BTF will get a
> >#GP instead of being ignored and printing a log in kernel.
> >
> 
> Since the BTF is not implemented on the KVM at all,
> I do propose not left this kind of dummy thing in the future KVM code.
> 
> Let's see if Netware or any BTF user will complain about this change.

If you want to drop that behavior it needs be done in a separate patch.
Personally I don't see the point in doing so, it's a trivial amount of code
in KVM and there's no harm in dropping the bits on write.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v12 07/11] KVM: vmx/pmu: Unmask LBR fields in the MSR_IA32_DEBUGCTLMSR emualtion
  2020-07-07 20:21       ` Sean Christopherson
@ 2020-07-08  1:37         ` Xiaoyao Li
  2020-07-08  7:06         ` Xu, Like
  1 sibling, 0 replies; 34+ messages in thread
From: Xiaoyao Li @ 2020-07-08  1:37 UTC (permalink / raw)
  To: Sean Christopherson, Xu, Like
  Cc: Like Xu, Paolo Bonzini, Peter Zijlstra, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, ak, wei.w.wang,
	linux-kernel, kvm

On 7/8/2020 4:21 AM, Sean Christopherson wrote:
> On Sat, Jun 13, 2020 at 05:42:50PM +0800, Xu, Like wrote:
>> On 2020/6/13 17:14, Xiaoyao Li wrote:
>>> On 6/13/2020 4:09 PM, Like Xu wrote:
[...]
>>>> @@ -237,6 +238,9 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu,
>>>> struct msr_data *msr_info)
>>>>                return 1;
>>>>            msr_info->data = vcpu->arch.perf_capabilities;
>>>>            return 0;
>>>> +    case MSR_IA32_DEBUGCTLMSR:
>>>> +        msr_info->data = vmcs_read64(GUEST_IA32_DEBUGCTL);
>>>
>>> Can we put the emulation of MSR_IA32_DEBUGCTLMSR in vmx_{get/set})_msr().
>>> AFAIK, MSR_IA32_DEBUGCTLMSR is not a pure PMU related MSR that there is
>>> bit 2 to enable #DB for bus lock.
>> We already have "case MSR_IA32_DEBUGCTLMSR" handler in the vmx_set_msr()
>> and you may apply you bus lock changes in that handler.
> 
> Hrm, but that'd be weird dependency as vmx_set_msr() would need to check for
> #DB bus lock support but not actually write GUEST_IA32_DEBUGCTL, or we'd end
> up writing it twice when both bus lock and LBR are supported.

Yeah. That's what I concerned as well.

> I don't see anything in the series that takes action on writes to
> MSR_IA32_DEBUGCTLMSR beyond updating the VMCS, i.e. AFAICT there isn't any
> reason to call into the PMU, VMX can simply query vmx_get_perf_capabilities()
> to check if it's legal to enable DEBUGCTLMSR_LBR_MASK.
> 
> A question for both LBR and bus lock: would it make sense to cache the
> guest's value in vcpu_vmx so that querying the guest value doesn't require
> a VMREAD?  I don't have a good feel for how frequently it would be accessed.

Cache the guest's value is OK, even though #DB bus lock bit wouldn't be 
toggled frequently in a normal OS.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v12 07/11] KVM: vmx/pmu: Unmask LBR fields in the MSR_IA32_DEBUGCTLMSR emualtion
  2020-07-07 20:21       ` Sean Christopherson
  2020-07-08  1:37         ` Xiaoyao Li
@ 2020-07-08  7:06         ` Xu, Like
  2020-07-10 16:28           ` Sean Christopherson
  1 sibling, 1 reply; 34+ messages in thread
From: Xu, Like @ 2020-07-08  7:06 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Xiaoyao Li, Like Xu, Paolo Bonzini, Peter Zijlstra,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, ak,
	wei.w.wang, linux-kernel, kvm

Hi Sean,

First of all, are you going to queue the LBR patch series in your tree
considering the host perf patches have already queued in Peter's tree ?

On 2020/7/8 4:21, Sean Christopherson wrote:
> On Sat, Jun 13, 2020 at 05:42:50PM +0800, Xu, Like wrote:
>> On 2020/6/13 17:14, Xiaoyao Li wrote:
>>> On 6/13/2020 4:09 PM, Like Xu wrote:
>>>> When the LBR feature is reported by the vmx_get_perf_capabilities(),
>>>> the LBR fields in the [vmx|vcpu]_supported debugctl should be unmasked.
>>>>
>>>> The debugctl msr is handled separately in vmx/svm and they're not
>>>> completely identical, hence remove the common msr handling code.
> I would prefer to put the "remove DEBUGCTRL handling from common x86" in a
> separate patch.  Without digging into SVM, it's not obvious that dropping
> MSR_IA32_DEBUGCTLMSR from kvm_set_msr_common() is a nop for SVM.
Sure, I'll do it in a separate patch.
>
>>>> Signed-off-by: Like Xu <like.xu@linux.intel.com>
>>>> ---
>>>>    arch/x86/kvm/vmx/capabilities.h | 12 ++++++++++++
>>>>    arch/x86/kvm/vmx/pmu_intel.c    | 19 +++++++++++++++++++
>>>>    arch/x86/kvm/x86.c              | 13 -------------
>>>>    3 files changed, 31 insertions(+), 13 deletions(-)
>>>>
>>>> diff --git a/arch/x86/kvm/vmx/capabilities.h
>>>> b/arch/x86/kvm/vmx/capabilities.h
>>>> index b633a90320ee..f6fcfabb1026 100644
>>>> --- a/arch/x86/kvm/vmx/capabilities.h
>>>> +++ b/arch/x86/kvm/vmx/capabilities.h
>>>> @@ -21,6 +21,8 @@ extern int __read_mostly pt_mode;
>>>>    #define PMU_CAP_FW_WRITES    (1ULL << 13)
>>>>    #define PMU_CAP_LBR_FMT        0x3f
>>>>    +#define DEBUGCTLMSR_LBR_MASK        (DEBUGCTLMSR_LBR |
>>>> DEBUGCTLMSR_FREEZE_LBRS_ON_PMI)
>>>> +
>>>>    struct nested_vmx_msrs {
>>>>        /*
>>>>         * We only store the "true" versions of the VMX capability MSRs. We
>>>> @@ -387,4 +389,14 @@ static inline u64 vmx_get_perf_capabilities(void)
>>>>        return perf_cap;
>>>>    }
>>>>    +static inline u64 vmx_get_supported_debugctl(void)
>>>> +{
>>>> +    u64 val = 0;
>>>> +
>>>> +    if (vmx_get_perf_capabilities() & PMU_CAP_LBR_FMT)
>>>> +        val |= DEBUGCTLMSR_LBR_MASK;
>>>> +
>>>> +    return val;
>>>> +}
>>>> +
>>>>    #endif /* __KVM_X86_VMX_CAPS_H */
>>>> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
>>>> index a953c7d633f6..d92e95b64c74 100644
>>>> --- a/arch/x86/kvm/vmx/pmu_intel.c
>>>> +++ b/arch/x86/kvm/vmx/pmu_intel.c
>>>> @@ -187,6 +187,7 @@ static bool intel_is_valid_msr(struct kvm_vcpu
>>>> *vcpu, u32 msr)
>>>>        case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
>>>>            ret = pmu->version > 1;
>>>>            break;
>>>> +    case MSR_IA32_DEBUGCTLMSR:
>>>>        case MSR_IA32_PERF_CAPABILITIES:
>>>>            ret = 1;
>>>>            break;
>>>> @@ -237,6 +238,9 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu,
>>>> struct msr_data *msr_info)
>>>>                return 1;
>>>>            msr_info->data = vcpu->arch.perf_capabilities;
>>>>            return 0;
>>>> +    case MSR_IA32_DEBUGCTLMSR:
>>>> +        msr_info->data = vmcs_read64(GUEST_IA32_DEBUGCTL);
>>> Can we put the emulation of MSR_IA32_DEBUGCTLMSR in vmx_{get/set})_msr().
>>> AFAIK, MSR_IA32_DEBUGCTLMSR is not a pure PMU related MSR that there is
>>> bit 2 to enable #DB for bus lock.
>> We already have "case MSR_IA32_DEBUGCTLMSR" handler in the vmx_set_msr()
>> and you may apply you bus lock changes in that handler.
> Hrm, but that'd be weird dependency as vmx_set_msr() would need to check for
> #DB bus lock support but not actually write GUEST_IA32_DEBUGCTL, or we'd end
> up writing it twice when both bus lock and LBR are supported.
Yes, you're right about the multiple writes on GUEST_IA32_DEBUGCTL.

I'll move the handler to vmx_set/get_msr() for other DEBUGCTL users.
>
> I don't see anything in the series that takes action on writes to
> MSR_IA32_DEBUGCTLMSR beyond updating the VMCS, i.e. AFAICT there isn't any
> reason to call into the PMU, VMX can simply query vmx_get_perf_capabilities()
> to check if it's legal to enable DEBUGCTLMSR_LBR_MASK.
There's a gap to enable DEBUGCTLMSR_LBR_MASK.

The vmx_get_perf_capabilities() is queried per-KVM while
the vmx_get_supported_debugctl() is queried per-guest.

>
> A question for both LBR and bus lock: would it make sense to cache the
> guest's value in vcpu_vmx so that querying the guest value doesn't require
> a VMREAD?  I don't have a good feel for how frequently it would be accessed.
I'm OK with the cached value for this field and AFAIK,
it will benefit the legacy_freezing_lbrs_on_pmi emulation
if the VMREAD is heavier than normal cache/mem touch.

>
>>>> +        return 0;
>>>>        default:
>>>>            if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
>>>>                (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
>>>> @@ -282,6 +286,16 @@ static inline bool lbr_is_compatible(struct
>>>> kvm_vcpu *vcpu)
>>>>        return true;
>>>>    }
>>>>    +static inline u64 vcpu_get_supported_debugctl(struct kvm_vcpu *vcpu)
>>>> +{
>>>> +    u64 debugctlmsr = vmx_get_supported_debugctl();
>>>> +
>>>> +    if (!lbr_is_enabled(vcpu))
>>>> +        debugctlmsr &= ~DEBUGCTLMSR_LBR_MASK;
>>>> +
>>>> +    return debugctlmsr;
>>>> +}
>>>> +
>>>>    static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data
>>>> *msr_info)
>>>>    {
>>>>        struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
>>>> @@ -336,6 +350,11 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu,
>>>> struct msr_data *msr_info)
>>>>            }
>>>>            vcpu->arch.perf_capabilities = data;
>>>>            return 0;
>>>> +    case MSR_IA32_DEBUGCTLMSR:
>>>> +        if (data & ~vcpu_get_supported_debugctl(vcpu))
>>>> +            return 1;
>>>> +        vmcs_write64(GUEST_IA32_DEBUGCTL, data);
>>>> +        return 0;
>>>>        default:
>>>>            if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
>>>>                (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
>>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>>> index 00c88c2f34e4..56f275eb4554 100644
>>>> --- a/arch/x86/kvm/x86.c
>>>> +++ b/arch/x86/kvm/x86.c
>>>> @@ -2840,18 +2840,6 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu,
>>>> struct msr_data *msr_info)
>>>>                return 1;
>>>>            }
>>>>            break;
>>>> -    case MSR_IA32_DEBUGCTLMSR:
>>>> -        if (!data) {
>>>> -            /* We support the non-activated case already */
>>>> -            break;
>>>> -        } else if (data & ~(DEBUGCTLMSR_LBR | DEBUGCTLMSR_BTF)) {
>>> So after this patch, guest trying to set bit DEBUGCTLMSR_BTF will get a
>>> #GP instead of being ignored and printing a log in kernel.
>>>
>> Since the BTF is not implemented on the KVM at all,
>> I do propose not left this kind of dummy thing in the future KVM code.
>>
>> Let's see if Netware or any BTF user will complain about this change.
> If you want to drop that behavior it needs be done in a separate patch.
> Personally I don't see the point in doing so, it's a trivial amount of code
> in KVM and there's no harm in dropping the bits on write.
No harm in dropping the bits on write ? Interesting.
I may keep the semantics unchanged for LBR patches and make it as separate 
proposal.

Thanks,
Like Xu


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v12 06/11] KVM: vmx/pmu: Expose LBR to guest via MSR_IA32_PERF_CAPABILITIES
  2020-06-13  8:09 ` [PATCH v12 06/11] KVM: vmx/pmu: Expose LBR to guest via MSR_IA32_PERF_CAPABILITIES Like Xu
@ 2020-07-08 13:36   ` Andi Kleen
  2020-07-08 14:38     ` Xu, Like
  0 siblings, 1 reply; 34+ messages in thread
From: Andi Kleen @ 2020-07-08 13:36 UTC (permalink / raw)
  To: Like Xu
  Cc: Paolo Bonzini, Peter Zijlstra, Sean Christopherson,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel,
	wei.w.wang, linux-kernel, kvm

> +	/*
> +	 * As a first step, a guest could only enable LBR feature if its cpu
> +	 * model is the same as the host because the LBR registers would
> +	 * be pass-through to the guest and they're model specific.
> +	 */
> +	if (boot_cpu_data.x86_model != guest_cpuid_model(vcpu))
> +		return false;

Could we relax this in a followon patch? (after this series is merged)

It's enough of the perf cap LBR version matches, don't need full model
number match. This would require a way to configure the LBR version
from qemu.

This would allow more flexibility, for example migration from
Icelake to Skylake and vice versa.

-Andi

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v12 06/11] KVM: vmx/pmu: Expose LBR to guest via MSR_IA32_PERF_CAPABILITIES
  2020-07-08 13:36   ` Andi Kleen
@ 2020-07-08 14:38     ` Xu, Like
  0 siblings, 0 replies; 34+ messages in thread
From: Xu, Like @ 2020-07-08 14:38 UTC (permalink / raw)
  To: Andi Kleen, Like Xu
  Cc: Paolo Bonzini, Peter Zijlstra, Sean Christopherson,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel,
	wei.w.wang, linux-kernel, kvm

On 2020/7/8 21:36, Andi Kleen wrote:
>> +	/*
>> +	 * As a first step, a guest could only enable LBR feature if its cpu
>> +	 * model is the same as the host because the LBR registers would
>> +	 * be pass-through to the guest and they're model specific.
>> +	 */
>> +	if (boot_cpu_data.x86_model != guest_cpuid_model(vcpu))
>> +		return false;
> Could we relax this in a followon patch? (after this series is merged)
Sure, there would be a follow-on patch to relax this check after it's merged.
>
> It's enough of the perf cap LBR version matches, don't need full model
> number match.
I assume you are referring to the LBR_FMT value in the perf_capabilities.
> This would require a way to configure the LBR version
> from qemu.
Sure, I may propose this configuration in the QEMU community.
>
> This would allow more flexibility, for example migration from
> Icelake to Skylake and vice versa.
Yes, we need this flexibility to cover as many platforms as possible.

Thanks,
Like Xu
>
> -Andi


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v12 07/11] KVM: vmx/pmu: Unmask LBR fields in the MSR_IA32_DEBUGCTLMSR emualtion
  2020-07-08  7:06         ` Xu, Like
@ 2020-07-10 16:28           ` Sean Christopherson
  0 siblings, 0 replies; 34+ messages in thread
From: Sean Christopherson @ 2020-07-10 16:28 UTC (permalink / raw)
  To: Xu, Like
  Cc: Xiaoyao Li, Like Xu, Paolo Bonzini, Peter Zijlstra,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, ak,
	wei.w.wang, linux-kernel, kvm

On Wed, Jul 08, 2020 at 03:06:57PM +0800, Xu, Like wrote:
> Hi Sean,
> 
> First of all, are you going to queue the LBR patch series in your tree
> considering the host perf patches have already queued in Peter's tree ?

No, I'll let Paolo take 'em directly, I'm nowhere near knowledgeable enough
with respect to the PMU to feel comfortable taking them.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v12 01/11] perf/x86: Fix variable types for LBR registers
  2020-06-13  8:09 ` [PATCH v12 01/11] perf/x86: Fix variable types for LBR registers Like Xu
  2020-07-03  8:01   ` [tip: perf/core] " tip-bot2 for Wei Wang
@ 2020-11-09  6:34   ` Andi Kleen
  2020-11-11  2:14     ` Xu, Like
  1 sibling, 1 reply; 34+ messages in thread
From: Andi Kleen @ 2020-11-09  6:34 UTC (permalink / raw)
  To: Like Xu
  Cc: Paolo Bonzini, Peter Zijlstra, Sean Christopherson,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel,
	wei.w.wang, linux-kernel, kvm

On Sat, Jun 13, 2020 at 04:09:46PM +0800, Like Xu wrote:
> From: Wei Wang <wei.w.wang@intel.com>
> 
> The MSR variable type can be 'unsigned int', which uses less memory than
> the longer 'unsigned long'. Fix 'struct x86_pmu' for that. The lbr_nr won't
> be a negative number, so make it 'unsigned int' as well.

Hi, 

What's the status of this patchkit? It would be quite useful to me (and
various other people) to use LBRs in guest. I reviewed it earlier and the
patches all looked good to me.  But i don't see it in any -next tree.

Reviewed-by: Andi Kleen <ak@linux.intel.com>

Could it please be merged?

Thanks,

-Andi

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v12 01/11] perf/x86: Fix variable types for LBR registers
  2020-11-09  6:34   ` [PATCH v12 01/11] " Andi Kleen
@ 2020-11-11  2:14     ` Xu, Like
  0 siblings, 0 replies; 34+ messages in thread
From: Xu, Like @ 2020-11-11  2:14 UTC (permalink / raw)
  To: Andi Kleen, Paolo Bonzini, Stephane Eranian
  Cc: Peter Zijlstra, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, wei.w.wang, linux-kernel,
	kvm

Hi Paolo,

As you may know, we have got host perf support in Linus' tree
which provides a clear path for enabling guest LBR,

will we merge the remaining LBR KVM patch set?

---

[PATCH RESEND v13 00/10] Guest Last Branch Recording Enabling
https://lore.kernel.org/kvm/20201030035220.102403-1-like.xu@linux.intel.com/

Thanks,
Like Xu

On 2020/11/9 14:34, Andi Kleen wrote:
> Hi,
>
> What's the status of this patchkit? It would be quite useful to me (and
> various other people) to use LBRs in guest. I reviewed it earlier and the
> patches all looked good to me.  But i don't see it in any -next tree.
>
> Reviewed-by: Andi Kleen<ak@linux.intel.com>
>
> Could it please be merged?
>
> Thanks,
>
> -Andi


^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2020-11-11  2:14 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-13  8:09 [PATCH v12 00/11] Guest Last Branch Recording Enabling Like Xu
2020-06-13  8:09 ` [PATCH v12 01/11] perf/x86: Fix variable types for LBR registers Like Xu
2020-07-03  8:01   ` [tip: perf/core] " tip-bot2 for Wei Wang
2020-11-09  6:34   ` [PATCH v12 01/11] " Andi Kleen
2020-11-11  2:14     ` Xu, Like
2020-06-13  8:09 ` [PATCH v12 02/11] perf/x86/core: Refactor hw->idx checks and cleanup Like Xu
2020-07-03  8:01   ` [tip: perf/core] " tip-bot2 for Like Xu
2020-06-13  8:09 ` [PATCH v12 03/11] perf/x86/lbr: Add interface to get LBR information Like Xu
2020-07-03  8:01   ` [tip: perf/core] " tip-bot2 for Like Xu
2020-06-13  8:09 ` [PATCH v12 04/11] perf/x86: Add constraint to create guest LBR event without hw counter Like Xu
2020-06-13  8:09 ` [PATCH v12 05/11] perf/x86: Keep LBR records unchanged in host context for guest usage Like Xu
2020-06-13  8:09 ` [PATCH v12 06/11] KVM: vmx/pmu: Expose LBR to guest via MSR_IA32_PERF_CAPABILITIES Like Xu
2020-07-08 13:36   ` Andi Kleen
2020-07-08 14:38     ` Xu, Like
2020-06-13  8:09 ` [PATCH v12 07/11] KVM: vmx/pmu: Unmask LBR fields in the MSR_IA32_DEBUGCTLMSR emualtion Like Xu
2020-06-13  9:14   ` Xiaoyao Li
2020-06-13  9:42     ` Xu, Like
2020-07-07 20:21       ` Sean Christopherson
2020-07-08  1:37         ` Xiaoyao Li
2020-07-08  7:06         ` Xu, Like
2020-07-10 16:28           ` Sean Christopherson
2020-06-13  8:09 ` [PATCH v12 08/11] KVM: vmx/pmu: Pass-through LBR msrs when guest LBR event is scheduled Like Xu
2020-06-13  8:09 ` [PATCH v12 09/11] KVM: vmx/pmu: Emulate legacy freezing LBRs on virtual PMI Like Xu
2020-06-13  8:09 ` [PATCH v12 10/11] KVM: vmx/pmu: Reduce the overhead of LBR pass-through or cancellation Like Xu
2020-06-13  8:09 ` [PATCH v12 11/11] KVM: vmx/pmu: Release guest LBR event via lazy release mechanism Like Xu
2020-06-13  8:09 ` [Qemu-devel] [PATCH 1/2] target/i386: define a new MSR based feature word - FEAT_PERF_CAPABILITIES Like Xu
2020-06-13  8:09 ` [Qemu-devel] [PATCH 2/2] target/i386: add -cpu,lbr=true support to enable guest LBR Like Xu
2020-06-23 13:13 ` [PATCH v12 00/11] Guest Last Branch Recording Enabling Like Xu
2020-07-01  2:38   ` Like Xu
2020-07-02  7:40 ` Peter Zijlstra
2020-07-02 13:11   ` Liang, Kan
2020-07-02 13:58     ` Peter Zijlstra
2020-07-03  7:56       ` Peter Zijlstra
2020-07-03  8:04         ` Xu, Like

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).