All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 0/2] ignore LBR-related MSRs
@ 2017-12-06 11:43 ` Jan Dakinevich
  0 siblings, 0 replies; 22+ messages in thread
From: Jan Dakinevich @ 2017-12-06 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jan Dakinevich, Denis V . Lunev, Roman Kagan, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, H. Peter Anvin, x86,
	Paolo Bonzini, Radim Krčmář,
	Andi Kleen, Kan Liang, Colin King, Stephane Eranian,
	Sebastian Andrzej Siewior, Greg Kroah-Hartman, Jin Yao, kvm

w2k16 essentials fails to boot if underlying hypervisor lacks of support for 
LBR MSRs. To workaround the issue, it suggessted to ignore these MSRs (but not 
all).

The information, which MSRs are supported for specific platform is taken from
perf, it is the subject of the first patch. The second patch adds ignoring for
these MSRs to pmu_intel code of KVM.

TODO: use MSR load/store areas to make full support of LBR debug.

Jan Dakinevich (2):
  perf/x86/intel: make reusable LBR initialization code
  KVM: x86/vPMU: ignore access to LBR-related MSRs

 arch/x86/events/core.c            |   8 +-
 arch/x86/events/intel/core.c      |  59 +++------
 arch/x86/events/intel/lbr.c       | 272 +++++++++++++++++++++++++-------------
 arch/x86/events/perf_event.h      |  27 +---
 arch/x86/include/asm/kvm_host.h   |   2 +
 arch/x86/include/asm/perf_event.h |  11 ++
 arch/x86/kvm/pmu_intel.c          |  33 +++++
 7 files changed, 250 insertions(+), 162 deletions(-)

-- 
2.1.4

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH RFC 0/2] ignore LBR-related MSRs
@ 2017-12-06 11:43 ` Jan Dakinevich
  0 siblings, 0 replies; 22+ messages in thread
From: Jan Dakinevich @ 2017-12-06 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jan Dakinevich, Denis V . Lunev, Roman Kagan, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, H. Peter Anvin, x86,
	Paolo Bonzini, Radim Krčmář,
	Andi Kleen, Kan Liang, Colin King, Stephane Eranian,
	Sebastian Andrzej Siewior

w2k16 essentials fails to boot if underlying hypervisor lacks of support for 
LBR MSRs. To workaround the issue, it suggessted to ignore these MSRs (but not 
all).

The information, which MSRs are supported for specific platform is taken from
perf, it is the subject of the first patch. The second patch adds ignoring for
these MSRs to pmu_intel code of KVM.

TODO: use MSR load/store areas to make full support of LBR debug.

Jan Dakinevich (2):
  perf/x86/intel: make reusable LBR initialization code
  KVM: x86/vPMU: ignore access to LBR-related MSRs

 arch/x86/events/core.c            |   8 +-
 arch/x86/events/intel/core.c      |  59 +++------
 arch/x86/events/intel/lbr.c       | 272 +++++++++++++++++++++++++-------------
 arch/x86/events/perf_event.h      |  27 +---
 arch/x86/include/asm/kvm_host.h   |   2 +
 arch/x86/include/asm/perf_event.h |  11 ++
 arch/x86/kvm/pmu_intel.c          |  33 +++++
 7 files changed, 250 insertions(+), 162 deletions(-)

-- 
2.1.4

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH RFC 1/2] perf/x86/intel: make reusable LBR initialization code
  2017-12-06 11:43 ` Jan Dakinevich
@ 2017-12-06 11:43   ` Jan Dakinevich
  -1 siblings, 0 replies; 22+ messages in thread
From: Jan Dakinevich @ 2017-12-06 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jan Dakinevich, Denis V . Lunev, Roman Kagan, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, H. Peter Anvin, x86,
	Paolo Bonzini, Radim Krčmář,
	Andi Kleen, Kan Liang, Stephane Eranian, Zhou Chengming,
	Sebastian Andrzej Siewior, Colin King, Greg Kroah-Hartman,
	Jin Yao, kvm

This patch introduces globally visible intel_pmu_lbr_fill() routine,
which gathers information which LBR MSRs are support for specific CPU
family/model.

It is supposed that the routine would be used in KVM code, using guest
CPU information as an input. By this reason, it should not have any side
effect which could affect host system.

 * LBR information moved to separate structure `struct x86_pmu_lbr';
 * All family-specific tweaks on gathered information are applied only
   for global x86_pmu.lbr to keep current perf initialization behavior.

Signed-off-by: Jan Dakinevich <jan.dakinevich@virtuozzo.com>
---
 arch/x86/events/core.c            |   8 +-
 arch/x86/events/intel/core.c      |  59 +++------
 arch/x86/events/intel/lbr.c       | 272 +++++++++++++++++++++++++-------------
 arch/x86/events/perf_event.h      |  27 +---
 arch/x86/include/asm/perf_event.h |  11 ++
 5 files changed, 215 insertions(+), 162 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 140d332..72fc174 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -370,7 +370,7 @@ int x86_add_exclusive(unsigned int what)
 	 * When lbr_pt_coexist we allow PT to coexist with either LBR or BTS.
 	 * LBR and BTS are still mutually exclusive.
 	 */
-	if (x86_pmu.lbr_pt_coexist && what == x86_lbr_exclusive_pt)
+	if (x86_pmu.lbr.pt_coexist && what == x86_lbr_exclusive_pt)
 		return 0;
 
 	if (!atomic_inc_not_zero(&x86_pmu.lbr_exclusive[what])) {
@@ -393,7 +393,7 @@ int x86_add_exclusive(unsigned int what)
 
 void x86_del_exclusive(unsigned int what)
 {
-	if (x86_pmu.lbr_pt_coexist && what == x86_lbr_exclusive_pt)
+	if (x86_pmu.lbr.pt_coexist && what == x86_lbr_exclusive_pt)
 		return;
 
 	atomic_dec(&x86_pmu.lbr_exclusive[what]);
@@ -496,7 +496,7 @@ int x86_pmu_max_precise(void)
 		precise++;
 
 		/* Support for IP fixup */
-		if (x86_pmu.lbr_nr || x86_pmu.intel_cap.pebs_format >= 2)
+		if (x86_pmu.lbr.nr || x86_pmu.intel_cap.pebs_format >= 2)
 			precise++;
 
 		if (x86_pmu.pebs_prec_dist)
@@ -1311,7 +1311,7 @@ void perf_event_print_debug(void)
 			rdmsrl(MSR_IA32_PEBS_ENABLE, pebs);
 			pr_info("CPU#%d: pebs:       %016llx\n", cpu, pebs);
 		}
-		if (x86_pmu.lbr_nr) {
+		if (x86_pmu.lbr.nr) {
 			rdmsrl(MSR_IA32_DEBUGCTLMSR, debugctl);
 			pr_info("CPU#%d: debugctl:   %016llx\n", cpu, debugctl);
 		}
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 09c26a4..2ac799d 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2182,7 +2182,7 @@ static void intel_pmu_reset(void)
 	}
 
 	/* Reset LBRs and LBR freezing */
-	if (x86_pmu.lbr_nr) {
+	if (x86_pmu.lbr.nr) {
 		update_debugctlmsr(get_debugctlmsr() &
 			~(DEBUGCTLMSR_FREEZE_LBRS_ON_PMI|DEBUGCTLMSR_LBR));
 	}
@@ -3262,7 +3262,7 @@ static int intel_pmu_cpu_prepare(int cpu)
 {
 	struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu);
 
-	if (x86_pmu.extra_regs || x86_pmu.lbr_sel_map) {
+	if (x86_pmu.extra_regs || x86_pmu.lbr.sel_map) {
 		cpuc->shared_regs = allocate_shared_regs(cpu);
 		if (!cpuc->shared_regs)
 			goto err;
@@ -3343,7 +3343,7 @@ static void intel_pmu_cpu_starting(int cpu)
 		cpuc->shared_regs->refcnt++;
 	}
 
-	if (x86_pmu.lbr_sel_map)
+	if (x86_pmu.lbr.sel_map)
 		cpuc->lbr_sel = &cpuc->shared_regs->regs[EXTRA_REG_LBR];
 
 	if (x86_pmu.flags & PMU_FL_EXCL_CNTRS) {
@@ -3595,9 +3595,9 @@ static void intel_snb_check_microcode(void)
 
 static bool is_lbr_from(unsigned long msr)
 {
-	unsigned long lbr_from_nr = x86_pmu.lbr_from + x86_pmu.lbr_nr;
+	unsigned long lbr_from_nr = x86_pmu.lbr.from + x86_pmu.lbr.nr;
 
-	return x86_pmu.lbr_from <= msr && msr < lbr_from_nr;
+	return x86_pmu.lbr.from <= msr && msr < lbr_from_nr;
 }
 
 /*
@@ -3814,7 +3814,7 @@ static ssize_t branches_show(struct device *cdev,
 			     struct device_attribute *attr,
 			     char *buf)
 {
-	return snprintf(buf, PAGE_SIZE, "%d\n", x86_pmu.lbr_nr);
+	return snprintf(buf, PAGE_SIZE, "%d\n", x86_pmu.lbr.nr);
 }
 
 static DEVICE_ATTR_RO(branches);
@@ -3915,6 +3915,8 @@ __init int intel_pmu_init(void)
 
 	intel_ds_init();
 
+	intel_pmu_lbr_init();
+
 	x86_add_quirk(intel_arch_events_quirk); /* Install first, so it runs last */
 
 	/*
@@ -3934,8 +3936,6 @@ __init int intel_pmu_init(void)
 		memcpy(hw_cache_event_ids, core2_hw_cache_event_ids,
 		       sizeof(hw_cache_event_ids));
 
-		intel_pmu_lbr_init_core();
-
 		x86_pmu.event_constraints = intel_core2_event_constraints;
 		x86_pmu.pebs_constraints = intel_core2_pebs_event_constraints;
 		pr_cont("Core2 events, ");
@@ -3950,8 +3950,6 @@ __init int intel_pmu_init(void)
 		memcpy(hw_cache_extra_regs, nehalem_hw_cache_extra_regs,
 		       sizeof(hw_cache_extra_regs));
 
-		intel_pmu_lbr_init_nhm();
-
 		x86_pmu.event_constraints = intel_nehalem_event_constraints;
 		x86_pmu.pebs_constraints = intel_nehalem_pebs_event_constraints;
 		x86_pmu.enable_all = intel_pmu_nhm_enable_all;
@@ -3983,8 +3981,6 @@ __init int intel_pmu_init(void)
 		memcpy(hw_cache_event_ids, atom_hw_cache_event_ids,
 		       sizeof(hw_cache_event_ids));
 
-		intel_pmu_lbr_init_atom();
-
 		x86_pmu.event_constraints = intel_gen_event_constraints;
 		x86_pmu.pebs_constraints = intel_atom_pebs_event_constraints;
 		x86_pmu.pebs_aliases = intel_pebs_aliases_core2;
@@ -4000,8 +3996,6 @@ __init int intel_pmu_init(void)
 		memcpy(hw_cache_extra_regs, slm_hw_cache_extra_regs,
 		       sizeof(hw_cache_extra_regs));
 
-		intel_pmu_lbr_init_slm();
-
 		x86_pmu.event_constraints = intel_slm_event_constraints;
 		x86_pmu.pebs_constraints = intel_slm_pebs_event_constraints;
 		x86_pmu.extra_regs = intel_slm_extra_regs;
@@ -4019,8 +4013,6 @@ __init int intel_pmu_init(void)
 		memcpy(hw_cache_extra_regs, glm_hw_cache_extra_regs,
 		       sizeof(hw_cache_extra_regs));
 
-		intel_pmu_lbr_init_skl();
-
 		x86_pmu.event_constraints = intel_slm_event_constraints;
 		x86_pmu.pebs_constraints = intel_glm_pebs_event_constraints;
 		x86_pmu.extra_regs = intel_glm_extra_regs;
@@ -4031,7 +4023,6 @@ __init int intel_pmu_init(void)
 		 */
 		x86_pmu.pebs_aliases = NULL;
 		x86_pmu.pebs_prec_dist = true;
-		x86_pmu.lbr_pt_coexist = true;
 		x86_pmu.flags |= PMU_FL_HAS_RSP_1;
 		x86_pmu.cpu_events = glm_events_attrs;
 		extra_attr = slm_format_attr;
@@ -4045,8 +4036,6 @@ __init int intel_pmu_init(void)
 		memcpy(hw_cache_extra_regs, glp_hw_cache_extra_regs,
 		       sizeof(hw_cache_extra_regs));
 
-		intel_pmu_lbr_init_skl();
-
 		x86_pmu.event_constraints = intel_slm_event_constraints;
 		x86_pmu.pebs_constraints = intel_glp_pebs_event_constraints;
 		x86_pmu.extra_regs = intel_glm_extra_regs;
@@ -4056,7 +4045,6 @@ __init int intel_pmu_init(void)
 		 */
 		x86_pmu.pebs_aliases = NULL;
 		x86_pmu.pebs_prec_dist = true;
-		x86_pmu.lbr_pt_coexist = true;
 		x86_pmu.flags |= PMU_FL_HAS_RSP_1;
 		x86_pmu.get_event_constraints = glp_get_event_constraints;
 		x86_pmu.cpu_events = glm_events_attrs;
@@ -4075,8 +4063,6 @@ __init int intel_pmu_init(void)
 		memcpy(hw_cache_extra_regs, nehalem_hw_cache_extra_regs,
 		       sizeof(hw_cache_extra_regs));
 
-		intel_pmu_lbr_init_nhm();
-
 		x86_pmu.event_constraints = intel_westmere_event_constraints;
 		x86_pmu.enable_all = intel_pmu_nhm_enable_all;
 		x86_pmu.pebs_constraints = intel_westmere_pebs_event_constraints;
@@ -4107,8 +4093,6 @@ __init int intel_pmu_init(void)
 		memcpy(hw_cache_extra_regs, snb_hw_cache_extra_regs,
 		       sizeof(hw_cache_extra_regs));
 
-		intel_pmu_lbr_init_snb();
-
 		x86_pmu.event_constraints = intel_snb_event_constraints;
 		x86_pmu.pebs_constraints = intel_snb_pebs_event_constraints;
 		x86_pmu.pebs_aliases = intel_pebs_aliases_snb;
@@ -4148,8 +4132,6 @@ __init int intel_pmu_init(void)
 		memcpy(hw_cache_extra_regs, snb_hw_cache_extra_regs,
 		       sizeof(hw_cache_extra_regs));
 
-		intel_pmu_lbr_init_snb();
-
 		x86_pmu.event_constraints = intel_ivb_event_constraints;
 		x86_pmu.pebs_constraints = intel_ivb_pebs_event_constraints;
 		x86_pmu.pebs_aliases = intel_pebs_aliases_ivb;
@@ -4184,8 +4166,6 @@ __init int intel_pmu_init(void)
 		memcpy(hw_cache_event_ids, hsw_hw_cache_event_ids, sizeof(hw_cache_event_ids));
 		memcpy(hw_cache_extra_regs, hsw_hw_cache_extra_regs, sizeof(hw_cache_extra_regs));
 
-		intel_pmu_lbr_init_hsw();
-
 		x86_pmu.event_constraints = intel_hsw_event_constraints;
 		x86_pmu.pebs_constraints = intel_hsw_pebs_event_constraints;
 		x86_pmu.extra_regs = intel_snbep_extra_regs;
@@ -4198,7 +4178,6 @@ __init int intel_pmu_init(void)
 		x86_pmu.hw_config = hsw_hw_config;
 		x86_pmu.get_event_constraints = hsw_get_event_constraints;
 		x86_pmu.cpu_events = get_hsw_events_attrs();
-		x86_pmu.lbr_double_abort = true;
 		extra_attr = boot_cpu_has(X86_FEATURE_RTM) ?
 			hsw_format_attr : nhm_format_attr;
 		pr_cont("Haswell events, ");
@@ -4223,8 +4202,6 @@ __init int intel_pmu_init(void)
 		hw_cache_extra_regs[C(NODE)][C(OP_WRITE)][C(RESULT_ACCESS)] = HSW_DEMAND_WRITE|
 									      BDW_L3_MISS_LOCAL|HSW_SNOOP_DRAM;
 
-		intel_pmu_lbr_init_hsw();
-
 		x86_pmu.event_constraints = intel_bdw_event_constraints;
 		x86_pmu.pebs_constraints = intel_bdw_pebs_event_constraints;
 		x86_pmu.extra_regs = intel_snbep_extra_regs;
@@ -4250,7 +4227,6 @@ __init int intel_pmu_init(void)
 		       slm_hw_cache_event_ids, sizeof(hw_cache_event_ids));
 		memcpy(hw_cache_extra_regs,
 		       knl_hw_cache_extra_regs, sizeof(hw_cache_extra_regs));
-		intel_pmu_lbr_init_knl();
 
 		x86_pmu.event_constraints = intel_slm_event_constraints;
 		x86_pmu.pebs_constraints = intel_slm_pebs_event_constraints;
@@ -4272,7 +4248,6 @@ __init int intel_pmu_init(void)
 		x86_pmu.late_ack = true;
 		memcpy(hw_cache_event_ids, skl_hw_cache_event_ids, sizeof(hw_cache_event_ids));
 		memcpy(hw_cache_extra_regs, skl_hw_cache_extra_regs, sizeof(hw_cache_extra_regs));
-		intel_pmu_lbr_init_skl();
 
 		/* INT_MISC.RECOVERY_CYCLES has umask 1 in Skylake */
 		event_attr_td_recovery_bubbles.event_str_noht =
@@ -4365,19 +4340,19 @@ __init int intel_pmu_init(void)
 	 * Check all LBT MSR here.
 	 * Disable LBR access if any LBR MSRs can not be accessed.
 	 */
-	if (x86_pmu.lbr_nr && !check_msr(x86_pmu.lbr_tos, 0x3UL))
-		x86_pmu.lbr_nr = 0;
-	for (i = 0; i < x86_pmu.lbr_nr; i++) {
-		if (!(check_msr(x86_pmu.lbr_from + i, 0xffffUL) &&
-		      check_msr(x86_pmu.lbr_to + i, 0xffffUL)))
-			x86_pmu.lbr_nr = 0;
+	if (x86_pmu.lbr.nr && !check_msr(x86_pmu.lbr.tos, 0x3UL))
+		x86_pmu.lbr.nr = 0;
+	for (i = 0; i < x86_pmu.lbr.nr; i++) {
+		if (!(check_msr(x86_pmu.lbr.from + i, 0xffffUL) &&
+		      check_msr(x86_pmu.lbr.to + i, 0xffffUL)))
+			x86_pmu.lbr.nr = 0;
 	}
 
 	x86_pmu.caps_attrs = intel_pmu_caps_attrs;
 
-	if (x86_pmu.lbr_nr) {
+	if (x86_pmu.lbr.nr) {
 		x86_pmu.caps_attrs = merge_attr(x86_pmu.caps_attrs, lbr_attrs);
-		pr_cont("%d-deep LBR, ", x86_pmu.lbr_nr);
+		pr_cont("%d-deep LBR, ", x86_pmu.lbr.nr);
 	}
 
 	/*
@@ -4390,7 +4365,7 @@ __init int intel_pmu_init(void)
 			er->extra_msr_access = check_msr(er->msr, 0x11UL);
 			/* Disable LBR select mapping */
 			if ((er->idx == EXTRA_REG_LBR) && !er->extra_msr_access)
-				x86_pmu.lbr_sel_map = NULL;
+				x86_pmu.lbr.sel_map = NULL;
 		}
 	}
 
diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index ae64d0b..f6dba2d 100644
--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -5,6 +5,7 @@
 #include <asm/perf_event.h>
 #include <asm/msr.h>
 #include <asm/insn.h>
+#include <asm/intel-family.h>
 
 #include "../perf_event.h"
 
@@ -167,7 +168,7 @@ static void __intel_pmu_lbr_enable(bool pmi)
 	 * did not change.
 	 */
 	if (cpuc->lbr_sel)
-		lbr_select = cpuc->lbr_sel->config & x86_pmu.lbr_sel_mask;
+		lbr_select = cpuc->lbr_sel->config & x86_pmu.lbr.sel_mask;
 	if (!pmi && cpuc->lbr_sel)
 		wrmsrl(MSR_LBR_SELECT, lbr_select);
 
@@ -198,17 +199,17 @@ static void intel_pmu_lbr_reset_32(void)
 {
 	int i;
 
-	for (i = 0; i < x86_pmu.lbr_nr; i++)
-		wrmsrl(x86_pmu.lbr_from + i, 0);
+	for (i = 0; i < x86_pmu.lbr.nr; i++)
+		wrmsrl(x86_pmu.lbr.from + i, 0);
 }
 
 static void intel_pmu_lbr_reset_64(void)
 {
 	int i;
 
-	for (i = 0; i < x86_pmu.lbr_nr; i++) {
-		wrmsrl(x86_pmu.lbr_from + i, 0);
-		wrmsrl(x86_pmu.lbr_to   + i, 0);
+	for (i = 0; i < x86_pmu.lbr.nr; i++) {
+		wrmsrl(x86_pmu.lbr.from + i, 0);
+		wrmsrl(x86_pmu.lbr.to   + i, 0);
 		if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_INFO)
 			wrmsrl(MSR_LBR_INFO_0 + i, 0);
 	}
@@ -216,7 +217,7 @@ static void intel_pmu_lbr_reset_64(void)
 
 void intel_pmu_lbr_reset(void)
 {
-	if (!x86_pmu.lbr_nr)
+	if (!x86_pmu.lbr.nr)
 		return;
 
 	if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_32)
@@ -232,7 +233,7 @@ static inline u64 intel_pmu_lbr_tos(void)
 {
 	u64 tos;
 
-	rdmsrl(x86_pmu.lbr_tos, tos);
+	rdmsrl(x86_pmu.lbr.tos, tos);
 	return tos;
 }
 
@@ -306,19 +307,19 @@ static u64 lbr_from_signext_quirk_rd(u64 val)
 static inline void wrlbr_from(unsigned int idx, u64 val)
 {
 	val = lbr_from_signext_quirk_wr(val);
-	wrmsrl(x86_pmu.lbr_from + idx, val);
+	wrmsrl(x86_pmu.lbr.from + idx, val);
 }
 
 static inline void wrlbr_to(unsigned int idx, u64 val)
 {
-	wrmsrl(x86_pmu.lbr_to + idx, val);
+	wrmsrl(x86_pmu.lbr.to + idx, val);
 }
 
 static inline u64 rdlbr_from(unsigned int idx)
 {
 	u64 val;
 
-	rdmsrl(x86_pmu.lbr_from + idx, val);
+	rdmsrl(x86_pmu.lbr.from + idx, val);
 
 	return lbr_from_signext_quirk_rd(val);
 }
@@ -327,7 +328,7 @@ static inline u64 rdlbr_to(unsigned int idx)
 {
 	u64 val;
 
-	rdmsrl(x86_pmu.lbr_to + idx, val);
+	rdmsrl(x86_pmu.lbr.to + idx, val);
 
 	return val;
 }
@@ -344,7 +345,7 @@ static void __intel_pmu_lbr_restore(struct x86_perf_task_context *task_ctx)
 		return;
 	}
 
-	mask = x86_pmu.lbr_nr - 1;
+	mask = x86_pmu.lbr.nr - 1;
 	tos = task_ctx->tos;
 	for (i = 0; i < tos; i++) {
 		lbr_idx = (tos - i) & mask;
@@ -354,7 +355,7 @@ static void __intel_pmu_lbr_restore(struct x86_perf_task_context *task_ctx)
 		if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_INFO)
 			wrmsrl(MSR_LBR_INFO_0 + lbr_idx, task_ctx->lbr_info[i]);
 	}
-	wrmsrl(x86_pmu.lbr_tos, tos);
+	wrmsrl(x86_pmu.lbr.tos, tos);
 	task_ctx->lbr_stack_state = LBR_NONE;
 }
 
@@ -369,7 +370,7 @@ static void __intel_pmu_lbr_save(struct x86_perf_task_context *task_ctx)
 		return;
 	}
 
-	mask = x86_pmu.lbr_nr - 1;
+	mask = x86_pmu.lbr.nr - 1;
 	tos = intel_pmu_lbr_tos();
 	for (i = 0; i < tos; i++) {
 		lbr_idx = (tos - i) & mask;
@@ -424,7 +425,7 @@ void intel_pmu_lbr_add(struct perf_event *event)
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct x86_perf_task_context *task_ctx;
 
-	if (!x86_pmu.lbr_nr)
+	if (!x86_pmu.lbr.nr)
 		return;
 
 	cpuc->br_sel = event->hw.branch_reg.reg;
@@ -463,7 +464,7 @@ void intel_pmu_lbr_del(struct perf_event *event)
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct x86_perf_task_context *task_ctx;
 
-	if (!x86_pmu.lbr_nr)
+	if (!x86_pmu.lbr.nr)
 		return;
 
 	if (branch_user_callstack(cpuc->br_sel) &&
@@ -495,11 +496,11 @@ void intel_pmu_lbr_disable_all(void)
 
 static void intel_pmu_lbr_read_32(struct cpu_hw_events *cpuc)
 {
-	unsigned long mask = x86_pmu.lbr_nr - 1;
+	unsigned long mask = x86_pmu.lbr.nr - 1;
 	u64 tos = intel_pmu_lbr_tos();
 	int i;
 
-	for (i = 0; i < x86_pmu.lbr_nr; i++) {
+	for (i = 0; i < x86_pmu.lbr.nr; i++) {
 		unsigned long lbr_idx = (tos - i) & mask;
 		union {
 			struct {
@@ -509,7 +510,7 @@ static void intel_pmu_lbr_read_32(struct cpu_hw_events *cpuc)
 			u64     lbr;
 		} msr_lastbranch;
 
-		rdmsrl(x86_pmu.lbr_from + lbr_idx, msr_lastbranch.lbr);
+		rdmsrl(x86_pmu.lbr.from + lbr_idx, msr_lastbranch.lbr);
 
 		cpuc->lbr_entries[i].from	= msr_lastbranch.from;
 		cpuc->lbr_entries[i].to		= msr_lastbranch.to;
@@ -532,12 +533,12 @@ static void intel_pmu_lbr_read_32(struct cpu_hw_events *cpuc)
 static void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc)
 {
 	bool need_info = false;
-	unsigned long mask = x86_pmu.lbr_nr - 1;
+	unsigned long mask = x86_pmu.lbr.nr - 1;
 	int lbr_format = x86_pmu.intel_cap.lbr_format;
 	u64 tos = intel_pmu_lbr_tos();
 	int i;
 	int out = 0;
-	int num = x86_pmu.lbr_nr;
+	int num = x86_pmu.lbr.nr;
 
 	if (cpuc->lbr_sel) {
 		need_info = !(cpuc->lbr_sel->config & LBR_NO_INFO);
@@ -595,7 +596,7 @@ static void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc)
 		 * If the abort just happened outside the window
 		 * the extra entry cannot be removed.
 		 */
-		if (abort && x86_pmu.lbr_double_abort && out > 0)
+		if (abort && x86_pmu.lbr.double_abort && out > 0)
 			out--;
 
 		cpuc->lbr_entries[out].from	 = from;
@@ -711,7 +712,7 @@ static int intel_pmu_setup_hw_lbr_filter(struct perf_event *event)
 		if (!(br_type & (1ULL << i)))
 			continue;
 
-		v = x86_pmu.lbr_sel_map[i];
+		v = x86_pmu.lbr.sel_map[i];
 		if (v == LBR_NOT_SUPP)
 			return -EOPNOTSUPP;
 
@@ -729,7 +730,7 @@ static int intel_pmu_setup_hw_lbr_filter(struct perf_event *event)
 	 * But the 10th bit LBR_CALL_STACK does not operate
 	 * in suppress mode.
 	 */
-	reg->config = mask ^ (x86_pmu.lbr_sel_mask & ~LBR_CALL_STACK);
+	reg->config = mask ^ (x86_pmu.lbr.sel_mask & ~LBR_CALL_STACK);
 
 	if ((br_type & PERF_SAMPLE_BRANCH_NO_CYCLES) &&
 	    (br_type & PERF_SAMPLE_BRANCH_NO_FLAGS) &&
@@ -746,7 +747,7 @@ int intel_pmu_setup_lbr_filter(struct perf_event *event)
 	/*
 	 * no LBR on this PMU
 	 */
-	if (!x86_pmu.lbr_nr)
+	if (!x86_pmu.lbr.nr)
 		return -EOPNOTSUPP;
 
 	/*
@@ -759,7 +760,7 @@ int intel_pmu_setup_lbr_filter(struct perf_event *event)
 	/*
 	 * setup HW LBR filter, if any
 	 */
-	if (x86_pmu.lbr_sel_map)
+	if (x86_pmu.lbr.sel_map)
 		ret = intel_pmu_setup_hw_lbr_filter(event);
 
 	return ret;
@@ -1091,12 +1092,12 @@ static const int hsw_lbr_sel_map[PERF_SAMPLE_BRANCH_MAX_SHIFT] = {
 };
 
 /* core */
-void __init intel_pmu_lbr_init_core(void)
+static void intel_pmu_lbr_init_core(struct x86_pmu_lbr *lbr)
 {
-	x86_pmu.lbr_nr     = 4;
-	x86_pmu.lbr_tos    = MSR_LBR_TOS;
-	x86_pmu.lbr_from   = MSR_LBR_CORE_FROM;
-	x86_pmu.lbr_to     = MSR_LBR_CORE_TO;
+	lbr->nr     = 4;
+	lbr->tos    = MSR_LBR_TOS;
+	lbr->from   = MSR_LBR_CORE_FROM;
+	lbr->to     = MSR_LBR_CORE_TO;
 
 	/*
 	 * SW branch filter usage:
@@ -1105,15 +1106,15 @@ void __init intel_pmu_lbr_init_core(void)
 }
 
 /* nehalem/westmere */
-void __init intel_pmu_lbr_init_nhm(void)
+static void intel_pmu_lbr_init_nhm(struct x86_pmu_lbr *lbr)
 {
-	x86_pmu.lbr_nr     = 16;
-	x86_pmu.lbr_tos    = MSR_LBR_TOS;
-	x86_pmu.lbr_from   = MSR_LBR_NHM_FROM;
-	x86_pmu.lbr_to     = MSR_LBR_NHM_TO;
+	lbr->nr     = 16;
+	lbr->tos    = MSR_LBR_TOS;
+	lbr->from   = MSR_LBR_NHM_FROM;
+	lbr->to     = MSR_LBR_NHM_TO;
 
-	x86_pmu.lbr_sel_mask = LBR_SEL_MASK;
-	x86_pmu.lbr_sel_map  = nhm_lbr_sel_map;
+	lbr->sel_mask = LBR_SEL_MASK;
+	lbr->sel_map  = nhm_lbr_sel_map;
 
 	/*
 	 * SW branch filter usage:
@@ -1125,15 +1126,15 @@ void __init intel_pmu_lbr_init_nhm(void)
 }
 
 /* sandy bridge */
-void __init intel_pmu_lbr_init_snb(void)
+static void intel_pmu_lbr_init_snb(struct x86_pmu_lbr *lbr)
 {
-	x86_pmu.lbr_nr	 = 16;
-	x86_pmu.lbr_tos	 = MSR_LBR_TOS;
-	x86_pmu.lbr_from = MSR_LBR_NHM_FROM;
-	x86_pmu.lbr_to   = MSR_LBR_NHM_TO;
+	lbr->nr   = 16;
+	lbr->tos  = MSR_LBR_TOS;
+	lbr->from = MSR_LBR_NHM_FROM;
+	lbr->to   = MSR_LBR_NHM_TO;
 
-	x86_pmu.lbr_sel_mask = LBR_SEL_MASK;
-	x86_pmu.lbr_sel_map  = snb_lbr_sel_map;
+	lbr->sel_mask = LBR_SEL_MASK;
+	lbr->sel_map  = snb_lbr_sel_map;
 
 	/*
 	 * SW branch filter usage:
@@ -1144,30 +1145,27 @@ void __init intel_pmu_lbr_init_snb(void)
 }
 
 /* haswell */
-void intel_pmu_lbr_init_hsw(void)
+static void intel_pmu_lbr_init_hsw(struct x86_pmu_lbr *lbr)
 {
-	x86_pmu.lbr_nr	 = 16;
-	x86_pmu.lbr_tos	 = MSR_LBR_TOS;
-	x86_pmu.lbr_from = MSR_LBR_NHM_FROM;
-	x86_pmu.lbr_to   = MSR_LBR_NHM_TO;
+	lbr->nr   = 16;
+	lbr->tos  = MSR_LBR_TOS;
+	lbr->from = MSR_LBR_NHM_FROM;
+	lbr->to   = MSR_LBR_NHM_TO;
 
-	x86_pmu.lbr_sel_mask = LBR_SEL_MASK;
-	x86_pmu.lbr_sel_map  = hsw_lbr_sel_map;
-
-	if (lbr_from_signext_quirk_needed())
-		static_branch_enable(&lbr_from_quirk_key);
+	lbr->sel_mask = LBR_SEL_MASK;
+	lbr->sel_map  = hsw_lbr_sel_map;
 }
 
 /* skylake */
-__init void intel_pmu_lbr_init_skl(void)
+static void intel_pmu_lbr_init_skl(struct x86_pmu_lbr *lbr)
 {
-	x86_pmu.lbr_nr	 = 32;
-	x86_pmu.lbr_tos	 = MSR_LBR_TOS;
-	x86_pmu.lbr_from = MSR_LBR_NHM_FROM;
-	x86_pmu.lbr_to   = MSR_LBR_NHM_TO;
+	lbr->nr   = 32;
+	lbr->tos  = MSR_LBR_TOS;
+	lbr->from = MSR_LBR_NHM_FROM;
+	lbr->to   = MSR_LBR_NHM_TO;
 
-	x86_pmu.lbr_sel_mask = LBR_SEL_MASK;
-	x86_pmu.lbr_sel_map  = hsw_lbr_sel_map;
+	lbr->sel_mask = LBR_SEL_MASK;
+	lbr->sel_map  = hsw_lbr_sel_map;
 
 	/*
 	 * SW branch filter usage:
@@ -1178,23 +1176,12 @@ __init void intel_pmu_lbr_init_skl(void)
 }
 
 /* atom */
-void __init intel_pmu_lbr_init_atom(void)
+static void intel_pmu_lbr_init_atom(struct x86_pmu_lbr *lbr)
 {
-	/*
-	 * only models starting at stepping 10 seems
-	 * to have an operational LBR which can freeze
-	 * on PMU interrupt
-	 */
-	if (boot_cpu_data.x86_model == 28
-	    && boot_cpu_data.x86_mask < 10) {
-		pr_cont("LBR disabled due to erratum");
-		return;
-	}
-
-	x86_pmu.lbr_nr	   = 8;
-	x86_pmu.lbr_tos    = MSR_LBR_TOS;
-	x86_pmu.lbr_from   = MSR_LBR_CORE_FROM;
-	x86_pmu.lbr_to     = MSR_LBR_CORE_TO;
+	lbr->nr     = 8;
+	lbr->tos    = MSR_LBR_TOS;
+	lbr->from   = MSR_LBR_CORE_FROM;
+	lbr->to     = MSR_LBR_CORE_TO;
 
 	/*
 	 * SW branch filter usage:
@@ -1203,31 +1190,128 @@ void __init intel_pmu_lbr_init_atom(void)
 }
 
 /* slm */
-void __init intel_pmu_lbr_init_slm(void)
+static void intel_pmu_lbr_init_slm(struct x86_pmu_lbr *lbr)
 {
-	x86_pmu.lbr_nr	   = 8;
-	x86_pmu.lbr_tos    = MSR_LBR_TOS;
-	x86_pmu.lbr_from   = MSR_LBR_CORE_FROM;
-	x86_pmu.lbr_to     = MSR_LBR_CORE_TO;
+	lbr->nr     = 8;
+	lbr->tos    = MSR_LBR_TOS;
+	lbr->from   = MSR_LBR_CORE_FROM;
+	lbr->to     = MSR_LBR_CORE_TO;
 
-	x86_pmu.lbr_sel_mask = LBR_SEL_MASK;
-	x86_pmu.lbr_sel_map  = nhm_lbr_sel_map;
+	lbr->sel_mask = LBR_SEL_MASK;
+	lbr->sel_map  = nhm_lbr_sel_map;
 
 	/*
 	 * SW branch filter usage:
 	 * - compensate for lack of HW filter
 	 */
-	pr_cont("8-deep LBR, ");
 }
 
 /* Knights Landing */
-void intel_pmu_lbr_init_knl(void)
+static void intel_pmu_lbr_init_knl(struct x86_pmu_lbr *lbr)
+{
+	lbr->nr     = 8;
+	lbr->tos    = MSR_LBR_TOS;
+	lbr->from   = MSR_LBR_NHM_FROM;
+	lbr->to     = MSR_LBR_NHM_TO;
+
+	lbr->sel_mask = LBR_SEL_MASK;
+	lbr->sel_map  = snb_lbr_sel_map;
+}
+
+static void __intel_pmu_lbr_fill(struct x86_pmu_lbr *lbr, u8 family, u8 model)
+{
+	bool apply_tweaks = lbr == &x86_pmu.lbr;
+
+	memset(lbr, 0, sizeof(struct x86_pmu_lbr));
+
+	if (family != 0x6)
+		return;
+
+	switch (model) {
+	case INTEL_FAM6_CORE_YONAH:
+		break;
+	case INTEL_FAM6_CORE2_MEROM:
+	case INTEL_FAM6_CORE2_MEROM_L:
+	case INTEL_FAM6_CORE2_PENRYN:
+	case INTEL_FAM6_CORE2_DUNNINGTON:
+		intel_pmu_lbr_init_core(lbr);
+		break;
+	case INTEL_FAM6_NEHALEM:
+	case INTEL_FAM6_NEHALEM_EP:
+	case INTEL_FAM6_NEHALEM_EX:
+	case INTEL_FAM6_WESTMERE:
+	case INTEL_FAM6_WESTMERE_EP:
+	case INTEL_FAM6_WESTMERE_EX:
+		intel_pmu_lbr_init_nhm(lbr);
+		break;
+	case INTEL_FAM6_ATOM_PINEVIEW:
+	case INTEL_FAM6_ATOM_LINCROFT:
+	case INTEL_FAM6_ATOM_PENWELL:
+	case INTEL_FAM6_ATOM_CLOVERVIEW:
+	case INTEL_FAM6_ATOM_CEDARVIEW:
+		/*
+		 * only models starting at stepping 10 seems
+		 * to have an operational LBR which can freeze
+		 * on PMU interrupt
+		 */
+		if (apply_tweaks &&
+		    boot_cpu_data.x86_model == INTEL_FAM6_ATOM_PINEVIEW &&
+		    boot_cpu_data.x86_mask < 10) {
+			pr_cont("LBR disabled due to erratum");
+			return;
+		}
+		intel_pmu_lbr_init_atom(lbr);
+		break;
+	case INTEL_FAM6_ATOM_SILVERMONT1:
+	case INTEL_FAM6_ATOM_SILVERMONT2:
+	case INTEL_FAM6_ATOM_AIRMONT:
+		intel_pmu_lbr_init_slm(lbr);
+		break;
+	case INTEL_FAM6_ATOM_GOLDMONT:
+	case INTEL_FAM6_ATOM_DENVERTON:
+	case INTEL_FAM6_ATOM_GEMINI_LAKE:
+		lbr->pt_coexist = true;
+	case INTEL_FAM6_SKYLAKE_MOBILE:
+	case INTEL_FAM6_SKYLAKE_DESKTOP:
+	case INTEL_FAM6_SKYLAKE_X:
+	case INTEL_FAM6_KABYLAKE_MOBILE:
+	case INTEL_FAM6_KABYLAKE_DESKTOP:
+		intel_pmu_lbr_init_skl(lbr);
+		break;
+	case INTEL_FAM6_SANDYBRIDGE:
+	case INTEL_FAM6_SANDYBRIDGE_X:
+	case INTEL_FAM6_IVYBRIDGE:
+	case INTEL_FAM6_IVYBRIDGE_X:
+		intel_pmu_lbr_init_snb(lbr);
+		break;
+	case INTEL_FAM6_HASWELL_CORE:
+	case INTEL_FAM6_HASWELL_X:
+	case INTEL_FAM6_HASWELL_ULT:
+	case INTEL_FAM6_HASWELL_GT3E:
+		lbr->double_abort = true;
+	case INTEL_FAM6_BROADWELL_CORE:
+	case INTEL_FAM6_BROADWELL_XEON_D:
+	case INTEL_FAM6_BROADWELL_GT3E:
+	case INTEL_FAM6_BROADWELL_X:
+		intel_pmu_lbr_init_hsw(lbr);
+		if (apply_tweaks && lbr_from_signext_quirk_needed())
+			static_branch_enable(&lbr_from_quirk_key);
+		break;
+	case INTEL_FAM6_XEON_PHI_KNL:
+	case INTEL_FAM6_XEON_PHI_KNM:
+		intel_pmu_lbr_init_knl(lbr);
+		break;
+	}
+}
+
+void __init intel_pmu_lbr_init(void)
 {
-	x86_pmu.lbr_nr	   = 8;
-	x86_pmu.lbr_tos    = MSR_LBR_TOS;
-	x86_pmu.lbr_from   = MSR_LBR_NHM_FROM;
-	x86_pmu.lbr_to     = MSR_LBR_NHM_TO;
+	__intel_pmu_lbr_fill(&x86_pmu.lbr, boot_cpu_data.x86,
+			     boot_cpu_data.x86_model);
+}
 
-	x86_pmu.lbr_sel_mask = LBR_SEL_MASK;
-	x86_pmu.lbr_sel_map  = snb_lbr_sel_map;
+void intel_pmu_lbr_fill(struct x86_pmu_lbr *lbr, u8 family, u8 model)
+{
+	__intel_pmu_lbr_fill(lbr, family, model);
 }
+EXPORT_SYMBOL_GPL(intel_pmu_lbr_fill);
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index f7aaadf..0c436ac 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -627,12 +627,7 @@ struct x86_pmu {
 	/*
 	 * Intel LBR
 	 */
-	unsigned long	lbr_tos, lbr_from, lbr_to; /* MSR base regs       */
-	int		lbr_nr;			   /* hardware stack size */
-	u64		lbr_sel_mask;		   /* LBR_SELECT valid bits */
-	const int	*lbr_sel_map;		   /* lbr_select mappings */
-	bool		lbr_double_abort;	   /* duplicated lbr aborts */
-	bool		lbr_pt_coexist;		   /* (LBR|BTS) may coexist with PT */
+	struct x86_pmu_lbr lbr;
 
 	/*
 	 * Intel PT/LBR/BTS are exclusive
@@ -711,8 +706,8 @@ extern struct x86_pmu x86_pmu __read_mostly;
 
 static inline bool x86_pmu_has_lbr_callstack(void)
 {
-	return  x86_pmu.lbr_sel_map &&
-		x86_pmu.lbr_sel_map[PERF_SAMPLE_BRANCH_CALL_STACK_SHIFT] > 0;
+	return  x86_pmu.lbr.sel_map &&
+		x86_pmu.lbr.sel_map[PERF_SAMPLE_BRANCH_CALL_STACK_SHIFT] > 0;
 }
 
 DECLARE_PER_CPU(struct cpu_hw_events, cpu_hw_events);
@@ -955,21 +950,9 @@ void intel_pmu_lbr_disable_all(void);
 
 void intel_pmu_lbr_read(void);
 
-void intel_pmu_lbr_init_core(void);
+void intel_pmu_lbr_init(void);
 
-void intel_pmu_lbr_init_nhm(void);
-
-void intel_pmu_lbr_init_atom(void);
-
-void intel_pmu_lbr_init_slm(void);
-
-void intel_pmu_lbr_init_snb(void);
-
-void intel_pmu_lbr_init_hsw(void);
-
-void intel_pmu_lbr_init_skl(void);
-
-void intel_pmu_lbr_init_knl(void);
+void intel_pmu_lbr_fill(struct x86_pmu_lbr *lbr, u8 family, u8 model);
 
 void intel_pmu_pebs_data_source_nhm(void);
 
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 12f5408..6fc722f 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -125,6 +125,15 @@ struct x86_pmu_capability {
 	int		events_mask_len;
 };
 
+struct x86_pmu_lbr {
+	unsigned long	tos, from, to;	/* MSR base regs */
+	int		nr;		/* hardware stack size */
+	u64		sel_mask;	/* LBR_SELECT valid bits */
+	const int	*sel_map;	/* lbr_select mappings */
+	bool		double_abort;	/* duplicated lbr aborts */
+	bool		pt_coexist;	/* (LBR|BTS) may coexist with PT */
+};
+
 /*
  * Fixed-purpose performance events:
  */
@@ -300,4 +309,6 @@ static inline void perf_check_microcode(void) { }
 
 #define arch_perf_out_copy_user copy_from_user_nmi
 
+extern void intel_pmu_lbr_fill(struct x86_pmu_lbr *lbr, u8 family, u8 model);
+
 #endif /* _ASM_X86_PERF_EVENT_H */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH RFC 1/2] perf/x86/intel: make reusable LBR initialization code
@ 2017-12-06 11:43   ` Jan Dakinevich
  0 siblings, 0 replies; 22+ messages in thread
From: Jan Dakinevich @ 2017-12-06 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jan Dakinevich, Denis V . Lunev, Roman Kagan, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, H. Peter Anvin, x86,
	Paolo Bonzini, Radim Krčmář,
	Andi Kleen, Kan Liang, Stephane Eranian, Zhou Chengming,
	Sebastian Andrzej Siewior

This patch introduces globally visible intel_pmu_lbr_fill() routine,
which gathers information which LBR MSRs are support for specific CPU
family/model.

It is supposed that the routine would be used in KVM code, using guest
CPU information as an input. By this reason, it should not have any side
effect which could affect host system.

 * LBR information moved to separate structure `struct x86_pmu_lbr';
 * All family-specific tweaks on gathered information are applied only
   for global x86_pmu.lbr to keep current perf initialization behavior.

Signed-off-by: Jan Dakinevich <jan.dakinevich@virtuozzo.com>
---
 arch/x86/events/core.c            |   8 +-
 arch/x86/events/intel/core.c      |  59 +++------
 arch/x86/events/intel/lbr.c       | 272 +++++++++++++++++++++++++-------------
 arch/x86/events/perf_event.h      |  27 +---
 arch/x86/include/asm/perf_event.h |  11 ++
 5 files changed, 215 insertions(+), 162 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 140d332..72fc174 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -370,7 +370,7 @@ int x86_add_exclusive(unsigned int what)
 	 * When lbr_pt_coexist we allow PT to coexist with either LBR or BTS.
 	 * LBR and BTS are still mutually exclusive.
 	 */
-	if (x86_pmu.lbr_pt_coexist && what == x86_lbr_exclusive_pt)
+	if (x86_pmu.lbr.pt_coexist && what == x86_lbr_exclusive_pt)
 		return 0;
 
 	if (!atomic_inc_not_zero(&x86_pmu.lbr_exclusive[what])) {
@@ -393,7 +393,7 @@ int x86_add_exclusive(unsigned int what)
 
 void x86_del_exclusive(unsigned int what)
 {
-	if (x86_pmu.lbr_pt_coexist && what == x86_lbr_exclusive_pt)
+	if (x86_pmu.lbr.pt_coexist && what == x86_lbr_exclusive_pt)
 		return;
 
 	atomic_dec(&x86_pmu.lbr_exclusive[what]);
@@ -496,7 +496,7 @@ int x86_pmu_max_precise(void)
 		precise++;
 
 		/* Support for IP fixup */
-		if (x86_pmu.lbr_nr || x86_pmu.intel_cap.pebs_format >= 2)
+		if (x86_pmu.lbr.nr || x86_pmu.intel_cap.pebs_format >= 2)
 			precise++;
 
 		if (x86_pmu.pebs_prec_dist)
@@ -1311,7 +1311,7 @@ void perf_event_print_debug(void)
 			rdmsrl(MSR_IA32_PEBS_ENABLE, pebs);
 			pr_info("CPU#%d: pebs:       %016llx\n", cpu, pebs);
 		}
-		if (x86_pmu.lbr_nr) {
+		if (x86_pmu.lbr.nr) {
 			rdmsrl(MSR_IA32_DEBUGCTLMSR, debugctl);
 			pr_info("CPU#%d: debugctl:   %016llx\n", cpu, debugctl);
 		}
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 09c26a4..2ac799d 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2182,7 +2182,7 @@ static void intel_pmu_reset(void)
 	}
 
 	/* Reset LBRs and LBR freezing */
-	if (x86_pmu.lbr_nr) {
+	if (x86_pmu.lbr.nr) {
 		update_debugctlmsr(get_debugctlmsr() &
 			~(DEBUGCTLMSR_FREEZE_LBRS_ON_PMI|DEBUGCTLMSR_LBR));
 	}
@@ -3262,7 +3262,7 @@ static int intel_pmu_cpu_prepare(int cpu)
 {
 	struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu);
 
-	if (x86_pmu.extra_regs || x86_pmu.lbr_sel_map) {
+	if (x86_pmu.extra_regs || x86_pmu.lbr.sel_map) {
 		cpuc->shared_regs = allocate_shared_regs(cpu);
 		if (!cpuc->shared_regs)
 			goto err;
@@ -3343,7 +3343,7 @@ static void intel_pmu_cpu_starting(int cpu)
 		cpuc->shared_regs->refcnt++;
 	}
 
-	if (x86_pmu.lbr_sel_map)
+	if (x86_pmu.lbr.sel_map)
 		cpuc->lbr_sel = &cpuc->shared_regs->regs[EXTRA_REG_LBR];
 
 	if (x86_pmu.flags & PMU_FL_EXCL_CNTRS) {
@@ -3595,9 +3595,9 @@ static void intel_snb_check_microcode(void)
 
 static bool is_lbr_from(unsigned long msr)
 {
-	unsigned long lbr_from_nr = x86_pmu.lbr_from + x86_pmu.lbr_nr;
+	unsigned long lbr_from_nr = x86_pmu.lbr.from + x86_pmu.lbr.nr;
 
-	return x86_pmu.lbr_from <= msr && msr < lbr_from_nr;
+	return x86_pmu.lbr.from <= msr && msr < lbr_from_nr;
 }
 
 /*
@@ -3814,7 +3814,7 @@ static ssize_t branches_show(struct device *cdev,
 			     struct device_attribute *attr,
 			     char *buf)
 {
-	return snprintf(buf, PAGE_SIZE, "%d\n", x86_pmu.lbr_nr);
+	return snprintf(buf, PAGE_SIZE, "%d\n", x86_pmu.lbr.nr);
 }
 
 static DEVICE_ATTR_RO(branches);
@@ -3915,6 +3915,8 @@ __init int intel_pmu_init(void)
 
 	intel_ds_init();
 
+	intel_pmu_lbr_init();
+
 	x86_add_quirk(intel_arch_events_quirk); /* Install first, so it runs last */
 
 	/*
@@ -3934,8 +3936,6 @@ __init int intel_pmu_init(void)
 		memcpy(hw_cache_event_ids, core2_hw_cache_event_ids,
 		       sizeof(hw_cache_event_ids));
 
-		intel_pmu_lbr_init_core();
-
 		x86_pmu.event_constraints = intel_core2_event_constraints;
 		x86_pmu.pebs_constraints = intel_core2_pebs_event_constraints;
 		pr_cont("Core2 events, ");
@@ -3950,8 +3950,6 @@ __init int intel_pmu_init(void)
 		memcpy(hw_cache_extra_regs, nehalem_hw_cache_extra_regs,
 		       sizeof(hw_cache_extra_regs));
 
-		intel_pmu_lbr_init_nhm();
-
 		x86_pmu.event_constraints = intel_nehalem_event_constraints;
 		x86_pmu.pebs_constraints = intel_nehalem_pebs_event_constraints;
 		x86_pmu.enable_all = intel_pmu_nhm_enable_all;
@@ -3983,8 +3981,6 @@ __init int intel_pmu_init(void)
 		memcpy(hw_cache_event_ids, atom_hw_cache_event_ids,
 		       sizeof(hw_cache_event_ids));
 
-		intel_pmu_lbr_init_atom();
-
 		x86_pmu.event_constraints = intel_gen_event_constraints;
 		x86_pmu.pebs_constraints = intel_atom_pebs_event_constraints;
 		x86_pmu.pebs_aliases = intel_pebs_aliases_core2;
@@ -4000,8 +3996,6 @@ __init int intel_pmu_init(void)
 		memcpy(hw_cache_extra_regs, slm_hw_cache_extra_regs,
 		       sizeof(hw_cache_extra_regs));
 
-		intel_pmu_lbr_init_slm();
-
 		x86_pmu.event_constraints = intel_slm_event_constraints;
 		x86_pmu.pebs_constraints = intel_slm_pebs_event_constraints;
 		x86_pmu.extra_regs = intel_slm_extra_regs;
@@ -4019,8 +4013,6 @@ __init int intel_pmu_init(void)
 		memcpy(hw_cache_extra_regs, glm_hw_cache_extra_regs,
 		       sizeof(hw_cache_extra_regs));
 
-		intel_pmu_lbr_init_skl();
-
 		x86_pmu.event_constraints = intel_slm_event_constraints;
 		x86_pmu.pebs_constraints = intel_glm_pebs_event_constraints;
 		x86_pmu.extra_regs = intel_glm_extra_regs;
@@ -4031,7 +4023,6 @@ __init int intel_pmu_init(void)
 		 */
 		x86_pmu.pebs_aliases = NULL;
 		x86_pmu.pebs_prec_dist = true;
-		x86_pmu.lbr_pt_coexist = true;
 		x86_pmu.flags |= PMU_FL_HAS_RSP_1;
 		x86_pmu.cpu_events = glm_events_attrs;
 		extra_attr = slm_format_attr;
@@ -4045,8 +4036,6 @@ __init int intel_pmu_init(void)
 		memcpy(hw_cache_extra_regs, glp_hw_cache_extra_regs,
 		       sizeof(hw_cache_extra_regs));
 
-		intel_pmu_lbr_init_skl();
-
 		x86_pmu.event_constraints = intel_slm_event_constraints;
 		x86_pmu.pebs_constraints = intel_glp_pebs_event_constraints;
 		x86_pmu.extra_regs = intel_glm_extra_regs;
@@ -4056,7 +4045,6 @@ __init int intel_pmu_init(void)
 		 */
 		x86_pmu.pebs_aliases = NULL;
 		x86_pmu.pebs_prec_dist = true;
-		x86_pmu.lbr_pt_coexist = true;
 		x86_pmu.flags |= PMU_FL_HAS_RSP_1;
 		x86_pmu.get_event_constraints = glp_get_event_constraints;
 		x86_pmu.cpu_events = glm_events_attrs;
@@ -4075,8 +4063,6 @@ __init int intel_pmu_init(void)
 		memcpy(hw_cache_extra_regs, nehalem_hw_cache_extra_regs,
 		       sizeof(hw_cache_extra_regs));
 
-		intel_pmu_lbr_init_nhm();
-
 		x86_pmu.event_constraints = intel_westmere_event_constraints;
 		x86_pmu.enable_all = intel_pmu_nhm_enable_all;
 		x86_pmu.pebs_constraints = intel_westmere_pebs_event_constraints;
@@ -4107,8 +4093,6 @@ __init int intel_pmu_init(void)
 		memcpy(hw_cache_extra_regs, snb_hw_cache_extra_regs,
 		       sizeof(hw_cache_extra_regs));
 
-		intel_pmu_lbr_init_snb();
-
 		x86_pmu.event_constraints = intel_snb_event_constraints;
 		x86_pmu.pebs_constraints = intel_snb_pebs_event_constraints;
 		x86_pmu.pebs_aliases = intel_pebs_aliases_snb;
@@ -4148,8 +4132,6 @@ __init int intel_pmu_init(void)
 		memcpy(hw_cache_extra_regs, snb_hw_cache_extra_regs,
 		       sizeof(hw_cache_extra_regs));
 
-		intel_pmu_lbr_init_snb();
-
 		x86_pmu.event_constraints = intel_ivb_event_constraints;
 		x86_pmu.pebs_constraints = intel_ivb_pebs_event_constraints;
 		x86_pmu.pebs_aliases = intel_pebs_aliases_ivb;
@@ -4184,8 +4166,6 @@ __init int intel_pmu_init(void)
 		memcpy(hw_cache_event_ids, hsw_hw_cache_event_ids, sizeof(hw_cache_event_ids));
 		memcpy(hw_cache_extra_regs, hsw_hw_cache_extra_regs, sizeof(hw_cache_extra_regs));
 
-		intel_pmu_lbr_init_hsw();
-
 		x86_pmu.event_constraints = intel_hsw_event_constraints;
 		x86_pmu.pebs_constraints = intel_hsw_pebs_event_constraints;
 		x86_pmu.extra_regs = intel_snbep_extra_regs;
@@ -4198,7 +4178,6 @@ __init int intel_pmu_init(void)
 		x86_pmu.hw_config = hsw_hw_config;
 		x86_pmu.get_event_constraints = hsw_get_event_constraints;
 		x86_pmu.cpu_events = get_hsw_events_attrs();
-		x86_pmu.lbr_double_abort = true;
 		extra_attr = boot_cpu_has(X86_FEATURE_RTM) ?
 			hsw_format_attr : nhm_format_attr;
 		pr_cont("Haswell events, ");
@@ -4223,8 +4202,6 @@ __init int intel_pmu_init(void)
 		hw_cache_extra_regs[C(NODE)][C(OP_WRITE)][C(RESULT_ACCESS)] = HSW_DEMAND_WRITE|
 									      BDW_L3_MISS_LOCAL|HSW_SNOOP_DRAM;
 
-		intel_pmu_lbr_init_hsw();
-
 		x86_pmu.event_constraints = intel_bdw_event_constraints;
 		x86_pmu.pebs_constraints = intel_bdw_pebs_event_constraints;
 		x86_pmu.extra_regs = intel_snbep_extra_regs;
@@ -4250,7 +4227,6 @@ __init int intel_pmu_init(void)
 		       slm_hw_cache_event_ids, sizeof(hw_cache_event_ids));
 		memcpy(hw_cache_extra_regs,
 		       knl_hw_cache_extra_regs, sizeof(hw_cache_extra_regs));
-		intel_pmu_lbr_init_knl();
 
 		x86_pmu.event_constraints = intel_slm_event_constraints;
 		x86_pmu.pebs_constraints = intel_slm_pebs_event_constraints;
@@ -4272,7 +4248,6 @@ __init int intel_pmu_init(void)
 		x86_pmu.late_ack = true;
 		memcpy(hw_cache_event_ids, skl_hw_cache_event_ids, sizeof(hw_cache_event_ids));
 		memcpy(hw_cache_extra_regs, skl_hw_cache_extra_regs, sizeof(hw_cache_extra_regs));
-		intel_pmu_lbr_init_skl();
 
 		/* INT_MISC.RECOVERY_CYCLES has umask 1 in Skylake */
 		event_attr_td_recovery_bubbles.event_str_noht =
@@ -4365,19 +4340,19 @@ __init int intel_pmu_init(void)
 	 * Check all LBT MSR here.
 	 * Disable LBR access if any LBR MSRs can not be accessed.
 	 */
-	if (x86_pmu.lbr_nr && !check_msr(x86_pmu.lbr_tos, 0x3UL))
-		x86_pmu.lbr_nr = 0;
-	for (i = 0; i < x86_pmu.lbr_nr; i++) {
-		if (!(check_msr(x86_pmu.lbr_from + i, 0xffffUL) &&
-		      check_msr(x86_pmu.lbr_to + i, 0xffffUL)))
-			x86_pmu.lbr_nr = 0;
+	if (x86_pmu.lbr.nr && !check_msr(x86_pmu.lbr.tos, 0x3UL))
+		x86_pmu.lbr.nr = 0;
+	for (i = 0; i < x86_pmu.lbr.nr; i++) {
+		if (!(check_msr(x86_pmu.lbr.from + i, 0xffffUL) &&
+		      check_msr(x86_pmu.lbr.to + i, 0xffffUL)))
+			x86_pmu.lbr.nr = 0;
 	}
 
 	x86_pmu.caps_attrs = intel_pmu_caps_attrs;
 
-	if (x86_pmu.lbr_nr) {
+	if (x86_pmu.lbr.nr) {
 		x86_pmu.caps_attrs = merge_attr(x86_pmu.caps_attrs, lbr_attrs);
-		pr_cont("%d-deep LBR, ", x86_pmu.lbr_nr);
+		pr_cont("%d-deep LBR, ", x86_pmu.lbr.nr);
 	}
 
 	/*
@@ -4390,7 +4365,7 @@ __init int intel_pmu_init(void)
 			er->extra_msr_access = check_msr(er->msr, 0x11UL);
 			/* Disable LBR select mapping */
 			if ((er->idx == EXTRA_REG_LBR) && !er->extra_msr_access)
-				x86_pmu.lbr_sel_map = NULL;
+				x86_pmu.lbr.sel_map = NULL;
 		}
 	}
 
diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index ae64d0b..f6dba2d 100644
--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -5,6 +5,7 @@
 #include <asm/perf_event.h>
 #include <asm/msr.h>
 #include <asm/insn.h>
+#include <asm/intel-family.h>
 
 #include "../perf_event.h"
 
@@ -167,7 +168,7 @@ static void __intel_pmu_lbr_enable(bool pmi)
 	 * did not change.
 	 */
 	if (cpuc->lbr_sel)
-		lbr_select = cpuc->lbr_sel->config & x86_pmu.lbr_sel_mask;
+		lbr_select = cpuc->lbr_sel->config & x86_pmu.lbr.sel_mask;
 	if (!pmi && cpuc->lbr_sel)
 		wrmsrl(MSR_LBR_SELECT, lbr_select);
 
@@ -198,17 +199,17 @@ static void intel_pmu_lbr_reset_32(void)
 {
 	int i;
 
-	for (i = 0; i < x86_pmu.lbr_nr; i++)
-		wrmsrl(x86_pmu.lbr_from + i, 0);
+	for (i = 0; i < x86_pmu.lbr.nr; i++)
+		wrmsrl(x86_pmu.lbr.from + i, 0);
 }
 
 static void intel_pmu_lbr_reset_64(void)
 {
 	int i;
 
-	for (i = 0; i < x86_pmu.lbr_nr; i++) {
-		wrmsrl(x86_pmu.lbr_from + i, 0);
-		wrmsrl(x86_pmu.lbr_to   + i, 0);
+	for (i = 0; i < x86_pmu.lbr.nr; i++) {
+		wrmsrl(x86_pmu.lbr.from + i, 0);
+		wrmsrl(x86_pmu.lbr.to   + i, 0);
 		if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_INFO)
 			wrmsrl(MSR_LBR_INFO_0 + i, 0);
 	}
@@ -216,7 +217,7 @@ static void intel_pmu_lbr_reset_64(void)
 
 void intel_pmu_lbr_reset(void)
 {
-	if (!x86_pmu.lbr_nr)
+	if (!x86_pmu.lbr.nr)
 		return;
 
 	if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_32)
@@ -232,7 +233,7 @@ static inline u64 intel_pmu_lbr_tos(void)
 {
 	u64 tos;
 
-	rdmsrl(x86_pmu.lbr_tos, tos);
+	rdmsrl(x86_pmu.lbr.tos, tos);
 	return tos;
 }
 
@@ -306,19 +307,19 @@ static u64 lbr_from_signext_quirk_rd(u64 val)
 static inline void wrlbr_from(unsigned int idx, u64 val)
 {
 	val = lbr_from_signext_quirk_wr(val);
-	wrmsrl(x86_pmu.lbr_from + idx, val);
+	wrmsrl(x86_pmu.lbr.from + idx, val);
 }
 
 static inline void wrlbr_to(unsigned int idx, u64 val)
 {
-	wrmsrl(x86_pmu.lbr_to + idx, val);
+	wrmsrl(x86_pmu.lbr.to + idx, val);
 }
 
 static inline u64 rdlbr_from(unsigned int idx)
 {
 	u64 val;
 
-	rdmsrl(x86_pmu.lbr_from + idx, val);
+	rdmsrl(x86_pmu.lbr.from + idx, val);
 
 	return lbr_from_signext_quirk_rd(val);
 }
@@ -327,7 +328,7 @@ static inline u64 rdlbr_to(unsigned int idx)
 {
 	u64 val;
 
-	rdmsrl(x86_pmu.lbr_to + idx, val);
+	rdmsrl(x86_pmu.lbr.to + idx, val);
 
 	return val;
 }
@@ -344,7 +345,7 @@ static void __intel_pmu_lbr_restore(struct x86_perf_task_context *task_ctx)
 		return;
 	}
 
-	mask = x86_pmu.lbr_nr - 1;
+	mask = x86_pmu.lbr.nr - 1;
 	tos = task_ctx->tos;
 	for (i = 0; i < tos; i++) {
 		lbr_idx = (tos - i) & mask;
@@ -354,7 +355,7 @@ static void __intel_pmu_lbr_restore(struct x86_perf_task_context *task_ctx)
 		if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_INFO)
 			wrmsrl(MSR_LBR_INFO_0 + lbr_idx, task_ctx->lbr_info[i]);
 	}
-	wrmsrl(x86_pmu.lbr_tos, tos);
+	wrmsrl(x86_pmu.lbr.tos, tos);
 	task_ctx->lbr_stack_state = LBR_NONE;
 }
 
@@ -369,7 +370,7 @@ static void __intel_pmu_lbr_save(struct x86_perf_task_context *task_ctx)
 		return;
 	}
 
-	mask = x86_pmu.lbr_nr - 1;
+	mask = x86_pmu.lbr.nr - 1;
 	tos = intel_pmu_lbr_tos();
 	for (i = 0; i < tos; i++) {
 		lbr_idx = (tos - i) & mask;
@@ -424,7 +425,7 @@ void intel_pmu_lbr_add(struct perf_event *event)
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct x86_perf_task_context *task_ctx;
 
-	if (!x86_pmu.lbr_nr)
+	if (!x86_pmu.lbr.nr)
 		return;
 
 	cpuc->br_sel = event->hw.branch_reg.reg;
@@ -463,7 +464,7 @@ void intel_pmu_lbr_del(struct perf_event *event)
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	struct x86_perf_task_context *task_ctx;
 
-	if (!x86_pmu.lbr_nr)
+	if (!x86_pmu.lbr.nr)
 		return;
 
 	if (branch_user_callstack(cpuc->br_sel) &&
@@ -495,11 +496,11 @@ void intel_pmu_lbr_disable_all(void)
 
 static void intel_pmu_lbr_read_32(struct cpu_hw_events *cpuc)
 {
-	unsigned long mask = x86_pmu.lbr_nr - 1;
+	unsigned long mask = x86_pmu.lbr.nr - 1;
 	u64 tos = intel_pmu_lbr_tos();
 	int i;
 
-	for (i = 0; i < x86_pmu.lbr_nr; i++) {
+	for (i = 0; i < x86_pmu.lbr.nr; i++) {
 		unsigned long lbr_idx = (tos - i) & mask;
 		union {
 			struct {
@@ -509,7 +510,7 @@ static void intel_pmu_lbr_read_32(struct cpu_hw_events *cpuc)
 			u64     lbr;
 		} msr_lastbranch;
 
-		rdmsrl(x86_pmu.lbr_from + lbr_idx, msr_lastbranch.lbr);
+		rdmsrl(x86_pmu.lbr.from + lbr_idx, msr_lastbranch.lbr);
 
 		cpuc->lbr_entries[i].from	= msr_lastbranch.from;
 		cpuc->lbr_entries[i].to		= msr_lastbranch.to;
@@ -532,12 +533,12 @@ static void intel_pmu_lbr_read_32(struct cpu_hw_events *cpuc)
 static void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc)
 {
 	bool need_info = false;
-	unsigned long mask = x86_pmu.lbr_nr - 1;
+	unsigned long mask = x86_pmu.lbr.nr - 1;
 	int lbr_format = x86_pmu.intel_cap.lbr_format;
 	u64 tos = intel_pmu_lbr_tos();
 	int i;
 	int out = 0;
-	int num = x86_pmu.lbr_nr;
+	int num = x86_pmu.lbr.nr;
 
 	if (cpuc->lbr_sel) {
 		need_info = !(cpuc->lbr_sel->config & LBR_NO_INFO);
@@ -595,7 +596,7 @@ static void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc)
 		 * If the abort just happened outside the window
 		 * the extra entry cannot be removed.
 		 */
-		if (abort && x86_pmu.lbr_double_abort && out > 0)
+		if (abort && x86_pmu.lbr.double_abort && out > 0)
 			out--;
 
 		cpuc->lbr_entries[out].from	 = from;
@@ -711,7 +712,7 @@ static int intel_pmu_setup_hw_lbr_filter(struct perf_event *event)
 		if (!(br_type & (1ULL << i)))
 			continue;
 
-		v = x86_pmu.lbr_sel_map[i];
+		v = x86_pmu.lbr.sel_map[i];
 		if (v == LBR_NOT_SUPP)
 			return -EOPNOTSUPP;
 
@@ -729,7 +730,7 @@ static int intel_pmu_setup_hw_lbr_filter(struct perf_event *event)
 	 * But the 10th bit LBR_CALL_STACK does not operate
 	 * in suppress mode.
 	 */
-	reg->config = mask ^ (x86_pmu.lbr_sel_mask & ~LBR_CALL_STACK);
+	reg->config = mask ^ (x86_pmu.lbr.sel_mask & ~LBR_CALL_STACK);
 
 	if ((br_type & PERF_SAMPLE_BRANCH_NO_CYCLES) &&
 	    (br_type & PERF_SAMPLE_BRANCH_NO_FLAGS) &&
@@ -746,7 +747,7 @@ int intel_pmu_setup_lbr_filter(struct perf_event *event)
 	/*
 	 * no LBR on this PMU
 	 */
-	if (!x86_pmu.lbr_nr)
+	if (!x86_pmu.lbr.nr)
 		return -EOPNOTSUPP;
 
 	/*
@@ -759,7 +760,7 @@ int intel_pmu_setup_lbr_filter(struct perf_event *event)
 	/*
 	 * setup HW LBR filter, if any
 	 */
-	if (x86_pmu.lbr_sel_map)
+	if (x86_pmu.lbr.sel_map)
 		ret = intel_pmu_setup_hw_lbr_filter(event);
 
 	return ret;
@@ -1091,12 +1092,12 @@ static const int hsw_lbr_sel_map[PERF_SAMPLE_BRANCH_MAX_SHIFT] = {
 };
 
 /* core */
-void __init intel_pmu_lbr_init_core(void)
+static void intel_pmu_lbr_init_core(struct x86_pmu_lbr *lbr)
 {
-	x86_pmu.lbr_nr     = 4;
-	x86_pmu.lbr_tos    = MSR_LBR_TOS;
-	x86_pmu.lbr_from   = MSR_LBR_CORE_FROM;
-	x86_pmu.lbr_to     = MSR_LBR_CORE_TO;
+	lbr->nr     = 4;
+	lbr->tos    = MSR_LBR_TOS;
+	lbr->from   = MSR_LBR_CORE_FROM;
+	lbr->to     = MSR_LBR_CORE_TO;
 
 	/*
 	 * SW branch filter usage:
@@ -1105,15 +1106,15 @@ void __init intel_pmu_lbr_init_core(void)
 }
 
 /* nehalem/westmere */
-void __init intel_pmu_lbr_init_nhm(void)
+static void intel_pmu_lbr_init_nhm(struct x86_pmu_lbr *lbr)
 {
-	x86_pmu.lbr_nr     = 16;
-	x86_pmu.lbr_tos    = MSR_LBR_TOS;
-	x86_pmu.lbr_from   = MSR_LBR_NHM_FROM;
-	x86_pmu.lbr_to     = MSR_LBR_NHM_TO;
+	lbr->nr     = 16;
+	lbr->tos    = MSR_LBR_TOS;
+	lbr->from   = MSR_LBR_NHM_FROM;
+	lbr->to     = MSR_LBR_NHM_TO;
 
-	x86_pmu.lbr_sel_mask = LBR_SEL_MASK;
-	x86_pmu.lbr_sel_map  = nhm_lbr_sel_map;
+	lbr->sel_mask = LBR_SEL_MASK;
+	lbr->sel_map  = nhm_lbr_sel_map;
 
 	/*
 	 * SW branch filter usage:
@@ -1125,15 +1126,15 @@ void __init intel_pmu_lbr_init_nhm(void)
 }
 
 /* sandy bridge */
-void __init intel_pmu_lbr_init_snb(void)
+static void intel_pmu_lbr_init_snb(struct x86_pmu_lbr *lbr)
 {
-	x86_pmu.lbr_nr	 = 16;
-	x86_pmu.lbr_tos	 = MSR_LBR_TOS;
-	x86_pmu.lbr_from = MSR_LBR_NHM_FROM;
-	x86_pmu.lbr_to   = MSR_LBR_NHM_TO;
+	lbr->nr   = 16;
+	lbr->tos  = MSR_LBR_TOS;
+	lbr->from = MSR_LBR_NHM_FROM;
+	lbr->to   = MSR_LBR_NHM_TO;
 
-	x86_pmu.lbr_sel_mask = LBR_SEL_MASK;
-	x86_pmu.lbr_sel_map  = snb_lbr_sel_map;
+	lbr->sel_mask = LBR_SEL_MASK;
+	lbr->sel_map  = snb_lbr_sel_map;
 
 	/*
 	 * SW branch filter usage:
@@ -1144,30 +1145,27 @@ void __init intel_pmu_lbr_init_snb(void)
 }
 
 /* haswell */
-void intel_pmu_lbr_init_hsw(void)
+static void intel_pmu_lbr_init_hsw(struct x86_pmu_lbr *lbr)
 {
-	x86_pmu.lbr_nr	 = 16;
-	x86_pmu.lbr_tos	 = MSR_LBR_TOS;
-	x86_pmu.lbr_from = MSR_LBR_NHM_FROM;
-	x86_pmu.lbr_to   = MSR_LBR_NHM_TO;
+	lbr->nr   = 16;
+	lbr->tos  = MSR_LBR_TOS;
+	lbr->from = MSR_LBR_NHM_FROM;
+	lbr->to   = MSR_LBR_NHM_TO;
 
-	x86_pmu.lbr_sel_mask = LBR_SEL_MASK;
-	x86_pmu.lbr_sel_map  = hsw_lbr_sel_map;
-
-	if (lbr_from_signext_quirk_needed())
-		static_branch_enable(&lbr_from_quirk_key);
+	lbr->sel_mask = LBR_SEL_MASK;
+	lbr->sel_map  = hsw_lbr_sel_map;
 }
 
 /* skylake */
-__init void intel_pmu_lbr_init_skl(void)
+static void intel_pmu_lbr_init_skl(struct x86_pmu_lbr *lbr)
 {
-	x86_pmu.lbr_nr	 = 32;
-	x86_pmu.lbr_tos	 = MSR_LBR_TOS;
-	x86_pmu.lbr_from = MSR_LBR_NHM_FROM;
-	x86_pmu.lbr_to   = MSR_LBR_NHM_TO;
+	lbr->nr   = 32;
+	lbr->tos  = MSR_LBR_TOS;
+	lbr->from = MSR_LBR_NHM_FROM;
+	lbr->to   = MSR_LBR_NHM_TO;
 
-	x86_pmu.lbr_sel_mask = LBR_SEL_MASK;
-	x86_pmu.lbr_sel_map  = hsw_lbr_sel_map;
+	lbr->sel_mask = LBR_SEL_MASK;
+	lbr->sel_map  = hsw_lbr_sel_map;
 
 	/*
 	 * SW branch filter usage:
@@ -1178,23 +1176,12 @@ __init void intel_pmu_lbr_init_skl(void)
 }
 
 /* atom */
-void __init intel_pmu_lbr_init_atom(void)
+static void intel_pmu_lbr_init_atom(struct x86_pmu_lbr *lbr)
 {
-	/*
-	 * only models starting at stepping 10 seems
-	 * to have an operational LBR which can freeze
-	 * on PMU interrupt
-	 */
-	if (boot_cpu_data.x86_model == 28
-	    && boot_cpu_data.x86_mask < 10) {
-		pr_cont("LBR disabled due to erratum");
-		return;
-	}
-
-	x86_pmu.lbr_nr	   = 8;
-	x86_pmu.lbr_tos    = MSR_LBR_TOS;
-	x86_pmu.lbr_from   = MSR_LBR_CORE_FROM;
-	x86_pmu.lbr_to     = MSR_LBR_CORE_TO;
+	lbr->nr     = 8;
+	lbr->tos    = MSR_LBR_TOS;
+	lbr->from   = MSR_LBR_CORE_FROM;
+	lbr->to     = MSR_LBR_CORE_TO;
 
 	/*
 	 * SW branch filter usage:
@@ -1203,31 +1190,128 @@ void __init intel_pmu_lbr_init_atom(void)
 }
 
 /* slm */
-void __init intel_pmu_lbr_init_slm(void)
+static void intel_pmu_lbr_init_slm(struct x86_pmu_lbr *lbr)
 {
-	x86_pmu.lbr_nr	   = 8;
-	x86_pmu.lbr_tos    = MSR_LBR_TOS;
-	x86_pmu.lbr_from   = MSR_LBR_CORE_FROM;
-	x86_pmu.lbr_to     = MSR_LBR_CORE_TO;
+	lbr->nr     = 8;
+	lbr->tos    = MSR_LBR_TOS;
+	lbr->from   = MSR_LBR_CORE_FROM;
+	lbr->to     = MSR_LBR_CORE_TO;
 
-	x86_pmu.lbr_sel_mask = LBR_SEL_MASK;
-	x86_pmu.lbr_sel_map  = nhm_lbr_sel_map;
+	lbr->sel_mask = LBR_SEL_MASK;
+	lbr->sel_map  = nhm_lbr_sel_map;
 
 	/*
 	 * SW branch filter usage:
 	 * - compensate for lack of HW filter
 	 */
-	pr_cont("8-deep LBR, ");
 }
 
 /* Knights Landing */
-void intel_pmu_lbr_init_knl(void)
+static void intel_pmu_lbr_init_knl(struct x86_pmu_lbr *lbr)
+{
+	lbr->nr     = 8;
+	lbr->tos    = MSR_LBR_TOS;
+	lbr->from   = MSR_LBR_NHM_FROM;
+	lbr->to     = MSR_LBR_NHM_TO;
+
+	lbr->sel_mask = LBR_SEL_MASK;
+	lbr->sel_map  = snb_lbr_sel_map;
+}
+
+static void __intel_pmu_lbr_fill(struct x86_pmu_lbr *lbr, u8 family, u8 model)
+{
+	bool apply_tweaks = lbr == &x86_pmu.lbr;
+
+	memset(lbr, 0, sizeof(struct x86_pmu_lbr));
+
+	if (family != 0x6)
+		return;
+
+	switch (model) {
+	case INTEL_FAM6_CORE_YONAH:
+		break;
+	case INTEL_FAM6_CORE2_MEROM:
+	case INTEL_FAM6_CORE2_MEROM_L:
+	case INTEL_FAM6_CORE2_PENRYN:
+	case INTEL_FAM6_CORE2_DUNNINGTON:
+		intel_pmu_lbr_init_core(lbr);
+		break;
+	case INTEL_FAM6_NEHALEM:
+	case INTEL_FAM6_NEHALEM_EP:
+	case INTEL_FAM6_NEHALEM_EX:
+	case INTEL_FAM6_WESTMERE:
+	case INTEL_FAM6_WESTMERE_EP:
+	case INTEL_FAM6_WESTMERE_EX:
+		intel_pmu_lbr_init_nhm(lbr);
+		break;
+	case INTEL_FAM6_ATOM_PINEVIEW:
+	case INTEL_FAM6_ATOM_LINCROFT:
+	case INTEL_FAM6_ATOM_PENWELL:
+	case INTEL_FAM6_ATOM_CLOVERVIEW:
+	case INTEL_FAM6_ATOM_CEDARVIEW:
+		/*
+		 * only models starting at stepping 10 seems
+		 * to have an operational LBR which can freeze
+		 * on PMU interrupt
+		 */
+		if (apply_tweaks &&
+		    boot_cpu_data.x86_model == INTEL_FAM6_ATOM_PINEVIEW &&
+		    boot_cpu_data.x86_mask < 10) {
+			pr_cont("LBR disabled due to erratum");
+			return;
+		}
+		intel_pmu_lbr_init_atom(lbr);
+		break;
+	case INTEL_FAM6_ATOM_SILVERMONT1:
+	case INTEL_FAM6_ATOM_SILVERMONT2:
+	case INTEL_FAM6_ATOM_AIRMONT:
+		intel_pmu_lbr_init_slm(lbr);
+		break;
+	case INTEL_FAM6_ATOM_GOLDMONT:
+	case INTEL_FAM6_ATOM_DENVERTON:
+	case INTEL_FAM6_ATOM_GEMINI_LAKE:
+		lbr->pt_coexist = true;
+	case INTEL_FAM6_SKYLAKE_MOBILE:
+	case INTEL_FAM6_SKYLAKE_DESKTOP:
+	case INTEL_FAM6_SKYLAKE_X:
+	case INTEL_FAM6_KABYLAKE_MOBILE:
+	case INTEL_FAM6_KABYLAKE_DESKTOP:
+		intel_pmu_lbr_init_skl(lbr);
+		break;
+	case INTEL_FAM6_SANDYBRIDGE:
+	case INTEL_FAM6_SANDYBRIDGE_X:
+	case INTEL_FAM6_IVYBRIDGE:
+	case INTEL_FAM6_IVYBRIDGE_X:
+		intel_pmu_lbr_init_snb(lbr);
+		break;
+	case INTEL_FAM6_HASWELL_CORE:
+	case INTEL_FAM6_HASWELL_X:
+	case INTEL_FAM6_HASWELL_ULT:
+	case INTEL_FAM6_HASWELL_GT3E:
+		lbr->double_abort = true;
+	case INTEL_FAM6_BROADWELL_CORE:
+	case INTEL_FAM6_BROADWELL_XEON_D:
+	case INTEL_FAM6_BROADWELL_GT3E:
+	case INTEL_FAM6_BROADWELL_X:
+		intel_pmu_lbr_init_hsw(lbr);
+		if (apply_tweaks && lbr_from_signext_quirk_needed())
+			static_branch_enable(&lbr_from_quirk_key);
+		break;
+	case INTEL_FAM6_XEON_PHI_KNL:
+	case INTEL_FAM6_XEON_PHI_KNM:
+		intel_pmu_lbr_init_knl(lbr);
+		break;
+	}
+}
+
+void __init intel_pmu_lbr_init(void)
 {
-	x86_pmu.lbr_nr	   = 8;
-	x86_pmu.lbr_tos    = MSR_LBR_TOS;
-	x86_pmu.lbr_from   = MSR_LBR_NHM_FROM;
-	x86_pmu.lbr_to     = MSR_LBR_NHM_TO;
+	__intel_pmu_lbr_fill(&x86_pmu.lbr, boot_cpu_data.x86,
+			     boot_cpu_data.x86_model);
+}
 
-	x86_pmu.lbr_sel_mask = LBR_SEL_MASK;
-	x86_pmu.lbr_sel_map  = snb_lbr_sel_map;
+void intel_pmu_lbr_fill(struct x86_pmu_lbr *lbr, u8 family, u8 model)
+{
+	__intel_pmu_lbr_fill(lbr, family, model);
 }
+EXPORT_SYMBOL_GPL(intel_pmu_lbr_fill);
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index f7aaadf..0c436ac 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -627,12 +627,7 @@ struct x86_pmu {
 	/*
 	 * Intel LBR
 	 */
-	unsigned long	lbr_tos, lbr_from, lbr_to; /* MSR base regs       */
-	int		lbr_nr;			   /* hardware stack size */
-	u64		lbr_sel_mask;		   /* LBR_SELECT valid bits */
-	const int	*lbr_sel_map;		   /* lbr_select mappings */
-	bool		lbr_double_abort;	   /* duplicated lbr aborts */
-	bool		lbr_pt_coexist;		   /* (LBR|BTS) may coexist with PT */
+	struct x86_pmu_lbr lbr;
 
 	/*
 	 * Intel PT/LBR/BTS are exclusive
@@ -711,8 +706,8 @@ extern struct x86_pmu x86_pmu __read_mostly;
 
 static inline bool x86_pmu_has_lbr_callstack(void)
 {
-	return  x86_pmu.lbr_sel_map &&
-		x86_pmu.lbr_sel_map[PERF_SAMPLE_BRANCH_CALL_STACK_SHIFT] > 0;
+	return  x86_pmu.lbr.sel_map &&
+		x86_pmu.lbr.sel_map[PERF_SAMPLE_BRANCH_CALL_STACK_SHIFT] > 0;
 }
 
 DECLARE_PER_CPU(struct cpu_hw_events, cpu_hw_events);
@@ -955,21 +950,9 @@ void intel_pmu_lbr_disable_all(void);
 
 void intel_pmu_lbr_read(void);
 
-void intel_pmu_lbr_init_core(void);
+void intel_pmu_lbr_init(void);
 
-void intel_pmu_lbr_init_nhm(void);
-
-void intel_pmu_lbr_init_atom(void);
-
-void intel_pmu_lbr_init_slm(void);
-
-void intel_pmu_lbr_init_snb(void);
-
-void intel_pmu_lbr_init_hsw(void);
-
-void intel_pmu_lbr_init_skl(void);
-
-void intel_pmu_lbr_init_knl(void);
+void intel_pmu_lbr_fill(struct x86_pmu_lbr *lbr, u8 family, u8 model);
 
 void intel_pmu_pebs_data_source_nhm(void);
 
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 12f5408..6fc722f 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -125,6 +125,15 @@ struct x86_pmu_capability {
 	int		events_mask_len;
 };
 
+struct x86_pmu_lbr {
+	unsigned long	tos, from, to;	/* MSR base regs */
+	int		nr;		/* hardware stack size */
+	u64		sel_mask;	/* LBR_SELECT valid bits */
+	const int	*sel_map;	/* lbr_select mappings */
+	bool		double_abort;	/* duplicated lbr aborts */
+	bool		pt_coexist;	/* (LBR|BTS) may coexist with PT */
+};
+
 /*
  * Fixed-purpose performance events:
  */
@@ -300,4 +309,6 @@ static inline void perf_check_microcode(void) { }
 
 #define arch_perf_out_copy_user copy_from_user_nmi
 
+extern void intel_pmu_lbr_fill(struct x86_pmu_lbr *lbr, u8 family, u8 model);
+
 #endif /* _ASM_X86_PERF_EVENT_H */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH RFC 2/2] KVM: x86/vPMU: ignore access to LBR-related MSRs
  2017-12-06 11:43 ` Jan Dakinevich
@ 2017-12-06 11:43   ` Jan Dakinevich
  -1 siblings, 0 replies; 22+ messages in thread
From: Jan Dakinevich @ 2017-12-06 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jan Dakinevich, Denis V . Lunev, Roman Kagan, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, H. Peter Anvin, x86,
	Paolo Bonzini, Radim Krčmář,
	Andi Kleen, Kan Liang, Stephane Eranian, Colin King,
	Sebastian Andrzej Siewior, Jin Yao, kvm

Windows Server 2016 Essentials (for yet unknown reason) attempts to
access MSR_LBR_TOS and other LBR-related registers at startup.  These
are not currently hadled by KVM so the guest gets #GP and crashes.

To prevent that, identify LBR-related MSRs pertinent to the CPU model
exposed to the guest, and dummy handle them (ignore writes and return
zero on reads).

Signed-off-by: Jan Dakinevich <jan.dakinevich@virtuozzo.com>
---
 arch/x86/include/asm/kvm_host.h |  2 ++
 arch/x86/kvm/pmu_intel.c        | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 35 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 977de5f..04324dd 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -34,6 +34,7 @@
 #include <asm/msr-index.h>
 #include <asm/asm.h>
 #include <asm/kvm_page_track.h>
+#include <asm/perf_event.h>
 
 #define KVM_MAX_VCPUS 288
 #define KVM_SOFT_MAX_VCPUS 240
@@ -416,6 +417,7 @@ struct kvm_pmu {
 	struct kvm_pmc fixed_counters[INTEL_PMC_MAX_FIXED];
 	struct irq_work irq_work;
 	u64 reprogram_pmi;
+	struct x86_pmu_lbr lbr;
 };
 
 struct kvm_pmu_ops;
diff --git a/arch/x86/kvm/pmu_intel.c b/arch/x86/kvm/pmu_intel.c
index 5ab4a36..7edf191 100644
--- a/arch/x86/kvm/pmu_intel.c
+++ b/arch/x86/kvm/pmu_intel.c
@@ -142,6 +142,24 @@ static struct kvm_pmc *intel_msr_idx_to_pmc(struct kvm_vcpu *vcpu,
 	return &counters[idx];
 }
 
+static bool intel_is_lbr_msr(struct kvm_vcpu *vcpu, u32 msr)
+{
+	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
+	struct x86_pmu_lbr *lbr = &pmu->lbr;
+
+	if (!lbr->nr)
+		return false;
+
+	if (msr == lbr->tos)
+		return true;
+	if (msr >= lbr->from && msr < lbr->from + lbr->nr)
+		return true;
+	if (msr >= lbr->to && msr < lbr->to + lbr->nr)
+		return true;
+
+	return false;
+}
+
 static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
@@ -155,6 +173,10 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 		ret = pmu->version > 1;
 		break;
 	default:
+		if (intel_is_lbr_msr(vcpu, msr)) {
+			ret = true;
+			break;
+		}
 		ret = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0) ||
 			get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0) ||
 			get_fixed_pmc(pmu, msr);
@@ -183,6 +205,10 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *data)
 		*data = pmu->global_ovf_ctrl;
 		return 0;
 	default:
+		if (intel_is_lbr_msr(vcpu, msr)) {
+			*data = 0;
+			return 0;
+		}
 		if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
 		    (pmc = get_fixed_pmc(pmu, msr))) {
 			*data = pmc_read_counter(pmc);
@@ -235,6 +261,8 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		}
 		break;
 	default:
+		if (intel_is_lbr_msr(vcpu, msr))
+			return 0;
 		if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
 		    (pmc = get_fixed_pmc(pmu, msr))) {
 			if (!msr_info->host_initiated)
@@ -303,6 +331,11 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 	    (boot_cpu_has(X86_FEATURE_HLE) || boot_cpu_has(X86_FEATURE_RTM)) &&
 	    (entry->ebx & (X86_FEATURE_HLE|X86_FEATURE_RTM)))
 		pmu->reserved_bits ^= HSW_IN_TX|HSW_IN_TX_CHECKPOINTED;
+
+	entry = kvm_find_cpuid_entry(vcpu, 1, 0);
+	if (entry)
+		intel_pmu_lbr_fill(&pmu->lbr,
+			x86_family(entry->eax), x86_model(entry->eax));
 }
 
 static void intel_pmu_init(struct kvm_vcpu *vcpu)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH RFC 2/2] KVM: x86/vPMU: ignore access to LBR-related MSRs
@ 2017-12-06 11:43   ` Jan Dakinevich
  0 siblings, 0 replies; 22+ messages in thread
From: Jan Dakinevich @ 2017-12-06 11:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jan Dakinevich, Denis V . Lunev, Roman Kagan, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, H. Peter Anvin, x86,
	Paolo Bonzini, Radim Krčmář,
	Andi Kleen, Kan Liang, Stephane Eranian, Colin King,
	Sebastian Andrzej Siewior

Windows Server 2016 Essentials (for yet unknown reason) attempts to
access MSR_LBR_TOS and other LBR-related registers at startup.  These
are not currently hadled by KVM so the guest gets #GP and crashes.

To prevent that, identify LBR-related MSRs pertinent to the CPU model
exposed to the guest, and dummy handle them (ignore writes and return
zero on reads).

Signed-off-by: Jan Dakinevich <jan.dakinevich@virtuozzo.com>
---
 arch/x86/include/asm/kvm_host.h |  2 ++
 arch/x86/kvm/pmu_intel.c        | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 35 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 977de5f..04324dd 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -34,6 +34,7 @@
 #include <asm/msr-index.h>
 #include <asm/asm.h>
 #include <asm/kvm_page_track.h>
+#include <asm/perf_event.h>
 
 #define KVM_MAX_VCPUS 288
 #define KVM_SOFT_MAX_VCPUS 240
@@ -416,6 +417,7 @@ struct kvm_pmu {
 	struct kvm_pmc fixed_counters[INTEL_PMC_MAX_FIXED];
 	struct irq_work irq_work;
 	u64 reprogram_pmi;
+	struct x86_pmu_lbr lbr;
 };
 
 struct kvm_pmu_ops;
diff --git a/arch/x86/kvm/pmu_intel.c b/arch/x86/kvm/pmu_intel.c
index 5ab4a36..7edf191 100644
--- a/arch/x86/kvm/pmu_intel.c
+++ b/arch/x86/kvm/pmu_intel.c
@@ -142,6 +142,24 @@ static struct kvm_pmc *intel_msr_idx_to_pmc(struct kvm_vcpu *vcpu,
 	return &counters[idx];
 }
 
+static bool intel_is_lbr_msr(struct kvm_vcpu *vcpu, u32 msr)
+{
+	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
+	struct x86_pmu_lbr *lbr = &pmu->lbr;
+
+	if (!lbr->nr)
+		return false;
+
+	if (msr == lbr->tos)
+		return true;
+	if (msr >= lbr->from && msr < lbr->from + lbr->nr)
+		return true;
+	if (msr >= lbr->to && msr < lbr->to + lbr->nr)
+		return true;
+
+	return false;
+}
+
 static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
@@ -155,6 +173,10 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 		ret = pmu->version > 1;
 		break;
 	default:
+		if (intel_is_lbr_msr(vcpu, msr)) {
+			ret = true;
+			break;
+		}
 		ret = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0) ||
 			get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0) ||
 			get_fixed_pmc(pmu, msr);
@@ -183,6 +205,10 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *data)
 		*data = pmu->global_ovf_ctrl;
 		return 0;
 	default:
+		if (intel_is_lbr_msr(vcpu, msr)) {
+			*data = 0;
+			return 0;
+		}
 		if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
 		    (pmc = get_fixed_pmc(pmu, msr))) {
 			*data = pmc_read_counter(pmc);
@@ -235,6 +261,8 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		}
 		break;
 	default:
+		if (intel_is_lbr_msr(vcpu, msr))
+			return 0;
 		if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
 		    (pmc = get_fixed_pmc(pmu, msr))) {
 			if (!msr_info->host_initiated)
@@ -303,6 +331,11 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 	    (boot_cpu_has(X86_FEATURE_HLE) || boot_cpu_has(X86_FEATURE_RTM)) &&
 	    (entry->ebx & (X86_FEATURE_HLE|X86_FEATURE_RTM)))
 		pmu->reserved_bits ^= HSW_IN_TX|HSW_IN_TX_CHECKPOINTED;
+
+	entry = kvm_find_cpuid_entry(vcpu, 1, 0);
+	if (entry)
+		intel_pmu_lbr_fill(&pmu->lbr,
+			x86_family(entry->eax), x86_model(entry->eax));
 }
 
 static void intel_pmu_init(struct kvm_vcpu *vcpu)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 1/2] perf/x86/intel: make reusable LBR initialization code
  2017-12-06 11:43   ` Jan Dakinevich
@ 2017-12-06 12:51     ` Peter Zijlstra
  -1 siblings, 0 replies; 22+ messages in thread
From: Peter Zijlstra @ 2017-12-06 12:51 UTC (permalink / raw)
  To: Jan Dakinevich
  Cc: linux-kernel, Denis V . Lunev, Roman Kagan, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, H. Peter Anvin, x86,
	Paolo Bonzini, Radim Krčmář,
	Andi Kleen, Kan Liang, Stephane Eranian, Zhou Chengming,
	Sebastian Andrzej Siewior, Colin King, Greg Kroah-Hartman,
	Jin Yao, kvm

On Wed, Dec 06, 2017 at 02:43:02PM +0300, Jan Dakinevich wrote:
> This patch introduces globally visible intel_pmu_lbr_fill() routine,
> which gathers information which LBR MSRs are support for specific CPU
> family/model.
> 
> It is supposed that the routine would be used in KVM code, using guest
> CPU information as an input. By this reason, it should not have any side
> effect which could affect host system.
> 
>  * LBR information moved to separate structure `struct x86_pmu_lbr';
>  * All family-specific tweaks on gathered information are applied only
>    for global x86_pmu.lbr to keep current perf initialization behavior.
> 
> Signed-off-by: Jan Dakinevich <jan.dakinevich@virtuozzo.com>

Hurch, that's a lot of churn. Nothing bad stood out though.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 1/2] perf/x86/intel: make reusable LBR initialization code
@ 2017-12-06 12:51     ` Peter Zijlstra
  0 siblings, 0 replies; 22+ messages in thread
From: Peter Zijlstra @ 2017-12-06 12:51 UTC (permalink / raw)
  To: Jan Dakinevich
  Cc: linux-kernel, Denis V . Lunev, Roman Kagan, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, H. Peter Anvin, x86,
	Paolo Bonzini, Radim Krčmář,
	Andi Kleen, Kan Liang, Stephane Eranian, Zhou Chengming,
	Sebastian Andrzej Siewior, Colin King, Greg Kroah-Hartman

On Wed, Dec 06, 2017 at 02:43:02PM +0300, Jan Dakinevich wrote:
> This patch introduces globally visible intel_pmu_lbr_fill() routine,
> which gathers information which LBR MSRs are support for specific CPU
> family/model.
> 
> It is supposed that the routine would be used in KVM code, using guest
> CPU information as an input. By this reason, it should not have any side
> effect which could affect host system.
> 
>  * LBR information moved to separate structure `struct x86_pmu_lbr';
>  * All family-specific tweaks on gathered information are applied only
>    for global x86_pmu.lbr to keep current perf initialization behavior.
> 
> Signed-off-by: Jan Dakinevich <jan.dakinevich@virtuozzo.com>

Hurch, that's a lot of churn. Nothing bad stood out though.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 2/2] KVM: x86/vPMU: ignore access to LBR-related MSRs
  2017-12-06 11:43   ` Jan Dakinevich
  (?)
@ 2017-12-06 12:52   ` Peter Zijlstra
  -1 siblings, 0 replies; 22+ messages in thread
From: Peter Zijlstra @ 2017-12-06 12:52 UTC (permalink / raw)
  To: Jan Dakinevich
  Cc: linux-kernel, Denis V . Lunev, Roman Kagan, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, H. Peter Anvin, x86,
	Paolo Bonzini, Radim Krčmář,
	Andi Kleen, Kan Liang, Stephane Eranian, Colin King,
	Sebastian Andrzej Siewior, Jin Yao, kvm

On Wed, Dec 06, 2017 at 02:43:03PM +0300, Jan Dakinevich wrote:
> Windows Server 2016 Essentials (for yet unknown reason) attempts to
> access MSR_LBR_TOS and other LBR-related registers at startup.  These
> are not currently hadled by KVM so the guest gets #GP and crashes.

Shiny feature :-)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 0/2] ignore LBR-related MSRs
  2017-12-06 11:43 ` Jan Dakinevich
@ 2017-12-06 15:06   ` Konrad Rzeszutek Wilk
  -1 siblings, 0 replies; 22+ messages in thread
From: Konrad Rzeszutek Wilk @ 2017-12-06 15:06 UTC (permalink / raw)
  To: Jan Dakinevich
  Cc: linux-kernel, Denis V . Lunev, Roman Kagan, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, H. Peter Anvin, x86,
	Paolo Bonzini, Radim Krčmář,
	Andi Kleen, Kan Liang, Colin King, Stephane Eranian,
	Sebastian Andrzej Siewior, Greg Kroah-Hartman, Jin Yao, kvm

On Wed, Dec 06, 2017 at 02:43:01PM +0300, Jan Dakinevich wrote:
> w2k16 essentials fails to boot if underlying hypervisor lacks of support for 
> LBR MSRs. To workaround the issue, it suggessted to ignore these MSRs (but not 
> all).

This is without any hyperv enablement? Meaning normal stock guest?

> 
> The information, which MSRs are supported for specific platform is taken from
> perf, it is the subject of the first patch. The second patch adds ignoring for
> these MSRs to pmu_intel code of KVM.
> 
> TODO: use MSR load/store areas to make full support of LBR debug.
> 
> Jan Dakinevich (2):
>   perf/x86/intel: make reusable LBR initialization code
>   KVM: x86/vPMU: ignore access to LBR-related MSRs
> 
>  arch/x86/events/core.c            |   8 +-
>  arch/x86/events/intel/core.c      |  59 +++------
>  arch/x86/events/intel/lbr.c       | 272 +++++++++++++++++++++++++-------------
>  arch/x86/events/perf_event.h      |  27 +---
>  arch/x86/include/asm/kvm_host.h   |   2 +
>  arch/x86/include/asm/perf_event.h |  11 ++
>  arch/x86/kvm/pmu_intel.c          |  33 +++++
>  7 files changed, 250 insertions(+), 162 deletions(-)
> 
> -- 
> 2.1.4
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 0/2] ignore LBR-related MSRs
@ 2017-12-06 15:06   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 22+ messages in thread
From: Konrad Rzeszutek Wilk @ 2017-12-06 15:06 UTC (permalink / raw)
  To: Jan Dakinevich
  Cc: linux-kernel, Denis V . Lunev, Roman Kagan, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, H. Peter Anvin, x86,
	Paolo Bonzini, Radim Krčmář,
	Andi Kleen, Kan Liang, Colin King, Stephane Eranian,
	Sebastian Andrzej Siewior, Greg Kroah-Hartman

On Wed, Dec 06, 2017 at 02:43:01PM +0300, Jan Dakinevich wrote:
> w2k16 essentials fails to boot if underlying hypervisor lacks of support for 
> LBR MSRs. To workaround the issue, it suggessted to ignore these MSRs (but not 
> all).

This is without any hyperv enablement? Meaning normal stock guest?

> 
> The information, which MSRs are supported for specific platform is taken from
> perf, it is the subject of the first patch. The second patch adds ignoring for
> these MSRs to pmu_intel code of KVM.
> 
> TODO: use MSR load/store areas to make full support of LBR debug.
> 
> Jan Dakinevich (2):
>   perf/x86/intel: make reusable LBR initialization code
>   KVM: x86/vPMU: ignore access to LBR-related MSRs
> 
>  arch/x86/events/core.c            |   8 +-
>  arch/x86/events/intel/core.c      |  59 +++------
>  arch/x86/events/intel/lbr.c       | 272 +++++++++++++++++++++++++-------------
>  arch/x86/events/perf_event.h      |  27 +---
>  arch/x86/include/asm/kvm_host.h   |   2 +
>  arch/x86/include/asm/perf_event.h |  11 ++
>  arch/x86/kvm/pmu_intel.c          |  33 +++++
>  7 files changed, 250 insertions(+), 162 deletions(-)
> 
> -- 
> 2.1.4
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 2/2] KVM: x86/vPMU: ignore access to LBR-related MSRs
  2017-12-06 11:43   ` Jan Dakinevich
  (?)
  (?)
@ 2017-12-06 15:57   ` Andi Kleen
  2017-12-06 17:02     ` Jan Dakinevich
  -1 siblings, 1 reply; 22+ messages in thread
From: Andi Kleen @ 2017-12-06 15:57 UTC (permalink / raw)
  To: Jan Dakinevich
  Cc: linux-kernel, Denis V . Lunev, Roman Kagan, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, H. Peter Anvin, x86,
	Paolo Bonzini, Radim Krčmář,
	Kan Liang, Stephane Eranian, Colin King,
	Sebastian Andrzej Siewior, Jin Yao, kvm


If you do all this it's only a small step to fully enable LBRs for guests. 

Just need to allow them to be written, expose PERF_CAPABILITIES too, and
start/stop them on entry/exit, and enable context switching through perf in
the host.

That would be far better than creating a frankenstate where LBR is there but
mostly broken on some KVM versions.

-Andi

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 0/2] ignore LBR-related MSRs
  2017-12-06 11:43 ` Jan Dakinevich
@ 2017-12-06 16:23   ` Jiri Olsa
  -1 siblings, 0 replies; 22+ messages in thread
From: Jiri Olsa @ 2017-12-06 16:23 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Denis V . Lunev, Jan Dakinevich, Roman Kagan,
	Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Namhyung Kim, Thomas Gleixner, H. Peter Anvin, x86,
	Paolo Bonzini, Radim Krčmář,
	Andi Kleen, Kan Liang, Colin King, Stephane Eranian,
	Sebastian Andrzej Siewior, Greg Kroah-Hartman, Jin Yao, kvm

On Wed, Dec 06, 2017 at 02:43:01PM +0300, Jan Dakinevich wrote:
> w2k16 essentials fails to boot if underlying hypervisor lacks of support for 
> LBR MSRs. To workaround the issue, it suggessted to ignore these MSRs (but not 
> all).
> 
> The information, which MSRs are supported for specific platform is taken from
> perf, it is the subject of the first patch. The second patch adds ignoring for
> these MSRs to pmu_intel code of KVM.
> 
> TODO: use MSR load/store areas to make full support of LBR debug.
> 
> Jan Dakinevich (2):
>   perf/x86/intel: make reusable LBR initialization code
>   KVM: x86/vPMU: ignore access to LBR-related MSRs
> 
>  arch/x86/events/core.c            |   8 +-
>  arch/x86/events/intel/core.c      |  59 +++------
>  arch/x86/events/intel/lbr.c       | 272 +++++++++++++++++++++++++-------------
>  arch/x86/events/perf_event.h      |  27 +---
>  arch/x86/include/asm/kvm_host.h   |   2 +
>  arch/x86/include/asm/perf_event.h |  11 ++
>  arch/x86/kvm/pmu_intel.c          |  33 +++++
>  7 files changed, 250 insertions(+), 162 deletions(-)

unrelated, but while looking on this, dont we miss task_ctx_data
allocation for child context? like in attached change

jirka


---
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 494eca1bc760..7c5de1160545 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -10699,6 +10699,7 @@ const struct perf_event_attr *perf_event_attrs(struct perf_event *event)
 	      struct perf_event_context *child_ctx)
 {
 	enum perf_event_state parent_state = parent_event->state;
+	void *task_ctx_data = NULL;
 	struct perf_event *child_event;
 	unsigned long flags;
 
@@ -10719,6 +10720,19 @@ const struct perf_event_attr *perf_event_attrs(struct perf_event *event)
 	if (IS_ERR(child_event))
 		return child_event;
 
+
+	if ((child_event->attach_state & PERF_ATTACH_TASK_DATA) &&
+	    !child_ctx->task_ctx_data) {
+		struct pmu *pmu = child_event->pmu;
+
+		task_ctx_data = kzalloc(pmu->task_ctx_size, GFP_KERNEL);
+		if (!task_ctx_data) {
+			free_event(child_event);
+			return NULL;
+		}
+		child_ctx->task_ctx_data = task_ctx_data;
+	}
+
 	/*
 	 * is_orphaned_event() and list_add_tail(&parent_event->child_list)
 	 * must be under the same lock in order to serialize against
@@ -10729,6 +10743,7 @@ const struct perf_event_attr *perf_event_attrs(struct perf_event *event)
 	if (is_orphaned_event(parent_event) ||
 	    !atomic_long_inc_not_zero(&parent_event->refcount)) {
 		mutex_unlock(&parent_event->child_mutex);
+		/* task_ctx_data is freed with child_ctx */
 		free_event(child_event);
 		return NULL;
 	}

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 0/2] ignore LBR-related MSRs
@ 2017-12-06 16:23   ` Jiri Olsa
  0 siblings, 0 replies; 22+ messages in thread
From: Jiri Olsa @ 2017-12-06 16:23 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Denis V . Lunev, Jan Dakinevich, Roman Kagan,
	Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Namhyung Kim, Thomas Gleixner, H. Peter Anvin, x86,
	Paolo Bonzini, Radim Krčmář,
	Andi Kleen, Kan Liang, Colin King, Stephane Eranian,
	Sebastian Andrzej Siewior, Greg Kroah-Hartman

On Wed, Dec 06, 2017 at 02:43:01PM +0300, Jan Dakinevich wrote:
> w2k16 essentials fails to boot if underlying hypervisor lacks of support for 
> LBR MSRs. To workaround the issue, it suggessted to ignore these MSRs (but not 
> all).
> 
> The information, which MSRs are supported for specific platform is taken from
> perf, it is the subject of the first patch. The second patch adds ignoring for
> these MSRs to pmu_intel code of KVM.
> 
> TODO: use MSR load/store areas to make full support of LBR debug.
> 
> Jan Dakinevich (2):
>   perf/x86/intel: make reusable LBR initialization code
>   KVM: x86/vPMU: ignore access to LBR-related MSRs
> 
>  arch/x86/events/core.c            |   8 +-
>  arch/x86/events/intel/core.c      |  59 +++------
>  arch/x86/events/intel/lbr.c       | 272 +++++++++++++++++++++++++-------------
>  arch/x86/events/perf_event.h      |  27 +---
>  arch/x86/include/asm/kvm_host.h   |   2 +
>  arch/x86/include/asm/perf_event.h |  11 ++
>  arch/x86/kvm/pmu_intel.c          |  33 +++++
>  7 files changed, 250 insertions(+), 162 deletions(-)

unrelated, but while looking on this, dont we miss task_ctx_data
allocation for child context? like in attached change

jirka


---
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 494eca1bc760..7c5de1160545 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -10699,6 +10699,7 @@ const struct perf_event_attr *perf_event_attrs(struct perf_event *event)
 	      struct perf_event_context *child_ctx)
 {
 	enum perf_event_state parent_state = parent_event->state;
+	void *task_ctx_data = NULL;
 	struct perf_event *child_event;
 	unsigned long flags;
 
@@ -10719,6 +10720,19 @@ const struct perf_event_attr *perf_event_attrs(struct perf_event *event)
 	if (IS_ERR(child_event))
 		return child_event;
 
+
+	if ((child_event->attach_state & PERF_ATTACH_TASK_DATA) &&
+	    !child_ctx->task_ctx_data) {
+		struct pmu *pmu = child_event->pmu;
+
+		task_ctx_data = kzalloc(pmu->task_ctx_size, GFP_KERNEL);
+		if (!task_ctx_data) {
+			free_event(child_event);
+			return NULL;
+		}
+		child_ctx->task_ctx_data = task_ctx_data;
+	}
+
 	/*
 	 * is_orphaned_event() and list_add_tail(&parent_event->child_list)
 	 * must be under the same lock in order to serialize against
@@ -10729,6 +10743,7 @@ const struct perf_event_attr *perf_event_attrs(struct perf_event *event)
 	if (is_orphaned_event(parent_event) ||
 	    !atomic_long_inc_not_zero(&parent_event->refcount)) {
 		mutex_unlock(&parent_event->child_mutex);
+		/* task_ctx_data is freed with child_ctx */
 		free_event(child_event);
 		return NULL;
 	}

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 0/2] ignore LBR-related MSRs
  2017-12-06 15:06   ` Konrad Rzeszutek Wilk
@ 2017-12-06 16:39     ` Jan Dakinevich
  -1 siblings, 0 replies; 22+ messages in thread
From: Jan Dakinevich @ 2017-12-06 16:39 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: linux-kernel, Denis V . Lunev, Roman Kagan, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, H. Peter Anvin, x86,
	Paolo Bonzini, Radim Krčmář,
	Andi Kleen, Kan Liang, Colin King, Stephane Eranian,
	Sebastian Andrzej Siewior, Greg Kroah-Hartman, Jin Yao, kvm

On Wed, 6 Dec 2017 10:06:48 -0500
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:

> On Wed, Dec 06, 2017 at 02:43:01PM +0300, Jan Dakinevich wrote:
> > w2k16 essentials fails to boot if underlying hypervisor lacks of
> > support for LBR MSRs. To workaround the issue, it suggessted to
> > ignore these MSRs (but not all).
> 
> This is without any hyperv enablement? Meaning normal stock guest?
> 

Yes, it is normal guest. No hyperv enlightenments were enabled, and
"-cpu host" was specified in QEMU command line.

> > 
> > The information, which MSRs are supported for specific platform is
> > taken from perf, it is the subject of the first patch. The second
> > patch adds ignoring for these MSRs to pmu_intel code of KVM.
> > 
> > TODO: use MSR load/store areas to make full support of LBR debug.
> > 
> > Jan Dakinevich (2):
> >   perf/x86/intel: make reusable LBR initialization code
> >   KVM: x86/vPMU: ignore access to LBR-related MSRs
> > 
> >  arch/x86/events/core.c            |   8 +-
> >  arch/x86/events/intel/core.c      |  59 +++------
> >  arch/x86/events/intel/lbr.c       | 272
> > +++++++++++++++++++++++++-------------
> > arch/x86/events/perf_event.h      |  27 +---
> > arch/x86/include/asm/kvm_host.h   |   2 +
> > arch/x86/include/asm/perf_event.h |  11 ++
> > arch/x86/kvm/pmu_intel.c          |  33 +++++ 7 files changed, 250
> > insertions(+), 162 deletions(-)
> > 
> > -- 
> > 2.1.4
> > 



-- 
Best regards
Jan Dakinevich

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 0/2] ignore LBR-related MSRs
@ 2017-12-06 16:39     ` Jan Dakinevich
  0 siblings, 0 replies; 22+ messages in thread
From: Jan Dakinevich @ 2017-12-06 16:39 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: linux-kernel, Denis V . Lunev, Roman Kagan, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, H. Peter Anvin, x86,
	Paolo Bonzini, Radim Krčmář,
	Andi Kleen, Kan Liang, Colin King, Stephane Eranian,
	Sebastian Andrzej Siewior, Greg Kroah-Hartman

On Wed, 6 Dec 2017 10:06:48 -0500
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:

> On Wed, Dec 06, 2017 at 02:43:01PM +0300, Jan Dakinevich wrote:
> > w2k16 essentials fails to boot if underlying hypervisor lacks of
> > support for LBR MSRs. To workaround the issue, it suggessted to
> > ignore these MSRs (but not all).
> 
> This is without any hyperv enablement? Meaning normal stock guest?
> 

Yes, it is normal guest. No hyperv enlightenments were enabled, and
"-cpu host" was specified in QEMU command line.

> > 
> > The information, which MSRs are supported for specific platform is
> > taken from perf, it is the subject of the first patch. The second
> > patch adds ignoring for these MSRs to pmu_intel code of KVM.
> > 
> > TODO: use MSR load/store areas to make full support of LBR debug.
> > 
> > Jan Dakinevich (2):
> >   perf/x86/intel: make reusable LBR initialization code
> >   KVM: x86/vPMU: ignore access to LBR-related MSRs
> > 
> >  arch/x86/events/core.c            |   8 +-
> >  arch/x86/events/intel/core.c      |  59 +++------
> >  arch/x86/events/intel/lbr.c       | 272
> > +++++++++++++++++++++++++-------------
> > arch/x86/events/perf_event.h      |  27 +---
> > arch/x86/include/asm/kvm_host.h   |   2 +
> > arch/x86/include/asm/perf_event.h |  11 ++
> > arch/x86/kvm/pmu_intel.c          |  33 +++++ 7 files changed, 250
> > insertions(+), 162 deletions(-)
> > 
> > -- 
> > 2.1.4
> > 



-- 
Best regards
Jan Dakinevich

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 0/2] ignore LBR-related MSRs
  2017-12-06 16:39     ` Jan Dakinevich
@ 2017-12-06 16:45       ` Denis V. Lunev
  -1 siblings, 0 replies; 22+ messages in thread
From: Denis V. Lunev @ 2017-12-06 16:45 UTC (permalink / raw)
  To: Jan Dakinevich, Konrad Rzeszutek Wilk
  Cc: linux-kernel, Roman Kagan, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, H. Peter Anvin, x86,
	Paolo Bonzini, Radim Krčmář,
	Andi Kleen, Kan Liang, Colin King, Stephane Eranian,
	Sebastian Andrzej Siewior, Greg Kroah-Hartman, Jin Yao, kvm

On 12/06/2017 07:39 PM, Jan Dakinevich wrote:
> On Wed, 6 Dec 2017 10:06:48 -0500
> Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
>
>> On Wed, Dec 06, 2017 at 02:43:01PM +0300, Jan Dakinevich wrote:
>>> w2k16 essentials fails to boot if underlying hypervisor lacks of
>>> support for LBR MSRs. To workaround the issue, it suggessted to
>>> ignore these MSRs (but not all).
>> This is without any hyperv enablement? Meaning normal stock guest?
>>
> Yes, it is normal guest. No hyperv enlightenments were enabled, and
> "-cpu host" was specified in QEMU command line.

The problem also exists with HyperV enabled.

Den

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 0/2] ignore LBR-related MSRs
@ 2017-12-06 16:45       ` Denis V. Lunev
  0 siblings, 0 replies; 22+ messages in thread
From: Denis V. Lunev @ 2017-12-06 16:45 UTC (permalink / raw)
  To: Jan Dakinevich, Konrad Rzeszutek Wilk
  Cc: linux-kernel, Roman Kagan, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, H. Peter Anvin, x86,
	Paolo Bonzini, Radim Krčmář,
	Andi Kleen, Kan Liang, Colin King, Stephane Eranian,
	Sebastian Andrzej Siewior, Greg Kroah-Hartman, Jin Yao

On 12/06/2017 07:39 PM, Jan Dakinevich wrote:
> On Wed, 6 Dec 2017 10:06:48 -0500
> Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
>
>> On Wed, Dec 06, 2017 at 02:43:01PM +0300, Jan Dakinevich wrote:
>>> w2k16 essentials fails to boot if underlying hypervisor lacks of
>>> support for LBR MSRs. To workaround the issue, it suggessted to
>>> ignore these MSRs (but not all).
>> This is without any hyperv enablement? Meaning normal stock guest?
>>
> Yes, it is normal guest. No hyperv enlightenments were enabled, and
> "-cpu host" was specified in QEMU command line.

The problem also exists with HyperV enabled.

Den

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 2/2] KVM: x86/vPMU: ignore access to LBR-related MSRs
  2017-12-06 15:57   ` Andi Kleen
@ 2017-12-06 17:02     ` Jan Dakinevich
  2017-12-06 17:55       ` Andi Kleen
  0 siblings, 1 reply; 22+ messages in thread
From: Jan Dakinevich @ 2017-12-06 17:02 UTC (permalink / raw)
  To: Andi Kleen
  Cc: linux-kernel, Denis V . Lunev, Roman Kagan, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, H. Peter Anvin, x86,
	Paolo Bonzini, Radim Krčmář,
	Kan Liang, Stephane Eranian, Colin King,
	Sebastian Andrzej Siewior, Jin Yao, kvm

On Wed, 6 Dec 2017 07:57:28 -0800
Andi Kleen <ak@linux.intel.com> wrote:

> 
> If you do all this it's only a small step to fully enable LBRs for
> guests. 

It is quite simple in a case where guest LBR-related MSRs matches host
ones. They could be handled by MSR load/store areas, I suppose.

In other cases, it could be expected the different amount of these MSRs
and different theirs base values (e.g. Nehalem vs Core). Guest MSRs
could be both subset and superset of host MSRs, so additional efforts
to support this would be required.

> 
> Just need to allow them to be written, expose PERF_CAPABILITIES too,
> and start/stop them on entry/exit, and enable context switching
> through perf in the host.
> 
> That would be far better than creating a frankenstate where LBR is
> there but mostly broken on some KVM versions.
> 
> -Andi



-- 
Best regards
Jan Dakinevich

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 2/2] KVM: x86/vPMU: ignore access to LBR-related MSRs
  2017-12-06 17:02     ` Jan Dakinevich
@ 2017-12-06 17:55       ` Andi Kleen
  0 siblings, 0 replies; 22+ messages in thread
From: Andi Kleen @ 2017-12-06 17:55 UTC (permalink / raw)
  To: Jan Dakinevich
  Cc: linux-kernel, Denis V . Lunev, Roman Kagan, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, H. Peter Anvin, x86,
	Paolo Bonzini, Radim Krčmář,
	Kan Liang, Stephane Eranian, Colin King,
	Sebastian Andrzej Siewior, Jin Yao, kvm

On Wed, Dec 06, 2017 at 08:02:07PM +0300, Jan Dakinevich wrote:
> On Wed, 6 Dec 2017 07:57:28 -0800
> Andi Kleen <ak@linux.intel.com> wrote:
> 
> > 
> > If you do all this it's only a small step to fully enable LBRs for
> > guests. 
> 
> It is quite simple in a case where guest LBR-related MSRs matches host
> ones. They could be handled by MSR load/store areas, I suppose.

There is already a LBR control to enable/disable I believe.
You don't want to save/restore all MSRs on every entry/exit 
because that would be slow. The normal Linux context switch can do it.
 
> 
> In other cases, it could be expected the different amount of these MSRs
> and different theirs base values (e.g. Nehalem vs Core). Guest MSRs
> could be both subset and superset of host MSRs, so additional efforts
> to support this would be required.

In this case ignoring would be sufficient I suppose. But for the case
when everything matches it should work.

-Andi

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 0/2] ignore LBR-related MSRs
  2017-12-06 11:43 ` Jan Dakinevich
@ 2017-12-20 16:26   ` Jan Dakinevich
  -1 siblings, 0 replies; 22+ messages in thread
From: Jan Dakinevich @ 2017-12-20 16:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Denis V . Lunev, Roman Kagan, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, H. Peter Anvin, x86,
	Paolo Bonzini, Radim Krčmář,
	Andi Kleen, Kan Liang, Colin King, Stephane Eranian,
	Sebastian Andrzej Siewior, Greg Kroah-Hartman, Jin Yao, kvm

On Wed, 6 Dec 2017 14:43:01 +0300
Jan Dakinevich <jan.dakinevich@virtuozzo.com> wrote:

> w2k16 essentials fails to boot if underlying hypervisor lacks of
> support for LBR MSRs. To workaround the issue, it suggessted to
> ignore these MSRs (but not all).
> 
> The information, which MSRs are supported for specific platform is
> taken from perf, it is the subject of the first patch. The second
> patch adds ignoring for these MSRs to pmu_intel code of KVM.
> 
> TODO: use MSR load/store areas to make full support of LBR debug.
> 
> Jan Dakinevich (2):
>   perf/x86/intel: make reusable LBR initialization code
>   KVM: x86/vPMU: ignore access to LBR-related MSRs
> 
>  arch/x86/events/core.c            |   8 +-
>  arch/x86/events/intel/core.c      |  59 +++------
>  arch/x86/events/intel/lbr.c       | 272
> +++++++++++++++++++++++++-------------
> arch/x86/events/perf_event.h      |  27 +---
> arch/x86/include/asm/kvm_host.h   |   2 +
> arch/x86/include/asm/perf_event.h |  11 ++
> arch/x86/kvm/pmu_intel.c          |  33 +++++ 7 files changed, 250
> insertions(+), 162 deletions(-)
> 

ping

Please, tell me whether these patches are applicable (as KVM workaround)
or it is assumed they should be reworked following to recommendation by
Andi Kleen?

-- 
Best regards
Jan Dakinevich

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC 0/2] ignore LBR-related MSRs
@ 2017-12-20 16:26   ` Jan Dakinevich
  0 siblings, 0 replies; 22+ messages in thread
From: Jan Dakinevich @ 2017-12-20 16:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Denis V . Lunev, Roman Kagan, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, H. Peter Anvin, x86,
	Paolo Bonzini, Radim Krčmář,
	Andi Kleen, Kan Liang, Colin King, Stephane Eranian,
	Sebastian Andrzej Siewior, Greg Kroah-Hartman

On Wed, 6 Dec 2017 14:43:01 +0300
Jan Dakinevich <jan.dakinevich@virtuozzo.com> wrote:

> w2k16 essentials fails to boot if underlying hypervisor lacks of
> support for LBR MSRs. To workaround the issue, it suggessted to
> ignore these MSRs (but not all).
> 
> The information, which MSRs are supported for specific platform is
> taken from perf, it is the subject of the first patch. The second
> patch adds ignoring for these MSRs to pmu_intel code of KVM.
> 
> TODO: use MSR load/store areas to make full support of LBR debug.
> 
> Jan Dakinevich (2):
>   perf/x86/intel: make reusable LBR initialization code
>   KVM: x86/vPMU: ignore access to LBR-related MSRs
> 
>  arch/x86/events/core.c            |   8 +-
>  arch/x86/events/intel/core.c      |  59 +++------
>  arch/x86/events/intel/lbr.c       | 272
> +++++++++++++++++++++++++-------------
> arch/x86/events/perf_event.h      |  27 +---
> arch/x86/include/asm/kvm_host.h   |   2 +
> arch/x86/include/asm/perf_event.h |  11 ++
> arch/x86/kvm/pmu_intel.c          |  33 +++++ 7 files changed, 250
> insertions(+), 162 deletions(-)
> 

ping

Please, tell me whether these patches are applicable (as KVM workaround)
or it is assumed they should be reworked following to recommendation by
Andi Kleen?

-- 
Best regards
Jan Dakinevich

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2017-12-20 16:27 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-06 11:43 [PATCH RFC 0/2] ignore LBR-related MSRs Jan Dakinevich
2017-12-06 11:43 ` Jan Dakinevich
2017-12-06 11:43 ` [PATCH RFC 1/2] perf/x86/intel: make reusable LBR initialization code Jan Dakinevich
2017-12-06 11:43   ` Jan Dakinevich
2017-12-06 12:51   ` Peter Zijlstra
2017-12-06 12:51     ` Peter Zijlstra
2017-12-06 11:43 ` [PATCH RFC 2/2] KVM: x86/vPMU: ignore access to LBR-related MSRs Jan Dakinevich
2017-12-06 11:43   ` Jan Dakinevich
2017-12-06 12:52   ` Peter Zijlstra
2017-12-06 15:57   ` Andi Kleen
2017-12-06 17:02     ` Jan Dakinevich
2017-12-06 17:55       ` Andi Kleen
2017-12-06 15:06 ` [PATCH RFC 0/2] ignore " Konrad Rzeszutek Wilk
2017-12-06 15:06   ` Konrad Rzeszutek Wilk
2017-12-06 16:39   ` Jan Dakinevich
2017-12-06 16:39     ` Jan Dakinevich
2017-12-06 16:45     ` Denis V. Lunev
2017-12-06 16:45       ` Denis V. Lunev
2017-12-06 16:23 ` Jiri Olsa
2017-12-06 16:23   ` Jiri Olsa
2017-12-20 16:26 ` Jan Dakinevich
2017-12-20 16:26   ` Jan Dakinevich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.