kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] KVM: x86/pmu: An insightful refactoring of vPMU code
@ 2021-11-16 12:20 Like Xu
  2021-11-16 12:20 ` [PATCH 1/4] KVM: x86/pmu: Setup pmc->eventsel for fixed PMCs Like Xu
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Like Xu @ 2021-11-16 12:20 UTC (permalink / raw)
  To: Paolo Bonzini, Jim Mattson
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm, linux-kernel

Hi,

This patch set is essentially triggered by Jim's patch set[1]
(especially patch 01 and 04).

The new idea to set up and maintain pmc->eventsel for fixed counters.
This would unify all fixed/gp code logic based on the same semantics
"pmc->eventse". (I demonstrated this in patch 01-03, more can be done)

[1] https://lore.kernel.org/kvm/96170437-1e00-7841-260e-39d181e7886d@gmail.com/T/#t

Please check each commit message for more details
and let me know if there is any room for improvement,

Thanks.

Like Xu (4):
  KVM: x86/pmu: Setup pmc->eventsel for fixed PMCs
  KVM: x86/pmu: Refactoring find_arch_event() to find_perf_hw_id()
  KVM: x86/pmu: Reuse find_perf_hw_id() and drop find_fixed_event()
  KVM: x86/pmu: Refactoring kvm_perf_overflow{_intr}()

 arch/x86/kvm/pmu.c           | 74 ++++++++++++++++++------------------
 arch/x86/kvm/pmu.h           |  4 +-
 arch/x86/kvm/svm/pmu.c       | 19 ++++-----
 arch/x86/kvm/vmx/pmu_intel.c | 54 +++++++++++++++++---------
 4 files changed, 83 insertions(+), 68 deletions(-)

-- 
2.33.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/4] KVM: x86/pmu: Setup pmc->eventsel for fixed PMCs
  2021-11-16 12:20 [PATCH 0/4] KVM: x86/pmu: An insightful refactoring of vPMU code Like Xu
@ 2021-11-16 12:20 ` Like Xu
  2021-11-16 12:20 ` [PATCH 2/4] KVM: x86/pmu: Refactoring find_arch_event() to find_perf_hw_id() Like Xu
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: Like Xu @ 2021-11-16 12:20 UTC (permalink / raw)
  To: Paolo Bonzini, Jim Mattson
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm, linux-kernel

From: Like Xu <likexu@tencent.com>

The current pmc->eventsel for fixed counter is underutilised. The
pmc->eventsel can be setup for all known available fixed counters
since we have mapping between fixed pmc index and
the intel_arch_events array.

Either gp or fixed counter, it will simplify the later checks for
consistency between eventsel and perf_hw_id.

Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/kvm/vmx/pmu_intel.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index b8e0d21b7c8a..51b00dbb2d1e 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -460,6 +460,21 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	return 1;
 }
 
+static void setup_fixed_pmc_eventsel(struct kvm_pmu *pmu)
+{
+	size_t size = ARRAY_SIZE(fixed_pmc_events);
+	struct kvm_pmc *pmc;
+	u32 event;
+	int i;
+
+	for (i = 0; i < pmu->nr_arch_fixed_counters; i++) {
+		pmc = &pmu->fixed_counters[i];
+		event = fixed_pmc_events[array_index_nospec(i, size)];
+		pmc->eventsel = (intel_arch_events[event].unit_mask << 8) |
+			intel_arch_events[event].eventsel;
+	}
+}
+
 static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
@@ -507,6 +522,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 			edx.split.bit_width_fixed, x86_pmu.bit_width_fixed);
 		pmu->counter_bitmask[KVM_PMC_FIXED] =
 			((u64)1 << edx.split.bit_width_fixed) - 1;
+		setup_fixed_pmc_eventsel(pmu);
 	}
 
 	pmu->global_ctrl = ((1ull << pmu->nr_arch_gp_counters) - 1) |
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/4] KVM: x86/pmu: Refactoring find_arch_event() to find_perf_hw_id()
  2021-11-16 12:20 [PATCH 0/4] KVM: x86/pmu: An insightful refactoring of vPMU code Like Xu
  2021-11-16 12:20 ` [PATCH 1/4] KVM: x86/pmu: Setup pmc->eventsel for fixed PMCs Like Xu
@ 2021-11-16 12:20 ` Like Xu
  2021-11-18 13:29   ` Paolo Bonzini
  2021-11-16 12:20 ` [PATCH 3/4] KVM: x86/pmu: Reuse find_perf_hw_id() and drop find_fixed_event() Like Xu
  2021-11-16 12:20 ` [PATCH 4/4] KVM: x86/pmu: Refactoring kvm_perf_overflow{_intr}() Like Xu
  3 siblings, 1 reply; 12+ messages in thread
From: Like Xu @ 2021-11-16 12:20 UTC (permalink / raw)
  To: Paolo Bonzini, Jim Mattson
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm, linux-kernel

From: Like Xu <likexu@tencent.com>

The find_arch_event() returns a "unsigned int" value,
which is used by the pmc_reprogram_counter() to
program a PERF_TYPE_HARDWARE type perf_event.

The returned value is actually the kernel defined gernic
perf_hw_id, let's rename it to find_perf_hw_id() with simpler
incoming parameters for better self-explanation.

Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/kvm/pmu.c           | 8 +-------
 arch/x86/kvm/pmu.h           | 3 +--
 arch/x86/kvm/svm/pmu.c       | 8 ++++----
 arch/x86/kvm/vmx/pmu_intel.c | 9 +++++----
 4 files changed, 11 insertions(+), 17 deletions(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 0772bad9165c..903dc6a532cc 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -174,7 +174,6 @@ static bool pmc_resume_counter(struct kvm_pmc *pmc)
 void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel)
 {
 	unsigned config, type = PERF_TYPE_RAW;
-	u8 event_select, unit_mask;
 	struct kvm *kvm = pmc->vcpu->kvm;
 	struct kvm_pmu_event_filter *filter;
 	int i;
@@ -206,17 +205,12 @@ void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel)
 	if (!allow_event)
 		return;
 
-	event_select = eventsel & ARCH_PERFMON_EVENTSEL_EVENT;
-	unit_mask = (eventsel & ARCH_PERFMON_EVENTSEL_UMASK) >> 8;
-
 	if (!(eventsel & (ARCH_PERFMON_EVENTSEL_EDGE |
 			  ARCH_PERFMON_EVENTSEL_INV |
 			  ARCH_PERFMON_EVENTSEL_CMASK |
 			  HSW_IN_TX |
 			  HSW_IN_TX_CHECKPOINTED))) {
-		config = kvm_x86_ops.pmu_ops->find_arch_event(pmc_to_pmu(pmc),
-						      event_select,
-						      unit_mask);
+		config = kvm_x86_ops.pmu_ops->find_perf_hw_id(pmc);
 		if (config != PERF_COUNT_HW_MAX)
 			type = PERF_TYPE_HARDWARE;
 	}
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 0e4f2b1fa9fb..e7a5d4b6fa94 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -24,8 +24,7 @@ struct kvm_event_hw_type_mapping {
 };
 
 struct kvm_pmu_ops {
-	unsigned (*find_arch_event)(struct kvm_pmu *pmu, u8 event_select,
-				    u8 unit_mask);
+	unsigned int (*find_perf_hw_id)(struct kvm_pmc *pmc);
 	unsigned (*find_fixed_event)(int idx);
 	bool (*pmc_is_enabled)(struct kvm_pmc *pmc);
 	struct kvm_pmc *(*pmc_idx_to_pmc)(struct kvm_pmu *pmu, int pmc_idx);
diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index fdf587f19c5f..1d31bd5c6803 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -134,10 +134,10 @@ static inline struct kvm_pmc *get_gp_pmc_amd(struct kvm_pmu *pmu, u32 msr,
 	return &pmu->gp_counters[msr_to_index(msr)];
 }
 
-static unsigned amd_find_arch_event(struct kvm_pmu *pmu,
-				    u8 event_select,
-				    u8 unit_mask)
+static unsigned int amd_find_perf_hw_id(struct kvm_pmc *pmc)
 {
+	u8 event_select = pmc->eventsel & ARCH_PERFMON_EVENTSEL_EVENT;
+	u8 unit_mask = (pmc->eventsel & ARCH_PERFMON_EVENTSEL_UMASK) >> 8;
 	int i;
 
 	for (i = 0; i < ARRAY_SIZE(amd_event_mapping); i++)
@@ -320,7 +320,7 @@ static void amd_pmu_reset(struct kvm_vcpu *vcpu)
 }
 
 struct kvm_pmu_ops amd_pmu_ops = {
-	.find_arch_event = amd_find_arch_event,
+	.find_perf_hw_id = amd_find_perf_hw_id,
 	.find_fixed_event = amd_find_fixed_event,
 	.pmc_is_enabled = amd_pmc_is_enabled,
 	.pmc_idx_to_pmc = amd_pmc_idx_to_pmc,
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 51b00dbb2d1e..f1cc6192ead7 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -68,10 +68,11 @@ static void global_ctrl_changed(struct kvm_pmu *pmu, u64 data)
 		reprogram_counter(pmu, bit);
 }
 
-static unsigned intel_find_arch_event(struct kvm_pmu *pmu,
-				      u8 event_select,
-				      u8 unit_mask)
+static unsigned int intel_find_perf_hw_id(struct kvm_pmc *pmc)
 {
+	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
+	u8 event_select = pmc->eventsel & ARCH_PERFMON_EVENTSEL_EVENT;
+	u8 unit_mask = (pmc->eventsel & ARCH_PERFMON_EVENTSEL_UMASK) >> 8;
 	int i;
 
 	for (i = 0; i < ARRAY_SIZE(intel_arch_events); i++)
@@ -720,7 +721,7 @@ static void intel_pmu_cleanup(struct kvm_vcpu *vcpu)
 }
 
 struct kvm_pmu_ops intel_pmu_ops = {
-	.find_arch_event = intel_find_arch_event,
+	.find_perf_hw_id = intel_find_perf_hw_id,
 	.find_fixed_event = intel_find_fixed_event,
 	.pmc_is_enabled = intel_pmc_is_enabled,
 	.pmc_idx_to_pmc = intel_pmc_idx_to_pmc,
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/4] KVM: x86/pmu: Reuse find_perf_hw_id() and drop find_fixed_event()
  2021-11-16 12:20 [PATCH 0/4] KVM: x86/pmu: An insightful refactoring of vPMU code Like Xu
  2021-11-16 12:20 ` [PATCH 1/4] KVM: x86/pmu: Setup pmc->eventsel for fixed PMCs Like Xu
  2021-11-16 12:20 ` [PATCH 2/4] KVM: x86/pmu: Refactoring find_arch_event() to find_perf_hw_id() Like Xu
@ 2021-11-16 12:20 ` Like Xu
  2021-11-18 14:43   ` Paolo Bonzini
  2021-11-18 15:00   ` Paolo Bonzini
  2021-11-16 12:20 ` [PATCH 4/4] KVM: x86/pmu: Refactoring kvm_perf_overflow{_intr}() Like Xu
  3 siblings, 2 replies; 12+ messages in thread
From: Like Xu @ 2021-11-16 12:20 UTC (permalink / raw)
  To: Paolo Bonzini, Jim Mattson
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm, linux-kernel

From: Like Xu <likexu@tencent.com>

Since we set the same semantic event value for the fixed counter in
pmc->eventsel, returning the perf_hw_id for the fixed counter via
find_fixed_event() can be painlessly replaced by find_perf_hw_id()
with the help of pmc_is_fixed() check.

Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/kvm/pmu.c           |  2 +-
 arch/x86/kvm/pmu.h           |  1 -
 arch/x86/kvm/svm/pmu.c       | 11 ++++-------
 arch/x86/kvm/vmx/pmu_intel.c | 29 ++++++++++++++++-------------
 4 files changed, 21 insertions(+), 22 deletions(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 903dc6a532cc..3c45467b4275 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -262,7 +262,7 @@ void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 ctrl, int idx)
 
 	pmc->current_config = (u64)ctrl;
 	pmc_reprogram_counter(pmc, PERF_TYPE_HARDWARE,
-			      kvm_x86_ops.pmu_ops->find_fixed_event(idx),
+			      kvm_x86_ops.pmu_ops->find_perf_hw_id(pmc),
 			      !(en_field & 0x2), /* exclude user */
 			      !(en_field & 0x1), /* exclude kernel */
 			      pmi, false, false);
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index e7a5d4b6fa94..354339710d0d 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -25,7 +25,6 @@ struct kvm_event_hw_type_mapping {
 
 struct kvm_pmu_ops {
 	unsigned int (*find_perf_hw_id)(struct kvm_pmc *pmc);
-	unsigned (*find_fixed_event)(int idx);
 	bool (*pmc_is_enabled)(struct kvm_pmc *pmc);
 	struct kvm_pmc *(*pmc_idx_to_pmc)(struct kvm_pmu *pmu, int pmc_idx);
 	struct kvm_pmc *(*rdpmc_ecx_to_pmc)(struct kvm_vcpu *vcpu,
diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index 1d31bd5c6803..eeaeb58d501b 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -140,6 +140,10 @@ static unsigned int amd_find_perf_hw_id(struct kvm_pmc *pmc)
 	u8 unit_mask = (pmc->eventsel & ARCH_PERFMON_EVENTSEL_UMASK) >> 8;
 	int i;
 
+	/* return PERF_COUNT_HW_MAX as AMD doesn't have fixed events */
+	if (pmc_is_fixed(pmc))
+		return PERF_COUNT_HW_MAX;
+
 	for (i = 0; i < ARRAY_SIZE(amd_event_mapping); i++)
 		if (amd_event_mapping[i].eventsel == event_select
 		    && amd_event_mapping[i].unit_mask == unit_mask)
@@ -151,12 +155,6 @@ static unsigned int amd_find_perf_hw_id(struct kvm_pmc *pmc)
 	return amd_event_mapping[i].event_type;
 }
 
-/* return PERF_COUNT_HW_MAX as AMD doesn't have fixed events */
-static unsigned amd_find_fixed_event(int idx)
-{
-	return PERF_COUNT_HW_MAX;
-}
-
 /* check if a PMC is enabled by comparing it against global_ctrl bits. Because
  * AMD CPU doesn't have global_ctrl MSR, all PMCs are enabled (return TRUE).
  */
@@ -321,7 +319,6 @@ static void amd_pmu_reset(struct kvm_vcpu *vcpu)
 
 struct kvm_pmu_ops amd_pmu_ops = {
 	.find_perf_hw_id = amd_find_perf_hw_id,
-	.find_fixed_event = amd_find_fixed_event,
 	.pmc_is_enabled = amd_pmc_is_enabled,
 	.pmc_idx_to_pmc = amd_pmc_idx_to_pmc,
 	.rdpmc_ecx_to_pmc = amd_rdpmc_ecx_to_pmc,
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index f1cc6192ead7..8ba8b4ab1fb7 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -68,6 +68,19 @@ static void global_ctrl_changed(struct kvm_pmu *pmu, u64 data)
 		reprogram_counter(pmu, bit);
 }
 
+static inline unsigned int intel_find_fixed_event(int idx)
+{
+	u32 event;
+	size_t size = ARRAY_SIZE(fixed_pmc_events);
+
+	if (idx >= size)
+		return PERF_COUNT_HW_MAX;
+
+	event = fixed_pmc_events[array_index_nospec(idx, size)];
+	return intel_arch_events[event].event_type;
+}
+
+
 static unsigned int intel_find_perf_hw_id(struct kvm_pmc *pmc)
 {
 	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
@@ -75,6 +88,9 @@ static unsigned int intel_find_perf_hw_id(struct kvm_pmc *pmc)
 	u8 unit_mask = (pmc->eventsel & ARCH_PERFMON_EVENTSEL_UMASK) >> 8;
 	int i;
 
+	if (pmc_is_fixed(pmc))
+		return intel_find_fixed_event(pmc->idx - INTEL_PMC_IDX_FIXED);
+
 	for (i = 0; i < ARRAY_SIZE(intel_arch_events); i++)
 		if (intel_arch_events[i].eventsel == event_select
 		    && intel_arch_events[i].unit_mask == unit_mask
@@ -87,18 +103,6 @@ static unsigned int intel_find_perf_hw_id(struct kvm_pmc *pmc)
 	return intel_arch_events[i].event_type;
 }
 
-static unsigned intel_find_fixed_event(int idx)
-{
-	u32 event;
-	size_t size = ARRAY_SIZE(fixed_pmc_events);
-
-	if (idx >= size)
-		return PERF_COUNT_HW_MAX;
-
-	event = fixed_pmc_events[array_index_nospec(idx, size)];
-	return intel_arch_events[event].event_type;
-}
-
 /* check if a PMC is enabled by comparing it with globl_ctrl bits. */
 static bool intel_pmc_is_enabled(struct kvm_pmc *pmc)
 {
@@ -722,7 +726,6 @@ static void intel_pmu_cleanup(struct kvm_vcpu *vcpu)
 
 struct kvm_pmu_ops intel_pmu_ops = {
 	.find_perf_hw_id = intel_find_perf_hw_id,
-	.find_fixed_event = intel_find_fixed_event,
 	.pmc_is_enabled = intel_pmc_is_enabled,
 	.pmc_idx_to_pmc = intel_pmc_idx_to_pmc,
 	.rdpmc_ecx_to_pmc = intel_rdpmc_ecx_to_pmc,
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/4] KVM: x86/pmu: Refactoring kvm_perf_overflow{_intr}()
  2021-11-16 12:20 [PATCH 0/4] KVM: x86/pmu: An insightful refactoring of vPMU code Like Xu
                   ` (2 preceding siblings ...)
  2021-11-16 12:20 ` [PATCH 3/4] KVM: x86/pmu: Reuse find_perf_hw_id() and drop find_fixed_event() Like Xu
@ 2021-11-16 12:20 ` Like Xu
  2021-11-18 14:53   ` Paolo Bonzini
  3 siblings, 1 reply; 12+ messages in thread
From: Like Xu @ 2021-11-16 12:20 UTC (permalink / raw)
  To: Paolo Bonzini, Jim Mattson
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm, linux-kernel

From: Like Xu <likexu@tencent.com>

Depending on whether intr should be triggered or not, KVM registers
two different event overflow callbacks in the perf_event context.

The code skeleton of these two functions is very similar, so
need_overflow_intr() is introduced to increase the potential code reuse.

There is a trade-off between extra cycles in the irq context and a
smaller instructions footprint against the u-architecture branch predictor.

Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/kvm/pmu.c | 64 +++++++++++++++++++++++++---------------------
 1 file changed, 35 insertions(+), 29 deletions(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 3c45467b4275..ef4bba8be7f7 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -55,43 +55,50 @@ static void kvm_pmi_trigger_fn(struct irq_work *irq_work)
 	kvm_pmu_deliver_pmi(vcpu);
 }
 
-static void kvm_perf_overflow(struct perf_event *perf_event,
-			      struct perf_sample_data *data,
-			      struct pt_regs *regs)
+static inline bool need_overflow_intr(struct kvm_pmc *pmc)
 {
-	struct kvm_pmc *pmc = perf_event->overflow_handler_context;
 	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
 
-	if (!test_and_set_bit(pmc->idx, pmu->reprogram_pmi)) {
-		__set_bit(pmc->idx, (unsigned long *)&pmu->global_status);
-		kvm_make_request(KVM_REQ_PMU, pmc->vcpu);
-	}
+	if (pmc_is_gp(pmc))
+		return (pmc->eventsel & ARCH_PERFMON_EVENTSEL_INT);
+	else
+		return fixed_ctrl_field(pmu->fixed_ctr_ctrl,
+			pmc->idx - INTEL_PMC_IDX_FIXED) & 0x8;
 }
 
-static void kvm_perf_overflow_intr(struct perf_event *perf_event,
-				   struct perf_sample_data *data,
-				   struct pt_regs *regs)
+static inline void kvm_pmu_counter_overflow(struct kvm_pmc *pmc, bool intr)
 {
-	struct kvm_pmc *pmc = perf_event->overflow_handler_context;
 	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
 
-	if (!test_and_set_bit(pmc->idx, pmu->reprogram_pmi)) {
-		__set_bit(pmc->idx, (unsigned long *)&pmu->global_status);
-		kvm_make_request(KVM_REQ_PMU, pmc->vcpu);
+	__set_bit(pmc->idx, (unsigned long *)&pmu->global_status);
+	kvm_make_request(KVM_REQ_PMU, pmc->vcpu);
 
-		/*
-		 * Inject PMI. If vcpu was in a guest mode during NMI PMI
-		 * can be ejected on a guest mode re-entry. Otherwise we can't
-		 * be sure that vcpu wasn't executing hlt instruction at the
-		 * time of vmexit and is not going to re-enter guest mode until
-		 * woken up. So we should wake it, but this is impossible from
-		 * NMI context. Do it from irq work instead.
-		 */
-		if (!kvm_is_in_guest())
-			irq_work_queue(&pmc_to_pmu(pmc)->irq_work);
-		else
-			kvm_make_request(KVM_REQ_PMI, pmc->vcpu);
-	}
+	if (!intr)
+		return;
+
+	/*
+	 * Inject PMI. If vcpu was in a guest mode during NMI PMI
+	 * can be ejected on a guest mode re-entry. Otherwise we can't
+	 * be sure that vcpu wasn't executing hlt instruction at the
+	 * time of vmexit and is not going to re-enter guest mode until
+	 * woken up. So we should wake it, but this is impossible from
+	 * NMI context. Do it from irq work instead.
+	 */
+	if (!kvm_is_in_guest())
+		irq_work_queue(&pmc_to_pmu(pmc)->irq_work);
+	else
+		kvm_make_request(KVM_REQ_PMI, pmc->vcpu);
+}
+
+static void kvm_perf_overflow(struct perf_event *perf_event,
+			      struct perf_sample_data *data,
+			      struct pt_regs *regs)
+{
+	struct kvm_pmc *pmc = perf_event->overflow_handler_context;
+	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
+
+	if (!test_and_set_bit(pmc->idx, pmu->reprogram_pmi))
+		kvm_pmu_counter_overflow(pmc, need_overflow_intr(pmc));
 }
 
 static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type,
@@ -126,7 +133,6 @@ static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type,
 	}
 
 	event = perf_event_create_kernel_counter(&attr, -1, current,
-						 intr ? kvm_perf_overflow_intr :
 						 kvm_perf_overflow, pmc);
 	if (IS_ERR(event)) {
 		pr_debug_ratelimited("kvm_pmu: event creation failed %ld for pmc->idx = %d\n",
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] KVM: x86/pmu: Refactoring find_arch_event() to find_perf_hw_id()
  2021-11-16 12:20 ` [PATCH 2/4] KVM: x86/pmu: Refactoring find_arch_event() to find_perf_hw_id() Like Xu
@ 2021-11-18 13:29   ` Paolo Bonzini
  0 siblings, 0 replies; 12+ messages in thread
From: Paolo Bonzini @ 2021-11-18 13:29 UTC (permalink / raw)
  To: Like Xu, Jim Mattson
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm, linux-kernel

On 11/16/21 13:20, Like Xu wrote:
> From: Like Xu<likexu@tencent.com>
> 
> The find_arch_event() returns a "unsigned int" value,
> which is used by the pmc_reprogram_counter() to
> program a PERF_TYPE_HARDWARE type perf_event.
> 
> The returned value is actually the kernel defined gernic
> perf_hw_id, let's rename it to find_perf_hw_id() with simpler
> incoming parameters for better self-explanation.

Since the argument is a pmc, let's rename it to pmc_perf_hw_id().

Paolo


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/4] KVM: x86/pmu: Reuse find_perf_hw_id() and drop find_fixed_event()
  2021-11-16 12:20 ` [PATCH 3/4] KVM: x86/pmu: Reuse find_perf_hw_id() and drop find_fixed_event() Like Xu
@ 2021-11-18 14:43   ` Paolo Bonzini
  2021-11-18 15:00   ` Paolo Bonzini
  1 sibling, 0 replies; 12+ messages in thread
From: Paolo Bonzini @ 2021-11-18 14:43 UTC (permalink / raw)
  To: Like Xu, Jim Mattson
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm, linux-kernel

On 11/16/21 13:20, Like Xu wrote:
> +	/* return PERF_COUNT_HW_MAX as AMD doesn't have fixed events */
> +	if (pmc_is_fixed(pmc))
> +		return PERF_COUNT_HW_MAX;
> +

This should have a WARN_ON, since reprogram_fixed_counter will never see 
an enabled fixed counter on AMD.

Paolo


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/4] KVM: x86/pmu: Refactoring kvm_perf_overflow{_intr}()
  2021-11-16 12:20 ` [PATCH 4/4] KVM: x86/pmu: Refactoring kvm_perf_overflow{_intr}() Like Xu
@ 2021-11-18 14:53   ` Paolo Bonzini
  0 siblings, 0 replies; 12+ messages in thread
From: Paolo Bonzini @ 2021-11-18 14:53 UTC (permalink / raw)
  To: Like Xu, Jim Mattson
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm, linux-kernel

On 11/16/21 13:20, Like Xu wrote:
> -	}
> +	if (!intr)
> +		return;
> +
> +	/*
> +	 * Inject PMI. If vcpu was in a guest mode during NMI PMI
> +	 * can be ejected on a guest mode re-entry. Otherwise we can't
> +	 * be sure that vcpu wasn't executing hlt instruction at the
> +	 * time of vmexit and is not going to re-enter guest mode until
> +	 * woken up. So we should wake it, but this is impossible from
> +	 * NMI context. Do it from irq work instead.
> +	 */
> +	if (!kvm_is_in_guest())
> +		irq_work_queue(&pmc_to_pmu(pmc)->irq_work);
> +	else
> +		kvm_make_request(KVM_REQ_PMI, pmc->vcpu);
> +}
> +
> +static void kvm_perf_overflow(struct perf_event *perf_event,
> +			      struct perf_sample_data *data,
> +			      struct pt_regs *regs)
> +{
> +	struct kvm_pmc *pmc = perf_event->overflow_handler_context;
> +	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
> +
> +	if (!test_and_set_bit(pmc->idx, pmu->reprogram_pmi))
> +		kvm_pmu_counter_overflow(pmc, need_overflow_intr(pmc));
>   }

It could be even better to make a single function, but instead of 
need_overflow_intr(pmc) you should store into pmc from 
pmc_reprogram_counter.  Like this:

	/* Ignore counters that have been reported already.  */
	if (test_and_set_bit(pmc->idx, pmu->reprogram_pmi))
		return;

	__set_bit(pmc->idx, (unsigned long *)&pmu->global_status);
	kvm_make_request(KVM_REQ_PMU, pmc->vcpu);

	if (pmc->intr) {
		/*
		 * Inject PMI. If vcpu was in a guest mode during NMI PMI
		 * can be ejected on a guest mode re-entry. Otherwise we can't
		 * be sure that vcpu wasn't executing hlt instruction at the
		 * time of vmexit and is not going to re-enter guest mode until
		 * woken up. So we should wake it, but this is impossible from
		 * NMI context. Do it from irq work instead.
		 */
		if (!kvm_is_in_guest())
			irq_work_queue(pmu->irq_work);
		else
			kvm_make_request(KVM_REQ_PMI, pmc->vcpu);
	}

Paolo


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/4] KVM: x86/pmu: Reuse find_perf_hw_id() and drop find_fixed_event()
  2021-11-16 12:20 ` [PATCH 3/4] KVM: x86/pmu: Reuse find_perf_hw_id() and drop find_fixed_event() Like Xu
  2021-11-18 14:43   ` Paolo Bonzini
@ 2021-11-18 15:00   ` Paolo Bonzini
  2021-11-19  7:16     ` Like Xu
  1 sibling, 1 reply; 12+ messages in thread
From: Paolo Bonzini @ 2021-11-18 15:00 UTC (permalink / raw)
  To: Like Xu, Jim Mattson
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm, linux-kernel

On 11/16/21 13:20, Like Xu wrote:
> +static inline unsigned int intel_find_fixed_event(int idx)
> +{
> +	u32 event;
> +	size_t size = ARRAY_SIZE(fixed_pmc_events);
> +
> +	if (idx >= size)
> +		return PERF_COUNT_HW_MAX;
> +
> +	event = fixed_pmc_events[array_index_nospec(idx, size)];
> +	return intel_arch_events[event].event_type;
> +}
> +
> +
>   static unsigned int intel_find_perf_hw_id(struct kvm_pmc *pmc)
>   {
>   	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
> @@ -75,6 +88,9 @@ static unsigned int intel_find_perf_hw_id(struct kvm_pmc *pmc)
>   	u8 unit_mask = (pmc->eventsel & ARCH_PERFMON_EVENTSEL_UMASK) >> 8;
>   	int i;
>   
> +	if (pmc_is_fixed(pmc))
> +		return intel_find_fixed_event(pmc->idx - INTEL_PMC_IDX_FIXED);

Is intel_find_fixed_event needed at all?  As you point out in the commit
message, eventsel/unit_mask are valid so you can do

@@ -88,13 +75,11 @@ static unsigned int intel_pmc_perf_hw_id(struct kvm_pmc *pmc)
  	u8 unit_mask = (pmc->eventsel & ARCH_PERFMON_EVENTSEL_UMASK) >> 8;
  	int i;
  
-	if (pmc_is_fixed(pmc))
-		return intel_find_fixed_event(pmc->idx - INTEL_PMC_IDX_FIXED);
-
  	for (i = 0; i < ARRAY_SIZE(intel_arch_events); i++)
  		if (intel_arch_events[i].eventsel == event_select
  		    && intel_arch_events[i].unit_mask == unit_mask
-		    && (pmu->available_event_types & (1 << i)))
+		    && (pmc_is_fixed(pmc) ||
+			pmu->available_event_types & (1 << i)))
  			break;
  
  	if (i == ARRAY_SIZE(intel_arch_events))

What do you think?  It's less efficient but makes fixed/gp more similar.

Can you please resubmit the series based on the review feedback?

Thanks,


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/4] KVM: x86/pmu: Reuse find_perf_hw_id() and drop find_fixed_event()
  2021-11-18 15:00   ` Paolo Bonzini
@ 2021-11-19  7:16     ` Like Xu
  2021-11-19 10:29       ` Paolo Bonzini
  0 siblings, 1 reply; 12+ messages in thread
From: Like Xu @ 2021-11-19  7:16 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm, linux-kernel, Jim Mattson

On 18/11/2021 11:00 pm, Paolo Bonzini wrote:
> On 11/16/21 13:20, Like Xu wrote:
>> +static inline unsigned int intel_find_fixed_event(int idx)
>> +{
>> +    u32 event;
>> +    size_t size = ARRAY_SIZE(fixed_pmc_events);
>> +
>> +    if (idx >= size)
>> +        return PERF_COUNT_HW_MAX;
>> +
>> +    event = fixed_pmc_events[array_index_nospec(idx, size)];
>> +    return intel_arch_events[event].event_type;
>> +}
>> +
>> +
>>   static unsigned int intel_find_perf_hw_id(struct kvm_pmc *pmc)
>>   {
>>       struct kvm_pmu *pmu = pmc_to_pmu(pmc);
>> @@ -75,6 +88,9 @@ static unsigned int intel_find_perf_hw_id(struct kvm_pmc *pmc)
>>       u8 unit_mask = (pmc->eventsel & ARCH_PERFMON_EVENTSEL_UMASK) >> 8;
>>       int i;
>> +    if (pmc_is_fixed(pmc))
>> +        return intel_find_fixed_event(pmc->idx - INTEL_PMC_IDX_FIXED);
> 
> Is intel_find_fixed_event needed at all?  As you point out in the commit
> message, eventsel/unit_mask are valid so you can do
> 
> @@ -88,13 +75,11 @@ static unsigned int intel_pmc_perf_hw_id(struct kvm_pmc *pmc)
>       u8 unit_mask = (pmc->eventsel & ARCH_PERFMON_EVENTSEL_UMASK) >> 8;
>       int i;
> 
> -    if (pmc_is_fixed(pmc))
> -        return intel_find_fixed_event(pmc->idx - INTEL_PMC_IDX_FIXED);
> -
>       for (i = 0; i < ARRAY_SIZE(intel_arch_events); i++)
>           if (intel_arch_events[i].eventsel == event_select
>               && intel_arch_events[i].unit_mask == unit_mask
> -            && (pmu->available_event_types & (1 << i)))
> +            && (pmc_is_fixed(pmc) ||
> +            pmu->available_event_types & (1 << i)))

It's a good move while the tricky thing I've found recently is:

the events masked in pmu->available_event_types are just *Intel CPUID* events
(they're not a subset or superset of the *kernel generic* hw events),
some of which can be programmed and enabled with generic or fixed counter.

According Intel SDM, when an Intel CPUID event (e.g. "instructions retirement")
is not masked in pmu->available_event_types (configured via Intel CPUID.0A.EBX),
the guest should not use generic or fixed counter to count/sample this event.

This issue is detailed in another patch set [1] and comments need to be collected.

It's proposed to get [V2] merged and continue to review the fixes from [1] 
seamlessly,
and then further unify all fixed/gp stuff including intel_find_fixed_event() as 
a follow up.

[1] https://lore.kernel.org/kvm/20211112095139.21775-1-likexu@tencent.com/
[V2] https://lore.kernel.org/kvm/20211119064856.77948-1-likexu@tencent.com/

>               break;
> 
>       if (i == ARRAY_SIZE(intel_arch_events))
> 
> What do you think?  It's less efficient but makes fixed/gp more similar.
> 
> Can you please resubmit the series based on the review feedback?
> 
> Thanks,
> 
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/4] KVM: x86/pmu: Reuse find_perf_hw_id() and drop find_fixed_event()
  2021-11-19  7:16     ` Like Xu
@ 2021-11-19 10:29       ` Paolo Bonzini
  2021-11-19 11:10         ` Like Xu
  0 siblings, 1 reply; 12+ messages in thread
From: Paolo Bonzini @ 2021-11-19 10:29 UTC (permalink / raw)
  To: Like Xu
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm, linux-kernel, Jim Mattson

On 11/19/21 08:16, Like Xu wrote:
> 
> It's proposed to get [V2] merged and continue to review the fixes from 
> [1] seamlessly,
> and then further unify all fixed/gp stuff including 
> intel_find_fixed_event() as a follow up.

I agree and I'll review it soon.  Though, why not add the

+            && (pmc_is_fixed(pmc) ||
+            pmu->available_event_types & (1 << i)))

version in v2 of this patch? :)

Paolo

> [1] https://lore.kernel.org/kvm/20211112095139.21775-1-likexu@tencent.com/
> [V2] https://lore.kernel.org/kvm/20211119064856.77948-1-likexu@tencent.com/


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/4] KVM: x86/pmu: Reuse find_perf_hw_id() and drop find_fixed_event()
  2021-11-19 10:29       ` Paolo Bonzini
@ 2021-11-19 11:10         ` Like Xu
  0 siblings, 0 replies; 12+ messages in thread
From: Like Xu @ 2021-11-19 11:10 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm, linux-kernel, Jim Mattson

On 19/11/2021 6:29 pm, Paolo Bonzini wrote:
> On 11/19/21 08:16, Like Xu wrote:
>>
>> It's proposed to get [V2] merged and continue to review the fixes from [1] 
>> seamlessly,
>> and then further unify all fixed/gp stuff including intel_find_fixed_event() 
>> as a follow up.
> 
> I agree and I'll review it soon.  Though, why not add the
> 
> +            && (pmc_is_fixed(pmc) ||
> +            pmu->available_event_types & (1 << i)))
> 

If we have a fixed ctr 0 for "retired instructions" event
but the bit 01 of the guest CPUID 0AH.EBX leaf is masked,

thus in that case, we got true from "pmc_is_fixed(pmc)"
and false from "pmu->available_event_types & (1 << i)",

thus it will break and continue to program a perf_event for pmc.

(SDM says, Bit 01: Instruction retired event not available if 1 or if EAX[31:24]<2.)

But the right behavior is that KVM should not program perf_event
for this pmc since this event should not be available (whether it's gp or fixed)
and the counter msr pair can be accessed but does not work.

The proposal final code may look like :

/* UMask and Event Select Encodings for Intel CPUID Events */
static inline bool is_intel_cpuid_event(u8 event_select, u8 unit_mask)
{
	if ((!unit_mask && event_select == 0x3C) ||
	    (!unit_mask && event_select == 0xC0) ||
	    (unit_mask == 0x01 && event_select == 0x3C) ||
	    (unit_mask == 0x4F && event_select == 0x2E) ||
	    (unit_mask == 0x41 && event_select == 0x2E) ||
	    (!unit_mask && event_select == 0xC4) ||
	    (!unit_mask && event_select == 0xC5))
		return true;

	/* the unimplemented topdown.slots event check is kipped. */
	return false;
}

static unsigned int intel_pmc_perf_hw_id(struct kvm_pmc *pmc)
{
	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
	u8 event_select = pmc->eventsel & ARCH_PERFMON_EVENTSEL_EVENT;
	u8 unit_mask = (pmc->eventsel & ARCH_PERFMON_EVENTSEL_UMASK) >> 8;
	int i;

	for (i = 0; i < PERF_COUNT_HW_MAX; i++) {
		if (kernel_generic_events[i].eventsel != event_select ||
		    kernel_generic_events[i].unit_mask != unit_mask)
			continue;

		if (is_intel_cpuid_event(event_select, unit_mask) &&
		    !test_bit(i, pmu->avail_cpuid_events))
			return PERF_COUNT_HW_MAX + 1;

		break;
	}

	return (i == PERF_COUNT_HW_MAX) ? i : kernel_generic_events[i].event_type;
}


> version in v2 of this patch? :)
> 
> Paolo
> 
>> [1] https://lore.kernel.org/kvm/20211112095139.21775-1-likexu@tencent.com/
>> [V2] https://lore.kernel.org/kvm/20211119064856.77948-1-likexu@tencent.com/
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-11-19 11:10 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-16 12:20 [PATCH 0/4] KVM: x86/pmu: An insightful refactoring of vPMU code Like Xu
2021-11-16 12:20 ` [PATCH 1/4] KVM: x86/pmu: Setup pmc->eventsel for fixed PMCs Like Xu
2021-11-16 12:20 ` [PATCH 2/4] KVM: x86/pmu: Refactoring find_arch_event() to find_perf_hw_id() Like Xu
2021-11-18 13:29   ` Paolo Bonzini
2021-11-16 12:20 ` [PATCH 3/4] KVM: x86/pmu: Reuse find_perf_hw_id() and drop find_fixed_event() Like Xu
2021-11-18 14:43   ` Paolo Bonzini
2021-11-18 15:00   ` Paolo Bonzini
2021-11-19  7:16     ` Like Xu
2021-11-19 10:29       ` Paolo Bonzini
2021-11-19 11:10         ` Like Xu
2021-11-16 12:20 ` [PATCH 4/4] KVM: x86/pmu: Refactoring kvm_perf_overflow{_intr}() Like Xu
2021-11-18 14:53   ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).