kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v8 0/7] Introduce and test masked events
@ 2022-12-20 16:12 Aaron Lewis
  2022-12-20 16:12 ` [PATCH v8 1/7] kvm: x86/pmu: Correct the mask used in a pmu event filter lookup Aaron Lewis
                   ` (7 more replies)
  0 siblings, 8 replies; 16+ messages in thread
From: Aaron Lewis @ 2022-12-20 16:12 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, jmattson, seanjc, like.xu.linux, Aaron Lewis

This series introduces the concept of masked events to the pmu event
filter. Masked events can help reduce the number of events needed in the
pmu event filter by allowing a more generalized matching method to be
used for the unit mask when filtering guest events in the pmu.  With
masked events, if an event select should be restricted from the guest,
instead of having to add an entry to the pmu event filter for each
event select + unit mask pair, a masked event can be added to generalize
the unit mask values.

Reviewed-by: Sean Christopherson <seanjc@google.com>

v7 -> v8
 - Rebased on top of the latest in kvm/queue.
 - s/num_gp_counters/X86_PROPERTY_PMU_NR_GP_COUNTERS/ in the masked
   events test.

v6 -> v7
 - Rebased on top of the latest in kvm/queue. [Like]
 - Patch #7, Added comment about how counters value are used. [Sean]

v5 -> v6
 - The following changes were based on Sean's feedback.
 - Patch #1, Use a const for EVENTSEL_EVENT rather than a callback.
 - Patch #2, s/invalid/impossible and removed helpers.
 - Patch #3, Ditched internal event struct.  Sticking with arch layout.
 - Patch #4,
     - Switched masked events to follow arch layout.
     - Created an internal struct for the pmu event filter.
     - Track separate lists for include events and exclude events.
     - General refactors as a result of these changes.

v4 -> v5
 - Patch #3, Simplified the return logic in filter_contains_match(). [Jim]
 - Patch #4, Corrected documentation for errors and returns. [Jim]
 - Patch #6, Added TEST_REQUIRE for KVM_CAP_PMU_EVENT_MASKED_EVENTS. [Jim]
 - Patch #7,
     - Removed TEST_REQUIRE for KVM_CAP_PMU_EVENT_MASKED_EVENTS (moved it
       to patch #6).
     - Changed the assert to a branch in supports_event_mem_inst_retired().
       [Jim]
     - Added a check to make sure 3 GP counters are available. [Jim]

v3 -> v4
 - Patch #1, Fix the mask for the guest event select used in bsearch.
 - Patch #2, Remove invalid events from the pmu event filter.
 - Patch #3, Create an internal/common representation for filter events.
 - Patch #4,
     - Use a common filter event to simplify kernel code. [Jim]
     - s/invalid/exclude for masked events. More descriptive. [Sean]
     - Simplified masked event layout. There was no need to complicate it.
     - Add KVM_CAP_PMU_EVENT_MASKED_EVENTS.
 - Patch #7, Rewrote the masked events tests using MEM_INST_RETIRED (0xd0)
   on Intel and LS Dispatch (0x29) on AMD. They have multiple unit masks
   each which were leveraged for improved masked events testing.

v2 -> v3
 - Reworked and documented the invert flag usage.  It was possible to
   get ambiguous results when using it.  That should not be possible
   now.
 - Added testing for inverted masked events to validate the updated
   implementation.
 - Removed testing for inverted masked events from the masked events test.
   They were meaningless with the updated implementation.  More meaning
   tests were added separately.

v1 -> v2
 - Made has_invalid_event() static to fix warning.
 - Fixed checkpatch.pl errors and warnings.
 - Updated to account for KVM_X86_PMU_OP().

Aaron Lewis (7):
  kvm: x86/pmu: Correct the mask used in a pmu event filter lookup
  kvm: x86/pmu: Remove impossible events from the pmu event filter
  kvm: x86/pmu: prepare the pmu event filter for masked events
  kvm: x86/pmu: Introduce masked events to the pmu event filter
  selftests: kvm/x86: Add flags when creating a pmu event filter
  selftests: kvm/x86: Add testing for KVM_SET_PMU_EVENT_FILTER
  selftests: kvm/x86: Test masked events

 Documentation/virt/kvm/api.rst                |  77 +++-
 arch/x86/include/asm/kvm_host.h               |  14 +-
 arch/x86/include/uapi/asm/kvm.h               |  29 ++
 arch/x86/kvm/pmu.c                            | 241 +++++++++--
 arch/x86/kvm/pmu.h                            |   2 +
 arch/x86/kvm/svm/pmu.c                        |   1 +
 arch/x86/kvm/vmx/pmu_intel.c                  |   1 +
 arch/x86/kvm/x86.c                            |   1 +
 include/uapi/linux/kvm.h                      |   1 +
 .../kvm/x86_64/pmu_event_filter_test.c        | 381 +++++++++++++++++-
 10 files changed, 697 insertions(+), 51 deletions(-)

-- 
2.39.0.314.g84b9a713c41-goog


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v8 1/7] kvm: x86/pmu: Correct the mask used in a pmu event filter lookup
  2022-12-20 16:12 [PATCH v8 0/7] Introduce and test masked events Aaron Lewis
@ 2022-12-20 16:12 ` Aaron Lewis
  2022-12-22  6:19   ` Like Xu
  2022-12-20 16:12 ` [PATCH v8 2/7] kvm: x86/pmu: Remove impossible events from the pmu event filter Aaron Lewis
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Aaron Lewis @ 2022-12-20 16:12 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, jmattson, seanjc, like.xu.linux, Aaron Lewis

When checking if a pmu event the guest is attempting to program should
be filtered, only consider the event select + unit mask in that
decision. Use an architecture specific mask to mask out all other bits,
including bits 35:32 on Intel.  Those bits are not part of the event
select and should not be considered in that decision.

Fixes: 66bb8a065f5a ("KVM: x86: PMU Event Filter")
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---
 arch/x86/kvm/pmu.c           | 3 ++-
 arch/x86/kvm/pmu.h           | 2 ++
 arch/x86/kvm/svm/pmu.c       | 1 +
 arch/x86/kvm/vmx/pmu_intel.c | 1 +
 4 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 684393c22105..760a09ff65cd 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -277,7 +277,8 @@ static bool check_pmu_event_filter(struct kvm_pmc *pmc)
 		goto out;
 
 	if (pmc_is_gp(pmc)) {
-		key = pmc->eventsel & AMD64_RAW_EVENT_MASK_NB;
+		key = pmc->eventsel & (kvm_pmu_ops.EVENTSEL_EVENT |
+				       ARCH_PERFMON_EVENTSEL_UMASK);
 		if (bsearch(&key, filter->events, filter->nevents,
 			    sizeof(__u64), cmp_u64))
 			allow_event = filter->action == KVM_PMU_EVENT_ALLOW;
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 85ff3c0588ba..5b070c563a97 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -40,6 +40,8 @@ struct kvm_pmu_ops {
 	void (*reset)(struct kvm_vcpu *vcpu);
 	void (*deliver_pmi)(struct kvm_vcpu *vcpu);
 	void (*cleanup)(struct kvm_vcpu *vcpu);
+
+	const u64 EVENTSEL_EVENT;
 };
 
 void kvm_pmu_ops_update(const struct kvm_pmu_ops *pmu_ops);
diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index 0e313fbae055..d3ae261d56a6 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -229,4 +229,5 @@ struct kvm_pmu_ops amd_pmu_ops __initdata = {
 	.refresh = amd_pmu_refresh,
 	.init = amd_pmu_init,
 	.reset = amd_pmu_reset,
+	.EVENTSEL_EVENT = AMD64_EVENTSEL_EVENT,
 };
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index e5cec07ca8d9..edf23115f2ef 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -810,4 +810,5 @@ struct kvm_pmu_ops intel_pmu_ops __initdata = {
 	.reset = intel_pmu_reset,
 	.deliver_pmi = intel_pmu_deliver_pmi,
 	.cleanup = intel_pmu_cleanup,
+	.EVENTSEL_EVENT = ARCH_PERFMON_EVENTSEL_EVENT,
 };
-- 
2.39.0.314.g84b9a713c41-goog


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v8 2/7] kvm: x86/pmu: Remove impossible events from the pmu event filter
  2022-12-20 16:12 [PATCH v8 0/7] Introduce and test masked events Aaron Lewis
  2022-12-20 16:12 ` [PATCH v8 1/7] kvm: x86/pmu: Correct the mask used in a pmu event filter lookup Aaron Lewis
@ 2022-12-20 16:12 ` Aaron Lewis
  2022-12-20 16:12 ` [PATCH v8 3/7] kvm: x86/pmu: prepare the pmu event filter for masked events Aaron Lewis
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Aaron Lewis @ 2022-12-20 16:12 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, jmattson, seanjc, like.xu.linux, Aaron Lewis

If it's not possible for an event in the pmu event filter to match a
pmu event being programmed by the guest, it's pointless to have it in
the list.  Opt for a shorter list by removing those events.

Because this is established uAPI the pmu event filter can't outright
rejected these events as garbage and return an error.  Instead, play
nice and remove them from the list.

Also, opportunistically rewrite the comment when the filter is set to
clarify that it guards against *all* TOCTOU attacks on the verified
data.

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---
 arch/x86/kvm/pmu.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 760a09ff65cd..51aac749cb5f 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -592,6 +592,21 @@ void kvm_pmu_trigger_event(struct kvm_vcpu *vcpu, u64 perf_hw_id)
 }
 EXPORT_SYMBOL_GPL(kvm_pmu_trigger_event);
 
+static void remove_impossible_events(struct kvm_pmu_event_filter *filter)
+{
+	int i, j;
+
+	for (i = 0, j = 0; i < filter->nevents; i++) {
+		if (filter->events[i] & ~(kvm_pmu_ops.EVENTSEL_EVENT |
+					  ARCH_PERFMON_EVENTSEL_UMASK))
+			continue;
+
+		filter->events[j++] = filter->events[i];
+	}
+
+	filter->nevents = j;
+}
+
 int kvm_vm_ioctl_set_pmu_event_filter(struct kvm *kvm, void __user *argp)
 {
 	struct kvm_pmu_event_filter tmp, *filter;
@@ -622,9 +637,11 @@ int kvm_vm_ioctl_set_pmu_event_filter(struct kvm *kvm, void __user *argp)
 	if (copy_from_user(filter, argp, size))
 		goto cleanup;
 
-	/* Ensure nevents can't be changed between the user copies. */
+	/* Restore the verified state to guard against TOCTOU attacks. */
 	*filter = tmp;
 
+	remove_impossible_events(filter);
+
 	/*
 	 * Sort the in-kernel list so that we can search it with bsearch.
 	 */
-- 
2.39.0.314.g84b9a713c41-goog


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v8 3/7] kvm: x86/pmu: prepare the pmu event filter for masked events
  2022-12-20 16:12 [PATCH v8 0/7] Introduce and test masked events Aaron Lewis
  2022-12-20 16:12 ` [PATCH v8 1/7] kvm: x86/pmu: Correct the mask used in a pmu event filter lookup Aaron Lewis
  2022-12-20 16:12 ` [PATCH v8 2/7] kvm: x86/pmu: Remove impossible events from the pmu event filter Aaron Lewis
@ 2022-12-20 16:12 ` Aaron Lewis
  2022-12-22  6:34   ` Like Xu
  2022-12-20 16:12 ` [PATCH v8 4/7] kvm: x86/pmu: Introduce masked events to the pmu event filter Aaron Lewis
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Aaron Lewis @ 2022-12-20 16:12 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, jmattson, seanjc, like.xu.linux, Aaron Lewis

Refactor check_pmu_event_filter() in preparation for masked events.

No functional changes intended

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---
 arch/x86/kvm/pmu.c | 56 +++++++++++++++++++++++++++-------------------
 1 file changed, 33 insertions(+), 23 deletions(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 51aac749cb5f..b09a43b66dbe 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -261,41 +261,51 @@ static int cmp_u64(const void *pa, const void *pb)
 	return (a > b) - (a < b);
 }
 
+static u64 *find_filter_entry(struct kvm_pmu_event_filter *filter, u64 key)
+{
+	return bsearch(&key, filter->events, filter->nevents,
+		       sizeof(filter->events[0]), cmp_u64);
+}
+
+static bool is_gp_event_allowed(struct kvm_pmu_event_filter *filter, u64 eventsel)
+{
+	if (find_filter_entry(filter, eventsel & (kvm_pmu_ops.EVENTSEL_EVENT |
+						  ARCH_PERFMON_EVENTSEL_UMASK)))
+		return filter->action == KVM_PMU_EVENT_ALLOW;
+
+	return filter->action == KVM_PMU_EVENT_DENY;
+}
+
+static bool is_fixed_event_allowed(struct kvm_pmu_event_filter *filter, int idx)
+{
+	int fixed_idx = idx - INTEL_PMC_IDX_FIXED;
+
+	if (filter->action == KVM_PMU_EVENT_DENY &&
+	    test_bit(fixed_idx, (ulong *)&filter->fixed_counter_bitmap))
+		return false;
+	if (filter->action == KVM_PMU_EVENT_ALLOW &&
+	    !test_bit(fixed_idx, (ulong *)&filter->fixed_counter_bitmap))
+		return false;
+
+	return true;
+}
+
 static bool check_pmu_event_filter(struct kvm_pmc *pmc)
 {
 	struct kvm_pmu_event_filter *filter;
 	struct kvm *kvm = pmc->vcpu->kvm;
-	bool allow_event = true;
-	__u64 key;
-	int idx;
 
 	if (!static_call(kvm_x86_pmu_hw_event_available)(pmc))
 		return false;
 
 	filter = srcu_dereference(kvm->arch.pmu_event_filter, &kvm->srcu);
 	if (!filter)
-		goto out;
+		return true;
 
-	if (pmc_is_gp(pmc)) {
-		key = pmc->eventsel & (kvm_pmu_ops.EVENTSEL_EVENT |
-				       ARCH_PERFMON_EVENTSEL_UMASK);
-		if (bsearch(&key, filter->events, filter->nevents,
-			    sizeof(__u64), cmp_u64))
-			allow_event = filter->action == KVM_PMU_EVENT_ALLOW;
-		else
-			allow_event = filter->action == KVM_PMU_EVENT_DENY;
-	} else {
-		idx = pmc->idx - INTEL_PMC_IDX_FIXED;
-		if (filter->action == KVM_PMU_EVENT_DENY &&
-		    test_bit(idx, (ulong *)&filter->fixed_counter_bitmap))
-			allow_event = false;
-		if (filter->action == KVM_PMU_EVENT_ALLOW &&
-		    !test_bit(idx, (ulong *)&filter->fixed_counter_bitmap))
-			allow_event = false;
-	}
+	if (pmc_is_gp(pmc))
+		return is_gp_event_allowed(filter, pmc->eventsel);
 
-out:
-	return allow_event;
+	return is_fixed_event_allowed(filter, pmc->idx);
 }
 
 static void reprogram_counter(struct kvm_pmc *pmc)
-- 
2.39.0.314.g84b9a713c41-goog


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v8 4/7] kvm: x86/pmu: Introduce masked events to the pmu event filter
  2022-12-20 16:12 [PATCH v8 0/7] Introduce and test masked events Aaron Lewis
                   ` (2 preceding siblings ...)
  2022-12-20 16:12 ` [PATCH v8 3/7] kvm: x86/pmu: prepare the pmu event filter for masked events Aaron Lewis
@ 2022-12-20 16:12 ` Aaron Lewis
  2022-12-22  7:40   ` Like Xu
  2022-12-20 16:12 ` [PATCH v8 5/7] selftests: kvm/x86: Add flags when creating a " Aaron Lewis
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Aaron Lewis @ 2022-12-20 16:12 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, jmattson, seanjc, like.xu.linux, Aaron Lewis

When building a list of filter events, it can sometimes be a challenge
to fit all the events needed to adequately restrict the guest into the
limited space available in the pmu event filter.  This stems from the
fact that the pmu event filter requires each event (i.e. event select +
unit mask) be listed, when the intention might be to restrict the
event select all together, regardless of it's unit mask.  Instead of
increasing the number of filter events in the pmu event filter, add a
new encoding that is able to do a more generalized match on the unit mask.

Introduce masked events as another encoding the pmu event filter
understands.  Masked events has the fields: mask, match, and exclude.
When filtering based on these events, the mask is applied to the guest's
unit mask to see if it matches the match value (i.e. umask & mask ==
match).  The exclude bit can then be used to exclude events from that
match.  E.g. for a given event select, if it's easier to say which unit
mask values shouldn't be filtered, a masked event can be set up to match
all possible unit mask values, then another masked event can be set up to
match the unit mask values that shouldn't be filtered.

Userspace can query to see if this feature exists by looking for the
capability, KVM_CAP_PMU_EVENT_MASKED_EVENTS.

This feature is enabled by setting the flags field in the pmu event
filter to KVM_PMU_EVENT_FLAG_MASKED_EVENTS.

Events can be encoded by using KVM_PMU_ENCODE_MASKED_ENTRY().

It is an error to have a bit set outside the valid bits for a masked
event, and calls to KVM_SET_PMU_EVENT_FILTER will return -EINVAL in
such cases, including the high bits of the event select (35:32) if
called on Intel.

With these updates the filter matching code has been updated to match on
a common event.  Masked events were flexible enough to handle both event
types, so they were used as the common event.  This changes how guest
events get filtered because regardless of the type of event used in the
uAPI, they will be converted to masked events.  Because of this there
could be a slight performance hit because instead of matching the filter
event with a lookup on event select + unit mask, it does a lookup on event
select then walks the unit masks to find the match.  This shouldn't be a
big problem because I would expect the set of common event selects to be
small, and if they aren't the set can likely be reduced by using masked
events to generalize the unit mask.  Using one type of event when
filtering guest events allows for a common code path to be used.

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---
 Documentation/virt/kvm/api.rst  |  77 +++++++++++--
 arch/x86/include/asm/kvm_host.h |  14 ++-
 arch/x86/include/uapi/asm/kvm.h |  29 +++++
 arch/x86/kvm/pmu.c              | 197 +++++++++++++++++++++++++++-----
 arch/x86/kvm/x86.c              |   1 +
 include/uapi/linux/kvm.h        |   1 +
 6 files changed, 281 insertions(+), 38 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 778c6460d1de..9f6256b56e0e 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -4997,6 +4997,15 @@ using this ioctl.
 :Parameters: struct kvm_pmu_event_filter (in)
 :Returns: 0 on success, -1 on error
 
+Errors:
+
+  ======     ============================================================
+  EFAULT     args[0] cannot be accessed
+  EINVAL     args[0] contains invalid data in the filter or filter events
+  E2BIG      nevents is too large
+  EBUSY      not enough memory to allocate the filter
+  ======     ============================================================
+
 ::
 
   struct kvm_pmu_event_filter {
@@ -5008,14 +5017,68 @@ using this ioctl.
 	__u64 events[0];
   };
 
-This ioctl restricts the set of PMU events that the guest can program.
-The argument holds a list of events which will be allowed or denied.
-The eventsel+umask of each event the guest attempts to program is compared
-against the events field to determine whether the guest should have access.
-The events field only controls general purpose counters; fixed purpose
-counters are controlled by the fixed_counter_bitmap.
+This ioctl restricts the set of PMU events the guest can program by limiting
+which event select and unit mask combinations are permitted.
+
+The argument holds a list of filter events which will be allowed or denied.
+
+Filter events only control general purpose counters; fixed purpose counters
+are controlled by the fixed_counter_bitmap.
+
+Valid values for 'flags'::
+
+``0``
+
+To use this mode, clear the 'flags' field.
+
+In this mode each event will contain an event select + unit mask.
+
+When the guest attempts to program the PMU the guest's event select +
+unit mask is compared against the filter events to determine whether the
+guest should have access.
+
+``KVM_PMU_EVENT_FLAG_MASKED_EVENTS``
+:Capability: KVM_CAP_PMU_EVENT_MASKED_EVENTS
+
+In this mode each filter event will contain an event select, mask, match, and
+exclude value.  To encode a masked event use::
+
+  KVM_PMU_ENCODE_MASKED_ENTRY()
+
+An encoded event will follow this layout::
+
+  Bits   Description
+  ----   -----------
+  7:0    event select (low bits)
+  15:8   umask match
+  31:16  unused
+  35:32  event select (high bits)
+  36:54  unused
+  55     exclude bit
+  63:56  umask mask
+
+When the guest attempts to program the PMU, these steps are followed in
+determining if the guest should have access:
+ 1. Match the event select from the guest against the filter events.
+ 2. If a match is found, match the guest's unit mask to the mask and match
+    values of the included filter events.
+    I.e. (unit mask & mask) == match && !exclude.
+ 3. If a match is found, match the guest's unit mask to the mask and match
+    values of the excluded filter events.
+    I.e. (unit mask & mask) == match && exclude.
+ 4.
+   a. If an included match is found and an excluded match is not found, filter
+      the event.
+   b. For everything else, do not filter the event.
+ 5.
+   a. If the event is filtered and it's an allow list, allow the guest to
+      program the event.
+   b. If the event is filtered and it's a deny list, do not allow the guest to
+      program the event.
 
-No flags are defined yet, the field must be zero.
+When setting a new pmu event filter, -EINVAL will be returned if any of the
+unused fields are set or if any of the high bits (35:32) in the event
+select are set when called on Intel.
 
 Valid values for 'action'::
 
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index aa4eb8cfcd7e..112b70a7e403 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1132,6 +1132,18 @@ struct kvm_x86_msr_filter {
 	struct msr_bitmap_range ranges[16];
 };
 
+struct kvm_x86_pmu_event_filter {
+	__u32 action;
+	__u32 nevents;
+	__u32 fixed_counter_bitmap;
+	__u32 flags;
+	__u32 nr_includes;
+	__u32 nr_excludes;
+	__u64 *includes;
+	__u64 *excludes;
+	__u64 events[];
+};
+
 enum kvm_apicv_inhibit {
 
 	/********************************************************************/
@@ -1337,7 +1349,7 @@ struct kvm_arch {
 	/* Guest can access the SGX PROVISIONKEY. */
 	bool sgx_provisioning_allowed;
 
-	struct kvm_pmu_event_filter __rcu *pmu_event_filter;
+	struct kvm_x86_pmu_event_filter __rcu *pmu_event_filter;
 	struct task_struct *nx_huge_page_recovery_thread;
 
 #ifdef CONFIG_X86_64
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index e48deab8901d..f142f3ebf4e4 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -525,6 +525,35 @@ struct kvm_pmu_event_filter {
 #define KVM_PMU_EVENT_ALLOW 0
 #define KVM_PMU_EVENT_DENY 1
 
+#define KVM_PMU_EVENT_FLAG_MASKED_EVENTS BIT(0)
+#define KVM_PMU_EVENT_FLAGS_VALID_MASK (KVM_PMU_EVENT_FLAG_MASKED_EVENTS)
+
+/*
+ * Masked event layout.
+ * Bits   Description
+ * ----   -----------
+ * 7:0    event select (low bits)
+ * 15:8   umask match
+ * 31:16  unused
+ * 35:32  event select (high bits)
+ * 36:54  unused
+ * 55     exclude bit
+ * 63:56  umask mask
+ */
+
+#define KVM_PMU_ENCODE_MASKED_ENTRY(event_select, mask, match, exclude) \
+	(((event_select) & 0xFFULL) | (((event_select) & 0XF00ULL) << 24) | \
+	(((mask) & 0xFFULL) << 56) | \
+	(((match) & 0xFFULL) << 8) | \
+	((__u64)(!!(exclude)) << 55))
+
+#define KVM_PMU_MASKED_ENTRY_EVENT_SELECT \
+	(GENMASK_ULL(7, 0) | GENMASK_ULL(35, 32))
+#define KVM_PMU_MASKED_ENTRY_UMASK_MASK		(GENMASK_ULL(63, 56))
+#define KVM_PMU_MASKED_ENTRY_UMASK_MATCH	(GENMASK_ULL(15, 8))
+#define KVM_PMU_MASKED_ENTRY_EXCLUDE		(BIT_ULL(55))
+#define KVM_PMU_MASKED_ENTRY_UMASK_MASK_SHIFT	(56)
+
 /* for KVM_{GET,SET,HAS}_DEVICE_ATTR */
 #define KVM_VCPU_TSC_CTRL 0 /* control group for the timestamp counter (TSC) */
 #define   KVM_VCPU_TSC_OFFSET 0 /* attribute for the TSC offset */
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index b09a43b66dbe..6558bd36a200 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -253,30 +253,99 @@ static bool pmc_resume_counter(struct kvm_pmc *pmc)
 	return true;
 }
 
-static int cmp_u64(const void *pa, const void *pb)
+static int filter_cmp(const void *pa, const void *pb, u64 mask)
 {
-	u64 a = *(u64 *)pa;
-	u64 b = *(u64 *)pb;
+	u64 a = *(u64 *)pa & mask;
+	u64 b = *(u64 *)pb & mask;
 
 	return (a > b) - (a < b);
 }
 
-static u64 *find_filter_entry(struct kvm_pmu_event_filter *filter, u64 key)
+
+static int filter_sort_cmp(const void *pa, const void *pb)
+{
+	return filter_cmp(pa, pb, (KVM_PMU_MASKED_ENTRY_EVENT_SELECT |
+				   KVM_PMU_MASKED_ENTRY_EXCLUDE));
+}
+
+/*
+ * For the event filter, searching is done on the 'includes' list and
+ * 'excludes' list separately rather than on the 'events' list (which
+ * has both).  As a result the exclude bit can be ignored.
+ */
+static int filter_event_cmp(const void *pa, const void *pb)
+{
+	return filter_cmp(pa, pb, (KVM_PMU_MASKED_ENTRY_EVENT_SELECT));
+}
+
+static int find_filter_index(u64 *events, u64 nevents, u64 key)
+{
+	u64 *fe = bsearch(&key, events, nevents, sizeof(events[0]),
+			  filter_event_cmp);
+
+	if (!fe)
+		return -1;
+
+	return fe - events;
+}
+
+static bool is_filter_entry_match(u64 filter_event, u64 umask)
+{
+	u64 mask = filter_event >> (KVM_PMU_MASKED_ENTRY_UMASK_MASK_SHIFT - 8);
+	u64 match = filter_event & KVM_PMU_MASKED_ENTRY_UMASK_MATCH;
+
+	BUILD_BUG_ON((KVM_PMU_ENCODE_MASKED_ENTRY(0, 0xff, 0, false) >>
+		     (KVM_PMU_MASKED_ENTRY_UMASK_MASK_SHIFT - 8)) !=
+		     ARCH_PERFMON_EVENTSEL_UMASK);
+
+	return (umask & mask) == match;
+}
+
+static bool filter_contains_match(u64 *events, u64 nevents, u64 eventsel)
 {
-	return bsearch(&key, filter->events, filter->nevents,
-		       sizeof(filter->events[0]), cmp_u64);
+	u64 event_select = eventsel & kvm_pmu_ops.EVENTSEL_EVENT;
+	u64 umask = eventsel & ARCH_PERFMON_EVENTSEL_UMASK;
+	int i, index;
+
+	index = find_filter_index(events, nevents, event_select);
+	if (index < 0)
+		return false;
+
+	/*
+	 * Entries are sorted by the event select.  Walk the list in both
+	 * directions to process all entries with the targeted event select.
+	 */
+	for (i = index; i < nevents; i++) {
+		if (filter_event_cmp(&events[i], &event_select))
+			break;
+
+		if (is_filter_entry_match(events[i], umask))
+			return true;
+	}
+
+	for (i = index - 1; i >= 0; i--) {
+		if (filter_event_cmp(&events[i], &event_select))
+			break;
+
+		if (is_filter_entry_match(events[i], umask))
+			return true;
+	}
+
+	return false;
 }
 
-static bool is_gp_event_allowed(struct kvm_pmu_event_filter *filter, u64 eventsel)
+static bool is_gp_event_allowed(struct kvm_x86_pmu_event_filter *f,
+				u64 eventsel)
 {
-	if (find_filter_entry(filter, eventsel & (kvm_pmu_ops.EVENTSEL_EVENT |
-						  ARCH_PERFMON_EVENTSEL_UMASK)))
-		return filter->action == KVM_PMU_EVENT_ALLOW;
+	if (filter_contains_match(f->includes, f->nr_includes, eventsel) &&
+	    !filter_contains_match(f->excludes, f->nr_excludes, eventsel))
+		return f->action == KVM_PMU_EVENT_ALLOW;
 
-	return filter->action == KVM_PMU_EVENT_DENY;
+	return f->action == KVM_PMU_EVENT_DENY;
 }
 
-static bool is_fixed_event_allowed(struct kvm_pmu_event_filter *filter, int idx)
+static bool is_fixed_event_allowed(struct kvm_x86_pmu_event_filter *filter,
+				   int idx)
 {
 	int fixed_idx = idx - INTEL_PMC_IDX_FIXED;
 
@@ -292,7 +361,7 @@ static bool is_fixed_event_allowed(struct kvm_pmu_event_filter *filter, int idx)
 
 static bool check_pmu_event_filter(struct kvm_pmc *pmc)
 {
-	struct kvm_pmu_event_filter *filter;
+	struct kvm_x86_pmu_event_filter *filter;
 	struct kvm *kvm = pmc->vcpu->kvm;
 
 	if (!static_call(kvm_x86_pmu_hw_event_available)(pmc))
@@ -602,60 +671,128 @@ void kvm_pmu_trigger_event(struct kvm_vcpu *vcpu, u64 perf_hw_id)
 }
 EXPORT_SYMBOL_GPL(kvm_pmu_trigger_event);
 
-static void remove_impossible_events(struct kvm_pmu_event_filter *filter)
+static bool is_masked_filter_valid(const struct kvm_x86_pmu_event_filter *filter)
+{
+	u64 mask = kvm_pmu_ops.EVENTSEL_EVENT |
+		   KVM_PMU_MASKED_ENTRY_UMASK_MASK |
+		   KVM_PMU_MASKED_ENTRY_UMASK_MATCH |
+		   KVM_PMU_MASKED_ENTRY_EXCLUDE;
+	int i;
+
+	for (i = 0; i < filter->nevents; i++) {
+		if (filter->events[i] & ~mask)
+			return false;
+	}
+
+	return true;
+}
+
+static void convert_to_masked_filter(struct kvm_x86_pmu_event_filter *filter)
 {
 	int i, j;
 
 	for (i = 0, j = 0; i < filter->nevents; i++) {
+		/*
+		 * Skip events that are impossible to match against a guest
+		 * event.  When filtering, only the event select + unit mask
+		 * of the guest event is used.  To maintain backwards
+		 * compatibility, impossible filters can't be rejected :-(
+		 */
 		if (filter->events[i] & ~(kvm_pmu_ops.EVENTSEL_EVENT |
 					  ARCH_PERFMON_EVENTSEL_UMASK))
 			continue;
-
-		filter->events[j++] = filter->events[i];
+		/*
+		 * Convert userspace events to a common in-kernel event so
+		 * only one code path is needed to support both events.  For
+		 * the in-kernel events use masked events because they are
+		 * flexible enough to handle both cases.  To convert to masked
+		 * events all that's needed is to add an "all ones" umask_mask,
+		 * (unmasked filter events don't support EXCLUDE).
+		 */
+		filter->events[j++] = filter->events[i] |
+				      (0xFFULL << KVM_PMU_MASKED_ENTRY_UMASK_MASK_SHIFT);
 	}
 
 	filter->nevents = j;
 }
 
+static int prepare_filter_lists(struct kvm_x86_pmu_event_filter *filter)
+{
+	int i;
+
+	if (!(filter->flags & KVM_PMU_EVENT_FLAG_MASKED_EVENTS))
+		convert_to_masked_filter(filter);
+	else if (!is_masked_filter_valid(filter))
+		return -EINVAL;
+
+	/*
+	 * Sort entries by event select and includes vs. excludes so that all
+	 * entries for a given event select can be processed efficiently during
+	 * filtering.  The EXCLUDE flag uses a more significant bit than the
+	 * event select, and so the sorted list is also effectively split into
+	 * includes and excludes sub-lists.
+	 */
+	sort(&filter->events, filter->nevents, sizeof(filter->events[0]),
+	     filter_sort_cmp, NULL);
+
+	i = filter->nevents;
+	/* Find the first EXCLUDE event (only supported for masked events). */
+	if (filter->flags & KVM_PMU_EVENT_FLAG_MASKED_EVENTS) {
+		for (i = 0; i < filter->nevents; i++) {
+			if (filter->events[i] & KVM_PMU_MASKED_ENTRY_EXCLUDE)
+				break;
+		}
+	}
+
+	filter->nr_includes = i;
+	filter->nr_excludes = filter->nevents - filter->nr_includes;
+	filter->includes = filter->events;
+	filter->excludes = filter->events + filter->nr_includes;
+
+	return 0;
+}
+
 int kvm_vm_ioctl_set_pmu_event_filter(struct kvm *kvm, void __user *argp)
 {
-	struct kvm_pmu_event_filter tmp, *filter;
+	struct kvm_pmu_event_filter __user *user_filter = argp;
+	struct kvm_x86_pmu_event_filter *filter;
+	struct kvm_pmu_event_filter tmp;
 	struct kvm_vcpu *vcpu;
 	unsigned long i;
 	size_t size;
 	int r;
 
-	if (copy_from_user(&tmp, argp, sizeof(tmp)))
+	if (copy_from_user(&tmp, user_filter, sizeof(tmp)))
 		return -EFAULT;
 
 	if (tmp.action != KVM_PMU_EVENT_ALLOW &&
 	    tmp.action != KVM_PMU_EVENT_DENY)
 		return -EINVAL;
 
-	if (tmp.flags != 0)
+	if (tmp.flags & ~KVM_PMU_EVENT_FLAGS_VALID_MASK)
 		return -EINVAL;
 
 	if (tmp.nevents > KVM_PMU_EVENT_FILTER_MAX_EVENTS)
 		return -E2BIG;
 
 	size = struct_size(filter, events, tmp.nevents);
-	filter = kmalloc(size, GFP_KERNEL_ACCOUNT);
+	filter = kzalloc(size, GFP_KERNEL_ACCOUNT);
 	if (!filter)
 		return -ENOMEM;
 
+	filter->action = tmp.action;
+	filter->nevents = tmp.nevents;
+	filter->fixed_counter_bitmap = tmp.fixed_counter_bitmap;
+	filter->flags = tmp.flags;
+
 	r = -EFAULT;
-	if (copy_from_user(filter, argp, size))
+	if (copy_from_user(filter->events, user_filter->events,
+			   sizeof(filter->events[0]) * filter->nevents))
 		goto cleanup;
 
-	/* Restore the verified state to guard against TOCTOU attacks. */
-	*filter = tmp;
-
-	remove_impossible_events(filter);
-
-	/*
-	 * Sort the in-kernel list so that we can search it with bsearch.
-	 */
-	sort(&filter->events, filter->nevents, sizeof(__u64), cmp_u64, NULL);
+	r = prepare_filter_lists(filter);
+	if (r)
+		goto cleanup;
 
 	mutex_lock(&kvm->lock);
 	filter = rcu_replace_pointer(kvm->arch.pmu_event_filter, filter,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 312aea1854ae..d2023076f363 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4401,6 +4401,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
  	case KVM_CAP_SPLIT_IRQCHIP:
 	case KVM_CAP_IMMEDIATE_EXIT:
 	case KVM_CAP_PMU_EVENT_FILTER:
+	case KVM_CAP_PMU_EVENT_MASKED_EVENTS:
 	case KVM_CAP_GET_MSR_FEATURES:
 	case KVM_CAP_MSR_PLATFORM_INFO:
 	case KVM_CAP_EXCEPTION_PAYLOAD:
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 20522d4ba1e0..0b16b9ed3b23 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1175,6 +1175,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_DIRTY_LOG_RING_ACQ_REL 223
 #define KVM_CAP_S390_PROTECTED_ASYNC_DISABLE 224
 #define KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP 225
+#define KVM_CAP_PMU_EVENT_MASKED_EVENTS 226
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.39.0.314.g84b9a713c41-goog


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v8 5/7] selftests: kvm/x86: Add flags when creating a pmu event filter
  2022-12-20 16:12 [PATCH v8 0/7] Introduce and test masked events Aaron Lewis
                   ` (3 preceding siblings ...)
  2022-12-20 16:12 ` [PATCH v8 4/7] kvm: x86/pmu: Introduce masked events to the pmu event filter Aaron Lewis
@ 2022-12-20 16:12 ` Aaron Lewis
  2022-12-20 16:12 ` [PATCH v8 6/7] selftests: kvm/x86: Add testing for KVM_SET_PMU_EVENT_FILTER Aaron Lewis
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Aaron Lewis @ 2022-12-20 16:12 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, jmattson, seanjc, like.xu.linux, Aaron Lewis

Now that the flags field can be non-zero, pass it in when creating a
pmu event filter.

This is needed in preparation for testing masked events.

No functional change intended.

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---
 .../testing/selftests/kvm/x86_64/pmu_event_filter_test.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/pmu_event_filter_test.c b/tools/testing/selftests/kvm/x86_64/pmu_event_filter_test.c
index 2de98fce7edd..d50c8c160658 100644
--- a/tools/testing/selftests/kvm/x86_64/pmu_event_filter_test.c
+++ b/tools/testing/selftests/kvm/x86_64/pmu_event_filter_test.c
@@ -198,14 +198,15 @@ static struct kvm_pmu_event_filter *alloc_pmu_event_filter(uint32_t nevents)
 
 
 static struct kvm_pmu_event_filter *
-create_pmu_event_filter(const uint64_t event_list[],
-			int nevents, uint32_t action)
+create_pmu_event_filter(const uint64_t event_list[], int nevents,
+			uint32_t action, uint32_t flags)
 {
 	struct kvm_pmu_event_filter *f;
 	int i;
 
 	f = alloc_pmu_event_filter(nevents);
 	f->action = action;
+	f->flags = flags;
 	for (i = 0; i < nevents; i++)
 		f->events[i] = event_list[i];
 
@@ -216,7 +217,7 @@ static struct kvm_pmu_event_filter *event_filter(uint32_t action)
 {
 	return create_pmu_event_filter(event_list,
 				       ARRAY_SIZE(event_list),
-				       action);
+				       action, 0);
 }
 
 /*
@@ -263,7 +264,7 @@ static void test_amd_deny_list(struct kvm_vcpu *vcpu)
 	struct kvm_pmu_event_filter *f;
 	uint64_t count;
 
-	f = create_pmu_event_filter(&event, 1, KVM_PMU_EVENT_DENY);
+	f = create_pmu_event_filter(&event, 1, KVM_PMU_EVENT_DENY, 0);
 	count = test_with_filter(vcpu, f);
 
 	free(f);
-- 
2.39.0.314.g84b9a713c41-goog


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v8 6/7] selftests: kvm/x86: Add testing for KVM_SET_PMU_EVENT_FILTER
  2022-12-20 16:12 [PATCH v8 0/7] Introduce and test masked events Aaron Lewis
                   ` (4 preceding siblings ...)
  2022-12-20 16:12 ` [PATCH v8 5/7] selftests: kvm/x86: Add flags when creating a " Aaron Lewis
@ 2022-12-20 16:12 ` Aaron Lewis
  2022-12-20 16:12 ` [PATCH v8 7/7] selftests: kvm/x86: Test masked events Aaron Lewis
  2023-01-19 20:57 ` [PATCH v8 0/7] Introduce and test " Sean Christopherson
  7 siblings, 0 replies; 16+ messages in thread
From: Aaron Lewis @ 2022-12-20 16:12 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, jmattson, seanjc, like.xu.linux, Aaron Lewis

Test that masked events are not using invalid bits, and if they are,
ensure the pmu event filter is not accepted by KVM_SET_PMU_EVENT_FILTER.
The only valid bits that can be used for masked events are set when
using KVM_PMU_ENCODE_MASKED_ENTRY() with one exception: If any of the
high bits (35:32) of the event select are set when using Intel, the pmu
event filter will fail.

Also, because validation was not being done prior to the introduction
of masked events, only expect validation to fail when masked events
are used.  E.g. in the first test a filter event with all its bits set
is accepted by KVM_SET_PMU_EVENT_FILTER when flags = 0.

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---
 .../kvm/x86_64/pmu_event_filter_test.c        | 36 +++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/tools/testing/selftests/kvm/x86_64/pmu_event_filter_test.c b/tools/testing/selftests/kvm/x86_64/pmu_event_filter_test.c
index d50c8c160658..a96830243195 100644
--- a/tools/testing/selftests/kvm/x86_64/pmu_event_filter_test.c
+++ b/tools/testing/selftests/kvm/x86_64/pmu_event_filter_test.c
@@ -404,6 +404,39 @@ static bool use_amd_pmu(void)
 		 is_zen3(family, model));
 }
 
+static int run_filter_test(struct kvm_vcpu *vcpu, const uint64_t *events,
+			   int nevents, uint32_t flags)
+{
+	struct kvm_pmu_event_filter *f;
+	int r;
+
+	f = create_pmu_event_filter(events, nevents, KVM_PMU_EVENT_ALLOW, flags);
+	r = __vm_ioctl(vcpu->vm, KVM_SET_PMU_EVENT_FILTER, f);
+	free(f);
+
+	return r;
+}
+
+static void test_filter_ioctl(struct kvm_vcpu *vcpu)
+{
+	uint64_t e = ~0ul;
+	int r;
+
+	/*
+	 * Unfortunately having invalid bits set in event data is expected to
+	 * pass when flags == 0 (bits other than eventsel+umask).
+	 */
+	r = run_filter_test(vcpu, &e, 1, 0);
+	TEST_ASSERT(r == 0, "Valid PMU Event Filter is failing");
+
+	r = run_filter_test(vcpu, &e, 1, KVM_PMU_EVENT_FLAG_MASKED_EVENTS);
+	TEST_ASSERT(r != 0, "Invalid PMU Event Filter is expected to fail");
+
+	e = KVM_PMU_EVENT_ENCODE_MASKED_ENTRY(0xff, 0xff, 0xff, 0xf);
+	r = run_filter_test(vcpu, &e, 1, KVM_PMU_EVENT_FLAG_MASKED_EVENTS);
+	TEST_ASSERT(r == 0, "Valid PMU Event Filter is failing");
+}
+
 int main(int argc, char *argv[])
 {
 	void (*guest_code)(void);
@@ -411,6 +444,7 @@ int main(int argc, char *argv[])
 	struct kvm_vm *vm;
 
 	TEST_REQUIRE(kvm_has_cap(KVM_CAP_PMU_EVENT_FILTER));
+	TEST_REQUIRE(kvm_has_cap(KVM_CAP_PMU_EVENT_MASKED_EVENTS));
 
 	TEST_REQUIRE(use_intel_pmu() || use_amd_pmu());
 	guest_code = use_intel_pmu() ? intel_guest_code : amd_guest_code;
@@ -431,6 +465,8 @@ int main(int argc, char *argv[])
 	test_not_member_deny_list(vcpu);
 	test_not_member_allow_list(vcpu);
 
+	test_filter_ioctl(vcpu);
+
 	kvm_vm_free(vm);
 
 	test_pmu_config_disable(guest_code);
-- 
2.39.0.314.g84b9a713c41-goog


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v8 7/7] selftests: kvm/x86: Test masked events
  2022-12-20 16:12 [PATCH v8 0/7] Introduce and test masked events Aaron Lewis
                   ` (5 preceding siblings ...)
  2022-12-20 16:12 ` [PATCH v8 6/7] selftests: kvm/x86: Add testing for KVM_SET_PMU_EVENT_FILTER Aaron Lewis
@ 2022-12-20 16:12 ` Aaron Lewis
  2023-01-19 20:57 ` [PATCH v8 0/7] Introduce and test " Sean Christopherson
  7 siblings, 0 replies; 16+ messages in thread
From: Aaron Lewis @ 2022-12-20 16:12 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, jmattson, seanjc, like.xu.linux, Aaron Lewis

Add testing to show that a pmu event can be filtered with a generalized
match on it's unit mask.

These tests set up test cases to demonstrate various ways of filtering
a pmu event that has multiple unit mask values.  It does this by
setting up the filter in KVM with the masked events provided, then
enabling three pmu counters in the guest.  The test then verifies that
the pmu counters agree with which counters should be counting and which
counters should be filtered for both a sparse filter list and a dense
filter list.

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
---
 .../kvm/x86_64/pmu_event_filter_test.c        | 338 +++++++++++++++++-
 1 file changed, 336 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/pmu_event_filter_test.c b/tools/testing/selftests/kvm/x86_64/pmu_event_filter_test.c
index a96830243195..253e4304bbe3 100644
--- a/tools/testing/selftests/kvm/x86_64/pmu_event_filter_test.c
+++ b/tools/testing/selftests/kvm/x86_64/pmu_event_filter_test.c
@@ -404,6 +404,331 @@ static bool use_amd_pmu(void)
 		 is_zen3(family, model));
 }
 
+/*
+ * "MEM_INST_RETIRED.ALL_LOADS", "MEM_INST_RETIRED.ALL_STORES", and
+ * "MEM_INST_RETIRED.ANY" from https://perfmon-events.intel.com/
+ * supported on Intel Xeon processors:
+ *  - Sapphire Rapids, Ice Lake, Cascade Lake, Skylake.
+ */
+#define MEM_INST_RETIRED		0xD0
+#define MEM_INST_RETIRED_LOAD		EVENT(MEM_INST_RETIRED, 0x81)
+#define MEM_INST_RETIRED_STORE		EVENT(MEM_INST_RETIRED, 0x82)
+#define MEM_INST_RETIRED_LOAD_STORE	EVENT(MEM_INST_RETIRED, 0x83)
+
+static bool supports_event_mem_inst_retired(void)
+{
+	uint32_t eax, ebx, ecx, edx;
+
+	cpuid(1, &eax, &ebx, &ecx, &edx);
+	if (x86_family(eax) == 0x6) {
+		switch (x86_model(eax)) {
+		/* Sapphire Rapids */
+		case 0x8F:
+		/* Ice Lake */
+		case 0x6A:
+		/* Skylake */
+		/* Cascade Lake */
+		case 0x55:
+			return true;
+		}
+	}
+
+	return false;
+}
+
+/*
+ * "LS Dispatch", from Processor Programming Reference
+ * (PPR) for AMD Family 17h Model 01h, Revision B1 Processors,
+ * Preliminary Processor Programming Reference (PPR) for AMD Family
+ * 17h Model 31h, Revision B0 Processors, and Preliminary Processor
+ * Programming Reference (PPR) for AMD Family 19h Model 01h, Revision
+ * B1 Processors Volume 1 of 2.
+ */
+#define LS_DISPATCH		0x29
+#define LS_DISPATCH_LOAD	EVENT(LS_DISPATCH, BIT(0))
+#define LS_DISPATCH_STORE	EVENT(LS_DISPATCH, BIT(1))
+#define LS_DISPATCH_LOAD_STORE	EVENT(LS_DISPATCH, BIT(2))
+
+#define INCLUDE_MASKED_ENTRY(event_select, mask, match) \
+	KVM_PMU_ENCODE_MASKED_ENTRY(event_select, mask, match, false)
+#define EXCLUDE_MASKED_ENTRY(event_select, mask, match) \
+	KVM_PMU_ENCODE_MASKED_ENTRY(event_select, mask, match, true)
+
+struct perf_counter {
+	union {
+		uint64_t raw;
+		struct {
+			uint64_t loads:22;
+			uint64_t stores:22;
+			uint64_t loads_stores:20;
+		};
+	};
+};
+
+static uint64_t masked_events_guest_test(uint32_t msr_base)
+{
+	uint64_t ld0, ld1, st0, st1, ls0, ls1;
+	struct perf_counter c;
+	int val;
+
+	/*
+	 * The acutal value of the counters don't determine the outcome of
+	 * the test.  Only that they are zero or non-zero.
+	 */
+	ld0 = rdmsr(msr_base + 0);
+	st0 = rdmsr(msr_base + 1);
+	ls0 = rdmsr(msr_base + 2);
+
+	__asm__ __volatile__("movl $0, %[v];"
+			     "movl %[v], %%eax;"
+			     "incl %[v];"
+			     : [v]"+m"(val) :: "eax");
+
+	ld1 = rdmsr(msr_base + 0);
+	st1 = rdmsr(msr_base + 1);
+	ls1 = rdmsr(msr_base + 2);
+
+	c.loads = ld1 - ld0;
+	c.stores = st1 - st0;
+	c.loads_stores = ls1 - ls0;
+
+	return c.raw;
+}
+
+static void intel_masked_events_guest_code(void)
+{
+	uint64_t r;
+
+	for (;;) {
+		wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, 0);
+
+		wrmsr(MSR_P6_EVNTSEL0 + 0, ARCH_PERFMON_EVENTSEL_ENABLE |
+		      ARCH_PERFMON_EVENTSEL_OS | MEM_INST_RETIRED_LOAD);
+		wrmsr(MSR_P6_EVNTSEL0 + 1, ARCH_PERFMON_EVENTSEL_ENABLE |
+		      ARCH_PERFMON_EVENTSEL_OS | MEM_INST_RETIRED_STORE);
+		wrmsr(MSR_P6_EVNTSEL0 + 2, ARCH_PERFMON_EVENTSEL_ENABLE |
+		      ARCH_PERFMON_EVENTSEL_OS | MEM_INST_RETIRED_LOAD_STORE);
+
+		wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, 0x7);
+
+		r = masked_events_guest_test(MSR_IA32_PMC0);
+
+		GUEST_SYNC(r);
+	}
+}
+
+static void amd_masked_events_guest_code(void)
+{
+	uint64_t r;
+
+	for (;;) {
+		wrmsr(MSR_K7_EVNTSEL0, 0);
+		wrmsr(MSR_K7_EVNTSEL1, 0);
+		wrmsr(MSR_K7_EVNTSEL2, 0);
+
+		wrmsr(MSR_K7_EVNTSEL0, ARCH_PERFMON_EVENTSEL_ENABLE |
+		      ARCH_PERFMON_EVENTSEL_OS | LS_DISPATCH_LOAD);
+		wrmsr(MSR_K7_EVNTSEL1, ARCH_PERFMON_EVENTSEL_ENABLE |
+		      ARCH_PERFMON_EVENTSEL_OS | LS_DISPATCH_STORE);
+		wrmsr(MSR_K7_EVNTSEL2, ARCH_PERFMON_EVENTSEL_ENABLE |
+		      ARCH_PERFMON_EVENTSEL_OS | LS_DISPATCH_LOAD_STORE);
+
+		r = masked_events_guest_test(MSR_K7_PERFCTR0);
+
+		GUEST_SYNC(r);
+	}
+}
+
+static struct perf_counter run_masked_events_test(struct kvm_vcpu *vcpu,
+						 const uint64_t masked_events[],
+						 const int nmasked_events)
+{
+	struct kvm_pmu_event_filter *f;
+	struct perf_counter r;
+
+	f = create_pmu_event_filter(masked_events, nmasked_events,
+				    KVM_PMU_EVENT_ALLOW,
+				    KVM_PMU_EVENT_FLAG_MASKED_EVENTS);
+	r.raw = test_with_filter(vcpu, f);
+	free(f);
+
+	return r;
+}
+
+/* Matches KVM_PMU_EVENT_FILTER_MAX_EVENTS in pmu.c */
+#define MAX_FILTER_EVENTS	300
+#define MAX_TEST_EVENTS		10
+
+#define ALLOW_LOADS		BIT(0)
+#define ALLOW_STORES		BIT(1)
+#define ALLOW_LOADS_STORES	BIT(2)
+
+struct masked_events_test {
+	uint64_t intel_events[MAX_TEST_EVENTS];
+	uint64_t intel_event_end;
+	uint64_t amd_events[MAX_TEST_EVENTS];
+	uint64_t amd_event_end;
+	const char *msg;
+	uint32_t flags;
+};
+
+/*
+ * These are the test cases for the masked events tests.
+ *
+ * For each test, the guest enables 3 PMU counters (loads, stores,
+ * loads + stores).  The filter is then set in KVM with the masked events
+ * provided.  The test then verifies that the counters agree with which
+ * ones should be counting and which ones should be filtered.
+ */
+const struct masked_events_test test_cases[] = {
+	{
+		.intel_events = {
+			INCLUDE_MASKED_ENTRY(MEM_INST_RETIRED, 0xFF, 0x81),
+		},
+		.amd_events = {
+			INCLUDE_MASKED_ENTRY(LS_DISPATCH, 0xFF, BIT(0)),
+		},
+		.msg = "Only allow loads.",
+		.flags = ALLOW_LOADS,
+	}, {
+		.intel_events = {
+			INCLUDE_MASKED_ENTRY(MEM_INST_RETIRED, 0xFF, 0x82),
+		},
+		.amd_events = {
+			INCLUDE_MASKED_ENTRY(LS_DISPATCH, 0xFF, BIT(1)),
+		},
+		.msg = "Only allow stores.",
+		.flags = ALLOW_STORES,
+	}, {
+		.intel_events = {
+			INCLUDE_MASKED_ENTRY(MEM_INST_RETIRED, 0xFF, 0x83),
+		},
+		.amd_events = {
+			INCLUDE_MASKED_ENTRY(LS_DISPATCH, 0xFF, BIT(2)),
+		},
+		.msg = "Only allow loads + stores.",
+		.flags = ALLOW_LOADS_STORES,
+	}, {
+		.intel_events = {
+			INCLUDE_MASKED_ENTRY(MEM_INST_RETIRED, 0x7C, 0),
+			EXCLUDE_MASKED_ENTRY(MEM_INST_RETIRED, 0xFF, 0x83),
+		},
+		.amd_events = {
+			INCLUDE_MASKED_ENTRY(LS_DISPATCH, ~(BIT(0) | BIT(1)), 0),
+		},
+		.msg = "Only allow loads and stores.",
+		.flags = ALLOW_LOADS | ALLOW_STORES,
+	}, {
+		.intel_events = {
+			INCLUDE_MASKED_ENTRY(MEM_INST_RETIRED, 0x7C, 0),
+			EXCLUDE_MASKED_ENTRY(MEM_INST_RETIRED, 0xFF, 0x82),
+		},
+		.amd_events = {
+			INCLUDE_MASKED_ENTRY(LS_DISPATCH, 0xF8, 0),
+			EXCLUDE_MASKED_ENTRY(LS_DISPATCH, 0xFF, BIT(1)),
+		},
+		.msg = "Only allow loads and loads + stores.",
+		.flags = ALLOW_LOADS | ALLOW_LOADS_STORES
+	}, {
+		.intel_events = {
+			INCLUDE_MASKED_ENTRY(MEM_INST_RETIRED, 0xFE, 0x82),
+		},
+		.amd_events = {
+			INCLUDE_MASKED_ENTRY(LS_DISPATCH, 0xF8, 0),
+			EXCLUDE_MASKED_ENTRY(LS_DISPATCH, 0xFF, BIT(0)),
+		},
+		.msg = "Only allow stores and loads + stores.",
+		.flags = ALLOW_STORES | ALLOW_LOADS_STORES
+	}, {
+		.intel_events = {
+			INCLUDE_MASKED_ENTRY(MEM_INST_RETIRED, 0x7C, 0),
+		},
+		.amd_events = {
+			INCLUDE_MASKED_ENTRY(LS_DISPATCH, 0xF8, 0),
+		},
+		.msg = "Only allow loads, stores, and loads + stores.",
+		.flags = ALLOW_LOADS | ALLOW_STORES | ALLOW_LOADS_STORES
+	},
+};
+
+static int append_test_events(const struct masked_events_test *test,
+			      uint64_t *events, int nevents)
+{
+	const uint64_t *evts;
+	int i;
+
+	evts = use_intel_pmu() ? test->intel_events : test->amd_events;
+	for (i = 0; i < MAX_TEST_EVENTS; i++) {
+		if (evts[i] == 0)
+			break;
+
+		events[nevents + i] = evts[i];
+	}
+
+	return nevents + i;
+}
+
+static bool bool_eq(bool a, bool b)
+{
+	return a == b;
+}
+
+static void run_masked_events_tests(struct kvm_vcpu *vcpu, uint64_t *events,
+				    int nevents)
+{
+	int ntests = ARRAY_SIZE(test_cases);
+	struct perf_counter c;
+	int i, n;
+
+	for (i = 0; i < ntests; i++) {
+		const struct masked_events_test *test = &test_cases[i];
+
+		/* Do any test case events overflow MAX_TEST_EVENTS? */
+		assert(test->intel_event_end == 0);
+		assert(test->amd_event_end == 0);
+
+		n = append_test_events(test, events, nevents);
+
+		c = run_masked_events_test(vcpu, events, n);
+		TEST_ASSERT(bool_eq(c.loads, test->flags & ALLOW_LOADS) &&
+			    bool_eq(c.stores, test->flags & ALLOW_STORES) &&
+			    bool_eq(c.loads_stores,
+				    test->flags & ALLOW_LOADS_STORES),
+			    "%s  loads: %u, stores: %u, loads + stores: %u",
+			    test->msg, c.loads, c.stores, c.loads_stores);
+	}
+}
+
+static void add_dummy_events(uint64_t *events, int nevents)
+{
+	int i;
+
+	for (i = 0; i < nevents; i++) {
+		int event_select = i % 0xFF;
+		bool exclude = ((i % 4) == 0);
+
+		if (event_select == MEM_INST_RETIRED ||
+		    event_select == LS_DISPATCH)
+			event_select++;
+
+		events[i] = KVM_PMU_ENCODE_MASKED_ENTRY(event_select, 0,
+							0, exclude);
+	}
+}
+
+static void test_masked_events(struct kvm_vcpu *vcpu)
+{
+	int nevents = MAX_FILTER_EVENTS - MAX_TEST_EVENTS;
+	uint64_t events[MAX_FILTER_EVENTS];
+
+	/* Run the test cases against a sparse PMU event filter. */
+	run_masked_events_tests(vcpu, events, 0);
+
+	/* Run the test cases against a dense PMU event filter. */
+	add_dummy_events(events, MAX_FILTER_EVENTS);
+	run_masked_events_tests(vcpu, events, nevents);
+}
+
 static int run_filter_test(struct kvm_vcpu *vcpu, const uint64_t *events,
 			   int nevents, uint32_t flags)
 {
@@ -432,7 +757,7 @@ static void test_filter_ioctl(struct kvm_vcpu *vcpu)
 	r = run_filter_test(vcpu, &e, 1, KVM_PMU_EVENT_FLAG_MASKED_EVENTS);
 	TEST_ASSERT(r != 0, "Invalid PMU Event Filter is expected to fail");
 
-	e = KVM_PMU_EVENT_ENCODE_MASKED_ENTRY(0xff, 0xff, 0xff, 0xf);
+	e = KVM_PMU_ENCODE_MASKED_ENTRY(0xff, 0xff, 0xff, 0xf);
 	r = run_filter_test(vcpu, &e, 1, KVM_PMU_EVENT_FLAG_MASKED_EVENTS);
 	TEST_ASSERT(r == 0, "Valid PMU Event Filter is failing");
 }
@@ -440,7 +765,7 @@ static void test_filter_ioctl(struct kvm_vcpu *vcpu)
 int main(int argc, char *argv[])
 {
 	void (*guest_code)(void);
-	struct kvm_vcpu *vcpu;
+	struct kvm_vcpu *vcpu, *vcpu2 = NULL;
 	struct kvm_vm *vm;
 
 	TEST_REQUIRE(kvm_has_cap(KVM_CAP_PMU_EVENT_FILTER));
@@ -465,6 +790,15 @@ int main(int argc, char *argv[])
 	test_not_member_deny_list(vcpu);
 	test_not_member_allow_list(vcpu);
 
+	if (use_intel_pmu() &&
+	    supports_event_mem_inst_retired() &&
+	    kvm_cpu_property(X86_PROPERTY_PMU_NR_GP_COUNTERS) >= 3)
+		vcpu2 = vm_vcpu_add(vm, 2, intel_masked_events_guest_code);
+	else if (use_amd_pmu())
+		vcpu2 = vm_vcpu_add(vm, 2, amd_masked_events_guest_code);
+
+	if (vcpu2)
+		test_masked_events(vcpu2);
 	test_filter_ioctl(vcpu);
 
 	kvm_vm_free(vm);
-- 
2.39.0.314.g84b9a713c41-goog


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 1/7] kvm: x86/pmu: Correct the mask used in a pmu event filter lookup
  2022-12-20 16:12 ` [PATCH v8 1/7] kvm: x86/pmu: Correct the mask used in a pmu event filter lookup Aaron Lewis
@ 2022-12-22  6:19   ` Like Xu
  2022-12-28 20:00     ` Aaron Lewis
  0 siblings, 1 reply; 16+ messages in thread
From: Like Xu @ 2022-12-22  6:19 UTC (permalink / raw)
  To: Aaron Lewis; +Cc: pbonzini, jmattson, seanjc, kvm list

On 21/12/2022 12:12 am, Aaron Lewis wrote:
> When checking if a pmu event the guest is attempting to program should
> be filtered, only consider the event select + unit mask in that
> decision. Use an architecture specific mask to mask out all other bits,
> including bits 35:32 on Intel.  Those bits are not part of the event
> select and should not be considered in that decision.
> 
> Fixes: 66bb8a065f5a ("KVM: x86: PMU Event Filter")
> Signed-off-by: Aaron Lewis <aaronlewis@google.com>
> ---
>   arch/x86/kvm/pmu.c           | 3 ++-
>   arch/x86/kvm/pmu.h           | 2 ++
>   arch/x86/kvm/svm/pmu.c       | 1 +
>   arch/x86/kvm/vmx/pmu_intel.c | 1 +
>   4 files changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> index 684393c22105..760a09ff65cd 100644
> --- a/arch/x86/kvm/pmu.c
> +++ b/arch/x86/kvm/pmu.c
> @@ -277,7 +277,8 @@ static bool check_pmu_event_filter(struct kvm_pmc *pmc)
>   		goto out;
>   
>   	if (pmc_is_gp(pmc)) {
> -		key = pmc->eventsel & AMD64_RAW_EVENT_MASK_NB;
> +		key = pmc->eventsel & (kvm_pmu_ops.EVENTSEL_EVENT |
> +				       ARCH_PERFMON_EVENTSEL_UMASK);
>   		if (bsearch(&key, filter->events, filter->nevents,
>   			    sizeof(__u64), cmp_u64))
>   			allow_event = filter->action == KVM_PMU_EVENT_ALLOW;
> diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
> index 85ff3c0588ba..5b070c563a97 100644
> --- a/arch/x86/kvm/pmu.h
> +++ b/arch/x86/kvm/pmu.h
> @@ -40,6 +40,8 @@ struct kvm_pmu_ops {
>   	void (*reset)(struct kvm_vcpu *vcpu);
>   	void (*deliver_pmi)(struct kvm_vcpu *vcpu);
>   	void (*cleanup)(struct kvm_vcpu *vcpu);
> +
> +	const u64 EVENTSEL_EVENT;

Isn't it weird when the new thing added here is
not of the same type as the existing members ?

Doesn't "pmu->raw_event_mask" help here ?

>   };
>   
>   void kvm_pmu_ops_update(const struct kvm_pmu_ops *pmu_ops);
> diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
> index 0e313fbae055..d3ae261d56a6 100644
> --- a/arch/x86/kvm/svm/pmu.c
> +++ b/arch/x86/kvm/svm/pmu.c
> @@ -229,4 +229,5 @@ struct kvm_pmu_ops amd_pmu_ops __initdata = {
>   	.refresh = amd_pmu_refresh,
>   	.init = amd_pmu_init,
>   	.reset = amd_pmu_reset,
> +	.EVENTSEL_EVENT = AMD64_EVENTSEL_EVENT,
>   };
> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
> index e5cec07ca8d9..edf23115f2ef 100644
> --- a/arch/x86/kvm/vmx/pmu_intel.c
> +++ b/arch/x86/kvm/vmx/pmu_intel.c
> @@ -810,4 +810,5 @@ struct kvm_pmu_ops intel_pmu_ops __initdata = {
>   	.reset = intel_pmu_reset,
>   	.deliver_pmi = intel_pmu_deliver_pmi,
>   	.cleanup = intel_pmu_cleanup,
> +	.EVENTSEL_EVENT = ARCH_PERFMON_EVENTSEL_EVENT,
>   };

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 3/7] kvm: x86/pmu: prepare the pmu event filter for masked events
  2022-12-20 16:12 ` [PATCH v8 3/7] kvm: x86/pmu: prepare the pmu event filter for masked events Aaron Lewis
@ 2022-12-22  6:34   ` Like Xu
  0 siblings, 0 replies; 16+ messages in thread
From: Like Xu @ 2022-12-22  6:34 UTC (permalink / raw)
  To: Aaron Lewis; +Cc: pbonzini, jmattson, seanjc, kvm list

On 21/12/2022 12:12 am, Aaron Lewis wrote:
> +static bool is_fixed_event_allowed(struct kvm_pmu_event_filter *filter, int idx)
> +{
> +	int fixed_idx = idx - INTEL_PMC_IDX_FIXED;
> +
> +	if (filter->action == KVM_PMU_EVENT_DENY &&
> +	    test_bit(fixed_idx, (ulong *)&filter->fixed_counter_bitmap))
> +		return false;
> +	if (filter->action == KVM_PMU_EVENT_ALLOW &&
> +	    !test_bit(fixed_idx, (ulong *)&filter->fixed_counter_bitmap))
> +		return false;
> +
> +	return true;
> +}

Before masked event is introduced, the decision logic for fixed events is not 
yet covered
by existing and new selftests, which affects the review of new changes to this part.

More selftests on fixed_event is appreciated and required.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 4/7] kvm: x86/pmu: Introduce masked events to the pmu event filter
  2022-12-20 16:12 ` [PATCH v8 4/7] kvm: x86/pmu: Introduce masked events to the pmu event filter Aaron Lewis
@ 2022-12-22  7:40   ` Like Xu
  2022-12-28 20:00     ` Aaron Lewis
  0 siblings, 1 reply; 16+ messages in thread
From: Like Xu @ 2022-12-22  7:40 UTC (permalink / raw)
  To: Aaron Lewis; +Cc: pbonzini, jmattson, seanjc, kvm list

On 21/12/2022 12:12 am, Aaron Lewis wrote:
> When building a list of filter events, it can sometimes be a challenge
> to fit all the events needed to adequately restrict the guest into the
> limited space available in the pmu event filter.  This stems from the
> fact that the pmu event filter requires each event (i.e. event select +
> unit mask) be listed, when the intention might be to restrict the
> event select all together, regardless of it's unit mask.  Instead of

This memory/cpu saving requirement is quite reasonable. To clarify,

PMU events with different unit masks for the same event select doesn't
point to the same or pipeline-associated microarchitectural component.

In practice, there will be a lot of cases where the "exclusion bit" is required.
That's one of the reasons I was asking about real use cases in another thread.

> increasing the number of filter events in the pmu event filter, add a
> new encoding that is able to do a more generalized match on the unit mask.
> 
> Introduce masked events as another encoding the pmu event filter
> understands.  Masked events has the fields: mask, match, and exclude.
> When filtering based on these events, the mask is applied to the guest's
> unit mask to see if it matches the match value (i.e. umask & mask ==
> match).  The exclude bit can then be used to exclude events from that
> match.  E.g. for a given event select, if it's easier to say which unit
> mask values shouldn't be filtered, a masked event can be set up to match
> all possible unit mask values, then another masked event can be set up to
> match the unit mask values that shouldn't be filtered.
> 
> Userspace can query to see if this feature exists by looking for the
> capability, KVM_CAP_PMU_EVENT_MASKED_EVENTS.
> 
> This feature is enabled by setting the flags field in the pmu event
> filter to KVM_PMU_EVENT_FLAG_MASKED_EVENTS.
> 
> Events can be encoded by using KVM_PMU_ENCODE_MASKED_ENTRY().
> 
> It is an error to have a bit set outside the valid bits for a masked
> event, and calls to KVM_SET_PMU_EVENT_FILTER will return -EINVAL in
> such cases, including the high bits of the event select (35:32) if
> called on Intel.

Some users want to count Intel TSX events and their VM instances needs:

#define HSW_IN_TX					(1ULL << 32)
#define HSW_IN_TX_CHECKPOINTED				(1ULL << 33)

> 
> With these updates the filter matching code has been updated to match on
> a common event.  Masked events were flexible enough to handle both event
> types, so they were used as the common event.  This changes how guest
> events get filtered because regardless of the type of event used in the
> uAPI, they will be converted to masked events.  Because of this there
> could be a slight performance hit because instead of matching the filter

Bonus, if this performance loss is quantified, we could make a side-by-side
comparison of alternative solutions, considering wasting memory seems to
be a habit of many kernel developers.

> event with a lookup on event select + unit mask, it does a lookup on event
> select then walks the unit masks to find the match.  This shouldn't be a
> big problem because I would expect the set of common event selects to be

A quick rough statistic about Intel PMU based on 
https://github.com/intel/perfmon.git:
our filtering mechanism needs to consider 388 different EventCode and 183 different
UMask, prioritizing filtering event_select will instead bring more entries.

> small, and if they aren't the set can likely be reduced by using masked
> events to generalize the unit mask.  Using one type of event when
> filtering guest events allows for a common code path to be used.
> 
> Signed-off-by: Aaron Lewis <aaronlewis@google.com>
> ---
>   Documentation/virt/kvm/api.rst  |  77 +++++++++++--
>   arch/x86/include/asm/kvm_host.h |  14 ++-
>   arch/x86/include/uapi/asm/kvm.h |  29 +++++
>   arch/x86/kvm/pmu.c              | 197 +++++++++++++++++++++++++++-----
>   arch/x86/kvm/x86.c              |   1 +
>   include/uapi/linux/kvm.h        |   1 +
>   6 files changed, 281 insertions(+), 38 deletions(-)
> 

...

> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 312aea1854ae..d2023076f363 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4401,6 +4401,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>    	case KVM_CAP_SPLIT_IRQCHIP:
>   	case KVM_CAP_IMMEDIATE_EXIT:
>   	case KVM_CAP_PMU_EVENT_FILTER:
> +	case KVM_CAP_PMU_EVENT_MASKED_EVENTS:

How about reusing KVM_CAP_PMU_CAPABILITY to advertise this new cap ?

>   	case KVM_CAP_GET_MSR_FEATURES:
>   	case KVM_CAP_MSR_PLATFORM_INFO:
>   	case KVM_CAP_EXCEPTION_PAYLOAD:
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 20522d4ba1e0..0b16b9ed3b23 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1175,6 +1175,7 @@ struct kvm_ppc_resize_hpt {
>   #define KVM_CAP_DIRTY_LOG_RING_ACQ_REL 223
>   #define KVM_CAP_S390_PROTECTED_ASYNC_DISABLE 224
>   #define KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP 225
> +#define KVM_CAP_PMU_EVENT_MASKED_EVENTS 226
>   
>   #ifdef KVM_CAP_IRQ_ROUTING
>   

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 1/7] kvm: x86/pmu: Correct the mask used in a pmu event filter lookup
  2022-12-22  6:19   ` Like Xu
@ 2022-12-28 20:00     ` Aaron Lewis
  2023-01-03 18:28       ` Sean Christopherson
  0 siblings, 1 reply; 16+ messages in thread
From: Aaron Lewis @ 2022-12-28 20:00 UTC (permalink / raw)
  To: Like Xu; +Cc: pbonzini, jmattson, seanjc, kvm list

> > diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
> > index 85ff3c0588ba..5b070c563a97 100644
> > --- a/arch/x86/kvm/pmu.h
> > +++ b/arch/x86/kvm/pmu.h
> > @@ -40,6 +40,8 @@ struct kvm_pmu_ops {
> >       void (*reset)(struct kvm_vcpu *vcpu);
> >       void (*deliver_pmi)(struct kvm_vcpu *vcpu);
> >       void (*cleanup)(struct kvm_vcpu *vcpu);
> > +
> > +     const u64 EVENTSEL_EVENT;
>
> Isn't it weird when the new thing added here is
> not of the same type as the existing members ?
>
> Doesn't "pmu->raw_event_mask" help here ?
>

In earlier revisions I had this as a callback, but it was a little
awkward being that I really just wanted it to be a const that differed
depending on platform.  Making it a const fit more naturally when
using it than the callback approach, so despite it being different
from the others here I think it works better overall.

> >   };
> >
> >   void kvm_pmu_ops_update(const struct kvm_pmu_ops *pmu_ops);
> > diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
> > index 0e313fbae055..d3ae261d56a6 100644
> > --- a/arch/x86/kvm/svm/pmu.c
> > +++ b/arch/x86/kvm/svm/pmu.c
> > @@ -229,4 +229,5 @@ struct kvm_pmu_ops amd_pmu_ops __initdata = {
> >       .refresh = amd_pmu_refresh,
> >       .init = amd_pmu_init,
> >       .reset = amd_pmu_reset,
> > +     .EVENTSEL_EVENT = AMD64_EVENTSEL_EVENT,
> >   };
> > diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
> > index e5cec07ca8d9..edf23115f2ef 100644
> > --- a/arch/x86/kvm/vmx/pmu_intel.c
> > +++ b/arch/x86/kvm/vmx/pmu_intel.c
> > @@ -810,4 +810,5 @@ struct kvm_pmu_ops intel_pmu_ops __initdata = {
> >       .reset = intel_pmu_reset,
> >       .deliver_pmi = intel_pmu_deliver_pmi,
> >       .cleanup = intel_pmu_cleanup,
> > +     .EVENTSEL_EVENT = ARCH_PERFMON_EVENTSEL_EVENT,
> >   };

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 4/7] kvm: x86/pmu: Introduce masked events to the pmu event filter
  2022-12-22  7:40   ` Like Xu
@ 2022-12-28 20:00     ` Aaron Lewis
  0 siblings, 0 replies; 16+ messages in thread
From: Aaron Lewis @ 2022-12-28 20:00 UTC (permalink / raw)
  To: Like Xu; +Cc: pbonzini, jmattson, seanjc, kvm list

> >
> > It is an error to have a bit set outside the valid bits for a masked
> > event, and calls to KVM_SET_PMU_EVENT_FILTER will return -EINVAL in
> > such cases, including the high bits of the event select (35:32) if
> > called on Intel.
>
> Some users want to count Intel TSX events and their VM instances needs:
>
> #define HSW_IN_TX                                       (1ULL << 32)
> #define HSW_IN_TX_CHECKPOINTED                          (1ULL << 33)

The purpose of the PMU event filter is to restrict events based solely
on the event select + unit mask.  What the guest decides to do with
HSW_IN_TX and HSW_IN_TX_CHECKPOINTED is done at their discretion and
the PMU event filter should not be involved.

Patch 1/7 in this series fixes a bug to ensure that's true, then the
next patch removes events from the PMU event filter that attempts to
filter bits outside the event select + unit mask.  Masked events take
this one step farther by rejecting the entire filter and returning an
error if there are invalid bits set in any of the events.
Unfortunately legacy events weren't implemented that way so removing
events is the best we can do.

> >
> > With these updates the filter matching code has been updated to match on
> > a common event.  Masked events were flexible enough to handle both event
> > types, so they were used as the common event.  This changes how guest
> > events get filtered because regardless of the type of event used in the
> > uAPI, they will be converted to masked events.  Because of this there
> > could be a slight performance hit because instead of matching the filter
>
> Bonus, if this performance loss is quantified, we could make a side-by-side
> comparison of alternative solutions, considering wasting memory seems to
> be a habit of many kernel developers.
>
> > event with a lookup on event select + unit mask, it does a lookup on event
> > select then walks the unit masks to find the match.  This shouldn't be a
> > big problem because I would expect the set of common event selects to be
>
> A quick rough statistic about Intel PMU based on
> https://github.com/intel/perfmon.git:
> our filtering mechanism needs to consider 388 different EventCode and 183 different
> UMask, prioritizing filtering event_select will instead bring more entries.

I don't see how we could end up with more entries considering the
pathological case would result in the same number of filter entries as
before.

How did you come up with 388 event selects and 183 different unit
masks?  Are those all from the same uarch?

>
> > small, and if they aren't the set can likely be reduced by using masked
> > events to generalize the unit mask.  Using one type of event when
> > filtering guest events allows for a common code path to be used.
> >
> > Signed-off-by: Aaron Lewis <aaronlewis@google.com>
> > ---
> >   Documentation/virt/kvm/api.rst  |  77 +++++++++++--
> >   arch/x86/include/asm/kvm_host.h |  14 ++-
> >   arch/x86/include/uapi/asm/kvm.h |  29 +++++
> >   arch/x86/kvm/pmu.c              | 197 +++++++++++++++++++++++++++-----
> >   arch/x86/kvm/x86.c              |   1 +
> >   include/uapi/linux/kvm.h        |   1 +
> >   6 files changed, 281 insertions(+), 38 deletions(-)
> >
>
> ...
>
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 312aea1854ae..d2023076f363 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -4401,6 +4401,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> >       case KVM_CAP_SPLIT_IRQCHIP:
> >       case KVM_CAP_IMMEDIATE_EXIT:
> >       case KVM_CAP_PMU_EVENT_FILTER:
> > +     case KVM_CAP_PMU_EVENT_MASKED_EVENTS:
>
> How about reusing KVM_CAP_PMU_CAPABILITY to advertise this new cap ?

The purpose of KVM_CAP_PMU_CAPABILITY is to allow userspace to adjust
the PMU virtualization capabilities on a VM.
KVM_CAP_PMU_EVENT_MASKED_EVENTS isn't meant to be adjustable, but
rather advertise that this feature exists.

>
> >       case KVM_CAP_GET_MSR_FEATURES:
> >       case KVM_CAP_MSR_PLATFORM_INFO:
> >       case KVM_CAP_EXCEPTION_PAYLOAD:
> > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> > index 20522d4ba1e0..0b16b9ed3b23 100644
> > --- a/include/uapi/linux/kvm.h
> > +++ b/include/uapi/linux/kvm.h
> > @@ -1175,6 +1175,7 @@ struct kvm_ppc_resize_hpt {
> >   #define KVM_CAP_DIRTY_LOG_RING_ACQ_REL 223
> >   #define KVM_CAP_S390_PROTECTED_ASYNC_DISABLE 224
> >   #define KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP 225
> > +#define KVM_CAP_PMU_EVENT_MASKED_EVENTS 226
> >
> >   #ifdef KVM_CAP_IRQ_ROUTING
> >

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 1/7] kvm: x86/pmu: Correct the mask used in a pmu event filter lookup
  2022-12-28 20:00     ` Aaron Lewis
@ 2023-01-03 18:28       ` Sean Christopherson
  0 siblings, 0 replies; 16+ messages in thread
From: Sean Christopherson @ 2023-01-03 18:28 UTC (permalink / raw)
  To: Aaron Lewis; +Cc: Like Xu, pbonzini, jmattson, kvm list

On Wed, Dec 28, 2022, Aaron Lewis wrote:
> > > diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
> > > index 85ff3c0588ba..5b070c563a97 100644
> > > --- a/arch/x86/kvm/pmu.h
> > > +++ b/arch/x86/kvm/pmu.h
> > > @@ -40,6 +40,8 @@ struct kvm_pmu_ops {
> > >       void (*reset)(struct kvm_vcpu *vcpu);
> > >       void (*deliver_pmi)(struct kvm_vcpu *vcpu);
> > >       void (*cleanup)(struct kvm_vcpu *vcpu);
> > > +
> > > +     const u64 EVENTSEL_EVENT;
> >
> > Isn't it weird when the new thing added here is
> > not of the same type as the existing members ?
> >
> > Doesn't "pmu->raw_event_mask" help here ?
> >
> 
> In earlier revisions I had this as a callback, but it was a little
> awkward being that I really just wanted it to be a const that differed
> depending on platform.  Making it a const fit more naturally when
> using it than the callback approach, so despite it being different
> from the others here I think it works better overall.

And KVM already does similar things in kvm_x86_ops.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 0/7] Introduce and test masked events
  2022-12-20 16:12 [PATCH v8 0/7] Introduce and test masked events Aaron Lewis
                   ` (6 preceding siblings ...)
  2022-12-20 16:12 ` [PATCH v8 7/7] selftests: kvm/x86: Test masked events Aaron Lewis
@ 2023-01-19 20:57 ` Sean Christopherson
  2023-01-20 18:11   ` Sean Christopherson
  7 siblings, 1 reply; 16+ messages in thread
From: Sean Christopherson @ 2023-01-19 20:57 UTC (permalink / raw)
  To: Sean Christopherson, kvm, Aaron Lewis; +Cc: pbonzini, jmattson, like.xu.linux

On Tue, 20 Dec 2022 16:12:29 +0000, Aaron Lewis wrote:
> This series introduces the concept of masked events to the pmu event
> filter. Masked events can help reduce the number of events needed in the
> pmu event filter by allowing a more generalized matching method to be
> used for the unit mask when filtering guest events in the pmu.  With
> masked events, if an event select should be restricted from the guest,
> instead of having to add an entry to the pmu event filter for each
> event select + unit mask pair, a masked event can be added to generalize
> the unit mask values.
> 
> [...]

Applied to kvm-x86 pmu, thanks!

[1/7] kvm: x86/pmu: Correct the mask used in a pmu event filter lookup
      https://github.com/kvm-x86/linux/commit/aa570a7481c3
[2/7] kvm: x86/pmu: Remove impossible events from the pmu event filter
      https://github.com/kvm-x86/linux/commit/778e86e3a2fd
[3/7] kvm: x86/pmu: prepare the pmu event filter for masked events
      https://github.com/kvm-x86/linux/commit/11794a3da07f
[4/7] kvm: x86/pmu: Introduce masked events to the pmu event filter
      https://github.com/kvm-x86/linux/commit/651daa44b11c
[5/7] selftests: kvm/x86: Add flags when creating a pmu event filter
      https://github.com/kvm-x86/linux/commit/6a6b17a7c594
[6/7] selftests: kvm/x86: Add testing for KVM_SET_PMU_EVENT_FILTER
      https://github.com/kvm-x86/linux/commit/b1a865796643
[7/7] selftests: kvm/x86: Test masked events
      https://github.com/kvm-x86/linux/commit/5ed12ae83c4c

--
https://github.com/kvm-x86/linux/tree/next
https://github.com/kvm-x86/linux/tree/fixes

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v8 0/7] Introduce and test masked events
  2023-01-19 20:57 ` [PATCH v8 0/7] Introduce and test " Sean Christopherson
@ 2023-01-20 18:11   ` Sean Christopherson
  0 siblings, 0 replies; 16+ messages in thread
From: Sean Christopherson @ 2023-01-20 18:11 UTC (permalink / raw)
  To: kvm, Aaron Lewis; +Cc: pbonzini, jmattson, like.xu.linux

On Thu, Jan 19, 2023, Sean Christopherson wrote:
> On Tue, 20 Dec 2022 16:12:29 +0000, Aaron Lewis wrote:
> > This series introduces the concept of masked events to the pmu event
> > filter. Masked events can help reduce the number of events needed in the
> > pmu event filter by allowing a more generalized matching method to be
> > used for the unit mask when filtering guest events in the pmu.  With
> > masked events, if an event select should be restricted from the guest,
> > instead of having to add an entry to the pmu event filter for each
> > event select + unit mask pair, a masked event can be added to generalize
> > the unit mask values.
> > 
> > [...]
> 
> Applied to kvm-x86 pmu, thanks!
> 
> [1/7] kvm: x86/pmu: Correct the mask used in a pmu event filter lookup
>       https://github.com/kvm-x86/linux/commit/aa570a7481c3
> [2/7] kvm: x86/pmu: Remove impossible events from the pmu event filter
>       https://github.com/kvm-x86/linux/commit/778e86e3a2fd
> [3/7] kvm: x86/pmu: prepare the pmu event filter for masked events
>       https://github.com/kvm-x86/linux/commit/11794a3da07f
> [4/7] kvm: x86/pmu: Introduce masked events to the pmu event filter
>       https://github.com/kvm-x86/linux/commit/651daa44b11c
> [5/7] selftests: kvm/x86: Add flags when creating a pmu event filter
>       https://github.com/kvm-x86/linux/commit/6a6b17a7c594
> [6/7] selftests: kvm/x86: Add testing for KVM_SET_PMU_EVENT_FILTER
>       https://github.com/kvm-x86/linux/commit/b1a865796643
> [7/7] selftests: kvm/x86: Test masked events
>       https://github.com/kvm-x86/linux/commit/5ed12ae83c4c

FYI, I forced pushed to pmu to fix a minor warning in the docs and decided to
opportunistically update all of the shortlogs to fix (again minor) issues with
the scope since this series is currently the only thing sitting in pmu.

[1/7] KVM: x86/pmu: Correct the mask used in a pmu event filter lookup
      https://github.com/kvm-x86/linux/commit/1d2489485e28
[2/7] KVM: x86/pmu: Remove impossible events from the pmu event filter
      https://github.com/kvm-x86/linux/commit/1110a9128858
[3/7] KVM: x86/pmu: prepare the pmu event filter for masked events
      https://github.com/kvm-x86/linux/commit/3b049af387c4
[4/7] KVM: x86/pmu: Introduce masked events to the pmu event filter
      https://github.com/kvm-x86/linux/commit/22f39725f0f6
[5/7] KVM: selftests: Add flags when creating a pmu event filter
      https://github.com/kvm-x86/linux/commit/963fd783390f
[6/7] KVM: selftests: Add testing for KVM_SET_PMU_EVENT_FILTER
      https://github.com/kvm-x86/linux/commit/1c1045925ed4
[7/7] KVM: selftests: Test masked events in PMU filter
      https://github.com/kvm-x86/linux/commit/526d9f600225

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2023-01-20 18:12 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-20 16:12 [PATCH v8 0/7] Introduce and test masked events Aaron Lewis
2022-12-20 16:12 ` [PATCH v8 1/7] kvm: x86/pmu: Correct the mask used in a pmu event filter lookup Aaron Lewis
2022-12-22  6:19   ` Like Xu
2022-12-28 20:00     ` Aaron Lewis
2023-01-03 18:28       ` Sean Christopherson
2022-12-20 16:12 ` [PATCH v8 2/7] kvm: x86/pmu: Remove impossible events from the pmu event filter Aaron Lewis
2022-12-20 16:12 ` [PATCH v8 3/7] kvm: x86/pmu: prepare the pmu event filter for masked events Aaron Lewis
2022-12-22  6:34   ` Like Xu
2022-12-20 16:12 ` [PATCH v8 4/7] kvm: x86/pmu: Introduce masked events to the pmu event filter Aaron Lewis
2022-12-22  7:40   ` Like Xu
2022-12-28 20:00     ` Aaron Lewis
2022-12-20 16:12 ` [PATCH v8 5/7] selftests: kvm/x86: Add flags when creating a " Aaron Lewis
2022-12-20 16:12 ` [PATCH v8 6/7] selftests: kvm/x86: Add testing for KVM_SET_PMU_EVENT_FILTER Aaron Lewis
2022-12-20 16:12 ` [PATCH v8 7/7] selftests: kvm/x86: Test masked events Aaron Lewis
2023-01-19 20:57 ` [PATCH v8 0/7] Introduce and test " Sean Christopherson
2023-01-20 18:11   ` Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).