linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Luwei Kang <luwei.kang@intel.com>
To: x86@kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org
Cc: peterz@infradead.org, mingo@redhat.com, acme@kernel.org,
	mark.rutland@arm.com, alexander.shishkin@linux.intel.com,
	jolsa@redhat.com, namhyung@kernel.org, tglx@linutronix.de,
	bp@alien8.de, hpa@zytor.com, pbonzini@redhat.com,
	sean.j.christopherson@intel.com, vkuznets@redhat.com,
	wanpengli@tencent.com, jmattson@google.com, joro@8bytes.org,
	pawan.kumar.gupta@linux.intel.com, ak@linux.intel.com,
	thomas.lendacky@amd.com, fenghua.yu@intel.com,
	kan.liang@linux.intel.com, like.xu@linux.intel.com
Subject: [PATCH v1 02/11] perf/x86/ds: Handle guest PEBS events overflow and inject fake PMI
Date: Fri,  6 Mar 2020 01:56:56 +0800	[thread overview]
Message-ID: <1583431025-19802-3-git-send-email-luwei.kang@intel.com> (raw)
In-Reply-To: <1583431025-19802-1-git-send-email-luwei.kang@intel.com>

From: Kan Liang <kan.liang@linux.intel.com>

With PEBS virtualization, the PEBS record gets delivered to the guest,
but host still sees the PMI. This would normally result in a spurious
PEBS PMI that is ignored. But we need to inject the PMI into the guest,
so that the guest PMI handler can handle the PEBS record.

Check for this case in the perf PEBS handler. If a guest PEBS counter
overflowed, a fake event will be triggered. The fake event results in
calling the KVM PMI callback, which injects the PMI into the guest.

No matter how many PEBS counters are overflowed, only triggering one
fake event is enough. Then the guest handler would retrieve the correct
information from its own PEBS records including the guest state.

Originally-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---
 arch/x86/events/intel/ds.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 59 insertions(+)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index dc43cc1..6722f39 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1721,6 +1721,62 @@ void intel_pmu_auto_reload_read(struct perf_event *event)
 	return 0;
 }
 
+/*
+ * We may be running with virtualized PEBS, so the PEBS record
+ * was logged into the guest's DS and is invisible to host.
+ *
+ * For guest-dedicated counters we always have to check if the
+ * counters are overflowed, because PEBS thresholds
+ * are not reported in the PERF_GLOBAL_STATUS.
+ *
+ * In this case we just trigger a fake event for KVM to forward
+ * to the guest as PMI. The guest will then see the real PEBS
+ * record and read the counter values.
+ *
+ * The contents of the event do not matter.
+ */
+static int intel_pmu_handle_guest_pebs(struct cpu_hw_events *cpuc,
+				       struct pt_regs *iregs,
+				       struct debug_store *ds)
+{
+	struct perf_sample_data data;
+	struct perf_event *event;
+	int bit;
+
+	/*
+	 * Ideally, we should check guest DS to understand if the
+	 * guest-dedicated PEBS counters are overflowed.
+	 *
+	 * However, it brings high overhead to retrieve guest DS in host.
+	 * The host and guest cannot have pending PEBS events simultaneously.
+	 * So we check host DS instead.
+	 *
+	 * If PEBS interrupt threshold on host is not exceeded in a NMI,
+	 * the guest-dedicated counter must be overflowed.
+	 */
+	if (!cpuc->intel_ctrl_guest_dedicated_mask || !in_nmi() ||
+	    (ds->pebs_interrupt_threshold <= ds->pebs_index))
+		return 0;
+
+	for_each_set_bit(bit,
+		(unsigned long *)&cpuc->intel_ctrl_guest_dedicated_mask,
+			INTEL_PMC_IDX_FIXED + x86_pmu.num_counters_fixed) {
+
+		event = cpuc->events[bit];
+		if (!event->attr.precise_ip)
+			continue;
+
+		perf_sample_data_init(&data, 0, event->hw.last_period);
+		if (perf_event_overflow(event, &data, iregs))
+			x86_pmu_stop(event, 0);
+
+		/* Inject one fake event is enough. */
+		return 1;
+	}
+
+	return 0;
+}
+
 static void __intel_pmu_pebs_event(struct perf_event *event,
 				   struct pt_regs *iregs,
 				   void *base, void *top,
@@ -1954,6 +2010,9 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs)
 	if (!x86_pmu.pebs_active)
 		return;
 
+	if (intel_pmu_handle_guest_pebs(cpuc, iregs, ds))
+		return;
+
 	base = (struct pebs_basic *)(unsigned long)ds->pebs_buffer_base;
 	top = (struct pebs_basic *)(unsigned long)ds->pebs_index;
 
-- 
1.8.3.1


  parent reply	other threads:[~2020-03-05  9:59 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-05 17:56 [PATCH v1 00/11] PEBS virtualization enabling via DS Luwei Kang
2020-03-05 16:51 ` Paolo Bonzini
2020-03-05 17:56 ` [PATCH v1 01/11] perf/x86/core: Support KVM to assign a dedicated counter for guest PEBS Luwei Kang
2020-03-06 13:53   ` Peter Zijlstra
2020-03-06 14:42     ` Liang, Kan
2020-03-09 10:04       ` Peter Zijlstra
2020-03-09 13:12         ` Liang, Kan
2020-03-09 15:05           ` Peter Zijlstra
2020-03-09 19:28             ` Liang, Kan
2020-03-12 10:28               ` Kang, Luwei
2020-03-26 14:03               ` Liang, Kan
2020-04-07 12:34                 ` Kang, Luwei
2020-06-12  5:28             ` Kang, Luwei
2020-06-19  9:30               ` Kang, Luwei
2020-08-20  3:32               ` Like Xu
2020-03-09 15:44         ` Andi Kleen
2020-03-05 17:56 ` Luwei Kang [this message]
2020-03-05 17:56 ` [PATCH v1 03/11] perf/x86: Expose a function to disable auto-reload Luwei Kang
2020-03-05 17:56 ` [PATCH v1 04/11] KVM: x86/pmu: Decouple event enablement from event creation Luwei Kang
2020-03-05 17:56 ` [PATCH v1 05/11] KVM: x86/pmu: Add support to reprogram PEBS event for guest counters Luwei Kang
2020-03-06 16:28   ` kbuild test robot
2020-03-09  0:58     ` Xu, Like
2020-03-05 17:57 ` [PATCH v1 06/11] KVM: x86/pmu: Implement is_pebs_via_ds_supported pmu ops Luwei Kang
2020-03-05 17:57 ` [PATCH v1 07/11] KVM: x86/pmu: Expose CPUIDs feature bits PDCM, DS, DTES64 Luwei Kang
2020-03-05 17:57 ` [PATCH v1 08/11] KVM: x86/pmu: PEBS MSRs emulation Luwei Kang
2020-03-05 17:57 ` [PATCH v1 09/11] KVM: x86/pmu: Expose PEBS feature to guest Luwei Kang
2020-03-05 17:57 ` [PATCH v1 10/11] KVM: x86/pmu: Introduce the mask value for fixed counter Luwei Kang
2020-03-05 17:57 ` [PATCH v1 11/11] KVM: x86/pmu: Adaptive PEBS virtualization enabling Luwei Kang
2020-03-05 22:48 ` [PATCH v1 00/11] PEBS virtualization enabling via DS Andi Kleen
2020-03-06  5:37   ` Kang, Luwei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1583431025-19802-3-git-send-email-luwei.kang@intel.com \
    --to=luwei.kang@intel.com \
    --cc=acme@kernel.org \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=fenghua.yu@intel.com \
    --cc=hpa@zytor.com \
    --cc=jmattson@google.com \
    --cc=jolsa@redhat.com \
    --cc=joro@8bytes.org \
    --cc=kan.liang@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=like.xu@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=pawan.kumar.gupta@linux.intel.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=sean.j.christopherson@intel.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).